02-28-2012 02:05 AM - edited 03-01-2019 10:18 AM
Hi,
I sanboot installed redhat 5u7 x86_64 os on UCS, and then mapped 15 volumes to run IO, os will become kernel panic less then 2 hours.
then I have to force power cycle the server.
The card used is M81KR.
Does anyone what is the problem? Thanks.
========================
Feb 26 19:43:59 arcucsb200095e0 kernel: INFO: task kjournald:6427 blocked for more than 120 seconds.
Feb 26 19:43:59 arcucsb200095e0 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 26 19:43:59 arcucsb200095e0 kernel: kjournald D ffffffff801546d1 0 6427 465 6430 6424 (L-TLB)
Feb 26 19:43:59 arcucsb200095e0 kernel: ffff81066b6f1cf0 0000000000000046 000000006e420048 ffff8106853ea9d0
Feb 26 19:43:59 arcucsb200095e0 kernel: 00000001ffffffff 000000000000000a ffff810c6c31b080 ffff8106853f07a0
Feb 26 19:43:59 arcucsb200095e0 kernel: 00000184ac3ecc0b 0000000000000edf ffff810c6c31b268 0000000b1693b738
Feb 26 19:43:59 arcucsb200095e0 kernel: Call Trace:
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff8006ec8f>] do_gettimeofday+0x40/0x90
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800155d3>] sync_buffer+0x0/0x3f
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800637ce>] io_schedule+0x3f/0x67
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff8001560e>] sync_buffer+0x3b/0x3f
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800639fa>] __wait_on_bit+0x40/0x6e
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800155d3>] sync_buffer+0x0/0x3f
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff80063a94>] out_of_line_wait_on_bit+0x6c/0x78
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800a2e8b>] wake_bit_function+0x0/0x23
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff88033a41>] :jbd:journal_commit_transaction+0x553/0x10aa
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff8003d85b>] lock_timer_base+0x1b/0x3c
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff8004ad98>] try_to_del_timer_sync+0x7f/0x88
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff88037662>] :jbd:kjournald+0xc1/0x213
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800a2e5d>] autoremove_wake_function+0x0/0x2e
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff880375a1>] :jbd:kjournald+0x0/0x213
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800a2c45>] keventd_create_kthread+0x0/0xc4
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff80032722>] kthread+0xfe/0x132
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff8005dfb1>] child_rip+0xa/0x11
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff800a2c45>] keventd_create_kthread+0x0/0xc4
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff80032624>] kthread+0x0/0x132
Feb 26 19:43:59 arcucsb200095e0 kernel: [<ffffffff8005dfa7>] child_rip+0x0/0x11
====================
Many thanks for your help.
02-28-2012 05:32 AM
Hello,
Please provide following information
UCSM version
Blade model
CIMC, BIOS and M81KR adapter firmware version
ENIC and FNIC driver version
What type of storage are you trying to mount ?
Are you able to consistently reproduce the issue ?
Padma
02-29-2012 02:10 AM
Hi
Thanks for your help.
UCSM version : 2.0(1s)
Blade model : Cisco UCS B2000 M1
CIMC: 2.0( 1q)
BIOS: S5500.2.0.1a.0.081120111606
and M81KR adapter firmware version : 2.0(1q)
ENIC and FNIC driver version: 5.0(3)N2(2.1s)
What type of storage are you trying to mount : The storage is IBM SVC.
Thanks
02-29-2012 02:40 AM
Hello
You can get the driver version by executing following command in RHEL
modinfo enic
modinfo fnic
Can you consistently reproduce the crash ?
I am not aware of any known issues.
Please open a TAC service request with above information and also upload Redhat logs ( which can be collected by executing " sosreport -k rpm.rpmva=off " command )
Padma
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide