cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1408
Views
5
Helpful
0
Comments
athukral
Level 1
Level 1

Introduction:

This document describes the process of converting multiple RAID Drive to single RAID drive to solve the rebooting issue.

Symptoms:

The device automatically reboots somewhere between 10 to 20 minutes if configured in HA mode. This is a known issue, if the RAID drive shows multiple drive and NAC is configured in HA. Workaround is to redo the RAID process or need to replace box with single logical raid drive.

Following logs confirm this issue-------------------

Jun 14 16:24:13 CNIT-CAM-3a kernel: Software Watchdog Timer: 0.07 initialized.

soft_noboot=0 soft_margin=600 sec (nowayout= 0)

Jun 14 16:24:14 CNIT-CAM-3a su(pam_unix)[4908]: session opened for user root by (uid=0)

Jun 14 16:24:14 CNIT-CAM-3a su(pam_unix)[4908]: session closed for user root

Jun 14 16:24:15 CNIT-CAM-3a su(pam_unix)[5007]: session opened for user postgres by(uid=0)

Jun 14 16:24:20 CNIT-CAM-3a su(pam_unix)[5007]: session closed for user postgres

Jun 14 16:24:21 CNIT-CAM-3a su(pam_unix)[5557]: session opened for user postgres by (uid=0)

Jun 14 16:24:24 CNIT-CAM-3a kernel: SoftDog: Unexpected close, not stopping watchdog!

Jun 14 16:24:25 CNIT-CAM-3a su(pam_unix)[5557]: session closed for user postgres

Jun 14 16:24:31 CNIT-CAM-3a su(pam_unix)[6472]: session opened for user postgres by(uid=0)

Jun 14 16:24:36 CNIT-CAM-3a su(pam_unix)[6472]: session closed for user postgres

Jun 14 16:24:37 CNIT-CAM-3a su(pam_unix)[6976]: session opened for user postgres by(uid=0)

Jun 14 16:25:03 CNIT-CAM-3a su(pam_unix)[6976]: session closed for user postgres

Jun 14 16:25:04 CNIT-CAM-3a su(pam_unix)[9083]: session opened for user postgres by(uid=0)

Jun 14 16:25:05 CNIT-CAM-3a su(pam_unix)[9083]: session closed for user postgres

Jun 14 16:25:06 CNIT-CAM-3a su(pam_unix)[9230]: session opened for user postgres by(uid=0)

Jun 14 16:25:13 CNIT-CAM-3a su(pam_unix)[9230]: session closed for user postgres

Jun 14 16:25:14 CNIT-CAM-3a su(pam_unix)[9889]: session opened for user postgres by(uid=0)

Jun 14 16:25:42 CNIT-CAM-3a su(pam_unix)[9889]: session closed for user postgres

Jun 14 16:25:43 CNIT-CAM-3a su(pam_unix)[12203]: session opened for user postgres by(uid=0)

Jun 14 16:25:43 CNIT-CAM-3a su(pam_unix)[12203]: session closed for user postgres

Jun 14 16:25:44 CNIT-CAM-3a su(pam_unix)[12331]: session opened for user postgres by(uid=0)

Jun 14 16:25:45 CNIT-CAM-3a su(pam_unix)[12331]: session closed for user postgres

Jun 14 16:25:46 CNIT-CAM-3a su(pam_unix)[12543]: session opened for user root by (uid=0)

Jun 14 16:25:46 CNIT-CAM-3a su(pam_unix)[12543]: session closed for user root

Jun 14 16:25:47 CNIT-CAM-3a su(pam_unix)[12643]: session opened for user root by (uid=0)

Jun 14 16:25:48 CNIT-CAM-3a su(pam_unix)[12643]: session closed for user root

Jun 14 16:25:49 CNIT-CAM-3a su(pam_unix)[12741]: session opened for user root by (uid=0)

Jun 14 16:25:49 CNIT-CAM-3a su(pam_unix)[12741]: session closed for user root

Jun 14 16:25:50 CNIT-CAM-3a su(pam_unix)[12840]: session opened for user root by (uid=0)

Jun 14 16:25:50 CNIT-CAM-3a su(pam_unix)[12840]: session closed for user root

Jun 14 16:34:24 CNIT-CAM-3a kernel:  [<c012cd94>] queue_work+0x40/0x52

Jun 14 16:34:24 CNIT-CAM-3a kernel:  [<f8935021>] watchdog_fire+0x21/0x44 [softdog]

Jun 14 16:34:24 CNIT-CAM-3a kernel:  [<c0126e8a>] run_timer_softirq+0xdd/0x1ad

Jun 14 16:34:24 CNIT-CAM-3a kernel:  [<c01232f2>] __do_softirq+0x62/0xcf

Jun 14 16:34:25 CNIT-CAM-3a kernel:  [<c012338c>] do_softirq+0x2d/0x33

Jun 14 16:34:25 CNIT-CAM-3a kernel:  [<c010370c>] apic_timer_interrupt+0x1c/0x24

Jun 14 16:34:25 CNIT-CAM-3a kernel:  [<c0101166>] mwait_idle+0x25/0x43

Jun 14 16:34:25 CNIT-CAM-3a kernel:  [<c01010c1>] cpu_idle+0x4b/0x60

Jun 14 16:36:33 CNIT-CAM-3a syslogd 1.4.1: restart.

Jun 14 16:36:33 CNIT-CAM-3a kernel: klogd 1.4.1, log source = /proc/kmsg started.

CHECKING WHETHER RAID IS CONFIGURED CORRECTLY OR NOT

Please get into Option Rom Config for Arrays (during boot up,press F8), use "logicaldrive * show" to see if you have 2 logical drives, if so, the RAID needs to be re-configured.

To make sure RAID is properly configured, F8 into Option Rom Config for Arrays during boot up:

"CLI>Logicaldrive * show" should display following:

Logical Drive #1, RAID 1+0, 136.7 GB, Status OK

If you see 2 lines like following, then RAID is not properly configured:

Logical Drive #1, RAID 1+0, 86.3 GB, Status OK

Logical Drive #2, RAID 1+0, 86.3 GB, Status OK

WORKAROUND

In this case rebuilding the RAID array / reimaging it resolves the problem. Refer the following process--

Please note the disks will be erased and reconfigured by this procedure. CAM has to be re-installed, so do all the necessary steps to back up the DB and save existing configuration.

This procedure applies to box with command line Option Rom Config for Arrays. For box with GUI, follow the UI procedures to delete existing 2 logical drives and then create 1 logical drive that contains 4 disks.

During boot up, press F8 to get in Option Rom Config for Arrays, get into CLI

"CLI>logicaldrive * show" will list 2 logical drives

"CLI>logicaldrive 1 show" will list 2 disks under logicaldrive

"CLI>logicaldrive 2 show" will list 2 disks under logicaldrive

"CLI>logicaldrive 1 delete" to delete logical drive 1

"CLI>logicaldrive 2 delete" to delete logical drive 2

"CLI>controller create type=logicaldrive raid=10 drives=1,2,3,4" to create one logical drive that contains 4 disks

"CLI>exit" to save current configuration, continue the booting process. You may now install CAM/CAS

Hope this is informative. I want to thank you for viewing.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: