07-03-2010 01:14 PM
I've had this issue with 2 RMA'd OE612s already. They originally shipped with 4.0.19, which worked fine, but failed on upgrade to 4.2.1.
Recovery attempt #1 with 4.0.19 Rescue disk displayed :
MODEL: UNKNOWN
FLASH: found, directory validated
COOKIE: invalid
IMAGE: NONE
FLASHDEV: /dev/hda
Installer Main Menu:
1. Configure Network
2. Manufacture flash
3. Install flash cookie
4. Install flash image from network
5. Install flash image from cdrom
6. Install flash image from disk
7. Wipe out disks and install .bin image
8. Exit (and reboot)
Choice [0]: 3
Unknown model, cannot create cookie.
Subsequent Recovery procedures using Rescue disks from 4.1.7 and 4.2.1 detected the Model as OE512. Option 2 creates a cookie as a MODEL OE512, which I wrote with option 3. Option 5 works, but trying to write the bin image on 4.1.7 displays this
Choice [0]: 8
Enter full URL of .bin image to install.
ftp://[user:pass@]ip_addr/path/to/file
http://[user:pass@]ip_addr/path/to/file
URL for .bin image [file:/cdrom/images/WAAS41.bin]:
Continue? This will wipe out all disks! [n]: y
Scrogging done
Saving random seed...
dd: opening `/state/random-seed': No such file or directory
No volume groups found
/ruby/bin/ruby_disk: First disk is missing, bad, marked bad, disabled via shutdown, or it has unknown data.
/ruby/bin/ruby_disk: Second disk is bad, missing, or shutdown; bailing out
/ruby/bin/ruby_disk: First disk is missing, bad, marked bad, disabled via shutdown, or it has unknown data.
/ruby/bin/ruby_disk: Second disk is bad, missing, or shutdown; bailing out
ssmgr: open(/state/safestate, O_WRONLY|O_CREAT) failed: No such file or directory
ssmgr: op_open or media_open
ssmgr: open(/state/safestate, O_WRONLY|O_CREAT) failed: No such file or directory
ssmgr: open(/state/safestate, O_WRONLY|O_CREAT) failed: No such file or directory
ssmgr: op_create or media_init
SSMGR RETURNING: 10 (No such file or directory)
Reading and installing image, it may take a few minutes, please wait...
Error, image NOT installed.
MODEL: OE512
FLASH: found, directory validated
COOKIE: valid
IMAGE: 4.1.7.11
FLASHDEV: /dev/sda
Installer Main Menu:
1. Configure Network (done)
I suspect that the Installer program on the Rescue CD can't detect the Hardware properly, and thus can't find the proper drivers for the platform.
Is there any manual method of forcing the Rescue CD Installer to use a particular hardware model?
Thanks,
Tom
07-06-2010 06:59 AM
Hi To,
Do you happen to have the show hardware output from when the devices are running 4.0(19)?
Thanks,
Zach
07-07-2010 04:22 AM
Hi Zach,
Sorry for the delay. I've found that the MODEL detection is affected by the BIOS settings of the Hard Drive Controller. If RAID is hardware enabled on the Adaptec, then 4.1. and 4.2 detects the OE612 as an OE512, and 4.0 will show the MODEL as UNKNOWN.
CHENWAE1#show disks tech-support
Disk drive not supported. (type 8 host 1 chan 0 id 128 lun 0)
Disk drive not supported. (type 8 host 1 chan 1 id 128 lun 0)
Unsupported disk hardware.
=== disk00 ===
Device: IBM-ESXS ST3146755SS Version: BA33
Serial number: 3LN1QPDW00009805QJK8
Device type: disk
Transport protocol: SAS
Local Time is: Wed Jul 7 16:26:55 2010 IST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
=== disk01 ===
Device: IBM-ESXS ST3146755SS Version: BA33
Serial number: 3LN1S1SH00009804D67D
Device type: disk
Transport protocol: SAS
Local Time is: Wed Jul 7 16:26:56 2010 IST
Device supports SMART and is Enabled
Temperature Warning Enabled
SMART Health Status: OK
CHENWAE1#show hardware
Cisco Wide Area Application Services Software (WAAS)
Copyright (c) 1999-2008 by Cisco Systems, Inc.
Cisco Wide Area Application Services Software Release 4.0.17 (build b14 Feb 27 2008)
Version: oe612-4.0.17.14
Compiled 14:42:31 Feb 27 2008 by cnbuild
System was restarted on Wed Jul 7 16:21:59 2010.
The system has been up for 5 minutes, 31 seconds.
CPU 0 is GenuineIntel Intel(R) Pentium(R) D CPU 3.00GHz (rev 6) running at 3002MHz.
CPU 1 is GenuineIntel Intel(R) Pentium(R) D CPU 3.00GHz (rev 6) running at 3002MHz.
Total 2 CPUs.
2048 Mbytes of Physical memory.
1 CD ROM drive (TSSTcorpCD-ROM TS-L162C)
2 GigabitEthernet interfaces
2 InlineGroup interfaces.
1 Console interface
Jul 7 10:57:30 CHENWAE1 kernel: %WAAS-SYS-6-900000: hda:
Jul 7 10:57:30 CHENWAE1 kernel: %WAAS-SYS-6-900000: hda1
Manufactured As: WAE-612-K9 [8849PAV]
BIOS Information:
Vendor :IBM
Version :IBM BIOS Version 1.06-[P6E106AUS-1.06]-
Rel. Date :04/13/2007
Cookie info:
SerialNumber: KQLFKPD
SerialNumber (raw): 75 81 76 70 75 80 68 0 0 0 0
TestDate: 7-7-2010
ExtModel: OE612
ModelNum (raw): 55 0 0 0 1
HWVersion: 1
PartNumber: 53 54 55 56 57
BoardRevision: 1
ChipRev: 1
VendID: 0
CookieVer: 2
Chksum: 0xfc1f
List of all disk drives:
Disk drive not supported. (type 8 host 1 chan 0 id 128 lun 0)
Disk drive not supported. (type 8 host 1 chan 1 id 128 lun 0)
Unsupported disk hardware.
Physical disk information:
disk00: Present 3LN1QPDW00009805QJK8 (h01 c00 i128 l00 - Int DAS-SAS) 140011MB(136.7GB)
disk01: Present 3LN1S1SH00009804D67D (h01 c01 i128 l00 - Int DAS-SAS) 140011MB(136.7GB)
Mounted file systems:
MOUNT POINT TYPE DEVICE SIZE INUSE FREE USE%
/sw internal /dev/md0 991MB 615MB 376MB 62%
/swstore internal /dev/md1 991MB 281MB 710MB 28%
/state internal /dev/md2 5951MB 172MB 5779MB 2%
/local/local1 SYSFS /dev/md5 6943MB 131MB 6812MB 1%
/disk00-04 CONTENT /dev/md4 117798MB 330MB 117468MB 0%
.../local1/spool PRINTSPOOL /dev/md6 991MB 16MB 975MB 1%
Software RAID devices:
DEVICE NAME TYPE STATUS PHYSICAL DEVICES AND STATUS
/dev/md0 RAID-1 NORMAL OPERATION disk00/00[GOOD] disk01/00[GOOD]
/dev/md1 RAID-1 NORMAL OPERATION disk00/01[GOOD] disk01/01[GOOD]
/dev/md2 RAID-1 NORMAL OPERATION disk00/02[GOOD] disk01/02[GOOD]
/dev/md3 RAID-1 NORMAL OPERATION disk00/03[GOOD] disk01/03[GOOD]
/dev/md4 RAID-1 REBUILDING disk00/04[GOOD] disk01/04[!!NOT U
P TO DATE!!]
/dev/md5 RAID-1 REBUILDING disk00/05[GOOD] disk01/05[!!NOT U
P TO DATE!!]
/dev/md6 RAID-1 NORMAL OPERATION disk00/06[GOOD] disk01/06[GOOD]
*** NOTE ***
One or more RAID constituent disk partitions appear to be abnormal.
This could be because of
problem in drive's partition(s) and it is being rebuilt
Possible I/O errors on a disk drive
Please run "show alarms critical detail support"
to check any critical disk errors.
RAID-1 volumes will continue to operate on the remaining disk drive.
Please refer to the product documentation for further
information on how to handle this situation.
Disk encryption feature is disabled.
My original problem persists though; if I upgrade this platform to 4.2, the drives are no longer detected as "good"
Welcome to the installer. The installer will enable installation
of a new software image onto your system, or recover a previous image
in the event that the hardware was changed.
MODEL: OE612
FLASH: found, directory validated
COOKIE: valid
IMAGE: 4.2.1.38
FLASHDEV: /dev/sdc
Installer Main Menu:
1. Configure Network
2. Manufacture flash
3. Install flash cookie
4. Install flash image from network
5. Install flash image from cdrom
6. Install flash image from disk
7. Recreate RAID device (WAE-674/7341/7371 only)
8. Wipe out disks and install .bin image
9. Exit (Eject and reboot)
Choice [0]: 8
Enter full URL of .bin image to install.
ftp://[user:pass@]ip_addr/path/to/file
http://[user:pass@]ip_addr/path/to/file
file:/local/path/to/to/file
URL for .bin image [file:/cdrom/images/WAAS40.bin]:
Continue? This will wipe out all disks! [n]: y
Scrogging sda sdb done
Saving random seed...
dd: opening `/state/random-seed': No such file or directory
No volume groups found
scan_612_disk_map: Can't open . Errno=2
change_disknum_612 failed!
/ruby/bin/ruby_disk: First disk is missing, bad, marked bad, disabled via shutdown, or it has unknown data.
/ruby/bin/ruby_disk: Second disk is bad, missing, or shutdown; bailing out
scan_612_disk_map: Can't open . Errno=2
change_disknum_612 failed!
/ruby/bin/ruby_disk: First disk is missing, bad, marked bad, disabled via shutdown, or it has unknown data.
/ruby/bin/ruby_disk: Second disk is bad, missing, or shutdown; bailing out
ssmgr: open(/state/safestate, O_WRONLY|O_CREAT) failed: No such file or directory
ssmgr: op_open or media_open
ssmgr: open(/state/safestate, O_WRONLY|O_CREAT) failed: No such file or directory
ssmgr: open(/state/safestate, O_WRONLY|O_CREAT) failed: No such file or directory
ssmgr: op_create or media_init
SSMGR RETURNING: 10 (No such file or directory)
Reading and installing image, it may take a few minutes, please wait...
Error, image NOT installed.
Any insight would be greatly appreciated.
Sincerely,
Tom
07-14-2010 05:43 AM
What led you to change the Adaptec configuration? I was able to make the disks "disappear" by enabling RAID in the Adaptec settings, but this doesn't surprise me (we don't support hardware RAID on the 1RU platforms).
What is the state of the device prior to upgrading? Does the WAAS software boot? If so, can you send me a tech-support from the device?
Zach
07-14-2010 08:08 AM
Hi Zach,
>What led you to change the Adaptec configuration?
I was directed to do so by TAC, although I wasn't sure why, since our other WAE-612-k9s didn't have this option enabled.
>What is the state of the device prior to upgrading? Does the WAAS software boot?
After disabling the hardware raid, the WAE could run any 4.0.x version, but could not upgrade to 4.1/4.2. I tried both the "rescue CDs" for 4.1/4.2, and the "copy ftp install" methods. "Rescue CDs" would work right up to the option for "wipe disks and install .bin" image, but would report "no disks" as shown above. The "copy ftp install" from version 4.0.27 would install the 4.2.1.38 image, but upon reboot, it would fail during the boot sequence
Frustrated, I looked at the boot logs and might have found an issue. On the "faulty" WAE-612, the boot script detects two SCSI adapters; only one shows up in BIOS POST.
The first adapter (Adaptec AIC79XX) has no drives, but is assigned the device name "scsi0"
Under the second adapter(Adaptec AIC94XX), the Hard disks are detected, and it's assigned a device name of "scsi1"
ACPI: PCI Interrupt 0000:03:03.0[A] -> GSI 17 (level, low) -> IRQ 17
scsi0 : Adaptec AIC79XX PCI-X SCSI HBA DRIVER, Rev 2.0.30
aic7901: Ultra320 Wide Channel A, SCSI Id=7, PCI-X 67-100Mhz, 512 SCBs
Adaptec aacraid driver 1.1-5[24495]custom-IBM
Loading AIC-94xx Linux SAS/SATA Family Driver, Rev: 1.0.8-12
Probing Adaptec AIC-94xx Controller(s)...
ACPI: PCI Interrupt 0000:03:04.0[A] -> GSI 19 (level, low) -> IRQ 19
scsi1 : Adaptec AIC-9405W SAS/SATA Host Adapter
scsi 1:0:128:0: Direct-Access IBM-ESXS ST3146755SS BA33 PQ: 0 ANSI: 5
adp94xx:0:128:0: Tagged Queuing enabled. Depth 32
scsi 1:1:128:0: Direct-Access IBM-ESXS ST3146755SS BA33 PQ: 0 ANSI: 5
adp94xx:1:128:0: Tagged Queuing enabled. Depth 32
AIC-94xx controller(s) attached = 1.
To contrast, our functioning WAE-612s do not have this second detected Adaptec host adapter, and are properly assigned a "scsi0" for their device name.
Adaptec aacraid driver 1.1-5[24495]custom-IBM
Loading AIC-94xx Linux SAS/SATA Family Driver, Rev: 1.0.8-12
Probing Adaptec AIC-94xx Controller(s)...
ACPI: PCI Interrupt 0000:03:04.0[A] -> GSI 19 (level, low) -> IRQ 19
scsi0 : Adaptec AIC-9405W SAS/SATA Host Adapter
scsi 0:0:128:0: Direct-Access IBM-ESXS MBA3300RC SA06 PQ: 0 ANSI: 5
adp94xx:0:128:0: Tagged Queuing enabled. Depth 32
scsi 0:1:128:0: Direct-Access IBM-ESXS MBA3300RC SA06 PQ: 0 ANSI: 5
adp94xx:1:128:0: Tagged Queuing enabled. Depth 32
AIC-94xx controller(s) attached = 1.
On our faulty WAE-612, in the section that tries to mount the detected drives, it appears that the script is looking for the physical drives under the "scsi0" device name, using the AIC94XX driver.. .
2010 Jul 8 08:58:41 (none) kernel: %WAAS-SYS-3-900000: Attached scsi disk sda a
t scsi1, channel 0, id 128, lun 0
2010 Jul 8 08:58:41 (none) kernel: %WAAS-SYS-3-900000: Attached scsi disk sdb a
t scsi1, channel 1, id 128, lun 0
2010 Jul 8 08:58:41 (none) kernel: %WAAS-SYS-3-900000: Attached scsi removable
disk sdc at scsi2, channel 0, id 0, lun 0
BOOT-100: disk apply
scan_612_disk_map: Can't open . Errno=2
change_disknum_612 failed!
/ruby/bin/ruby_disk: First disk is missing, bad, marked bad, disabled via shutdo
wn, or it has unknown data.
/ruby/bin/ruby_disk: Second disk is bad, missing, or shutdown; bailing out
I'm guessing this fails because there are no physical drives on "scsi0", they are assigned on "scsi1". If the disk mapping script would've tried , I think the drives would've been mounted properly.
I was hoping there was a modified script out there somewhere, but after our 3rd RMA'd WAE exhibited the same characteristics (these were all in Chennai India), I directed the onsite tech to remove the Adaptec 29320ALP UltrasSCSI adapter.
With the adapter removed, the 4.2.1.38 code installed and booted properly.
When I forwarded my results to TAC, our Rep mentioned that the WAE-612-k9 ships with both ACNS and WAE capabilities. He said that ACNS uses that additonal SCSI card.
I'm fairly sure our configuration is not unique; our environment consists of 32 WAEs, and this is the first time that I've encountered this issue.
I'm just happy it's resolved now.
Thanks for your insight, Zach
Tom
07-20-2010 07:54 AM
Hi Tom,
What is the TAC case # you have been working under?
Thanks,
Zach
07-20-2010 08:05 AM
Hi Zach,
It was 614759583.
Tom
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide