cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Announcements

**Updated 16 February 2011** IBM 7816-I4 782x-I4 filesystem errors

44721
Views
10
Helpful
89
Comments

 

Summary

Cisco Media Convergence Servers 7816-I4, 7825-I4 (and IBM x3250-M2  equivalent) and 7828-I4 have recently been experiencing technical issues.  These servers are used by Cisco Unified Communications Manager and various other Cisco Collaboration software products.

The symptom is that the local disk drives' file-system goes into read-only mode, which can manifest as application services going down, the server becoming  unresponsive via the network or the management interfaces, or worst case data corruption necessitating a reinstall and restore from backup.

 

Root cause has been identified by Cisco and its suppliers as a disk drive issue stemming from interaction with system firmware. 

 

Field Notice 63374 has been published and includes more technical details regarding this  issue.  Cisco and its suppliers are committed to high quality and  apologize for any disruptions or impact caused by this issue.

 

Solution

The file-system going read-only issue which has recently been affecting server models MCS-7816-I4, MCS-7825-I4, and MCS-7828-I4 (or their IBM equivilants) in the field is addressed by CSCti52867 - "IBM 7816-I4 and 782x-I4 READONLY file system".

 

The fix for CSCti52867 is now available and requires the application of two patch files.  Install both of these patch files in the order listed below.

 

1. First install ciscocm.ibm-diskex-1.0.cop.sgn 
     The Readme file ciscocm.ibm-diskex-1.0.cop.sgn includes installation instructions for this .cop.sgn.

     Make sure to only install this utility when show hardware CLI output indicates the array is in a healthy state.

     If your server has never had the filesystem go readonly then this step is optional. 
2. Next install Cisco-HDD-FWUpdate-3.0.1-I.ISO .
     The Readme file Cisco-HDD-FWUpdate-3.0.1-I.Readme.pdf includes installation instructions for this ISO.

     This installer is completely independant of the OS installed on the server.

Note:  Installing the FWUpdate v3.0(1) or later will get you firmware with the fix for this defect.  It is always recommended that you apply the latest FWUCD available for your server.

 

Refer to the Release Note of CSCti52867 and the Readme file for each of the above mentioned patch files for more details.

 

Symptoms

  • The file system goes READONLY, then CUCM services may go down, the server may become "unresponsive" meaning that it is not possible to ssh into the server, login to the console, or web into the server although it may still respond to pings.
  • Traces from all services stop writing (including syslog)
  • You see the following error on the server console
 
EXT3-fs error (device sda6) in start_transaction: Jornal has aborted

 

 

 

 

 

  • If you are able to login to the server via SSH, the following output may be displayed.
 
Last login: Mon Aug  X XX:XX:XX XXXX from XXX.XXX.XXX.XXX
Command Line Interface is starting up, please wait ...
java.io.FileNotFoundException: /var/log/active/platform/log/cli.bin (Read-only file system)
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
    :
    :
    :
        at org.apache.log4j.Category.info(Category.java:674)
        at sdMain.main(sdMain.java:611)
log4j:ERROR No output stream or file set for the appender named [CLI_LOG].

   Welcome to the Platform Command Line Interface
    WARNING:
        The /common file system is mounted read only.  <<<<<<<<<<<<<<<<<<

        Please use Recovery Disk to check the file system using fsck.
admin:

 

 

 

How to determine the current version of firmware on the hard drive

 

 

 

  • For MCS-7825-I4 and MCS-7828-I4, running Cisco UCM 7.1 and above, you can use the CLI command 'show hardware' to verify the firmware version.

 

 



admin:show hardware



HW Platform       : 7828I4

Processors        : 1

Type            
: Intel(R) Core(TM)2 Quad CPU    Q9400  @ 2.66GHz

CPU Speed         : 2660

Memory            : 8192
MBytes

Object ID         : 1.3.6.1.4.1.9.1.899

OS Version        : UCOS 4.0.0.0-34

Serial Number     : KQRBVVB



RAID Version      :

Raid firmware version: 1.26.81.00

Raid Bios version: 6.16.00.00



BIOS Information  :

IBM IBMBIOSVersion1.44-[M9E144AUS-1.44]- 06/11/2009



RAID Details      :

LSI Logic IR Configuration Utility 2.00.15

Read configuration has been initiated for controller 0

------------------------------------------------------------------------

Controller information

------------------------------------------------------------------------

  Controller
type                       
: SAS1064E

  BIOS
version                          
: 6.16.00.00

  Firmware version                      
: 1.26.81.00

  Channel
description                   
: 1 Serial Attached SCSI

  Initiator
ID                          
: 112

  Maximum physical
devices              
: 62

  Concurrent commands
supported           : 266

  Slot                                  
: 0


Bus                                   
: 1


Device                                
: 0


Function                              
: 0

  RAID
Support                          
: Yes

------------------------------------------------------------------------

IR Volume information

------------------------------------------------------------------------

IR volume 1

  Volume
ID                             
: 7

  Status of
volume                      
: Okay (OKY)

  RAID
level                            
: 1

  Size (in
MB)                          
: 237464

  Physical hard disks (Target
ID)         : 9 8

------------------------------------------------------------------------

Physical device information

------------------------------------------------------------------------

Initiator at ID #112

Target on ID #8

  Device is a Hard disk

  Enclosure
#                           
: 1

  Slot
#                                
: 1

  Target
ID                             
: 8


State                                 
: Online (ONL)

  Size (in MB)/(in
sectors)             
: 238475/488397168


Manufacturer                          
: ATA   

  Model
Number                          
: WD2502ABYS
-23B7A

  Firmware Revision                     
: 3B04


  Serial
No                             
:      WD-WCAT1D712130

  Drive
Type                            
: SATA

Target on ID #9

  Device is a Hard disk

  Enclosure
#                           
: 1

  Slot
#                                
: 0

  Target
ID                             
: 9


State                                 
: Online (ONL)

  Size (in MB)/(in
sectors)             
: 238475/488397168


Manufacturer                          
: ATA   

  Model Number                          
: WD2502ABYS
-23B7A

  Firmware
Revision                     
: 3B04


  Serial
No                             
:      WD-WCAT1D723848

  Drive
Type                            
: SATA

------------------------------------------------------------------------

Enclosure information

------------------------------------------------------------------------

Enclosure#                              
: 1

  Logical
ID                            
: 5005076b:0648afc0

  Numslots                              
: 4


StartSlot                             
: 0

  Start
TargetID                        
: 0

  Start
Bus                             
: 0

------------------------------------------------------------------------



 

The text highlited in red are the info you need.  This output shows a server with two drives with model  number WD2502ABYS on 3B04 firmware.  These drives should be upgraded as  soon as possible.

 

  • For MCS-7825-I4 and MCS-7828-I4 models running Cisco UCM versions previous to 7.1, as well as any version of Cisco UCM running on a MCS-7816-I4 model server, you must download and boot off of a CD burned with Cisco-HDD-FWUpdate-3.0.1-I.ISO (refer to the "Solution" section for links to download the ISO and readme).  Upon successful boot of the Cisco-HDD-FWUpdate-3.0.1-I.ISO CD, you will be presented with the current HDD FW version as well as the opportunity to upgrade to HDD FW version 02.03B06.

What should be done if filesystem issues persist after applying the patches?


As of 16 February 2011 if you encounter any further filesystem or hard drive issues after applying both the firmware and disk exerciser you should proceed to replace the affected drive(s).

 

There are three ways you can replace the drive(s).

  1. If you have an MCS server with an active Cisco support contract open a TAC service request.
  2. If you purchased the IBM x3250 M2 MCS equivalent and have an IBM support contract contact IBM support.
  3. If you do not have any support contract for the server you can purchase a new drive from Cisco or IBM.
    • The Cisco part number for the hard drive is HDD-7825-I4-250=.
    • Contact your IBM reseller to confirm the correct part number for the 250GB SATA simple-swap HD for the x3250 M2 server.

 

If you have any questions you can leave a comment on this document. The ibm-fs-failure@cisco.com email address is no longer active as of 1 September 2014.

Sending the email will not generate a TAC SR but will allow us to collect more information.  This is an informal submission with no associated SLA and we will make every effort to follow up submissions but cannot guarantee a response.

 

Related Defects

 

 

 

Related Links

Comments
Phillip Ratliff
Cisco Employee

The Readme for the 3.6(1) FWUCD shows that it includes hard drive firmware 3B05.  While it is a good idea to keep all of the firmware on the server up to date you will still need to run the standalone 3B06 firmware update in addition to the ciscocm.ibm-diskex-1.0.cop.sgn file to get the complete fix for CSCti52867.

Since a reboot is required to apply the 3B06 firmware that presents an ideal time to apply the FWUCD as well.

-Ryan

we hit the bug CSCti52867.

I found a new version 3.6.1 ,

FWUCD-3.6.1-I.iso

Release Date: 30/NOV/2010

Size: 334360.00 KB (342384640 bytes)

questions:

1) can some one tell me if I should use this 3.6 or the old 3.0.1.

2) if use the 3.6 I have to use first the ciscocm.ibm-diskex-1.0.cop.sgn or with version 3.6 is not necesary?

http://www.cisco.com/cisco/software/release.html?mdfid=282152197&catid=278875240&flowid=20295&softwareid=283046743&release=3.6%281%29&rellifecycle=&relind=AVAILABLE&reltype=all

kind regards,

d.haeni
Enthusiast

Ryan,

We have sorted this out with TAC and your help. Since the R/O isse has not yet occured on the UCCX 8.0 Server, we will just upgrade the firmware. Nonetheless, you might want to reference the defects that were opened while resolving the SR:

CSCtn17205 Drive Exerciser Utility can not be installed on UCCX VOS platform

CSCti28336 Need document using the recovery CD when file system mounted read only

/David

d.haeni
Enthusiast

Ryan,

We recently had an issue with a UCCX 7.x installation, running on a 7816-I4 (and Windows, of course):

- Windows Event Logs every now and then showed "bad blocks", and the server restarted itself automatically

- In one case, the server was unresponsive and manually had to be powered off and on in order to restore its function.

- I attributed it to a bad Drive and had the HDD RMAed.

My questions:

- The info on this R/O filesystem issue with respect to Windows OS seems a bit vague. Any chance this was related to the issue we're discussing here?

- Is HDD FW Upgrade supported on MCSes running Windows OS?

- Would you even recommend it?

- How do you accomplish this (step-by-step instructions)

Thanks for your help

Phillip Ratliff
Cisco Employee

Thanks David I've updated the document with the bugs you cited.  Thanks for your patience working through the UCCX issues.

We never saw any complaints of this issue on a Windows server so I cannot confirm or refute that the issue you saw was due to this problem or not.  I would encourage anyone with one of these servers to apply the hard drive firmware update regardless of whether you have seen issues.  The installer is a self contained patch utility from IBM that does not rely on any data on the hard drives.  It can be run on a server with no OS at all.

If you are seeing bad blocks reported on a hard drive from Windows then I would replace the drive regardless of firmware.  You can also confirm using the IBM DSA utility if the drive is showing SMART errors.

Ryan,

We recently had an issue with a UCCX 7.x installation, running on a 7816-I4 (and Windows, of course):

- Windows Event Logs every now and then showed "bad blocks", and the server restarted itself automatically

- In one case, the server was unresponsive and manually had to be powered off and on in order to restore its function.

- I attributed it to a bad Drive and had the HDD RMAed.

My questions:

- The info on this R/O filesystem issue with respect to Windows OS seems a bit vague. Any chance this was related to the issue we're discussing here?

- Is HDD FW Upgrade supported on MCSes running Windows OS?

- Would you even recommend it?

- How do you accomplish this (step-by-step instructions)

Thanks for your help

Andreas Thamm
Beginner

Yesterday I've also hit the "read-only"-issue when trying to upgrade to v8.5.1 of CUCM, nevertheless I have already installed the B06-Firmware fix in December 2010. So I contacted Cisco TAC and as described above I requested the new HDD's to replace the old ones.

In the meantime I would like to know if there is any possiblity for me to get the CUCM work again until the new HDD's are delivered? I'm a little bit afraid of trying a simple restart of the CUCM, because now mostly eighty percent of our phones are working because they were logged in with Extension Mobility when the error occured. May be after the restart no one will be working in appropriate function, because login with ExMo is not available. I would be pleased if someone could give me any good advice to follow.

Kind regards from Germany,
Andi

Phillip Ratliff
Cisco Employee

Most of the time simply rebooting the server will recover it.  If this doesn't work for you then a filesystem check may get you up enough to proceed but you may be stuck until you get the HDD(s) replaced.

Yesterday I've also hit the "read-only"-issue when trying to upgrade to v8.5.1 of CUCM, nevertheless I have already installed the B06-Firmware fix in December 2010. So I contacted Cisco TAC and as described above I requested the new HDD's to replace the old ones.

In the meantime I would like to know if there is any possiblity for me to get the CUCM work again until the new HDD's are delivered? I'm a little bit afraid of trying a simple restart of the CUCM, because now mostly eighty percent of our phones are working because they were logged in with Extension Mobility when the error occured. May be after the restart no one will be working in appropriate function, because login with ExMo is not available. I would be pleased if someone could give me any good advice to follow.

Kind regards from Germany,
Andi

Andreas Thamm
Beginner

Phillip,

I followed your advice and made a reset (right at the front of the machine) of the cucm. Now Ex-Mo and any other services are working again! So I'm very happy and have to thank you very, very much

The only thing that I'm missing now is the inactive partition in the "CUCM OS => Settings => Version windows" window. Normally there should be the option to switch to the displayed inactive partition with V.x.x.x- installed on it. May be you have for this any useful advice.

Kind regards,

Andreas

Phillip Ratliff
Cisco Employee

If it happened during an upgrade then it's likely your inactive partition got wiped in preparation for it to be come the new active partition.

Phillip,

I followed your advice and made a reset (right at the front of the machine) of the cucm. Now Ex-Mo and any other services are working again! So I'm very happy and have to thank you very, very much

The only thing that I'm missing now is the inactive partition in the "CUCM OS => Settings => Version windows" window. Normally there should be the option to switch to the displayed inactive partition with V.x.x.x- installed on it. May be you have for this any useful advice.

Kind regards,

Andreas

Arash Tabarestani
Contributor

I Have same issue with 7828I3, is there any HDD firmware update for l3 ?

HW Platform       : 7828I3

Processors        : 1

Type              : Family: Core 2

CPU Speed         : 2130

Memory            : 6144 MBytes

Object ID         : 1.3.6.1.4.1.9.1.899

OS Version        : UCOS 4.0.0.0-44

Serial Number     : KQFZKTV

RAID Version      :

RAID Firmware Version:  1.18.83.00

RAID BIOS Version:  6.0e.00.00

BIOS Information  :

1.45

RAID Details      :

LSI Logic IR Configuration Utility 2.00.15

Read configuration has been initiated for controller 0

------------------------------------------------------------------------

Controller information

------------------------------------------------------------------------

  Controller type                         : SAS1064E

  BIOS version                            : 6.0e.00.00

  Firmware version                        : 1.18.83.00

  Channel description                     : 1 Serial Attached SCSI

  Initiator ID                            : 112

  Maximum physical devices                : 62

  Concurrent commands supported           : 511

  Slot                                    : 0

  Bus                                     : 5

  Device                                  : 0

  Function                                : 0

  RAID Support                            : Yes

------------------------------------------------------------------------

IR Volume information

------------------------------------------------------------------------

IR volume 1

  Volume ID                               : 0

  Status of volume                        : Resyncing (RSY)

  RAID level                              : 1

  Size (in MB)                            : 237464

  Physical hard disks (Target ID)         : 4 1

------------------------------------------------------------------------

Physical device information

------------------------------------------------------------------------

Initiator at ID #112

Target on ID #1

  Device is a Hard disk

  Enclosure #                             : 1

  Slot #                                  : 1

  Target ID                               : 1

  State                                   : Out of Sync (OSY)

  Size (in MB)/(in sectors)               : 238475/488397168

  Manufacturer                            : ATA    

  Model Number                            : WD2502ABYS-23B7A

  Firmware Revision                       : 3B02

  Serial No                               :      WD-WCAT14597708

  Drive Type                              : SATA

Target on ID #4

  Device is a Hard disk

  Enclosure #                             : 1

  Slot #                                  : 0

  Target ID                               : 4

  State                                   : Online (ONL)

  Size (in MB)/(in sectors)               : 238475/488397168

  Manufacturer                            : ATA    

  Model Number                            : WD2502ABYS-23B7A

  Firmware Revision                       : 3B02

  Serial No                               :      WD-WCAT14631340

  Drive Type                              : SATA

Phillip Ratliff
Cisco Employee

I wasn't aware the 7828I3 had those drives but yours sure does.  The firmware update is specific to the hard drives, not the server so you should be able to apply it successfully to your server.

If your system went readonly then you should also install the cop.sgn file just as if you had an I4.

Arash Tabarestani
Contributor

Thanks Philip,

I was able to install both cop and hdd update on 7828-l3,

Initiator at ID #112

Target on ID #1

  Device is a Hard disk

  Enclosure #                             : 1

  Slot #                                  : 1

  Target ID                               : 1

  State                                   : Out of Sync (OSY)

  Size (in MB)/(in sectors)               : 238475/488397168

  Manufacturer                            : ATA    

  Model Number                            : WD2502ABYS-23B7A

  Firmware Revision                       : 3B06

  Serial No                               :      WD-WCAT14597708

  Drive Type                              : SATA

Target on ID #4

  Device is a Hard disk

  Enclosure #                             : 1

  Slot #                                  : 0

  Target ID                               : 4

  State                                   : Online (ONL)

  Size (in MB)/(in sectors)               : 238475/488397168

  Manufacturer                            : ATA    

  Model Number                            : WD2502ABYS-23B7A

  Firmware Revision                       : 3B06

  Serial No                               :      WD-WCAT14631340

  Drive Type                              : SATA

I just wonder if you could update the title and add 7828-l3 as well and if it is possible to add cop and hdd update on 7828-l3 cisco download portal incase someone else has same issue.
Cheers!
Arash

Phillip Ratliff
Cisco Employee

At this point you are the first I've heard of to report this on an I3 but I'll keep an eye out for others and update the document accordingly.

I noticed you array is out of sync.  This is expected after applying the firmware update but the cop.sgn file should be run when the array sync is finished, otherwise it only runs on the single drive.

Arash Tabarestani
Contributor

So I should keep checking the array status once it synced I should run the cop file again ?

tiwhalen
Cisco Employee

Phillip

Will the fix for the issue be integrated into CUCM installation disks in order to relieve customers from having to separately upgrade the server firmware?

Tim

Phillip Ratliff
Cisco Employee

Unfortunately we don't have the ability to update hd firmware during software or OS install so it needs to be done via the FWUCD.

Phillip

Will the fix for the issue be integrated into CUCM installation disks in order to relieve customers from having to separately upgrade the server firmware?

Tim

Create
Recognize Your Peers
Content for Community-Ad