03-05-2011 07:27 PM
Hello-
We had a rather strange thing happen tonight in our data center. We have a VMware environment, multipath with two HP P2000 SANs (each with two dual-port controllers) and two MDS 9124s. One port from each controller on each SAN is connected to each switch, so there are 4 to MDS Switch A and 4 connections to MDS Switch B.
At about 6:00pm, I received 4 alerts (two from each SAN) simultaneosly telling me that the port on each controller that was connected to MDS Switch B was down.
I looked in the log on that MDS switch and observed the follow:
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16777216 = 1036 (was 198206400)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16777216 = 2196 (was 86978036)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16777216 = 11 (was 2884392)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16777216 = 11 (was 2777508)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16781312 = 1064 (was 198205184)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16781312 = 2840 (was 86977548)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16781312 = 12 (was 2884375)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16781312 = 12 (was 2777504)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16785408 = 124 (was 53013132)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16785408 = 124 (was 124737700)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16785408 = 1 (was 887000)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16785408 = 1 (was 3073597)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16789504 = 176 (was 53012256)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16789504 = 176 (was 124736000)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16789504 = 1 (was 886984)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16789504 = 1 (was 3073568)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16793600 = 172 (was 53031164)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16793600 = 172 (was 124779604)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16793600 = 1 (was 887239)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16793600 = 1 (was 3074033)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16797696 = 224 (was 53062184)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16797696 = 224 (was 125096172)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16797696 = 1 (was 888147)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16797696 = 1 (was 3074674)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16801792 = 220 (was 53084032)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16801792 = 220 (was 125146192)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16801792 = 1 (was 888449)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16801792 = 1 (was 3075226)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16805888 = 172 (was 224535392)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16805888 = 172 (was 98024592)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16805888 = 1 (was 3252712)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16805888 = 1 (was 2961667)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16809984 = 172 (was 224527920)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16809984 = 172 (was 98021716)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16809984 = 1 (was 3252601)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16809984 = 1 (was 2961611)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16814080 = 156 (was 60596136)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16814080 = 156 (was 149168056)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16814080 = 1 (was 1018232)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16814080 = 1 (was 3301871)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCInOctets.16818176 = 208 (was 53221708)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifHCOutOctets.16818176 = 208 (was 126034160)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifInUcastPkts.16818176 = 1 (was 891348)
2011.03.05 18:06:29 [dm.DM] 18:06:29 Counter decreased: ifOutUcastPkts.16818176 = 1 (was 3077987)
It looks to me like all the counters on the connected interfaces were reset. As far as I can tell, the switch didn't go down per se, but it did drop its connections momentarily.
Software version on the MDS 9124s is reported as:
Cisco SAN-OS(tm) m9100, Software (m9100-s2ek9-mz), Version 3.3(2), RELEASE SOFTWARE (fc2) Copyright (c) 2002-2005 by Cisco Systems, Inc. Compiled 10/3/2008 11:00:00
Can anyone shed any light on what might be going on here?
Many thanks,
Rick
03-07-2011 04:57 AM
i dont have any direct pointers to the issue here but you seem to be running quiet old code. an upgrade might.
Discover and save your favorite ideas. Come back to expert answers, step-by-step guides, recent topics, and more.
New here? Get started with these tips. How to use Community New member guide