cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1743
Views
0
Helpful
2
Replies

WS-X6708-10GE-3CXL

Oleg Volkov
Spotlight
Spotlight

Dear Sirs!

I have module WS-X6708-10GE, it is working good but when I run all test by cimmand "diagnostic start module 3 test all" I see next messages:

Router#sh mod

Mod Ports Card Type                              Model              Serial No.

--- ----- -------------------------------------- ------------------ -----------

3    8  CEF720 8 port 10GE with DFC            WS-X6708-10GE      SAL16084SR5

  5    2  Supervisor Engine 720 (Active)         WS-SUP720-3B       SAD081500MC

Mod MAC addresses                       Hw    Fw           Sw           Status

--- ---------------------------------- ------ ------------ ------------ -------

  3  f0f7.556a.05b0 to f0f7.556a.05b7   3.5   12.2(18r)S1  15.1(2)S     Ok

  5  000d.ed92.a2c0 to 000d.ed92.a2c3   1.0   8.5(3)       15.1(2)S     Ok

Mod  Sub-Module                  Model              Serial       Hw     Status

---- --------------------------- ------------------ ----------- ------- -------

  3  Distributed Forwarding Card WS-F6700-DFC3CXL   SAL1436T3MM  1.6    Ok

  5  Policy Feature Card 3       WS-F6K-PFC3BXL     SAD084503T4  1.8    Ok

  5  MSFC3 Daughterboard         WS-SUP720          SAD102307UN  0.101  Ok

Mod  Online Diag Status

---- -------------------

3  Pass

  5  Pass

Router#diagnostic start module 3 test all

*Oct 16 10:41:29.899: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestOBFL{ID=1} ...

*Oct 16 10:41:29.899: %DIAG-SP-6-TEST_OK: Module 3: TestOBFL{ID=1} has completed successfully

*Oct 16 10:41:29.899: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestFabricCh0Health{ID=2} ...

*Oct 16 10:41:30.263: %DIAG-SP-6-TEST_OK: Module 3: TestFabricCh0Health{ID=2} has completed successfully

*Oct 16 10:41:30.263: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestFabricCh1Health{ID=3} ...

*Oct 16 10:41:30.631: %DIAG-SP-6-TEST_OK: Module 3: TestFabricCh1Health{ID=3} has completed successfully

*Oct 16 10:41:30.631: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestTransceiverIntegrity{ID=4} ...

*Oct 16 10:41:30.631: %DIAG-SP-3-TEST_SKIPPED: Module 3: TestTransceiverIntegrity{ID=4} is skipped

*Oct 16 10:41:30.631: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestLoopback{ID=5} ...

*Oct 16 10:41:33.435: %DIAG-SP-6-TEST_OK: Module 3: TestLoopback{ID=5} has completed successfully

*Oct 16 10:41:33.435: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestScratchRegister{ID=6} ...

*Oct 16 10:41:33.459: %DIAG-SP-6-TEST_OK: Module 3: TestScratchRegister{ID=6} has completed successfully

*Oct 16 10:41:33.459: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestSynchedFabChannel{ID=7} ...

*Oct 16 10:41:33.459: %DIAG-SP-6-TEST_OK: Module 3: TestSynchedFabChannel{ID=7} has completed successfully

*Oct 16 10:41:33.459: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestDontLearn{ID=8} ...

*Oct 16 10:41:34.707: %DIAG-SP-6-TEST_OK: Module 3: TestDontLearn{ID=8} has completed successfully

*Oct 16 10:41:34.707: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestConditionalLearn{ID=9} ...

*Oct 16 10:41:35.399: %DIAG-SP-6-TEST_OK: Module 3: TestConditionalLearn{ID=9} has completed successfully

*Oct 16 10:41:35.399: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestNewLearn{ID=10} ...

*Oct 16 10:41:36.475: %DIAG-SP-6-TEST_OK: Module 3: TestNewLearn{ID=10} has completed successfully

*Oct 16 10:41:36.475: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestStaticEntry{ID=11} ...

*Oct 16 10:41:37.255: %DIAG-SP-6-TEST_OK: Module 3: TestStaticEntry{ID=11} has completed successfully

*Oct 16 10:41:37.255: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestIndexLearn{ID=12} ...

*Oct 16 10:41:38.031: %DIAG-SP-6-TEST_OK: Module 3: TestIndexLearn{ID=12} has completed successfully

*Oct 16 10:41:38.031: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestCapture{ID=13} ...

*Oct 16 10:41:39.135: %DIAG-SP-6-TEST_OK: Module 3: TestCapture{ID=13} has completed successfully

*Oct 16 10:41:39.135: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestTrap{ID=14} ...

*Oct 16 10:41:39.923: %DIAG-SP-6-TEST_OK: Module 3: TestTrap{ID=14} has completed successfully

*Oct 16 10:41:39.923: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestMacNotification{ID=15} ...

*Oct 16 10:41:40.299: %DIAG-SP-6-TEST_OK: Module 3: TestMacNotification{ID=15} has completed successfully

*Oct 16 10:41:40.299: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestFibDevices{ID=16} ...

*Oct 16 10:41:42.687: %DIAG-SP-6-TEST_OK: Module 3: TestFibDevices{ID=16} has completed successfully

*Oct 16 10:41:42.687: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestIPv4FibShortcut{ID=17} ...

*Oct 16 10:41:43.367: %DIAG-SP-6-TEST_OK: Module 3: TestIPv4FibShortcut{ID=17} has completed successfully

*Oct 16 10:41:43.367: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestIPv6FibShortcut{ID=18} ...

*Oct 16 10:41:44.263: %DIAG-SP-6-TEST_OK: Module 3: TestIPv6FibShortcut{ID=18} has completed successfully

*Oct 16 10:41:44.263: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestNATFibShortcut{ID=19} ...

*Oct 16 10:41:45.031: %DIAG-SP-6-TEST_OK: Module 3: TestNATFibShortcut{ID=19} has completed successfully

*Oct 16 10:41:45.031: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestMPLSFibShortcut{ID=20} ...

*Oct 16 10:41:45.935: %DIAG-SP-6-TEST_OK: Module 3: TestMPLSFibShortcut{ID=20} has completed successfully

*Oct 16 10:41:45.935: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestL3Capture{ID=21} ...

*Oct 16 10:41:46.939: %DIAG-SP-6-TEST_OK: Module 3: TestL3Capture{ID=21} has completed successfully

*Oct 16 10:41:46.939: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestL3VlanMet{ID=22} ...

*Oct 16 10:41:48.023: %DIAG-SP-6-TEST_OK: Module 3: TestL3VlanMet{ID=22} has completed successfully

*Oct 16 10:41:48.023: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestIngressSpan{ID=23} ...

*Oct 16 10:41:48.743: %DIAG-SP-6-TEST_OK: Module 3: TestIngressSpan{ID=23} has completed successfully

*Oct 16 10:41:48.743: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestEgressSpan{ID=24} ...

*Oct 16 10:41:48.975: %DIAG-SP-6-TEST_OK: Module 3: TestEgressSpan{ID=24} has completed successfully

*Oct 16 10:41:48.975: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestAclPermit{ID=25} ...

*Oct 16 10:41:49.863: %DIAG-SP-6-TEST_OK: Module 3: TestAclPermit{ID=25} has completed successfully

*Oct 16 10:41:49.863: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestAclDeny{ID=26} ...

*Oct 16 10:41:57.127: %DIAG-SP-6-TEST_OK: Module 3: TestAclDeny{ID=26} has completed successfully

*Oct 16 10:41:57.127: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestQos{ID=27} ...

*Oct 16 10:41:58.199: %DIAG-SP-6-TEST_OK: Module 3: TestQos{ID=27} has completed successfully

*Oct 16 10:41:58.199: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestNetflowShortcut{ID=28} ...

*Oct 16 10:41:58.891: %DIAG-SP-6-TEST_OK: Module 3: TestNetflowShortcut{ID=28} has completed successfully

*Oct 16 10:41:58.891: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestFirmwareDiagStatus{ID=31} ...

*Oct 16 10:41:58.903: %DIAG-SP-6-TEST_OK: Module 3: TestFirmwareDiagStatus{ID=31} has completed successfully

*Oct 16 10:41:58.903: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestAsicSync{ID=32} ...

*Oct 16 10:41:58.903: %DIAG-SP-3-TEST_SKIPPED: Module 3: TestAsicSync{ID=32} is skipped

*Oct 16 10:41:58.903: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestFibTcamSSRAM{ID=29} ...

*Oct 16 10:41:58.903: DFC3: ******************************************************************

*Oct 16 10:41:58.903: DFC3: * WARNING:

*Oct 16 10:41:58.903: DFC3: * FIB TCAM/SSRAM Memory test on module 3 may take up to .

*Oct 16 10:41:58.903: DFC3: * During this time, please DO NOT perform any packet switching.

*Oct 16 10:41:58.903: DFC3: ******************************************************************

*Oct 16 10:42:04.803: DFC3: test_144bit_lookup: Tcam test walking 0, dev 0

*Oct 16 10:48:41.975: DFC3: test_144bit_lookup: Tcam test walking 1, dev 0

*Oct 16 10:55:18.715: DFC3: test_144bit_lookup: Tcam test walking 0, dev 1

*Oct 16 11:01:55.363: DFC3: test_144bit_lookup: Tcam test walking 1, dev 1

*Oct 16 11:08:32.067: DFC3: test_144bit_lookup: Tcam test walking 0, dev 2

*Oct 16 11:15:08.727: DFC3: test_144bit_lookup: Tcam test walking 1, dev 2

*Oct 16 11:21:45.651: DFC3: test_144bit_lookup: Tcam test walking 0, dev 3

*Oct 16 11:28:22.339: DFC3: test_144bit_lookup: Tcam test walking 1, dev 3

*Oct 16 11:34:59.359: DFC3: test_72bit_lookup: Tcam test walking 0, dev 0

*Oct 16 11:43:38.027: DFC3: test_72bit_lookup: Tcam test walking 1, dev 0

*Oct 16 11:52:16.219: DFC3: test_72bit_lookup: Tcam test walking 0, dev 1

*Oct 16 12:00:54.671: DFC3: test_72bit_lookup: Tcam test walking 1, dev 1

*Oct 16 12:09:33.311: DFC3: test_72bit_lookup: Tcam test walking 0, dev 2

*Oct 16 12:18:11.823: DFC3: test_72bit_lookup: Tcam test walking 1, dev 2

*Oct 16 12:26:50.363: DFC3: test_72bit_lookup: Tcam test walking 0, dev 3

*Oct 16 12:35:28.291: DFC3: test_72bit_lookup: Tcam test walking 1, dev 3

*Oct 16 12:44:06.731: DFC3: FIB TCAM Test Passed.

*Oct 16 12:44:15.063: DFC3: FIB SSRAM Test Passed.

****************************************************************************

* WARNING: TCAM/SSRAM are filled with test pattern. Module 3 MUST be reset *

****************************************************************************

*Oct 16 12:44:15.190: %DIAG-SP-6-TEST_OK: Module 3: TestFibTcamSSRAM{ID=29} has completed successfully

*Oct 16 12:44:15.190: %DIAG-SP-6-TEST_RUNNING: Module 3: Running TestEobcStressPing{ID=30} ...

*Oct 16 12:44:15.190: %DIAG-SP-3-TEST_SKIPPED: Module 3: TestEobcStressPing{ID=30} is skipped

And then I get error:


*Oct 16 12:46:40.187: %HA_EM-6-LOG: Mandatory.go_fabrich1.tcl: GOLD EEM TCL policy for TestFabricCh1Health

Router#sh diagnostic result module 3

Current bootup diagnostic level: minimal

Module 3: CEF720 8 port 10GE with DFC  SerialNo : SAL16084SR5

  Overall Diagnostic Result for Module 3 : MAJOR ERROR

  Diagnostic level at card bootup: minimal

  Test results: (. = Pass, F = Fail, U = Untested)

    1) TestOBFL ------------------------> .

    2) TestFabricCh0Health -------------> .

   3) TestFabricCh1Health -------------> F

    4) TestTransceiverIntegrity:

      Port  1  2  3  4  5  6  7  8

      ----------------------------

            U  U  U  U  U  U  U  U

    5) TestLoopback:

      Port  1  2  3  4  5  6  7  8

      ----------------------------

            .  .  .  .  .  .  .  .

    6) TestScratchRegister -------------> .

    7) TestSynchedFabChannel -----------> .

    8) TestDontLearn -------------------> .

    9) TestConditionalLearn ------------> .

   10) TestNewLearn --------------------> .

   11) TestStaticEntry -----------------> .

   12) TestIndexLearn ------------------> .

   13) TestCapture ---------------------> .

   14) TestTrap ------------------------> .

   15) TestMacNotification -------------> .

   16) TestFibDevices ------------------> .

   17) TestIPv4FibShortcut -------------> .

   18) TestIPv6FibShortcut -------------> .

   19) TestNATFibShortcut --------------> .

   20) TestMPLSFibShortcut -------------> .

   21) TestL3Capture -------------------> .

   22) TestL3VlanMet -------------------> .

   23) TestIngressSpan -----------------> .

   24) TestEgressSpan ------------------> .

   25) TestAclPermit -------------------> .

   26) TestAclDeny ---------------------> .

   27) TestQos -------------------------> .

   28) TestNetflowShortcut -------------> .

   29) TestFibTcamSSRAM ----------------> .

   30) TestEobcStressPing --------------> U

   31) TestFirmwareDiagStatus ----------> .

   32) TestAsicSync --------------------> U

But after reload this module, and run minimal diagnostic, I get:

Router#sh diagnostic result

Current bootup diagnostic level: minimal

Module 1: CEF720 8 port 10GE with DFC  SerialNo : SAL16084SR5

  Overall Diagnostic Result for Module 1 : PASS

  Diagnostic level at card bootup: minimal

  Test results: (. = Pass, F = Fail, U = Untested)

    1) TestOBFL ------------------------> .

    2) TestFabricCh0Health -------------> .

    3) TestFabricCh1Health -------------> .

    4) TestTransceiverIntegrity:

      Port  1  2  3  4  5  6  7  8

      ----------------------------

            U  U  U  U  U  U  U  U

    5) TestLoopback:

      Port  1  2  3  4  5  6  7  8

      ----------------------------

            .  .  .  .  .  .  .  .

    6) TestScratchRegister -------------> .

    7) TestSynchedFabChannel -----------> .

    8) TestDontLearn -------------------> U

    9) TestConditionalLearn ------------> .

   10) TestNewLearn --------------------> U

   11) TestStaticEntry -----------------> U

   12) TestIndexLearn ------------------> U

   13) TestCapture ---------------------> U

   14) TestTrap ------------------------> U

   15) TestMacNotification -------------> .

   16) TestFibDevices ------------------> .

   17) TestIPv4FibShortcut -------------> .

   18) TestIPv6FibShortcut -------------> .

   19) TestNATFibShortcut --------------> .

   20) TestMPLSFibShortcut -------------> .

   21) TestL3Capture -------------------> U

   22) TestL3VlanMet -------------------> .

   23) TestIngressSpan -----------------> .

   24) TestEgressSpan ------------------> .

   25) TestAclPermit -------------------> .

   26) TestAclDeny ---------------------> U

   27) TestQos -------------------------> .

   28) TestNetflowShortcut -------------> .

   29) TestFibTcamSSRAM ----------------> U

   30) TestEobcStressPing --------------> U

   31) TestFirmwareDiagStatus ----------> .

   32) TestAsicSync --------------------> U

I try to change IOS and Sup720 but I also get this error.

But if I do this test with RSP720-3C-GE, test complete and all test is pass.

Current bootup diagnostic level: minimal

Module 1: CEF720 8 port 10GE with DFC  SerialNo : SAL16084SR5

  Overall Diagnostic Result for Module 1 : PASS

  Diagnostic level at card bootup: minimal

  Test results: (. = Pass, F = Fail, U = Untested)

    1) TestOBFL ------------------------> .

    2) TestFabricCh0Health -------------> .

    3) TestFabricCh1Health -------------> .

    4) TestTransceiverIntegrity:

      Port  1  2  3  4  5  6  7  8

      ----------------------------

            U  U  U  U  U  U  U  U

    5) TestLoopback:

      Port  1  2  3  4  5  6  7  8

      ----------------------------

            .  .  .  .  .  .  .  .

    6) TestScratchRegister -------------> .

    7) TestSynchedFabChannel -----------> .

    8) TestDontLearn -------------------> .

    9) TestConditionalLearn ------------> .

   10) TestNewLearn --------------------> .

   11) TestStaticEntry -----------------> .

   12) TestIndexLearn ------------------> .

   13) TestCapture ---------------------> .

   14) TestTrap ------------------------> .

   15) TestMacNotification -------------> .

   16) TestFibDevices ------------------> .

   17) TestIPv4FibShortcut -------------> .

   18) TestIPv6FibShortcut -------------> .

   19) TestNATFibShortcut --------------> .

   20) TestMPLSFibShortcut -------------> .

   21) TestL3Capture -------------------> .

   22) TestL3VlanMet -------------------> .

   23) TestIngressSpan -----------------> .

   24) TestEgressSpan ------------------> .

   25) TestAclPermit -------------------> .

   26) TestAclDeny ---------------------> .

   27) TestQos -------------------------> .

   28) TestNetflowShortcut -------------> .

   29) TestFibTcamSSRAM ----------------> .

   30) TestEobcStressPing --------------> U

   31) TestFirmwareDiagStatus ----------> .

   32) TestAsicSync --------------------> U

   33) TestErrorCounterMonitor ---------> .

What You recommended to do?

Thanks!

------------------------------------------------------
Helping seriously ill children, all together. All information about this, is posted on my blog       

--------------------------------------------------------------------------

Helping seriously ill children, all together. All information about this, is posted on my blog
1 Accepted Solution

Accepted Solutions

Arumugam Muthaiah
Cisco Employee
Cisco Employee

Hi Oleg,

The test constantly monitors the health of the ingress and egress data paths for fabric channel 1 on 10-gigabit modules.

The test runs every five seconds. Ten consecutive failures are treated as fatal and the module resets; three consecutive reset cycles may result in a fabric switchover.

The module resets after 10 consecutive failures. Three consecutive failures resets powers down the module.

Your issue mathing with known DDTS CSCtq54730

CSCtq54730  - TestFabricCh1Health fails on WS-X6708-10GE

Symptom:

FabricCh1health fails when diagn start module <> test all is executed resulting in Major fail.

Conditions:

When you execute diagn start module 8 test all , it runs disruptive test including the tests that need a reset of the card.

There  is a clear message on the console that when FibTCAM memory is tested ,  the card has to be reset after the test is performed and it may affect  the normal operation and in this case FabricCh1 HM test is affected.  FabricCh1 health test also runs parallelly as HM test along with this  FibTcam manually run test.

Ideally when a reset required test is  running all the other tests should be skipped. But there is a s/w bug in  the code and this particular FabricCh1 HM test is not getting skipped  and hence it is failing resulting in Major error which is misguiding.

Workaround:

workaround that you can suggest to your customer running SRD.

We  do all these test as part of boot-up and I don't think they have to do  these tests again. But even if they want to do it , I  suggest they can  disable all the HM test and run all the test manually so that there is  no conflicts. i.e conf t>no diagn monitor module <> test all.

Regards,

Aru

*** Please rate if the post is useful ***

Regards, Aru *** Please rate if the post useful ***

View solution in original post

2 Replies 2

Arumugam Muthaiah
Cisco Employee
Cisco Employee

Hi Oleg,

The test constantly monitors the health of the ingress and egress data paths for fabric channel 1 on 10-gigabit modules.

The test runs every five seconds. Ten consecutive failures are treated as fatal and the module resets; three consecutive reset cycles may result in a fabric switchover.

The module resets after 10 consecutive failures. Three consecutive failures resets powers down the module.

Your issue mathing with known DDTS CSCtq54730

CSCtq54730  - TestFabricCh1Health fails on WS-X6708-10GE

Symptom:

FabricCh1health fails when diagn start module <> test all is executed resulting in Major fail.

Conditions:

When you execute diagn start module 8 test all , it runs disruptive test including the tests that need a reset of the card.

There  is a clear message on the console that when FibTCAM memory is tested ,  the card has to be reset after the test is performed and it may affect  the normal operation and in this case FabricCh1 HM test is affected.  FabricCh1 health test also runs parallelly as HM test along with this  FibTcam manually run test.

Ideally when a reset required test is  running all the other tests should be skipped. But there is a s/w bug in  the code and this particular FabricCh1 HM test is not getting skipped  and hence it is failing resulting in Major error which is misguiding.

Workaround:

workaround that you can suggest to your customer running SRD.

We  do all these test as part of boot-up and I don't think they have to do  these tests again. But even if they want to do it , I  suggest they can  disable all the HM test and run all the test manually so that there is  no conflicts. i.e conf t>no diagn monitor module <> test all.

Regards,

Aru

*** Please rate if the post is useful ***

Regards, Aru *** Please rate if the post useful ***

Thanks Dear Arumugam!

We are really help me!

------------------------------------------------------
Helping seriously ill children, all together. All information about this, is posted on my blog

--------------------------------------------------------------------------

Helping seriously ill children, all together. All information about this, is posted on my blog
Review Cisco Networking for a $25 gift card