cancel
Showing results for 
Search instead for 
Did you mean: 
cancel

ASR9000/XR optic related errors explained

20757
Views
15
Helpful
13
Comments

When using the commands from show controller <interface> an error can be reported that helps identify why the interface is not transmitting or receiving or coming up.

This is a bit of a bear doc maybe, but hopefully it explains some of the simple basics behind L1 related issues that can be identified with this command and what remedy to undertake to keep going.

 

Optics related defects: (as seen with show controller)
 
=======================
NO_OPTICS - Optics is not present
TX_FAULT - Optics is reporting Transmit Fault. Try replacing Optics.
MOD_NOT_RDY - Optics is reporting module not ready (Transmit laser,
              Clocking issue inside Optics). Try replacing Optics.
RX_LOS - Optics is seeing receive Loss-of-Signal (will cause LASIand HW_LINK Defects). Check fiber
XCVR_SECURITY - Not a Cisco supported Optics. Security checked failed.
                Transmit laser is turned-off. Use Cisco supported Optics.
SFP_HW_LINK - 100FX/Copper SFP is seeing receive link as DOWN.
              Check cable and speed/duplex/autoneg config.
XCVR_PID_UNSUP - Optics Product ID is not supported for ASR9k.
XCVR_TYPE_UNSUP - Not a correct Optics type for this port (eg., on a10GE port a
                  1GE Optics type is inserted)
UNSUP_CFG - User has configured unsupported config. Port transmit is disabled
            (ie., Autoneg configured for Copper/100FX SFP).
             Port transmit is disabled.
 
PHY related defect: (only in 10G port)
===================
LASI    - PHY TX/RX link is DOWN (because of RX_LOS/PCS/PMA layers DOWN)
  ('sh controllers tenGigE <> xgxs' provides additional information)
 
MAC related defects:
====================
HW_LINK - MAC RX link is DOWN (because of RX_LOS/PCS/PMA layers DOWN)
RFI - MAC is getting Remote fault (far-end must be seeing local fault)
AUTO_NEG - Auto-neg failure
 
Port transmit link is down because of Fabric related defect:
===========================================================
FIA_INI  - Fabric is not ready to receive (some initialization issue).
FIA_SHUTDOWN - Fabric is shutdown
 
G.709/OTN related defects: (more in 'sh controller dwdm <> g709')
=================================================================
G709_LICENSE - No G.709 license is available. Port transmit is diabled
ADV_OPT_PIE  - Advanced Optics Pie is not installed. Port transmit is
disabled
DWDM_LASER_SHUT - Transmit laser is disabled because user has configured
 
  ' admin-state [Out-of-service | Out-of-service-Maintenance]' under  'controller dwdm <>'.
 
Not a defect:
=============
MODE_CHANGE - Port transmit is disabled momentarily during mode change
             (lan/wanphy/otn changed using 'transport-mode <>'
              under 'interface tengig <>')
 
 
---------------
Xander Thuijs CCIE #6775
ASR9000 & XR Principal Engineer
 
Comments
Nan Bai
Cisco Employee

hi Xander

thanks for your sharing, this is Nan from CHINA TAC

customer wants to know the meaning of the following error message about signal failure detected and the possible cause to trigger the log, and what would happen after the log was generated.

LC/0/3/CPU0:Feb 27 11:33:48.445 : vic_0_2[366]: %PLATFORM-VIC-4-SIGNAL : Interface HundredGigE0/3/0/2, Detected Signal failure

is it a local fault or remote fault or both maybe ?

smailmilak
Enthusiast

HI Nan,

use this command and check the Tx/Rx.

show controllers HundredGigE0/3/0/2 phy

You will find a lot of useful info there. 

This is the output of a proper link (TenG).

Temperature: 27.656
Voltage: 3.295 Volt
Tx Bias: 4.964 mAmps
Tx Power: 0.61130 mW (-2.13746 dBm)
Rx Power: 0.53820 mW (-2.69056 dBm)  --------> If RX power is way out of this range, then it's maybe a fault on the remote side.

xthuijs
Cisco Employee

yeah to add to smail's Nan, the vic-signal error is basically a generic error that the optic has lost signal, this has a deeper laying reason, for instance the receive power is lost (no light) or that we experienced PCS or PHY errors. there is likely more syslog around it showing that the link or lane showed BER (bit errors) or something.

the following output below would be helpful to determine that.

at times this can be because of a dirty or faulty CPAK/optic. so trying a different optic, fiber, may be the cause of it. Also use the same optic/cable in a different port to see if it is the physical port or optic that is creeping up.

xander

sho controller HundredGigE0/9/0/7 phy
PHY data for interface: HundredGigE0/9/0/7


Rx        64B66B      Lane        Sync        PCS         Virt  PCS
Service   Block       Marker      Header      Lane        Lane  Lane
Lane      Lock        Sync        Err Cnt     BIP Errors  Error Mapping
--        ---------   ---------   ----------  ----------  ----- -------
0         Locked      Locked      0           0           Clean 1
1         Locked      Locked      0           0           Clean 2
2         Locked      Locked      0           0           Clean 4
3         Locked      Locked      0           0           Clean 7
4         Locked      Locked      0           0           Clean 8
5         Locked      Locked      0           0           Clean 10
6         Locked      Locked      0           0           Clean 13
7         Locked      Locked      0           0           Clean 15
8         Locked      Locked      0           0           Clean 17
9         Locked      Locked      0           0           Clean 19
10        Locked      Locked      0           0           Clean 0
11        Locked      Locked      0           0           Clean 3
12        Locked      Locked      0           0           Clean 5
13        Locked      Locked      0           0           Clean 6
14        Locked      Locked      0           0           Clean 9
15        Locked      Locked      0           0           Clean 11
16        Locked      Locked      0           0           Clean 12
17        Locked      Locked      0           0           Clean 14
18        Locked      Locked      0           0           Clean 16
19        Locked      Locked      0           0           Clean 18


*** PHY PCS PMA Statistics ***
Rx        Rx          Aligment    PCS         PCS
Service   Block       Marker      Lane        Lane
Lane      Lock        Lock        BIP Errors  Mapping
-------   ---------   ---------   ----------  -------
0         Locked      Locked      5          10
1         Locked      Locked      5          11
2         Locked      Locked      5          0
3         Locked      Locked      5          6
4         Locked      Locked      5          5
5         Locked      Locked      5          1
6         Locked      Locked      5          12
7         Locked      Locked      5          7
8         Locked      Locked      4          16
9         Locked      Locked      5          2
10        Locked      Locked      5          14
11        Locked      Locked      5          15
12        Locked      Locked      5          19
13        Locked      Locked      5          18
14        Locked      Locked      5          17
15        Locked      Locked      5          3
16        Locked      Locked      5          8
17        Locked      Locked      5          13
18        Locked      Locked      5          4
19        Locked      Locked      5          9

Aleksandar Vidakovic
Cisco Employee

Nan,

you can look into the "show controllers hu0/3/0/2 phy" to see what alerts are raised.

/Aleksandar

Nan Bai
Cisco Employee

really appreicated for all guys help.

Nan Bai
Cisco Employee

still have some questions

customer mentioned that there is a interface not down when detected SF. so is it possible?

what's the relationship between log Detected Signal failure / BER SF threshold / port down

I mean , must interface be down when SF log generate ? does the log generate only when BER SF over the threshold?

xthuijs
Cisco Employee

correct, if the bit error rate is "bad enough" than the link will be declared down, this is seen what the thresholds are in the show controller <> phy also.

xander

smailmilak
Enthusiast

I had many times an issue with DWDM where I did a "shutdown" on one side and on the remote side the link was UP UP. 

It's important that the DWDM team enables "loss forwarding" (term on Coriant) so that the other side knows that the link is down on the remote side.

Not sure if your customer has this issue, Nan.

Nan Bai
Cisco Employee

hi all guys

thanks for your kindly help , customer is use 8x100G tomahawk LC.

maybe I didn't claim clearly in my previose message. sorry for that.

in fact customer observed that an error log

LC/0/3/CPU0:Feb 27 11:33:48.445 : vic_0_2[366]: %PLATFORM-VIC-4-SIGNAL : Interface HundredGigE0/3/0/2, Detected Signal failure

but they didn't observe that any down event or flap event of the intf hun0302.

customer wants to know whether the interface can still keep up even the SF has been detected?

if the interface would be down only when "bad enough", so is that mean in my case customer's link not bad enough ? because interface is not down. in other word, the log doesn't mean the SF rate exceed the threshold , any SF event will trigger the log. but only the event that rate exceed the threshold will trigger the interface down. Am I understand correct?

Jon Berres
Enthusiast

Hey Xander,

Thanks for the details on 9k optic errors. Great information!

I have a question related to the vic errors on the 9k. Do you have additional details on what they may mean from the 9ks perspective? We have run into a few lately on 10G links transporting over optical networks. The confusing thing is when we see the RFI error the problem has been on the near side not the far side like the error would seem to indicate. Is there any logic in how the 9k determines an RFI?

In the example below we had a 10G link flapping between (2) ASR9k routers.

Near end 9k example (TXP problem occurs on the local link on this end):

LC/0/3/CPU0:Aug  8 06:37:14.712 : vic[369]: %PLATFORM-VIC-4-RFI : Interface TenGigE0/3/0/4, Detected Remote Fault

Far end 9k example

LC/0/3/CPU0:Aug  4 20:39:46.421 : vic[369]: %PLATFORM-VIC-4-SIGNAL : Interface TenGigE0/3/0/4, Detected Signal failure 

Example of syslog when we actually unplug the fiber from the optic

LC/0/3/CPU0:Aug  4 20:40:06.786 : vic[369]: %PLATFORM-VIC-4-RX_LOS : Interface TenGigE0/3/0/4, Detected Rx Loss of Signal 

Thanks,

Jon

xthuijs
Cisco Employee

hi jon! htanks! :) ah you know VIC is the cisco specific driver that allowed us to use different phy implementations, like what you get when you are using MOD cards with MPA's. the VIC is a sort of abstraction layer that is the in between ether drivers and physical hardware.

in this case for your message, it is just the VIC reporting the L1 issues from the phy.

cheers!

xander

jineshman
Beginner
Hi Experts I am Jinesh from Optics background. We are facing 10G interface flap issue whenever there is a switching in transport L1 ASON network. As per transport testing on Ethernet tester no defect is being observed and switching is less than 50ms. Carrier delay ia configured as 150 ms down and up. We are observing detected signal failure at one end and detected remote fault on other end and interface going down. This behaviour is intermittent in nature. SFP 10GLR and router is ASR9k. Whether interface will come down on detecting bit errors or stressed frames..till CD timers get expire. Transport team claiming they are not shutting the laser till 300ms and will send idle frames. Kindly share your views and suggestion to resolve this issue.
jigojar
Beginner

Hi xthuijs,

 

We too face the same problem. how to overcome why do receiving the signal even if the physical port is up?