cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1069
Views
0
Helpful
5
Replies

9000v satellite orbit iccp behaviour


Hi,


Im setting up 9000v2 satellites in dual home as well as simple ring with ICCP/ORBIT. Hosts are 9904, 9006 running 6.2.3 and 6.4.2  I have been mostly sucesful in bringing up the setup and achieving active/standby, but some things are not clear and did not find answers in the documentation, so would appreciate some help


Q1. Under normal conditions, ASR9904 is the active host servicing traffic to/from the satellite/s in my setup. It also seems to be the Primary ICCP device. (see below)

 

Now if I either change the host-priority for the satellite, or if I fail the ICL from the 9904 to the satellite, failover occurs and Satellite 100 is now active on the 9006 which was previously standby. this works fine.

 

Question is: The ICCP Group remains "Role Primary" on the 9904 even if the satellite fails over to the other host. I also proceeded to bring down the backbone link on the 9904, and while the logs show that the hosts lost their ICCP connection to each other, the 9904 remains reporting Primary while the 9006 remains reporting secondary. This doesnt sound right. Should the 9006 not become primary ?

 

RP/0/RSP0/CPU0:A9904-LAB#sh nv satellite status
Thu Oct 24 11:14:59.225 CET
Satellite 100
-------------
Status: Connected (Stable)
Redundancy: Active (Group: 1)
Type: asr9000v2
MAC address: 042a.e2c9.8edc
IPv4 address: 10.0.100.1 (auto, VRF: **nVSatellite)
Serial Number: CAT2025U2SE
Remote version: Compatible (older version)
ROMMON: 127.1 (Available: 128.1)
FPGA: 1.1 (Available: 1.13)
IOS: 378.0 (Available: 653.100)
Configured satellite fabric links:
TenGigE0/0/0/3
--------------
Status: Satellite Ready
Remote ports: GigabitEthernet0/0/0-43

RP/0/RSP0/CPU0:A9904-LAB#sh nv satellite protocol redundancy
Thu Oct 24 11:15:10.108 CET
ICCP Group: 1
-------------
Status: Connected since 2019/10/23 12:11:59.902
Role: Primary (System MAC: 02a7.4201.c15e)




Q2. If I setup more than one simple ring topology, ie if I have Satellite#100 connected to 9904 and Satellite#101 connected to 9006 in simple ring, and then I want to setup Satellite #102 and Satellite#103 as a new independent ring, but connect these ICL ports to the same host asr9k routers, on different teng ports on the asr9k of course, should I use a separate ICCP group for this new ring ?  ie is does one ICCP group belong to one MC-LAG bundle, or one simple ring ?


Q3. I see from the documentation that the satellite IDs should be in the range 100-230, however asr9k allows you to configure a much bigger range for satellite id, and using ID 300 in fact it works. Is this some limitation in older code that is no longer the case, or should I strictly use only this range even with XR 6.4 ?

 

thanks

 

Mark

1 Accepted Solution

Accepted Solutions

smilstea
Cisco Employee
Cisco Employee

Answering in reverse order, the 100-230 has to do with CRS not asr9k as rack 240ish is used for fabric chassis in multichassis setups. I see no issues with using 300 or larger values for satellites.

 

There is really no reason to have multiple ICCP groups if you are only using the satellites with the same two head asr9ks.

Here is documentation on that feature:

https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k_r6-0/nV/configuration/guide/b_nv_cg60xasr9k/b_nv_cg60xasr9k_chapter_010.html#concept_EA1604BAE1574BA48DA5B15DA2CF93EB

 

 

So when we talk about redundancy we have to think about ICCP and satellite redundancy. So in the case of the satellite we have a TCP control channel towards each host from the satellite, with one being the active and one the standby, if for some reason the satellite figures out that the host is gone or you change the priority on the host then a switchover occurs in the dataplane to the other host. This is independent of ICCP, although ICCP is used to sync state of the hosts between each other.

Whether one host is active or standby for ICCP is irrelevant for satellite, so long as ICCP is up that is all that really matters. The active from satellite dataplane and ICCP group don't need to be the same. Hopefully my rambling made some sense.

 

Sam

 

View solution in original post

5 Replies 5

smilstea
Cisco Employee
Cisco Employee

Answering in reverse order, the 100-230 has to do with CRS not asr9k as rack 240ish is used for fabric chassis in multichassis setups. I see no issues with using 300 or larger values for satellites.

 

There is really no reason to have multiple ICCP groups if you are only using the satellites with the same two head asr9ks.

Here is documentation on that feature:

https://www.cisco.com/c/en/us/td/docs/routers/asr9000/software/asr9k_r6-0/nV/configuration/guide/b_nv_cg60xasr9k/b_nv_cg60xasr9k_chapter_010.html#concept_EA1604BAE1574BA48DA5B15DA2CF93EB

 

 

So when we talk about redundancy we have to think about ICCP and satellite redundancy. So in the case of the satellite we have a TCP control channel towards each host from the satellite, with one being the active and one the standby, if for some reason the satellite figures out that the host is gone or you change the priority on the host then a switchover occurs in the dataplane to the other host. This is independent of ICCP, although ICCP is used to sync state of the hosts between each other.

Whether one host is active or standby for ICCP is irrelevant for satellite, so long as ICCP is up that is all that really matters. The active from satellite dataplane and ICCP group don't need to be the same. Hopefully my rambling made some sense.

 

Sam

 

One interesting aspect I ran into with this was that the failover would work as advertised when the primary host failed, but when the primary came back up, the satellite would not fall back to the primary, per the priority settings.  If I remember correctly, this was tied to running graceful-restart under ldp, and if I removed it, that seemed to allow the satellite to move back to the primary dynamically.

That was on 9006's with 5.3.3.  I've always been curious about the relationship between this behavior and ldp graceful-restart.

While this is a tangent from the question, I'll be interested to see if it comes into play here as well.

 

 

By default when the primary comes back up it has a higher priority and will take over again as the primary (causing a switchover).

 

I would be curious to see your setup when the primary comes back up, what the t-LDP looks like, the satellite status, etc. I did some searching and can't find anything to explain this behavior.

 

Sam

 

Maybe if I can find my old configs, I can replicate it in our lab.  Appreciate you looking into it.

Hi Sam,

thanks - yes your rambles are just fine and filled in the blanks :)

will post updates here when i complete my testing to see if any issue even with the tldp mentioned below

cheers

Mark