Solved: P2P Issues Between NX-OS and IOS

michael.mastropaolo · ‎10-29-2015

Hello all.

This is my first post and i'll try to be as detailed as possible. I am upgrading the core of our network with two NX 6004's that are connecting north to two Catalyst 7606's. The 6004's also have connections going south to two NX 6001's. Everything is eBGP with all P2P links, detailed like this: (for clarity sake, i'm just going to use a single of each box)

7606 -> 6004 (Port-Channel - two 10Gb links on both sides)

6004 -> 6001 (40Gb P2P)

The eBGP peerings between the NX boxes come up just fine. The peerings between the 6004 and the 7606 does not come up what so ever. After digging around and debugging some bgp packets, I noticed that TCP never establishes what so ever. Forgetting about BGP for the moment, I then noticed when I run some pings from the 7606 with the df-bit set and a size of 1500 and a count of 100 (per se), every 15th packet is dropped, consistently. If I were to change the size up or down, it affects the dropped packet but at different intervals. For example, send a packet size of 1100, and its every 25th packet. Send a size of 8000 (when trying to set MTU manually on the interface), every 3rd packet was dropped. Here is what I have done so far:

Set MTU manually

Set P2P to a single link only

WireShark the link (no good info aside from no tcp response, which didn't yield much)

Wipe the NX box clean and only configured interface

IP TCP PATH-MTU-DISCOVERY was enabled globally on the 7606. I added it to the 6004

Configured static speed and duplex settings

I'm certain I've done a lot more that I cannot think of at the moment (have it documented at work). When I run the debug ip tcp transactions, I notice that the syn_sent to the neighbor (when originally trying to setup bgp) was timing out. It almost appears as though this is some buffer or window issue with the NX box but I am coming up short in my research of how to potentially fix this. Before I call TAC, I figured i'd post this.

I'm 99% certain its not a fiber issue or L1 issue as both NX boxes which have redundant P2P links to both 7606's are having this same exact issue. I'm also leaning on the fact of a potential bug between IOS and NX-OS; not too sure.

Any help would be appreciated.

Thanks.

-Michael

michael.mastropaolo · ‎10-31-2015

I was able to identify the issue. The 7600 box being so old, someone had configured a copp_management ACL on it. I had found this by pure luck because a coworker suggested to peer up the 6004 with the 4948, with it being an IOS device. The 4948 was clean and the peering came right up.

What I did though was I just used a random 192.168 space for the point to point to come up between the 6004 and 4948. Therefore, I left the address space on the interface on the 6004, and since we were running single mode LC to SC, I changed the cable to multimode running between the 6004 and the 7606. I then configured the 7606 interface with the 192.168 space as well. Boom, BGP came up.

My initial thought was there was something funky with all 8 single mode runs. It didnt make sense to me but whatever. I then changed everything back to use the 10 space, and what do you know, no BGP peering. Now I was intrigued. Next step; I changed the interface to 172.16 space. No peering. This was screaming some type of ACL to me. I searched the ACL list on the 7606 and was looking for one that had 192.168 space in it but not 10 space or 172.16 space in it. Found one that was listed as copp_management that must have been created years ago. Noticed that the 192.168 space had hits on it as well and (when I set interfaces back to 192.168 space), the count was increasing. Bingo, I knew this was it. Added permit for the 10 space, readdressed my interfaces and BGP came up.

Sheer luck because if I hadn't addressed my test between the 6004 and 4948 strictly using 192.168 space, I wouldn't have went down this path. Good to know it was an isolated incident and not a bug between NX-OS and IOS, code, etc.

Thanks for the assistance anyways!

View solution in original post

Jon Marshall · ‎10-29-2015

Michael

I suspect the pings may be misleading because the Nexus switches have a default CoPP policy to protect the control plane.

See this thread for the same sort of results with ping you are seeing -

https://supportforums.cisco.com/discussion/11879276/cisco-nexus-6004-switch-drops-ip-packets-greater-300-bytes

so the BGP issue may not be related.

Do you have the BGP configuration to post ?

Jon

michael.mastropaolo · ‎10-30-2015

Hey Jon. Thanks for the reply.

So I've read about the CoPP policy as well and suspected that may be the issue for the pings. I don't believe its a bgp issue at all unless there's something completely out of my league that I'm not seeing. The reason I say this is because i've stripped both sides down to the very basic and just put router-id's along with the neighbor statements and remote AS. The config was certainly correct. I will put what I have below just for clarity sake.

One thing to note, the 7606 is a very old box. It's our Aggregate Router so it has a config in place already. While we use peer groups on that box, and I set one up initially for this peering, I tore it down and just put a neighbor statement. I'm curious if there's a setting on the 7606's that may be causing the issue. The bgp peering never establishes because of the TCP handshake not establishing, I would think. That is where I think the root of my problem is. That is why I went down to the path of the dropped packets when I noticed it. Funny thing is though, just to eliminate it, I have no issues pinging the same kind of details (size of packets, count, etc.) from the 6004 to the 6001 and back. No dropped packets at all. Would this be why the tcp handshake never establishes? I could post the debug output (debug ip tcp transactions) from the 7606 when I get to work if that would help.

The exact bgp config is as follows:

7606

router bgp 64594

router-id 10.252.68.0

neighbor 10.251.177.2 remote-as 65002

6004

router bgp 65002

router-id 10.251.177.128

neighbor 10.251.177.1 remote-as 64594

address-family ipv4 unicast

I've also tried peering up with the loopbacks, adding ebgp multi hop, update-source loopback0 (both the router-id addresses) and having static entries to both ip's. Even though i'm able to ping across successfully, still no bgp and still the same TCP issue. I've had three senior network engineers look at it for over an hour each, all of them stumped, all of them thinking its a bug of some sort (which it very well may be). I just can't get away from the TCP not establishing, which may be due to a bug, or some other underlying aspect, possibly between IOS and NX-OS. I highly doubt i'm the first to try an eBGP between the two platforms haha. Maybe the code on the boxes?

For what it's worth, I know the code for the NX box - 7.0.2(2). I don't know the 7606 off the top of my head. We are running that version on all of our Nexus gear due to HQ wanting that code. I know its older, but just wanted to throw that out there.

Thanks

-Michael

michael.mastropaolo · ‎10-30-2015

That's odd...my spacing didn't come out. Let me try to edit.

michael.mastropaolo · ‎10-30-2015

I just ran a ping from the 6004 to the 6001 with a size of 8000 and a count of 100. Not one dropped packet. That seems odd to me. Just wanted to throw this out there.

Jon Marshall · ‎10-30-2015

Michael

Is there any chance of debugging BGP to see what each switch thinks ?

I appreciate you may already have done this and/or you don't want to run it during production hours but it may show something.

Edit - sorry just noticed you have already done this, what did they show ?

Jon

michael.mastropaolo · ‎10-30-2015

No worries. I don't mind running them again. I already ran "show sockets all" on the nexus and found something interesting, posted below. It's mss is set to 536 and for some reason the 7606 is sending different sizes. I understand this is the process of mss but could this be the reason why it times out on the 7606 side?

2015 Oct 30 14:07:53.784909 netstack: tcp_usr_connect: TCP: tcp_usr_connect to 10.251.177.1.45824
2015 Oct 30 14:07:53.785629 netstack: tcp_connect: Originating Connections with ports (Src 56377, Dst 179)
2015 Oct 30 14:07:53.786268 netstack: in_pcbrehash: PCB: Insert pcb in hash lists L: 10.251.177.2.56377, F: 10.251.177.1.179, C: 1
2015 Oct 30 14:07:53.786935 netstack: tcp_mssopt: tcp_mssopt:setting mss value 0 and 0
2015 Oct 30 14:07:53.787621 netstack: tcp_mssopt: sending out finally, mss 1460
2015 Oct 30 14:07:53.788280 netstack: tcp_output: tcp_output: preparing to send mss option 1460 bytes
2015 Oct 30 14:07:54.918058 netstack: in_pcblookup_hash: Flow Params: Ports(Src 179, Dst 36878)
2015 Oct 30 14:07:54.918802 netstack: tcp_mss: tcp_mss: pktiod 9 iodvalid 1 offer 1440 isipv6 0
2015 Oct 30 14:07:54.919435 netstack: tcp_mss: tcp_mss: client offered mtu 1440, our value 536

2015 Oct 30 14:07:56.601004 netstack: tcp_mss: tcp_mss: pktiod 51 iodvalid 1 offer 9138 isipv6 0
2015 Oct 30 14:07:56.601640 netstack: tcp_mss: tcp_mss: client offered mtu 9138, our value 536
2015 Oct 30 14:07:56.804479 netstack: tcp_mssopt: tcp_mssopt:setting mss value 0 and 0
2015 Oct 30 14:07:56.804566 netstack: tcp_mssopt: sending out finally, mss 1460
2015 Oct 30 14:07:56.804575 netstack: tcp_output: tcp_output: preparing to send mss option 1460 bytes
2015 Oct 30 14:07:56.915348 netstack: in_pcblookup_hash: Flow Params: Ports(Src 179, Dst 36878)
in2015 Oct 30 14:07:56.916030 netstack: tcp_mss: tcp_mss: pktiod 9 iodvalid 1 offer 1440 isipv6 0
2015 Oct 30 14:07:56.916662 netstack: tcp_mss: tcp_mss: client offered mtu 1440, our value 536
2015 Oct 30 14:07:59.610177 netstack: in_pcblookup_hash: Flow Params: Ports(Src 179, Dst 50126)
2015 Oct 30 14:07:59.610897 netstack: tcp_mss: tcp_mss: pktiod 51 iodvalid 1 offer 9138 isipv6 0
2015 Oct 30 14:07:59.611531 netstack: tcp_mss: tcp_mss: client offered mtu 9138, our value 536
2015 Oct 30 14:07:59.824458 netstack: tcp_mssopt: tcp_mssopt:setting mss value 0 and 0
2015 Oct 30 14:07:59.824565 netstack: tcp_mssopt: sending out finally, mss 1460
2015 Oct 30 14:07:59.824575 netstack: tcp_output: tcp_output: preparing to send mss option 1460 bytes
2015 Oct 30 14:08:00.915395 netstack: in_pcblookup_hash: Flow Params: Ports(Src 179, Dst 36878)
2015 Oct 30 14:08:00.916099 netstack: tcp_mss: tcp_mss: pktiod 9 iodvalid 1 offer 1440 isipv6 0
2015 Oct 30 14:08:00.916735 netstack: tcp_mss: tcp_mss: client offered mtu 1440, our value 536

I'll get the debug ip bgp all info now

Thanks Jon.

Jon Marshall · ‎10-30-2015

-

Jon Marshall · ‎10-30-2015

Michael

I suspect this may be part of the issue but you said you played around with the settings on the Nexus switch already.

Obviously you can't make changes to the 7600 as it has BGP peerings already I expect.

Jon

michael.mastropaolo · ‎10-30-2015

Correct. I cannot really make many changes on the 7600 as it has about 10 peerings that have been up for quite some time. I can't really mess with any mss settings on the nexus as I don't believe there is an option for that. I used the mss option on the 7600 P2P interface and set it to 536, still no luck. Same sizes coming over still (9138, 1460).

Not sure what else to do here haha. Waiting on Cisco TAC as well.

Jon Marshall · ‎10-30-2015

I keep missing your replies :-)

Not sure what else to suggest especially as you can't make changes to the 7600.

Would be very interested to hear what TAC have to say about it.

Sorry I can't be of more help.

Jon

michael.mastropaolo · ‎10-30-2015

Do you have any suggestions for the 7600 side? If they won't' affect the bgp peerings, I can work on it. I'm administratively able to make changes, just have to be careful with it being one of our Agg routers is all.

Jon Marshall · ‎10-30-2015

Not really or at least I wouldn't want to be suggesting any changes that could affect your existing peerings which I think it would because it looks like the issue is with the BGP neighbor setup rather than a more general issue with the link.

So I'm really reluctant to suggest anything that could impact your production network.

Jon

michael.mastropaolo · ‎10-30-2015

No problem. I'm actually currently installing the latest firmware on the Nexus box to see if this works. Either way, i'll update this thread so it could help someone else in the future.

Thanks anyways Jon.

michael.mastropaolo · ‎10-30-2015

Here is the output for the debug ip bgp all on the nexus box:

as01.ndceast# sh ip bgp summ
BGP summary information for VRF default, address family IPv4 Unicast
BGP router identifier 10.251.177.128, local AS number 65002
BGP table version is 2, IPv4 Unicast config peers 1, capable peers 0
0 network entries and 0 paths using 0 bytes of memory
BGP attribute entries [0/0], BGP AS path entries [0/0]
BGP community entries [0/0], BGP clusterlist entries [0/0]

Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd
10.251.177.1 4 64594 0 0 0 0 0 00:41:40 Idle
as01.ndceast# 2015 Oct 30 14:24:55.193253 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:24:55.193304 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:24:55.193334 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
2015 Oct 30 14:25:03.893258 bgp: 65002 [18165] (default) EVT: 10.251.177.1 peer connection retry timer expired
2015 Oct 30 14:25:03.893330 bgp: 65002 [18165] (default) EVT: 10.251.177.1 Triggered active open for peer
2015 Oct 30 14:25:03.893377 bgp: 65002 [18165] (default) EVT: 10.251.177.1 went from Idle to Active (Active setup)
2015 Oct 30 14:25:03.894919 bgp: 65002 [18165] (default) EVT: 10.251.177.1 Schedule wait for connect
2015 Oct 30 14:25:03.894934 bgp: 65002 [18165] (default) EVT: 10.251.177.1 Wait (30 sec) for session setup response
2015 Oct 30 14:25:18.113806 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:25:18.113864 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:25:18.113895 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
2015 Oct 30 14:25:33.913224 bgp: 65002 [18165] (default) EVT: 10.251.177.1 session setup (active) timed out, setup state Active busy 0
2015 Oct 30 14:25:33.913249 bgp: 65002 [18165] (default) EVT: 10.251.177.1 cleaning up active peer setup, thread id 0x0
2015 Oct 30 14:25:38.523229 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:25:38.523283 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:25:38.523314 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
2015 Oct 30 14:26:01.523241 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:26:01.523974 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:26:01.524630 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
2015 Oct 30 14:26:22.943235 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:26:22.943290 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:26:22.943321 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
2015 Oct 30 14:26:37.083217 bgp: 65002 [18165] (default) EVT: 10.251.177.1 peer connection retry timer expired
2015 Oct 30 14:26:37.083296 bgp: 65002 [18165] (default) EVT: 10.251.177.1 Triggered active open for peer
2015 Oct 30 14:26:37.083346 bgp: 65002 [18165] (default) EVT: 10.251.177.1 went from Idle to Active (Active setup)
2015 Oct 30 14:26:37.084928 bgp: 65002 [18165] (default) EVT: 10.251.177.1 Schedule wait for connect
2015 Oct 30 14:26:37.084942 bgp: 65002 [18165] (default) EVT: 10.251.177.1 Wait (30 sec) for session setup response
2015 Oct 30 14:26:46.383239 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:26:46.383289 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:26:46.383320 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
2015 Oct 30 14:27:07.103212 bgp: 65002 [18165] (default) EVT: 10.251.177.1 session setup (active) timed out, setup state Active busy 0
2015 Oct 30 14:27:07.103246 bgp: 65002 [18165] (default) EVT: 10.251.177.1 cleaning up active peer setup, thread id 0x0
2015 Oct 30 14:27:07.743230 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:27:07.743281 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:27:07.743312 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
2015 Oct 30 14:27:32.503232 bgp: 65002 [18165] EVT: Sent message 2 to bgp_cleanup_mq
2015 Oct 30 14:27:32.503283 bgp: 65002 [18165] EVT: Starting periodic BRIB processing
2015 Oct 30 14:27:32.503314 bgp: 65002 [18165] (default) EVT: [IPv4 Unicast] tbl_ctx cleanup, refcount 1
un all