cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
14963
Views
0
Helpful
19
Replies

BGP Notification sent, invalid or corrupt AS path

Josiah Inubio
Level 1
Level 1

Hi Everyone,

 

I've been trying to have an bgp peering on our counterpart but this error keeps on showing. Everytime it's established, it dropped also. Please see logs below. I've already bounced my session, create / re-create my bgp config, but still no avail. Can anyone advise me on this?

 

RP/0/RP0/CPU0:Aug 25 09:20:31.043 : bgp[1045]:
%ROUTING-BGP-5-ADJCHANGE_DETAIL : neighbor 206.41.72.1 Down - BGP
Notification sent, invalid or corrupt AS path (VRF: default; AFI/SAFI: 1/1)

 

Connections established 145; dropped 145
  Local host: 206.41.72.39, Local port: 11458
  Foreign host: 206.41.72.1, Foreign port: 179
  Last reset 00:00:14, due to Peer closing down the session
  Peer reset reason: Remote closed the session (Connection timed out)
  Time since last notification sent to neighbor: 00:01:30
  Error Code: invalid or corrupt AS path
  Notification data sent:
    40020A02 02000051 CC000040 F1

 

Total malformed UPDATE 145
  Last malformed UPDATE 00:01:30
  Error subcode 11, attribute code 0, action reset session
  Malformed UPDATE: 133 bytes
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    00850200 00002E40 01010040 020A0202
    000051CC 000040F1 400304CE 294822C0
    07080000 51CC172D 4359C008 0851CC01
    9B51CC03 781317DD 6016B855 E41648F6
    A01648F6 CC1748F7 EA14B855 4014173E
    40141706 B014173E 501748F6 501417DE
    A016173E 3817173E 361417CD C017B81C
    2014173B 10

19 Replies 19

Josh Sprang
Level 1
Level 1

This can happen when there is a really long AS_PATH received in the update.  Try to set the BGP MAX AS_PATH limit or set an inbound AS_PATH filter to block 0 and no AS path in the update regex ^$

I already tried to drop our IN and OUT policy but still no avail

Can you post the output of a debug ip bgp ipv4 updates

Please see above updates

Peter Paluch
Cisco Employee
Cisco Employee

Hi Josiah,

Your issue was interesting enough to me to decode the malformed BGP message by hand, and this is what I came up with (parts highlighted with red are malformed):

    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF - Marker
    0085 - Length (133)
    02 - Type (update)
 
    00 00 - Withdrawn Routes (0)

    002E - Total Path Attribute Length (46)
    
    40 01 - Transitive, Origin
    01 - Length (1)
    00 - IGP

    40 02 - Transitive, AS_PATH
    0A - Length (10)
    02 - Segment Type (AS_SEQUENCE)
    02 - Segment Length (2 ASNs)
    0000 51CC - Segment (0, 20940)
    00 - Segment Type (error)
    00 - Segment Length (error) 
    40F1 - Malformed value

    40 03 - Transitive, Next Hop
    04 - Length (4)
    CE294822 - 206.41.72.34

    C0 07 - Optional Transitive, Aggregator
    08 - Length (8, should have been 6!)
    0000 51CC - Probably ASN (20940)
    172D 4359 - Probably IP (23.45.67.89)

    C0 08 - Optional Transitive, Community
    08 - Length (8)
    51CC 019B - 20940:411 
    51CC 0378 - 20940:891

===============================

    13 - Prefix Length (/19)
    17DD 60 - 23.221.96.0/19

    16 - Prefix Length (/22)
    B855 E4 - 184.85.228.0/24

    16 - Prefix Length (/22)
    48F6 A0 - 72.246.160.0/22

    16 - Prefix Length (/22)
    48F6 CC - 72.246.204.0/22

    17 - Prefix Length (/23)
    48F7 EA - 72.247.234.0/23

    14 - Prefix Length (/20)
    B855 40 - 184.85.64.0/20

    14 - Prefix Length (/20)
    173E 40 - 23.62.64.0/20

    14 - Prefix Length (/20)
    1706 B0 - 23.6.176.0/20

    14 - Prefix Length (/20)
    173E 50 - 23.62.80.0/20

    17 - Prefix Length (/23)
    48F6 50 - 72.246.80.0/23

    14 - Prefix Length (/20)
    17DE A0 - 23.222.160.0/20

    16 - Prefix Length (/22)
    173E 38 - 23.62.56.0/22

    17 - Prefix Length (/23)
    173E 36 - 23.62.54.0/23

    14 - Prefix Length (/20)
    17CD C0 - 23.205.192.0/20

    17 - Prefix Length (/23)
    B81C 20 - 184.28.32.0/23

    14 - Prefix Length (/20)
    173B 10 - 23.59.16.0/20

 

The AS_PATH attribute appears to be truly malformed. Let me restate it once again:

    40 02 - Transitive, AS_PATH
    0A - Length (10)
    02 - Segment Type (AS_SEQUENCE)
    02 - Segment Length (2 ASNs)
    0000 51CC - Segment (0, 20940)
    00 - Segment Type (error)
    00 - Segment Length (error) 
    40F1 - Malformed value

 

An AS_PATH attribute consists of so-called path segments which can be of two types, AS_SET and AS_SEQUENCE. Each segment is encoded as <Segment Type, Segment Length, Segment>. Each Segment contains a list of ASNs encoded as two-byte-long numbers, and then the next segment (if any) follows. Notice above that this AS_PATH claims to start with an AS_SEQUENCE segment type with 2 ASNs inside, but the first ASN makes no sense - it is 0. The second ASN is 20940, and here, the first segment ends. Following should be another segment but its encoding makes no sense - its segment type is invalid, its length is invalid, and the remaining bytes cannot be interpreted.

It actually seems that your BGP peer is mistakenly expressing the AS numbers in 4B-long format instead of 2B. If we assume this interpretation then it suddenly makes starting sense:

    40 02 - Transitive, AS_PATH
    0A - Length (10)
    02 - Segment Type (AS_SEQUENCE)
    02 - Segment Length (2 ASNs)
    000051CC 000040F1 - Segment (20940, 16625)

 

A well-behaved BGP speaker should, however, never do this. In an AS_PATH, the ASNs must be 2B-long values. For 4B-long values, another attribute, the AS4_PATH, must be used, but this attribute was not present in that message at all. Using 4B-long ASNs in an AS_PATH is definitely a gross violation of the BGP protocol specification, as the receiving peer assumes that the ASNs are in 2B format and decodes them differently, as shown previously.

My suspicion is also corroborated by the AGGREGATOR attribute:

    C0 07 - Optional Transitive, Aggregator
    08 - Length (8, should have been 6!)
    0000 51CC - Probably ASN (20940)
    172D 4359 - Probably IP (23.45.67.89)

 

BGP specification in RFC 4271 states very clearly that the Aggregator is a 6-byte attribute, consisting of the ASN that formed the aggregate route, and the IP address of the router that formed the aggregate route. However, as you can see above, this attribute is 8, not 6, bytes long, and the difference is again in the ASN encoding where the AS 20940 seems to be encoded as an 4B value.

Definitely, there is nothing you can do on your part about this. It seems that the BGP implementation of your peer is broken, and it has to be corected in the code. Please have the administrator of your BGP peer see this post and contact the support for his router. Definitely, this will require patching the operating system or the BGP daemon of your peer which is something only a vendor can do.

Just wondering - what vendor and operating system is used on the misbehaving BGP peer?

Best regards,
Peter

Hi Peter, 

 

I've compared both working and non working BGP peer. Please see attached debug results.

 

Not working - refer to session.txt

Working - refer to session-sea.txt

 

Josiah,

I've had a look at the debugs you have posted. Those debugs describe the OPEN and KEEPALIVE messages being exchanged, and the NOTIFICATION being sent between routers when something goes gravely wrong, but these messages do not include UPDATE messages. I would believe that you need to use debug bgp update command to activate debugs of UPDATE messages.

However, I do not believe we will find anything new. The analysis of the malformed message clearly shows that the misbehaving BGP peer is sending autonomous system numbers in 4B format in attributes where only 2B numbers are expected. As I indicated before, this is not a problem on your device. It is the problem of the BGP peer, and there is literally nothing you can do. It's a buggy BGP implementation on the misbehaving BGP peer, and it is up to its administrator to find out if there are any updates to the operating system he is using.

Best regards,
Peter

Peter,

 

Let me clarify this on my peer. I'll get back as soon as I already have an update. Thank you for your assistance. 

Hi Josiah,

Certainly. Please keep me updated - this issue is very interesting, and I would like to know your peer's response. Please have him read this thread as it contains important diagnostic information.

Looking forward to hearing from you.

Best regards,
Peter

Peter Paluch
Cisco Employee
Cisco Employee

Josiah,

I have to correct myself somewhat. According to RFC 4893 and its updated version RFC 6973, if both neighbors agree on using 4-byte ASNs then all ASNs will be expressed in 4-byte format including those in AS_PATH, and this would be a correct behavior.

So now the questions boil down to:

  1. Did your two routers truly negotiate and agree on the use of 4-byte ASNs?
  2. If so, why is your router having a problem with them now that it should be accepting them?

Would it be possible for you to post the sanitized configuration of your BGP process from your router? Also, can you please post the output of show bgp neighbor 206.42.72.1?

Thank you - and apologies for being misleading in the previous posts!

Best regards,
Peter

Hi Peter,

 

Please see bgp config below.

 

router#sh run router bgp <AS> neigh 206.41.72.1
Tue Sep  1 01:08:58.013 LAX
router bgp <AS>
 neighbor 206.41.72.1
  remote-as 63022
  description :: PEERING
  address-family ipv4 unicast
   route-policy PEERING-IN in
   route-policy PEERING-OUT out
   remove-private-AS
   soft-reconfiguration inbound always

 

show bgp neighbor 206.42.72.1

BGP neighbor is 206.41.72.1
 Remote AS 63022, local AS, external link
 Description: ::PEERING
 Remote router ID 0.0.0.0
  BGP state = Idle
  Last read 00:00:00, Last read before reset 00:04:32
  Hold time is 180, keepalive interval is 60 seconds
  Configured hold time: 180, keepalive: 60, min acceptable hold time: 3
  Last write 00:00:18, attempted 53, written 53
  Second last write 00:01:00, attempted 53, written 53
  Last write before reset 00:04:32, attempted 72, written 72
  Second last write before reset 00:05:12, attempted 53, written 53
  Last write pulse rcvd  Sep  1 01:15:44.807 last full not set pulse count 46339
  Last write pulse rcvd before reset 00:04:32
  Socket not armed for io, not armed for read, not armed for write
  Last write thread event before reset 00:04:32, second last 00:04:32
  Last KA expiry before reset 00:00:00, second last 00:00:00
  Last KA error before reset 00:00:00, KA not sent 00:00:00
  Last KA start before reset 00:00:00, second last 00:00:00
  Precedence: internet
  Enforcing first AS is enabled
  Received 13663 messages, 0 notifications, 0 in queue
  Sent 25942 messages, 4559 notifications, 0 in queue
  Minimum time between advertisement runs is 30 secs

 For Address Family: IPv4 Unicast
  BGP neighbor version 0
  Update group: 0.20 Filter-group: 0.0  No Refresh request being processed
  Inbound soft reconfiguration allowed (override route-refresh)
  Private AS number removed from updates to this neighbor
  Route refresh request: received 0, sent 0
  Policy for incoming advertisements is PEERING-IN
  Policy for outgoing advertisements is PEERING-OUT
  0 accepted prefixes, 0 are bestpaths
  Cumulative no. of prefixes denied: 0. 
  Prefix advertised 0, suppressed 0, withdrawn 0
  Maximum prefixes allowed 524288
  Threshold for warning message 75%, restart interval 0 min
  An EoR was not received during read-only mode
  Last ack version 0, Last synced ack version 0
  Outstanding version objects: current 0, max 0
  Additional-paths operation: None

  Connections established 4550; dropped 4550
  Local host: 206.41.72.39, Local port: 34367
  Foreign host: 206.41.72.1, Foreign port: 179
  Last reset 00:00:18, due to Peer closing down the session
  Peer reset reason: Remote closed the session (Connection timed out)
  Time since last notification sent to neighbor: 00:04:32
  Error Code: invalid or corrupt AS path
  Notification data sent:
    40020A02 02000051 CC000040 F1


 Total malformed UPDATE 4550
  Last malformed UPDATE 00:04:32
  Error subcode 11, attribute code 0, action reset session
  Malformed UPDATE: 133 bytes
    FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF
    00850200 00002E40 01010040 020A0202
    000051CC 000040F1 400304CE 294822C0
    07080000 51CC172D 4359C008 0851CC01
    9B51CC03 781317DD 6016B855 E41648F6
    A01648F6 CC1748F7 EA14B855 4014173E
    40141706 B014173E 501748F6 501417DE
    A016173E 3817173E 361417CD C017B81C
    2014173B 10

Josiah,

Thanks!

Can you, for the time being, configure the following on your router? It will prevent your router from advertising the 4-byte ASN capability to the problematic peer. This should at least allow the session to come up and remain stable even though the 4-byte ASN capability will be disabled. Let's consider this to be just a workaround to find out if removing the 4-byte ASN capability will solve the problem. If so, we will discuss the options further.

router bgp <AS>
 neighbor 206.41.72.1
  capability suppress 4-byte-as

Best regards,
Peter

Hi Peter,

 

Session is still not established, we're coordinating our peering right now to check on their end.

Hi Josiah,

Okay, I understand. Is the NOTIFICATION message the same (corrupt/malformed AS_PATH in an Update)?

Can you please post the output of show bgp neighbor 206.41.73.1? I understand that this is a working BGP peer and I would like to compare the outputs.

What XR version are you running on your router please?

Best regards,
Peter

Review Cisco Networking for a $25 gift card