07-18-2019 01:55 AM
Hi,
I have just deployed a very small multicast network. One site we have a VLAN with IPTV Equipment (Multicast Source) and across a few Layer 3 routers we have another site with IPTV Receivers.
I have opted for very simple configuration of PIM Sparse mode on all Layer 3 links and statically assigned the RP which is at the IPTC Equipment site. This is locked down to the 239.192.0.0/15 range on the receiver site and routers in between, except the source site, he is allow to be RP for everything.
Multicast is making its way from Source to receiver, until someone mentioned the channel list isn't working. I looked into it and it looks like the IPTV system uses SAP (Session Announcement Protocol) to announce the channels. This is using IP 239.255.255.255 which is the default and can't be changed.
So, I thought ok, cool I will just update my ACL to allow 239.255.255.255. When I do that I see that all equipment on both sides of the multicast network start to become S,G entries and the RP now comes up with a message saying -
Received Register from (IPTV receiver VLAN IP) for (IPTV Receiver IP, 239.255.255.255), not willing to be RP
So both source and receiver of the Video traffic also use SAP to communicate. They all all configured to find the channels at sap://239.255.255.255
I cannot find if this is supported configuration and can't find a reason why the RP would not be RP for this group?
Can multiple sites/hosts in a PIM domain send to an RP on the same group?
This works fine at the IPTV Equipment site inside a single VLAN, now that we are trying to make it work over a few PIM routers, it's a no go.
Am i using the wrong type of multicast deployment? Is a Many to Many protocol needed instead? Is the PIM-SM unidirectional nature causing the issue?
Thanks for your time.
Brad
07-19-2019 11:10 PM
HI,
An update and new questions. After resolving the RP messages I have been able to do captures all the way to the receiver.
What I have found is the following -
1. We have a layer 3 switch with 3 links to our WAN, all enabled with PIM Spares Mode.
2. We use BGP to prefer one link only
3. On that link I see the multicast stream DEST IP - 239.255.255.255
4. I went to the end host and did a capture, no multicast seen.
I have done a IGMP debug, as I can only think that the switch receives the IGMP query from the source for the stream , but the switch is not forwarding it.
The debug reveals the following -
IP -- 10.141.241.100 (Receiver) - VLAN41
001811: Jul 20 16:07:24.461 AEST: IGMP(0): Received v2 Report on Vlan41 from 10.141.241.100 for 239.255.255.255
001812: Jul 20 16:07:24.461 AEST: IGMP(0): Received Group record for group 239.255.255.255, mode 2 from 10.141.241.100 for 0 sources
001813: Jul 20 16:07:24.462 AEST: IGMP(0): Updating EXCLUDE group timer for 239.255.255.255
001814: Jul 20 16:07:24.462 AEST: IGMP(0): MRT Add/Update Vlan41 for (*,239.255.255.255) by 0
These are the sources sending the SAP announcements. They are sent every 30 seconds.
Seen in local router -
Timers: Uptime/Expires
Interface state: Interface, Next-Hop or VCD, State/Mode
(*, 239.255.255.255), 23:52:35/stopped, RP 10.142.0.1, flags: SJCF
Incoming interface: TenGigabitEthernet1/0/12, RPF nbr 10.141.255.253
Outgoing interface list:
Vlan41, Forward/Sparse, 21:13:41/00:02:17
(10.142.5.56, 239.255.255.255), 23:52:04/00:02:08, flags: JT
Incoming interface: TenGigabitEthernet1/0/12, RPF nbr 10.141.255.253
Outgoing interface list:
Vlan41, Forward/Sparse, 21:13:41/00:02:17
(10.142.5.27, 239.255.255.255), 23:52:05/00:01:27, flags: JT
Incoming interface: TenGigabitEthernet1/0/12, RPF nbr 10.141.255.253
Outgoing interface list:
Vlan41, Forward/Sparse, 21:13:41/00:02:17
(10.142.5.66, 239.255.255.255), 23:52:05/00:02:05, flags: JT
Incoming interface: TenGigabitEthernet1/0/12, RPF nbr 10.141.255.253
Outgoing interface list:
Vlan41, Forward/Sparse, 21:13:41/00:02:17
(10.142.5.16, 239.255.255.255), 23:52:06/00:01:28, flags: JT
Incoming interface: TenGigabitEthernet1/0/12, RPF nbr 10.141.255.253
Outgoing interface list:
Vlan41, Forward/Sparse, 21:13:41/00:02:17
(10.142.5.50, 239.255.255.255), 23:52:06/00:01:24, flags: JT
Incoming interface: TenGigabitEthernet1/0/12, RPF nbr 10.141.255.253
Struggling to work out why this stream is not being forwarded?
07-20-2019 03:55 AM
Sorry, but need clarification. Is the debug and output of the local router provided where the sources are and not on the receiver end?
If that is the case, the similar information from the receiver side would be helpful.
07-20-2019 02:56 PM
The debug and outputs is on the receiver side.
I was able to capture the actual traffic coming into this router on the WAN link, but that's as far as it goes.
07-20-2019 07:25 AM
Hello Bradley,
you should check the RPF status for all the sources sending packets to 239.255.255.255 on the layer3 switch.
Use
show ip route 10.142.5.56
show ip rpf 10.142.5.56
The l3 switch can forward traffic only if the source passes the RPF check : if multicast traffic is received on the same interface that is the best path to the source IP address the traffic is accepted and forwarded otherwise it is dropped.
Hope to help
Giuseppe
07-21-2019 05:11 PM
Hi,
SPF is good, double checked.
I did further captures and found that the SAP messages coming from our head end are being sent across the WAN and they come into our WAN link.
I then did a capture on the end host, and saw no SAP messages.
I did a debug on the switch and I see this -
001927: Jul 20 16:27:24.457 AEST: IGMP(0): Received v2 Report on Vlan41 from 10.141.241.103 for 239.255.255.255
001928: Jul 20 16:27:24.457 AEST: IGMP(0): Received Group record for group 239.255.255.255, mode 2 from 10.141.241.103 for 0 sources
001929: Jul 20 16:27:24.457 AEST: IGMP(0): Cancel report for 239.255.255.255 on Vlan41
001930: Jul 20 16:27:24.458 AEST: IGMP(0): Updating EXCLUDE group timer for 239.255.255.255
001931: Jul 20 16:27:24.458 AEST: IGMP(0): MRT Add/Update Vlan41 for (*,239.255.255.255) by 0
I then decided to add VLAN 41 to the group - ip igmp join-group 239.255.255.255 and then went back to the headend and can ping the group and vlan41 responds.
So the problem seems to be between the switch (layer 3 and also the local PIM router) and the local receivers like above - 10.141.241.103.
07-22-2019 12:11 AM - edited 07-22-2019 12:12 AM
Hello Bradley,
check if the layer switch is running IGMP version 3 on Vlan 41 because the following messages in debug:
>>001927: Jul 20 16:27:24.457 AEST: IGMP(0): Received v2 Report on Vlan41 from 10.141.241.103 for 239.255.255.255
001928: Jul 20 16:27:24.457 AEST: IGMP(0): Received Group record for group 239.255.255.255, mode 2 from 10.141.241.103 for 0 sources
001929: Jul 20 16:27:24.457 AEST: IGMP(0): Cancel report for 239.255.255.255 on Vlan41
>>>>>>>>001930: Jul 20 16:27:24.458 AEST: IGMP(0): Updating EXCLUDE group timer for 239.255.255.255
001931: Jul 20 16:27:24.458 AEST: IGMP(0): MRT Add/Update Vlan41 for (*,239.255.255.255) by 0
The IGMPv3 allows to specify the desired source or undesired source with Include or exclude directives.
The switch states that it received an IGMP v2 report from receiver and then it updates the EXCLUDE for group 239.255.255.255.
Hope to help
Giuseppe
07-22-2019 03:31 PM
07-22-2019 11:40 PM - edited 07-22-2019 11:44 PM
Hello Bradley,
can you provide the following info:
show version of the multilayer3 switch to see model and IOS XE image running on it.
what is the show command you have used for the last output you have provided?
The directive Exclude with an empty source list should mean that no source is dropped or black listed.
However, you see that all the sources for SAP Group 239.255.255.255 present on the router are not reaching the receiver in vlan 41 on the multilayer switch.
>> I think it might be time to engage Cisco TAC, try and get some deep dive
debugging and maybe look for any possible bugs.
If you provide the show version for the multilayer switch we can look at the Bug Search tool.
For example by looking at "GMP Exclude" in Bug search tool I have found 30 bugs. One interesting, but that applies to Nexus is the following
However, it can be wise to open a ticket if you need to solve and you would like to get ufficial support and answers.
Hope to help
Giuseppe
07-23-2019 12:28 AM
07-23-2019 01:04 AM - edited 07-23-2019 01:10 AM
Hello Brad,
>> The RP sends a Register Stop for the group.....back to my WAN link IP which
is the PIM DR for the WAN link. Is this Register Stop to say I have no
sources, or is it to say don't send any more registers, I will send you the
stream via multicast?
The RP should send the Register-Stop message for two possible scenarios:
a) there are no downstream receivers (routers or end user devices) interested in group G traffic -> Shared Tree oilist is empty.
b) the opposite there is someone interested in, and the RP itself has already joined the source based tree rooted at PIM DR near the source, and the register messages are not needed anymore (they use unicast GRE to send a multicast packet, carried in the payload, to the RP address so it is not very efficient)
By using a local receiver in another Vlan you should be able to see that in the local site address 239.255.255.255 is processed correctly.
Edit:
>> Good to know that exclude means no source blacklisted or blocked.
I mean exclude with an empty source list otherwise the listed sources are blocked / blacklisted
Hope to help
Giuseppe
07-23-2019 02:27 AM - edited 07-23-2019 03:07 AM
Ok, understand all points.
I can’t see any reason why this doesn’t work.
How can I capture the SAP traffic coming into the Multilayer switch, and
then processing it and deciding if to send it to all ports that requested
it? Will debugging multicast packet be enough and do I need to disable all
CEF processing?
I need to capture the moment the switch hardware or software gets the
stream and decides to flood or not into the local VLAN for the receivers to
see. That is exactly where it is broken it seems.
I have also captured outbound of the Layer 3 interface and the port channel
to the receiver and never see the traffic from the WAN.
Edit -
I decided to do a quick debug, I am remote so didn't want to overwhelm the switch.
I disabled route-cache on both WAN interface - Te1/0/12 & VLAN41 - and did a debug ip mfib ps 239.255.255.255
This is the source traffic coming in from the remote site -
02689: Jul 23 19:49:04.068 AEST: MFIBv4(0x0): Receive (10.142.5.52,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 165 ttl 4 frag 0x4000
002690: Jul 23 19:49:04.321 AEST: MFIBv4(0x0): Receive (10.142.5.21,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 165 ttl 4 frag 0x4000
002691: Jul 23 19:49:04.897 AEST: MFIBv4(0x0): Receive (10.142.5.38,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 165 ttl 4 frag 0x4000
002692: Jul 23 19:49:05.455 AEST: MFIBv4(0x0): Receive (10.142.5.35,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 165 ttl 4 frag 0x4000
002693: Jul 23 19:49:05.534 AEST: MFIBv4(0x0): Receive (10.142.5.14,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 165 ttl 4 frag 0x4000
002694: Jul 23 19:49:05.731 AEST: MFIBv4(0x0): Receive (10.142.5.31,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 165 ttl 4 frag 0x4000
002695: Jul 23 19:49:07.091 AEST: MFIBv4(0x0): Receive (10.142.5.50,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 165 ttl 4 frag 0x4000
002696: Jul 23 19:49:09.560 AEST: MFIBv4(0x0): Receive (10.142.5.85,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 445 ttl 252 frag 0x4000
002697: Jul 23 19:49:09.728 AEST: MFIBv4(0x0): Receive (10.142.5.89,239.255.255.255) from TenGigabitEthernet1/0/12 (PS): hlen 5 prot 17 len 436 ttl 252 frag 0x4000
The on VLAN41 I see this, the receivers also I think sending to this group as well -
002698: Jul 23 19:49:10.607 AEST: MFIBv4(0x0): Receive (10.141.241.101,239.255.255.255) from Vlan41 (PS): hlen 5 prot 17 len 165 ttl 6 frag 0x4000
002681: Jul 23 19:49:01.310 AEST: MFIBv4(0x0): Receive (10.141.241.100,239.255.255.255) from Vlan41 (PS): hlen 5 prot 17 len 165 ttl 6 frag 0x4000
I want to debug packet forwarding, but I need to put a change in to do deep dive debugs (this site is't live yet) but don't want to lose connection if the switch gets overwhelmed.
Brad
EDIT #2 -
I noticed one packet is TTL 4 or 6 and the other is 252. It looks like we get SAP traffic on 239.255.255.255 with TTL 252 and the SNMP traffic is TTL 6, used by this app to manage devices and same Source Group. So the VLAN41 traffic above is SNMP not SAP.
Still need to see where the decision is made to flood the stream or drop it.
Thanks again,
Brad
08-05-2019 02:36 AM
Still waiting on an update from Cisco TAC on this.
Also still need to capture one more debug on the switch, with no CEF and fast switching to see what mpacket debug says.
Brad.
08-06-2019 10:37 PM
Hi again,
Thanks for your help with this post. Wanted to let you know that my colleague did some troubleshooting with Cisco TAC and after disabling IGMP snooping and re-enabling, the SAP messages (Source 239.255.255.255) are now being forwarded!
So the two final resolutions to this case -
Symptom 1 - RP not allowing a RP router to register
Root Cause - PIM register ACL on the RP
Symptom 2 - SAP messages of 239.255.255.255 being sent from source over layer 3 network and not being forwarded by last hop router/layer 3 switch
Root Cause - Possible bug or software issue. Disabling and re-enabling IGMP snooping resolved issue.
08-06-2019 11:27 PM
Hello Brad,
thanks for your feedback on this very strange multicast issue.
So to solve they disabled IGMP snooping and then enabled it again .
IGMP snooping was disabled globally or only on the involved Vlan 41 ?
I suppose the second option disabling it on a specific vlan. This is just to complete the happy ending story.
Best Regards
Giuseppe
08-07-2019 04:28 PM
Hi,
I just got further feedback and it looks like it worked for a bit then stopped again. They believe it could be this bug -
https://quickview.cloudapps.cisco.com/quickview/bug/CSCvn14836
We have shut down one of the etherchannel ports and monitoring.
Very weird and strange issue.
Brad
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: