Community discussions

MikroTik App
 
User avatar
cpbruton
just joined
Topic Author
Posts: 16
Joined: Fri Jun 07, 2013 4:23 am
Location: Pasadena, California, USA
Contact:

PIM multicast forwarding stops after brief interface outage

Wed Dec 06, 2017 10:30 pm

Update 2017-12-14: I determined that this is not time based as originally thought, but occurs after a brief outage of the GRE tunnel between the two routers. So the issue becomes "PIM multicast forwarding does not come back correctly after temporary interface outage".

I have encountered an issue where PIM stops working correctly after a period of time. Everything will be working fine for a few days but eventually the MFC table empties out and multicast traffic no longer gets forwarded. I can still see the relevant groups listed in Joins and IGMP groups, they just disappear from the MFC table. I am working on generating supout.rifs (before and after on both routers) but I am wondering if anyone else has experienced this in the meantime. This happened on at least v6.40.5 and v6.41rc52. I am not sure if something triggers this or if it is spontaneous. I found some old forum threads describing a similar issue with IGMP proxy, but it sounds like it was resolved.

Attached image shows our setup. Unicast routing (via OSPF) works correctly end-to-end (and continues to work even after multicast failure). Amazon Web Services VPC does not support multicast, which is why we have an individual GRE tunnel running from each instance to our Mikrotik CHR.

I know this post is vague on specifics; I'll work on getting my configurations ready to post for more detail. But does this problem sound familiar to anyone? It's really hard to characterize the problem because it happens after 1-5 days and there's no obvious trigger. (Edit to add 2017-12-14: the trigger is the GRE interface going down briefly. PIM does not resume forwarding after it comes back up.)
You do not have the required permissions to view the files attached to this post.
 
User avatar
cpbruton
just joined
Topic Author
Posts: 16
Joined: Fri Jun 07, 2013 4:23 am
Location: Pasadena, California, USA
Contact:

Re: PIM multicast forwarding stops after brief interface outage

Thu Dec 14, 2017 9:28 pm

After investigation I found a trigger for this behavior. This happens when the GRE tunnel between CHR and CCR1009 goes down and then comes back online - brief or long outage, it doesn't matter.
Everything comes back online (OSPF, end-to-end unicast routing, etc.) except for PIM forwarding.

The only way to bring it back seems to be to completely reset PIM on one or both routers, by disabling RP setting, disabling PIM interfaces, and waiting for dynamic PIM interfaces to disappear.

Obviously we could work around this issue by scripting some sort of PIM reset after outage, or by increasing the GRE keepalive timeout to smooth out brief outages, but that doesn't get to the core of the problem.

Any ideas?
 
mickdoev
just joined
Posts: 14
Joined: Fri Mar 17, 2023 2:44 am

Re: PIM multicast forwarding stops after brief interface outage

Mon Sep 25, 2023 8:16 am

Hello,
In searching for an issue that I’ve been encountering with PIM in OS 6; I came across the trailing post from 2017 which details issues very similar to what I’m still seeing now, 6 years later.

I am wondering if the OP ever found a solution.

I initially came across the issue working with GRE tunnels, OSPF and PIM, but I have been able to configure a super simple design that consistently reproduces the issue using static routes and without needing GRE tunnels.

It would be great if someone could reproduce this scenario and confirm my findings (or point our any flaws I have made :? )

CONFIGS

Router1 config
/interface bridge
add name=localhost
/ip address
add address=192.168.1.1/24 interface=ether2 network=192.168.1.0
add address=10.1.1.1/30 interface=ether1 network=10.1.1.0
add address=10.0.1.1 interface=localhost network=10.0.1.1
/ip route
add distance=1 dst-address=192.168.2.0/24 gateway=10.1.1.2
/routing pim bsr-candidates
add interface=localhost
/routing pim rp-candidates
add interface=localhost
/routing pim interface
add interface=localhost protocols=pim
add interface=ether1 protocols=pim
add interface=ether2

Router2 config
/ip address
add address=192.168.2.1/24 interface=ether2 network=192.168.2.0
add address=10.1.1.2/30 interface=ether1 network=10.1.1.0
/ip route
add distance=1 dst-address=10.0.1.1/32 gateway=10.1.1.1
add distance=1 dst-address=192.168.1.0/24 gateway=10.1.1.1
/routing pim interface
add interface=ether2
add interface=ether1 protocols=pim


TOPOLOGY

Router1 is connected to Router2 via a CAT6 Ethernet cable direct to each routers Ether1 port.
pic1.png


PREABMLE TO THE ISSUE

Once both routers are booted, we interrogate Router2 and see the following information.

We can see here that Interface ether1 (our WAN connection) has one PIM neighbor (as expected)
pic2.png
Router2 - \routing\PIM\Interface


Looking at the neighbors tab, we see the PIM neighbor is the ether1 address of Router1 (as expected)
pic3.png
Router2 - \routing\PIM\Neighbors


The elected bootstrap router is the loopback interface of Router1 (as expected)
pic4.png
Router2 - \routing\PIM\BSR


The elected rendezvous point is the loopback interface of Router1 (as expected)
pic5.png
Router2 - \Routing\PIM\RP


At this point all is working correctly. I can start a multicast source on either side of the two routers and the multicast data is received by the respective multicast listener.


GENERATING THE ISSUE

At this point, I provide an interruption by temporarily unplugging the CAT6 connection to the Ether1 port of Router2. I then plug the connection back in after a few seconds.

Re- interrogating Router2, we make the following observations.
pic6.png
Router2 - \routing\PIM\Neighbors

We see that when the Ether1 connection was re-established, a new addition entry is made in the neighbors list. Eventually the original neighbor entry counts down to zero and is removed from the list.

At this point all is still working correctly. Multicast traffic is still flowing between senders and receivers.

At this point, I again provide an interruption by temporarily unplugging the CAT6 connection to the Ether1 port of Router2. I then plug the connection back in after a few seconds.

Interrogating Router2, we make the following observations.
We can see here that Interface ether1 has lost its PIM neighbor
pic7.png
Router2 - \routing\PIM\Interface

We see that on this occasion, when the Ether1 connection was re-established, a new addition entry was NOT created in the neighbors list.
pic8.png
Router2 - \routing\PIM\Neighbors

The “bootstrap router timeout” counts down to zero.
pic9.png
Router2 - \routing\PIM\BSR


The “rendezvous point timeout” counts down to zero and is then removed.
pic10.png
Router2 - \Routing\PIM\RP


At this point, the Multicast traffic stops.


This is reproducible 100% of the time.

In order to re-establish the multicast streams, I disable the ether1 PIM interface (long enough that the PIM neighbor entry times-out) and then re-enable it. The BSR and RP are then re-populated and Multicast traffic resumes.

Software = ARM OS 6 versions up to 6.49.10
Hardware = RB3011 UiAS-RM

Look forward to any replies.
You do not have the required permissions to view the files attached to this post.

Who is online

Users browsing this forum: No registered users and 5 guests