Knowledge Search


×
 

[MX] RPD core dump during route change with BGP-L3VPN no-vrf-propagate-ttl and multipath

  [KB34563] Show Article Properties


Summary:

In an L3VPN scenario with no-vrf-propagate-ttl and multipath enabled, route protocol process (RPD) core dump might occur if the primary route is deleted from the multipath set.

This article provides the reason for a core dump to occur, indicates the releases in which this issue has been resolved, and also lists a couple of workarounds.

 

Symptoms:

RPD core dump might occur in a BGP-L3VPN network that has equal-cost multipath (ECMP) paths configured and there is a shift from the primary path to a secondary path.

What is the trigger for the issue?

  • An ECMP path exists but the best path has changed.

  • Border Gateway Protocol (BGP) multipath is enabled for BGP L3VPN and no-vrf-propagate-ttl is enabled in the VRF.

  • A rib group is enabled under the BGP source peer with the given address family.

 

Cause:

Sample VRF Configuration

R1# show routing-instances vrf1                       
instance-type vrf;
no-vrf-propagate-ttl;
interface ae0.0;
interface ae1.0;
<...>
routing-options {
    multipath;
}
 

Consider the following topology where R1, R2, and R3 belong to the same VRF.

 
               +----------------R2
               |
R1-----------+
               |
               +----------------R3
 

Here:

  1. R2 and R3 are advertising the same VPNv4 prefix (that is, route 100.0.3.235/32 with the same route-distinguisher)

  2. When these routes are received at R1, there are two ECMP paths for the same L3VPN prefix, as shown (let's call them r1 and r2).

  3. Due to multipath, these routes are copied into the VRF table as r1 and r2.

R1>show route 100.0.3.235/32

MAINTENANCE-1.inet.0: 20427 destinations, 20428 routes (20427 active, 0 holddown, 1 hidden)
+ = Active Route, - = Last Active, * = Both

100.0.3.235/32     *[BGP/170] 1w2d 09:21:57, localpref 100
                      AS path: 100 I, validation-state: unverified
                    > to 10.10.10.1 via ae0.0
100.0.3.235/32     [BGP/170] 1w2d 09:23:27, localpref 100
                      AS path: 100 I, validation-state: unverified
                    > to 10.10.10.1 via ae1.0
 

RPD core dump

  • At first, route r1 is the best path (primary route) and so is marked as the multipath leader route.

  • At some point, there is a route change on r1, and consequently, r2 is now better than r1.

  • Due to the above multipath state change, BGP calls for an update to route r1. However, due to the no-vrf-propagate-ttl knob in the configuration, when BGP calls for a change in the primary route r1, RPD infra will call back BGP with its secondary route r1 (VRF route).

  • Given the current code, BGP does not check whether this route is marked with an independent resolution flag, which may result in an RPD core dump.

 

Solution:

This issue has been resolved in Junos OS releases 17.2X75-D105, 17.2X75-D105-J1, 17.2X75-D110, 18.2X75-D40, 18.2X75-D50, 18.4R1-S3, 18.4R3, 19.1R2, 19.2R2, and 19.3R1 (PR1436465).

Meanwhile, a couple of workarounds would be as follows:

  • Remove multipath.

OR

  • Remove no-vrf-propagate-ttl.

 

Related Links: