LDP (Label Distribution Protocol) is seen flapping after upgrading to Junos 14.1+. This occurs only when peering with H3C routers, not Cisco or Junos equipment.
This article explains how ELC (Entropy Label Capability) may be the cause. It can be disabled to resolve this issue.
After upgrading the MX to Junos 14.1, LDP flaps with the H3C router. The H3C router reports that the LDP session is going up and down.
The logs on the MX report the LDP TCP session is reset by the peer device (H3C).
Aug 4 10:34:49.448715 Read from 172.16.1.18 failed: Connection reset by peer
Aug 4 10:34:49.448726 Error reading from 172.16.1.18: Connection reset by peer
Aug 4 10:34:49.448857 Session 172.16.1.18 GR state Operational -> Nonexistent
...
Aug 4 10:34:49.448867 Session 172.16.1.18 state Operational -> Closing
Aug 4 10:34:49.448875 LDP session 172.16.1.18 is down, reason: connection reset
Aug 4 10:34:49.448911 RPD_LDP_SESSIONDOWN: LDP session 172.16.1.18 is down, reason: connection reset
Junos 14.1 and later enables by default the advertisement of ELC in LDP and RSVP. As a result, the first mapping message includes a ELC TLV (type length variable). This TLV length is "0". If the peer device does not support this feature, it may consider the TLV length as invalid and then reset the TCP session due to "bad message length".
Example:
Confirmed that the peering is using ELC:
labroot@lab1# run show ldp overview | match egress
Egress FEC capabilities enabled: entropy-label-capability <-- entropy label capability enabled
Notice the difference in LDP packets between 13.3 and 14.1.
Junos 13.3 traceoptions output:
The advertisement of ELC was not supported by Junos. There was no ELC (0X206) TLV in the label mapping message.
Aug 4 10:34:49.233946 LDP sent TCP PDU 172.16.1.115 -> 172.16.1.18 (none)
Aug 4 10:34:49.233954 ver 1, pkt len 2982, PDU len 2978, ID 172.16.1.115:0
Aug 4 10:34:49.233962 Msg LabelMap (0x400), len 28, ID 640022
Aug 4 10:34:49.233969 TLV FEC (0x100), U: 0, F: 0, len 8
Aug 4 10:34:49.233978 Prefix, family 1, 172.16.1.115/32
Aug 4 10:34:49.233985 TLV Label (0x200), U: 0, F: 0, len 4
Aug 4 10:34:49.233991 Label 3
Aug 4 10:34:49.234002 TLV ELC (0x206), U: 1, F: 1, len 0 <-- TLV Length is "0"
Junos 14.1 traceoptions output:
Aug 9 01:54:46.076097 LDP sent TCP PDU 50.0.0.13 -> 172.0.0.1 (none)
Aug 9 01:54:46.076117 ver 1, pkt len 38, PDU len 34, ID 50.0.0.13:0
Aug 9 01:54:46.076123 Msg LabelMap (0x400), len 24, ID 4
Aug 9 01:54:46.076128 TLV FEC (0x100), U: 0, F: 0, len 8
Aug 9 01:54:46.077172 TLV Label (0x200), U: 0, F: 0, len 4
Disable ELP with the no-load-balance-label-capability knob:
user@router# set forwarding-options no-load-balance-label-capability
For more information, refer to the RFC:
The Use of Entropy Labels in MPLS Forwarding
2019-07-31: Updated for clarity.
2017-03-22: Author corrected version number. Editor deleted reference to JTAC lab server.