Knowledge Search


×
 

[SRX] Fabric link status down after configuring CoS rewrite-rules on fab0 and fab1

  [KB33328] Show Article Properties


Summary:

This article explains why the fab link monitor status shows as being down after deploying the quality of service (QoS) configuration on the fab0 and fab1 fabric links in SRX devices, and recommends the ideal solution to not run into this problem.  ‚Äč

 

Symptoms:

When the following Class of Service (CoS) configuration is applied on fab0 and fab1:

set class-of-service interfaces fab0 scheduler-map qos-scheduler
set class-of-service interfaces fab0 unit * classifiers dscp-ipv6 inet6-classifier
set class-of-service interfaces fab0 unit * classifiers inet-precedence inet-classifier
set class-of-service interfaces fab0 unit * rewrite-rules dscp inet-rewrite
set class-of-service interfaces fab0 unit * rewrite-rules dscp-ipv6 inet6-rewrite
set class-of-service interfaces fab1 scheduler-map qos-scheduler
set class-of-service interfaces fab1 unit * classifiers dscp-ipv6 inet6-classifier
set class-of-service interfaces fab1 unit * classifiers inet-precedence inet-classifier
set class-of-service interfaces fab1 unit * rewrite-rules dscp inet-rewrite
set class-of-service interfaces fab1 unit * rewrite-rules dscp-ipv6 inet6-rewrite

set class-of-service rewrite-rules dscp inet-rewrite forwarding-class be1 loss-priority low code-point 000000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class be1 loss-priority high code-point 000000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af1 loss-priority low code-point 001000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af1 loss-priority high code-point 001000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af2 loss-priority low code-point 010000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af2 loss-priority high code-point 010000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af3 loss-priority low code-point 011000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af3 loss-priority high code-point 011000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af4 loss-priority low code-point 100000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class af4 loss-priority high code-point 101000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class nc1 loss-priority low code-point 110000
set class-of-service rewrite-rules dscp inet-rewrite forwarding-class nc1 loss-priority high code-point 111000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class be1 loss-priority low code-point 000000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class be1 loss-priority high code-point 000000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af1 loss-priority low code-point 001000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af1 loss-priority high code-point 001000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af2 loss-priority low code-point 010000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af2 loss-priority high code-point 010000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af3 loss-priority low code-point 011000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af3 loss-priority high code-point 011000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af4 loss-priority low code-point 100000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class af4 loss-priority high code-point 100000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class nc1 loss-priority low code-point 110000
set class-of-service rewrite-rules dscp-ipv6 inet6-rewrite forwarding-class nc1 loss-priority high code-point 111000
set class-of-service scheduler-maps qos-scheduler forwarding-class nc1 scheduler nc1
set class-of-service scheduler-maps qos-scheduler forwarding-class af4 scheduler af4
set class-of-service scheduler-maps qos-scheduler forwarding-class af3 scheduler af3
set class-of-service scheduler-maps qos-scheduler forwarding-class af2 scheduler af2
set class-of-service scheduler-maps qos-scheduler forwarding-class af1 scheduler af1
set class-of-service scheduler-maps qos-scheduler forwarding-class be1 scheduler be1
set class-of-service schedulers af4 buffer-size percent 22
set class-of-service schedulers af4 priority strict-high
set class-of-service schedulers nc1 buffer-size percent 2
set class-of-service schedulers nc1 priority high
set class-of-service schedulers af3 transmit-rate percent 94
set class-of-service schedulers af3 buffer-size percent 22
set class-of-service schedulers af3 priority low
set class-of-service schedulers af2 transmit-rate percent 3
set class-of-service schedulers af2 buffer-size percent 18
set class-of-service schedulers af2 priority low
set class-of-service schedulers af1 transmit-rate percent 2
set class-of-service schedulers af1 buffer-size percent 18
set class-of-service schedulers af1 priority low
set class-of-service schedulers be1 transmit-rate percent 1
set class-of-service schedulers be1 buffer-size percent 18
set class-of-service schedulers be1 priority low

The fab link monitor status shows as down even on the latest Junos OS version, Junos OS 15.1X49-D130:

root# run show chassis cluster interfaces
Control link status: Up

Control interfaces:
    Index   Interface   Monitored-Status   Internal-SA   Security
    0       em0         Up                 Disabled      Disabled
    1       em1         Down               Disabled      Disabled  

Fabric link status: Down

Fabric interfaces:
    Name    Child-interface    Status                    Security
                               (Physical/Monitored)
    fab0    xe-2/0/8           Up   / Down               Disabled 
    fab0    xe-2/0/9           Up   / Down               Disabled 
    fab1    xe-5/0/8           Up   / Down               Disabled 
    fab1    xe-5/0/9           Up   / Down               Disabled   

In the output of show usp ha fabric statistics, the Pkt-rcvd counter does not increase.

root% vty node0.fpc0.pic0

BSD platform (XLP processor, 32491MB memory, 16384KB flash)

[flowd64]FPC0.PIC0(vty)# show usp ha fabric statistics  
Fabric monitor attributes:
   Interval : 500
   Packet loss threshold : 6
   Packet recovery threshold : 6

Number of status notifications sent to JSRPD: 4

Sending Fabric Hello Packets: Yes
    ADMIN DOWN: No
    RE DOWN: No

idx intf-name  Pkt-sent  Pkt-rcvd  Pkt-lost  Pkt-Recovery Pkt_Invalid Wait-Count
--- ---------- --------- --------- --------- ------------ ----------- ----------
  0 xe-2/0/8      188504    188415         0            0           0          0
  1 xe-2/0/9      188504    188415         0            0           0          0

[flowd64]FPC0.PIC0(vty)# show usp ha fabric statistics  
Fabric monitor attributes:
   Interval : 500
   Packet loss threshold : 6
   Packet recovery threshold : 6

Number of status notifications sent to JSRPD: 4

Sending Fabric Hello Packets: Yes
    ADMIN DOWN: No
    RE DOWN: No

idx intf-name  Pkt-sent  Pkt-rcvd  Pkt-lost  Pkt-Recovery Pkt_Invalid Wait-Count
--- ---------- --------- --------- --------- ------------ ----------- ----------
  0 xe-2/0/8      188509    188415         0            0           0          0
  1 xe-2/0/9      188509    188415         0            0           0          0

 

Cause:

This is because SRX devices do not support CoS rewrite-rules on the fabric link.

When a rewrite-rule is applied on the fabric link, the ha_validate_pkt: mbuf m_len(60) value does not match the sum total value of msg_size(124), ha_pkt_header(4), and ha_msg_header(12), as indicated in the trace output of debug usp ha fabric probes and debug usp ha fabric monitor.

[104] T11 ha_validate_pkt: mbuf m_len(60) doesn't match with sum total of msg_size(188), ha_pkt_header(4), ha_msg_header(12)
[105] T07 ha_validate_pkt: mbuf m_len(60) doesn't match with sum total of msg_size(188), ha_pkt_header(4), ha_msg_header(12)

The CoS configuration also results in a change in some of the HA probe packet fields and therefore, HA packet validation is seen to fail as well.

[106] T11 ha_rcv_packet: HA packet validation failed
[107] T07 ha_rcv_packet: HA packet validation failed

 

Solution:

It is recommended that customers do not configure CoS rewrite-rules on the fabric link.

 

Related Links: