This article highlights the steps for troubleshooting a packet drop scenario due to incrementing input packet rejects on Juniper MPC linecards in MX Series routers.
Packet drop on the incoming interface due to parity error on the pre-classifier engine of the linecard
Often packet loss or application-level slowness is attributed to Layer 1 issues, wherein input/output errors are said to be incrementing due to a bad cable or port. A CRC error is often considered to be the culprit in such scenarios. The solution in these cases is simple: isolate the part that is seen to be inducing the error and replace it. Normally, the isolation procedure involves replacing the component one by one or moving the link to a different port to validate if it is a port-level issue.
In some cases, however, you may see the "Input packet reject" counter increment.
show interfaces xe-0/0/0 extensive | match reject
Input packet rejects 165466
Input DA rejects 165466
Input SA rejects 0
Note: The rejects counter can be seen only with extensive output.
The Destination Address (DA) rejects are generally seen when the incoming packet has a destination address that is not in the accept list. Similarly, the Source Address (SA) reject counter counts the incoming packets that have source addresses that are not present in the accept list. So, one would assume that these counters are ideally not indicative of an issue on the Juniper device.
Parity Errors in Pre-Classifier Engine of Linecard
However, there are instances when these counters could be incrementing, and hence dropping an incoming packet at the ingress of a Juniper device due to parity errors in the pre-classifier engine of the linecard.
Ideally, in such cases, an alarm should be raised at the FPC level and the impacted PFE or the interfaces should be disabled to prevent traffic blackholing. But in versions prior to Junos OS Release 17.2R1, the packets are silently dropped instead.
We can look at the show interface xe-*/*/* extensive
output to verify if the packet rejects counter is incrementing. (Replace the asterisks with the slot, PIC and port number, respectively).
-
If we see the packet rejects counter incrementing, you can log in to the FPC shell and check if there are errors on the pre-classifier engine:
start shell pfe network fpc1 << Replace the FPC number as per your case.
-
Next, check the ASICs for this FPC:
show jspec client << This command lists out the ASIC components of the PFE.
For example:
show jspec client
ID Name
1 LUCHIP[0]
2 LUCHIP[4]
3 XMCHIP[0]
4 LUCHIP[1]
5 LUCHIP[5]
6 XMCHIP[1]
Note: For mqchip-based PFEs, the commands would differ. Hence you need to use show jspec client
to determine the ASICs in the first place.
-
Determine the PFE number that your interface resides on:
show xmchip 0 ifd list 0 << This command will give the interfaces on PFE 0.
show xmchip 1 ifd list 0 << This command will give the interfaces on PFE 1.
show xmchip 0 ifd list 0
Ingress IFD list
----------------
---------------------------------------------------------------------
IFD name IFD index PHY stream LU SID Traffic Class
---------------------------------------------------------------------
xe-1/0/0 797 1025 0 0 (High)
xe-1/0/0 797 1026 0 1 (Medium)
xe-1/0/0 797 1027 0 2 (Low)
xe-1/0/1 798 1029 33 0 (High)
xe-1/0/1 798 1030 33 1 (Medium)
xe-1/0/1 798 1031 33 2 (Low)
xe-1/0/2 799 1033 66 0 (High)
xe-1/0/2 799 1034 66 1 (Medium)
xe-1/0/2 799 1035 66 2 (Low)
xe-1/0/3 800 1037 99 0 (High)
xe-1/0/3 800 1038 99 1 (Medium)
xe-1/0/3 800 1039 99 2 (Low)
xe-1/0/0 797 1072 0 3 (Drop)
xe-1/0/1 798 1072 33 3 (Drop)
xe-1/0/2 799 1072 66 3 (Drop)
xe-1/0/3 800 1072 99 3 (Drop)
et-1/1/0 801 1181 1248 0 (High)
et-1/1/0 801 1182 1248 1 (Medium)
et-1/1/0 801 1183 1248 2 (Low)
et-1/1/0 801 1228 1248 3 (Drop)
---------------------------------------------------------------------
show mqchip 0 ifd
Input IFD IFD LU
Stream Index Name Sid TClass
------ ------ ---------- ------ ------
1025 658 xe-3/0/0 0 hi
1026 658 xe-3/0/0 0 med
1027 658 xe-3/0/0 0 lo
1029 659 xe-3/0/1 33 hi
1030 659 xe-3/0/1 33 med
1031 659 xe-3/0/1 33 lo
1033 660 xe-3/0/2 66 hi
1034 660 xe-3/0/2 66 med
1035 660 xe-3/0/2 66 lo
1037 661 xe-3/0/3 99 hi
1038 661 xe-3/0/3 99 med
1039 661 xe-3/0/3 99 lo
1040 658 xe-3/0/0 0 drop
1040 659 xe-3/0/1 33 drop
1040 660 xe-3/0/2 66 drop
1040 661 xe-3/0/3 99 drop
1121 753 pd-3/0/0 1121 N/A
1121 754 pe-3/0/0 1121 N/A
1121 755 gr-3/0/0 1121 N/A
1121 756 ip-3/0/0 1121 N/A
1121 757 vt-3/0/0 1121 N/A
1121 758 mt-3/0/0 1121 N/A
1121 759 lt-3/0/0 1121 N/A
1121 760 ut-3/0/0 1121 N/A
1121 761 ud-3/0/0 1121 N/A
Output IFD IFD Base
Stream Index Name Qsys Qnum
------ ------ ---------- ------ ------
1024 658 xe-3/0/0 MQ0 0
1025 659 xe-3/0/1 MQ0 256
1026 660 xe-3/0/2 MQ0 512
1027 661 xe-3/0/3 MQ0 776
1121 753 pd-3/0/0 MQ0 184
1121 754 pe-3/0/0 MQ0 184
1121 755 gr-3/0/0 MQ0 184
1121 756 ip-3/0/0 MQ0 184
1121 757 vt-3/0/0 MQ0 184
1121 758 mt-3/0/0 MQ0 184
1121 759 lt-3/0/0 MQ0 184
1121 760 ut-3/0/0 MQ0 184
1121 761 ud-3/0/0 MQ0 184
-
After you isolate the PFE number, check whether the pre-classifier engine has any parity errors:
For xmchip-based linecards:
sh jspec xmchip[0] registers precl 0 error_int
Offset Name Current
0x08200100 xm.precl[0].error_int.status 00000000
0x08200108 xm.precl[0].error_int.diag 00000000
0x08200110 xm.precl[0].error_int.enable 00000000
0x08200118 xm.precl[0].error_int.poll_enable 00000000
0x08200120 xm.precl[0].error_int.enable_set 00000000
0x08200128 xm.precl[0].error_int.enable_clr 00000000
0x08200130 xm.precl[0].error_int.poll_enable_set 00000000
0x08200138 xm.precl[0].error_int.poll_enable_clr 00000000
For mqchip-based linecards:
show jspec mqchip[0] registers precl error_int
Offset Name Current
0x07180020 mq.precl.error_int.status 00000000
0x07180024 mq.precl.error_int.diag 00000000
0x07180028 mq.precl.error_int.enable 00000000
0x0718002c mq.precl.error_int.poll_enable 00000000
The above output reads the register value for the interrupts resulting due to parity errors. In the above examples, it is zero.
show jspec xmchip[0] registers precl 1 error_log
Offset Name Current
0x08280040 xm.precl[1].error_log.instmem_parity_err_addr 0000003F
0x08280044 xm.precl[1].error_log.progerr_context_num 00000004
0x08280048 xm.precl[1].error_log.tcamerr_context_num 00000000
0x0828004c xm.precl[1].error_log.inqbuf_overflow_port 00000000
In the above output, we see a non-zero value in the register for the interrupt. This means that there are parity errors in the pre-classifier engine of the PFE.
Solution
To resolve the error, you need to reboot the FPC in a service-affecting window. After the reboot, check the value of the register and the input rejects counter as well.
If there are no increments, the issue has been cleared. If you still get reports of performance issues and see the reject counter increment with a non-zero value in the register for the pre-classifier, contact Support with your findings to get a confirmation if the FPC needs to be replaced.
For versions post Junos OS Release 17.2R1, the enhancement PR PR1059137 addresses this issue wherein a cmerror alarm is raised when any parity errors are observed on the pre-classifier engine.