Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[MX] MPC6E reports XR2 JGCI Major CRC error with error code 0x250001

0

0

Article ID: KB33336 KB Last Updated: 12 Nov 2018Version: 1.0
Summary:

MPC6E reports XR2 JGCI Major CRC error with error code 0x25001. The XR2 CRC errors indicate marginal hardware, typically due to a board problem on the MPC6E, but possibly an XR2 memory issue. This is not a software issue.

A few reported CRC errors are not disruptive. If the count does not increase, then it is ok. However, if a bursty CRC error event occurs, then the XR2 links need to be retrained by rebooting the MPC6E board, and a Major alarm is raised. This is a Major alarm from the view point of CM ERROR INFRA, and a Fatal error from the view point of ASIC.

JTAC recommends to manually reboot the board. If CRC errors appear again, then consider replacing the MPC6E board. We can trigger auto FPC restart by setting a config knob (interasic-linkerror-recovery-enable) to clear the error condition immediately.

Symptoms:

MPC6E Cm error reports fatal CRC errors with 'XR2 Error code: 0x250001', but the MPC6E does not reboot.

*** messages ***
Oct 22 09:29:15  MX <163>Oct 22 09:29:15 MX fpc0 jgci_intf_log_interrupt_status: interface XR2CHIP(1)-intf-1 chan0 crc_count current 15 
accumulated 15
Oct 22 09:29:15  MX fpc0 jgci_intf_log_interrupt_status: interface XR2CHIP(1)-intf-1 chan0 crc_count current 15 accumulated 15
Oct 22 09:29:15  MX fpc0 JGCI[XR2CHIP(1)-intf-1] JGCI_INT_REG_LKR_1_FORCED_RETRAIN seen
Oct 22 09:29:15  MX fpc0 JGCI[XR2CHIP(1)-intf-1] JGCI_INT_REG_LKR_0_FORCED_RETRAIN seen
Oct 22 09:29:15  MX tnp.tftpd[4558]: TFTPD_CONNECT_INFO: TFTP read from address 16 port 8 file /var/tmp/pfe_debug_commands
Oct 22 09:29:15  MX tnp.tftpd[4558]: TFTPD_SENDCOMPLETE_INFO: Sent 0 blocks of 1024 and 1 block of 42 for file '/var/tmp/pfe_debug_commands'
Oct 22 09:29:15  MX fpc0 Cmerror: Draining ASIC error message queue
Oct 22 09:29:15  MX fpc0 cmerror_process_queue: module = XR2CHIP(1)
Oct 22 09:29:16  MX fpc0 Cmerror: processing the task op_type 1 for level 1 level_count 2 occur_count 2 clear_count 0 level_threshold 1
level_action 0x6   item errid 2424833 item_threshold 1 item_count 0  sub_item errid 0 sub_item_state 0 item_timestamp 0 current times
Oct 22 09:29:16  MX fpc0 Cmerror: Level 1 count increment 3 occur_count 3 clear_count 0
Oct 22 09:29:16  MX fpc0 Error (0x250001), module: XR2CHIP(1), type: XR2 JGCI Major CRC error
Oct 22 09:29:16  MX fpc0 Cmerror: Level 1 count 3 (occur_count 3 clear_count 0)crossed threshold 1 action 0x6
Oct 22 09:29:16  MX fpc0 cmerror_take_action_helper: performing action 2 for level 1 err_id 0x250001
Oct 22 09:29:16  MX tnp.tftpd[4560]: TFTPD_CONNECT_INFO: TFTP write from address 16 port 9 file /var/tmp/pfe_debug_info_RMPC0
Oct 22 09:29:16  MX fpc0 cmerror_take_action_helper: performing action 4 for level 1 err_id 0x250001
Oct 22 09:29:16  MX fpc0 Cmerror Op Set: XR2CHIP(1): CRC Errors:XR2CHIP(1):1 on jgci rx channel id 2
Oct 22 09:29:17  MX fpc0 cmerror_process_queue: module = XR2CHIP(1)
Oct 22 09:29:17  MX fpc0 Cmerror: processing the task op_type 1 for level 1 level_count 3 occur_count 3 clear_count 0 level_threshold 1
 level_action 0x6 item errid 2424833 item_threshold 1 item_count 1  sub_item errid 0 sub_item_state 0 item_timestamp -313367889 curr
Oct 22 09:29:17  MX fpc0 Cmerror: Level 1 count increment 4 occur_count 4 clear_count 0
Oct 22 09:29:17  MX fpc0 Error (0x250001), module: XR2CHIP(1), type: XR2 JGCI Major CRC error <-- Major alarm on CM error infra 
Oct 22 09:29:17  MX fpc0 Cmerror: Level 1 count 4 (occur_count 4 clear_count 0)crossed threshold 1 action 0x6
Oct 22 09:29:17  MX fpc0 cmerror_take_action_helper: performing action 2 for level 1 err_id 0x250001
Oct 22 09:29:17  MX tnp.tftpd[4562]: TFTPD_CONNECT_INFO: TFTP read from address 16 port 10 file /var/tmp/pfe_debug_commands
Oct 22 09:29:17  MX tnp.tftpd[4562]: TFTPD_SENDCOMPLETE_INFO: Sent 0 blocks of 1024 and 1 block of 42 for file 
'/var/tmp/pfe_debug_commands'
Oct 22 09:29:17  MX tnp.tftpd[4560]: TFTPD_RECVCOMPLETE_INFO: Received 67 blocks of 1024 size for file '/var/tmp/pfe_debug_info_RMPC0.2'
Oct 22 09:29:18  MX tnp.tftpd[4564]: TFTPD_CONNECT_INFO: TFTP write from address 16 port 11 file /var/tmp/pfe_debug_info_RMPC0
Oct 22 09:29:18  MX fpc0 cmerror_take_action_helper: performing action 4 for level 1 err_id 0x250001
Oct 22 09:29:18  MX <163>Oct 22 09:29:18 MX fpc0 Cmerror Op Set: XR2CHIP(1): Fatal Errors:XR2CHIP(1):1 on jgci rx channel id 3 <-- 
Fatal asic error 
Oct 22 09:29:18  MX fpc0 Cmerror Op Set: XR2CHIP(1): Fatal Errors:XR2CHIP(1):1 on jgci rx channel id 3

 

Solution:

In order to trigger auto FPC reboot, we need a config knob (interasic-linkerror-recovery-enable). It is recommended to configure it on MX routers to clear the error immediately.

Pio Poking

RMPC6(mx2020-re0 vty)# test tpio poke 1 long 0x02303e88 0x12  
0x0002303e88: 0x12

RMPC6(mx2020-re0 vty)#
[Oct 24 09:49:53.536 LOG: Err] JGCI[XLCHIP(42)-intf-0] JGCI_INT_REG_LKR_0_FORCED_RETRAIN seen
[Oct 24 09:49:53.536 LOG: Err] Fatal JGCI error....FPC will restart to recover....  <-- FPC auto reboot
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: Draining ASIC error message queue
[Oct 24 09:49:53.536 LOG: Debug] cmerror_process_queue: module = XL[0:0]
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: processing the task op_type 1 for level 2 level_count 0 occur_count 0 clear_count 0 
level_threshold 1 level_action 0x20 item errid 262232 item_threshold 1 item_count 0  sub_item errid 0 sub_item_state 0 item_timestamp 0 current times
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: Level 2 count increment 1 occur_count 1 clear_count 0
[Oct 24 09:49:53.536 LOG: Info] Error (0x40058), module: XL[0:0], type: JGCI Fatal Errors
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: Level 2 count 1 (occur_count 1 clear_count 0)crossed threshold 1 action 0x20
[Oct 24 09:49:53.536 LOG: Debug] cmerror_take_action_helper: performing action 20 for level 2 err_id 0x40058

lab@mx2020-re0> show chassis alarms  
9 alarms currently active
Alarm time               Class  Description
2018-10-24 18:47:03 JST  Major  FPC 6 Major Errors <-- major alarm is raised.

Enabling Knob

[edit]
lab@mx2020-re0# set chassis fpc 6 interasic-linkerror-recovery-enable 

[edit]
lab@mx2020-re0# commit 
re0: 
configuration check succeeds
re1: 
commit complete
re0: 
commit complete

[edit]
lab@mx2020-re0# 

Test Again

RMPC6(mx2020-re0 vty)# test tpio poke 1 long 0x02303e88 0x12   
0x0002303e88: 0x12

RMPC6(mx2020-re0 vty)#
[Oct 24 09:49:53.536 LOG: Err] JGCI[XLCHIP(42)-intf-0] JGCI_INT_REG_LKR_0_FORCED_RETRAIN seen
[Oct 24 09:49:53.536 LOG: Err] Fatal JGCI error....FPC will restart to recover....  <-- FPC auto reboot
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: Draining ASIC error message queue
[Oct 24 09:49:53.536 LOG: Debug] cmerror_process_queue: module = XL[0:0]
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: processing the task op_type 1 for level 2 level_count 0 occur_count 0 clear_count 0 
level_threshold 1 level_action 0x20 item errid 262232 item_threshold 1 item_count 0  sub_item errid 0 sub_item_state 0 item_timestamp 0 
current times
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: Level 2 count increment 1 occur_count 1 clear_count 0
[Oct 24 09:49:53.536 LOG: Info] Error (0x40058), module: XL[0:0], type: JGCI Fatal Errors
[Oct 24 09:49:53.536 LOG: Debug] Cmerror: Level 2 count 1 (occur_count 1 clear_count 0)crossed threshold 1 action 0x20
[Oct 24 09:49:53.536 LOG: Debug] cmerror_take_action_helper: performing action 20 for level 2 err_id 0x40058

[edit]
lab@mx2020-re0# run show chassis fpc 
                     Temp  CPU Utilization (%)   CPU Utilization (%)  Memory    Utilization (%)
Slot State            (C)  Total  Interrupt      1min   5min   15min  DRAM (MB) Heap     Buffer
  0  Empty           
  1  Empty           
  2  Empty           
  3  Empty           
  4  Empty           
  5  Empty           
  6  Present          Testing
  7  Empty           
  8  Empty           
  9  Empty           
 10  Empty           
 11  Empty           
 12  Empty           
 13  Empty           
 14  Empty           
 15  Empty           
 16  Empty           
 17  Empty           
 18  Empty           
 19  Empty 

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search