Support Support Downloads Knowledge Base Apex Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[MX] Syslog message - XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM (0x70068)

0

0

Article ID: KB37469 KB Last Updated: 13 Sep 2021Version: 1.0
Summary:

This article explains the meaning of the syslog messages below, along with the corresponding major alarm, and clarifies what actions need to be taken.

"Performing action cmalarm for error /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM (0x70068) in module: XMCHIP(3) with scope: pfe category: functional level: major" and,

"Cmerror Op Set: XMCHIP(3): XMCHIP(3): DDRIF: LLISTQ Protect: Parity error for chunk checksum SRAM (0x4) (URI: /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM)" 

The "LLISTQ Protect: Multiple Errors" message reports a memory parity error in the XMCHIP.

This is a troubleshooting article for a PFE ASIC Syslog Event. To view other documented syslog events related to XMCHIP, XLCHIP, MQCHIP, LUCHIP, EACHIP, and PECHIP, see KB31893 - Index of Articles for Troubleshooting PFE ASIC Syslog Events.

Symptoms:

The show system alarm output indicates a "Major Errors" alarm as shown below:

root@device_re0> show chassis alarms no-forwarding
1 alarms currently active
Alarm time               Class  Description
2021-08-23 09:43:43 CEST Major  FPC 1 Major Errors

When a "LLISTQ Protect: Multiple Errors" event occurs, the following messages can be seen:

Aug 23 09:43:29 device_re0 fpc1 XMCHIP(3): XMCHIP(3): DDRIF: LLISTQ Protect: Multiple Errors 0x20
Aug 23 09:43:29 device_re0 fpc1 Error: /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM (0x70068), scope: pfe, category: functional, severity: major, module: XMCHIP(3), type: DDRIF_PROTECT: Detected: Parity error for chnk checksum SRAM
Aug 23 09:43:30 device_re0 fpc1 Performing action get-state for error /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM (0x70068) in module: XMCHIP(3) with scope: pfe category: functional level: major
Aug 23 09:43:43 device_re0 fpc1 Performing action cmalarm for error /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM (0x70068) in module: XMCHIP(3) with scope: pfe category: functional level: major
Aug 23 09:43:43 device_re0 alarmd[40428]: Alarm set: FPC id=150995304, color=RED, class=CHASSIS, reason=FPC 1 Major Errors
Aug 23 09:43:43 device_re0 craftd[24337]: Major alarm set, FPC 1 Major Errors
Aug 23 09:43:43 device_re0 fpc1 Performing action disable-pfe for error /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM (0x70068) in module: XMCHIP(3) with scope: pfe category: functional level: major
Aug 23 09:43:43 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/12 388
Aug 23 09:43:43 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/13 389
Aug 23 09:43:44 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/14 390
Aug 23 09:43:44 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/15 391
Aug 23 09:43:44 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/16 392
Aug 23 09:43:45 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/17 393
Aug 23 09:43:45 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/18 394
Aug 23 09:43:45 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/19 395
Aug 23 09:43:45 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/20 396
Aug 23 09:43:45 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/21 397
Aug 23 09:43:45 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/22 398
Aug 23 09:43:45 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/23 399
Aug 23 09:43:46 device_re0 fpc1 PFE 3: 'PFE Disable' action performed. Bringing down ifd xe-1/1/13 389
Aug 23 09:43:47 device_re0 fpc1 Cmerror Op Set: XMCHIP(3): XMCHIP(3): DDRIF: LLISTQ Protect: Parity error for chunk checksum SRAM (0x4) (URI: /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM)


RMPC1(device_re0 vty)# show cmerror module 33 error 0x070068       
 
Error-id              : 0x70068
 
Error Name            : XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM
Identifier            : /fpc/1/pfe/0/cm/0/XMCHIP(3)/3/XMCHIP_CMERROR_DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM
Description           : DDRIF_PROTECT: Detected: Parity error for chnk checksum SRAM
State                 : enabled
Scope                 : pfe
Category              : functional
PFE                   : 3
Configured Level      : Major
Default Level         : Major
Count                 : 1
Threshold             : 1
Error Limit           : 0
Occur Count           : 1
Clear Count           : 0
Last-occurred(ms ago) : 9972079

Logs:
----------------------------------------------------------
Index  Time                 Sub-Err   State    Description
----------------------------------------------------------
0      08/23/21 07:43:43    0         Set      unknown
Cause:

This is due to a transient hardware memory parity error in the linked list queue on the XMCHIP ASIC. This error is caused by a transient hardware error, and is very rare. If the number of DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM errors exceeds the threshold of 1, then a particular PFE will be disabled from further usage.

Indications:

  • A single occurrence of syslog, not seen with other events, indicates a one-time hardware error. Multiple continuous occurrences indicate persistent underlying issues.

  • Traffic impact or permanent impact on packet forwarding may happen if errors are seen repeatedly.

To summarize:

  • The DDRIF_PROTECT_LLISTQ_SRAM_CHNK_CHKSUM data error is a transient hardware problem.

  • An alarm is raised, and then the specific PFE is disabled.

  • Hardware PPE traps would be seen.

  • HW Fault Trap: Count #; the amount of traffic drop can be found in the "count" of the HW fault trap message.

Solution:

Perform the following steps to determine the cause and resolve the problem (if any). Continue through each step until the problem is resolved.

  1. Collect the following show command outputs.

Capture the output to a file (in case you have to open a technical support case). To do this, configure each SSH client/terminal emulator to log your session.

  • show log messages
  • show log chassisd
  • start shell network pfe <fpc#>
  • show nvram
  • show syslog messages
  • exit
  1. Analyze the show command output.

In the show log messages output, review the events that occurred at or just before the appearance of the "LLISTQ Protect: Multiple Errors" message. Frequently, these events help identify the cause.

  • No RMA is required.

  • Restarting the FPC during a Maintenance Window should clear this error.

  • Contact your technical support representative if the issue is seen after an FPC restart. 

This article is indexed in KB31893 - Primary Index of Articles for Troubleshooting PFE ASIC Syslog Events; tag XMCHIPTSG

Tip: When looking at an event in the logs, it is important to focus on the first error message in a collection of syslog messages. The first error message is usually the cause of all the follow-on error messages. The follow-on collateral damage error messages can be ignored.

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search