Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[MX] Syslog message: XMCHIP.*HOSTIF: Protect: Parity error for SRAM in bank.*

1

0

Article ID: KB32890 KB Last Updated: 29 Oct 2020Version: 2.0
Summary:

This is a troubleshooting article for a PFE ASIC syslog event. It explains how to troubleshoot when the "HOSTIF: Protect Parity error for SRAM" message is reported, indicating a transient hardware memory error.

To view other documented syslog events related to XMCHIP, XLCHIP, MQCHIP, LUCHIP, and EACHIP, see KB31893 - Master Index of Articles for Troubleshooting PFE ASIC Syslog Events.

Symptoms:

When an "SRAM memory parity error" event occurs, a message similar to the following is reported:

<Host> <FPC#> XMCHIP(x): HOSTIF: Protect: Parity error for SRAM in bank 0​
<Host> <FPC#> XMCHIP(x): HOSTIF: Protect: Log Error 0x1, Log Address 0x3c54, Multiple Errors 0x0​

The following log messages accompany the above message:

alarmd[1607]: %DAEMON-4: Alarm set: FPC color=RED, class=CHASSIS, reason=FPC 19 Major Errors
craftd[1608]: %DAEMON-4:  Major alarm set, FPC 19 Major Errors
 

Indications

  • A single occurrence of this syslog message not seen with other events indicates a one-time hardware error. Multiple continuous occurrences indicate persistent underlying issues.

  • Traffic impact or PFE permanent packet forwarding issues may occur if errors are seen repeatedly.

Cause:

The cause is a transient parity error detected in the internal XMCHIP memory. A software workaround was implemented to throttle the messages in PR958661. With the fix in this PR, the first three occurrences are reported. However, the error itself can still be seen.

Solution:

Perform the following steps to determine the cause and resolve problems (if any). Continue through each step until the problem is resolved.

  1. Collect the show command output.

Attention 'Junos Space Service Now' users:

This 'show' command output is automatically collected for you by the Advanced Insight Scripts, so you may skip to the next step after reading the rest of this notification.

To see the 'show' command output, refer to the Attachment Details in the incident (Service Central > Incidents); see Viewing Incident Details. When a Technical Service Request is opened by Service Now, the 'show' command output is also attached to the request. 

The Advanced Insight Scripts also collect standard information, such as RSI (Request Support Information), log files, and core files. See KB29138 - [Service Now] Standard information collected by AI-Scripts.

To suggest additional commands or provide comments, please email us: ais-events-review@juniper.net.


 

Capture the output to a file (in case you need to open a technical service request). To do this, configure each SSH client/terminal emulator to log your session.

show log messages
show log chassisd
start shell pfe network <fpc#>
show nvram
show syslog messages
exit
  1. Analyze the show command output.

In the show log messages output, review the events that occurred at or just before the appearance of the "DMEM allocation memory parity error" message. Frequently, these events help identify the cause and what action must be taken next, such as the following:

  • No RMA is required.

  • Restart the MPC and monitor the errors.

  • Contact your technical support representative if the issue is seen after an FPC restart. 

Note: This article is indexed in KB31893 - Master Index of Articles for Troubleshooting PFE ASIC Syslog Events; tag XMCHIPTSG

Tip: When looking at an event in the logs, it is important to focus on the first error message in a collection of syslog messages. The first error message is usually the cause of all follow-on error messages. The follow-on collateral damage error messages can be ignored.

Modification History:
2020-10-28: Fixed typo in Solution item 1 (start shell network pfe <fpc#> To: start shell pfe network <fpc#>)
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search