Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

Syslog message: EA.*HMCIF Rx: Link.: A response packet with a FATAL state is received from HMC - State: 0x7f

2

0

Article ID: KB32091 KB Last Updated: 11 Oct 2021Version: 5.0
Summary:

The "response packet with a FATAL state is received from HMC" message reports a fatal error condition in Hybrid Memory Cube.

This is a Troubleshooting Article for a PFE ASIC Syslog Event.
To view other documented syslog events related to XMCHIP, XLCHIP, MQCHIP, LUCHIP, EACHIP, and PECHIP, see KB31893 - Index of Articles for Troubleshooting PFE ASIC Syslog Events.

.

Symptoms:

When a "response packet with a FATAL state is received from HMC" event occurs, a message similar to the following is reported:

Mar 24 13:35:26.009 router-re0 fpc3 cmtfpc_hmc_fatal_dump: generating fatal register dump, HMC 64
Mar 24 13:35:26.013 router-re0 fpc3 Dumping Micron HMC 64 FATAL ERR DUMP 3181 entries ...
Mar 24 13:35:26.038 router-re0 fpc3 Addr , Data Addr , Data
Mar 24 13:35:26.062 router-re0 fpc3 0x002c8001, 0x609a9ee3 0x002c8002, 0xee854a32
Mar 24 13:35:26.067 router-re0 fpc3 0x002c8004, 0x0111222c 0x002c8003, 0x009b0000
Mar 24 13:35:26.068 router-re0 fpc3 0x002c8000, 0x00000002 0x00288000, 0x00000009
Mar 24 13:35:26.068 router-re0 fpc3 0x00288001, 0x00000009 0x00288002, 0x81ff0100
Mar 24 13:35:26.068 router-re0 fpc3 0x00288003, 0x00f8f3e9 0x00288004, 0x00f670ec
Mar 24 13:35:26.068 router-re0 fpc3 0x00240000, 0x00000ef9 0x00240002, 0x00010380
Mar 24 13:35:26.069 router-re0 fpc3 0x00240003, 0x40640000 0x00240004, 0x00000000
Mar 24 13:35:26.069 router-re0 fpc3 0x00240005, 0xffff8200 0x00240006, 0x00000040
Mar 24 13:35:26.069 router-re0 fpc3 Cmerror: Draining ASIC error message queue
Mar 24 13:35:26.074 router-re0 fpc3 cmerror_process_queue: module = EA[1:0]
Mar 24 13:35:26.079 router-re0 fpc3 Cmerror: processing the task op_type 1 for level 2 level_count 0 occur_count 0 clear_count 0 level_threshold 1 level_action 0x3 item errid 2359489 item_threshold 1 item_count 0 sub_item errid 0 sub_item_state 0 item_timestamp 0 current times
Mar 24 13:35:26.083 router-re0 fpc3 Cmerror: Level 2 count increment 1 occur_count 1 clear_count 0
Mar 24 13:35:26.088 router-re0 fpc3 Error (0x2400c1), module: EA[1:0], type: HMCIF RX link int reg HMC fatal err
Mar 24 13:35:26.092 router-re0 fpc3 Cmerror: Level 2 count 1 (occur_count 1 clear_count 0)crossed threshold 1 action 0x3
Mar 24 13:35:26.096 router-re0 chassisd[13184]: ASIC Error detected errorno 0x002400c1
Mar 24 13:35:26.096 router-re0 fpc3 cmerror_take_action_helper: performing action 1 for level 2 err_id 0x2400c1
Mar 24 13:35:26.098 router-re0 inetd[19496]: Number of tftp connections at max limit (1)
Mar 24 13:35:26.101 router-re0 fpc3 cmerror_take_action_helper: performing action 2 for level 2 err_id 0x2400c1
Mar 24 13:35:27.478 router-re0 fpc3 Cmerror Op Set: EA[1:0]::FATAL ERROR!! from EA[1:0]: HMCIF Rx: Link0: A response packet with a FATAL state is received from HMC - State: 0x7f, Count 1
<..>

Indications:

  • Permanent PFE forwarding impact and MPC is getting restarted automatically recovery from this condition
  • ‚ÄčThe fatal register dump will be printed in Junos 15.1F7, 16.1R4 or 16.2R2 to provide additional information
  • In Junos OS 17.3R1 or higher, the default action will be disable-pfe instead of MPC restart and Alarm will be raised
  • If inline-jflow application is used FPC might crash or cause high cpu utilisation upon convergence event see PR1407506

 

Cause:

This indicates that the internal logic in the Hybrid memory cube has encountered an error from which it can not recover without reset.

Solution:



Perform these steps to determine the cause and resolve the problem (if any).  Continue through each step until the problem is resolved.

  1. Collect the show command output.

    Capture the output to a file (in case you have to open a technical support case). To do this, configure each SSH client/terminal emulator to log your session.

    show log messages
    show log chassisd
    start shell network pfe <fpc#>
    show nvram
    show syslog messages
    exit

  2. Analyze the show command output.

    In the 'show log messages', review the events that occurred at or just before the appearance of the "response packet with a FATAL state is received from HMC" message. Frequently, these events help identify the cause.

This article is indexed in KB31893 - Primary Index of Articles for Troubleshooting PFE ASIC Syslog Events; tag EACHIPTSG


Tip: When looking at an event in the logs, it is important to focus on the first error message in a collection of syslog messages. The first error message is usually the cause of all the follow-on error messages. The follow-on collateral damage error messages can be ignored.

 

Modification History:
2019-08-13: updated internal section for simulation and event-policy to restart FPC instead of disable-pfe
2019-03-01: Added impact of PR/1407506 upon HMC Fatal Error condition 0x7F
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search