Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[MX] Syslog message - LUCHIP(0)/0/LKUP_ASIC_DOUBLE_BIT_ECC (0x40020)

1

0

Article ID: KB37350 KB Last Updated: 24 Sep 2021Version: 2.0
Summary:

This article explains the meaning of the "LUCHIP(0)/0/LKUP_ASIC_DOUBLE_BIT_ECC (0x40020)" syslog message that may be observed on MX routers and details the action that must be taken to clear it. 

Symptoms:

A LKUP_ASIC_DOUBLE_BIT_ECC syslog message usually reports a transient hardware issue. When encountered, the following logs and/or alarms are registered:

May 21 01:14:57  mx960_re0 tnp.tftpd[75119]: TFTPD_CONNECT_INFO: TFTP write from address 19 port 1 file /var/tmp/ppe_trap_fpc3_LU_0_00
May 21 01:14:57  mx960_re0 fpc3 LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162240000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c] Correctable ECC @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
May 21 01:14:57  mx960_re0 fpc3 LUCHIP(0) PPE_10 Errors sync xtxn error
May 21 01:14:57  mx960_re0 fpc3 PPE Sync XTXN Err Trap:  Count 1, PC 6dd,     0x06dd:  mac_age_load_dw0
May 21 01:14:58  mx960_re0 fpc3 Error: /fpc/3/pfe/0/cm/0/LUCHIP(0)/0/LKUP_ASIC_DOUBLE_BIT_ECC (0x40020), scope: pfe, category: functional, severity: major, module: LUCHIP(0), type: Double-bit ECC error
May 21 01:14:58  mx960_re0 fpc3 Performing action cmalarm for error /fpc/3/pfe/0/cm/0/LUCHIP(0)/0/LKUP_ASIC_DOUBLE_BIT_ECC (0x40020) in module: LUCHIP(0) with scope: pfe category: functional level: major
May 21 01:14:58  mx960_re0 alarmd[92470]: Alarm set: FPC color=RED, class=CHASSIS, reason=FPC 3 Major Errors
May 21 01:14:58  mx960_re0 craftd[6286]:  Major alarm set, FPC 3 Major Errors
May 21 01:14:58  mx960_re0 mosquitto[92462]: Allocated node for mosq : 0x849780, Client : client-2-NA_periodic_subscriber, topic : /1002/1/0, max bytes in queue : 10485760, hash_size is 500, hashIndex is 0x880000
May 21 01:14:58  mx960_re0 fpc3 Performing action get-state for error /fpc/3/pfe/0/cm/0/LUCHIP(0)/0/LKUP_ASIC_DOUBLE_BIT_ECC (0x40020) in module: LUCHIP(0) with scope: pfe category: functional level: major
May 21 01:14:59  mx960_re0 fpc3 ttrace_tracer_entryp(503): TTRACE PPE5 Context3 now in-active
May 21 01:14:59  mx960_re0 fpc3 PPE Sync XTXN Err Trap:  Count 1, PC 6dd,     0x06dd:  mac_age_load_dw0
May 21 01:14:59  mx960_re0 fpc3 PPE Thread Timeout Trap:  Count 1, PC 71c,     0x071c:  hash_walk_check_bucket_empty
May 21 01:14:59  mx960_re0 fpc3 LUCHIP(0) PPE_5 Errors thread timeout error
May 21 01:14:59  mx960_re0 fpc3 PPE Sync XTXN Err Trap:  Count 1, PC 6dd,     0x06dd:  mac_age_load_dw0
May 21 01:14:59  mx960_re0 fpc3 PPE Thread Timeout Trap:  Count 1, PC 71c,     0x071c:  hash_walk_check_bucket_empty
May 21 01:14:59  mx960_re0 fpc3 LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
May 21 01:15:00  mx960_re0 /usr/sbin/cron[75127]: (root) CMD (newsyslog -X)
May 21 01:15:00  mx960_re0 /usr/sbin/cron[75128]: (root) CMD (   /usr/libexec/atrun)
May 21 01:15:01  mx960_re0 fpc3 LUCHIP(0) PPE_4 Errors sync xtxn error

labroot@mx960_re0> show chassis alarms
1 alarms currently active
Alarm time               Class  Description
2021-05-21 01:14:58 UTC  Major  FPC 3 Major Errors
Cause:

The "LKUP_ASIC_DOUBLE_BIT_ECC errors on LUCHIP(x) - Double-bit ECC" error is caused by a transient hardware memory error.

"ECC" stands for Error Checking and Correction, which permits error detection as well as correction of certain errors. On-chip memory is ECC protected and can detect double-bit errors.

When a double-bit ECC is detected, an alarm is raised with a level major. Usually, a reboot of the affected FPC clears the error as the memory is re-initialized. If the error reoccurs after a restart, a hardware replacement is recommended.

The corrected errors are transparent to the system. If the error is encountered just once or infrequently, it can be ignored.

Solution:

Collect the following command outputs from the affected FPC (in this case FPC 3 based on the above syslog) to verify its status:

  • RSI
  • /var/log
  • request pfe execute command "show nvram" target fpc3
  • request pfe execute command "show syslog messages" target fpc3
  • request pfe execute command "show syslog messages" target fpc3
  • request pfe execute command "show nvram" target fpc3
  • request pfe execute command "show cmerror module brief" target fpc3  

A snippet of the above command output is given here for reference:

labroot@mx960_re0>request pfe execute command "show cmerror module brief" target fpc3 -----> To identify the module ID
SENT: Ukern command: show cmerror module brief

-------------------------------------------------------------------------
Module  Name              Active Errors  PFE     Callback    ModuleData
                                         Specific  Function
-------------------------------------------------------------------------
1       PQ3 Chip          0              Yes       0x00000000  0x00000000
2       Host Loopback     0              No        0x00000000  0x4680e918
3       PCIe Error        0              No        0x00000000  0x00000000
4       CM[0]             0              No        0x421be25c  0x46666c44
5       CM[1]             0              No        0x421be25c  0x46666c7c
6       CM[2]             0              No        0x421be25c  0x46666cb4
7       CM[3]             0              No        0x421be25c  0x46666cec
8            LUCHIP(0)               12             No        0x00000000  0x4680e218 -------> LUCHIP 0 has 12 active errors and is using module ID 8.
9       TOE-LU-0:0:0      0              No        0x00000000  0x4680e1d8

labroot@mx960_re0>request pfe execute command "show cmerror module "x" detail" target fpc3 -----> Where x is the module ID from the previous command output
SENT: Ukern command: show cmerror module 8 detail

Module (8) (LUCHIP(0))
PFE support 0
Get state cb  0x423a1edc platform action cb 0x0 control cb 0x0

Error-id     PFE  Scope   Category    Level  Threshold  Count  Occurred  Cleared  Last-occurred(ms ago)  Name
-------------------------------------------------------------------------------------------------------------
0x040040      0   pfe     functional  Major      1        0       0        0        0                   LKUP_ASIC_HSL2_MAJOR_CRC_ERROR
0x040048      0   pfe     functional  Fatal      1        0       0        0        0                   LKUP_ASIC_HSL2_FATAL_CRC_ERROR
0x040018      0   pfe     functional  Minor      1        0       0        0        0                   LKUP_ASIC_SINGLE_BIT_ECC
0x040038      0   pfe     functional  Major      1        0       0        0        0                   LKUP_ASIC_PPE_LMEM_ERR
0x040020         0      pfe     functional      Major       1             12         12            0              69066850            LKUP_ASIC_DOUBLE_BIT_ECC  ------> Shows that the error is major and exceeded the threshold of 1, if the errors are cleared, PFE ID which reported the error, and its last occurrence
0x040030      0   pfe     functional  Major      1        0       0        0        0                   LKUP_ASIC_SINGLE_BIT_MAJOR_ECC
0x040008      0   board   functional  Major      1        0       0        0        0                   CMERROR_LUCHIP_ERROR


labroot@mx960_re0>request pfe execute command "show cmerror module 8 error "ERROR+ID" pfe "x"" target fpc3 -----> Where ERROR+ID and x are to be taken from the previous output
SENT: Ukern command: show cmerror module 8 error 0x040020 pfe 0
Error-id              : 0x40020
Error Name            : LKUP_ASIC_DOUBLE_BIT_ECC
Identifier            : /fpc/3/pfe/0/cm/0/LUCHIP(0)/0/LKUP_ASIC_DOUBLE_BIT_ECC
Description           : Double-bit ECC error
State                 : enabled
Scope                 : pfe
Category              : functional
PFE       : 0
Configured Level      : Major
Default Level         : Major
Count                 : 12
Threshold             : 1
Error Limit           : 0
Occur Count           : 12
Clear Count           : 0
Last-occurred(ms ago) : 69116953

Logs:
----------------------------------------------------------
Index  Time            Sub-Err   State    Description -------> Shows the timestamp and details of the 12 errors that were logged for module 8/LUCHIP; if any errors were cleared, the same can be seen here as "cleared" under the state column
----------------------------------------------------------
0      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 2, syn 0x0 - EDMEM[0x1860e8c]
1      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
2      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
3      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
4      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 2, syn 0x0 - EDMEM[0x1860e8c]
5      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
6      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
7      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 2, syn 0x0 - EDMEM[0x1860e8c]
8      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
9      05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
10     05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162200000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]
11     05/21/21 01:15:14    0         Set      LUCHIP(0) RMC 0  Uncorrectable ECC 0x0000162240000003 @ 0x6183a6, cnt 1, syn 0x0 - EDMEM[0x1860e8c]

In the above example, restart FPC 3 to clear the errors. If the errors persist even after FPC reboot, a hardware replacement is recommended. Contact Support for any assistance. 

Modification History:
2021-09-24: Updated 'Cause' section for more clarity.
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search