This document explains the meaning of FPC x major errors with MIC error codes 0x1b0001.
Active Alarms:
user@host> show system alarms
1 alarms currently active
Alarm time Class Description
2020-04-17 09:01:08 UTC Major FPC 11 Major Errors - MIC Error code: 0x1b0001
Syslog:
Apr 17 09:01:08 2020 craftd[15077]: Major alarm set, FPC 11 Major Errors - MIC Error code: 0x1b0001
Apr 17 09:01:08 2020 alarmd[16732]: Alarm set: FPC color=RED, class=CHASSIS, reason=FPC 11 Major Errors - MIC Error code: 0x1b0001
To troubleshoot and attempt to resolve the error, perform the following steps:
-
Check the alarm code.
Apr 17 09:01:08 2020 craftd[15077]: Major alarm set, FPC 11 Major Errors - MIC Error code: 0x1b0001
-
Check the alarm module on the FPC / MIC
user@host> request pfe execute command “show cmerror module” target fpc11
Example: Error Module Number for SMIC is 53.
-
Check the alarms active
user@host> request pfe execute command "show cmerror module 53" target fpc11
Module (53) (SMIC(0/1))
Error-id PFE Level Threshold Count Occured Cleared Last-occurred(ms ago) Description
--------------------------------------------------------------------------------------------
0x1b0001 0 Major 1 1 1 0 965815587 Device Voltage Level High <-- Concerned error code
0x1b0002 0 Major 1 0 0 0 0 Device Temperature Level High
-
Check the Error Module for specific error code:
user@host> request pfe execute command "show cmerror module 53 error 0x1b0001" target fpc11
Error-id : 0x1b0001
Description : Device Voltage Level High
PFE : 0
Level : Major
Count : 1
Threshold : 1
Error Limit : 1
Occur Count : 1
Clear Count : 0
Last-occurred(ms ago) : 3893900384
Sub-Error Information:
---------------------------
Id State Description
---------------------------
65536 Inactive MIC8-PMBUS-3V3-MAX20731
65537 Active MIC8-PMBUS-1V0-MAX20731_A
Logs:
---------------------------------------------------------
Index Time Sub-Err State Description
----------------------------------------------------------
0 03/08/19 07:23:59 65537 Set MIC8-PMBUS-1V0-MAX20731_A voltage: (265.764) exceeded threshold (1.120)
This alarm could be software triggered which could be a false alarm or an issue with the MIC/ FPC of transient nature.
To confirm this, check the voltage level for MICs on FPC 1.
-
Check Voltage level on MIC:
user@host> request pfe execute command "show smic 0 faults" target fpc11
....
MIC VOLTAGE FAULTS :stout-12xqsfpp-mic-0
================================================================================================================
Device Current Volt Monitoring Volt Thres High Volt Action Taken
================================================================================================================
MIC8-PMBUS-3V3-MAX20731 3.301 Enabled 3.600 No None
MIC8-PMBUS-1V0-MAX20731_A 1.026 Enabled 1.120 No None <-- normal level
MIC8-PMBUS-1V0-MAX20731_B 1.035 Enabled 1.120 No None
user@host> request pfe execute command "show smic 1 faults" target fpc11
....
MIC VOLTAGE FAULTS :stout-12xqsfpp-mic-1
================================================================================================================
Device Current Volt Monitoring Volt Thres High Volt Action Taken
================================================================================================================
MIC8-PMBUS-3V3-MAX20731 3.301 Enabled 3.600 No None
MIC8-PMBUS-1V0-MAX20731_A 1.026 Enabled 1.120 No None <-- normal level
MIC8-PMBUS-1V0-MAX20731_B 1.035 Enabled 1.120 No None
The above is an ideal output for a healthy MIC w.r.t voltage level. If the above output indicates high voltage level, then this alarm could be due to a hardware issue. In order to recover the FPC from a transient hardware issue, restart the MIC and FPC in a safe MW.r
If the above output indicates a healthy MIC with no voltage high triggers, it is most likely a software issue. Perform steps 6 and 7 before moving to step 8 in order to rule out the possibility of a hardware fault.
-
MIC Restart:
user@host>request chassis mic (offline | online) fpc-slot slot-number mic-slot slot-number
If the MIC restart does not solve the issue, restart the FPC to check if the issue is with voltage detected at the MIC slot.
-
Restart the FPC:
user@host> request chassis fpc (offline | online | restart) slot slot-number
-
Possibility of software issue
If this does not solve the issue, please check FPC type. If FPC type is MPC5E/MPC7/MPC8/MPC9, then there is a possibility that this is a software issue where in a transient high voltage is triggered on the MPC and even when the voltage stabilizes, the alarm does not clear.
Please check the Junos version. On releases above Junos 17.4, this false alarm should not exist after voltage stabilizes and should clear on its own.
If the device is running a version above 17.4 and this error is still being seen OT the alarm exists post MPC restart and there is a high voltage level seen in smic faults outputs, contact JTAC as this could be an issue with the MPC card.
Before contacting JTAC, collect the following logs in addition to the ones above:
user@host> show log messages
user@host> show log chassisd
user@host> show chassis environment
user@host> show chassis hardware
user@host> show chassis alarms