[QFX] FPC crashes on QFX5200 when high volume of broadcast traffic is received via em0

  [KB34098] Show Article Properties


Summary:

This article describes a scenarios where FPC crashes on QFX5200 when a high volume of broadcast traffic is received via em0.

Symptoms:

The forwarding plane resets when QFX5200 received a high volume of broadcast traffic via em0.

root@>show chassis fpc detail 
Slot 0 information:
  State                               Online
  Total CPU DRAM                 3907 MB
  Total SRAM                        0 MB
  Total SDRAM                       0 MB
  Start time                          2019-03-01 16:18:12 UTC
  Uptime                              2 hours, 16 minutes, 25 seconds

CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 7, jnxFruL1Index 1, jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName FPC: QFX5200-32C-32Q @ 0/*/*, jnxFruType 3, jnxFruSlot 0)
Mar  1 16:16:46.418 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 7, jnxFruL1Index 1, jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName FPC: QFX5200-32C-32Q @ 0/*/*, jnxFruType 3, jnxFruSlot 0)
Mar  1 16:16:46.495 2019   mib2d[5048]: SNMP_TRAP_LINK_DOWN: ifIndex 515, ifAdminStatus up(1), ifOperStatus down(2), ifName et-0/0/1
Mar  1 16:16:46.421 2019   chassisd[4375]: fru_nmi_timer: Restart FPC 0 due to NMI timeout

System stays up:
root@> show system uptime no-forwarding
 
Current time: 2019-03-01 18:34:37 UTC
Time Source:  NTP CLOCK
System booted: 2018-10-25 22:10:38 UTC (18w0d 20:23 ago)
Protocols started: 2018-10-25 22:11:23 UTC (18w0d 20:23 ago)
Last configured: 2019-02-20 13:41:09 UTC (1w2d 04:53 ago) by cvicente
6:34PM  up 126 days, 20:24, 2 users, load averages: 0.28, 0.26, 0.25
Cause:

This high volume of broadcast traffic causes the CPU to go high; so that JUNOS does not have CPU cycle to poll the status of the PEM and fans. As a result, the system to loses connection to fan and power supply.

Mar  1 16:18:29.590 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 4, jnxFruL1Index 1, jnxFruL2Index 1, jnxFruL3Index 0, jnxFruName Fan Tray 0 @ 0/0/*, jnxFruType 13, jnxFruSlot 0)
Mar  1 16:18:29.590 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 4, jnxFruL1Index 1, jnxFruL2Index 2, jnxFruL3Index 0, jnxFruName Fan Tray 1 @ 0/1/*, jnxFruType 13, jnxFruSlot 0)
Mar  1 16:18:29.591 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 4, jnxFruL1Index 1, jnxFruL2Index 3, jnxFruL3Index 0, jnxFruName Fan Tray 2 @ 0/2/*, jnxFruType 13, jnxFruSlot 0)
Mar  1 16:18:29.591 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 4, jnxFruL1Index 1, jnxFruL2Index 4, jnxFruL3Index 0, jnxFruName Fan Tray 3 @ 0/3/*, jnxFruType 13, jnxFruSlot 0)
Mar  1 16:18:29.591 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 4, jnxFruL1Index 1, jnxFruL2Index 5, jnxFruL3Index 0, jnxFruName Fan Tray 4 @ 0/4/*, jnxFruType 13, jnxFruSlot 0)
Mar  1 16:18:29.591 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 2, jnxFruL1Index 1, jnxFruL2Index 1, jnxFruL3Index 0, jnxFruName Power Supply 0 @ 0/0/*, jnxFruType 7, jnxFruSlot 0)
Mar  1 16:18:29.591 2019   chassisd[4375]: CHASSISD_SNMP_TRAP7: SNMP trap generated: FRU removal (jnxFruContentsIndex 2, jnxFruL1Index 1, jnxFruL2Index 2, jnxFruL3Index 0, jnxFruName Power Supply 1 @ 0/1/*, jnxFruType 7, jnxFruSlot 1)
 
test@> show chassis environment
Class Item                           Status     Measurement
Power FPC 0 Power Supply 0           Absent
FPC 0 Power Supply 1                 Absent
Temp  FPC 0 Sensor TopLeft I         OK         30 degrees C / 86 degrees F
FPC 0 Sensor TopRight E              OK         28 degrees C / 82 degrees F
FPC 0 Sensor TopCenter I             OK         31 degrees C / 87 degrees F
FPC 0 Sensor TopLeft E               OK         36 degrees C / 96 degrees F
FPC 0 Sensor CPULeft I               OK         33 degrees C / 91 degrees F
FPC 0 Sensor CPURight I              OK         32 degrees C / 89 degrees F
FPC 0 Sensor CPU Die Temp            OK         37 degrees C / 98 degrees F
FPC 0 Sensor TopCenter E             OK         28 degrees C / 82 degrees F
FPC 0 Sensor BottomRight E           OK         35 degrees C / 95 degrees F
FPC 0 Sensor TopRight I              OK         31 degrees C / 87 degrees F
FPC 0 Sensor BottomLeft E            OK         41 degrees C / 105 degrees F
Fans  FPC 0 Fan Tray 0               Absent
FPC 0 Fan Tray 1                     Absent
FPC 0 Fan Tray 2                     Absent
FPC 0 Fan Tray 3                     Absent
FPC 0 Fan Tray 4                     Absent
Solution:

This issue is fixed in Junos 17.2X75-D41.

Workaround:

Traffic shaping can be used when we want to send high broadcast traffic to the management interface em0. Issue the following command from the CLI to shape management port traffic:

> request app-engine host-cmd "/sbin/mgmt_port_shaper start"

To disable the traffic shaping on the management port, issue:

> request app-engine host-cmd "/sbin/mgmt_port_shaper stop"

Related Links: