Knowledge Search


×
 

[SRX] When temperature exceeds threshold, no SNMP trap generated and no shutdown is forced

  [KB29434] Show Article Properties


Summary:

This article documents a problem seen on some branch SRX devices (SRX100/SRX110/SRX210) in Junos OS 12.1. When the chassis temperature exceeds the alarm thresholds, no alarms or shutdowns are generated. There is no workaround.

Symptoms:

Chassis temperature is monitored by the routing engine in different steps:

  • When chassis temperature exceeds 63 degrees C, syslog and SNMP traps are generated, and also a minor alarm is raised in "show system alarms" as a warning (yellow alarm).  
  • When chassis temperature exceeds 72 degrees C, syslog and SNMP traps are generated, and a major alarm is raised in "show system alarm" as a critical warning (red alarm).  

  • When chassis temperature exceeds 72 degrees C for 4 minutes, chassis shuts down automatically in 4 minutes. 

For more information, refer to Monitoring the SRX100 Services Gateway Using Chassis Alarm Conditions.

Each threshold can be checked from show chassis temperature-thresholds.

------------------------------------------------------------------------------
root> show chassis temperature-thresholds
                           Fan speed      Yellow alarm      Red alarm      Fire Shutdown
                          (degrees C)      (degrees C)     (degrees C)      (degrees C)
Item                     Normal  High   Normal  Bad fan   Normal  Bad fan     Normal
Chassis default             N/A   N/A       63      N/A       72      N/A       90
Routing Engine              N/A   N/A       63      N/A       72      N/A       90
------------------------------------------------------------------------------

This whole thermal alarm system is not working in SRX100/SRX110/SRX210 platforms.

Example: 12.1X44-D35

  1. When chassis temperature exceeds 63 degrees, there is no syslog, no SNMP trap, and no alarm raised.
  2. When chassis temperature exceeds 72 degrees, there is no syslog, no SNMP trap, and no alarm raised. Also chassis does not shut down after temperature exceeds 72 degrees for 4 minutes.

Cause:

In Junos OS 12.1, the temperature is not monitored in the code.

Note: This issue does not happen in Junos OS 11.4.

Solution:

There is no workaround for this issue. This section will explain the expected behavior of the fixed version, 12.1X44-D40.

Affected Platforms: SRX100/110/210

Fixed Junos OS versions:

  • 12.1X44-D40
  • 12.1X46-D30 (Planned-Release)
  • 12.1X47-D20 (Planned-Release)

The fixed version works as expected in each step.

  1. Yellow Alarm

    When chassis temperature exceeds 63 degrees C, a minor alarm is raised, and a Syslog/SNMP trap is generated.

    -------------------------------------------------------
    root> show chassis alarms
    1 alarms currently active
    Alarm time               Class  Description
    2014-07-29 07:27:12 UTC  Minor  Host 0 Temperature Warm 
    -------------------------------------------------------
    <Syslog Messages>
    Jul 29 07:26:47   chassisd[1387]: CHASSISD_SNMP_TRAP6: SNMP trap generated: Over Temperature!  
  2. Red Alarm

    When chassis temperature exceeds 72 degrees C, a major alarm is raised. Note: The same SNMP trap will continue to be raised from yellow alarm.

    -------------------------------------------------------
    root> show chassis alarms
    1 alarms currently active
    Alarm time               Class  Description
    2014-07-29 08:07:50 UTC  Major  Host 0 Temperature Hot 
    -------------------------------------------------------
    
  3. Red Alarm continues for 4 minutes

    When chassis temperature exceeds 72 degrees C for 4 minutes, SRX is forced to shut down.

    -------------------------------------------------------
    <Syslog Messages>
    CHASSISD_RE_OVER_TEMP_WARNING: Routing Engine 0 temperature (73 C) over 72 degrees C, platform will shut down in 240 seconds if condition persists :
    :
    CHASSISD_RE_OVER_TEMP_WARNING: Routing Engine 0 temperature (73 C) over 72 degrees C, platform will shutdown in 13 seconds if condition persists
    CHASSISD_RE_OVER_TEMP_WARNING: Routing Engine 0 temperature (73 C) over 72 degrees C, platform will shutdown in 8 seconds if condition persists
    CHASSISD_RE_OVER_TEMP_WARNING: Routing Engine 0 temperature (73 C) over 72 degrees C, platform will shutdown in 3 seconds if condition persists
    CHASSISD_RE_OVER_TEMP_SHUTDOWN: Routing Engine 0 temperature above 72 degrees C for too long; powering down all FRUs 
    -------------------------------------------------------
    
Related Links: