Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[SRX] Finding out possible reasons for Chassis Cluster failover

0

0

Article ID: KB21164 KB Last Updated: 02 Aug 2017Version: 5.0
Summary:

The article discusses the common analysis methods used to find the cause for failover. It also explains what to look for, though it is difficult to list the numerous possible scenarios with various causes of failover.

This article is linked to KB21905 - Resolution Guides and Articles - SRX - High Availability (Chassis Cluster).

 

Symptoms:

A failover occurred.

Solution:

If a failover occurred, and the Chassis Cluster is not up or you are unsure if it is up, verify it by referring to KB20641-Troubleshooting steps when the Chassis Cluster does not come up.

Then review the following show commands and debug logs if you need more information on the probable cause.

Show commands

  • root@srx> show system core-dumps

    Check for core-dumps that happened during the approximate time of failover.  If there is a core-dump at the time of failover, in order to analyze the core-dump, upload the file to the JTAC FTP server and open a case with JTAC. Refer to KB15585 - How to reliably and securely FTP files to JTAC.
     
  • root@srx> show chassis cluster information

    This is a hidden command that provides informational messages relating to parameter changes for the Chassis Cluster. It also lists the last 10 events that caused changes in the cluster behavior as well as current and previous state of the Chassis Cluster.
     
  • root@srx> show log jsrpd

    Check for monitoring interface/IP errors, heartbeat or threshold errors, warning messages etc.
    Refer to the JSRPD System Log Messages for a list of possible causes and descriptions.

Debug Logs

Common logs available for debugging in Junos:

  • chassisd: Shows events/information relating to hardware, chassis control and related logs

  • messages: Shows device event logs, critical and emergency messages

  • jsrpd: Shows events leading to redundancy feature of Junos

  • idpd: Shows events relating to the IDP daemon, events and failures

  • interactive-commands: Shows the commands that were entered on the device and is useful to validate and check configuration/monitoring commands initiated by other users on the device

  • kmd: Shows the negotiations and messages that happen during IKE

  • utmd: For platforms that support UTM features, detailed event logs are stored here

The first three logs listed above need to be analyzed in detail to understand Chassis Cluster and its functions. Keep in mind that the logs should be analyzed on both nodes participating in the cluster individually. The above logs run on Junos by default, and there is no need to enable them explicitly.

When analyzing these event log files, it is maybe helpful to use output filtering tools that Junos has to offer. 
Example: After creating a log file, "chassis_trace", check the file by following any of the below commands:

             show log chassis_trace | match drop         ## match the log file for the word ''drop"
            
show log chassis_trace | last 100             ## show the last 100 lines written on the file

The following can also be used: show log <log file> | match <expression>
 
show log <log file> | find <expression>
show log <log file> | last <X number of lines>
show log <log file> | trim
show log <lod file> | after <X number of lines>

There are two files that will report if there is hardware failure. The first file is show log messages, which contain general purpose log messages, and the show log chassisd file, which  determines if there are any other hardware chassis failures.
 
  • root@srx> show log messages

    Check for warning messages before and after the failover. Review this log on both nodes.
     
  • root@srx> show log chassisd

    Check for hardware errors, IOC down messages, FPC warnings, fan problems, temperature issues etc.
    Further understanding can be retrieved from: CHASSISD System Log Messages

 

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search