Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[SRX] Troubleshooting Chassis Cluster Redundancy Group not failing over

0

0

Article ID: KB20987 KB Last Updated: 30 Sep 2020Version: 10.0
Summary:

This article provides self-troubleshooting steps to determine why a Redundancy Group (RG) in a High Availability Chassis Cluster of SRX services gateway is not failing over.

This article is part of the Resolution Guide -- SRX Chassis Cluster (High Availability).

Symptoms:

If you are expecting a Redundancy Group (RG) to failover due to some reason but it is not failing over, then follow the below mentioned steps to troubleshoot and to find the root cause.

Solution:
  1. On the SRX device, run the command:   show chassis cluster status

    Sample Output:

    > show chassis cluster status

    Cluster ID: 1
    Node                     Priority     Status     Preempt    Manual failover

    Redundancy group: 0 , Failover count: 0
    node0                       150       primary        no               no
    node1                       100       secondary      no               no

    Redundancy group: 1 , Failover count: 0
    node0                       150       primary        yes              no
    node1                       100       secondary      yes              no
  2. Are you trying to do a Redundancy Group Manual Failover?  For details regarding a Manual Failover, refer to Understanding Chassis Cluster Redundancy Group Manual Failover.

    • Yes: Contine to Step 3
    • No:   Jump to Step 4
  3. Have you done a Redundancy Group Manual Failover before?

  4. Are the Control and Fabric links configured correctly and up?

    Run the following command:

    >show chassis cluster interfaces

    Sample Output for a Branch series SRX services gateway device:

    {primary:node0}
    root@SRX_Branch> show chassis cluster interfaces
    Control link 0 name: fxp1
    Control link status: Up

    Fabric interfaces:
    Name Child-interface Status
    fab0 ge-0/0/2 down
    fab0
    fab1 ge-9/0/2 down
    fab1

    Fabric link status: down

    Sample Output for a High End series SRX services gateway device:

    {primary:node0}
    root@SRX_HighEnd> show chassis cluster interfaces
    Control link 0 name: em0
    Control link 1 name: em1
    Control link status: up

    Fabric interfaces:
    Name Child-interface Status
    fab0 ge-0/0/5 down
    fab0

    Fabric link status: down

    If either one or both links are down, refer to the following articles:
    KB20698 - Troubleshooting Control Link
    KB20687 - Troubleshooting Fabric Link

    If both links are up, continue to Step 5
  5. Have you correctly configured Interface Monitoring or IP Address Monitoring?

    Either Interface Monitoring or IP Address Monitoring is required for RG1+ failover. For detailed explanation on the working of Interface and IP Address Monitoring, refer to the following:
    Monitoring Chassis Cluster Interfaces
    Monitoring IP Addresses on a Chassis Cluster

    Cross check your configuration with the following examples:
    Configuring IP Monitoring and
    Step 6 in the specific SRX models in KB15650
    • Yes the config is correct: Continue to Step 6
  6. What is the priority of each node in the output of >show chassis cluster status?

  7. If the above steps do not resolve your problem, KB15911 - SRX Getting Started -- Troubleshoot High Availability (HA) is a good reference for failover tips.

    Also, KB21164 - [SRX] Finding out possible reasons for Chassis Cluster failover contains tips on logs to review.

    If still not resolved, refer to KB21781 - [SRX] Data Collection Checklist - Logs/data to collect for troubleshooting in order to collect the necessary logs from BOTH devices, and open a case with your technical support representative.

Modification History:
2020-09-26: Article reviewed for accuracy. Article is correct and complete.

Related Links

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search