Knowledge Search


[SRX] Troubleshooting Chassis Cluster Redundancy Group not failing over

  [KB20987] Show Article Properties


This article provides self-troubleshooting steps to determine why a Redundancy Group (RG) in a High Availability Chassis Cluster of SRX services gateway is not failing over.

This article is part of the Resolution Guide -- SRX Chassis Cluster (High Availability).


If you are expecting a Redundancy Group (RG) to failover due to some reason but it is not failing over, then follow the below mentioned steps to troubleshoot and to find the root cause.



Step 1. On the SRX device, run the command:   show chassis cluster status

Sample Output:
> show chassis cluster status

Cluster ID: 1
Node                     Priority     Status     Preempt    Manual failover

Redundancy group: 0 , Failover count: 0
node0                       150       primary        no               no
node1                       100       secondary      no               no

Redundancy group: 1 , Failover count: 0
node0                       150       primary        yes              no
node1                       100       secondary      yes              no

Step 2.  Are you trying to do a Redundancy Group Manual Failover?  For details regarding a Manual Failover, refer to Understanding Chassis Cluster Redundancy Group Manual Failover.

  • Yes: Contine to Step 3
  • No:   Jump to Step 4

Step 3.  Have you done a Redundancy Group Manual Failover before?

Step 4.  Are the Control and Fabric links configured correctly and up?

Run the following command:
>show chassis cluster interfaces

Sample Output for a Branch series SRX services gateway device:

root@SRX_Branch> show chassis cluster interfaces
Control link 0 name: fxp1
Control link status: Up

Fabric interfaces:
Name Child-interface Status
fab0 ge-0/0/2 down
fab1 ge-9/0/2 down
Fabric link status: down

Sample Output for a High End series SRX services gateway device:

root@SRX_HighEnd> show chassis cluster interfaces
Control link 0 name: em0
Control link 1 name: em1
Control link status: up

Fabric interfaces:
Name Child-interface Status
fab0 ge-0/0/5 down
Fabric link status: down

If either one or both links are down, refer to the following articles:
KB20698 - Troubleshooting Control Link
KB20687 - Troubleshooting Fabric Link

If both links are up, continue to Step 5

Step 5.  Have you correctly configured Interface Monitoring or IP Address Monitoring (SRX3000/SRX5000)?

Either Interface Monitoring or IP Address Monitoring (SRX High End) is required for RG1+ failover. For detailed explanation on the working of Interface and IP Address Monitoring, refer to the following:
Understanding Interface Monitoring
Understanding IP Address Monitoring

You can cross check your configuration with the following examples:
Configuring IP Monitoring and
Step 6 in the specific SRX models in KB15650
  • Yes the config is correct: Continue to Step 6

Step 6.  What is the priority of each node in the output of >show chassis cluster status?

Step 7. If the above steps do not resolve your problem, KB15911 - SRX Getting Started -- Troubleshoot High Availability (HA) is a good reference for failover tips.

Also, KB21164 - [SRX] Finding out possible reasons for Chassis Cluster failover contains tips on logs to review.

If still not resolved, refer to KB21781 - [SRX] Data Collection Checklist - Logs/data to collect for troubleshooting in order to collect the necessary logs from BOTH devices, and open a case with your technical support representative.

Related Links: