Knowledge Center Search


 

[SRX] Troubleshooting Chassis Cluster Redundancy Group not failing over

  [KB20987] Show KB Properties

  [KB20987] Hide KB Properties

Categories:
Knowledge Base ID: KB20987
Last Updated: 01 Jun 2014
Version: 7.0

Summary:

This article provides self-troubleshooting steps to determine why a Redundancy Group (RG) in a High Availability Chassis Cluster of SRX services gateway is not failing over.

This article is part of the Resolution Guide -- SRX Chassis Cluster (High Availability).

Problem or Goal:

If you are expecting a Redundancy Group (RG) to failover due to some reason but it is not failing over, then follow the below mentioned steps to troubleshoot and to find the root cause.



Cause:

Solution:

Step 1. On the SRX device, run the command:   show chassis cluster status

Sample Output:
> show chassis cluster status

Cluster ID: 1
Node                     Priority     Status     Preempt    Manual failover

Redundancy group: 0 , Failover count: 0
node0                       150       primary        no               no
node1                       100       secondary      no               no

Redundancy group: 1 , Failover count: 0
node0                       150       primary        yes              no
node1                       100       secondary      yes              no


Step 2.  Are you trying to do a Redundancy Group Manual Failover?  For details regarding a Manual Failover, refer to Understanding Chassis Cluster Redundancy Group Manual Failover.

  • Yes: Contine to Step 3
  • No:   Jump to Step 4

Step 3.  Have you done a Redundancy Group Manual Failover before?


Step 4.  Are the Control and Fabric links configured correctly and up?

Run the following command:
>show chassis cluster interfaces

Sample Output for a Branch series SRX services gateway device:

{primary:node0}
root@SRX_Branch> show chassis cluster interfaces
Control link 0 name: fxp1
Control link status: Up

Fabric interfaces:
Name Child-interface Status
fab0 ge-0/0/2 down
fab0
fab1 ge-9/0/2 down
fab1
Fabric link status: down

Sample Output for a High End series SRX services gateway device:

{primary:node0}
root@SRX_HighEnd> show chassis cluster interfaces
Control link 0 name: em0
Control link 1 name: em1
Control link status: up

Fabric interfaces:
Name Child-interface Status
fab0 ge-0/0/5 down
fab0
Fabric link status: down


If either one or both links are down, refer to the following articles:
KB20698 - Troubleshooting Control Link
KB20687 - Troubleshooting Fabric Link

If both links are up, continue to Step 5

Step 5.  Have you correctly configured Interface Monitoring or IP Address Monitoring (SRX3000/SRX5000)?

Either Interface Monitoring or IP Address Monitoring (SRX High End) is required for RG1+ failover. For detailed explanation on the working of Interface and IP Address Monitoring, refer to the following:
Understanding Interface Monitoring
Understanding IP Address Monitoring

You can cross check your configuration with the following examples:
Configuring IP Monitoring and
Step 6 in the specific SRX models in KB15650
  • Yes the config is correct: Continue to Step 6

Step 6.  What is the priority of each node in the output of >show chassis cluster status?


Step 7. If the above steps do not resolve your problem, KB15911 - SRX Getting Started -- Troubleshoot High Availability (HA) is a good reference for failover tips.

Also, KB21164 - [SRX] Finding out possible reasons for Chassis Cluster failover contains tips on logs to review.

If still not resolved, refer to KB21781 - [SRX] Data Collection Checklist - Logs/data to collect for troubleshooting in order to collect the necessary logs from BOTH devices, and open a case with your technical support representative.


Purpose:
Troubleshooting

Related Links:

 

 

ASK THE KB

Question or KB ID:


 


 

 
Copyright© 1999-2012 Juniper Networks, Inc. All rights reserved.