[SRX] How to recover or prevent a chassis cluster from going into a Hold/Lost state in Branch SRX

  [KB27713] Show Article Properties


Summary:

This article explains how to remove the configuration on the interfaces that will be used as fxp0 (out-of-band management) and fxp1 (control) in a chassis cluster.  

This solution can be used to:

  • recover a chassis cluster that is in the Hold/Lost state OR
  • prevent a new chassis cluster setup from going into the Hold/Lost state
Symptoms:

As shown by the output of the 'show chassis cluster status' command below, the SRX chassis cluster status is in the hold / lost state after the cables are connected and the SRX devices are rebooted in cluster mode.

{hold:node0} user@node0> show chassis cluster status 

Cluster ID: 1, Redundancy-group: 0 
Node name  Priority Status Preempt Manual failover 
node0        100      hold     No    No 
node1        1        lost     No    No

{hold:node1}
user@node1> show chassis cluster status
Cluster ID: 1, Redundancy-group: 0
Node name  Priority Status Preempt Manual failover
node0        100      lost     No    No
node1        1        hold     No    No

The hold status means that the node is not ready to be in a chassis cluster.

Cause:

When a SRX branch series device is booted in cluster mode, two particular revenue interfaces (depending upon the model of the device) are designated for   fxp0 (out-of band management link) and fxp1 (control link) of a chassis cluster. These ports can no longer be used for transit traffic.

For more information on which interfaces are assigned to fxp0 and fxp1, refer to KB15356 - How are interfaces assigned on J-Series and SRX platforms when the chassis cluster is enabled?.

Important:  A prerequisite for chassis cluster configuration is that fxp0 and fxp1 do not have any configuration. If they are configured, the chassis cluster will go into the hold/lost state.
This issue DOES NOT affect SRX High End devices because the high end devices have dedicated control and management ports.


Solution:

Follow the instructions for an SRX running 'factory default config' or for an existing stand-alone SRX below on how to remove the configuration on the interfaces that will be used as fxp0 (out-of-band management) and fxp1 (control) in a chassis cluster.

For an SRX running 'factory default config':

The 'factory default config' by default contains configuration on the interfaces that are transformed into the fxp0 and fxp1 interfaces. Therefore, it needs to be deleted by you before enabling chassis cluster mode.

There are two conditions that a device is running a 'factory default config':

  1. A new device
  2. Generally it is seen in the production environment that the new devices are used for the chassis cluster. These new devices come with the factory-default configuration which have some or the other config on these interfaces.

  3. Device crashes and comes back with factory default config.
  4. Rarely, but if a device in chassis cluster mode crashes, it may come up with the factory default configuration.


To remove the configuration on the interfaces, it is typically easiest to delete the factory-default configuration and configure the device from scratch.

To delete the configuration:

Warning: The following procedure removes the current configuration.

  1. Get into the configuration mode.
  2. Execute the # delete command. (This command deletes the current configuration from the device.)  
  3. root# delete
    This will delete the entire configuration
    Delete everything under this level? [yes,no] (no) yes

  4. Configure the root password and commit:
  5. root# set system root-authentication plain-text-password
    root# commit


OR

For an existing stand-alone SRX:

If your SRX is currently running in a production environment, then check to see if there is configuration on the interfaces that will be transformed into fxp0 and fxp1. To determine which interfaces will be transformed into fxp0 and control port (fxp1), refer to KB15356 - How are interfaces assigned on J-Series and SRX platforms when the chassis cluster is enabled and Node Interfaces on Active SRX Series Chassis Clusters.

Then to find the configuration for those interfaces, run these commands in configuration mode and delete all the configuration related to those interfaces from every configuration hierarchy in which those interfaces are being used:

show | display set | match <control port (fxp1)'s physical interface>
show | display set | match <management (fxp0) port's  physical interface>


If you wish to remove the entire config and reconfigure the device for chassis cluster, follow the procedure to delete the entire configuration, as shown in the 'factory default config' instructions.

Related Links: