Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[SRX] How to move SPC cards to a different slot on high-end SRX devices (based on the minimal downtime procedure)

0

0

Article ID: KB26674 KB Last Updated: 15 Apr 2014Version: 2.0
Summary:
This article provides information on how to swap SPCs on high-end SRX devices (based on the minimal downtime).
Symptoms:

The goal of this procedure is to provide a means to change the SPC slot in an SRX cluster with the minimum possible amount of down time. The following events can be expected during this process:


  • All sessions, which have network address translation, will be lost.

  • All sessions that utilize ALGs (such as FTP, SIP, and so on) will be lost.

  • Dynamic routing protocol adjacencies will have to be re-established, when failover occurs between the devices.

  • All other existing sessions will be able to fail between devices.

  • Depending on the network configuration, traffic will failover between devices with the minimal packet loss.

Consider the following configuration :
    xe-2/0/0 {
        gigether-options {
            redundant-parent reth0;
        }
    }
    xe-3/0/0 {
        gigether-options {
            redundant-parent reth2;
        }
    }
    xe-3/0/1 {
        gigether-options {
            redundant-parent reth3;
        }
    }
    xe-15/0/0 {
        gigether-options {
            redundant-parent reth0;
        }
    }
    xe-16/0/0 {
        gigether-options {
            redundant-parent reth2;
        }
    }
    xe-16/0/1 {
        gigether-options {
            redundant-parent reth3;
        }
    }

Cause:

Solution:

For this procedure, Assume that node0 is the primary for all RGs and node1 is the secondary for all RGs.


  1. Disable the network interfaces on the node1 backup device. This is performed to isolate the unit from the network, so that it does not impact traffic, when the upgrade procedure is in progress:
    # set interfaces xe-15/0/0 disable
    # set interfaces xe-16/0/0 disable
    # set interfaces xe-16/0/1 disable
  2. Disable SYN bit checking and TCP Sequence number checking. This allows the secondary firewall to take over stateful, non-NAT, and non-ALG traffic; without requiring a 3-way TCP handshake:
    # set security flow tcp-options no-syn-check
    # set security flow tcp-options no-sequence-check
    # commit
    
    Note: Till this point, the configuration has to been committed on only the primary node (node0).

    After the commit, verify if the configuration has been synced on both of the nodes by having two console windows (one for each node). Also, verify if the interfaces of the backup node are shown as Down in the output of show interfaces terse on both of the nodes.

  3. Physically disconnect the control and fabric links between the two devices. This will ensure that the nodes, which have different SPC slots, will not communicate with each other.

  4. Power down the backup firewall (node1) and change the SPC slot. When done, power up this device.

  5. Check if the backup firewall is up and available to take over traffic. It can take several minutes to complete the boot process. Run the show chassis cluster status command on node1 to verify the status; node0 should be shown as Lost and node1 should be shown as the Primary for all RGs:
    root> show chassis cluster status
    Cluster ID: 1
    Node                  Priority          Status    Preempt  Manual failover
               Redundancy group: 0 , Failover count: 1
               node0                   0           lost           n/a      n/a
               node1                   100         Primary        no       no
    
               Redundancy group: 1 , Failover count: 1
               node0                   0           lost           n/a      n/a
               node1                   0           Primary        no       no
    
    Note: If you run the same command on 'node0', it will shown as the Primary and 'node1' will be shown as 'Lost'. However, as the revenue ports haven been disabled on 'node1', the split brain issue will not occur.

    Also run the show chassis fpc pic-status command. All FPCs and PICs should be in the Online status. It takes around 5 to 10 minutes for all the cards to come up properly.

  6. The backup firewall (node1) is ready to take over for the primary. This is one of the crucial steps in this procedure. The traffic will now be switched between the two devices by simultaneously disabling the physical interfaces on the primary and enabling them on the secondary device. The following commands have to be separately committed on both of the nodes:
    # delete interfaces xe-15/0/0 disable
    # delete interfaces xe-16/0/0 disable
    # delete interfaces xe-16/0/1 disable
    
    # set interfaces xe-2/0/0 disable
    # set interfaces xe-3/0/0 disable
    # set interfaces xe-3/0/1 disable
         
    # commit check
    
    Perform a commit check on both of the nodes, before the actual commit, to ensure that there are no syntax errors or other issues. When the commit check is successful, simultaneously commit the configuration on both of the nodes:
    # commit
    This will cause the interfaces of the backup node (node1) to send GARP to the switch and the node0 interfaces to go down; due to the disable. Traffic will immediately begin to flow on the secondary device (node1).

  7. To ensure that the secondary device (node1) is handling the traffic, look at the session table and check if the traffic is flowing through the device and that new sessions are being created. Use the following command on node1:
    > show security flow session summary
  8. Now, power down the primary firewall (node0) and then change the SPC slot. After this change, power up the device.

  9. Check if the primary firewall is up and available to take over traffic by using the same verification methods mentioned in Step 5:
    > show chassis cluster status
    node1 should be shown as Lost and node0 should be shown as the Primary for all of the RGs:
    root> show chassis cluster status
        Cluster ID: 1
        Node                  Priority          Status    Preempt  Manual failover
     
        Redundancy group: 0 , Failover count: 1
        node0                   100         primary        no       no
        node1                   0           lost           n/a      n/a
    
        Redundancy group: 1 , Failover count: 1
        node0                   0           primary        no       no
        node1                   0           lost           n/a      n/a
       
        > show chassis fpc pic-status
    
    All of the FPCs and PICs should be in the Online status. It takes around 5 to 10 minutes for all the cards to come up properly.

  10. At this point, the primary firewall (node0) is ready to take over for the backup. This is the second most crucial step in this procedure. The traffic will now be switched between the two devices by simultaneously disabling the physical interfaces on the backup and enabling them on the primary device. The following commands have to be separately committed on both of the nodes:
    # set interfaces xe-15/0/0 disable
    # set interfaces xe-16/0/0 disable
    # set interfaces xe-16/0/1 disable
    # delete interfaces xe-2/0/0 disable
    # delete interfaces xe-3/0/0 disable
    # delete interfaces xe-3/0/1 disable
    # commit check
    
    Perform a commit check on both of the nodes, before the actual commit, to make sure that there are no syntax errors or other issues. When the commit check is successful, simultaneously commit the configuration on both of the nodes.

  11. To ensure that the primary device (node0) is handling the traffic, look at the session table and check if traffic is flowing through the device and that new sessions are being created. Run the following command on node0:
    > show security flow session summary
  12. Now it is time to synchronize the cluster. First reboot the backup device (node1). When it is rebooting, physically connect the control and fabric ports between the primary and backup devices. When the backup device comes back up, it will synchronize with the primary device.

  13. When the backup firewall (node1) is up and ready to process traffic, enable it's physical interfaces. Type the following configuration on the Primary node (node0):
    # delete interfaces xe-15/0/0 disable
    # delete interfaces xe-16/0/0 disable
    # delete interfaces xe-16/0/1 disable
    # commit
  14. Re-enable SYN Check and Sequence Check:
    # delete security flow tcp-options no-syn-check
    # delete security flow tcp-options no-sequence-check
    # commit
  15. On the backup node's (node1) CLI window, check the configuration to ensure that the commit on the primary node has been properly synced on the backup node.
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search