Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

How to install SPC or SPC II modules in an SRX5000 chassis cluster

0

0

Article ID: KB34012 KB Last Updated: 03 Apr 2019Version: 1.0
Summary:

This article explains how to install SPC or SPC II modules in an existing SRX5000 chassis cluster using the minimum downtime procedure.

Prerequisites:

  • If the chassis cluster is operating in active-active mode, you must transition it to active-passive mode before using this procedure. You transition the cluster to active-passive mode by making one node primary for all redundancy groups.

>show chassis cluster status
 
root@srx5K> show chassis cluster status 
Mar 13 21:32:41
Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring              
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring      RE  Relinquish monitoring
 
Cluster ID: 1
Node   Priority Status               Preempt Manual   Monitor-failures
 
Redundancy group: 0 , Failover count: 1
node0  100      primary              no      no       None           
node1  1        secondary            no      no       None           
 
Redundancy group: 1 , Failover count: 1
node0  100      primary              yes     no       None           
node1  1        secondary            yes     no       None           
 
Redundancy group: 2 , Failover count: 1
node0  100      secondary            yes     no       None           
node1  1        primary              yes     no       None    
 
>request chassis cluster failover redundancy group 2 node 0
  • To install first-generation SRX5K-SPC-2-10-40 SPCs, both of the services gateways in the cluster must be running Junos OS Release 11.4R2S1, 12.1R2, or later.

  • To install next-generation SRX5K-SPC-4-15-320 SPCs, both of the services gateways in the cluster must be running Junos OS Release 12.1X44-D10, or later.

  • You must install SPCs of the same type and in the same slots in both of the services gateways in the cluster. Both services gateways in the cluster must result in the same physical configuration and slot locations post upgrade.

  • If you are adding first-generation SRX5K-SPC-2-10-40 SPC in an existing cluster installed with next-generation SRX5K-SPC-4-15-320 SPC, you must install the new SPCs so that a next-generation SRX5K-SPC-4-15-320 SPC is the SPC in the original lowest-numbered slot. For example, if the chassis already has two first-generation SPCs installed in slots 2 and 3, you cannot install SRX5K-SPC-4-15-320 SPCs in slots 0 or 1. You will need to make sure that an SRX5K-SPC-4-15-320 SPC is installed in the slot providing center point (CP) functionality (in this case, slot 2). This ensures that the CP functionality is performed by an SRX5K-SPC-4-15-320 SPC.

  • If you are replacing next-generation SRX5K-SPC-4-15-320 SPCs in the services gateways, both services gateways must already be equipped with high-capacity power supplies and fan trays.

  • Console connections to both chassis cluster nodes are necessary to allow unique config adjustments and due to device power off via 'halt' method used.

Solution:

NOTES:

  • This procedure was compiled with the assumption that node0 is the primary for control plane (RG0) and data plane (RG1+) and configured with higher priority than the secondary node. Ensure you have two separate console CLI sessions to each node before proceeding with the steps below. Allow ~15 minutes after each reboot for the respective node to come up with all its modules online in the procedure. 

  • Link naming below are examples only and will be dependent upon current configurations

  1. Disable all physical interfaces for transit traffic on node1 (secondary node).      
    (Alternatively: Interface disabling maybe accomplished via disabling or using ‘shut’ on peer devices for Node 1 links)

    set interfaces xe-13/0/0 disable
    set interfaces xe-13/1/0 disable

  2. Disable TCP SYN check and sequence check
     
    set security flow tcp-session no-syn-check
    set security flow tcp-session no-sequence-check

     
  3. Disable preempt for all RG1+ groups
     
    deactivate chassis cluster redundancy-group 1 preempt
    deactivate chassis cluster redundancy-group 2 preempt

  4. Disable interface-monitoring, ip-monitoring and control-link recovery features if used
     
    deactivate chassis cluster redundancy-group 1 interface-monitor
    deactivate chassis cluster redundancy-group 1 ip-monitoring
    deactivate chassis cluster redundancy-group 2 interface-monitor
    deactivate chassis cluster redundancy-group 2 ip-monitoring
    deactivate chassis cluster control-link-recovery  

  5. Commit the configuration from steps 1 through 4
     
    {primary:node0}[edit]
    root@srx5K#commit

     
  6. Change the control to non-existing / unused ports to simulate control link failure.
    Control ports must be set in any SPC port on the device, which does not have a  physical connection.

    Note:  Record previous configured ports as it will be used later
     
    delete chassis cluster control-port
    set chassis cluster control-ports fpc 10 port 0 (dummy SPC port)
    set chassis cluster control-ports fpc 22 port 0 (dummy SPC port)

     
  7. Change FAB link configuration to non-existing / unused revenue ports to simulate fabric link failure.
    Fabric ports can be set in any IOC slots (existing or not) on the device.
    A simple way is that change the fabric ports to undefined port numbers (e.g., port 40) on the same slot.

      Note:  Record previous configured ports as it will be used later
     
    delete interface fab0
    delete interface fab1
    set interfaces fab0 fabric-options member-interfaces xe-1/4/0 (dummy non-used port)
    set interfaces fab1 fabric-options member-interfaces xe-13/4/0 (dummy non-used port)

     
  8. Commit configuration configured of steps 6 & 7.

    NOTE: Upon commit completion the following errors will be generated due to control link down. These are expected error messages.
     
      {primary:node0}[edit]root@srx5K#commit
      node0:
      configuration check succeeds
      error: remote commit configuration failed on node1
      error: commit failed
      error: Connection to node1 has been broken
     

    NOTE: In case if you want to exit the configuration mode, you can execute “commit” again on node0.

     e.g.,
      {primary:node0}[edit]
      root@srx5K# exit
      The configuration has been changed but not committed
      Discard uncommitted changes? [yes,no] (yes) no <<< SHOULD be "no"
      Exit aborted

      {primary:node0}[edit]
      root@srx5K# commit and-quit
      node0:
      commit complete
      Exiting configuration mode

     
  9. Upon ensuring that both nodes are isolated after commit is done in step 8, Power down node1, unplug the power cables and install new/updated SPC module(s) in accordance with prerequisites given above. For more on how to install SPC modules, refer to Services Processing Card SRX5K-SPC-4-15-320 Specifications

    {primary:node1}
    >request system power-off

  10. Once the modules are installed, plug the power cables back to node1 for it to boot up. After login, verify chassis bootup with all its modules online with below commands:
     
    >show version
    >show chassis fpc pic-status
    >show chassis cluster status (node0 should be “lost” status)

     
  11. Once all the modules are up and running, disable all physical revenue interfaces for transit traffic on node0 and enable all physical revenue interfaces on node1 (which were disabled in step 1).
    (Alternatively: Interface disabling/enabling maybe accomplished via disabling or using ‘shut’ on peer devices for Node 0 links and enabling or using ‘no-shut’ on peer devices for Node 1 links)
     When using configuration change mentioned in this step, ensure commits are done simultaneously on both nodes via separate CLI sessions to each node.

     NOTE:  This step might incur some downtime as traffic is failed over from node0 to node1.

     Node 0
         set interfaces xe-1/0/0 disable
         set interfaces xe-1/1/0 disable
         delete interfaces xe-13/0/0 disable
         delete interfaces xe-13/1/0 disable
         commit check
       
     
    Node 1
         set interfaces xe-1/0/0 disable
         set interfaces xe-1/1/0 disable
         delete interfaces xe-13/0/0 disable
         delete interfaces xe-13/1/0 disable
         commit check

     
    If commit check successful then issue commit on both nodes

      {primary:node0}[edit]
      root@srx5K#commit
           
      {primary:node1}[edit]
      root@srx5K#commit

  12. Ensure services are running through node1 by checking active number of sessions on node1.
     
    Verification Commands:
      >show security flow session summary   ("Sessions-in-use" counter is incrementing)        
     
  13. After ensuring all traffic successfully passing through node1, power down node0, unplug the power cables and install SPC module(s) in accordance with prerequisites given above.
    Please note that both services gateways in the cluster must have the same physical configuration of SPCs in same slots on both nodes.

    {primary:node0}
    >request system power-off

  14. Once the modules are installed, power cables back to node0 for it to boot up. Same verification commands given in step 10 can be used to ensure new module installation and sanity of node0.
     
  15. Re-configure previously used CONTROL and FABRIC links manipulated in steps 6 & 7 and commit on node0.
    NOTE THAT THIS STEP SHOULD ONLY BE COMMITTED ON NODE0.
     
    Node 0
      delete chassis cluster control-port
      set chassis cluster control-ports fpc 0 port 0
      set chassis cluster control-ports fpc 12 port 0
      delete interface fab0
      delete interface fab1
      set interfaces fab0 fabric-options member-interfaces xe-1/3/0
      set interfaces fab1 fabric-options member-interfaces xe-13/3/0
      commit
     
  16. Post above commit on node0, halt node0 and wait till "The operating system has halted. Please press any key to reboot." message is displayed on the terminal session.
    NOTE: DO NOT PRESS ANY KEY at this stage, as this will result in boot up of node0.

    {primary:node0}
    >request system halt
    Halt the system ? [yes,no] (no) yes

     

  17. Once node0 is in halt condition, re-configure CONTROL and FABRIC from Step 16.

    Node 1
         delete chassis cluster control-port
        set chassis cluster control-ports fpc 0 port 0
        set chassis cluster control-ports fpc 12 port 0
        delete interface fab0
        delete interface fab1
        set interfaces fab0 fabric-options member-interfaces xe-1/3/0
        set interfaces fab1 fabric-options member-interfaces xe-13/3/0
        commit

     

  18. After successful commit on node1 in step 18, press any key on node 0 terminal session for it to boot up.  

  19. Ensure node0 is synchronized with node1 post boot up using the below verification commands for sanity check of the cluster:
     
    >show chassis cluster status
    >show chassis cluster interfaces
    >show chassis cluster information detail
    >show chassis fpc pic-status

     
  20. After successful sanity check of the cluster with both nodes synchronized, please enable all revenue interfaces disabled in step 11 for node0
    (Alternatively: Interface enabling maybe accomplished via enabling or using ‘no-shut’ on peer devices for Node 0 links)

     delete interfaces xe-1/0/0 disable
     delete interfaces xe-1/1/0 disable
     commit

  21. Post above successful commit in step 20, Reconfigure any/all features added/disabled in steps 2, 3 and 4 (interface-monitoring, IP-monitoring, control-link-recovery, TCP syn/seq checks) except 'preempt'.

    delete security flow tcp-session no-syn-check
    delete security flow tcp-session no-sequence-check
    activate chassis cluster redundancy-group 1 interface-monitor
    activate chassis cluster redundancy-group 1 ip-monitoring
    activate chassis cluster redundancy-group 2 interface-monitor
    activate chassis cluster redundancy-group 2 ip-monitoring

    activate chassis cluster control-link-recovery
    commit

  22. Verify if the RG states are back online with the correct priority and that interface monitoring and ip-monitoring are working correctly

    >show chassis cluster status
    >show chassis cluster ip-monitoring status
    >show chassis cluster interfaces

  23. Enable “preempt” for RG1+ groups, if previously used. 
    Note: Enabling preempt may result in RG1+ failover depending on the RG priority configured for the data RG groups

    activate chassis cluster redundancy-group 1 preempt
    activate chassis cluster redundancy-group 2 preempt

  24. Optional:  Manually failover RG0 and RG1 to node0 if you want to have all RGs primary on node0 and reset the failover count
     
    >request chassis cluster failover redundancy group 0 node 0
    >request chassis cluster failover redundancy group 1 node 0
    >request chassis cluster failover reset redundancy-group 0
    >request chassis cluster failover reset redundancy-group 1

     
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search