Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[QFX] NSSU checklist for QFabric

0

0

Article ID: KB26232 KB Last Updated: 29 Mar 2021Version: 3.0
Summary:
This article shows how to check that all necessary services are running prior to starting the Nonstop Software Upgrade (NSSU).
Symptoms:
All the respective services must be running prior to starting NSSU on QFabric, otherwise the upgrade can fail.
The NSSU enables you to upgrade a QFabric system with minimal packet loss and maximum uptime. This feature introduces several high availability improvements to the QFabric system software upgrade process.

When performing a nonstop upgrade, follow this sequence:
  1. Start with the Director-Group upgrade.
  2. Perform the fabric upgrade
  3. End with the Node group upgrades.
Solution:
  • Nonstop Software Upgrade for Director devices in a director group:

    The NSSU process individually upgrades members of a Director-Group, so that one device in the group is always operational. It switches the primary role of the Routing Engine processes to the backup Director device, before upgrading the primary Director device.

    Before starting the NSSU upgrade for DG, ensure that the following criteria are met:
     
    • Run the sanity check scripts to ensure that the director device is ready to be upgraded:
      root@router0 ~]# /opt/dcf/scripts/fabric_upgrade_sanity_check.pl
      Checking if director-device upgrade is currently in progress.
      Checking VM status.
      Checking for communication between director devices.
      Checking for aggregator ID mismatches on bond0/bond1.
      Checking links.
      Checking inventory status of ICs/FCs.
      Building inventory information for P6744-C/RE0.
      Building inventory information for P6749-C/RE0.
      Building inventory information for FC-0.
      Building inventory information for FC-1.
      Checking version for FC-0.
      Checking version for FC-1.
      Checking version for P6744-C/RE0.
      Checking version for P6749-C/RE0.
      Checking password for FC-0.
      Checking password for FC-1.
      Checking password for P6744-C/RE0.
      Checking password for P6749-C/RE0.
      ----------------------------------------
      The system appears to be ready for upgrade.
      Warnings:
      Aggregator ID mismatch in bond1: 4 / 6
      Aggregator ID mismatch in bond1: 4 / 6
      Link not detected on eth3
      Link not detected on eth7 
    • Verify if both the DG devices are online:
      root@Qfabric-router> show fabric administration inventory director-group status 
      Director Group Status Fri Oct 26 05:32:35 CEST 2012 
       Member Status Role     Mgmt Address    CPU Free Memory VMs Up Time
       ------ ------ -------- --------------- --- ----------- --- -------------
       dg0    online master   10.209.73.123   23% 19545728k   4   32:43 mins       
       dg1    online backup   10.209.73.124   9%  21493160k   3   32:51 mins   
       
       Member Device Id/Alias  Status  Role
       ------ ---------------- ------- ---------
       dg0    0281052012000031 online  master   
        Master Services
        ---------------
        Database Server                online    
        Load Balancer Director         online    
        QFabric Partition Address      online    
       
       Director Group Managed Services
        -------------------------------
        Shared File System             online    
        Network File System            online    
        Virtual Machine Server         online    
        Load Balancer/DHCP             online    
       
       Hard Drive Status
        ----------------
        Volume ID:04AFE3427A00B004     optimal    
        Physical ID:0                  online     
        Physical ID:1                  online     
        SCSI ID:1                      100%       
        SCSI ID:0                      100%       
       
        Size  Used Avail Used% Mounted on 
        ----  ---- ----- ----- ----------
        423G 13G  389G  4%   /          
        99M  20M  75M   21%  /boot      
        93G  12G  82G   13%  /pbdata    
       
        Director Group Processes
        ------------------------
        Director Group Manager         online    
        Partition Manager              online    
        Software Mirroring             online    
        Shared File System master      online    
        Secure Shell Process           online    
        Network File System            online    
        DHCP Server master             online     primary                           
        FTP Server                     online    
        Syslog                         online    
        Distributed Management         online    
        SNMP Trap Forwarder            online    
        SNMP Process                   online    
        Platform Management            online    
       Interface Link Status
        ---------------------
        Management Interface           up        
        Control Plane Bridge           up        
        Control Plane LAG              up        
        CP Link [0/2]                  down      
        CP Link [0/1]                  up        
        CP Link [0/0]                  up        
        CP Link [1/2]                  down      
        CP Link [1/1]                  up        
        CP Link [1/0]                  up        
        Crossover LAG                  up        
        CP Link [0/3]                  up        
        CP Link [1/3]                  up        
       
      Member Device Id/Alias  Status  Role
       ------ ---------------- ------- ---------
       dg1    0281052012000027 online  backup   
      
      Director Group Managed Services
        -------------------------------
        Shared File System             online    
        Network File System            online    
        Virtual Machine Server         online    
        Load Balancer/DHCP             online    
      Hard Drive Status
        ----------------
        Volume ID:0045CF9047BE5C52     optimal    
        Physical ID:0                  online     
        Physical ID:1                  online     
        SCSI ID:1                      100%       
        SCSI ID:0                      100%       
        Size  Used Avail Used% Mounted on 
        ----  ---- ----- ----- ----------
        423G 13G  389G  4%   /          
        99M  20M  75M   21%  /boot      
        93G  12G  82G   13%  /pbdata    
       Director Group Processes
        ------------------------
        Director Group Manager         online    
        Partition Manager              online    
        Software Mirroring             online    
        Shared File System master      online    
        Secure Shell Process           online    
        Network File System            online    
        DHCP Server master             online     backup                           
        FTP Server                     online    
        Syslog                         online    
        Distributed Management         online    
        SNMP Trap Forwarder            online    
        SNMP Process                   online    
        Platform Management            online    
       
      Interface Link Status
        ---------------------
        Management Interface           up        
        Control Plane Bridge           up        
        Control Plane LAG              up        
        CP Link [0/2]                  down      
        CP Link [0/1]                  up        
        CP Link [0/0]                  up        
        CP Link [1/2]                  down      
        CP Link [1/1]                  up        
        CP Link [1/0]                  up        
        Crossover LAG                  up        
        CP Link [0/3]                  up        
        CP Link [1/3]                  up
    • All the configured components should be connected:
      root@Qfabric-router> show fabric administration inventory
      Item                    Identifier              Connection      Configuration
      Node group
        NW-NG-0                                       Connected       Configured    
          Node-4              P5293-C                 Connected                     
          Node-5              P5507-C                 Connected                     
        RSNG-1                                        Connected       Configured    
          Node-3              P4799-C                 Connected                     
      Interconnect device
        IC-P6744-C                                    Connected       Configured    
          P6744-C/RE0                                 Connected                     
        IC-P6749-C                                    Connected       Configured    
          P6749-C/RE0                                 Connected                     
      Fabric manager
        FM-0                                          Connected       Configured    
      Fabric control
        FC-0                                          Connected       Configured    
        FC-1                                          Connected       Configured    
      Diagnostic routing engine
        DRE-0                                         Connected       Configured
    • All the cluster services should be online:
      [root@router0 ~]# clustat
      Cluster Status for sfc_pb_cluster @ Tue Oct 23 23:30:42 2012
      Member Status: Quorate
       
       Member Name                             ID   Status
       ------ ----                             ---- ------
       dg0                                         1 Online, Local, RG-Master
       dg1                                         2 Online, RG-Worker
       Service Name                   Owner (Last)                   State         
       ------- ----                   ----- ------                   -----         
       dcfservice:dcf_svc             dg0                            started       
       service:pbccif_svc0            dg0                            started       
       service:pbccif_svc1            dg1                            started       
       service:pbgfs_svc0             dg0                            started       
       service:pbgfs_svc1             dg1                            started       
       service:pblb_dhcp_svc0         dg0                            started       
       service:pblb_dhcp_svc1         dg1                            started       
       service:pbnfs_svc0             dg0                            started       
       service:pbnfs_svc1             dg1                            started
    • All the VMs should be up and distributed between the DGs:
      [root@router0 ~]# lsvm
      NODE    ACTIVE  TAG                             UUID
      dg1     1       _DCF_default___NW-INE-0_RE0_    e008ea46-f87d-11e1-929f-00e081cbcbfc
      dg1     1       _DCF_default___RR-INE-1_RE0_    f48cc334-f87d-11e1-abbe-00e081cbcbfc
      dg1     1       _TAG_DCF_ROOT_RE1_              650be992-f87d-11e1-9698-00e081cbcbfc
      dg0     1       _DCF_default___RR-INE-0_RE0_    eae52de4-f87d-11e1-a767-00e081cbcbfc
      dg0     1       _DCF_default___NW-INE-0_RE1_    e64c0776-f87d-11e1-b3a4-00e081cbcbfc
      dg0     1       _TAG_DCF_ROOT_RE0_              63ea2ac4-f87d-11e1-a230-00e081cbcbfc
      dg0     1       _TAG_DRE_                       786355ac-f87d-11e1-80d6-00e081cbcbfc
      If there is any discrepancy, contact your vendor before starting the upgrade.

      To upgrade the software on the Director devices in a Director Group, issue the request system software nonstop-upgrade director-group package-name command. For example:
      user@qfabric> request system software nonstop-upgrade director-group jinstall-qfabric-12.2X50-D20.4.rpm
      Note: The Director Group process takes almost two hours.

  • Nonstop Software Upgrade for fabric (Interconnect Devices and FC - Fabric-Control):

    Before starting the fabric upgrade, refer to the criteria mentioned above; if you notice any discrepancy, contact your vendor.

    Issue the request system software nonstop-upgrade fabric package-name command. For example:
    user@qfabric> request system software nonstop-upgrade fabric jinstall-qfabric-12.2X50-D20.4.rpm
  • Nonstop Software Upgrade on a Node Group:

    Before performing the Node group upgrade, ensure that IC, FC, and DG s are online, after the fabric upgrade.
     
    • When NSSU is performed on a Network Node Group, the node devices in the NNG are serially upgraded; except when upgrade groups are configured.

    • If NSSU is performed on a Redundant Server Node Group, both of the node devices must be online for a successful upgrade. If one of the node devices is no longer available, remove it from the configuration, before performing the NSSU. You cannot configure upgrade groups for RSNG. For RSNG, upgrade-group is inherently defined for each node; so the upgrade is serially executed.

    • When NSSU is performed on a Server Node Group with only one node device, traffic loss occurs, when the node device is rebooting.

    To perform a nonstop upgrade on one Node group, run the following command:
    user@qfabric> request system software nonstop-upgrade node-group nodegroup1 jinstall-qfabric-12.2X50-D20.4.rpm
    To perform a nonstop upgrade on more than one Node group, run the following command:
    user@qfabric> request system software nonstop-upgrade node-group [nodegroup1 nodegroup2 nodegroup3] 
    jinstall-qfabric-12.2X50-D20.4.rpm
Modification History:
2021-03-24: Updated the article terminology to align with Juniper's Inclusion & Diversity initiatives.
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search