Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[Contrail] Cloud 13.3: Ceph Storage Node Removal Process

0

0

Article ID: KB36108 KB Last Updated: 05 Aug 2020Version: 1.0
Summary:

Use this article ‚Äčto remove a Ceph storage node from a Ceph cluster. This procedure will demonstrate the removal of a storage node from an environment in the context of Contrail Cloud.

Before you begin, ensure the remaining nodes in the cluster will be sufficient for keeping the required amount of pgs and replicas for your Ceph storage cluster. Ensure both Ceph cluster and overcloud stack are healthy. For checking the health of your overcloud, refer to the technical documentation on Quorum and Node Health

For additional Red Hat guidance and reference, see Removing a Node from the Overcloud and Handling a Node Failure.

Solution:

In the examples used for this procedure, “storage3” will be the targeted node for removal.

Remove the storage node:

  1. Find the connection between the bare metal server and the overcloud server. The output from the command below shows us that the serer we are looking for is “overcloud8st-cephstorageblue1-0”. This information will be used later in the procedure.

    (undercloud) [stack@undercloud ~]$ openstack ccloud nodemap list
    +---------------------------------+----------------+------------+----------------+
    | Name                            | IP             | Hypervisor | Hypervisor IP  |
    +---------------------------------+----------------+------------+----------------+
    | overcloud8st-cc-2               | 192.168.213.54 | controler2 | 192.168.213.6  |
    | overcloud8st-cc-1               | 192.168.213.51 | controler3 | 192.168.213.7  |
    | overcloud8st-cc-0               | 192.168.213.55 | controler1 | 192.168.213.5  |
    | overcloud8st-ca-1               | 192.168.213.72 | controler1 | 192.168.213.5  |
    | overcloud8st-ca-0               | 192.168.213.60 | controler3 | 192.168.213.7  |
    | overcloud8st-ca-2               | 192.168.213.53 | controler2 | 192.168.213.6  |
    | overcloud8st-cadb-1             | 192.168.213.71 | controler1 | 192.168.213.5  |
    | overcloud8st-afxctrl-0          | 192.168.213.69 | controler2 | 192.168.213.6  |
    | overcloud8st-afxctrl-1          | 192.168.213.52 | controler3 | 192.168.213.7  |
    | overcloud8st-afxctrl-2          | 192.168.213.58 | controler1 | 192.168.213.5  |
    | overcloud8st-cadb-0             | 192.168.213.65 | controler2 | 192.168.213.6  |
    | overcloud8st-ctrl-0             | 192.168.213.73 | controler2 | 192.168.213.6  |
    | overcloud8st-ctrl-1             | 192.168.213.63 | controler1 | 192.168.213.5  |
    | overcloud8st-ctrl-2             | 192.168.213.59 | controler3 | 192.168.213.7  |
    | overcloud8st-cephstorageblue1-0 | 192.168.213.62 | storage3   | 192.168.213.62 |
    | overcloud8st-compdpdk-0         | 192.168.213.56 | compute1   | 192.168.213.56 |
    | overcloud8st-cephstorageblue2-0 | 192.168.213.61 | storage2   | 192.168.213.61 |
    | overcloud8st-cephstorageblue2-1 | 192.168.213.80 | storage1   | 192.168.213.80 |
    | overcloud8st-cadb-2             | 192.168.213.74 | controler3 | 192.168.213.7  |
    +---------------------------------+----------------+------------+----------------+
  2. Log in as root user to any of the openstack controllers and verify that the Ceph cluster is healthy:

    [root@overcloud8st-ctrl-1 ~]# ceph -s
      cluster:
        id:     a98b1580-bb97-11ea-9f2b-525400882160
        health: HEALTH_OK
  3. Find the OSDs that reside on the server to be removed (overcloud8st-cephstorageblue1-0). We identify osd.2, osd.3, osd.6, and osd.7 from the example below:

    [root@overcloud8st-ctrl-1 ~]# ceph osd tree
    ID CLASS WEIGHT   TYPE NAME                                STATUS REWEIGHT PRI-AFF
    -1       10.91638 root default
    -3        3.63879     host overcloud8st-cephstorageblue1-0
     2   hdd  0.90970         osd.2                                up  1.00000 1.00000
     3   hdd  0.90970         osd.3                                up  1.00000 1.00000
     6   hdd  0.90970         osd.6                                up  1.00000 1.00000
     7   hdd  0.90970         osd.7                                up  1.00000 1.00000
    -7        3.63879     host overcloud8st-cephstorageblue2-0
     1   hdd  0.90970         osd.1                                up  1.00000 1.00000
     4   hdd  0.90970         osd.4                                up  1.00000 1.00000
     8   hdd  0.90970         osd.8                                up  1.00000 1.00000
    10   hdd  0.90970         osd.10                               up  1.00000 1.00000
    -5        3.63879     host overcloud8st-cephstorageblue2-1
     0   hdd  0.90970         osd.0                                up  1.00000 1.00000
     5   hdd  0.90970         osd.5                                up  1.00000 1.00000
     9   hdd  0.90970         osd.9                                up  1.00000 1.00000
    11   hdd  0.90970         osd.11                               up  1.00000 1.00000
  4. Remove osd.2, osd.3, osd.6, and osd.7 while still logged in to the openstack controller:

    [root@overcloud8st-ctrl-1 ~]# ceph osd out 2
    marked out osd.2.
    [root@overcloud8st-ctrl-1 ~]# ceph osd out 3
    marked out osd.3.
    [root@overcloud8st-ctrl-1 ~]# ceph osd out 6
    marked out osd.6.
    [root@overcloud8st-ctrl-1 ~]# ceph osd out 7
    marked out osd.7.
  5. Moving forward, check to see that the cluster health returns to “health_ok” state by using the ceph -s command. Check for cluster health after every operation. 

  6. Log in to the Ceph node (overcloud8st-cephstorageblue1-0) to stop the OSD services:

    [root@overcloud8st-cephstorageblue1-0 ~]# systemctl stop  ceph-osd@2.service
    [root@overcloud8st-cephstorageblue1-0 ~]# systemctl stop  ceph-osd@3.service
    [root@overcloud8st-cephstorageblue1-0 ~]# systemctl stop  ceph-osd@6.service
    [root@overcloud8st-cephstorageblue1-0 ~]# systemctl stop  ceph-osd@7.service
  7. Log back into the controller and remove further information about the OSDs from overcloud8st-cephstorageblue1-0:

    [root@overcloud8st-ctrl-1 ~]# ceph osd crush remove osd.2
    removed item id 2 name 'osd.2' from crush map
    [root@overcloud8st-ctrl-1 ~]# ceph osd crush remove osd.3
    removed item id 3 name 'osd.3' from crush map
    [root@overcloud8st-ctrl-1 ~]# ceph osd crush remove osd.6
    removed item id 6 name 'osd.6' from crush map
    [root@overcloud8st-ctrl-1 ~]# ceph osd crush remove osd.7
    removed item id 7 name 'osd.7' from crush map
     
    [root@overcloud8st-ctrl-1 ~]# ceph auth del osd.2
    updated
    [root@overcloud8st-ctrl-1 ~]# ceph auth del osd.3
    updated
    [root@overcloud8st-ctrl-1 ~]# ceph auth del osd.6
    updated
    [root@overcloud8st-ctrl-1 ~]# ceph auth del osd.7
    updated
     
    [root@overcloud8st-ctrl-1 ~]# ceph osd rm 2
    removed osd.2
    [root@overcloud8st-ctrl-1 ~]# ceph osd rm 3
    removed osd.3
    [root@overcloud8st-ctrl-1 ~]# ceph osd rm 6
    removed osd.6
    [root@overcloud8st-ctrl-1 ~]# ceph osd rm 7
    removed osd.7
     
    [root@overcloud8st-ctrl-1 ~]# ceph osd crush rm overcloud8st-cephstorageblue1-0
  8. From the undercloud VM find the ID of the Ceph storage node:

    (undercloud) [stack@undercloud ~]$ openstack server list | grep overcloud8st-cephstorageblue1-0
    | 7ee9be4f-efda-4837-a597-a6554027d0c9 | overcloud8st-cephstorageblue1-0 | ACTIVE | ctlplane=192.168.213.62 | overcloud-full | CephStorageBlue1
  9. Initiate a removal using the node ID from the previous step:

    (undercloud) [stack@undercloud ~]$ openstack overcloud node delete --stack overcloud 7ee9be4f-efda-4837-a597-a6554027d0c9
  10. Verify that the bare metal node is in a state of power off and available:

    (undercloud) [stack@undercloud ~]$ openstack baremetal node list | grep storage3
    | 05bbab4b-b968-4d1d-87bc-a26ac335303d | storage3 | None  | power off  | available | False |
  11. From the jump host as the contrail user mark the storage node with ‘status: deleting’ so the Ceph profile will be removed from it. Do this by adding the ‘status: deleting’ to the storage-nodes.yml file for storage3 and then run the script storage-nodes-assign.sh.

    [contrail@5a6s13-node1 contrail_cloud]$ cat config/storage-nodes.yml
    storage_nodes:
      - name: storage1
        profile: blue2
      - name: storage2
        profile: blue2
      - name: storage3
        profile: blue1
        status: deleting
    [contrail@5a6s13-node1 contrail_cloud]$ ./scripts/storage-nodes-assign.sh
  12. Run openstack-deploy.sh to regenerate the templates to reflect the current state:

    [contrail@5a6s13-node1 contrail_cloud]$ ./scripts/openstack-deploy.sh
  13. If the goal is to remove the bare metal node completely, the ‘status: deleting’ can be assigned in the inventory.yml file. Information about the bare metal will be removed from the storage-nodes.yml file:

    [contrail@5a6s13-node1 contrail_cloud]$ cat config/inventory.yml
    ...
    inventory_nodes:
      - name: "storage3"
        pm_addr: "10.84.129.184"
        status: deleting
        <<: *common
    
    
    [contrail@5a6s13-node1 contrail_cloud]$ ./scripts/inventory-assign.sh 
    
    [contrail@5a6s13-node1 contrail_cloud]$ cat config/storage-nodes.yml
    storage_nodes:
      - name: storage1
        profile: blue2
      - name: storage2
        profile: blue2
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search