Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

CSO 5.1.1 HA : ETCD pod in Crashloopbackoff state

0

0

Article ID: KB35552 KB Last Updated: 06 Mar 2020Version: 1.0
Summary:
This KB article explains a known issue with respect to etcd post reboots of servers in CSO 5.1.1 HA environment and its possible recovery steps.
Symptoms:
Post reboots of servers, etcd pods are in CrashLoopBackOff state.

root@startupserver1:/Contrail_Service_Orchestration_5.1.1# kubectl get pods -n infra | grep etcd
etcd-etcd-0                                    1/1     Running            1          73m
etcd-etcd-1                                    0/1     CrashLoopBackOff   9          135m
etcd-etcd-2                                    0/1     CrashLoopBackOff   8          112m
Solution:

Note : Below are recovery steps needs to be performed in startup server in JTAC presence :

  • helm delete --purge etcd .
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# helm delete --purge etcd .
release "etcd" deleted
Error: invalid release name, must match regex ^(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])+$ and the length must not longer than 53
  • kubectl get pvc -n infra | grep etcd
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# kubectl get pvc -n infra | grep etcd
data-etcd-etcd-0                     Bound    pv-infra-etcd-2                                  8Gi        RWO            localsc        26h
data-etcd-etcd-1                     Bound    pv-infra-etcd                                    8Gi        RWO            localsc        26h
data-etcd-etcd-2                     Bound    pv-infra-etcd-1                                  8Gi        RWO            localsc        26h
  • kubectl delete pvc -n infra data-etcd-etcd-0 data-etcd-etcd-1 data-etcd-etcd-2
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# kubectl delete pvc -n infra data-etcd-etcd-0 data-etcd-etcd-1 data-etcd-etcd-2
persistentvolumeclaim "data-etcd-etcd-0" deleted
persistentvolumeclaim "data-etcd-etcd-1" deleted
persistentvolumeclaim "data-etcd-etcd-2" deleted
  • kubectl get pv -n infra | grep etcd
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# kubectl get pv -n infra | grep etcd
pv-infra-etcd                                    8Gi        RWO            Retain           Released   infra/data-etcd-etcd-1                     localsc                 26h
pv-infra-etcd-1                                  8Gi        RWO            Retain           Released   infra/data-etcd-etcd-2                     localsc                 26h
pv-infra-etcd-2                                  8Gi        RWO            Retain           Released   infra/data-etcd-etcd-0                     localsc                 26h
  • kubectl delete pv -n infra pv-infra-etcd pv-infra-etcd-1 pv-infra-etcd-2
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# kubectl delete pvc -n infra data-etcd-etcd-0 data-etcd-etcd-1 data-etcd-etcd-2
persistentvolumeclaim "data-etcd-etcd-0" deleted
persistentvolumeclaim "data-etcd-etcd-1" deleted
persistentvolumeclaim "data-etcd-etcd-2" deleted
  • salt -C 'G@roles:kubeworker' cmd.run 'rm -rf /mnt/data/etcd/*'
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# salt -C 'G@roles:kubeworker' cmd.run 'rm -rf /mnt/data/etcd/*'
csp-central-k8-microservices3.4NO3VD.central:
csp-central-k8-infra1.4NO3VD.central:
csp-central-k8-infra3.4NO3VD.central:
csp-central-k8-infra2.4NO3VD.central:
csp-central-k8-microservices1.4NO3VD.central:
csp-central-k8-microservices2.4NO3VD.central:
  • kubectl --kubeconfig=/root/.kube/config apply -f /opt/charts/localpvc/local_pv_etcd.yaml
  • kubectl --kubeconfig=/root/.kube/config apply -f /opt/charts/localpvc/local_pv_etcd-1.yaml
  • kubectl --kubeconfig=/root/.kube/config apply -f /opt/charts/localpvc/local_pv_etcd-2.yaml
     
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# kubectl --kubeconfig=/root/.kube/config apply -f /opt/charts/localpvc/local_pv_etcd.yaml
persistentvolume/pv-infra-etcd created
root@startupserver1:/Contrail_Service_Orchestration_5.1.1#  kubectl --kubeconfig=/root/.kube/config apply -f /opt/charts/localpvc/local_pv_etcd-1.yaml
persistentvolume/pv-infra-etcd-1 created
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# kubectl --kubeconfig=/root/.kube/config apply -f /opt/charts/localpvc/local_pv_etcd-2.yaml
persistentvolume/pv-infra-etcd-2 created
  • kubectl get pv -n infra | grep etcd
root@startupserver1:/Contrail_Service_Orchestration_5.1.1#  k get pv -n infra | grep etcd
pv-infra-etcd                                    8Gi        RWO            Retain           Available                                              localsc                 31s
pv-infra-etcd-1                                  8Gi        RWO            Retain           Available                                              localsc                 18s
pv-infra-etcd-2                                  8Gi        RWO            Retain           Available                                              localsc                 9s
  • cd /opt/charts/etcd && helm install --name etcd --namespace infra .
root@startupserver1:/Contrail_Service_Orchestration_5.1.1# cd /opt/charts/etcd && helm install --name etcd --namespace infra .
NAME:   etcd
LAST DEPLOYED: Thu Mar  5 00:24:13 2020
NAMESPACE: infra
STATUS: DEPLOYED
 
RESOURCES:
==> v1/Service
NAME                TYPE          CLUSTER-IP     EXTERNAL-IP  PORT(S)                        AGE
etcd-etcd-headless  ClusterIP     None           <none>       2379/TCP,2380/TCP              1s
etcd-etcd           LoadBalancer  10.108.27.214  10.0.10.19   2379:31777/TCP,2380:32331/TCP  1s
 
==> v1beta2/StatefulSet
NAME       DESIRED  CURRENT  AGE
etcd-etcd  3        1        1s
 
==> v1/Pod(related)
NAME         READY  STATUS   RESTARTS  AGE
etcd-etcd-0  0/1    Pending  0         0s
 
 
NOTES:
 
-------------------------------------------------------------------------------
 WARNING
 
    By specifying "service.type=LoadBalancer" and "allowNoneAuthentication=true" you
    have most likely exposed the Redis service externally without any authentication
    mechanism.
 
    For security reasons, we strongly suggest that you switch to "ClusterIP" or
    "NodePort". As alternative, you can also switch to "usePassword=true"
    providing a valid password on "password" parameter.
 
-------------------------------------------------------------------------------
 
** Please be patient while the chart is being deployed **
 
etcd can be accessed via port 2379 on the following DNS name from within your cluster:
 
    etcd-etcd.infra.svc.cluster.local
 
To set a key run the following command:
 
    export POD_NAME=$(kubectl get pods --namespace infra -l "app=etcd" -o jsonpath="{.items[0].metadata.name}")
    kubectl exec -it $POD_NAME -- etcdctl set /message Hello
 
To get a key run the following command:
 
    export POD_NAME=$(kubectl get pods --namespace infra -l "app=etcd" -o jsonpath="{.items[0].metadata.name}")
    kubectl exec -it $POD_NAME -- etcdctl get /message
 
To connect to your etcd server from outside the cluster execute the following commands:
 
  NOTE: It may take a few minutes for the LoadBalancer IP to be available.
        Watch the status with: 'kubectl get svc --namespace infra -w etcd-etcd'
 
    export SERVICE_IP=$(kubectl get svc --namespace infra etcd-etcd --template "{{ range (index .status.loadBalancer.ingress 0) }}{{.}}{{ end }}")
    echo "etcd URL: http://$SERVICE_IP:2379/"
  • Copy attached etcdkey_infra.sls.tgz to startup server and extract in startup server using command: tar -zxvf etcdkey_infra.sls.tgz
  • cp etcdkey_infra.sls  /Contrail_Service_Orchestration_5.1.1/deployments/central/file_root/helm_manager/
  • salt -C "G@roles:helm_manager" state.apply helm_manager.etcdkey_infra saltenv='central'
root@startupserver1:/opt/charts/etcd# salt -C "G@roles:helm_manager" state.apply helm_manager.etcdkey_infra saltenv='central'
csp-central-startupserver1.4NO3VD.central:
----------
          ID: /csp/infra/memcached/servers
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.228755
    Duration: 30.925 ms
     Changes:
              ----------
              /csp/infra/memcached/servers:
                  memcached.infra.svc.cluster.local
----------
          ID: /secureserver/token
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.260427
    Duration: 27.456 ms
     Changes:
              ----------
              /secureserver/token:
                  bWCPX8fckXqW1tiTWRCqhQ==
----------
          ID: /saml/server-url
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.288558
    Duration: 14.812 ms
     Changes:
              ----------
              /saml/server-url:
                  https://cso.combridge.ro
----------
          ID: /saml/admin-portal-url
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.304070
    Duration: 19.274 ms
     Changes:
              ----------
              /saml/admin-portal-url:
                  https://cso.combridge.ro
----------
          ID: /csp/infra/keystone/host
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.323984
    Duration: 17.732 ms
     Changes:
              ----------
              /csp/infra/keystone/host:
                  jkeystone.infra.svc.cluster.local
----------
          ID: /csp/infra/redis/tls
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.342644
    Duration: 22.165 ms
     Changes:
              ----------
              /csp/infra/redis/tls:
                  False
----------
          ID: /csp/infra/redis/authenabled
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.365584
    Duration: 14.143 ms
     Changes:
              ----------
              /csp/infra/redis/authenabled:
                  False
----------
          ID: /csp/infra/redis/password
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.380369
    Duration: 19.025 ms
     Changes:
              ----------
              /csp/infra/redis/password:
                  None
----------
          ID: /csp/infra/redis/host
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.400118
    Duration: 20.983 ms
     Changes:
              ----------
              /csp/infra/redis/host:
                  redis-ha-haproxy.infra.svc.cluster.local
----------
          ID: /csp/infra/redis/port
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.422254
    Duration: 13.453 ms
     Changes:
              ----------
              /csp/infra/redis/port:
                  6379
----------
          ID: /csp/infra/redis/expiration
    Function: etcd.set
      Result: True
     Comment: New key created
     Started: 00:33:40.436343
    Duration: 12.57 ms
     Changes:
              ----------
              /csp/infra/redis/expiration:
                  900
 
Summary for csp-central-startupserver1.4NO3VD.central
-------------
Succeeded: 11 (changed=11)
Failed:     0
-------------
Total states run:     11
Total run time:  212.538 ms
root@startupserver1:/opt/charts/etcd#
  • Find out name of keystone pods using command : kubectl get pods -n infra | grep jkeystone*
root@startupserver1:/opt/charts/etcd# kubectl get pods -n infra | grep jkeystone*
jkeystone-64f47bf7c7-c7lp8                     0/1     Running            8          90m
jkeystone-64f47bf7c7-l9pl4                     0/1     Running            8          133m
jkeystone-64f47bf7c7-w8r6h                     0/1     CrashLoopBackOff   7          133m
  • Delete keystone pods using command -kubectl delete pods -n infra <podname>
root@startupserver1:/opt/charts/etcd# k delete pods -n infra jkeystone-64f47bf7c7-c7lp8 jkeystone-64f47bf7c7-l9pl4 jkeystone-64f47bf7c7-w8r6h
pod "jkeystone-64f47bf7c7-c7lp8" deleted
pod "jkeystone-64f47bf7c7-l9pl4" deleted
pod "jkeystone-64f47bf7c7-w8r6h" deleted
  • Navigate to CSO install directory and run below commands :
  • ./python.sh micro_services/deploy_micro_services.py --deployment_env='central'
  • It is observed that secmgt-monitoring and secmgt-sm pod needs to be deleted manually
  • Note down pods name using command - kubectl get pods -n central | grep secmgt-monitoring and kubectl get pods -n central | grep secmgt-sm
  • Delete above pods (secmgt-monitoring and secmgt-sm pod) using commands kubectl delete pod -n central <pod name>
  • ./python.sh micro_services/deploy_micro_services.py --deployment_env='regional'
  • Run component health to verify status of etcd

(Note: In order to keep context short, lab snippet of deploy_micro_services has not been provided)
etcdkey_infra_sls.tgz

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search