Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[CSO] CSO 5.1 service pods in evicted state in on-prem deployment

0

0

Article ID: KB35843 KB Last Updated: 03 Jun 2020Version: 1.0
Summary:

Contrail Service Orchestration (CSO) is a fully distributed, docker container–based microservices architecture that consists of several infra and microservices deployed across central and regional nodes. The CSO microservices are packaged as docker containers, orchestrated by Kubernetes, and grouped as pods.

Sometimes, these pods are seen to be in evicted state.

This article explains why the pods may be seen to be in evicted state and what must be done to resolve the issue.

 

Symptoms:

Sometimes you observe that the pods that are hosting the microservices are in evicted state.

# kubectl get pods -n central | grep Evic
csp.csp-cslm-signature-ims-central-74c7446f97-jm5gn 3/3 Running 0 12d
csp.csp-cslm-signature-ims-central-core-758896b9d-2878z 0/3 Evicted 0 6d22h
csp.csp-cslm-signature-ims-central-core-758896b9d-2fcmg 0/3 Evicted 0 8d
csp.csp-cslm-signature-ims-central-core-758896b9d-2j2xv 0/3 Evicted 0 68m
csp.csp-cslm-signature-ims-central-core-758896b9d-2mz4k 0/3 Evicted 0 81m
csp.csp-cslm-signature-ims-central-core-758896b9d-2ndzx 0/3 Evicted 0 8d
csp.csp-cslm-signature-ims-central-core-758896b9d-2nk8q 0/3 Evicted 0 81m
csp.csp-cslm-signature-ims-central-core-758896b9d-2p78w 0/3 Evicted 0 8d
csp.csp-cslm-signature-ims-central-core-758896b9d-2q47q 0/3 Evicted 0 6d22h
csp.csp-cslm-signature-ims-central-core-758896b9d-2vqjq 0/3 Evicted 0 8d
csp.csp-cslm-signature-ims-central-core-758896b9d-46zg2 0/3 Evicted 0 8d
csp.csp-cslm-signature-ims-central-core-758896b9d-47z62 0/3 Evicted 0 10m
 
 
Cause:

The above logs are seen when there is not enough resources on any node in the cluster of your pod.

Use kubectl get events to check the events that are happening in the pod cluster and determine the cause for the evicted state:

root@k8-microservices1:# kubectl get events -n central
 
11m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-6vrw4 Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-6vrw4 to k8-microservices1
11m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-6vrw4 The node had condition: [DiskPressure].
11m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-7bbxf Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-7bbxf to k8-microservices1
18m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-9zknc The node was low on resource: ephemeral-storage. Container ims-central-core was using 132Ki, which exceeds its request of 0. Container signature-manager-core was using 727804Ki, which exceeds its request of 0. Container cslm-core was using 196Ki, which exceeds its request of 0.
18m Normal Killing pod/csp.csp-cslm-signature-ims-central-core-758896b9d-9zknc Stopping container cslm-core
18m Normal Killing pod/csp.csp-cslm-signature-ims-central-core-758896b9d-9zknc Stopping container ims-central-core
18m Normal Killing pod/csp.csp-cslm-signature-ims-central-core-758896b9d-9zknc Stopping container signature-manager-core
11m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-bttm7 Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-bttm7 to k8-microservices1
11m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-bttm7 The node had condition: [DiskPressure].
11m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-c2pvp Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-c2pvp to k8-microservices1
11m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-c2pvp The node had condition: [DiskPressure].
18m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-dcsgd Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-dcsgd to k8-microservices1
18m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-dcsgd The node had condition: [DiskPressure].
11m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-dlvc6 Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-dlvc6 to k8-microservices1
11m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-dlvc6 The node had condition: [DiskPressure].
18m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-gmmml Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-gmmml to k8-microservices1
18m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-gmmml The node had condition: [DiskPressure].
18m Normal Scheduled pod/csp.csp-cslm-signature-ims-central-core-758896b9d-gz2jh Successfully assigned central/csp.csp-cslm-signature-ims-central-core-758896b9d-gz2jh to k8-microservices1
18m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-gz2jh The node had condition: [DiskPressure].
11m Warning Evicted pod/csp.csp-cslm-signature-ims-central-core-758896b9d-zn7lb The node was low on resource: ephemeral-storage. Container signature-manager-core was using 471432Ki, which exceeds its request of 0. Container cslm-core was using 184Ki, which exceeds its request of 0. Container ims-central-core was using 132Ki, which exceeds its request of 0.

The above log message clearly indicates that there is a space issue and we need to free some space to resolve the problem.

 

Solution:

Perform the following steps to troubleshoot and resolve the issue:

  1. Get in to the k8 node and execute df -h * to check the space used.

root@k8-microservices1: ~root@k8-microservices1:~# df -h | more
Filesystem      Size  Used Avail Use% Mounted on
udev             32G     0   32G   0% /dev
tmpfs           6.3G  719M  5.6G  12% /run
/dev/vda1        97G   82G   16G  84% /  --------------> root is 84% utilized
tmpfs            32G   16K   32G   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            32G     0   32G   0% /sys/fs/cgroup
/dev/loop0       96G   17G   80G  18% /srv/node/swiftstorage1
/dev/vdb1       493G  3.5G  464G   1% /mnt/data

The space utilized in the root folder should be less than 70%. So from the above logs, it is clear that we need to free up space to move the pods from the evicted state.

Note: Reach out to Technical Support for assistance if required while freeing up space.

From the logs, you see that the swiftstorage1 file has consumed 96G of memory. swiftstorage is a temp location where app signatures are stored when you download them via GUI.

root@k8-microservices1:/opt# ls -ltrh
total 17G
drwx--x--x 4 root root 4.0K Feb 19 14:53 containerd
drwxr-xr-x 3 root root 4.0K Feb 19 14:53 cni
drwxrwxrwx 3 root root 4.0K Feb 19 14:54 plugin
drwxrwxrwx 7 root root 4.0K Apr 30 11:03 csp
-rw-r--r-- 1 root root  96G May 12 17:48 swiftstorage1
root@k8-microservices1:/opt#
  1. Move the swiftstorage1 file to /mnt/data, which is shown to have 464G of free space in the df -h | more command output above.

Note: We are not deleting the file. We are moving it to a different location to prevent any type of data loss.

  1. Now run df -h again to check space usage.

root@k8-microservices1: /root@k8-microservices1:/# df h-h | more
Filesystem      Size  Used Avail Use% Mounted on
udev             32G     0   32G   0% /dev
tmpfs           6.3G  719M  5.6G  12% /run
/dev/vda1        97G   50G   48G  51% /  ========> utilization reduced
tmpfs            32G   16K   32G   1% /dev/shm
tmpfs           5.0M     0  5.0M   0% /run/lock
tmpfs            32G     0   32G   0% /sys/fs/cgroup
/dev/loop0       96G   17G   80G  18% /srv/node/swiftstorage1
/dev/vdb1       493G   36G  432G   8% /mnt/data

 As seen above, this has freed up around 50% of the space in root. 

  1. After freeing up space, delete the evicted nodes by using the following command:

Note: You should not delete any pods from any namespaces other than the central and regional nodes. Kindly reach out to Technical Support if you have questions.

kubectl get pods -n central  | grep -I evi | awk '{print $i}' | xargs kubectl delete pods -n central
  1. Confirm that there are no pods in evicted status:

#kubectl get pods -n central  | grep evic
csp.csp-reporting-service-69b9f8f459-xjjvf                1/1     Running   0          12d
csp.csp-schema-svc-5cc6d757cd-l97j6                       1/1     Running   0          12d
csp.csp-sse-597955c6c6-fgrmr                              1/1     Running   0          12d
csp.csp-topology-service-85dc6dcbcb-j6h4h                 1/1     Running   0          12d
csp.csp-topology-service-85dc6dcbcb-qs6dc                 1/1     Running   0          12d
csp.csp-topology-service-85dc6dcbcb-ss2fv                 1/1     Running   0          12d
csp.csp-topology-service-core-5b986b649b-26c4w            1/1     Running   0          12d
csp.csp-tssm-85dd95655-8f62m                              1/1     Running   0          12d
csp.csp-tssm-core-5d5766b7b8-j88js                        1/1     Running   0          12d
csp.csp-vim-66845b8db5-wdq5c                              1/1     Running   0          12d
csp.csp-vim-core-845c7d55f9-7rthm                         1/1     Running   0          12d
csp.ne-5b4587f8f6-lrppn                                   1/1     Running   0          12d
csp.ne-core-cfc99c7-mvsv7                                 1/1     Running   0          12d
csp.nsd-ui-799585464c-55xzr                               1/1     Running   0          12d
csp.sd-security-objects-697cbcd96c-n4cl6                  1/1     Running   0          12d
csp.secmgt-jingest-5b9c97bf9-mb22x                        1/1     Running   0          12d

 

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search