Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[MX] Multiple ways to monitor RPD CPU utilization

0

0

Article ID: KB33635 KB Last Updated: 01 May 2019Version: 1.0
Summary:

The Routing Process Daemon (RPD) is a critical process that runs various routing protocols and performs the best path selection. During route convergence in a high-scale setup, RPD is expected to consume high CPU. However, additional investigation may be required when RPD consistently consumes high CPU.

This article demonstrates how to monitor RPD CPU utilization in such scenarios.

Note: The same method can be applied to other processes too, such as mib2d and mgd.

 

Solution:

To monitor CPU utilization:

From CLI

user@router> show system processes extensive | match rpd
 3035 root             5  20    0  5298M  4374M kqread  3  93.0H  26.76% rpd

Thread level CPU utilization is available in 17.2+.
user@router> show system processes extensive threads | match rpd
 3035 root           41    0  5298M  4374M kqread  3  86.6H  31.88% rpd 
 3035 root           20    0  5298M  4374M kqread  2 163:52   0.00% rpd{krtio-th}
 3035 root           20    0  5298M  4374M kqread  0  87:19   0.00% rpd{rsvp-io}
 3035 root           20    0  5298M  4374M kqread  1  76:29   0.00% rpd 
 3035 root           20    0  5298M  4374M kqread  3  56:16   0.00% rpd{bgpio-0-th}

From Shell

%top
last pid: 46414;  load averages:  0.53,  0.61,  0.63                           up 63+05:21:09  21:43:06
98 processes:  1 running, 97 sleeping
CPU 0:  7.5% user,  0.0% nice,  5.1% system,  0.0% interrupt, 87.5% idle
CPU 1:  4.3% user,  0.0% nice,  5.1% system,  0.0% interrupt, 90.6% idle
CPU 2:  7.1% user,  0.0% nice,  3.1% system,  0.4% interrupt, 89.4% idle
CPU 3:  7.5% user,  0.0% nice,  2.7% system,  0.0% interrupt, 89.8% idle
Mem: 320M Active, 7997M Inact, 1948M Wired, 179M Cache, 415M Buf, 5449M Free
Swap: 8192M Total, 300K Used, 8192M Free

  PID USERNAME       THR PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
3035 root             5  20    0  5298M  4374M kqread  2  93.0H  15.58% rpd
37582 root             2 -26  r26   825M 30348K nanslp  1  22.5H   3.66% chassisd
3108 root             1  22    0   792M 42136K select  1 262:44   2.29% mib2d
<snip>

Thread level CPU utilization (17.2+)

%top -H
last pid: 46471;  load averages:  0.61,  0.62,  0.63                           up 63+05:22:48  21:44:45
127 processes: 2 running, 125 sleeping
CPU 0:     % user,     % nice,     % system,     % interrupt,     % idle
CPU 1:     % user,     % nice,     % system,     % interrupt,     % idle
CPU 2:     % user,     % nice,     % system,     % interrupt,     % idle
CPU 3:     % user,     % nice,     % system,     % interrupt,     % idle
Mem: 343M Active, 7997M Inact, 1949M Wired, 179M Cache, 416M Buf, 5425M Free
Swap: 8192M Total, 300K Used, 8192M Free

  PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
3035 root           40    0  5298M  4374M CPU0    0  86.6H  31.49% rpd 
64282 remote         29    0   768M 30936K ttyout  3  52:31   9.18% cli
64283 root           25    0  1393M 49532K sbwait  1  30:12   4.98% mgd
<snip>

Process/thread CPU utilization every <x> second for <y> times:

% top -Hb -s 1 -d 2
last pid: 46807;  load averages:  0.54,  0.61,  0.62  up 63+05:33:25    21:55:22
128 processes: 1 running, 127 sleeping

Mem: 346M Active, 7998M Inact, 1950M Wired, 179M Cache, 422M Buf, 5420M Free
Swap: 8192M Total, 300K Used, 8192M Free

  PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
3035 root           38    0  5298M  4374M kqread  1  86.6H  18.90% rpd 
37582 root            4    0   825M 30348K select  0  22.4H   4.05% chassisd 
3085 root           21    0   722M 11020K select  3  99:51   1.37% xmlproxyd
3104 root           20    0   774M 26476K select  2 296:56   0.29% snmpd
3138 root           22    0  1393M 50196K select  1  29:40   0.20% mgd
40297 root           20    0   754M 16764K select  1  15:44   0.20% jsd 
3108 root           20    0   792M 42136K select  3 262:52   0.10% mib2d
3035 root           20    0  5298M  4374M kqread  2  76:31   0.10% rpd 
40297 root           20    0   754M 16764K select  0  15:50   0.10% jsd 
13647 root           20    0 52936K  8980K select  2  13:05   0.10% na-grpcd{na-grpcd}
13647 root           20    0 52936K  8980K select  3  12:56   0.10% na-grpcd{na-grpcd}
3038 root           35   15   729M  9716K select  2 244:31   0.00% sampled
3035 root           20    0  5298M  4374M kqread  1 163:52   0.00% rpd{krtio-th}
37579 root           20    0   725M  8940K select  3  99:58   0.00% clksyncd
3035 root           20    0  5298M  4374M kqread  3  87:22   0.00% rpd{rsvp-io}
6312 daemon         20    0 22736K  1892K select  0  71:39   0.00% mosquitto-nossl
3035 root           20    0  5298M  4374M kqread  0  56:16   0.00% rpd{bgpio-0-th}
6273 root           20    0   721M  7452K select  0  54:44   0.00% eventd

last pid: 46808;  load averages:  0.54,  0.61,  0.62  up 63+05:33:26    21:55:23
128 processes: 2 running, 126 sleeping
CPU 0: 14.1% user,  0.0% nice,  0.8% system,  0.8% interrupt, 84.4% idle
CPU 1: 28.9% user,  0.0% nice,  0.8% system,  0.8% interrupt, 69.5% idle
CPU 2:  2.3% user,  0.0% nice,  0.8% system,  0.8% interrupt, 96.1% idle
CPU 3:  8.6% user,  0.0% nice,  2.3% system,  0.0% interrupt, 89.1% idle
Mem: 346M Active, 7998M Inact, 1950M Wired, 179M Cache, 422M Buf, 5419M Free
Swap: 8192M Total, 300K Used, 8192M Free

  PID USERNAME      PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
3035 root           40    0  5298M  4374M CPU2    2  86.6H  21.09% rpd 
37582 root            4    0   825M 30348K select  0  22.4H   3.66% chassisd 
3085 root           21    0   722M 11020K select  1  99:51   1.46% xmlproxyd
3104 root           20    0   774M 26476K select  0 296:56   0.29% snmpd
3108 root            4    0   792M 42136K select  3 262:52   0.20% mib2d
3138 root           21    0  1393M 50196K select  0  29:40   0.20% mgd
40297 root           20    0   754M 16764K select  0  15:44   0.20% jsd 
40297 root           20    0   754M 16764K select  3  15:50   0.10% jsd 
13647 root           20    0 52936K  8980K select  1  13:05   0.10% na-grpcd{na-grpcd}
13647 root           20    0 52936K  8980K select  3  12:56   0.10% na-grpcd{na-grpcd}
46807 remote         20    0 25080K  3140K CPU1    1   0:00   0.10% top
3038 root           35   15   729M  9716K select  1 244:31   0.00% sampled
3035 root           20    0  5298M  4374M kqread  0 163:52   0.00% rpd{krtio-th}
37579 root           20    0   725M  8940K select  1  99:58   0.00% clksyncd
3035 root           20    0  5298M  4374M kqread  0  87:22   0.00% rpd{rsvp-io}
3035 root           20    0  5298M  4374M kqread  3  76:31   0.00% rpd 
6312 daemon         20    0 22736K  1892K select  3  71:39   0.00% mosquitto-nossl
3035 root           20    0  5298M  4374M kqread  1  56:16   0.00% rpd{bgpio-0-th}

Using SNMP

  • Get the RPD process OID.

user@router> show snmp mib walk sysApplElmtRunName | match rpd
sysApplElmtRunName.5.5.3035 = /usr/libexec64/rpd
  • Pull the RPD CPU.

user@router> show snmp mib get sysApplElmtRunCPU.5.5.3035
sysApplElmtRunCPU.5.5.3035 = 33511535

The above output shows the number of centi-seconds of total system CPU resources consumed by RPD. To get the utilization in percentage, we can pull the CPU time multiple times. The CPU time difference divided by the interval at which the value is being pulled is the utilization in percentage. For example:

user@router> show snmp mib get sysApplElmtRunCPU.5.5.3035 | refresh 10
---(refreshed at 2018-12-19 22:13:10 PST)---
sysApplElmtRunCPU.5.5.3035 = 33521713
---(refreshed at 2018-12-19 22:13:20 PST)---
sysApplElmtRunCPU.5.5.3035 = 33521865

CPU utilization = (33521865-33521713)/10*100 = 15.2%

 

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search