Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

Understanding memory usage on a Juniper MX device

0

0

Article ID: KB36381 KB Last Updated: 25 Dec 2020Version: 1.0
Summary:

This article explains memory usage on a Juniper MX device and how to troubleshoot memory-related issues.

Solution:

Memory-related issues on a Juniper device can trigger various performance related problems in the network. In order to troubleshoot the underlying problem, it is imperative to understand the various components of the physical memory.

First, start with 'show chassis routing-engine' to verify the current usage. Remember to take multiple iterations of this command with a gap of one minute in order to see a sustained usage of high memory:

show chassis routing-engine 
Routing Engine status:
  Slot 0:
    Current state                  Master
    Election priority              Master (default)
    Temperature                 32 degrees C / 89 degrees F
    CPU temperature             31 degrees C / 87 degrees F
    DRAM                      16320 MB (16384 MB installed)  <-- This is the total physical memory available: 16GB
    Memory utilization           8 percent  <-- Current usage
    5 sec CPU utilization:
      User                       2 percent
      Background                 0 percent
      Kernel                     1 percent
      Interrupt                  0 percent
      Idle                      97 percent
    1 min CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     1 percent
      Interrupt                  0 percent
      Idle                      99 percent
    5 min CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     1 percent
      Interrupt                  0 percent
      Idle                      99 percent
    15 min CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     1 percent
      Interrupt                  0 percent
      Idle                      99 percent
    Model                          RE-S-1800x4
    Serial ID                      9009094177
    Start time                     2020-12-02 15:53:37 PST
    Uptime                         13 days, 3 hours, 9 minutes, 41 seconds
    Last reboot reason             Router rebooted after a normal shutdown.
    Load averages:                 1 minute   5 minute  15 minute
                                       0.40       0.26       0.24
Routing Engine status:
  Slot 1:
    Current state                  Backup
    Election priority              Backup (default)
    Temperature                 32 degrees C / 89 degrees F
    DRAM                      16320 MB
    Memory utilization           6 percent
    5 sec CPU utilization:
      User                       0 percent
      Background                 0 percent
      Kernel                     0 percent
      Interrupt                  0 percent
      Idle                     100 percent
    Model                          RE-S-1800x4
    Serial ID                      9009102182
    Start time                     2020-12-02 16:02:36 PST
    Uptime                         13 days, 3 hours, 40 seconds
    Last reboot reason             Router rebooted after a normal shutdown.

If there is sustained high utilization in the above output, the next step is to figure out the process consuming the highest memory: 

show system processes extensive 
last pid: 80591;  load averages:  0.22,  0.23,  0.23  up 13+03:11:42    19:05:19
370 processes: 5 running, 337 sleeping, 28 waiting

Mem: 99M Active, 3709M Inact, 909M Wired, 479M Buf, 11G Free
Swap: 8192M Total, 8192M Free

  PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
   11 root     155 ki31     0K    64K CPU2    2 311.2H 100.00% idle{idle: cpu2}
   11 root     155 ki31     0K    64K CPU1    1 311.2H 100.00% idle{idle: cpu1}
   11 root     155 ki31     0K    64K CPU3    3 311.2H 100.00% idle{idle: cpu3}
   11 root     155 ki31     0K    64K RUN     0 310.7H  99.37% idle{idle: cpu0}
10410 root       4    0  1495M   701M select  1 722:58   2.59% chassisd 
10431 root      20    0   727M 13272K select  0  32:53   0.00% clksyncd
10551 root      20    0   900M 44620K nanslp  2  16:39   0.00% rep-serverd
10550 root      20    0   900M 44616K nanslp  0  16:24   0.00% rep-clientd
10416 root      20    0   800M 63536K select  1  15:48   0.00% mib2d

The above command shows both the memory usage and CPU cycles consumed by a process. It is important to look at the right place.

Understanding the output: 
last pid: 80591;  load averages:  0.22,  0.23,  0.23  up 13+03:11:42    19:05:19
370 processes: 5 running, 337 sleeping, 28 waiting

Mem: 99M Active, 3709M Inact, 909M Wired, 479M Buf, 11G Free
Swap: 8192M Total, 8192M Free

The top few lines of the output highlights the number of total processes, both active and inactive (sleeping).
It breaks down the total available memory into active, inactive, wired, buffer, and free. Adding all of these components equals to 16 GB of total memory available.

  • Active: Memory that is allocated and actively used by programs,
  • Inactive: Either memory that is allocated but not recently used or memory that was freed by programs. Inactive memory is still mapped in the address space of one or more processes and, therefore, counts toward the resident set size of those processes.
  • Wired: Memory that is not eligible to be swapped, usually used for kernel memory structures and/or memory physically locked by a process.
  • Buffer: Size of the memory buffer used to hold data recently called from disk.
  • Free: Completely free memory not associated with any programs.

Swap memory is a virtual memory that comes into use in situations of memory pressure. These memory pages are stored on the disk and are used by the process when there is no active memory left to be allocated. Understandably, the access to this memory drives latency in the process and is never a good sign for the system in terms of memory health.

Kernel is the core component of the operating system that is responsible for assigning the resources to the various processes. It utilizes the pageout daemon to scan the current memory usage and free up memory based on needs of the processes. When a process requests memory page requests, the pageout daemon follows the below sequence: 

Free memory >> Cache memory >> Inactive memory >> Active memory

Cache memory is a freed up memory that is not being used by any process is ready to be re-used.

As evident from above, the pageout daemon tries to access the active memory to an application at the end when no other choice exists.

PID USERNAME PRI NICE   SIZE    RES STATE   C   TIME    WCPU COMMAND
10410 root       4    0  1495M   701M select  1 722:58   2.59% chassisd 

In the above output, the process ID for chassisd as 10410. Size is the total virtual memory size allocated to the process. RES is the resident size of the process in physical memory. It is the sum on active and inactive memory currently allocated to the process.

If there are any processes utilizing higher memory or multiple daemons of the same process being spawned, troubleshoot further to isolate the reason. In many cases, there are daemons like mosquitos (used for telemetry ) being spawned multiple times, with different process ID. In such cases, a single parent process creates multiple child processes. Try the steps below to clear this issue: 

  1. Validate or reach out to JTAC if it is safe to restart the process. If yes, restart the process and check if the child processes are still spinning.

  2. In many cases, the processes are spawned only on one RE, sometimes due to a sync issue with the other RE. Enable GRES and NSR and switchover the primary role to the RE which does not spin the child processes and reboot the other RE.

  3. In case it is the RPD that is consuming higher memory, check the output of 'show task memory'.

    show task memory 
    Memory                 Size (kB)  Percentage  When
      Currently In Use:        47145          0%  now
      Maximum Ever Used:       58716          0%  20/12/02 15:56:35
      Available:            17092010        100%  now

    This gives the memory utilization for RPD and the highest memory ever used by the RPD and the concerned timestamp as well.

  4. Check the output of 'show task memory detail' for a detailed version of the output.

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search