Knowledge Center Search


 

Troubleshooting High CPU on a firewall device

  [KB9453] Show KB Properties

  [KB9453] Hide KB Properties

Categories:
Knowledge Base ID: KB9453
Last Updated: 28 Jul 2014
Version: 19.0

Summary:
CPU utilization is extremely high on the Juniper Firewall, what is triggering the High CPU situation?

Problem or Goal:
Packets passed to, through, or processed by the firewall could use the CPU.  The firewall will start to experience problems if the CPU begins to reach 85%. The symptoms include:
  • High CPU utilization
  • Poor system or throughput performance
  • OSPF adjacencies or BGP peering is failing
  • Device management is slower than normal
  • Ping to the management interface times out
  • Firewall is not passing traffic
  • Packet drops
  • The 'in overrun' counter (get counter stat) could increment.

Solution:
To Troubleshoot a High CPU situation

step1/  Check the CPU Utilization 

The CPU utilization is calculated based on two entities: Flow and Task.   CPU utilization is defined as the percentage of time CPU spends on processing, instead of sitting idle.  When CPU utilization is high, it means it is busy processing network traffic, but it does not mean it cannot keep up and will start dropping packets.  CPU utilization is only a measure of network load through the firewall, not the throughput of the box itself.

Note: On all firewall appliance devices (NetScreen-5, 25, 50, 204, 208, and SSG Series), there is 1 CPU used for processing.  On ASIC based hardware firewalls (NS-5000, ISG devices) there are two CPU’s; one dedicated for flow and the other dedicated for task.

The CLI command get perf cpu detail will show an overview of the CPU percentage, with the last 1 minute broken down into average CPU during single second segments:

Sample:
ns5200-> get perf cpu detail
Average System Utilization:  2%
Last 60 seconds:
59:  2    58:  2    57:  2    56:  2    55:  2    54:  2    
53:  2    52:  2    51:  2    50:  2    49:  2    48:  2    
47:  2    46:  2    45:  2    44:  2    43:  2    42:  2    
41:  2    40:  2    39:  2    38:  2    37:  2    36:  2    
35:  2    34:  2    33:  2    32:  2    31:  2    30:  2    
29:  2    28:  2    27:  2    26:  2    25:  2    24:  2    
23:  2    22:  2    21:  2    20:  2    19:  2    18:  2    
17:  2    16:  2    15:  2    14:  2    13:  2    12:  2    
11:  2    10:  2     9:  2     8:  2     7:  2     6:  2    
 5:  2     4:  2     3:  2     2:  2     1:  2     0:  2    

Last 60 minutes:
59:  2    58:  2    57:  2    56:  2    55:  2    54:  2    
53:  2    52:  2    51:  2    50:  2    49:  2    48:  2    
47:  2    46:  2    45:  2    44:  2    43:  2    42:  2    
41:  2    40:  2    39:  2    38:  2    37:  2    36:  2    
35:  2    34:  2    33:  2    32:  2    31:  2    30:  2    
29:  2    28:  2    27:  2    26:  2    25:  2    24:  2    
23:  2    22:  2    21:  2    20:  2    19:  2    18:  2    
17:  2    16:  2    15:  2    14:  2    13:  2    12:  2    
11:  2    10:  2     9:  2     8:  2     7:  2     6:  2    
 5:  2     4:  2     3:  2     2:  2     1:  2     0:  2    

Last 24 hours:
23:  2    22:  2    21:  2    20:  2    19:  2    18:  2    
17:  2    16:  2    15:  2    14:  2    13:  2    12:  2    
11:  2    10:  2     9:  2     8:  2     7:  2     6:  2    
 5:  2     4:  2     3:  2     2:  2     1:  2     0:  2   
  • Average system utilization is the average CPU utilization for the last 24 hrs. Example, if the system up time is 48 hrs and 18 minutes, then the average system utilization is the average CPU utilization in the last 24 hours, excluding that 18 minutes.
    • If system up time is less than 24 hrs but greater than 1 hr, it will be average utilization up to last hour. Example, if system is up 10 hr 40 minutes, the average system utilization is the cpu utilization in 10 hrs (excluding 40 minutes).
    • If system up time is less than 1hr, (for example, 34 minutes 26 seconds), then average utilization is the cpu utilization in last 34 minutes (excluding 26 seconds).
    • If system up time is less than 1 minute, example 48 seconds, then average utilization is computed over that 48 seconds.


step2/  Determine if the High CPU is caused by Flow or Task

The command get perf cpu all detail lists the utilization history of the CPU by Flow and Task. The first number within the parenthesis refers to the Flow CPU, and the second number represents the Task CPU. 
---------------------------------------------------------------------
nsisg2000-> get perf cpu all detail
Average System Utilization: 55% (61  5)
Last 60 seconds:
59: 86(96  2)*** 58: 85(95  0)**  57: 86(96  2)*** 56: 85(95  0)**
55: 85(95  2)**  54: 86(96  0)*** 53: 86(96  2)*** 52: 86(96  0)***
51: 86(96  2)*** 50: 85(95  1)**  49: 86(96  2)*** 48: 86(96  0)***
47: 86(96  3)*** 46: 86(96  0)*** 45: 86(96  2)*** 44: 86(96  0)***
43: 86(96  2)*** 42: 86(96  0)*** 41: 86(96  2)*** 40: 86(96  0)***
39: 86(96  2)*** 38: 86(96  0)*** 37: 86(96  2)*** 36: 86(96  0)***
35: 86(96  2)*** 34: 86(96  1)*** 33: 85(95  4)**  32: 85(95  0)**
31: 86(96  2)*** 30: 86(96  0)*** 29: 86(96  2)*** 28: 86(96  1)***
27: 86(96  3)*** 26: 86(96  0)*** 25: 86(96  2)*** 24: 86(96  0)***
23: 86(96  2)*** 22: 86(96  0)*** 21: 86(96  2)*** 20: 86(96  0)***
19: 86(96  2)*** 18: 86(96  1)*** 17: 86(96  2)*** 16: 86(96 36)***
15: 86(96  2)*** 14: 86(96  0)*** 13: 85(95  2)**  12: 86(96  0)***
11: 86(96  3)*** 10: 86(96  0)***  9: 86(96  2)***  8: 86(96  0)***
 7: 86(96  3)***  6: 86(96  0)***  5: 85(95  3)**   4: 86(96  1)***
 3: 86(96  2)***  2: 86(96  1)***  1: 86(96  2)***  0: 85(95  0)**

Last 60 minutes:
59: 85(95  1)**  58: 85(95 24)**  57: 84(94  1)**  56: 84(94  1)**
55: 84(94  1)**  54: 84(94  1)**  53: 83(93  1)**  52: 83(93  1)**
51: 82(92  1)**  50: 82(92  1)**  49: 83(93  2)**  48: 82(92  1)**
47: 82(92  1)**  46: 81(91  1)**  45: 81(91  1)**  44: 80(90  2)**
43: 81(91 14)**  42: 79(89 72)**  41: 57(22 66)*   40: 53(19 63)*
39: 53( 1 63)*   38: 53(18 63)*   37: 61(57 65)*   36: 56(34 64)*
35: 59(58 66)*   34: 32(35 11)    33: 26(33  1)    32: 70(80  0)*
31: 66(76  0)*   30: 50(60  0)*   29: 48(58  1)    28: 26(36  4)
27: 24(32  2)    26: 45(54  1)    25: 55(65  1)*   24: 21(30  1)
23: 63(73  0)*   22: 33(40  0)    21: 11(13  1)    20: 53(63  0)*
19: 78(88  1)**  18:  9(13  2)    17: 46(56  2)    16: 19(29  1)
15: 38(48  0)    14: 35(45  1)    13: 63(73  0)*   12: 79(89  0)**
11: 78(88  1)**  10: 36(45  0)     9: 22(27  1)     8: 31(41  0)
 7: 71(81  1)**   6:  5( 6  2)     5:  4( 5  0)     4: 31(39  1)
 3:  5( 5  1)     2: 56(66  0)*    1: 42(52  0)     0: 25(34  2)

Last 24 hours:
23: 44(48 10)    22: 66(74  1)*   21: N/A          20: N/A     
19: N/A          18: N/A          17: N/A          16: N/A     
15: N/A          14: N/A          13: N/A          12: N/A     
11: N/A          10: N/A           9: N/A           8: N/A     
 7: N/A           6: N/A           5: N/A           4: N/A     
 3: N/A           2: N/A           1: N/A           0: N/A

----------------------------------------------------------------------
  • A single asterisk  *  indicates the CPU is nearing a warning threshold.   It is marked when utilization is  ≥ 50%  &  ≤ 70%.

  • Double asterisks  **  indicates to the administrator that CPU is nearing a high level; the administrator should investigate the cause of why CPU is nearing this level.  It is marked when utilization ≥ 70% & ≤ 85%.

  • Triple asterisks  ***  indicates the CPU utilization is high; the administrator should investigate the cause of why CPU is high.  It is marked when utilization is ≥ 85%.

step3  Investigate what could be causing the High CPU:

step4  Collect data and open a case. KB6987 - What data should I collect to troubleshoot High CPU issues on a Firewall device?

 

Purpose:
Troubleshooting

Related Links:

 

 

ASK THE KB

Question or KB ID:


 


 

 
Copyright© 1999-2012 Juniper Networks, Inc. All rights reserved.