Knowledge Center Search


 

What is causing High "FLOW" CPU Utilization? (ScreenOS 5.x and later)

  [KB21722] Show KB Properties

  [KB21722] Hide KB Properties

Categories:
Knowledge Base ID: KB21722
Last Updated: 23 Apr 2013
Version: 3.0

Summary:

High CPU in Flow indicates the Firewall is busy processing packets; this includes the processing of functions such as:

  • Session creation/ tear down
  • Traffic management features (i.e. logging, shaping, etc)
  • Firewall Protection features (i.e. Screen options)
  • ALG processing
  • Attacks

Note:  This article was previously documented as KB3858.


Problem or Goal:

What is the cause for High "FLOW" CPU Utilization?


note:   If you have not determined if the High CPU is due to Flow or Task, consult KB9453 - Troubleshooting High CPU on a firewall device before continuing.


If the Flow CPU is high, perform the following, depending on the ScreenOS version:

ScreenOS 6.0.0r2 or higher
-----------------------------------------
Go to KB11710 - How to run Packet Profiling on firewall with High Flow CPU to help identify the cause.  Packet Profiling was added to ScreenOS 6.x for troubleshooting high flow CPU.
If you already performed the steps in KB11710 and did not identify the cause, then you will be redirected back to this article to run the commands in the Solution below. 


ScreenOS 5.4.0 or lower
-----------------------------------------
Check and collect the command output in the Solution below.

Solution:

The following ScreenOS commands are helpful to identify the cause of High "FLOW" CPU.


Ramp-up rate  -  Run the CLI command ‘get perf session detail’ several times, and view the values in the Last 60 seconds; this represents the new sessions/per second.

get perf session detail

 

Sample:
ns5400 > get perf session detail
Last 60 seconds:
0:  173   1: 164   2: 149   3: 155   4: 163   5: 159
6:  157   7: 152   8: 154   9: 155  10: 160  11: 162
12: 162  13: 156  14: 153  15: 155  16: 154  17: 155



ASIC Counters
- (ISG-1000, ISG-2000, and NS-5000 devices)   Run the 'get sat 0 demux' command.   Issue the command several times (at approx 10 second intervals) and note those counters that have a high PPS (Packets Per Sec).   In ScreenOS 6.0.0r2 and above, the PPS is calculated and displayed in the last column.  In ScreenOS 5.4 and below, the PPS has to be calculated manually. 

Important:  The ASIC Counters output, i.e the 'get sat 0 demux' output, displays what traffic is being processed by the CPU, so analysis of the output is very helpful in identifying the root cause of the high flow CPU.
If ISG or NS-5000, run the following 6 to 8 times:
        get sat 0 d
        <wait 10 sec>

If NS-5000 device, also enter the following 6 to 8 times for each ASIC (1-5).
        get sat <asic> d
        <wait 10 sec>

 


Sample of ScreenOS 5.4 and below:

ns5400 > get sat 0 demux
to_host_packet: 3080742  
(indicates the packets are related to existing sessions; this includes packets for ALG, ICMP, TCP Proxy, traffic shaping, and routing changes)
first_packet: 250560540   (indicates new sessions)
brcst: 16379
no_ip_ether_net: 35690
ttl_zerio: 46545
tcp_chksum_err: 70
udp_chksum_err: 1

clsf counters:
fin no ack: 0x00000485
unknown protocol: 0x0ee63d10
icmp: 0x0ee62d3e

Sample of ScreenOS 6.0.0r2 and above:  Note that the PPS will be calculated and the statistics shown as follows:
TEST-1(M)-> get sat 0 d    

                Current(10d;10:12:00) Last(10d;10:12:00) PPS(  44s) 
to_host_packet:              29844841           29840077    107        (new sessions)
SYN/ACK: 159045 159036 0
FIN: 217824 217816 0
RST: 65392 65388 0
OTHERS: 29402580 29397837 107

first_packet: 631391645 631357189 777
brcst: 1084386 1084357 0
no_ip_ether_net: 9681331 9680827 11
ipsec_no_sa: 5 5 0
sa_time_sec_expire: 11373803 11372599 27
sa_inactive: 3260 3260 0
required_vpn_done: 5140827 5140629 4
ipsec_auth_fail: 1216 1216 0
seq_out_window: 148080 148050 0
ttl_zero: 108117 108117 0
tcp_data_off_err: 210 210 0
tiny_tcp_err: 38 38 0
tcp_chksum_err: 215523 215522 0
udp_chksum_err: 897 897 0
defragged_proc: 35198 35198 0
merge_proc: 120103 20101 0
total packet: 689149480 689108292 929

clsf counters:
winnuke 7 7 0
tcp no flag 48 48 0
syn fin 97 97 0
fin no ack 6058 6058 0
ip opt record 7 7 0
ip opt time 7 7 0
ip opt security 9 9 0
ip opt loose src route 2 2 0
ip opt strict src route 7 7 0
ip opt stream 7 7 0
udp flood 7 7 0
icmp flood 6 6 0
fragment pak 140355 140353 0
unknown protocol 747330 747294 0
more than 4 options 4 4 0
tcp rst pak 2 2 0
xpt passthru ipsec 5 5 0
ipsec frag 5 5 0
icmp 523465465 523437720 626
unknown clsf tag 376 376 0
Note that all of the counters, eg "first_packet" or "icmp' for example, are packets which will need to be processed by the firewall CPU.
In the above example, if the two numbers highlighted in red are compared:  626 / 929 * 100% = 67%.

The number of packets processed by the CPU due to ICMP would be 626 Packets Per Sec and compared with the total of 929 would show that approximately 67% of the traffic which is being processed by the CPU would be due to ICMP traffic.



Session Table – Check session table information to see the total number of sustained sessions and whether there are any session allocation failures. 

get session info

Sample:
S5400->  get session info
slot 1: sw alloc 0/max 1000000, alloc failed 24749314, di alloc failed 0
slot 2: hw0 alloc 0/max 1048576
slot 2: hw1 alloc 0/max 1048576


Attacks - Check if the network is under any kind of attack or if there are a high number of packets getting processed by the screen options

get counter screen zone
get alarm event
get log event


Note: There is the possibility that an attack can be occurring, but is not being reported in the output of the above commands. This is because the firewall will only report attacks for the screen options configured on the firewall. To confirm an attack is not occurring, connect a packet capture tool to the firewall’s network segments and review the data.

For additional information, consult: KB8332 - ScreenOS: Which of the screening features can increase CPU utilization?

 

Malicious URL - Check if Mal-URL is enabled. Consult KB8603 - Impact of turning on 'mal-url' screen option on ASIC based system

 

Interface Counters - Check for errors, high policy deny values, high frag values or any other counters that are incrementing unusually.

It's best to clear the counters and take a new snapshot of the counters. To clear the counters, enter clear counter all. Then, enter the following set of commands several times; leaving a 5 - 10 second interval between sets.

get clock
get counter stat


High volume of fragmentation can cause high CPU in flow. For firewall devices with a single CPU (i.e. NetScreen-200 models and below), fragmentation has a dramatic effect.  Run CLI command get session frag several times to check for packet fragmentation.  Consult the following article for recommendations on resolving the issue:

 

ALG - Identify any applications using the ALG function (i.e. FTP, H323, etc)

Obtain the non-truncated  get session output from the firewall and run it through the Firewall Session Analyzer Tool to determine which applications are most commonly running through the firewall.  The  ‘Rank based on source IP with protocol and destination port information’ will help identify the ‘top’ applications.  Refer to KB8604 (below) to see if those ‘top’ applications will trigger the ALG. 

 

Debug / Snoop - Check if either debug or snoop is enabled. Consult KB4493 - Debug and Snoop Can Cause High CPU Utilization

 

Traffic -


Packet rate - If SSG series, NS500, NS200, NS25/50 or NS5 series, calculate the packets-per-second going through the firewall.   The easiest way to determine the packet-per-second rate is to get a 1-5 minute snapshot of the network by capturing a packet trace. Many of the packet capturing tools have an option to display the packet-per-second rate.

If you do not have or cannot set up a packet trace of the network, the next best effort is to calculate the total number of packets coming into the interfaces of the firewall.

This can be done by obtaining the output from get counter stat consecutively over a set time period.  To do this, first issue a ‘get clock’, so that you have a time stamp to reference from. Then, issue a ‘get counter stat | include packet’. Total the number of ‘in packets’ and divide by 2.  This provides the total hardware and flow counters for each interface. Repeat the process of issuing another ‘get clock’ and another ‘get counter stat | include packet’.  Do this in quick succession, so that you can get an accurate time measurement.


Policy Ordering - Ensure the most frequently used policies are positioned near the top of the policy list. 

The following are possible ways to help determine the frequently used policies:

  • Use NSM.
  • If counting is enabled on the policies, analyze the data.
  • Obtain the non-truncated  get session output from the firewall, and run it through the Firewall Session Analyzer Tool to help determine the frequently used policies.

 

For additional Technical Support Assistance, consult: KB6987 - What data should I collect to troubleshoot High CPU issues on a Firewall device?



Purpose:
Implementation

Related Links:

 

 

ASK THE KB

Question or KB ID:


 


 

 
Copyright© 1999-2012 Juniper Networks, Inc. All rights reserved.