Knowledge Search


×
 

[SRX] Low throughput when using IPv6

  [KB30951] Show Article Properties


Summary:

This article explains the reason for low throughput when using IPv6 and how it is related to TCP Segmentation Offload.

Symptoms:

Throughput issues with IPv6 traffic.

Cause:

When analyzing the host to host captures, various segments were being lost on the way while reaching the other side. This resulted in a lot of re-transmissions and duplicate ACKS, contributing to slow throughput for transit traffic flowing through the SRX.

Solution:

The issue was resolved by disabling “TCP Segmentation Offload” on the Server IPV6 stack.

TCP segmentation offload helps reduce the CPU overhead of TCP/IP. Its function includes breaking down large groups of data into segments to pass through the network between the source and destination. In this offload, the Network interface controller (NIC) segments the data and adds the Layer 4 (TCP), Layer 3 (IP) and Layer 2 (data link layer) headers to the segment. The NIC must support TSO. TSO is also known as large segment offload (LSO).

This feature should improve the performance in the network, however there is a catch:

For this feature to work as expected, the devices in the network, namely the SRX through which all traffic flows, have to agree on the frame size with the Server. The server cannot send frames that are larger than the Maximum Transmission Unit (MTU) supported by the SRX.

The server can discover the MTU by using the Path MTU Discovery (PMTUD), but there is no way for the server to inform the Ethernet adapter about the same. The LSO engine does not have the ability to use a change the frame size on the fly. It can use the default standard value of 1500 bytes, or if jumbo frames are supported, the size of the jumbo frame configured for the Ethernet Adapter. If the LSO engine transmits a frame which is larger than what is supported by the SRX, it silently drops that frame. And this is how a feature which is supposed to enhance performance becomes a serious bottleneck in the flow.

To understand how the performance is affected, take an example of a large TCP packet which needs to traverse the SRX from the source to destination:

  • With the feature enabled, the TCP/IP network stack on the server builds a large TCP packet.
  • The server sends this large TCP packet to the NIC or segmentation by its LSO engine. Because the LSO engine does not have the ability to discover the MTU supported by the SRX, it uses a standard default value.• The LSO engine sends each of the frame segments that make up the large TCP message to the SRX.
  • The LSO engine sends each of the frame segments that make up the large TCP message to the SRX.
  • The SRX receives the frame segments, but because LSO sent frames larger than the MTU of the SRX interface and they get discarded.
  • On the server which is waiting for the TCP packets, the flow times out when no packets are received and it sends a re-transmit request. The time it takes to timeout the flow and send a re-transmit causes a huge delay in the flow.
  • The sending side receives the re-transmission request and creates the packet again. But now this being a retransmission request, the server does not send the TCP packet to the NIC adapter to be segmented. It handles the segmentation process on its own. This feature is designed to counter any failures introduced by the offloading hardware on the adapter.
  • The SRX receives the re-transmission frames from the server, which match the proper MTU of the link as the re-transmitted packet was handled by the server itself and not offloaded to the adapter.
  • The other side receives the TCP message and processes it accordingly.

The delay introduced while waiting for the timeout clock on the receiving side to reach zero and then ask for a re-transmission from the sender side is the cause of this issue. This process is repeated the next time when a large TCP packet has to be sent.

Moreover, IPv6 performs a Path MTU Discovery. This functionality coupled with TSO causes a severe throughput degradation in the network using IPv6.

TSO/LSO is enabled by default for both IPv4 and IPv6 stacks on the NIC Card:

Related Links: