Support Support Downloads Knowledge Base Service Request Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[QFabric] QFX3000-M and QFX3600-I IC replacement impact analysis

0

0

Article ID: KB34940 KB Last Updated: 28 Aug 2019Version: 1.0
Summary:
This article discusses the impact of replacing a QFX3600-I InterConnect (IC) device on a QFX3000-M QFabric system.

The tests described in this article were carried out in a lab environment using a IXIA traffic generator. A lab environment is generally less complex than a production environment. This article aims to quantify the amount of packet loss, but will not look into how services might be impacted. Service impact depends on the design, the protocols used, and how they handle and recover from packet loss.

Note: The impact of an IC crash or replacement on a QFabric QFX3000-G is different.

Symptoms:

A possible symptom is that a fan fails, and even after replacing it, the new fan also does not work. After troubleshooting, it is determined that the slot is faulty and not the fan.

Cause:

The QFX3600-I chassis might need to be replaced or rebooted following troubleshooting or RMA.

Solution:

Impact of IC Replacement by Different Scenarios

Scenario 0 - power-off the IC:

The recommended procedure can be found in the following technical document:
Adding or Replacing an Interconnect Device in a QFX3000-M QFabric System

Additionally, tests were conducted in the lab pushing 20Gbps through the IC with minimal packet loss. Powering off the device or halting the device exhibit the same results. In the lab, traffic is gracefully failed over to the other IC device within 10 to 15 seconds, observing no packet loss.

Note: This was a lab with a clean install. You should expect and plan for some packet drops in a live working scenario.

The following commands were used on the IC, which exhibit the same results:

  • request system power-off in 0
  • request system halt in 0
  • request system reboot in 0‚Äč

Scenario 1 - Disable the interfaces on the node devices 1 by 1

This may complicate and/or extend your maintenance window.
Once the IC chassis is replaced, you need to go back to each node device and enable the interfaces 1 by 1.

Scenario 2 - Disable the interfaces on the IC device 1 by 1

This may extend your maintenance window.
You need to log into the IC device as root and then to the CLI to be able to access the configuration from outside the DG.
This is not the recommended method of replacing the IC device.
If you do this, you would have several moments where traffic is impacted (1 per interface during each interface failover
Scenario 3 - Disable all the fte interfaces on the IC side.

This would be comparable to powering off the device, but can give you a quick way to restore traffic back in case the service impact is greater than expected.

Impact of IC Replacement by Disabling Interfaces

  • Frame drops are observed for L2 unicast/multicast traffic when the fte* interfaces from the interconnect device are disabled.
  • These frame drops stop when the traffic is failed over to the remaining interconnect device.
  • No frame drops are observed when the interfaces are enabled again, placing traffic back on the IC device.
  • Total frame drops for a given frame size is inversely proportional to the frame size.
  • The percentage of dropped frames during failover as compared to the total frames transmitted in a 60s interval is constant, regardless of frame size. Which suggests that the number of dropped frames is a function of frame size.
  • By measuring the total frame count transmitted in a 60s interval with a constant load, one can calculate the amount of frames per second going through the IC device.
  • Failover time is calculated to be around 14ms, based on the fps number and the total Dropped Frame Count.

One can observe these amounts of drops during the moment traffic is failed over, under the following conditions:

  • ≈6s commit time (± 1s error) on the IC device
  • Constant load of 5 Gbps Tx and 5 Gbps Rx going through the IC device (aggregated 10G of traffic)
  • A Total Frame Count span of ≈60s (± 1s error) , during which all fte* interfaces in the IC device are disabled at the ≈30s mark (± 1s error)
 
Frame Size (bytes) Dropped frame Count Total Frame Count
(in 60s)
% Dropped Packet
(in 60s)
Frames per Second (fps) Failover Time
(in ms)
128 124363 520366820 0,0238991 8672780,33 14,33946154
300 66438 240791244 0,02759153 4013187,4 16,55492091
500 31668 147839915 0,02142047 2463998,58 12,85228012
1024 18545 74386178 0,02493071 1239769,63 14,95842413
1518 12139 50165216 0,02419804 836086,933 14,51882516
4000 4786 19271480 0,02483463 321191,333 14,90077565
9000 1934 8529814 0,02267341 142163,567 13,60404811

% Dropped Packets (in 60s) = Dropped Frame Count x 100% ÷ Total Frame Count (in 60s)
fps = Total Frame Count (in 60s) ÷ 60s
Failover Time = Dropped Frame Count x 1000ms ÷ fps

 
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Security Alerts and Vulnerabilities

Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search