Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

QFX3500 limitations with soft error recovery

0

0

Article ID: KB34081 KB Last Updated: 28 May 2019Version: 1.0
Summary:

A soft error on QFX3500 causes non-permanent damage of the internal memories and registers, and is correctable. 

This article explains the limitations on a QFX3500 from different software versions and some useful commands.

Symptoms:

When a QFX3500 device encounters a soft error (parity error) while the soft error recovery feature is not enabled, this may cause packets to drop silently and no error logs would be available.

You may have to reboot or power cycle the device to recover it, but without finding the root cause.

Cause:

Reasons the soft error may occur:

  1. Emission of alpha particles from tiny amounts of radioactive materials present in the chips.

  2. Cosmic rays creating energetic neutrons and protons.

Solution:

Version 12.3X50-D42.10 has enabled limited soft error detection for QFX3500, but only detection for static register and memory, no dynamic memory error will be detected. And no recovery actions will be taken.

Version 14.1X53-D49 and higher has enabled full soft error detection and recovery for QFX3500, including memory scan for dynamic memory error detection. In case the PFE encounters a parity error, the software will try to recover the error by itself.

There's no need to change any configuration with this feature. Once software is upgraded, the feature is enabled automatically.

Here is the typical log message when a parity error happened and recovered:

fpc0   SER Parity Check Error.
fpc0 unit 0 L3_DEFIP_DATA_ONLY entry 0 parity error
fpc0 Unit 0: mem: 2067=L3_DEFIP_DATA_ONLY blkoffset:6
fpc0 Unit 0: RESTORE[from Y pipe]: L3_DEFIP_DATA_ONLY[2067] blk: ipipe0 index: 0
fpc0 Unit 0: L3_DEFIP entry 0 TCAM parity error
fpc0 Unit 0: mem: 2046=L3_DEFIP blkoffset:6
fpc0 Unit 0: CACHE_RESTORE: L3_DEFIP[2046] blk: ipipe0 index: 0 : [0][0]
 

Commands to check if the soft error recovery feature is running or not. These commands may apply to other QFX5k platforms:

root@qfx>start shell
root@qfx:RE:0% cprod -A fpc0 -c "set dc getconfig" | grep parity
parity_correction = 1   <--"1" indicated the parity error correction feature enabled
parity_enable = 1       <--"1" indicated the parity error detection feature enabled
 
root@qfx:RE:0% cprod -A fpc0 -c 'set dc bc "memscan"'
HW (unit 0)
MemSCAN: Running on unit 0   <--"Running" indicated the memory scan feature enabled on the device
MemSCAN:   Interval: 10000000 usec
MemSCAN:   Rate: 4096

Command for soft error recovery testing:

root@qfx>start shell
root@qfx:RE:0% cprod -A fpc0 -c 'set dc bc "ser inject memory=l3_defip"'  <-- This command is to inject an error into pfe memory l3_defip

HW (unit 0)
Error injected on l3_defip at index 0 pipe_x


Except for the soft error recovery feature, in case the device encounters a parity error, power cycle or rebooting is another way to recover it. If that does not work, it is most likely a hardware issue. For further assistance, contact your JTAC representative.
 

Related Links

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search