Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[Junos] Network-services mismatch between Primary RE/Backup RE or between RE/line card can result in traffic blackholing/unexpected behavior

0

0

Article ID: KB31920 KB Last Updated: 30 Sep 2021Version: 5.0
Summary:

The network-services mode defines how a router chassis recognizes and uses certain modules. For example, certain line cards and fabric cards are powered off/on only with certain types of network services. Besides power on/off of modules, this network-services mode also enables certain features with respective modes. For example, features such as multicast replication, logical interface scaling, adaptive load balancing and enhanced-lag depend on these modes.

On the MX platform, we have the following network-services modes available with specific modes enabled by default on specific chassis.

  • IP (default)

  • Enhanced-IP

  • Enhanced-Ethernet

  • Ethernet 

  • LAN

For more details on which features are available and which module is powered on/off with these specific modes, refer to Network Services Mode Overview.

On PTX Series and T Series routers, we have the following two modes of network services available:

  • enhanced-mode

  • non-enhanced mode or Normal mode

For more details on these network-services modes on PTX and T Series routers, refer to enhanced-mode (Network Services).

To configure these modes, you need to configure the following under [edit chassis network-services].

lab@lab-mx# set chassis network-services ?
Possible completions:
  enhanced-ethernet    Enhanced ethernet network services
  enhanced-ip          Enhanced IP network services
  ethernet             Ethernet network services
  ip                   IP network services
  lan                  Ethernet LAN services
[edit]

On certain platforms, the network-services mode is applied by using the junos-defaults group and there is no need to configure any network-services mode on those platforms.

lab@lab-ptx> show configuration groups junos-defaults chassis
##
## protect: groups junos-defaults
##
network-services enhanced-mode;

Because this is a chassis-level configuration, whenever this mode is changed, the entire router has to be rebooted for the new network-services mode to take effect. In case of dual Routing Engines, you need to make sure that both REs are rebooted after making any change. For the system to work, this mode configuration should be in sync on the control plane and the forwarding plane; that is, the RPD, kernel, and line card should all operate in the same mode. 

If there is any discrepancy between the configuration/CLI view, kernel view, and line card view with respect to this network-services mode, you can see unexpected behavior such as kernel crash, FPC crash, and even blackhole/outage for unicast/multicast traffic.

This article talks about various scenarios where this network-services mode can go out of sync between both Routing Engines or between the Routing Engine and the line card and how to detect the same.

Symptoms:

Primary RE and Backup RE out of sync (example from MX router)

Primary RE

root@master-re> show configuration chassis network-services
network-services enhanced-ip;

labroot@master-re> show chassis network-services
Network Services Mode: Enhanced-IP

root@master-re% sysctl net.netsvc
net.netsvc: 2                               <<< 2 indicates enhanced-ip mode.

Backup RE

root@backup-re> show configuration chassis network-services
network-services enhanced-ip;

root@backup-re> show chassis network-services
Network Services Mode: IP

root@backup-re% sysctl net.netsvc
net.netsvc: 0                              <<< 0 indicates IP mode.

Routing Engine and Line card out of sync (Example from PTX router)

From Routing Engine

lab@ptx> show configuration groups junos-defaults chassis               
##
## protect: groups junos-defaults
##
network-services enhanced-mode;

lab@ptx> show chassis network-services  
Network Services Mode: Enhanced-Mode

root@ptx:# sysctl net.netsvc
net.netsvc: 0               <<< For enhanced-mode, this value should be 6. 

From FPC

lab@ptx> start shell pfe network fpc0

FPC0(ptx vty)# show shim fpc enh-mode    

ENH-MODE: paradise mode

Due to a difference in the network-services mode that is operational on the RE and the FPC, the following log can be seen in some cases as well. This syslog message was seen on an MX Series device:

dfwc: Filter () configured for enhanced-mode but the chassis is running in NORMAL (All FPC) mode

This indicates that the filter is not getting programmed correctly to the PFEs due to a difference in the network-services mode. These logs can be accompanied by the following symptoms as well: 

  • Traffic blackholing

  • DHCP subscribers not coming online

  • Intermittent traffic disruptions

Cause:

There are multiple scenarios where this out-of-sync state can occur in the field. Listed below are a few examples:

  1. Routing Engine replacement scenario

In the case of dual Routing Engines, let us say that network-services is configured as enhanced-ip on an MX router and due to a hardware issue, there is a need to replace the backup RE. Usually customers deactivate graceful-switchover and nonstop-routing, and then halt the backup RE and replace it. Post replacement, they push the configuration from the primary RE by using commit synchronize.

In this scenario, the default network-services mode on the backup RE is "IP" and when commit synchronize is executed from the primary RE, it triggers a network-services change and the new backup RE has to be rebooted for the network-services to take effect.

However if it is ignored, it can result in the primary RE and the backup RE operating in different network-services modes. The impact can be seen when this new backup RE becomes the primary. Because the network-services is not enhanced-ip, it can result in all the features/modules that were working with enhanced-ip mode on to stop working. PR1287956 covers one such example. KB32992 - [MX] Recommendations for performing RE replacement on an MX router configured with graceful-switchover and 'network-services enhanced-ip/enhanced-ethernet' can also be referred to for recommended steps while performing Routing Engine replacement.

  1. Changing the network-services mode under [edit chassis network-services] in the case of dual REs

If there is a need to change network services via configuration, two possible scenarios can result in out-of-sync states.

After the network-services change, let us say only the primary RE is rebooted and the backup RE is not rebooted. Then both the primary RE and the backup RE will go out of sync and if RE switchover is done much later, this will have an impact and tracing it back to this network-services change might not be easy.

Another scenario may occur when graceful-switchover is configured. If the network-services mode is changed and each RE is rebooted one by one (by doing RE switchover and rebooting the backup RE) such that the FPC is not rebooted, it can make the FPC to go out of sync with both the REs.

A few PRs where out-of-sync states were reported in the past are PR1149928 and PR1336378.

  1. Routing Engine booting up into Amnesiac mode

This scenario can happen in both single RE and dual RE systems.

Let us say that the customer has configured enhanced-ip mode and is using event-scripts in the configuration. For event-scripts, there is a slax file copied under /var/db/scripts/event or under /var/run/scripts/event or under /config/scripts/event/ depending on different Junos OS releases. If during RE replacement or while modifying event-scripts, you fail to copy the required slax script in the desired location but call that script in the configuration and reboot the RE, this can result in the RE going into Amnesiac mode because it fails to find the specified file during boot process. Once the RE boots up in Amnesiac mode, it will not honor the network-services added via configuration or added via the junos-defaults group. In such cases, this out-of-sync can happen.

Solution:

From the router perspective, the network-services mode should be in sync from the configuration/CLI view, kernel view, and linecard/FPC view. Commands to check it in all three places on various platforms are given below:

On MX Platform

CLI View

lab@lab-mx> show chassis network-services
Network Services Mode: Enhanced-IP

Kernel View

start shell user root
root@lab-mx# sysctl net.netsvc
net.netsvc: 2              << 2 - enhanced-ip mode,  0 - ip mode

Line card or FPC view: Two ways to check

Here 0 is the PFE number.

SMPC0(lab-mx vty)# show jnh 0 vc mc state    

GLOBAL VC MULTICAST INFO
========================
  VC Mode Enabled:          ---> NO<---
  chassis id non-0:         ---> NO<---
  VC jnh structs allocated: ---> NO<---
  VC Member ID  (0 based): 0
  VC Chassis ID (1 based): 0

  VC BSDT target: No
  static mcast mode: TRINITY-ONLY             << For enhanced-ip mode, in case of IP mode, it will show ICHIP INTEROP.

SMPC11(lab-mx)# show nhdb hw mcast                          
Enhanced IP                            : Enabled                   <<<< 
Ingress replication latency fairness   : Disabled
Local replication latency fairness     : Disabled
   |-- Replication Threshold           : 10
   +-- Replication Thread Limit        : 256

PTX Platform

CLI View

lab@lab-ptx> show chassis network-services
Network Services Mode: Enhanced-Mode

Kernel View

lab@lab-ptx> start shell user root
root@lab-ptx# sysctl net.netsvc
net.netsvc: 6

Line card or FPC view

lab@lab-ptx> start shell pfe network fpc0
LNX-FPC0(lab-ptx vty)# show shim fpc enh-mode    

ENH-MODE: paradise mode

If you run into any of the scenarios mentioned in the "Cause" section, it is recommended to check if the network-services mode is in sync from the CLI/Kernel/Linecard perspective. If there is any discrepancy, it should be fixed first before troubleshooting further.

An event-script is also available, which can be deployed for MX/PTX/T4000 platforms from Junos OS release 14.1 and later. This event script will verify the network-services mode between the primary/backup RE and all FPCs. If any mismatch is detected, an emergency level syslog and SNMP trap will be generated. Details on the script and its usage is available at check_netsvc.slax.

In case of a discrepancy, you can perform the following steps to synchronize the network-service modes between the RE and the FPC: 

  1. Reboot one of the FPCs to see if the network-service gets in sync with the RE. Use "show jnh 0 vc mc state" from the FPC shell to verify.

  2. If this does not help, reboot the device. Make sure to verify the status on all the FPCs by using the FPC shell command "show jnh 0 vc mc state".

Modification History:
  • 2020-06-16: Article checked for accuracy; minor non-technical changes made

  • 2021-03-25: Updated the article terminology to align with Juniper's Inclusion & Diversity initiatives

  • 2021-09-30: Removed "sysctl -a" commands (they are intrusive), replaced with targeted correct "sysctl net.netsvc" commands

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search