Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[SRX] Heartbeat errors incrementing on both nodes of SRX chassis cluster control link

0

0

Article ID: KB37129 KB Last Updated: 30 Jun 2021Version: 1.0
Summary:

This article explains why users may see incrementing heartbeat errors on the control link of both nodes in an SRX chassis cluster and recommends defining each cluster in a separate VLAN to avoid such increments.

Symptoms:

Users may see heartbeat errors incrementing on the control link of an SRX chassis cluster as indicated below:

root@SRX-Node0> show chassis cluster information detail no-forwarding

Control link statistics:
    Control link 0:
    Heartbeat packets sent: 3050
    Heartbeat packets received: 2777
    Heartbeat packet errors: 6098
    Duplicate heartbeat packets received: 0

root@SRX-Node1> show chassis cluster information detail no-forwarding

Control link statistics:
    Control link 0:
    Heartbeat packets sent: 2794
    Heartbeat packets received: 2792
    Heartbeat packet errors: 5586
    Duplicate heartbeat packets received: 0
Cause:

Such error increments are due to one cluster node receiving heartbeats from another cluster node that is configured in the same broadcast domain.

Look for the jsrpd logs and for indication of heartbeats from any other cluster-id than what is configured for the given cluster. You may see logs as shown below, for example:

"May 24 08:01:22 invalid cluster-id 4 in heartbeat"

A cluster-id is not part of the configuration on the device but is stored in the CB. Run the following command to know your cluster ID.

root@SRX-Node0> show chassis cluster status

Monitor Failure codes:
    CS  Cold Sync monitoring        FL  Fabric Connection monitoring
    GR  GRES monitoring             HW  Hardware monitoring
    IF  Interface monitoring        IP  IP monitoring
    LB  Loopback monitoring         MB  Mbuf monitoring
    NH  Nexthop monitoring          NP  NPC monitoring
    SP  SPU monitoring              SM  Schedule monitoring
    CF  Config Sync monitoring      RE  Relinquish monitoring
    IS  IRQ storm

Cluster ID: 3                              
Node   Priority Status               Preempt Manual   Monitor-failures

Redundancy group: 0 , Failover count: 1
node0  100      primary              no      no       None
node1  1        secondary            no      no       None

Redundancy group: 1 , Failover count: 1
node0  100      primary              no      no       None
node1  1        secondary            no      no       None
Solution:

To prevent the heartbeat errors from incrementing as shown above, it is recommended that each cluster should be defined in a separate VLAN.

See Example: Setting the Node ID and Cluster ID for Security Devices in a Chassis Cluster for more information.

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search