In a chassis cluster there could be situations where a flow could come in on an interface of Node0 and leave from an interface of Node1. This situation is called Z-Mode, in which a flow has to go through the fabric link in order to get to the other node, which is Active for the other redundancy-group. These situations could cause undesirable results in some situations causing low throughput and/or latency.
This article will attempt to describe a situation where a TCP flow and an ICMP flow are going through Z-Mode and will compare with flows traversing through single node.
The goal is to identify the the flows that are traversing through SRX in Z-Mode. This will help make the right decisions while troubleshooting.
The Junos code used here is 11.2R3.
{primary:node0}[edit]
root@D10_30-SRX240H-Node0-HQ# run show version
node0:
--------------------------------------------------------------------------
Hostname: D10_30-SRX240H-Node0-HQ
Model: srx240h
JUNOS Software Release [11.2R3.3]
node1:
--------------------------------------------------------------------------
Hostname: D10_32-SRX240H-Node1-HQ
Model: srx240h
JUNOS Software Release [11.2R3.3]
Asymmetric Flow:
================
- In this example, notice the "Redundancy group: 1" is primary on Node 1 and "Redundancy group: 2" is primary on Node 0.
- Our source is located in "Redundancy group: 2" and target is located in "Redundancy group: 1"
- This means that the flow will enter the cluster from Node 0 and will leave from Node 1.
{primary:node0}[edit]
root@D10_30-SRX240H-Node0-HQ# run show chassis cluster status
Cluster ID: 1
Node Priority Status Preempt Manual failover
Redundancy group: 0 , Failover count: 1
node0 100 primary no no
node1 1 secondary no no
Redundancy group: 1 , Failover count: 2
node0 100 secondary yes yes
node1 255 primary yes yes <<<<< Notice Node1 is primary.
Redundancy group: 2 , Failover count: 1
node0 100 primary yes no <<<<< Notice Node0 is primary.
node1 1 secondary yes no
TCP:
====
- Souce initiated a Telnet session.
- Here is what is seen on the cluster;
{primary:node0}[edit]
root@D10_30-SRX240H-Node0-HQ# run show security flow session destination-port 23
node0:
--------------------------------------------------------------------------
Session ID: 7059, Policy name: CORP-INT-to-Internet/12, State: Backup, Timeout: 1790, Valid <<<< Notice the Backup state here.
In: 10.168.0.1/61529 --> 1.1.2.2/23;tcp, If: reth2.150, Pkts: 9, Bytes: 607
Out: 1.1.2.2/23 --> 1.1.2.63/61529;tcp, If: reth0.0, Pkts: 0, Bytes: 0 <<<<<< Notice there are no packets in the second wing of the session on Node 0.
Total sessions: 1
node1:
--------------------------------------------------------------------------
Session ID: 1127, Policy name: CORP-INT-to-Internet/12, State: Active, Timeout: 1790, Valid <<<< Notice the Active state here.
In: 10.168.0.1/61529 --> 1.1.2.2/23;tcp, If: reth2.150, Pkts: 0, Bytes: 0
Out: 1.1.2.2/23 --> 1.1.2.63/61529;tcp, If: reth0.0, Pkts: 9, Bytes: 647 <<<<<< Notice the return packets are seen on Node1.
Total sessions: 1
- Notice the "STATE". The handshake happened on the interface that was active on Node1, hence it was labelled as "Active".
- The tcp session was completely synched up on both nodes.
- This was a scenario where tcp flow was traversing through SRX in Z-Mode.
ICMP:
=====
{primary:node0}[edit]
root@D10_30-SRX240H-Node0-HQ# run show security flow session protocol icmp
node0:
--------------------------------------------------------------------------
Session ID: 7053, Policy name: CORP-INT-to-Internet/12, State: Forward, Timeout: 8, Valid <<<< Notice the "Forward" state.
In: 10.168.0.1/0 --> 1.1.2.2/21116;icmp, If: reth2.150, Pkts: 1, Bytes: 84
Out: 1.1.2.2/21116 --> 1.1.2.63/0;icmp, If: reth0.0, Pkts: 0, Bytes: 0 <<<< No packets received on Node0.
Total sessions: 1
node1:
--------------------------------------------------------------------------
Session ID: 1122, Policy name: CORP-INT-to-Internet/12, State: Active, Timeout: 2, Valid <<<< Notice the "Active" state.
In: 10.168.0.1/0 --> 1.1.2.2/21116;icmp, If: reth2.150, Pkts: 0, Bytes: 0
Out: 1.1.2.2/21116 --> 1.1.2.63/0;icmp, If: reth0.0, Pkts: 1, Bytes: 84 <<<< Packets are received here.
Total sessions: 1
- The only difference between the TCP session and an ICMP flow is the "Forward" and "Active" states on both. It means that Node0 received the initial packets and forwarded the packets via fabric link to the other node. There was no session sync for ICMP flow.
Normal Flow:
============
- In this scenario Node 0 is the primary for both redundancy groups.
{primary:node0}[edit]
root@D10_30-SRX240H-Node0-HQ# run show chassis cluster status
Cluster ID: 1
Node Priority Status Preempt Manual failover
Redundancy group: 0 , Failover count: 1
node0 100 primary no no
node1 1 secondary no no
Redundancy group: 1 , Failover count: 3
node0 100 primary yes no
node1 1 secondary yes no
Redundancy group: 2 , Failover count: 1
node0 100 primary yes no
node1 1 secondary yes no
TCP:
====
- Notice the session sync between the nodes, although all the traffic is going through node 0.
{primary:node0}[edit]
root@D10_30-SRX240H-Node0-HQ# run show security flow session destination-port 23
node0:
--------------------------------------------------------------------------
Session ID: 7071, Policy name: CORP-INT-to-Internet/12, State: Active, Timeout: 1792, Valid
In: 10.168.0.1/63196 --> 1.1.2.2/23;tcp, If: reth2.150, Pkts: 10, Bytes: 671
Out: 1.1.2.2/23 --> 1.1.2.63/63196;tcp, If: reth0.0, Pkts: 9, Bytes: 647
Total sessions: 1
node1:
--------------------------------------------------------------------------
Session ID: 1136, Policy name: CORP-INT-to-Internet/12, State: Backup, Timeout: 14408, Valid <<< Session is synchronized here
In: 10.168.0.1/63196 --> 1.1.2.2/23;tcp, If: reth2.150, Pkts: 0, Bytes: 0
Out: 1.1.2.2/23 --> 1.1.2.63/63196;tcp, If: reth0.0, Pkts: 0, Bytes: 0
Total sessions: 1
ICMP:
=====
- Notice that there is no session sync with ICMP. Node 0 didn't "Forward" anything to Node 1. This is expected behavior.
{primary:node0}[edit]
root@D10_30-SRX240H-Node0-HQ# run show security flow session protocol icmp
node0:
--------------------------------------------------------------------------
Session ID: 7152, Policy name: CORP-INT-to-Internet/12, State: Active, Timeout: 2, Valid
In: 10.168.0.1/0 --> 1.1.2.2/21296;icmp, If: reth2.150, Pkts: 1, Bytes: 84
Out: 1.1.2.2/21296 --> 1.1.2.63/0;icmp, If: reth0.0, Pkts: 1, Bytes: 84
Total sessions: 1
node1:
--------------------------------------------------------------------------
Total sessions: 0