Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[Contrail] Format of Zookeeper transaction ID

0

0

Article ID: KB35227 KB Last Updated: 16 Nov 2019Version: 1.0
Summary:

Each write operation in ZooKeeper is a transaction, and it is assigned with a transaction ID (zxid).

This article provides a simple explanation of how the transaction ID works. This information can be used when troubleshooting to determine the latest zxid from each Zookeeper server in the ensemble, whether any one server is left behind or brain split occurs.

Solution:

The Zookeeper transaction ID consists of two parts; 32-bit epoch and 32-bit counter. The Epoch part changes each time there is a new leader elected, and the counter part increases each time there is a write operation.

The following example shows how the transaction ID changes while a new leader is elected. At the beginning, all three servers show the same maximum zxid 0x700000009b, where 0x70 is the epoch part, and 0x0000009b is the counter part. When combining the epoch part and counter part, you get the whole maximum zxid. 

root@c101:~# echo stat | nc localhost 2181
Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT
Clients:
 /172.18.101.101:34539[1](queued=0,recved=190881,sent=190881)
 /172.18.101.101:54438[1](queued=0,recved=299127,sent=299127)
 /172.18.101.102:47848[1](queued=0,recved=454,sent=454)
 /172.18.101.102:47830[1](queued=0,recved=4503,sent=4504)
 /172.18.101.103:41946[1](queued=0,recved=4496,sent=4496)
 /172.18.101.101:54406[1](queued=0,recved=4501,sent=4501)
 /172.18.101.102:47835[1](queued=0,recved=299112,sent=299112)
 /172.18.101.101:34518[1](queued=0,recved=190932,sent=190932)
 /172.18.101.103:41923[1](queued=0,recved=449,sent=449)
 /172.18.101.102:57197[1](queued=0,recved=190851,sent=190852)
 /172.18.101.101:54377[1](queued=0,recved=4502,sent=4502)
 /172.18.101.101:34699[1](queued=0,recved=190740,sent=190740)
 /172.18.101.103:38405[1](queued=0,recved=299135,sent=299135)
 /172.18.101.101:54349[1](queued=0,recved=456,sent=457)
 /172.18.101.102:57193[1](queued=0,recved=190935,sent=190936)
 /0:0:0:0:0:0:0:1:36134[0](queued=0,recved=1,sent=0)
 /172.18.101.101:59966[1](queued=0,recved=4495,sent=4495)
 /172.18.101.103:38384[1](queued=0,recved=4503,sent=4504)
 /172.18.101.103:41944[1](queued=0,recved=4496,sent=4496)
Latency min/avg/max: 0/0/15
Received: 1884938
Sent: 1884942
Connections: 19
Outstanding: 0
Zxid: 0x700000009b
Mode: follower
Node count: 2043
root@c102:~# echo stat | nc localhost 2181
Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT
Clients:
 /127.0.0.1:51864[1](queued=0,recved=622,sent=622)
 /0:0:0:0:0:0:0:1:58776[0](queued=0,recved=1,sent=0)
 /127.0.0.1:51905[1](queued=0,recved=622,sent=622)
 /127.0.0.1:51852[1](queued=0,recved=622,sent=622)
Latency min/avg/max: 0/0/6
Received: 2181
Sent: 2180
Connections: 4
Outstanding: 0
Zxid: 0x700000009b
Mode: follower
Node count: 2043
root@c103:~# echo stat | nc localhost 2181
Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT
Clients:
 /0:0:0:0:0:0:0:1:52529[0](queued=0,recved=1,sent=0)
 /172.18.101.103:52966[1](queued=0,recved=190887,sent=190888)
 /172.18.101.103:53371[1](queued=0,recved=190228,sent=190228)
 /172.18.101.102:51860[1](queued=0,recved=4495,sent=4495)
 /172.18.101.102:55437[1](queued=0,recved=190441,sent=190441)
 /172.18.101.102:51833[1](queued=0,recved=4495,sent=4495)
 /172.18.101.103:52964[1](queued=0,recved=190889,sent=190890)
Latency min/avg/max: 0/0/13
Received: 771444
Sent: 771445
Connections: 7
Outstanding: 0
Zxid: 0x700000009b
Mode: leader
Node count: 2043

The current leader is c103. We stop zookeeper on c103 for a while, then start it again. We will see a new leader is elected once the old leader is gone. The epoch part of zxid will increase by 1. 

root@c103:~# service zookeeper stop
zookeeper stop/waiting
root@c103:~# echo stat | nc localhost 2181
root@c103:~# service zookeeper start
zookeeper start/running, process 4126

Now, we check again and find zxid has been changed to 0x7100000012. This indicates epoch changed from 0x70 to 0x71 as expected. The new leader is now c102.

root@c101:~# echo stat | nc localhost 2181
Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT
Clients:
 /172.18.101.102:39011[1](queued=0,recved=3,sent=3)
/172.18.101.101:35636[1](queued=0,recved=2,sent=2)
/172.18.101.103:39482[1](queued=0,recved=1,sent=1)
/0:0:0:0:0:0:0:1:36755[0](queued=0,recved=1,sent=0)
/172.18.101.103:39457[1](queued=0,recved=1,sent=1)
/172.18.101.101:35722[1](queued=0,recved=5,sent=5)
/172.18.101.103:39448[1](queued=0,recved=1,sent=1)
/172.18.101.101:35647[1](queued=0,recved=2,sent=2)
/172.18.101.103:39474[1](queued=0,recved=2,sent=2)
/172.18.101.102:39043[1](queued=0,recved=5,sent=5)
/172.18.101.101:35697[1](queued=0,recved=5,sent=5)
/172.18.101.102:38970[1](queued=0,recved=1,sent=1)
/172.18.101.103:39487[1](queued=0,recved=5,sent=5)
/172.18.101.101:35644[1](queued=0,recved=1,sent=1)
/172.18.101.102:38977[1](queued=0,recved=2,sent=2)
/172.18.101.101:35725[1](queued=0,recved=3,sent=3)
/172.18.101.102:38980[1](queued=0,recved=1,sent=1)
/172.18.101.103:39489[1](queued=0,recved=2,sent=2)
/172.18.101.102:39035[1](queued=0,recved=1,sent=1)
/172.18.101.103:39450[1](queued=0,recved=2,sent=2)
/172.18.101.101:35668[1](queued=0,recved=5,sent=5)
/172.18.101.102:39012[1](queued=0,recved=2,sent=2)
/172.18.101.102:39045[1](queued=0,recved=5,sent=5)
/172.18.101.101:35662[1](queued=0,recved=2,sent=2)
/172.18.101.103:39505[1](queued=0,recved=3,sent=3)

Latency min/avg/max: 0/0/8
Received: 81
Sent: 80
Connections: 25
Outstanding: 0
Zxid: 0x7100000012
Mode: follower
Node count: 2043

root@c102:~# echo stat | nc localhost 2181
Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT
Clients:
/127.0.0.1:46443[1](queued=0,recved=1,sent=1)
/127.0.0.1:46468[1](queued=0,recved=1,sent=1)
/0:0:0:0:0:0:0:1:60333[0](queued=0,recved=1,sent=0)
/0:0:0:0:0:0:0:1:60172[1](queued=0,recved=1,sent=1)

Latency min/avg/max: 0/0/0
Received: 4
Sent: 3
Connections: 4
Outstanding: 0
Zxid: 0x7100000012
Mode: leader
Node count: 2043

root@c103:~# echo stat | nc localhost 2181
Zookeeper version: 3.4.5--1, built on 06/10/2013 17:26 GMT
Clients:
/0:0:0:0:0:0:0:1:53951[0](queued=0,recved=1,sent=0)

Latency min/avg/max: 0/0/0
Received: 1
Sent: 0
Connections: 1
Outstanding: 0
Zxid: 0x7100000012
Mode: follower
Node count: 2043

If you see more than one epoch ID across the ensemble, you may be experiencing brain split. 

For more on troubleshooting Zookeeper, refer to KB31144 - Contrail Getting Started - Administration, Configuration & Troubleshooting (JumpStation)
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search