Support Support Downloads Knowledge Base Apex Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[CSO] How to verify Zookeeper status

0

0

Article ID: KB34625 KB Last Updated: 25 Jun 2019Version: 2.0
Summary:

This article provides the instructions for verifying zookeeper status in CSO (Contrail Service Orchestration).

Symptoms:

The component health checkup in CSO reports zookeeper as unhealthy (snippet as below). How do I check the status of zookeeper?

./components_health.sh
INFO Health Check for Infrastructure Component ZooKeeper Started
INFO Attempt: 1 - Retrying Health Check for Component ZooKeeper
INFO Attempt: 2 - Retrying Health Check for Component ZooKeeper
ERROR The Infra Component : ZooKeeper is Unhealthy
Cause:

The "service zookeeper status" may show as zookeeper status running, but CSO component health checkup script will show status as unhealthy as it uses zkServer.sh to fetch the status.

Solution:
  1. Zookeeper process runs on infra VM's. CSO Process can be checked in infra VM's using the command: '/usr/share/zookeeper/bin/zkServer.sh status'

    # /usr/share/zookeeper/bin/zkServer.sh status
    JMX enabled by default
    Using config: /etc/zookeeper/conf/zoo.cfg
    Mode: leader

    Note: Mode in above command can be standalone ( in case of all in one server) or leader and follower ( in case of HA setup)
    In case of HA, zookeeper run as leader in one infra node and follower in rest of the nodes.

  2. To start the zookeeper service use command: /usr/share/zookeeper/bin/zkServer.sh start

  3. To check whether process is running: ps -ef | grep zookeeper

  4. Errorlogs can be checked in Infra nodes: /var/log/zookeeper/zookeeper.log

    Successful start of follower zookeeper log message:

    2019-06-10 11:37:23,495 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.arch=amd64
    2019-06-10 11:37:23,495 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:os.version=3.13.0-141-generic
    2019-06-10 11:37:23,495 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.name=zookeeper
    2019-06-10 11:37:23,496 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.home=/var/lib/zookeeper
    2019-06-10 11:37:23,496 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Environment@100] - Server environment:user.dir=/
    2019-06-10 11:37:23,497 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - Created server with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir /mnt/data/zookeeper/version-2 snapdir /mnt/data/zookeeper/version-2
    2019-06-10 11:37:23,499 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Follower@63] - FOLLOWING - LEADER ELECTION TOOK - 45
    2019-06-10 11:37:23,506 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Learner@322] - Getting a diff from the leader 0x15000cb4f6
    2019-06-10 11:37:23,511 - WARN  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Learner@373] - Got zxid 0x15000cb4f5 expected 0x1
    2019-06-10 11:37:23,514 - INFO  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] - Snapshotting: 0x15000cb4f6 to /mnt/data/zookeeper/version-2/snapshot.15000cb4f6
    2019-06-10 11:40:00,006 - WARN  [QuorumPeer[myid=209193126]/0:0:0:0:0:0:0:0:2181:Follower@118] - Got zxid 0x15000cb4f7 expected 0x1
    2019-06-10 11:40:00,007 - INFO  [SyncThread:209193126:FileTxnLog@199] - Creating new log file: log.15000cb4f7
  5. Check the free memory: free -mh

    Zookeeper, Cassandra, Arango db and elasticsearch share the same JDM, so if one process uses more memory, there is possibility of other process running out of memory.
    Such issues can be resolved by restarting the process consuming more memory.

    Out of memory message:

    + /usr/bin/java -Dzookeeper.log.dir=/var/log/zookeeper -Dzookeeper.root.logger=INFO,ROLLINGFILE -cp /etc/zookeeper/conf:/usr/share/java/jline.jar:/usr/share/java/log4j-1.2.jar:/usr/share/java/xercesImpl.jar:/usr/share/java/xmlParserAPIs.jar:/usr/share/java/netty.jar:/usr/share/java/slf4j-api.jar:/usr/share/java/slf4j-log4j12.jar:/usr/share/java/zookeeper.jar org.apache.zookeeper.client.FourLetterWordMain localhost 2181
    OpenJDK 64-Bit Server VM warning: INFO: os::commit_memory(0x00000003d1400000, 704643072, 0) failed; error='Cannot allocate memory' (errno=12)
  6. Check which process is using more memory using the command: top ( shift m)

  7. Check the output of the following command to find out zk_followers and zk_synced_followers :

    echo mntr | nc <leader ip> 2181
    
    # echo mntr | nc <leader ip> 2181
    zk_version    3.4.5--1, built on 06/10/2013 17:26 GMT
    zk_avg_latency    0
    zk_max_latency    245
    zk_min_latency    0
    zk_packets_received    56220060
    zk_packets_sent    56220162
    zk_num_alive_connections    51
    zk_outstanding_requests    0
    zk_server_state    leader ( state will be standalone in case of non-HA)
    zk_znode_count    467
    zk_watch_count    6
    zk_ephemerals_count    5
    zk_approximate_data_size    44299
    zk_open_file_descriptor_count    97
    zk_max_file_descriptor_count    8192
    zk_followers    2
    zk_synced_followers    2
    zk_pending_syncs    0

    The output above is captured from HA setup where there is one leader and two followers.

    zk_followers and zk_synced_followers as 2 in the output aboe is indicating that leader node has 2 followers and both are in sync. (Good status)

    In case of standalone only zk_server_state as standalone will be visible.

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search