Knowledge Search


×
 

[Contrail] "--connection-timeout" option in "db_manage" script to avoid timeout errors

  [KB34818] Show Article Properties


Summary:

The db_manage.py script is a common tool that is run regularly by customers to detect Contrail database out-of-sync issues and heal such inconsistencies.

However, if the Cassandra database is located on remote storage, underlay latency may delay database queries, leading to the db_manage script throwing timeout errors and exiting prematurely.

This article informs users of the integrated --connection-timeout option that is available within the script that can help avoid such failure.

Symptoms:

Users report that the db_manage.py script is reporting an error due to Cassandra DB queries timing out:

python /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py --verbose --debug check

<snippet>

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py", line 2876, in <module>
    main()
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py", line 2857, in main
    return globals()['db_%s' % (verb)](args, api_args)
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py", line 2772, in db_check
    db_checker.check_orphan_resources()
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py", line 1130, in wrapper
    errors = func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py", line 1648, in check_orphan_resources
    _, errors = self.audit_orphan_resources()
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py", line 1113, in audit_orphan_resources
    obj_uuid_table.get(parent_uuid)
  File "/usr/lib/python2.7/dist-packages/pycassa/columnfamily.py", line 660, in get
    read_consistency_level or self.read_consistency_level)
  File "/usr/lib/python2.7/dist-packages/pycassa/pool.py", line 577, in execute
    return getattr(conn, f)(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pycassa/pool.py", line 153, in new_f
    return new_f(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pycassa/pool.py", line 153, in new_f
    return new_f(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pycassa/pool.py", line 153, in new_f
    return new_f(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pycassa/pool.py", line 153, in new_f
    return new_f(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pycassa/pool.py", line 153, in new_f
    return new_f(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/pycassa/pool.py", line 148, in new_f
    (self._retry_count, exc.__class__.__name__, exc))
MaximumRetryException: Retried 6 times. Last failure was timeout: timed out

Cause:

The Cassandra DB queries timeout could be due to underlay latencies, DB timing issues, or other factors.

Solution:

To resolve the timeout issues, users can add the --connection-timeout <time_value> parameter when running the db_manage.py script to enable the DB check to complete successfully.

For example, we could configure the db_manage.py script to wait for up to 60 seconds before timing out:

python /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage.py --verbose --debug --connection-timeout 60 check

Note: In some extreme cases, up to 300 seconds wait time may be necessary.

Related Links: