This article discusses one of the reasons for vrouter agent being unresponsive in release R1909 and its fixed version details.
The following was seen in /var/log/contrail/contrail-vrouter-agent.log a few minutes before the issue was triggered:
2020-05-30 Mon 04:46:26:144.084 UTC [Thread 139809795421952, Pid 97274]: ensureCanWrite: Realloc size 0 FAILED
(gdb) bt
#0 0x00007f4a93129b8b in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib64/libtcmalloc.so.4
#1 0x00007f4a93129a8f in SpinLock::SlowLock() () from /lib64/libtcmalloc.so.4
#2 0x00007f4a9311a248 in tcmalloc::CentralFreeList::Populate() ()
from /lib64/libtcmalloc.so.4
#3 0x00007f4a9311a338 in tcmalloc::CentralFreeList::FetchFromOneSpansSafe(int, void**, void**) () from /lib64/libtcmalloc.so.4
#4 0x00007f4a9311a3d0 in tcmalloc::CentralFreeList::RemoveRange(void**, void**, int) () from /lib64/libtcmalloc.so.4
#5 0x00007f4a9311d2a7 in tcmalloc::ThreadCache::FetchFromCentralCache(unsigned int, int) () from /lib64/libtcmalloc.so.4
#6 0x00007f4a90b39a45 in std::locale::_Impl::_Impl(std::locale::_Impl const&, unsigned long) () from /lib64/libstdc++.so.6
#7 0x0000000000a465a1 in std::locale::locale<boost::date_time::time_facet<boost::posix_time::ptime, char, std::ostreambuf_iterator
<char, std::char_traits<char> > > > (this=0x7f4a934e2de8 <tcmalloc::Static::pageheap_lock_>, __other=...,
__f=0x2) at /usr/include/c++/4.8.2/bits/locale_classes.tcc:47
#8 0x000000000f5a2a58 in ?? ()
#9 0x00007f4a804bd8c0 in ?? ()
#10 0x00007f4a804bd918 in ?? ()
#11 0x00007f4a804bd8c8 in ?? ()
#12 0x0000000000000000 in ?? ()
Thread 3 (Thread 0x7f4a8a300700 (LWP 451164)):
#0 0x00007f4a93129b8b in base::internal::SpinLockDelay(int volatile*, int, int) () from /lib64/libtcmalloc.so.4
#1 0x00007f4a93129a8f in SpinLock::SlowLock() () from /lib64/libtcmalloc.so.4
#2 0x00007f4a93119c58 in tcmalloc::CentralFreeList::ReleaseToSpans(void*) ()
from /lib64/libtcmalloc.so.4
#3 0x00007f4a93119cbe in tcmalloc::CentralFreeList::ReleaseListToSpans(void*)
() from /lib64/libtcmalloc.so.4
#4 0x00007f4a93119df4 in tcmalloc::CentralFreeList::ShrinkCache(int, bool) ()
from /lib64/libtcmalloc.so.4
#5 0x00007f4a93119f1d in tcmalloc::CentralFreeList::MakeCacheSpace() ()
from /lib64/libtcmalloc.so.4
#6 0x00007f4a93119f98 in tcmalloc::CentralFreeList::InsertRange(void*, void*, int) () from /lib64/libtcmalloc.so.4
#7 0x00007f4a9311d408 in tcmalloc::ThreadCache::ReleaseToCentralCache(tcmalloc::ThreadCache::FreeList*, unsigned int, int) () from /lib64/libtcmalloc.so.4
#8 0x00007f4a9311d798 in tcmalloc::ThreadCache::Scavenge() ()
from /lib64/libtcmalloc.so.4
Deadlock in tcmalloc code, Contrail release 1909 uses gperftools 2.1, tcmalloc alloc large performance is a bottleneck in scalability scenarios.
Upgrade to the gperftools 2.7 rpm package. The fix is available from R1912L2 and higher: https://review.opencontrail.org/c/Juniper/contrail-controller/+/59262
If you are in 1909 release and facing this issue, contact your JTAC Representative.