Support Support Downloads Knowledge Base Juniper Support Portal Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[Contrail] Workflow to troubleshoot a (UDF) User defined Function not working

0

0

Article ID: KB36771 KB Last Updated: 20 Apr 2021Version: 1.0
Summary:

This article explains the workflow for troubleshooting a UDF on a Contrail Healthbot system.

Symptoms:

UDF are user defined functions that can be invoked from rules inside a playbook for a device group or a network group.

The Rule/Playbook detects an anomaly from a correct defined state via telemetry data received from configured devices. The Rule invokes a script that corrects the anomaly. The script would be executed inside the UDF Farm to perform the scripted action.

Example:
A Playbook is in place to monitor MTU on connected devices in a network group.
Once the healthbot playbook detects a change of MTU on one of the interface, it invokes a script that can be written to correct the MTU.

Let's say healthbot playbook detects a change as shown below:

UDF will come into action and need to correct the MTU. But the same is not happening.

Solution:

To troubleshoot, take the following steps:

  1. Navigate to Administration > Log Collection and enter the details as shown below to download the alert a logs:

  2. Untar the downloaded file and open the "alerta.log" file.

  3. Search with the script name to find the log to understand if the script was executed:

    2021-03-31 17:24:57,672 DEBG 'run_uda1' stdout output:
    script: change-mtu-config.py method: add_config args: {'device_group': 'vSRX-EveNG', 'device_id': '-', 'group_type': 'network', 'interface': 'ge-0/0/0', 'mtu': 1300, 'router': 'vSRXNG2', 'rule_name': 'compare-interfaces-mtu', 'topic_name': 'interfaces'}

    This confirms the script was executed by the healthbot.

  4. Next, check for error logs to find a clue.

    2021-03-31 17:24:57,930 DEBG 'run_uda1' stdout output:
    HTTPConnectionPool(host='api_server', port=9000): Max retries exceeded with url: /api/v1/device/vSRXNG2/ (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f91d2860208>: Failed to establish a new connection: [Errno -2] Name or service not known',))
  5. Checking back at the script to understand the clue for the error "HTTPConnectionPool(host='api_server', port=9000):" The following was found in the script:

    r = requests.get('http://api_server:9000/api/v1/device/%s/' % router, verify=False)
  6. Since Healthbot version 3.2 was used, base api has been shifted to version 2. Refer to device schema.

  7. According to the documentation on device schema, the api has to be polled from the config server. Hence, the UDF script needed to be corrected to the following:

    r = requests.get('http://cconfig-server:9000/api/v2/config/device/%s/' % router, verify=False)
  8. After correcting the script, the UDF correctly performed the change to fix the anomaly (mtu in this case):



Another way to test if the get statement in step 7 works can be tested via running a curl request from the config-server container.

  1. Check the container ID for config server

    [root@localhost ~]# docker ps | grep config
    8a8a96309c2b   localhost:5000/healthbot_api_server          "python3 -m api_serv…"   3 weeks ago   Up 3 weeks             k8s_config-server_confi-server-64b748bd58-gmmwg_healthbot_25813bfa-b43c-4d0c-a130-a8f4f5fe672c_0
  2. Run docker exec to login to the container:

    [root@localhost ~]# docker exec -ti -e COLUMNS=$COLUMNS -e LINES=$LINES 8a8a96309c2b /bin/bash

  3. Execute the get request using curl command:

    root@config-server-64b748bd58-gmmwg:/# curl -G http://config-server:9000/api/v2/config/device/
    [
      "vSRXNG",
      "vSRXNG2",
      "vSRXNG3"
    ]
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search