Knowledge Search


×
 

[SRX] Example - IP Monitoring with route fail-over configuration and behavior

  [KB25052] Show Article Properties


Summary:

This article explains how to set up route fail-over by using IP Monitoring, and explains fail-over behavior.


Symptoms:
Basic configuration:

The test window is defined by specifying how many probes you are sending, how often, and the time between test windows. In this example, three probes are sent 15 seconds apart and the test-interval indicates a 10 second pause between test windows.

set services rpm probe example test test-name probe-count 3
set services rpm probe example test test-name probe-interval 15
set services rpm probe example test test-name test-interval 10

The requirement for fail-over is provided by configuring the successive-loss and/or total-loss values. The conditions must be met inside of the test window; so this configuration will require that all three probes from the test window be lost to cause a fail-over.

set services rpm probe example test test-name thresholds successive-loss 3
set services rpm probe example test test-name thresholds total-loss 3

To complete the rpm setup, specify where the probes are being sent, and which interface to use.  This example also includes the optional next-hop, though it is not required unless the probe needs to use a different next-hop than is in the routing table.

set services rpm probe example test test-name target address 10.0.0.2
set services rpm probe example test test-name destination-interface fe-0/0/0.0
set services rpm probe example test test-name next-hop 10.0.0.2

The final step is to configure the policy to use the example configured above; upon failure, it will switch the next-hop of the static route configured to 20.0.0.2.

set services ip-monitoring policy test match rpm-probe example
set services ip-monitoring policy test then preferred-route route 50.0.0.0/8 next-hop 20.0.0.2

Cause:

Solution:

Basic behavior is shown in the example below with 15 seconds between probes and 10 seconds between tests.  The test is complete as soon as the third probe response is received.


If the first or second probes are lost, but the IP is reachable prior to the final probe, the test will be complete after receiving the third probe response.


To cause a fail-over based on the following configuration, three probes must be lost during the test window.   Since probe 3 fails to get a response, the device will wait for an entire probe-interval before the probe is considered unsuccessful.


The bright red bar indicates loss of connection right before the first probe is sent, with all three probes of Test 1 being unsuccessful. The result is a Test 1 interval at  45 seconds before fail-over is triggered.  The darker red bar indicates loss of connection after first probe success, resulting in Test 1 not meeting fail-over requirements.  Test 2 results in fail-over due to all three probes being unsuccessful, resulting in a total time of approximately 100 seconds before fail-over. When figuring out timings for a network, refer to the diagrams above and consider what behavior you want to see in your network.

Use the following command to verify what state the route is in based on your ip-monitoring:

> show services ip-monitoring status
Policy - test
RPM Probes:
   Probe name             Address          Status
   ---------------------- ---------------- ---------
   example                10.0.0.2         PASS
Route-Action:
   route-instance    route             next-hop         State
   ----------------- ----------------- ---------------- -------------
   inet.0            50.0.0.0          20.0.0.2         NOT-APPLIED

After it fails, you will be able to see that the new next-hop is in place:

> show services ip-monitoring status
Policy - test
RPM Probes:
  Probe name             Address          Status
  ---------------------- ---------------- ---------
  example                10.0.0.2         FAIL
Route-Action:
  route-instance    route             next-hop         State
  ----------------- ----------------- ---------------- -------------
  inet.0            50.0.0.0          20.0.0.2         APPLIED

Use the following command to see what probes are lost and understand when the connection was lost:

> show services rpm history-results
Owner, Test            Probe received             Round trip time
example, test-name     Wed Jun 20 14:30:04 2012     1753 usec
example, test-name     Wed Jun 20 14:30:06 2012     7318 usec
example, test-name     Wed Jun 20 14:30:07 2012     1891 usec
example, test-name     Wed Jun 20 14:30:09 2012     7575 usec
example, test-name     Wed Jun 20 14:30:11 2012     1695 usec
example, test-name     Wed Jun 20 14:30:13 2012   Internal error
example, test-name     Wed Jun 20 14:30:15 2012   Internal error
example, test-name     Wed Jun 20 14:30:18 2012   Internal error
example, test-name     Wed Jun 20 14:30:20 2012   Internal error
example, test-name     Wed Jun 20 14:30:22 2012   Internal error

Note: The above test was performed by disconnecting the cable; if it is not a physical loss of connection, the message will not be an 'Internal error'.

Related Links: