Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

Health Monitoring on the EX-series Switches

0

0

Article ID: KB16450 KB Last Updated: 07 Jan 2010Version: 1.0
Summary:
The health monitor periodically checks key indicators to determine EX Switch health
Symptoms:

Solution:

Health Monitoring:

The health monitor periodically (over the time you specify in the interval field) checks the following key indicators of switch health:
* Percentage of file storage used
* Percentage of Routing Engine CPU used
* Percentage of Routing Engine memory used
* Percentage of memory used for each system process
* Percentage of CPU used by the forwarding process
* Percentage of memory used for temporary storage by the forwarding process
These health-monitoring key indicators OID's can be found from the output of the following command:
user@juniper# run show snmp health-monitor alarms | no-more 
Alarm

Index Variable description Value State
32768 Health Monitor: root file system utilization
jnxHrStoragePercentUsed.1 55 rising threshold
32769 Health Monitor: /config file system utilization
jnxHrStoragePercentUsed.2 0 active
32770 Health Monitor: RE 0 CPU utilization
jnxOperatingCPU.9.1.0.0 8 rising threshold
32771 Health Monitor: RE 1 CPU utilization
jnxOperatingCPU.9.2.0.0 0 instance not available
32772 Health Monitor: RE 0 memory utilization
jnxOperatingBuffer.9.1.0.0 21 rising threshold
32773 Health Monitor: RE 1 memory utilization
jnxOperatingBuffer.9.2.0.0 0 instance not available
32774 Health Monitor: Max Kernel Memory Used (%)
jnxBoxKernelMemoryUsedPercent.0 3 active
32775 Health Monitor: junos daemon memory usage
0 instance not available

Use the following command to check the logs pertaining to health-monitoring:

user@juniper# run show snmp health-monitor logs | no-more
Event Index: 32768
Description: Health Monitor: root file system utilization crossed rising threshold 5 (value: 55), (variable:
jnxHrStoragePercentUsed.1)
Time: 2009-12-05 20:19:57
Description: Health Monitor: RE 0 CPU utilization crossed rising threshold 5 (value: 8), (variable: jnxOperatingCPU.9.1.0.0)
Time: 2009-12-05 20:20:01
Description: Health Monitor: RE 0 memory utilization crossed rising threshold 5 (value: 21), (variable:
jnxOperatingBuffer.9.1.0.0)
Time: 2009-12-05 20:20:05

How do I enable Health Monitoring on the EX Switch?


This can be enabled through J-web (configure-->services--->SNMP-->Health Monitoring)


Select the check box to enable the health monitor and configure options. Clear the check box to disable the health monitor.
Note: If you select the Enable Health Monitoring check box and do not specify options, then SNMP health monitoring is enabled with default values.

Interval: Specifies the sampling frequency, in seconds, over which the key health indicators are sampled and compared with the rising and falling thresholds.

For example, if you configure the interval as 100 seconds, the values are checked every 100 seconds.

Enter an interval time, in seconds, from 1 through 2147483647.

The default value is 300 seconds (5 minutes).

Rising Threshold:
Specifies the value at which SNMP generates an event (trap and system log message) when the value of a sampled indicator is increasing.

For example, if the rising threshold is 90 (the default), SNMP generates an event when the value of any key indicator reaches or exceeds 90 percent.

Enter a value from 0 through 100. The default value is 90.

Falling Threshold:
Specifies the value at which SNMP generates an event (trap and system log message) when the value of a sampled indicator is decreasing.

For example, if the falling threshold is 80 (the default), SNMP generates an event when the value of any key indicator falls back to 80 percent or less.

Enter a value from 0 through 100. The default value is 80.
NOTE: The falling threshold value must be less than the rising threshold value.
Configuration required to receive traps and syslog messages when health-monitoring threshold breaches:

SNMP TRAP CONFIG:
==================

user@juniper# edit snmp
[edit snmp]
user@juniper# show
trap-group Juniper {
categories {
chassis;
services;
rmon-alarm;
}
targets {
10.130.235.6;
}
}
health-monitor {
interval 20;
rising-threshold 20;
falling-threshold 10;
}
NOTE: To configure your own trap group port, include the destination-port statement. The default destination-port is port 162. 
SYSLOG CONFIG:
==============

user@juniper# edit system syslog
[edit system syslog]
user@juniper# show
user * {
any emergency;
}
host 10.130.235.6 {
any any;
}
file messages {
any notice;
authorization info;
}
file interactive-commands {
interactive-commands any;
}

Example of Syslog and System event logs when any of the Health Monitor key indicator's threshold breaches:



Event Logs:
============

root@EX1# run show log messages | no-more

Dec 4 20:45:00 EX1 clear-log[9343]: logfile cleared
Dec 4 20:45:52 EX1 mgd[9356]: UI_COMMIT: User 'root' requested 'commit' operation (comment: Modified via SNMP Quick Configuration)
Dec 4 20:45:52 EX1 vccp[9369]: vcdb_extract_db_from_file reading file /config/vchassis/vc.tlv.db.1 second_try = 0
Dec 4 20:45:57 EX1 fpc0 (mrvl_cos_delete_scheduler_map): Error -32
Dec 4 20:45:57 EX1 /kernel: GENCFG: op 8 (CoS) failed; err 1 (Unknown)
Dec 4 20:45:55 EX1 init: remote-operations (PID 9575) started
Dec 4 20:45:55 EX1 snmpd[8522]: SNMPD_HEALTH_MON_THRESH_CROSS: Health Monitor: root file system utilization crossed rising threshold 20 (value: 55), (variable: jnxHrStoragePercentUsed.1)
Dec 4 20:45:56 EX1 rmopd[9575]: SNMP_SUBAGENT_IPC_REG_ROWS: ns_subagent_register_mibs: registering 20 rows
Dec 4 20:46:03 EX1 snmpd[8522]: SNMPD_HEALTH_MON_THRESH_CROSS: Health Monitor: RE 0 memory utilization crossed rising threshold 20 (value: 20), (variable: jnxOperatingBuffer.9.1.0.0)
Dec 4 20:46:59 EX1 snmpd[8522]: SNMPD_HEALTH_MON_THRESH_CROSS: Health Monitor: RE 0 CPU utilization crossed falling threshold 10 (value: 8), (variable: jnxOperatingCPU.9.1.0.0)
Dec 4 20:48:18 EX1 mgd[9583]: UI_DBASE_LOGIN_EVENT: User 'root' entering configuration mode
Dec 4 21:05:19 EX1 snmpd[8522]: SNMPD_HEALTH_MON_THRESH_CROSS: Health Monitor: RE 0 CPU utilization crossed rising threshold 20 (value: 20), (variable: jnxOperatingCPU.9.1.0.0)
Dec 4 21:05:39 EX1 snmpd[8522]: SNMPD_HEALTH_MON_THRESH_CROSS: Health Monitor: RE 0 CPU utilization crossed falling threshold 10 (value: 9), (variable: jnxOperatingCPU.9.1.0.0)




Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search