Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[MX] "Host x failed to mount /var off HDD, emergency /var created" major alarm

0

0

Article ID: KB35827 KB Last Updated: 30 Mar 2021Version: 2.0
Summary:

The article details the actions to be taken when a "Host x failed to mount /var off HDD, emergency /var created" major alarm is triggered on MX Series devices. 

 

Symptoms:

In certain cases, the Routing Engine fails to detect some parts of the storage. The following alarm is seen on the device.

Note: Here, Host 1 refers to Routing Engine 1.

{MASTER}
root@router> show system alarms
2 alarms currently active
Alarm time               Class  Description
2020-04-04 03:25:44 CEST Major  Host 1 failed to mount /var off HDD, emergency /var created

 

Cause:
  1. The RE is unable to detect the disk.

  2. A transient issue exists with the storage system on the Routing Engine.

  3. The Routing Engine has failed.

 

Solution:

The solution here explains two different cases.

CASE 1: Disk is detected in the RE:

Step 1: Check the hardware inventory list:

{MASTER}
root@router> show chassis hardware detail
Hardware inventory:
Item             Version  Part number  Serial number     Description
<snip>
Routing Engine 1 REV xx   7xxxxxxx2   9xxxxxxxxxx        RE-S-1800x4
  ad0    3998 MB  VTxxxxxxxxxxxxx2     4xxxxxxxxx        Compact Flash
  ad1   28496 MB  Stoxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx2xxx Disk 1     >>>> ad1 is present.

Step 2: Check the log messages.

May 2 04:06:23 2020 /kernel: ad1: 28496MB<Stoxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx2xxx> at ata0-master UDMA33
May 2 04:06:23 2020 /kernel: ad0: 3998MB<VTxxxxxxxxxxxxx2 xxx 1xxxxx7> at ata0-slave UDMA33
…
May  2 04:07:33 2020 smartd[1892]: Device /dev/ad1, found and is SMART capable

Step 3: Check the Boot List.

root@router> start shell                                       
% sysctl -a | grep bootdev
machdep.currbootdev: compact-flash
machdep.nextbootdev: usb
machdep.bootdevs: usb,compact-flash,disk1,disk2,lan
%

In this case, although disk1 is present on the RE, the alarm indicates that there is an issue with mounting the /var. This could be a transient issue with the RE.

Reboot the RE (ensure that the RE you are trying to reboot is not the current primary. If the suspected faulty RE is the primary, then make sure to switch the primary role before going ahead with the reboot) to clear this alarm. You might want to connect the console port as well just to be prepared in case you lose management access.

In case reboot does not work, it is possible that there is a hardware issue with the RE.

CASE 2: Disk not getting detected on the RE

Step 1: Check the hardware inventory list:

{MASTER}
root@router> show chassis hardware detail
Hardware inventory:
Item             Version  Part number  Serial number     Description
<snip>
Routing Engine 1 REV xx   7xxxxxxx2   9xxxxxxxxxx        RE-S-1800x4
  ad0    3998 MB  VTxxxxxxxxxxxxx2     4xxxxxxxxx        Compact Flash          
CB 0             REV xx   7xxxxxxxx   Cxxxxxxxx          Enhanced MX SCB 2         >>>> ad1 is missing from the list.

Optional: You may choose to perform Step 2 and Step 3 from CASE 1 to be absolutely sure.

Step 2: Retrieve disk 1 by correcting any errors and mount /var:

  1. Reboot the affected Routing Engine using console access. 

  2. While booting up, when the below line is seen, press the space bar immediately to enter into command prompt.

"Hit [Enter] to boot immediately, or space bar for command prompt."

  1. On pressing the Space bar, you get to the loader > prompt as shown below. This will allow you to boot in single-user mode.

Type '?' for a list of commands, 'help' for more detailed help.

loader>

  1. Enter “boot –s“  in the loader prompt.

  2. Enter RETURN when the below line appears. You will be redirected to the single-user shell.

Enter full pathname of shell or 'recovery' for root password recovery or RETURN for /bin/sh:

  1. When you see the below line, it means that you are in the single-user shell:

NOTE: to go to multi-user operation, exit the single-user shell (with ^D)
#
  1. Issue the command: fsck -y /dev/da0s1f a few times to correct any error with respect to the file system.

  2. Issue the command: mount /var.

  3. Exit from the single-user shell using ^D.

  4. Reboot the RE again to see if /var can be successfully mounted on /dev/ad0.

Example output of a healthy file system check

Enter full pathname of shell or 'recovery' for root password recovery or RETURN for /bin/sh:
NOTE: to go to multi-user operation, exit the single-user shell (with ^D)
#
# fsck -y /dev/da0s1f
** /dev/da0s1f
** Last Mounted on /var
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
1660 files, 1587823 used, 1245340 free (596 frags, 155593 blocks, 0.0% fragmentation)
# mount /var

Step 3: Verify if the recovery process worked:

  1. Run the "show system alarms" and "show system storage" commands in operational mode to verify if the issue is resolved.

If you still see the same alarm after manually attempting a recovery of the file system and mounting /var, this could be a possible hardware failure on the RE, which cannot be further recovered.

***WARNING: Before rebooting the RE or concluding hardware failure to be the cause of the above problem, make sure that you consult a Support representative.

 

Modification History:
2021-03-25: Updated the article terminology to align with Juniper's Inclusion & Diversity initiatives

Related Links

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search