Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[SRX] File system recovery method using single user mode

0

0

Article ID: KB29069 KB Last Updated: 27 Mar 2020Version: 3.0
Summary:

The NAND (Negated AND or NOT AND) media check, the utility checks for bad blocks in the NAND flash memory that is used for the internal boot media. If found, the utility can recover the bad block. On very rare occasions, the file system cannot be recovered even after the recovery of bad blocks by the NAND media check utility. This article describes a recovery method of file system corruption using single user mode.

Symptoms:

Example1:
Wrong information is found on a mounted partition, bo0s3f in this example.

root@SRX% fsck /dev/bo0s3f
** /dev/bo0s3f (NO WRITE)
** Last Mounted on /cf/var
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
FREE BLK COUNT(S) WRONG IN SUPERBLK
SALVAGE? no


SUMMARY INFORMATION BAD
SALVAGE? no

BLK(S) MISSING IN BIT MAPS
SALVAGE? no


529 files, 77049 used, 98275 free (539 frags, 12217 blocks, 0.3% fragmentation)


Example2:
The following is another case showing that single user mode is helpful.

#############################################
# During reboot
# "Bad read & Recovering block" are observed
# Usually, this is sufficient to have the device working normally
#############################################

Trying to create bootdev, rootpartition da0s1a
Trying to mount root from ufs:/dev/da0s1a
Attaching /cf/packages/junos via /dev/mdctl...
Mounted junos package on /dev/md0...

Media check on da0
Zone 11 Block 0092 Addr 2c5c00 : Bad read
Recovering block
Zone 11 Block 0315 Addr 2d3b00 : Bad read
Recovering block
Zone 11 Block 0503 Addr 2df700 : Bad read
Recovering block
Zone 11 Block 0623 Addr 2e6f00 : Bad read
Recovering block
Zone 11 Block 0707 Addr 2ec300 : Bad read
Recovering block
Zone 11 Block 0889 Addr 2f7900 : Bad read
Recovering block
Zone 11 Block 1012 Addr 2ff400 : Bad read
Recovering block
Automatic reboot in progress...
** /dev/da0s1a
FILE SYSTEM CLEAN; SKIPPING CHECKS
clean, 200363 free (35 frags, 25041 blocks, 0.0% fragmentation)
Verified junos signed by PackageProduction_10_4_0
Verified jboot signed by PackageProduction_10_4_0
Verified junos-10.4R5.5-domestic signed by PackageProduction_10_4_0
** /dev/bo0s3e
FILE SYSTEM CLEAN; SKIPPING CHECKS
clean, 23543 free (15 frags, 2941 blocks, 0.1% fragmentation)
** /dev/bo0s3f
FILE SYSTEM CLEAN; SKIPPING CHECKS
clean, 247321 free (737 frags, 30823 blocks, 0.2% fragmentation)
Loading configuration ...


#############################################
# However, g_vfs_done() error keeps being shown on console
#############################################

g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
mgd: commit complete
:
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5
g_vfs_done():da0s3f[READ(offset=-2048, length=16384)]error = 5

Note:  In addition to the example described above, this article is helpful for a Junos OS upgrade when file system corruption is observed.

Cause:

File system corruption might be caused mainly but not limited to the following:

  • Ungraceful power off; customer operation, system crash etc.
  • High frequent data writing to the flash; traffic log etc.



Solution:

Regardless what the cause is, the solution below can help to recover the NAND flash to keep the device working normally.

  1. Perform the bad block recovery using nand-mediack command (no -C option).
  2. Note: At this point, the NAND media check utility checks the corrupted blocks and those corrupted blocks will be recovered.

  3. Perform fsck -f on mounted partitions, and fsck -f -y on non-mounted partitions.
  4. There are four partitions in flash media; Slice1(s1a), Slice2(s2a), and Slice3(s3e, s3f).  It will perform a "NO-WRITE" operation on mounted partitions, and a "WRITE" operation on non-mounted partitions.  Mounted and non-mounted partitions can be checked using the df command.

    Example: If da0s1a is the primary root partition, it will perform a "NO-WRITE" operation on da0s1a and a "WRITE" operation on da0s2abo0s3e and bo0s3f are mounted partitions, so a "NO-WRITE" operation will be performed.

    root@% df
    Filesystem   512-blocks   Used  Avail Capacity  Mounted on
    /dev/da0s1a     1248744 301376 847472    26%    /  devfs                 2      2      0   100%    /dev
    /dev/md0         797400 797400      0   100%    /junos
    /cf             1248744 301376 847472    26%    /junos/cf
    devfs                 2      2      0   100%    /junos/dev/
    procfs                8      8      0   100%    /proc
    /dev/bo0s3e       94304     60  86700     0%    /config  /dev/bo0s3f     1264808 956712 206912    82%    /cf/var  /dev/md1         687744  36456 596272     6%    /mfs
    /cf/var/jail    1264808 956712 206912    82%    /jail/var
    /cf/var/log     1264808 956712 206912    82%    /jail/var/log
    devfs                 2      2      0   100%    /jail/dev
    /dev/md2         128728      8 118424     0%    /mfs/var/run/utm
    /dev/md3           3768      8   3460     0%    /jail/mfs
    
         

    Mounted Partition
    root@% fsck -f /dev/da0s1a
    root@% fsck -f /dev/bo0s3e
    root@% fsck -f /dev/bo0s3f

    Non-Mounted Partition
    root@% fsck -f -y /dev/da0s2a

    At this point, file system recovery will be recovered for non-mounted partition by "WRITE" operation.

  5. If file system error is observed on mounted partition, it is necessary to log in to single user mode during next reboot.

How to log in to Single User Mode
  1. When the autoboot is completed, press the spacebar a few times to access the bootstrap loader prompt.
  2. At the following prompt, enter boot -s to start up the system in single-user mode.
  3. Enter full pathname of shell or 'recovery' for root password recovery or RETURN for /bin/sh: [Press Enter/Return, do not edit 'recovery']


If a bad file system is observed on non-primary root partition (e.g. bo0s3e, or bo0s3f):
  1. Reboot the unit, and log in to Single User Mode (see above) during the bootup.
  2. Check the mounted filesystem from Single User Mode.

  3. # df
    Filesystem  512-blocks   Used  Avail Capacity  Mounted on
    /dev/da0s1a     598104 300084 250172    55%    /  
    devfs                2      2      0   100%    /dev
    /dev/md0        795116 795116      0   100%    /junos
    /cf             598104 300084 250172    55%    /junos/cf
    devfs                2      2      0   100%    /junos/dev/  
    

  4. Perform fsck -f -y on non-mounted partitions (bo0s3e,bo0s3f).
  5. Non-Mounted Partitions
    root@% fsck -f -y /dev/bo0s3e
    root@% fsck -f -y /dev/bo0s3f

  6. Boot the unit.

  7. Install the upgrade image.

  8. Reboot to upgrade the unit.
If a bad file system is observed on primary root partition (e.g. da0s1a):
  1. Install the upgrade image => Primary root partition will be switched (e.g. da0s2a).
  2. Reboot the unit, and log in to Single User Mode (see above) during the bootup.

  3. Check the mounted filesystem.
  4. # df
    Filesystem  512-blocks   Used  Avail Capacity  Mounted on
    /dev/da0s2a    1264552 288416 874972    25%    /  
    devfs                2      2      0   100%    /dev
    /dev/md0        760316 760316      0   100%    /junos
    /cf            1264552 288416 874972    25%    /junos/cf
    devfs                2      2      0   100%    /junos/dev/
    
  5. Perform fsck -f y on former primary root partition.
  6. Non-Mounted Partition
    root@% fsck -f -y /dev/da0s1a

  7. Boot the unit.
Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search