Support Support Downloads Knowledge Base Case Manager My Juniper Community

Knowledge Base

Search our Knowledge Base sites to find answers to your questions.

Ask All Knowledge Base Sites All Knowledge Base Sites JunosE Defect (KA)Knowledge BaseSecurity AdvisoriesTechnical BulletinsTechnotes Sign in to display secure content and recently viewed articles

[EX/SRX] Recovering from file system corruption during a system reboot, NAND media utility checks for bad blocks in the NAND flash memory

0

0

Article ID: KB20570 KB Last Updated: 09 May 2018Version: 5.0
Summary:

This article discusses the NAND media check utility, which checks for bad blocks in the NAND flash memory that is used for the internal boot media in EX platforms. The utility can recover the bad blocks using SCSI protocol extension commands provided by the NAND flash controller vendor. The utility checks for the product model and runs only on EX Series and some SRX Branch Series using the ST72682 NAND flash controller as boot media. This utility recovers the bad blocks by erasing them and permits the system to boot successfully in most cases.

Symptoms:

Issue:
In very rare occasions, the file system on an EX Series switch can become corrupted after an abnormal shutdown due to power failure. The system might fail to boot successfully when the switch is powered on or rebooted. This could be due to some bad blocks in the boot media. If bad blocks are the cause of the corruption, you will see messages during the boot process that are similar to the examples shown at the end of this document.

Platforms/Products Affected:

  • EX Series Switches
  • EX2200, EX3200, EX4200, EX4500, EX4550, EX8200
  • SRX Series
  • SRX100, SRX100H2, SRX210, SRX210HE2, SRX240, SRX240H2
  • (Other SRX Branch Series, such as SRX110, SRX110H2, SRX220, SRX220H2, SRX550, and SRX650 use CF flash memory)
Solution:

The NAND media check utility checks for bad blocks in the NAND flash memory that is used for the internal boot media in EX platforms. The utility can recover the bad blocks using SCSI protocol extension commands provided by the NAND flash controller vendor. The utility checks for the product model and runs only on EX Series and some SRX Branch Series using the ST72682 NAND flash controller as boot media. This utility recovers the bad blocks by erasing them and permits the system to boot successfully in most cases.

Requirement:

  • Junos OS Release 10.4R2, 11.1R1 or later

Performing the recovery using nand-mediack.sh:

The nand-mediack.sh utility recovers the bad blocks by erasing them. The recovery option is set by default. To check for bad blocks only and not recover them, use option -C. To get help, use option-h.

Usage: nand-mediack[-Ch]

-C: Only check for bad blocks on media. Do not recover.
-h: Show this help message

The possible errors that this utility tries to recover are:

* Badformat
* Bad read
* Badwrite
* Baderase

If the recovery fails, you might see an error such as the following:

Erase <addr> failed: exit code <exit-code>

Examples of the nand-mediack utility:

root@ex-switch:RE:0% nand-mediack –C
Media check on da0 on non-srx platforms
Zone 02 Block 0172 Addr 08ac00 : Bad erase
Zone 07 Block 0520 Addr 1e0800 : Bad erase
Zone 07 Block 0631 Addr 1e7700 : Bad erase
Zone 08 Block 0856 Addr 235800 : Bad erase


root@ex-switch:RE:0% nand-mediack
Media check on da0 on non-srx platforms
Zone 02 Block 0172 Addr 08ac00 : Bad erase
Recovering block
Zone 07 Block 0520 Addr 1e0800 : Bad erase
Recovering block
Zone 07 Block 0631 Addr 1e7700 : Bad erase
Recovering block
Zone 08 Block 0856 Addr 235800 : Bad erase
Recovering block

root@ex-switch:RE:0% nand-mediack –C


Note: We do not receive output when there are no bad blocks.

Media check on da0 on non-SRX platforms.

An example of an unrecoverable error during image installation even with --format option:

Formatting installation disk...
1+0 records in
1+0 records out
512 bytes transferred in 0.000303 secs (1689602 bytes/sec)
32+0 records in
32+0 records out
16384 bytes transferred in 0.032164 secs (509388 bytes/sec)
******* Working on device /dev/da0 *******
fdisk: invalid fdisk partition table found
fdisk: Geom not found
(da0:umass-sim0:0:0:0):READ(10). CDB: 28 0 0 13 80 1 0 0 1 0
(da0:umass-sim0:0:0:0):CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0):SCSI Status: Check Condition
(da0:umass-sim0:0:0:0):MEDIUM ERROR asc:11,0
(da0:umass-sim0:0:0:0):Unrecovered read error
(da0:umass-sim0:0:0:0):Retrying Command (per Sense Data)
(da0:umass-sim0:0:0:0):READ(10). CDB: 28 0 0 13 80 1 0 0 1 0
(da0:umass-sim0:0:0:0):CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0):SCSI Status: Check Condition
(da0:umass-sim0:0:0:0):ILLEGAL REQUEST asc:20,0
(da0:umass-sim0:0:0:0):Invalid command operation code
(da0:umass-sim0:0:0:0):Unretryable error
32+0 records in
32+0 records out
16384 bytes transferred in 0.040159 secs (407978 bytes/sec)
32+0 records in
32+0 records out
16384 bytes transferred in 0.181412 secs (90314 bytes/sec)
32+0 records in
32+0 records out
16384 bytes transferred in 0.180663 secs (90688 bytes/sec)
(da0:umass-sim0:0:0:0):READ(10). CDB: 28 0 0 13 80 1 0 0 1 0
(da0:umass-sim0:0:0:0):CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0):SCSI Status: Check Condition
(da0:umass-sim0:0:0:0):MEDIUM ERROR asc:11,0
(da0:umass-sim0:0:0:0):Unrecovered read error
(da0:umass-sim0:0:0:0):Retrying Command (per Sense Data)
(da0:umass-sim0:0:0:0):READ(10). CDB: 28 0 0 13 80 1 0 0 1 0
(da0:umass-sim0:0:0:0):CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0):SCSI Status: Check Condition
(da0:umass-sim0:0:0:0):ILLEGAL REQUEST asc:20,0
(da0:umass-sim0:0:0:0):Invalid command operation code
(da0:umass-sim0:0:0:0):Unretryable error
(da0:umass-sim0:0:0:0):READ(10). CDB: 28 0 0 13 80 0 0 0 10 0
(da0:umass-sim0:0:0:0):CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0):SCSI Status: Check Condition
(da0:umass-sim0:0:0:0): MEDIUM ERROR asc:11,0
(da0:umass-sim0:0:0:0):Unrecovered read error
(da0:umass-sim0:0:0:0):Retrying Command (per Sense Data)
(da0:umass-sim0:0:0:0):READ(10). CDB: 28 0 0 13 80 0 0 0 10 0
(da0:umass-sim0:0:0:0):CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0):SCSI Status: Check Condition
(da0:umass-sim0:0:0:0):ILLEGAL REQUEST asc:20,0
(da0:umass-sim0:0:0:0):Invalid command operation code
(da0:umass-sim0:0:0:0):Unretryable error
bsdlabel: /dev/da0s3 read: Invalid argument
*** The installer exited with status 4 ***
*** The installation is unsuccessful!!! ***
A shell has been started. type exit<cr> to reboot:

Another example of error during image installation and boot:

Formatting installation disk...
1+0 records in
1+0 records out
512 bytes transferred in 0.000299 secs (1712507 bytes/sec)
32+0 records in
32+0 records out
16384 bytes transferred in 0.033161 secs (494072 bytes/sec)
******* Working on device /dev/da0 *******
fdisk: invalid fdisk partition table found
fdisk: Geom not found
32+0 records in
32+0 records out
16384 bytes transferred in 0.040162 secs (407947 bytes/sec)
32+0 records in
32+0 records out
16384 bytes transferred in 0.186408 secs (87893 bytes/sec)
32+0 records in
32+0 records out
16384 bytes transferred in 0.179908 secs (91069 bytes/sec)
/dev/da0s3d: 319.6MB(654524 sectors) block size 16384, fragment size 2048
using 4 cylinder groups of 79.91MB,5114 blks, 10240 inodes.
super-block backups(for fsck -b #) at:
32, 163680, 327328, 490976
(da0:umass-sim0:0:0:0):READ(10). CDB: 28 0 0 13 80 40 0 0 10 0
(da0:umass-sim0:0:0:0):CAM Status: SCSI Status Error
(da0:umass-sim0:0:0:0):SCSI Status: Check Condition
(da0:umass-sim0:0:0:0):MEDIUM ERROR asc:11,0
(da0:umass-sim0:0:0:0):Unrecovered read error
(da0:umass-sim0:0:0:0):Retrying Command (per Sense Data)
cg 0: bad magic number
*** The installer exited with status 31 ***
*** The installation is unsuccessful!!! ***
A shell has been started. type exit<cr> to reboot:

Recovery of file system corruption:

This recovery procedure requires booting from the external USB in cases where the switch cannot be booted using the internal NAND flash media. The procedure to boot the switch with external USB is described at the link below, "Booting an EX Series Switch Using a Software Package Stored on a USB Flash Drive."

Modification History:

2018-05-09: Removed EX4300 from "Platforms/Products Affected"

Related Links

Comment on this article > Affected Products Browse the Knowledge Base for more articles related to these product categories. Select a category to begin.

Getting Up and Running with Junos

Getting Up and Running with Junos Security Alerts and Vulnerabilities Product Alerts and Software Release Notices Problem Report (PR) Search Tool EOL Notices and Bulletins JTAC User Guide Customer Care User Guide Pathfinder SRX High Availability Configurator SRX VPN Configurator Training Courses and Videos End User Licence Agreement Global Search