Knowledge Search


×
 

[EX] Legacy EX switches may bypass "fsck" during boot cycle, resulting in undetected file system corruptions

  [KB33996] Show Article Properties


Summary:

Legacy EX switches may be seen to bypass the file system check (fsck) during the boot cycle, which results in file system corruptions that may not be detected and corrected.

This article explains why the file system check is being bypassed, and provides a couple of ways to recover from the problem. It also indicates the Junos OS release in which the fix for this problem is available.

 

Symptoms:

File system issues may be observed even when the partition has been cleanly shut down, as shown below:

root@switch% nand-mediack -C
Media check on da0 on ex platforms
Zone 05 Block 0349 Addr 155d00 : Bad read
Zone 05 Block 0355 Addr 156300 : Bad read
Zone 05 Block 0357 Addr 156500 : Bad read
Zone 05 Block 0361 Addr 156900 : Bad read

Log messages:

Jan  1 00:06:32.762 2015  switch eventd[1458]: SYSTEM_ABNORMAL_SHUTDOWN: System abnormally shut down
Jan  1 00:06:32.946 2015  switch eventd[1458]: SYSTEM_OPERATIONAL: System is operational
Jan  1 00:06:32.949 2015  switch /kernel: /mnt: update error: blocks 0 files 1
 
Mar  1 16:16:31.477 2019  switch /kernel: (da0:umass-sim0:0:0:0): CAM Status: SCSI Status Error
Mar  1 16:16:31.477 2019  switch /kernel: (da0:umass-sim0:0:0:0): SCSI Status: Check Condition
Mar  1 16:16:31.477 2019  switch /kernel: (da0:umass-sim0:0:0:0): MEDIUM ERROR asc:11,0
Mar  1 16:16:31.477 2019  switch /kernel: (da0:umass-sim0:0:0:0): Unrecovered read error
Mar  1 16:16:31.477 2019  switch /kernel: (da0:umass-sim0:0:0:0): Retrying Command (per Sense Data)
Mar  1 16:18:06.702 2019  switch sfid[1298]: JTASK_SCHED_SLIP_KEVENT: 95 sec 155493 usec kevent block
Mar  1 16:18:06.703 2019  switch chassism[1297]: JTASK_SCHED_SLIP_KEVENT: 95 sec 64013 usec kevent block
Mar  1 16:18:06.708 2019  switch eswd[1310]: JTASK_SCHED_SLIP_KEVENT: 95 sec 234127 usec kevent block
Mar  1 16:18:06.711 2019  switch lldpd[1335]: JTASK_SCHED_SLIP_KEVENT: 95 sec 233048 usec kevent block
Mar  1 16:18:06.712 2019  switch cfmd[1316]: JTASK_SCHED_SLIP_KEVENT: 97 sec 631027 usec kevent block
Mar  1 16:18:06.712 2019  switch mcsnoopd[1337]: JTASK_SCHED_SLIP_KEVENT: 96 sec 139675 usec kevent block
Mar  1 16:18:06.713 2019  switch sflowd[1336]: JTASK_SCHED_SLIP_KEVENT: 95 sec 663718 usec kevent block
Mar  1 16:18:06.713 2019  switch vccpd[1299]: JTASK_SCHED_SLIP_KEVENT: 95 sec 243301 usec kevent block
Mar  1 16:18:06.721 2019  switch /kernel: bad block -1, ino 85
Mar  1 16:18:06.721 2019  switch /kernel: pid 40 (softdepflush), uid 0 inumber 85 on /var: bad block
Mar  1 16:18:06.721 2019  switch /kernel: bad block -1, ino 85
Mar  1 16:18:06.721 2019  switch /kernel: pid 40 (softdepflush), uid 0 inumber 85 on /var: bad block
Mar  1 16:18:06.721 2019  switch /kernel: bad block -1, ino 85
Mar  1 16:18:06.721 2019  switch /kernel: pid 40 (softdepflush), uid 0 inumber 85 on /var: bad block
Mar  1 16:18:06.721 2019  switch /kernel: bad block -1, ino 85
 
Mar  1 16:18:06.702 2019  switch sfid[1298]: JTASK_SCHED_SLIP_KEVENT: 95 sec 155493 usec kevent block
Mar  1 16:18:06.703 2019  switch chassism[1297]: JTASK_SCHED_SLIP_KEVENT: 95 sec 64013 usec kevent block
Mar  1 16:18:06.708 2019  switch eswd[1310]: JTASK_SCHED_SLIP_KEVENT: 95 sec 234127 usec kevent block
Mar  1 16:18:06.711 2019  switch lldpd[1335]: JTASK_SCHED_SLIP_KEVENT: 95 sec 233048 usec kevent block
Mar  1 16:18:06.712 2019  switch cfmd[1316]: JTASK_SCHED_SLIP_KEVENT: 97 sec 631027 usec kevent block
Mar  1 16:18:06.712 2019  switch mcsnoopd[1337]: JTASK_SCHED_SLIP_KEVENT: 96 sec 139675 usec kevent block
Mar  1 16:18:06.713 2019  switch sflowd[1336]: JTASK_SCHED_SLIP_KEVENT: 95 sec 663718 usec kevent block
Mar  1 16:18:06.713 2019  switch vccpd[1299]: JTASK_SCHED_SLIP_KEVENT: 95 sec 243301 usec kevent block
Mar  1 16:18:06.747 2019  switch rpd[1319]: RPD_SCHED_SLIP_KEVENT: 95 sec 237260 usec kevent block

 

Cause:

On legacy EX switches, file system check (fsck) is run with the -C option, which skips the file system corruption check if the partition has been marked clean during the boot "nand-media" check. Due to this, there have been multiple instances where the partition has had file system issues even when cleanly shut down.

 

Solution:

This problem has been fixed in Junos OS releases 12.3R12-S7, 14.1X53-D46, and 15.1R6.

Meanwhile, to recover from the problem:

 

Related Links: