This article provides steps on how to troubleshoot what is taking up the disk space in an EX switch.
Running the command ‘show chassis alarms’
on an EX switch returns the following message:
Minor Host 0 /var partition usage is high.
What could be consuming the disk space?
Example:
When checking the device utilization, the following output is returned:
{master:0}pwd
lab@EX> show system storage
/dev/da0s3e 123M 82M 31M 72% /var
This output does not reveal what could be consuming the disk space.
Running the command, ’file list /var/’
does not provide enough information:
.snap/
BSD.var.dist
account/
at /
backups/
bin/
crash/
cron/
db /
empty/
etc/
etcroot/
heimdal/
home/
jail/
log/
logical-systems/
mail/
mfs/
msgs/
named/
preserve/
root/
run/
rundb/
rwho/
spool/
sw /
tmp/
transfer/
validate/
yp /
Important Notes:
- Unless you are familiar with the Junos OS file system, or you have been advised to do so by a JTAC engineer, do not attempt to delete files outside the /var/home directory, as it can cause your switch to fail.
- Unless strictly necessary, do not log into the console as ‘root’, because the switch will allow you to delete critical system files.
- Remember that UNIX does not recognize file extensions, so it is common to have a file with no extension.
- KB22966 - How to resolve the '/var: filesystem full' issue which occurs as a result of the WTMP file not being archived. This article explains a similar condition related to the WTMP file taking up space.
The following example uses UNIX shell commands to display ways to find out what is taking up space in an EX switch and remove them:
Note: Please note that this command used to find the file which occupies more space in / partition “% find / -size +100000“ (shell command)
- Enter the command '
start shell'
{master:0}
lab@EX> start shell
%
By default, Junos will go to the user home directory. Confirm this by typing pwd
.
% pwd
/var/home/lab
- Go to the upper level of the /var/ partition by using the following commands:
% cd /var/
% pwd
/var
- Check folder utilization:
% du -cks * | sort -rn
du: cron/tabs: Permission denied
du: db/entropy: Permission denied
du: db/certs/common: Permission denied
du: db/certs/system-key-pair: Permission denied
du: db/certs/system-cert: Permission denied
du: db/dhcp_snoop: Permission denied
du: heimdal: Permission denied
du: root/.ssh: Permission denied
du: run/ppp: Permission denied
du: spool/opielocks: Permission denied
407216 total
293838 tmp
53964 home
30430 rundb
20698 root
6888 log
378 db
332 mfs
204 run
204 etc
120 jail
100 etcroot
12 spool
6 at
4 transfer
4 sw
4 crash
4 BSD.var.dist
2 yp
2 validate
2 rwho
2 preserve
2 named
2 msgs
2 mail
2 logical-systems
2 empty
2 cron
2 bin
2 backups
2 account
In this case, it appears that most of the disk space is being consumed by the tmp folder (a partition of its own) and the home file.
- There is a limitation with the du tool. We cannot tell if it is an actual file or a directory. Use the following command to confirm:
% file home
home: directory
Alternatively, the following command can be used:
% ls -l | grep home
drwxr-xr-x 37 root wheel 1024 Sep 5 18:22 home
The first column shows a lowercase d, which in UNIX, indicates it is a directory.
- Enter this directory and run the procedure again to find out what is using up the space inside /var/home:
% cd home/
% pwd
/var/home
% du -cks * | sort -rn
53962 total
53802 lab
46 daltamirano
16 remote
14 wheaslip
12 jrojas
4 wmoreira
drwxr-xr-x 3 lab 20 512 Dec 17 09:30 lab
This reveals that most of the usage comes from the lab directory.
- Repeat the procedure once more to see files under the lab directory:
% cd lab/
% du -cks * | sort -rn
53796 total
28000 foo
25760 bar
14 DHCP-CONFIGURATION-MEMO
8 et inte
8 JORGE-TEST
4 h
2 DHCPclient.PCAP
0 LC-DHCP-LAB
0 DHCP.PCAP
% ls -l | grep foo
-rw-r--r-- 1 lab field 28655162 Dec 17 09:13 foo
% ls -l | grep bar
-rw-r--r-- 1 lab field 26361809 Dec 17 09:13 bar
The space is being consumed by two files; foo and bar.
- Run the files to find out what they are:
% file foo
foo: gzip compressed data, from UNIX, max compression
% file bar
bar: gzip compressed data, from UNIX, max compression
This reveals they are tarballs. In most scenarios /var/ utilization increases either because tarball upgrade packets were copied to a wrong directory, or a packet capture was running for too long.
% file DHCPclient.PCAP
DHCPclient.PCAP: tcpdump capture file (little-endian) - version 2.4, capture length 96)
- Remove the files:
% rm foo
% rm bar
% ls
DHCP-CONFIGURATION-MEMO DHCPclient.PCAP LC-DHCP-LAB h
DHCP.PCAP JORGE-TEST et inte
Liberating the space in the process:
/dev/da0s3e 123M 29M 84M 26% /var
After about 20 minutes the alarm will clear by itself.
- Another helpful command is '
request system storage cleanup dry-run’
user@host> request system storage cleanup dry-run
Currently rotating log files, please wait.
This operation can take up to a minute.
List of files to delete:
Size Date Name
11.4K Mar 8 15:00 /var/log/messages.1.gz
7245B Feb 5 15:00 /var/log/messages.3.gz
11.8K Feb 22 13:00 /var/log/messages.2.gz
3926B Mar 16 13:57 /var/log/messages.0.gz
3962B Feb 22 12:47 /var/log/sampled.1.gz
4146B Mar 8 12:20 /var/log/sampled.0.gz
4708B Dec 21 11:39 /var/log/sampled.2.gz
7068B Jan 16 18:00 /var/log/messages.4.gz
13.7K Dec 27 22:00 /var/log/messages.5.gz
890B Feb 22 17:22 /var/tmp/sampled.pkts
65.8M Oct 26 09:10 /var/sw/pkg/jinstall-7.4R1.7-export-signed.tgz
63.1M Oct 26 09:13 /var/sw/pkg/jbundle-7.4R1.7.tgz
2020-03-19: Moved important notes to top of solution field.
2019-03-05: Minor edit. Non-technical