Resolving the Dreaded “Out of Disk Space” Errors on Linux Servers and Desktops
Running out of precious disk space continues to be one of the most common pain points faced by Linux administrators and users. A full root or home partition can quickly bring your entire Linux system to a grinding halt. Vital functions can fail with obscure errors when partitions fill up.
Recovering from a disk space crunch requires urgent troubleshooting and opens up a Pandora’s box of maintenance issues. Here we dive into the typical causes of disk space exhaustion on Linux and a comprehensive blueprint to diagnose, troubleshoot and resolve out of disk space errors.
Probing the Problem: Identifying Which Disk is Full
The very first step is to identify which particular disk partition is running low or out of space on your Linux machine.
On servers and headless Linux, the ‘df’ command is used to view disk usage statistics and free space available for all mounted partitions. The ‘df’ output will highlight the used and available space on each mounted filesystem. Pay close attention to partitions showing 100% or upwards of 90% usage, which signals critically insufficient free space.
The ‘df’ command just displays disk space in blocks by default, which can be hard to interpret. Using ‘df -h’ is recommended to show file sizes in a more human readable format like MB or GB instead of blocks.
df -h Filesystem Size Used Avail Use% Mounted on /dev/sda1 99G 92G 2.1G 97% / /dev/sda2 306G 289G 17G 95% /home /dev/sda3 9.7G 4.2G 5.5G 44% /opt
Here we can clearly see that the root filesystem ‘/’ and home ‘/home’ partitions are nearly full at 97% and 95% usage respectively.
The ‘du’ command can be used to drill down and find which folders are consuming the most space. For example, scanning the top level directories using ‘du -sh /*’ will generate size totals for each folder under root. This can clue you into large space hogs.
On desktop Linux, GUI disk usage utilities can simplify the visualization:
- GNOME Disks provides a graphical overview of all disk partitions, their mount points and usage percentages.
- DaisyDisk and other tools let you interactively browse folders and drill down sorted by size.
- Baobab (Disk Usage Analyzer) scans the disk and displays an interactive map of folder sizes.
System administrators also rely on continuous terminal-based monitoring tools like ncdu and ddgraph to spot usage spikes across partitions in real-time. Sudden upticks in usage and rapidly filling partitions raise red flags.
It’s also important to check log files and browse through system error messages. The “No space left on device” IO error clearly points to a disk space issue, as do warnings about disks filling up.
Finding and Removing Large Space Hogging Files
Armed with the identity of the specific partition that is full, the next step is to dig deeper and find exactly which files and folders are occupying all the space. Remember – logs, temporary files, database dumps, and caches can silently balloon and choke up space.
Run ‘du -sh *’ from the root of the problem partition to get folder sizes. Then cd into each large folder and run ‘du -sh *’ again to narrow it down further. Eventually you can pinpoint the particular set of space hogging files.
The ‘du’ command also has a handy switch to show individual file sizes directly using:
du -sh --max-depth=1 /full/partition
This will print sizes for every file in that partition’s root folder. Pipe it into sort -rh to sort by size in descending order for quick detection of space hogs.
In a graphical desktop file manager, simply right click and sort by size to surface large files. You can also use visual disk mappers like DaisyDisk and Baobab.
Once you’ve identified some candidate large files, time to remove unnecessary ones. Common targets include:
- Application archives (.tar.gz, .rpm, .deb) after programs are installed.
- Temp files and folders usually safe to delete, like /tmp/ and browser caches.
- Big log files and archives of logs that just pile up over time.
- Crashed VM images and leftover VM snapshots hogging space.
- Downloaded ISOs and disk images you forgot to delete.
- Build artifacts and old software project directories.
- Uninstalled application files. Use ‘yum autoremove’ or ‘apt autoremove’ to clear.
If the system has backups, consider clearing out copies of large files. Compress existing files selectively where possible using gzip, bzip2, xz and other compression utilities. This can provide some quick wins.
Databases and log files in particular tend to balloon in size over time. Check what rotation and retention policies are in place. Trim historical data aggressively if the partition crisis demands.
For code repositories, explore shallow clones and prune large binary files to reclaim space. Removing unneeded Docker images, containers and volumes can also provide relief.
Resize Partitions and Filesystem Limits
If after clearing out all unnecessary large files a partition continues to stay 100% full, more serious measures like resizing partitions and quotas may be required.
Use a disk utility like GNOME Disks, GParted or fdisk to attempt resizing and expanding partitions. This may require unmounting the partition first. Adding more physical disks and partitioning using LVM are other ways to increase space.
If the Linux machine is a VM, extend the underlying virtual disk with more space and resize the filesystem using tools provided by the hypervisor.
For cloud servers, most cloud console provide options to expand the primary boot disk volume. This dynamically allocates more storage without rebooting.
When the root ‘/’ partition has no more space left but other partitions do have space, consider using symlinks to store data files on partitions with available capacity instead of root. Make sure the symlinks are created properly and point to the non-root partition’s directories.
Check whether user, group or project quotas are in effect and limiting space usage. Review and increase quotas as needed to the extent possible. Quotas are commonly used to prevent disk space abuse.
If the partition is used by multiple users or groups, enable soft and hard partition quotas using tools like quota, setquota, edquota and repquota. But use quotas judicially and only where needed.
An Onging Battle: Proactive Monitoring and Tuning
Preventing out of disk errors should be an ongoing and proactive system administration task. Here are some tips:
- Set up disk space usage alerts to detect early warning signs before partitions fill up completely.
- Schedule periodic scans of disk usage and prune large old files. Do not let partitions reach 100%.
- Monitor users and applications that are consuming the most storage.
- Retain breathing room and allocate adequately for future growth needs.
- Move less accessed data to external disks or cloud storage to free primary partition space.
- Archive and compress old logs, dumps and files.
- Explore storage optimization technologies like deduplication, compression and thin provisioning.
- Upgrade to larger disks over time and migrate data across. A storage upgrade plan is essential.
- Virtualized environments make storage expandable. But watch for guest-host mapped partitions hitting capacity.
With diligence around identifying problem partitions early, pruning files selectively, and optimizing space usage, disk outages can be tackled. But the essential work is continuously monitoring, tuning and upgrading the storage landscape. An outage will happen if you take your eyes off the disk capacity meter