dotlinux guide

Understanding Linux File Systems: A Beginner's Tutorial

If you’ve ever used a Linux-based operating system (like Ubuntu, Fedora, or Debian), you’ve interacted with its file system—whether you realized it or not. Unlike Windows or macOS, Linux organizes data in a unified, tree-like hierarchy where everything is treated as a file (including hardware devices, processes, and directories). This design is both powerful and intuitive, but it can feel foreign to newcomers. In this tutorial, we’ll demystify Linux file systems. We’ll start with core concepts, explore the standard directory structure, break down common file system types, and teach you practical skills like mounting drives, checking disk space, and managing files. By the end, you’ll confidently navigate and manage Linux file systems like a pro.

Table of Contents

What is a File System?

A file system is a method for organizing and storing data on a storage device (e.g., HDD, SSD, USB drive). It defines how files are named, stored, and retrieved, and manages metadata (e.g., file size, permissions, creation date). Without a file system, data would be a chaotic jumble of bits—like a bookshelf with no organization.

Linux supports multiple file system types, each optimized for specific use cases (e.g., speed, reliability, large files). Unlike some operating systems (e.g., Windows, which historically favored NTFS), Linux is flexible and can read/write to most major file systems (FAT32, exFAT, NTFS, etc.).

Linux File System Hierarchy

Linux follows the Filesystem Hierarchy Standard (FHS), a unified structure that ensures consistency across distributions. Unlike Windows (which uses drive letters like C: or D:), Linux has a single root directory (/), with all other directories branching from it.

Here’s a breakdown of key directories in the Linux hierarchy:

DirectoryDescriptionExamples of Contents
/The root directory, the top of the hierarchy. All other directories stem from here./home, /etc, /bin, etc.
/binEssential user binaries (executable programs) needed for system repair/recovery.ls, cp, mv, sh (shell)
/bootFiles required to boot the system: kernel, initramfs, bootloader configs.vmlinuz (kernel), initrd.img (initial RAM disk), grub (bootloader directory)
/devDevice files: Linux represents hardware/software devices as files here./dev/sda (first SATA drive), /dev/null (null device), /dev/tty1 (terminal)
/etcSystem-wide configuration files.passwd (user accounts), fstab (mount points), apache2/ (web server configs)
/homeUser home directories, where personal files are stored./home/alice, /home/bob
/libShared libraries (code used by programs) and kernel modules.libc.so (C standard library), kernel modules in /lib/modules/
/mediaMount point for removable media (USB drives, CDs, external HDDs)./media/usb, /media/cdrom
/mntTemporary mount point for manually mounting file systems./mnt/backup, /mnt/shared
/optOptional software (third-party apps not managed by the system package manager)./opt/google/chrome, /opt/docker/
/procVirtual file system (not on disk) exposing system/kernel information./proc/cpuinfo (CPU details), /proc/meminfo (memory usage), /proc/1/ (PID 1 process)
/rootHome directory for the root (administrative) user.root’s personal files (not to be confused with /).
/sbinSystem binaries (admin-only tools for system maintenance).fdisk (disk partitioning), mount, reboot, fsck (file system check)
/tmpTemporary files (cleared on reboot).Logs, caches, or files used by running programs.
/usrSecondary hierarchy for user utilities and applications./usr/bin (non-essential user binaries), /usr/share (shared data like icons), /usr/local (locally installed software)
/varVariable data: files that grow/change over time (logs, databases, spool files)./var/log (system logs), /var/www (web server files), /var/spool/mail (emails)

Key Notes:

  • Virtual File Systems: Directories like /proc and /sys are not stored on disk—they are generated dynamically by the kernel to expose system state.
  • No Drive Letters: All storage devices (internal drives, USBs) are mounted under the root hierarchy (e.g., a USB drive might mount at /media/usb).

Common Linux File Systems

Linux supports dozens of file systems, but these are the most widely used:

1. ext4 (Fourth Extended File System)

  • Default for most Linux distributions (Ubuntu, Debian, Fedora).
  • Features: Journaling (prevents data corruption after crashes), supports files up to 16 TiB, volumes up to 1 EiB, and backward compatibility with ext2/ext3.
  • Use Case: General-purpose desktops, laptops, and servers.

2. XFS

  • Default for RHEL/CentOS 7+ and SUSE Linux Enterprise.
  • Features: High performance for large files/volumes, scalable to 8 EiB, parallel I/O support (ideal for multi-core systems).
  • Use Case: Servers handling large datasets (e.g., video editing, databases).

3. Btrfs (B-tree File System)

  • A modern, copy-on-write (CoW) file system with advanced features.
  • Features: Snapshots (point-in-time backups), RAID support, online resizing, and checksumming (detects/corrects data corruption).
  • Use Case: Systems needing flexibility (e.g., developers, power users) or data integrity (servers with critical data).

4. ZFS

  • Originally developed by Sun Microsystems, now popular in Linux for enterprise storage.
  • Features: RAID-Z (software RAID with parity), snapshots, compression, and deduplication. Not natively included in Linux kernels (licensing reasons) but available via third-party modules.
  • Use Case: High-availability servers, network-attached storage (NAS).

5. FAT32/exFAT

  • FAT32: Legacy file system for removable media (USBs, SD cards). Limited to 4 GB file sizes and 2 TiB volumes.
  • exFAT: Microsoft’s successor to FAT32, supporting large files/volumes (ideal for cross-platform USB drives).
  • Use Case: Removable media needing Windows/macOS/Linux compatibility.

To truly understand Linux file systems, you need to grasp three core concepts: inodes, directories, and links.

Inodes: The “Metadata Managers”

Every file/directory in Linux is associated with an inode (index node), a data structure that stores metadata about the file. Think of inodes as “digital ID cards” for files.

Inode metadata includes:

  • File size, owner, group, permissions.
  • Timestamps: atime (last access), mtime (last modification), ctime (last metadata change).
  • Pointers to data blocks (where the file’s actual content is stored on disk).

Key Fact: Filenames are not stored in inodes. Instead, filenames are labels that map to inodes (more on this below).

To view a file’s inode number, use ls -i:

ls -i example.txt  # Output: 12345 example.txt (12345 is the inode number)

Directories: Special Files for Organizing Inodes

A directory is a special file that contains a list of entries, where each entry maps a filename to an inode number. For example, the /home/alice directory might contain entries like:

"document.pdf" → inode 5678
"photo.jpg" → inode 9012

To list a directory’s contents with inode numbers, use ls -li:

ls -li /home/alice  # Lists filenames, inode numbers, permissions, etc.

Links are pointers to files. Linux supports two types:

A hard link is a direct reference to an inode. Creating a hard link to a file gives it a new filename, but both names point to the same inode (and thus the same data).

Key Properties:

  • Hard links share the same inode number as the original file.
  • Deleting the original file does not delete the hard link (the data remains until all links are deleted).
  • Hard links cannot cross file system boundaries (they only work within the same partition).

Example: Create a hard link to file.txt:

echo "Hello" > file.txt  # Create original file
ln file.txt file_link    # Create hard link
ls -li file.txt file_link  # Both show the same inode number
rm file.txt               # Delete original
cat file_link             # Still works! Output: "Hello"

A soft link (symlink) is a shortcut to another file/directory. Unlike hard links, symlinks store the path to the target file, not its inode.

Key Properties:

  • Symlinks have their own inode number.
  • If the target file is deleted, the symlink becomes “broken” (points to nothing).
  • Symlinks can cross file system boundaries and link to directories.

Example: Create a symlink to file.txt:

echo "Hello" > file.txt  # Create original file
ln -s file.txt file_symlink  # Create symlink (-s for soft)
ls -li file.txt file_symlink  # Different inode numbers
rm file.txt               # Delete original
cat file_symlink          # Error: "No such file or directory"

Working with File Systems: Commands and Tools

Now that you understand the theory, let’s dive into practical commands for managing Linux file systems.

1. Check Disk Space: df and du

  • df (disk free): Shows free space on mounted file systems.
    Use -h for human-readable units (GB, MB):

    df -h  # Output: Filesystem      Size  Used Avail Use% Mounted on
           # /dev/sda1        20G   12G  7.5G  61% /
           # /dev/sdb1        100G   50G   50G  50% /mnt/external
  • du (disk usage): Shows space used by files/directories.
    Use -sh to get a summary of a directory:

    du -sh /home/alice  # Output: 15G	/home/alice (total size of alice’s home)

2. Mounting File Systems: mount and /etc/fstab

To access a file system, you must mount it to a directory (the “mount point”).

Temporary Mounting with mount

Use mount to manually mount a device (e.g., a USB drive at /dev/sdb1):

sudo mkdir /mnt/usb       # Create a mount point
sudo mount /dev/sdb1 /mnt/usb  # Mount the drive to /mnt/usb

To unmount (when done):

sudo umount /mnt/usb  # Note: "umount" (no "n")

Persistent Mounting with /etc/fstab

To auto-mount file systems at boot, edit /etc/fstab (file system table). Each line in fstab defines a mount point:

# Format: <device> <mount point> <type> <options> <dump> <pass>
UUID=1234-ABCD  /mnt/usb  vfat  defaults  0  0
  • UUID: Unique identifier for the drive (more reliable than /dev/sdb1, which can change). Find with blkid:
    blkid /dev/sdb1  # Output: /dev/sdb1: UUID="1234-ABCD" TYPE="vfat"
  • defaults: Common options (read/write, auto-mount at boot).
  • dump/pass: Backup and fsck (file system check) settings (use 0 for non-critical drives).

3. File System Health: fsck

The fsck (file system check) tool repairs corrupted file systems. Always unmount the file system first to avoid data loss!

Example: Check and repair /dev/sda1 (root partition, unmounted):

sudo fsck /dev/sda1
  • Use -y to auto-approve repairs: sudo fsck -y /dev/sda1
  • Never run fsck on a mounted file system (risk of corruption!).

Best Practices for Managing Linux File Systems

To keep your Linux system running smoothly, follow these best practices:

1. Organize Data with Separate Partitions

During installation, split the file system into partitions for critical directories like:

  • /home: Isolates user data (eases OS reinstalls).
  • /var: Prevents log/database growth from filling the root partition.
  • /tmp: Use a tmpfs (in-memory file system) for faster temporary files.

2. Monitor Disk Space Regularly

Use tools like df -h or du -sh to track usage. Set up alerts (e.g., with cron jobs) if disk space exceeds 85% (to avoid crashes).

3. Use Journaled File Systems for Critical Data

For systems where data integrity is key (e.g., servers), use journaled file systems like ext4 or XFS. Journals log changes before applying them, reducing corruption risk during crashes.

4. Avoid Filling the Root Partition

The root partition (/) contains essential system files. If it fills up, the system may become unresponsive. Use /home or /mnt for large files.

5. Backup Important Data

Regularly back up files in /home, /etc, and /var. Use tools like rsync, tar, or cloud backups (e.g., Nextcloud).

Conclusion

Linux file systems are a cornerstone of the OS, providing a unified, flexible way to organize data. By mastering the hierarchy, inodes, and key commands like mount, df, and fsck, you’ll be able to manage storage efficiently and troubleshoot issues with confidence.

Remember: The FHS ensures consistency across distributions, so once you learn the basics, you’ll feel at home on any Linux system. Practice with commands like ls -i (inode exploration) or df -h (disk space checks) to reinforce your skills!

References