Table of Contents

Understanding Backup Automation Basics

What is Backup Automation?

Backup automation is the process of configuring a system to create, store, and manage backups without manual intervention. It ensures backups run consistently (e.g., daily/weekly) and reduces human error.

Why Shell Scripts?

Backup Types

Key Components of a Backup Shell Script

A robust backup script includes:

  1. Variables: Define sources, destinations, timestamps, and paths (e.g., SOURCE_DIR, DEST_DIR).
  2. Input Validation: Checks if the source exists or if the destination has free space.
  3. Backup Logic: Uses tools like tar (for archives) or rsync (for syncing) to create backups.
  4. Logging: Records success/failure, timestamps, and errors for debugging.
  5. Error Handling: Exits gracefully on failure and alerts the user.
  6. Cleanup: Removes old backups to avoid storage bloat.

Step-by-Step Guide to Creating a Backup Script

1. Basic Full Backup Script

Let’s start with a simple script to back up a directory to a local external drive using tar (tape archive) for compression.

#!/bin/bash

# Basic Full Backup Script
# Backs up SOURCE_DIR to DEST_DIR as a compressed tarball

# Configuration
SOURCE_DIR="/home/user/documents"  # Directory to back up
DEST_DIR="/mnt/external_drive/backups"  # Where to store backups
TIMESTAMP=$(date +%Y%m%d_%H%M%S)  # Add timestamp to avoid overwrites
BACKUP_FILENAME="backup_${TIMESTAMP}.tar.gz"  # e.g., backup_20240520_143022.tar.gz

# Create destination directory if it doesn't exist
mkdir -p "$DEST_DIR"

# Create backup using tar (c=create, z=compress with gzip, f=file)
echo "Creating backup: $DEST_DIR/$BACKUP_FILENAME"
tar -czf "$DEST_DIR/$BACKUP_FILENAME" "$SOURCE_DIR"

# Check if backup succeeded
if [ $? -eq 0 ]; then
  echo "Backup completed successfully!"
else
  echo "ERROR: Backup failed!"
  exit 1
fi

How to Use:

  1. Save as basic_backup.sh.
  2. Make executable: chmod +x basic_backup.sh.
  3. Run: ./basic_backup.sh.

Explanation:

2. Enhanced Script with Logging and Error Handling

Improve the script by adding logging, input validation, and cleanup of old backups.

#!/bin/bash
set -euo pipefail  # Exit on error, undefined variable, or pipe failure

# Enhanced Backup Script with Logging and Cleanup
# Configuration
SOURCE_DIR="/home/user/documents"
DEST_DIR="/mnt/external_drive/backups"
LOG_FILE="/var/log/backup_script.log"
MAX_BACKUPS=30  # Keep last 30 backups

# Function to log messages with timestamps
log() {
  echo "[$(date +%Y-%m-%dT%H:%M:%S)] $1" >> "$LOG_FILE"
}

# Validate source directory exists
if [ ! -d "$SOURCE_DIR" ]; then
  log "ERROR: Source directory $SOURCE_DIR does not exist!"
  exit 1
fi

# Validate destination has free space (simplified check)
if [ ! -w "$DEST_DIR" ]; then
  log "ERROR: Destination $DEST_DIR is not writable!"
  exit 1
fi

# Create backup
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILENAME="backup_${TIMESTAMP}.tar.gz"
BACKUP_PATH="$DEST_DIR/$BACKUP_FILENAME"

log "Starting backup: $BACKUP_PATH"
if tar -czf "$BACKUP_PATH" "$SOURCE_DIR"; then
  log "Backup succeeded: $BACKUP_PATH"
else
  log "ERROR: Backup failed for $BACKUP_PATH"
  exit 1
fi

# Cleanup old backups (keep last $MAX_BACKUPS)
log "Cleaning up backups older than $MAX_BACKUPS days..."
find "$DEST_DIR" -name "backup_*.tar.gz" -type f -mtime +"$MAX_BACKUPS" -delete >> "$LOG_FILE" 2>&1

log "Backup process completed."

Key Improvements:

3. Incremental Backups with rsync

For efficient incremental backups (only changed files), use rsync with --link-dest to create hard links to unchanged files (saves space).

#!/bin/bash
set -euo pipefail

# Incremental Backup with rsync
# Creates daily snapshots with hard links to unchanged files
SOURCE_DIR="/home/user/documents"
DEST_DIR="/mnt/external_drive/incremental_backups"
LATEST_LINK="$DEST_DIR/latest"  # Link to the most recent backup

# Create destination if needed
mkdir -p "$DEST_DIR"

# Create backup directory with timestamp
BACKUP_DIR="$DEST_DIR/backup_$(date +%Y%m%d_%H%M%S)"

# rsync options:
# -a: Archive mode (preserve permissions, timestamps, etc.)
# -v: Verbose (optional, for debugging)
# --link-dest: Hard-link unchanged files to $LATEST_LINK (saves space)
rsync -a --link-dest="$LATEST_LINK" "$SOURCE_DIR"/ "$BACKUP_DIR"/

# Update "latest" link to point to the new backup
rm -f "$LATEST_LINK"
ln -s "$BACKUP_DIR" "$LATEST_LINK"

echo "Incremental backup created: $BACKUP_DIR"

Why This Works:

Common Practices and Scenarios

Scheduling with cron

Automate backups using cron (a time-based job scheduler).

Step 1: Open the crontab editor:

crontab -e

Step 2: Add a job (run daily at 2 AM):

0 2 * * * /home/user/scripts/enhanced_backup.sh  # Runs daily at 2:00 AM

Cron Syntax: * * * * * command

Backing Up Remote Systems

Use rsync over SSH to back up a remote server to your local machine.

# Backup remote directory to local drive via SSH
rsync -avz -e ssh [email protected]:/home/user/data /local/backup/dir

Add to a Script:

REMOTE_USER="user"
REMOTE_SERVER="remote.server"
REMOTE_DIR="/home/user/data"
LOCAL_DEST="/local/backup/remote_data"

rsync -avz -e ssh "$REMOTE_USER@$REMOTE_SERVER:$REMOTE_DIR" "$LOCAL_DEST"

Excluding Files/Directories

Exclude temporary files or large caches using --exclude (works with tar and rsync).

Example with rsync:

rsync -av --exclude=".git" --exclude="node_modules" /source /dest

Example with tar:

tar -czf backup.tar.gz --exclude="*.tmp" /source/dir

Verifying Backups

Ensure backups are intact by verifying checksums or using rsync --checksum.

# Generate checksum for backup file
md5sum /backup/backup.tar.gz > /backup/backup.tar.gz.md5

# Verify later
md5sum -c /backup/backup.tar.gz.md5  # Outputs "OK" if valid

Rotating Old Backups

Delete backups older than 30 days (adjust -mtime +30 as needed):

# Delete files older than 30 days in DEST_DIR
find "$DEST_DIR" -name "backup_*.tar.gz" -type f -mtime +30 -delete

Best Practices for Reliable Backups

  1. Use Absolute Paths: Avoid relative paths (e.g., /home/user instead of ~/user).
  2. Encrypt Sensitive Data: Use gpg or openssl to encrypt backups:
    # Encrypt with gpg (password-protected)
    gpg -c backup.tar.gz  # Creates backup.tar.gz.gpg
  3. Test Restores Regularly: A backup is useless if you can’t restore it!
  4. Limit Permissions: Restrict script access with chmod 700 backup_script.sh (only owner can execute).
  5. Monitor Success: Send email alerts on failure using mail:
    if ! ./backup_script.sh; then
      echo "Backup failed!" | mail -s "Backup Alert" [email protected]
    fi
  6. Store Backups Offsite: Use external drives, network storage (NAS), or cloud (e.g., rsync to S3 with s3cmd).

Conclusion

Automating backups with shell scripts transforms data protection from a chore into a reliable, hands-off process. By combining tools like tar, rsync, and cron, you can build flexible solutions tailored to your needs—whether full, incremental, local, or remote backups.

Start small (e.g., a basic tar script), then enhance with logging, encryption, and scheduling. Remember: the best backup is one you test regularly. With these practices, you’ll sleep easier knowing your data is safe.

References