dotlinux guide

How to Debug Complex Shell Scripts Like a Pro

Table of Contents

  1. Understanding Shell Script Execution & Debugging Basics
  2. Built-in Debugging Tools: set Options & Flags
  3. Advanced Debugging Techniques
  4. Common Pitfalls & How to Debug Them
  5. Best Practices for Debug-Friendly Scripts
  6. Conclusion
  7. References

1. Understanding Shell Script Execution & Debugging Basics

Before diving into tools, it’s critical to understand how shell scripts run. Unlike compiled languages, shell scripts are interpreted line-by-line by a shell (e.g., bash, zsh). The shell parses each line, expands variables/glob patterns, and executes the resulting command. Bugs often emerge during:

  • Parsing: Syntax errors (e.g., missing done in a loop).
  • Expansion: Unintended word splitting (e.g., unquoted variables with spaces).
  • Execution: Failed commands (e.g., grep returning no matches) or logic flaws (e.g., incorrect conditionals).

Debugging shell scripts involves tracing these stages to identify where the script deviates from expectations.

2. Built-in Debugging Tools: set Options & Flags

The set command in Bash/Zsh controls shell behavior, including powerful debugging flags. These flags are the foundation of pro-level debugging.

Trace Execution with set -x (xtrace)

The set -x flag (short for “xtrace”) is the most widely used debugging tool. It prints each command after expansion to stderr, prefixed with +, allowing you to trace execution step-by-step.

Example Script (debug_demo.sh):

#!/bin/bash
name="Alice"
echo "Hello, $name"
for i in {1..2}; do
  echo "Count: $i"
done

Run with set -x:
Add set -x at the top of the script or run it via bash -x debug_demo.sh:

bash -x debug_demo.sh

Output:

+ name=Alice
+ echo 'Hello, Alice'
Hello, Alice
+ for i in '{1..2}'
+ echo 'Count: 1'
Count: 1
+ for i in '{1..2}'
+ echo 'Count: 2'
Count: 2

Pro Tip: Customize the trace prefix with PS4 (default: + ). For example, PS4='+ [${BASH_SOURCE}:${LINENO}]: ' adds the script name and line number to traces:

PS4='+ [${BASH_SOURCE}:${LINENO}]: ' bash -x debug_demo.sh

Output:

+ [debug_demo.sh:2]: name=Alice
+ [debug_demo.sh:3]: echo 'Hello, Alice'
Hello, Alice
...

Validate Syntax with set -n (noexec)

Use set -n (or bash -n script.sh) to check for syntax errors without executing the script. This is ideal for catching typos like missing fi in if statements.

Example:
A script with a missing done in a loop:

#!/bin/bash
for i in {1..2}; do
  echo $i
# Missing 'done' here!

Run with bash -n:

bash -n broken_loop.sh
broken_loop.sh: line 4: syntax error: unexpected end of file

Catch Undefined Variables with set -u (nounset)

By default, the shell treats undefined variables as empty strings (e.g., echo $undefined_var prints nothing). set -u (or set -o nounset) makes the shell exit with an error when it encounters an undefined variable, preventing silent failures.

Example:

#!/bin/bash
set -u  # Enable nounset
echo "Hello, $username"  # 'username' is undefined

Output:

./script.sh: line 3: username: unbound variable

Exit on Errors with set -e (errexit)

By default, the shell continues executing after a failed command (non-zero exit code). set -e (or set -o errexit) makes the script exit immediately if any command fails, avoiding cascading errors.

Example:

#!/bin/bash
set -e  # Exit on error
echo "Step 1"
false  # This command fails (exit code 1)
echo "Step 2"  # This line will NOT run

Output:

Step 1

Handle Pipeline Failures with set -o pipefail

Pipelines (e.g., cmd1 | cmd2 | cmd3) return the exit code of the last command by default. If cmd1 fails but cmd3 succeeds, the pipeline exits with 0, hiding the error. set -o pipefail makes the pipeline return the exit code of the first failed command in the chain.

Example:

#!/bin/bash
set -o pipefail  # Catch pipeline errors
grep "nonexistent" file.txt | wc -l  # grep fails, but wc succeeds
echo "Exit code: $?"  # Without pipefail: 0; with pipefail: 1

3. Advanced Debugging Techniques

Using trap to Catch Errors and Signals

The trap command lets you run code when the shell receives a signal (e.g., SIGINT from Ctrl+C) or when a command fails (via the ERR pseudo-signal). It’s invaluable for cleaning up temporary files or logging errors with context (e.g., line numbers).

Example: Log Errors with Line Numbers

#!/bin/bash
set -euo pipefail  # Combine safety flags

# Trap errors: print line number and exit
trap 'echo "ERROR: Failed at line $LINENO"; exit 1' ERR

# Simulate a failure
false  # Trigger ERR trap

Output:

ERROR: Failed at line 7

Structured Logging for Complex Scripts

For large scripts, echo statements are messy. Use a logging function to standardize output with timestamps, log levels (INFO/ERROR), and context.

Example Logging Function:

#!/bin/bash
set -euo pipefail

# Logging function: $1=level, $2=message
log() {
  local level=$1
  local message=$2
  echo "[$(date +'%Y-%m-%d %H:%M:%S')] [$level] $message"
}

log "INFO" "Starting script..."
grep "critical" data.log || log "ERROR" "Critical entries not found"
log "INFO" "Script finished"

Output:

[2024-03-20 14:30:00] [INFO] Starting script...
[2024-03-20 14:30:00] [ERROR] Critical entries not found
[2024-03-20 14:30:00] [INFO] Script finished

Interactive Debugging with bash -i or set -v

For stubborn bugs, use interactive debugging:

  • bash -i script.sh: Runs the script in an interactive shell, allowing you to inspect variables/state mid-execution.
  • set -v (verbose mode): Prints input lines before expansion (useful for debugging variable expansion issues).

4. Common Pitfalls & How to Debug Them

Unquoted Variables and Word Splitting

Problem: Unquoted variables with spaces (e.g., name="Alice Smith") get split into multiple arguments.
Debugging: Use set -x to trace expansion; quotes preserve spaces.

Example:

#!/bin/bash
set -x
name="Alice Smith"
echo Hello, $name  # Unquoted: splits into "Hello," "Alice," "Smith"
echo "Hello, $name"  # Quoted: preserved as "Hello, Alice Smith"

Output:

+ name='Alice Smith'
+ echo Hello, Alice Smith
Hello, Alice Smith
+ echo 'Hello, Alice Smith'
Hello, Alice Smith

Incorrect Loop Syntax

Problem: Forgetting do/done or using for with unquoted globs.
Debugging: Use bash -n to check syntax; set -x to trace loop execution.

Example Fix:

#!/bin/bash
set -x
files="*.txt"  # Unquoted glob expands here, not in loop
for file in "$files"; do  # Quoting prevents early expansion
  echo "Processing $file"
done

Silent Failures in Pipelines

Problem: Pipelines hiding failed commands (e.g., curl ... | jq where curl fails).
Debugging: Use set -o pipefail to catch pipeline errors early.

5. Best Practices for Debug-Friendly Scripts

  1. Start with a Shebang: Always use #!/bin/bash (or #!/bin/zsh) to specify the shell, avoiding unexpected behavior from /bin/sh.
  2. Enable Strict Mode: Start scripts with set -euo pipefail to catch errors early:
    #!/bin/bash
    set -euo pipefail  # Exit on error, undefined vars, and pipeline failures
  3. Quote Variables: Always quote variables ("$var") to prevent word splitting.
  4. Modularize with Functions: Break scripts into functions; debug individual functions with set -x locally (e.g., my_func() { set -x; ...; set +x; }).
  5. Log Aggressively: Use a logging function (as shown earlier) to track execution flow and errors.
  6. Test Incrementally: Write and test small chunks before combining them into complex scripts.
  7. Version Control: Track changes with Git to revert if new code introduces bugs.

6. Conclusion

Debugging complex shell scripts is a skill that combines tool mastery, logical deduction, and attention to detail. By leveraging set flags ( -x, -u, -e, pipefail), trap for error handling, and structured logging, you can systematically diagnose issues. Adopting best practices like strict mode, quoting variables, and modular design will prevent many bugs upfront.

Remember: The goal isn’t just to fix bugs, but to write scripts that are easy to debug in the first place. With these techniques, you’ll transform from a frustrated scripter to a pro who tames even the most unruly shell scripts.

7. References