dotlinux guide

Practical Shell Scripting for Cloud Environments

Table of Contents

  1. Fundamentals of Shell Scripting in Cloud Environments
    • Why Shell Scripting for Cloud?
    • Key Shells and Tools
    • Cloud CLI Integration
  2. Usage Methods: Getting Started
    • Environment Setup
    • Basic Script Structure
    • Interacting with Cloud Services
  3. Common Practices: Real-World Use Cases
    • Automating Deployments
    • Monitoring and Alerting
    • Cost Optimization
  4. Best Practices for Reliable and Secure Scripts
    • Error Handling
    • Idempotency
    • Security
    • Logging and Testing
  5. Conclusion
  6. References

1. Fundamentals of Shell Scripting in Cloud Environments

Why Shell Scripting for Cloud?

Shell scripts excel in cloud environments for three key reasons:

  • Lightweight & Portable: They run natively on Linux/macOS systems (and via WSL on Windows) without needing runtime dependencies (e.g., Python interpreters).
  • CLI-First Integration: Cloud providers (AWS, Azure, GCP) offer robust CLIs (e.g., aws, az, gcloud), which shell scripts can directly invoke to interact with cloud resources.
  • Rapid Prototyping: Ideal for small-to-medium automation tasks (e.g., backup scripts, resource cleanup) where developing a full-fledged application would be overkill.

Key Shells and Tools

  • Bash: The most widely used shell (default on Linux and macOS). This blog focuses on Bash scripting, as it’s the de facto standard for cloud automation.
  • Zsh/Fish: Enhancements to Bash with better autocompletion, but Bash remains the most portable choice for cloud environments.
  • Critical Tools:
    • jq: Parses JSON output from cloud CLIs (essential for extracting data like instance IDs or bucket names).
    • curl/wget: Interact with REST APIs if CLIs aren’t available.
    • Cloud CLIs: aws (AWS), az (Azure), gcloud (GCP), oci (Oracle Cloud).

Cloud CLI Integration

Cloud CLIs are the bridge between shell scripts and cloud services. They authenticate via API keys, OAuth tokens, or IAM roles and expose operations like creating VMs, listing buckets, or scaling clusters. For example:

  • AWS CLI: aws ec2 start-instances --instance-ids i-0abc1234
  • Azure CLI: az vm stop --name my-vm --resource-group my-rg
  • GCP CLI: gcloud compute instances delete my-instance --zone us-central1-a

Shell scripts chain these commands to automate complex workflows.

2. Usage Methods: Getting Started

Environment Setup

Before writing scripts, configure your cloud environment:

1. Install Cloud CLIs

  • AWS: pip install awscli (or use the AWS CLI v2 installer).
  • Azure: curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash (Linux) or use the official installer.
  • GCP: curl https://sdk.cloud.google.com | bash (then run gcloud init).

2. Authenticate Securely

Never hardcode credentials in scripts. Use:

  • IAM Roles (preferred): Attach roles to cloud instances (e.g., AWS EC2 instance profiles) so scripts inherit permissions.
  • Environment Variables: Store credentials in AWS_ACCESS_KEY_ID, AZURE_CLIENT_SECRET, etc., (use a vault like HashiCorp Vault for production).
  • CLI Configuration Files: ~/.aws/credentials, ~/.azure/credentials, or ~/.config/gcloud/application_default_credentials.json (for local development only).

Basic Script Structure

A typical cloud shell script includes:

  • Shebang: Specifies the shell (e.g., #!/bin/bash).
  • Comments: Explain logic for readability.
  • Variables: Store reusable values (e.g., instance IDs, region).
  • Commands: Cloud CLI calls and logic (e.g., conditionals, loops).

Example: Simple S3 Bucket Check (AWS)

#!/bin/bash
# Purpose: Check if an S3 bucket exists in AWS

# Variables
BUCKET_NAME="my-unique-bucket-123"
REGION="us-east-1"

# Check if AWS CLI is installed
if ! command -v aws &> /dev/null; then
  echo "Error: aws cli is not installed. Exiting."
  exit 1
fi

# Check bucket existence
echo "Checking if bucket $BUCKET_NAME exists..."
if aws s3api head-bucket --bucket "$BUCKET_NAME" --region "$REGION" 2>/dev/null; then
  echo "Bucket $BUCKET_NAME exists."
else
  echo "Bucket $BUCKET_NAME does NOT exist."
fi

Explanation:

  • The shebang line #!/bin/bash ensures the script runs with Bash.
  • command -v aws checks if the AWS CLI is installed.
  • aws s3api head-bucket verifies bucket existence (silences errors with 2>/dev/null).

Interacting with Cloud Services

Shell scripts use cloud CLIs to perform actions like creating resources, modifying configurations, or deleting assets. Below is an example for Azure:

Example: Start/Stop Azure VM

#!/bin/bash
# Purpose: Start or stop an Azure VM based on input argument

# Check for required argument (start/stop)
if [ $# -ne 2 ]; then
  echo "Usage: $0 <start|stop> <vm-name>"
  exit 1
fi

ACTION="$1"
VM_NAME="$2"
RESOURCE_GROUP="my-resource-group"

# Validate action
if [ "$ACTION" != "start" ] && [ "$ACTION" != "stop" ]; then
  echo "Error: Action must be 'start' or 'stop'."
  exit 1
fi

# Perform action
echo "Attempting to $ACTION VM $VM_NAME in resource group $RESOURCE_GROUP..."
if az vm "$ACTION" --name "$VM_NAME" --resource-group "$RESOURCE_GROUP"; then
  echo "Successfully $ACTION VM $VM_NAME."
else
  echo "Failed to $ACTION VM $VM_NAME."
  exit 1
fi

Usage:

./manage_vm.sh start my-vm  # Starts the VM
./manage_vm.sh stop my-vm   # Stops the VM

3. Common Practices: Real-World Use Cases

Automating Deployments

Shell scripts simplify deploying applications to the cloud. For example, a script to deploy a static website to AWS S3 and invalidate CloudFront:

#!/bin/bash
# Purpose: Deploy static site to S3 and invalidate CloudFront cache

# Variables
S3_BUCKET="my-website-bucket"
CLOUDFRONT_DISTRIBUTION="E123456789ABC"
BUILD_DIR="./dist"  # Directory with static files (HTML, CSS, JS)

# Build the app (example: npm-based project)
echo "Building app..."
npm run build || { echo "Build failed!"; exit 1; }

# Sync files to S3 (delete old files, set public read)
echo "Syncing to S3..."
aws s3 sync "$BUILD_DIR" "s3://$S3_BUCKET" \
  --delete \
  --acl public-read || { echo "S3 sync failed!"; exit 1; }

# Invalidate CloudFront cache
echo "Invalidating CloudFront cache..."
aws cloudfront create-invalidation \
  --distribution-id "$CLOUDFRONT_DISTRIBUTION" \
  --paths "/*" || { echo "Invalidation failed!"; exit 1; }

echo "Deployment complete!"

Monitoring and Alerting

Scripts can monitor cloud resources (e.g., CPU usage, disk space) and trigger alerts. Example with AWS CloudWatch:

#!/bin/bash
# Purpose: Check EC2 instance CPU usage and send alert if >80%

INSTANCE_ID="i-0abc1234"
THRESHOLD=80  # CPU threshold (%)
ALARM_TOPIC="arn:aws:sns:us-east-1:123456789012:high-cpu-alerts"

# Get average CPU usage (last 5 minutes)
CPU_USAGE=$(aws cloudwatch get-metric-statistics \
  --namespace AWS/EC2 \
  --metric-name CPUUtilization \
  --dimensions Name=InstanceId,Value="$INSTANCE_ID" \
  --start-time "$(date -u +"%Y-%m-%dT%H:%M:%SZ" -d "5 minutes ago")" \
  --end-time "$(date -u +"%Y-%m-%dT%H:%M:%SZ")" \
  --period 300 \
  --statistics Average \
  --output json | jq -r '.Datapoints[0].Average')

# Check if CPU exceeds threshold
if (( $(echo "$CPU_USAGE > $THRESHOLD" | bc -l) )); then
  echo "ALERT: CPU usage is $CPU_USAGE% (> $THRESHOLD%). Sending notification..."
  aws sns publish \
    --topic-arn "$ALARM_TOPIC" \
    --message "High CPU on $INSTANCE_ID: $CPU_USAGE%" \
    --subject "ALERT: High CPU Usage"
else
  echo "CPU usage is $CPU_USAGE% (below threshold). No action needed."
fi

Cost Optimization

Unused resources (e.g., idle VMs, old snapshots) drive cloud costs. Scripts can identify and clean them up:

#!/bin/bash
# Purpose: Delete AWS EBS snapshots older than 30 days

# Get snapshots owned by the current account, older than 30 days
echo "Finding old snapshots..."
SNAPSHOTS=$(aws ec2 describe-snapshots \
  --owner-ids self \
  --filters "Name=start-time,Values=[$(date -u +"%Y-%m-%dT%H:%M:%SZ" -d "30 days ago")]" \
  --query "Snapshots[*].SnapshotId" \
  --output text)

# Delete snapshots
if [ -z "$SNAPSHOTS" ]; then
  echo "No old snapshots found."
  exit 0
fi

for SNAP_ID in $SNAPSHOTS; do
  echo "Deleting snapshot $SNAP_ID..."
  aws ec2 delete-snapshot --snapshot-id "$SNAP_ID" || {
    echo "Failed to delete $SNAP_ID. Skipping..."
  }
done

echo "Cleanup complete."

4. Best Practices for Reliable and Secure Scripts

Error Handling

  • Use set -euo pipefail: Makes scripts exit on errors, undefined variables, or failed pipeline commands.
    # Add at the top of scripts
    # -e: Exit on error; -u: Treat undefined variables as errors; -o pipefail: Exit if any command in a pipeline fails
    set -euo pipefail
  • Check Exit Codes: Explicitly verify command success with if statements:
    aws s3 cp file.txt s3://my-bucket || { echo "Copy failed!"; exit 1; }

Idempotency

Scripts should run safely multiple times (e.g., creating a bucket only if it doesn’t exist). Use conditional checks:

# Create S3 bucket only if it doesn't exist
if ! aws s3api head-bucket --bucket "$BUCKET_NAME" 2>/dev/null; then
  aws s3api create-bucket --bucket "$BUCKET_NAME" --region "$REGION"
fi

Security

  • Avoid Hardcoded Credentials: Use IAM roles (AWS), managed identities (Azure), or secure vaults (HashiCorp Vault).
  • Restrict Permissions: Run scripts with the least privilege (e.g., an IAM role with only s3:ListBucket for a bucket-check script).
  • Sanitize Inputs: Validate user inputs to prevent injection attacks (e.g., reject VM names with special characters).

Logging and Testing

  • Log to Files: Redirect output to a log file for debugging:
    LOG_FILE="/var/log/cloud-script.log"
    exec > >(tee -a "$LOG_FILE") 2>&1  # Log stdout/stderr to file and console
  • Lint with shellcheck: Catch syntax errors and bad practices:
    shellcheck my-script.sh  # Install with: sudo apt install shellcheck
  • Test with bats: Write unit tests for scripts (e.g., Bats).

5. Conclusion

Shell scripting is a versatile tool for automating cloud workflows, offering simplicity, portability, and deep integration with cloud CLIs. By mastering fundamentals like CLI usage, error handling, and idempotency, and adhering to best practices for security and reliability, you can build robust scripts to manage deployments, monitoring, and cost optimization in the cloud.

Whether you’re a DevOps engineer, cloud administrator, or developer, shell scripting empowers you to turn repetitive tasks into scalable, maintainable automation—saving time and reducing human error.

6. References