DSTW Notes

Bash Like an Engineer: Smarter Scripting for DevOps

Smart Bash Scripting Illustration

When you think of Bash, you probably think of quick fixes, hacked-together cron jobs, or obscure one-liners you copied from Stack Overflow at 2 a.m. But Bash scripting doesn’t have to be messy or fragile. In fact, when treated with the same discipline we apply to other languages, Bash can be a powerful, safe, and maintainable tool—especially for DevOps workflows.

In this post, I’ll walk you through:

  • Bash tricks every DevOps engineer should know
  • How to write structured, testable, and debug-friendly scripts
  • Safety practices to avoid production disasters
  • A real-world example from my toolkit

Let’s dive in.


Thinking Like a Software Engineer in Bash

Many DevOps scripts start as a quick fix—and stay that way. But you can bring software engineering principles into Bash scripting with just a few simple habits:

Use Functions to Structure Your Code

Instead of writing everything in a top-down blob, split your logic into functions:

#!/usr/bin/env bash

set -euo pipefail

log_info() {
  echo "[INFO] $*"
}

backup_database() {
  # some logic here
  log_info "Starting database backup"
}

main() {
  backup_database
}

main "$@"

Why? Easier testing, readability, and future reuse. Think of main() as the script’s entrypoint, just like in C or Python.


Self-Debugging Bash: Know What Your Script is Doing

Ever had a script silently fail? Or worse, behave unpredictably? Here’s how to make it transparent and debuggable.

Enable Fail-Fast Mode

Start every serious Bash script with this:

set -euo pipefail
IFS=$'\n\t'
  • -e: exit on error
  • -u: error on unset variables
  • -o pipefail: fail if any command in a pipeline fails
  • IFS: safer word splitting

Add Traps for Debugging

Catch script exits, signals, or errors:

trap 'echo "ERROR: Script failed at line $LINENO"; exit 1' ERR
trap 'echo "Script exited cleanly."' EXIT

Now, when something goes wrong, you know where.


Trace with set -x — See What Your Script is Doing

Sometimes your Bash script doesn’t crash, but it still misbehaves. Maybe it’s skipping a step, passing the wrong variable, or silently doing something unexpected.

This is where set -x shines. It enables execution tracing, printing every command and its arguments before they run.

set -x  # Turn on command tracing

Or, if you’re running a script externally:

bash -x script.sh

Example:

#!/usr/bin/env bash
set -x

name="world"
echo "Hello, $name!"

Output:

+ name=world
+ echo 'Hello, world!'
Hello, world!

You’ll see exactly what the script is doing, which helps identify subtle bugs—like incorrect substitutions, misordered logic, or silently failing branches.

Heads-up: set -x is Extremely Verbose

⚠️ Warning: set -x can generate a flood of output, especially in larger scripts or when looping over files, parsing input, or running background processes. It can also inadvertently log sensitive data—such as passwords in command-line arguments or environment variables.

So when should you use it?

✔️ During development or debugging
❌ Not in production unless output is redirected and scrubbed
✔️ Only when explicitly enabled by a flag or environment variable

Best Practice: Conditional Tracing with DEBUG Mode

To avoid hardcoding set -x into your script, use a simple flag pattern like this:

[[ "${DEBUG:-0}" -eq 1 ]] && set -x

This checks whether the environment variable DEBUG is set to 1. If so, it enables tracing.

Now you can run:

./script.sh          # Normal mode
DEBUG=1 ./script.sh  # Debug mode with tracing

You get debug output only when you want it — and your logs stay clean by default.

Optional: Support a --debug Flag

If you prefer using CLI flags, add this:

DEBUG=0

while [[ $# -gt 0 ]]; do
  case "$1" in
    --debug) DEBUG=1 ;;
  esac
  shift
done

[[ "$DEBUG" -eq 1 ]] && set -x

Now you can run:

./script.sh --debug

Protect Sensitive Data

Always be mindful of what you log when set -x is active. For example:

PASSWORD="secret"
curl -u "admin:$PASSWORD" https://example.com  # Will leak password in trace!

If your script deals with secrets, avoid enabling full tracing, or mask critical values before logging them.

By using conditional DEBUG tracing, you maintain control over verbosity and security while keeping the powerful insights of set -x just a flag away.


Bash Safety 101: Guard Rails for Production Scripts

Writing production-ready Bash scripts requires a defensive mindset. Bash is powerful but unforgiving—small oversights can lead to major failures. Here’s how to reduce risk with safer scripting practices.

Quote Everything

Unquoted variables are one of the most common causes of unexpected behavior in Bash scripts. They can break commands when variables contain spaces, are empty, or have special characters.

rm -rf "$TARGET_DIR"    # Safe
rm -rf $TARGET_DIR      # Dangerous if empty or contains spaces

If $TARGET_DIR is unset or blank, the second line becomes rm -rf, potentially deleting everything in the current directory.

Validate Input

Assume nothing. Scripts often fail due to missing or malformed input. Validate arguments before using them:

[[ -z "${1:-}" ]] && { echo "Usage: $0 <target_dir>"; exit 1; }

This ensures the required argument is present before the script proceeds.

Use mktemp for Temporary Files

Hardcoding temporary filenames invites collisions and race conditions. Instead, use mktemp to safely generate unique filenames:

TMP_FILE=$(mktemp)

Always clean up temporary files after use to avoid clutter or leaks.

Be Explicit with Destructive Commands

When using commands like rm, mv, or cp, add verbosity (-v) and consider a dry-run mode for testing. Log actions clearly before execution. It’s better to be overly cautious than to guess in production.


Testing Your Bash Scripts

Testing isn’t just for compiled languages or high-level frameworks. Bash scripts can and should be tested—especially if they’re part of your infrastructure automation. The Bats framework makes this possible.

What is Bats?

Bats (Bash Automated Testing System) provides a simple, readable way to write test cases for your scripts. It helps you verify script behavior in isolation or as part of CI pipelines.

Example Test Case

@test "Backup completes successfully" {
  run ./my_backup.sh
  [ "$status" -eq 0 ]
}

This test runs the script and checks that it exits with a status code of 0 (success). You can also assert on the script’s output or any file system changes it performs.

Setup Instructions

  1. Install Bats:

    git clone https://github.com/bats-core/bats-core.git
    sudo ./bats-core/install.sh /usr/local
    
  2. Create a test file:

    mkdir tests
    touch tests/test_my_script.bats
    
  3. Run tests:

    bats tests/
    

Incorporating Bats into your CI/CD pipeline allows you to catch regressions early and ensure your Bash scripts behave as expected across environments.

Example CI Testing with GitHub Actions

You can automate Bash script testing using Bats and GitHub Actions:

Example Structure:

.
├── my_script.sh
├── tests/
│   └── test_my_script.bats
├── .github/workflows/ci.yml

Sample Bats Test (tests/test_my_script.bats):

@test "Script runs without error" {
  run ./my_script.sh
  [ "$status" -eq 0 ]
}

GitHub Actions Workflow:

name: Test Bash Scripts
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: git clone --depth=1 https://github.com/bats-core/bats-core.git
      - run: sudo ./bats-core/install.sh /usr/local
      - run: chmod +x *.sh
      - run: bats tests/

This lets you test Bash scripts like real software—automatically and reliably.


Real-World Toolkit Example: Safe Cleanup Script

Below is a production-safe Bash script to delete logs older than a certain number of days. It’s structured, safe, and ready for automation or CI integration:

#!/usr/bin/env bash

set -euo pipefail
IFS=$'\n\t'

LOG_DIR="/var/log/myapp"
DAYS_TO_KEEP=7

log() {
  echo "[$(date +'%F %T')] $*"
}

clean_old_logs() {
  log "Looking for files older than $DAYS_TO_KEEP days in $LOG_DIR"
  find "$LOG_DIR" -type f -mtime +"$DAYS_TO_KEEP" -print -delete
  log "Cleanup complete."
}

main() {
  [[ -d "$LOG_DIR" ]] || {
    log "Log directory not found: $LOG_DIR"
    exit 1
  }
  clean_old_logs
}

trap 'log "Script exited unexpectedly at line $LINENO"' ERR
trap 'log "Script completed."' EXIT

main "$@"

Let’s walk through that script: safely deletes old log files, commonly used in log rotation or maintenance routines.

#!/usr/bin/env bash

set -euo pipefail
IFS=$'\n\t'

Fail-Fast Settings

  • -e: Exit immediately on any error.
  • -u: Treat unset variables as errors.
  • -o pipefail: Exit if any command in a pipeline fails.
  • IFS: Sets a safer internal field separator to handle file names with spaces or tabs.

LOG_DIR="/var/log/myapp"
DAYS_TO_KEEP=7

Configuration Block

Defines constants up front. Makes the script easy to adapt without editing logic later on.


log() { echo "[$(date +'%F %T')] $*"; }

Logging Function

Adds a timestamp to every message, improving traceability, especially when logs are redirected to files or viewed in CI output.


clean_old_logs() {
  log "Looking for files older than $DAYS_TO_KEEP days in $LOG_DIR"
  find "$LOG_DIR" -type f -mtime +$DAYS_TO_KEEP -print -delete
  log "Cleanup complete."
}

Cleanup Logic

  • find locates files older than DAYS_TO_KEEP.
  • -print shows which files will be removed.
  • -delete performs the deletion.

This combination gives visibility before destructive action is taken, and the modular design improves testability.


main() {
  [[ -d "$LOG_DIR" ]] || { echo "Log directory not found: $LOG_DIR"; exit 1; }
  clean_old_logs
}

Entrypoint

Encapsulates the logic and validates prerequisites. Ensures the script doesn’t run if the target directory doesn’t exist.


trap 'log "Script exited unexpectedly at line $LINENO"; exit 1' ERR
trap 'log "Script completed."' EXIT

Error and Exit Traps

  • Logs the line number on failure for easier debugging.
  • Confirms successful completion when the script ends.

main "$@"

Main Call

This line calls the main function and passes all command-line arguments that were given to the script into it.

In short: main "$@" = run main() with all script arguments, safely and cleanly.


Summary: Why This Script is Safe and Reliable

  • Uses fail-fast settings to avoid silent failures
  • Validates all inputs and environment dependencies
  • Provides clear, timestamped logging for observability
  • Is structured with functions, making it easier to maintain and test
  • Uses traps to improve debugging and error reporting

This pattern can be adapted for many use cases—from cleanup to backups to automation. It’s a strong foundation for writing safer Bash in real-world systems.


Final Thoughts

Bash is everywhere in DevOps—from provisioning to automation, deployment, and recovery. But too often, it’s treated like a second-class citizen. With a bit of structure, discipline, and tooling, you can make your Bash scripts:

  • Maintainable
  • Safe
  • Testable
  • Debuggable

And most importantly—trusted by your future self and your team.

Next time you write a script, don’t just make it work. Make it engineered.

#Bash #Scripting #ShellScript #30DaysOfDevOps