May 28, 2026

How to Monitor Cron Jobs: The Complete Guide

Cron jobs are the backbone of every server. Backups, cleanup scripts, certificate renewals, database dumps, report generation — all scheduled, all running in the background, all assumed to be working.

Until they're not.

The worst part about a failed cron job isn't the failure itself. It's the silence. Cron doesn't page you when a job stops running. It doesn't send a Slack message. If you're lucky, there's a log entry buried in /var/log/syslog. If you're not, you find out three weeks later when your backups are gone and you need them.

This guide covers everything you need to monitor cron jobs properly: the patterns, the tools, and the specific setup steps.

Why cron jobs fail silently

Cron has no built-in concept of "this job should have run but didn't." It fires jobs on schedule. If the job exits non-zero, cron doesn't retry. If the machine was off during the scheduled time, cron doesn't catch up. If someone deletes the crontab entry, there's no warning.

Common failure modes:

Script error — the job runs but fails partway through
Permission change — a file or directory becomes inaccessible
Dependency missing — a binary gets removed during an OS upgrade
Disk full — the job can't write output
Machine reboot — the scheduled time passes while the server is down
Crontab overwritten — another deploy or config management run wipes the entry
PATH issues — cron's environment is minimal, missing paths your shell has

All of these produce the same symptom: nothing. No output, no alert, no indication anything is wrong.

The dead man's switch pattern

The most reliable way to monitor cron jobs is to flip the problem around. Instead of watching for failures, expect a success signal and alert when it doesn't arrive.

This is called a dead man's switch (or heartbeat monitoring):

Your cron job runs normally
After completing successfully, it sends an HTTP ping to a monitoring endpoint
The monitor knows your job's schedule (e.g., "every day at 3 AM")
If no ping arrives within the expected window, the monitor triggers an alert

# Your crontab entry
0 3 * * * /usr/local/bin/backup.sh && curl -fsS https://monitor.example.com/ping/nightly-backup

The && is critical — curl only runs if backup.sh exits 0. A failed backup doesn't send a ping, and the monitor alerts you.

This catches every failure mode listed above. Script errors, permission issues, missing binaries, disk full, machine down, deleted crontab — all of them result in the same thing: no ping arrives, and you get alerted.

What to look for in a cron monitor

Not all monitoring tools handle cron well. General-purpose uptime monitors (Uptime Robot, Pingdom) check that a server is responding. Cron monitoring checks that a job ran on schedule. Different problem.

A good cron monitor needs:

Cron expression parsing — it should understand 0 3 * * * means "expect a ping daily around 3 AM" with a configurable grace period
Per-job tracking — separate checks for each job, not one global health endpoint
Multiple alert channels — email, webhook, Slack, PagerDuty
Failure capture — ability to receive and store job output on failure for debugging
Low overhead — adding monitoring shouldn't slow down your jobs

Option 1: Self-hosted monitoring

If you're already running your own infrastructure, a self-hosted cron monitor keeps everything under your control. No third-party dependency, no usage limits, no monthly fee.

cronguard

cronguard is a self-hosted cron monitor built for this exact use case. Single Go binary, no dependencies, runs on anything.

Setup:

# Download and run
./cronguard
# Open http://localhost:8099

Create a check with a name and schedule, then add the ping URL to your cron job:

# Basic: ping after success
0 3 * * * /usr/local/bin/backup.sh && curl -fsS http://localhost:8099/ping/nightly-backup

# With output capture: pipe stdout/stderr to the ping
0 3 * * * /usr/local/bin/backup.sh 2>&1 | curl -fsS -d @- http://localhost:8099/ping/nightly-backup

# Report failures explicitly
0 3 * * * /usr/local/bin/backup.sh || curl -fsS -X POST http://localhost:8099/ping/nightly-backup/fail

Add a webhook URL or email as the alert destination. When a job misses its window, you get notified.

Docker Compose setup for homelabs:

services:
  cronguard:
    image: narrowcastdev/cronguard:latest
    restart: unless-stopped
    ports:
      - "127.0.0.1:8099:8099"
    volumes:
      - cronguard-data:/data
    env_file:
      - .env

volumes:
  cronguard-data:

Healthchecks.io (self-hosted)

Healthchecks.io is open source and can be self-hosted. It's a Django app, so it requires Python, Postgres, and more operational overhead than a single binary. But it has a polished web UI, integrations with dozens of alert services, and a strong community.

The trade-off: more features, more infrastructure to maintain.

Option 2: Hosted services

If you don't want to run your own monitoring infrastructure:

Service	Free tier	Paid from	Notes
Healthchecks.io	20 checks	$20/mo	Open source, self-hostable
Cronitor	5 monitors	$24/mo	Mature, good integrations
Better Stack	5 monitors	$24/mo	Formerly Better Uptime
Dead Man's Snitch	1 snitch	$5/mo	Simple, focused

All of these work the same way: create a check, get a ping URL, add it to your cron job.

Setting up alerts properly

The monitoring tool is only as good as its alert pipeline. A few things to get right:

Grace periods

Cron jobs don't run at exactly the scheduled second. System load, other jobs, and clock drift all introduce variance. Set a grace period of 5-15 minutes so you don't get false alerts from normal jitter.

Schedule: 0 3 * * *     (daily at 3:00 AM)
Grace period: 15 min     (alert if no ping by 3:15 AM)

Alert fatigue

If you monitor 30 cron jobs and three of them are flaky, you'll start ignoring alerts. Fix or remove flaky jobs before adding monitoring. Every alert should mean "something is actually wrong."

Escalation

Set up two tiers:

First alert — email or Slack message. "Your nightly backup didn't run."
Repeated failure — after 2-3 missed pings, escalate to SMS or PagerDuty.

A single missed ping might be a transient network issue. Three in a row is a real problem.

Monitoring wrapper script

Instead of adding curl to every crontab entry, use a wrapper:

#!/bin/bash
# /usr/local/bin/cronwrap
# Usage: cronwrap <check-slug> <command...>

MONITOR_URL="${CRONGUARD_URL:-http://localhost:8099}"
SLUG="$1"
shift

OUTPUT=$("$@" 2>&1)
EXIT_CODE=$?

if [ $EXIT_CODE -eq 0 ]; then
    echo "$OUTPUT" | curl -fsS -d @- "${MONITOR_URL}/ping/${SLUG}" > /dev/null
else
    echo "$OUTPUT" | curl -fsS -d @- "${MONITOR_URL}/ping/${SLUG}/fail" > /dev/null
fi

exit $EXIT_CODE

Then your crontab stays clean:

0 3 * * * cronwrap nightly-backup /usr/local/bin/backup.sh
0 * * * * cronwrap hourly-cleanup /usr/local/bin/cleanup.sh
0 0 1 * * cronwrap monthly-report /usr/local/bin/report.sh

This captures output on both success and failure, reports the correct status, and keeps the monitoring URL out of individual crontab entries.

Monitoring cron in Docker and Kubernetes

Docker

If your cron jobs run inside Docker containers, the ping needs to reach the monitor. Use Docker networking:

services:
  cronguard:
    image: narrowcastdev/cronguard:latest
    ports:
      - "127.0.0.1:8099:8099"

  backup:
    image: your-backup-image
    command: sh -c '/backup.sh && curl -fsS http://cronguard:8099/ping/backup'
    depends_on:
      - cronguard

Services on the same Docker network can reach each other by container name.

Kubernetes CronJobs

Kubernetes CronJobs have their own failure modes: pod scheduling delays, image pull failures, node pressure evictions. The dead man's switch pattern works the same way:

apiVersion: batch/v1
kind: CronJob
metadata:
  name: nightly-backup
spec:
  schedule: "0 3 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: your-backup-image
            command:
            - sh
            - -c
            - |
              /backup.sh && \
              curl -fsS http://cronguard.monitoring:8099/ping/nightly-backup
          restartPolicy: Never

Checklist

Before you close this tab:

List every cron job running on your servers (crontab -l, check /etc/cron.d/, /etc/cron.daily/, etc.)
Set up a cron monitor (self-hosted or SaaS)
Add ping URLs to each job with && (only ping on success)
Set grace periods appropriate to each job's expected runtime
Configure at least two alert channels (email + webhook or Slack)
Test the alert pipeline — pause a job and verify you get notified
Add monitoring to your new-job checklist so future jobs are covered from day one

Every server has cron jobs. Most of them aren't monitored. The gap between "my backup script runs every night" and "I know my backup script ran last night" is one HTTP ping.

→ cronguard — self-hosted cron monitoring. Single binary. Free and open source.