I spent days debugging a cron job that was "working fine"
My storage bill kept climbing. The cron job that was supposed to clean up outdated file records was running on schedule with no errors in the logs. My app, which was supposed to automatically delet...

Source: DEV Community
My storage bill kept climbing. The cron job that was supposed to clean up outdated file records was running on schedule with no errors in the logs. My app, which was supposed to automatically delete expired media files during the nightly cron job, wasn’t actually doing it. It took me days to figure out that the job was completing without deleting anything. It was failing when it tried to delete a database row that became invalid after a migration update. I had it hosted on DigitalOcean, and even their logs showed no errors. Zero alerts. Zero indication anything was wrong. I only caught it when the bill got bad enough that I started digging. After fixing it, I started thinking, how do I make sure this never happens again? I did what I always do, I reinvented the wheel. I built my own health check cron job wrapper. A daily cron job report card with a health check endpoint, alerting logic, everything. It took longer than I want to admit. Then after I built it, I figured someone must have