How do I fix a cron job that stopped due to clock drift?
#1
I just spent two hours trying to figure out why my app's scheduled task wasn't firing, and it turns out the system clock had drifted by a few seconds, throwing the entire cron job out of sync. Has anyone else run into this kind of subtle timing issue where a seemingly minor system clock drift caused a major failure in your scheduled processes?
Reply
#2
That exact thing happened to me too. A few seconds of clock drift a couple days in a row pushed the nightly job just out of its window and nothing ran. We started logging the time against a reference and could see the drift in the timestamps.
Reply
#3
I spent days chasing code paths only to realize the problem was the worker getting blocked on GC for a few seconds; the cron part wasn't actually failing, just the backlog made it look like it wasn't firing.
Reply
#4
We added a drift check that compares system time to a trusted NTP source every hour and sends an alert if it goes beyond a threshold. It helped us notice drift sooner, though it didn't fix the core timing once the drift happened.
Reply
#5
I tried loosening the tolerance and letting a supervisor trigger if the queue was empty, but that just hid the symptom.
Reply
#6
Do you have a drift monitor in place, or do you rely on NTP alone?
Reply
#7
From a reliability angle, cron is brittle here; you end up chasing a non deterministic clock rather than the code. It feels like a reminder that we should use monotonic clocks for measurements and separate the time source for scheduling.
Reply


[-]
Quick Reply
Message
Type your reply to this message here.

Image Verification
Please enter the text contained within the image into the text box below it. This process is used to prevent automated spam bots.
Image Verification
(case insensitive)

Forum Jump: