2012-11-23

The Trouble with Leap Seconds, Part 3

"Okay, now let me just see if I can get this straight. You come from another planet and you're mortal there, but you're immortal here until you kill all the guys from there who have come here, and then you're mortal here unless you go back there, or some more guys from there came here, in which case you become immortal here. Again."
"Something like that."
Highlander 2

Like any other piece of software (and information generally), this comes with NO WARRANTY.

NTP is a suite of time synchronization tools and it is not itself really "broken". Not entirely, anyway. NTP is a sophisticated set of software that is one of the oldest and most fundamental Internet protocols we have. It was written to expect a POSIX-compliant OS for the same reasons that most of the other broken time tools out there expect it: it's pretty common. The POSIX standard is a standard, after all. See, the problem with changing your computer clock to never worry about leap seconds is that you still want to keep it synchronized, and in real world scenarios that means you will have to sync it to some other machine running NTP, either upstream or downstream. So if you're not playing by POSIX's rules, NTP won't understand your game. In practice this means that your TAI clock will be offset from NTP's UTC clock by an integer number of seconds, so the NTP algorithm will decide that your clock is always just a little bit off. This discrepancy is a result of leap seconds injecting into UTC, so the UTC time scale is behind the TAI time scale by the quantity and duration of the number of leap seconds that have been added to the UTC time scale.

NTP was very meticulously designed. It isn't merely a wrapper around other operating system tools that tries to balance out clock offsets meekly and without a lot of fuss. It's a hulking behemoth of an application that defined a new timescale and a new epoch. Most UNIX clocks consider the beginning of time to be 1970-01-01 00:00:00 UTC. NTP selected its epoch to occur in 1900, meaning that that the Titanic sinking, the stock market crash, D-Day, the moon landing, and Dennis Ritchie's time and date of birth can all be represented with positive NTP timestamps. The problem isn't so much that there is an issue with NTP or its design. It's actually really robust in many ways. The problem (one of them, anyway) is that NTP doesn't operate inside a vacuum. It is run on a wide array of machines of different sizes, speeds, and architectures with different ideas of what it said on the calender when the zeroth second elapsed. On UNIX-like platforms, this means that NTP needs to interpret roughly when "now" is, communicate with other NTP servers doing the same thing on their local hardware, then figure out what "now" should really be. To do this, it needs to convert from UNIX time to NTP time and back, and therein lies the issue.

Next time: code!

See also:
Part 1
Part 2

No comments: