2012-12-08

The Trouble with Leap Seconds, Part 4

"Simple. Change the gravitational constant of the universe."
— Q, Star Trek: The Next Generation

Like any other piece of software (and information generally), this comes with NO WARRANTY.

NTP, as stated previously, requires a POSIX clock, which means it is already standing on a dodgy foundation of fictional timekeeping. If your time zone data files are all kept up-to-date, NTP should be able to figure out approximately how long ago 1970-01-01 00:00:00 UTC happened. If you aren't using POSIX-compliant time zone data, NTP concludes that your clock is wrong. It will assume your clock is off by however many leap seconds have elapsed since 1970.

There are a couple ways, I assume, of getting around this problem. You can sandbox NTP and give it POSIX time zone files, but I've always found that having to assemble, enforce, and maintain chrooted and jailed applications to be a huge administrative hassle. Additionally, ntpd needs to have incredibly high levels of control over your hardware in order to read and adjust the clock with low latency. NTP doesn't work as well under sandbox conditions. This fact is one of the primary motivators that led to the formation of OpenNTPD.

Or, you know, you could change when the epoch happened.

This isn't as hard as it might seem. It took me about an hour on a Friday afternoon — after I'd already started drinking, so it's not like it's a huge Herculean effort to make NTP function relatively intelligently on TAI-configured hardware.

First, I started with a recent NTP release, ntp-4.2.6p5. The source is littered with references to JAN_1970, either adding it or subtracting it from time measurements, everywhere. Sure enough, JAN_1970 is defined as a constant: 0x83aa7e80, or 2,208,988,800 in decimal. This is the number of seconds between the NTP epoch (in 1900) and the UNIX epoch (in 1970). If you are so inclined, you can figure out how many seconds exist between the TAI-UTC offset, edit JAN_1970, and recompile. Easy!

Of course, the very next time a leap second occurs, you will need to re-edit, recompile, and redeploy the updated binaries on a very narrow, one-second-long deployment schedule.

It would be better for NTP to have a lookup table to help it know how many leap seconds have transpired to know the TAI-UTC offset at all times. Rather than regularly recompiling a fundamental constant, you just update the leap second table and reload ntpd. Fortunately, such a leap second table exists. You just need to instruct NTP to use it to account for the TAI-UTC offset when it calculates when the UNIX epoch happened. I should point out that neither David Mills nor I advocate doing this. Furthermore, I should reiterate the common NTP hackers' guideline:

If you edit how NTP works, that's great, just never put it somewhere where people could confuse it for NTP. Take it off the public network, put it behind a properly-configured packet filter, isolate it, block it, or change the port it runs on so that innocent NTP users don't accidentally run across your setup and suffer the consequences of your actions.

OK, fine. Now we're going to change history. To do this, you need to revise the JAN_1970 constant. Instead, it needs to return a value after checking the leap seconds table. Easy enough to search and replace this in every .c and .h file in the source. One way to do this is:

$ grep -rl JAN_1970 . | grep -E "\.[ch]$" | while read file; do perl -i.bak -pe "s/JAN_1970/get_jan_1970()/g" $file; done

$ mv -f include/ntp_unixtime.h.bak include/ntp_unixtime.h


You want to keep the JAN_1970 constant that's been #define'd, you just don't want anyone else using it but your get_jan_1970() function. Create a new file, .../libntp/get_jan1970.c:

#include "ntp_fp.h"
#include "ntp_unixtime.h"
#include "leapsecs.h"
#include "tai.h"

ulong
get_jan_1970(
       void
       )
{
       struct tai t0;
       struct tai t1;

       if (leapsecs_init() == -1)
       {
               return 0;
       }

       tai_now(&t0);
       t1.x = t0.x;
       leapsecs_add(&t1,0);

       return JAN_1970 - (t1.x - t0.x);
}

You can already see we're porting over large pieces of the libtai library, and this will break if /etc/leapsecs.dat is missing or out-of-date. Merging the new code into the NTP build system is left as a non-trivial exercise for the reader, but the end result is an NTP system that checks for leap seconds and can announce UTC-from-TAI timestamps on TAI hardware.

It's important to know that even though this seems to work in public scenarios, it should never be used for such purposes. Time synchronization for your TAI system should only be done with taiclock or sntpclock from the clockspeed package. So if you're not going to use this NTP hack to sync with other NTP servers, why do this at all? Because your edited ntpd can be used to distribute time on your private LAN to braindead machines with this ntpd.conf:

server 127.127.1.0 # local clock
fudge  127.127.1.0 stratum 9

This way, even machines that aren't savvy enough to run clockspeed can get correct time from your TAI server using SNTP/NTP. Keep /etc/leapsecs.dat updated as usual and things should go smoothly. I haven't tested this out, but it works well enough for me for my purposes that I think I'm going to keep it.

See also:
Part 1
Part 2
Part 3

No comments: