86,400 Seconds Every Day
So Mills invented a sleight-of-hand, counting the last second twice, leaving 86,400 seconds in the day. Knowing it's going to happen, NTP is geared to adjust and add 36 seconds (instead of 35) when coordinating time with its atomic clock references. Atomic time is the time kept by precise atomic clocks used in geographical positioning systems (GPS) and other precise-time measuring services. Atomic clocks can't adjust to the solar day, but through the leap second atomic time and solar time remain in synch.
Stenn has checked with other maintainers of system time for their opinions on the Google smear and other methods. NTP's method of counting the last second twice got this endorsement from Poul-Henning Kamp, a Danish expert on computer time, who is working on improvements to NTP: "I believe this works the way Dave [Mills] originally envisioned it should, and [it] makes a semi-perverse kind of sense to do it that way."
Mills wanted system users to be able to find out when they had been caught in a leap second. So for those who use a "timex" inquiry method on what would normally be the last second of June 30, their time stamp will read 23:59:59. For those who inquire during the leap second, their time stamp would read 23:59:60, a time that normally can't occur, before the clock rolls over to the first second of July 1.
Kamp noted in an email message to Stenn, however: "So far, I have never found one single piece of not-written-by-me software that actually uses the timex API to find out what's going on, so it probably doesn't matter."
Pick Your Poison
At Google, it does matter. In adding a second to its NTP servers in 2005, it ran into timekeeping problems on some of its widely distributed systems. The Mills sleight-of-hand was confusing to some of its clusters, as they fell out of synch with NTP time.
"Very large-scale distributed systems, like ours, demand that time be well-synchronized and expect that time always moves forwards," wrote Christopher Pascoe, Google site reliability engineer, in a blog post on Sept. 15, 2011, as another leap second adjustment approached. "Computers traditionally accommodate leap seconds by setting their clock backwards by one second at the very end of the day. But this 'repeated' second can be a problem. For example, what happens to write operations that happen during that second? Does email that comes in during that second get stored correctly? What about all the unforeseen problems that may come up with the massive number of systems and servers that we run?"
Google had already tried the smear approach in 2008. According to Pascoe's blog post: "The leap smear is talked about internally in the Site Reliability Engineering group as one of our coolest workarounds, that … ultimately saved us massive amounts of time and energy in inspecting and refactoring code. It meant that we didn't have to sweep our entire (large) codebase …"
In an email message to InformationWeek on the subject earlier this month, Stenn conceded: "Operationally, this is a very nice solution." But he said he still can't accept imposing inaccurate clocks on all types of systems used by NTP to satisfy Google's operational reasons.
Stenn can see problems in both approaches. "Choose your poison," he advised at one point. But the real solution, he said, lies in more work by standards bodies and time experts collaborating on a solution. And the barrier to that, he said, is one with which he's already familiar -- no one is willing to devote money toward resolving what sometimes seem like obscure time issues.
The work done to date by two different standards bodies has resulted in two different philosophies: Either it's OK to add a leap second to any month; or it can only be added at the end of June or December.
So far, the latter holds as the convention.
The Network Time Foundation, a nonprofit umbrella group that includes the NTP project, has as one of its agenda items to come up with a General Timestamp API, and resolve such issues in the time stamp process it adopts. "Getting that implemented and accepted takes resources we do not yet have," Stenn wrote in his email exchange with InformationWeek.
How would you like to see the leap second handled? Does Google's smear approach make more sense to you, or does Mills's idea of counting the last second twice work better? Do you have a better idea of how to handle this? Tell us all about it in the comments section below.