Software developers sometimes feel the need to add temporal information to the systems they build: for instance, it might be useful to know the last time you logged in to a certain chat application, so your friends know exactly when to bug you. Maybe the system is a database and there is a strong need to maintain information about when actions take place, and that may even imply defining a total order of events. While there is nothing bad about depending on timestamps for particular features of your system, there are anomalies related to the way computers store that information that you should be aware of.
1. Realising computer clocks are weird
Time, and more importantly, its speed are generally perceived as constant to us. You can experience some anomalies when you have to wait for your doctor’s appointment or when you’re having fun, but every time you look at the clock you should see time passing at the same rate.
For the purpose of this post, let’s also assume that your relative speed is no where near the speed of light, and so time will pass at the same rate as your friendly neighbour.
How computers actually keep track of time
The device you’re using to read this post uses some arbitrary date in time, called epoch, which is considered to be the start of the time. The current time can then be calculated as the number of cpu cycles or ticks since that date. Here’s how you can see what the epoch is on a terminal:
[email protected]:~$ date -j -f %s 0 Thu Jan 1 01:00:00 CET 1970
This is basically asking: “Give me the date in a string format of the time when only
0 ticks have passed since the start”, which basically equates to giving you the epoch.
This number of ticks is stored somewhere in your operating system, and if you happen to store it in a fixed size variable, well, let’s just say that eventually you’ll run out of space and overflow the value. Here’s a good future example of this, and another more humorous one from 1999.
But I’m not here to tell you that you’ll have to worry about the future. No! If someones runs your software for more than 20 years, you should be flattered but by that time AI will have taken over the world and maybe it can patch that error without your intervention. If not, humanity will probably have bigger problems (fighting T-1000 terminators).
Harambe the chimp
Meet Harambe. Harambe is a chimp that lives in an undisclosed zoo, whose job it is to log the time at which hippos take naps (they can’t get too much sleep). The zoo was kind enough to give her a watch with little bananas drawn on it, because they’re cool like that. But it’s one of those fancy digital clocks.
Harambe’s had an impeccable record of 10 years as a hippo nap supervisor. Her everyday log was something like this:
23:31:32 Alfred nap 23:32:13 Barold nap 23:37:52 Caesar nap
But there was one fateful day where Harambe’s record was corrupted. Zookeepers were baffled about what made Harambe write an obviously wrong timestamp. Here is her log for that day:
00:59:12 Barold nap 00:59:49 Caesar nap 00:00:13 Alfred nap
Luckily, one theoretical physicist came forward to vouch for Harambe. After a complex analysis, he said that she did not actually travel back in time to log a forgotten nap, but instead that morning was when daylight savings kicked in, and the clock was set back 1 hour automatically.
Harambe kept her job and was even offered a raise.
Harambe’s story is a rough reminder that programs are very literal. If you tell your program to look at the system time, be aware that it cannot infer time adjustments and understand that you’ll run into these anomalies that can cause unexpected behaviour. It’s particularly chaotic in applications that absolutely rely on time like calculating daily interest rates in the banking sector, or buying airplane tickets for future dates.
If you want to think about very weird anomalies, think about time intervals. It’s not uncommon to check how long a piece of code takes to execute, and you usually do that by checking system time before and after the code is run. If one of the aforementioned time anomalies were to happen during the execution of the code, you might momentarily believe to be looking at the fastest code in existence, looking at a negative time variation result.
But some of you might yet not be convinced. After all, if something like day light savings is predictable, maybe there’s some smart trick to adjust the clock without notice from both users and programs alike. Perhaps that could be true, but there are other time adjustments that are not easily predictable by systems developers, like leap seconds, so the issue needs to be addressed in another way.
1.5 A brief monotonous pit-stop
Here’s a complementary crash course in monotonicity. You’ll need it later! Let’s say you have a sequence of numbers. If it only increases, you can call it monotonous. If there are no repeated values, it’s strictly monotonous.
[-1,1,2,3,3,3,3,9] # monotonous [-1,0,1,3,4,5,6,7] # strictly monotonous
2. Engineers to the rescue
After probably getting burned by these pesky time anomalies a few times and hours of head scratching, operating system developers realised that a new time API was needed, and so monotonic system time was created. Programs that use such monotonic sources to obtain system time will not experience time travel to the past, and that sounds great!
You’ll be pleased to know that its implementation is also easy to understand. The operating system can return an arbitrary integer number in the following way:
// choose your preferred programming language OS.getMonotonicTime(); // returns -123456789 OS.getMonotonicTime(); // returns -123456711
If the very large negative number annoy you, just imagine the returned values were
43. The point here is that these numbers don’t really make sense on their own, but you can get meaning out of it once you subtract two values. This solves the negative time intervals, which as I said causes all sorts of chaos.
A new question arises: how does the operating system adjust the monotonic time value to catch up or to wait for the real time? Protocols like NTP, which is vastly used as the norm to synchronise the clocks for all sorts of electronic devices just override the old value, causing sudden jumps or discontinuities in time, which is exactly what causes the unexpected behaviours. So what are the alternatives?
If you’re curious about this sort of issue, I highly advise you to read these two articles from Google. It’s the type of thing that makes people want to work for them. They basically cast a not-so-complex spell that happens to slow down the server clock approximately 14 parts per million, spreading time anomalies like those damned leap seconds across a time window of your choosing (say, 20 hours).
Outside of Google, there are some programming languages and runtimes which add their own layer of abstraction since this is a very known area for bugs. C++, Java, Go and Erlang all include this in the standard library, but I’m sure there are more out there that I don’t know or care about. Despite my mixed feelings for Erlang, I have to recommend this post by Fred Hebert.
This was something I’ve been wanting to learn about for some time, and it’s possible that I might have to write some time aware code in the near future, so that’s why this post exists. I think it retransmits what I learned quite well, so I wouldn’t be surprised if there was an imprecision (or three). In that case, I’d appreciate it if you’d kindly yell at me below.