The 2038 problem is a Y2K-like scenario which will occur on January 19, 2038 at roughly 3:14 AM UTC, causing all programs running on an (unfixed) Unix system to believe that it is December 13, 1901.

The origin of this problem is the way that Unix counts time; the primary internal clock is given as the number of seconds since the epoch, which is midnight on January 1, 1970. In the C language, these times are stored in what is called a time_t, and can be manipulated using functions which can convert this value into a string or a broken down list which gives the second, minute, hour, day, month, and year. On nearly all current Unix systems, time_t is a signed 4-byte integer, and on the aforementioned date in 2038, the value of the clock will tick over from 0x7FFFFFFF to 0x80000000, which, because it is signed, is considered a large negative number instead of a large positive one. So instead of adding 68 years and change to the epoch, 68 years and change will be substracted from the epoch, putting us back to the days when men were men and women couldn't vote.

Other operating systems have their own clock rollover issues, but the one affecting Unix is the only one that will happen in this century (Windows gets it in 2184, for example).

The fix is to change how time_t is defined, to be either an unsigned integer type (thus allowing it to be used until sometime in 2106 or so), or to a 64-bit signed integer, leaving us with the less memorable 292277026596 problem. Making time_t unsigned would solve the problem, but might break applications which actually use negative time_t values (I have never seen one, but I suppose they are out there somewhere). On the other hand, increasing the size of time_t will break applications which assume time_t is 4 bytes. While Unix programmers are typically good about not making assumptions such as that pointers are 32 bits, or that long and int are the same size (because there are plenty of Unix systems out there already for which these are not true), time_t has been its current size for quite some time, and people may have made (soon incorrect) assumptions about it.

After this writeup was posted, scs noted that his DEC Alpha Linux box does use a 64-bit time_t, which I have since confirmed is the case on an Alpha Linux box I have access to. Realizing that my information about this might be out of date, I checked an AMD64 Linux machine and a Solaris 9 machine on UltraSPARC, and both of these also use 64 bit time counters. However, on Solaris 9 the time counter is 64 bits long only when the application is compiled for 64-bit mode, which is a rarity on UltraSPARC machines for any operating system (most applications do not benefit from it, and the userspace on Solaris and most SPARC Linux distributions is typically either 32-bit only, or primarily 32-bit with optional 64-bit support).

Compared to Y2K, the time_t wraparound is both more and less serious. On the one hand, fixing it should require significantly less work, because most applications can simply be recompiled. However, it affects all programs that run on Unix, because it is an issue with the kernel and libc, not with the application itself. The best an application can do is make sure that it doesn't assume much about how large time_t is, so a simple recompile will fix the problem.

Many binary files and protocols (including gzip and PGP, and the SSL protocol) store the time as a 32-bit value, but these are nearly always specified as being unsigned, meaning they won't become a major problem until 2106 - though processing data created after 2038 may cause problems on older systems, simply because the receiving application may well stuff the value into a time_t (which silently overflows) and then get bad dates when it tries to process it.

An old Unix curse/joke (mentioned in the Jargon File entry for tunafish) is that "<something bad> will hassle you until the time_t's wrap around". So for those who have been hit with such a curse, you only have a short 33 years to survive before gaining relief from whatever curse has been dropped on you. Given the old joke status, it's obvious that this problem has been known for a long time, and nobody is doing anything about it, primarily because it really shouldn't be too bad to fix. Probably starting around 2030 or 2035, systems will start using larger time_t values, and then everyone will have to spend a week or two fixing their applications.

Log in or register to write something here or to contact authors.