Random number generators are not the easiest technology to explain, and recently I was in the position of trying to explain a particular random number generator. It led me to some curious thoughts about cosmology and computing. Those are two topics that seem like they don’t go together, I admit, but bear with me for a minute and I promise I will draw meaningful comparisons between the two.
The paradox of random number generators is that they are, at heart, machines, which means their actions are not random at all. They generate numbers that appear to be random just by providing every number they can think of in a sequence that gives the appearance of being random. This sequence of numbers is called a random number stream, and given the nature of machines, it is finite, which means, eventually, it repeats. The length of a random number stream before it starts over is called the period of the random number generator.
It used to be that most random number generators had a period no greater than 232, 2 to the 32nd power, or 4,294,967,296. That was a concession to the 32-bit integers that were the largest integer values readily available to programmers. But while 4,294,967,296 is a very large number, it is hardly an unimaginably large number. It is smaller, for example, than the number of people on Earth. It is also, and this is an important point, roughly equal to the operating frequency of a computer. The computer I am writing on now, for example, operates at a frequency of 3 GHz, or 3,000,000,000 time slices per second. Put a traditional random number generator on a computer like this, and a program could loop through the entire random number stream every few seconds. It is a thought that makes statisticians uncomfortable.
There is no such problem with the current-generation random number generator I was looking at. It has a period of 219,937, 2 to the 19,937th power, or a number so large it cannot comfortably be written. So after every 219,937 random numbers, the random number generator starts to repeat itself. Well, in theory, it does. In practice, that is impossible. No computer program could ever run that long.
I am not sticking my neck out when I say that. The age of the universe, as it turns out, is a much smaller number. It is estimated at 2114, 2 to the 114th power, a very large number, but tiny in comparison to the capacity of the random number generator.
When I say that the age of the universe is the number 2114, I am attempting to measure according the Planck scale. The Planck scale, in physics, is the scale of the smallest places and smallest units of time in which space, time, energy, and mass have any understandable meaning. It is also the smallest scale of any single physical action. The Planck scale creates a limit on any sequence of action. If you could push the pace of action to its physical limit, you could still only have 2114 actions in sequence from the origin of the universe to the present. For various physical reasons, such as temperature, the pace of any controllable sequence of actions is much slower than this. Still, the Planck scale provides a useful point of reference as the outer limit of physical action.
I am not saying that it would take a billion universes to generate all the available random numbers. The number in question is much, much larger than a billion. It would take more universes than you can imagine to provide enough time to generate the entire sequence of 219,937 random numbers.
It is a peculiar way of looking at something as commonplace as an ordinary computer, but it is absolutely true: the potential action of an ordinary computer is on a scale larger than that of the entire universe. That is, if you could somehow provide your computer with all the time and energy in the entire universe, it still could not get through everything it is digitally designed to do.
The age of the universe is one thing, but we are closer to the practical limits of digital data than that. We are, I dare say, bumping into it already.
Computers operate on a certain number of bits, or binary digits, at one time. This grouping of information is called a word. Current high-end computers have a 64-bit architecture and are operating on 64-bit words.
A 64-bit word is not all that big. It is literally the size of an 8-letter word. It is not even that hard to imagine all the 64-bit words that are possible. The easiest way to do this is to think in terms of hexadecimal digits. These are the familiar digits 0 through 9, plus the letters A through F. All the possible 64-bit words can be written as sequences of 16 hexadecimal digits. For example, 0123456789ABCDEF represents one of the possible 64-bit words; AAAA1111BBBB2222 is another. It is not hard to imagine the possible combinations of 16 hexadecimal digits.
It is easy to imagine, yet you could never write them all out. The number of possible 16-hexadecimal-digit values, which is the same as the number of distinct 64-bit words in computing, is 264, 2 to the 64th power. By comparison, the number of seconds in a year is 225, 2 to the 25th power. A theoretical person writing numbers for a year, or a trillion years for that matter, would not begin to approach the complete set of 64-bit words.
Computers take actions faster than people do. A person can do perhaps one distinct action in a second. A computer can do perhaps one distinct action (in sequence) in a nanosecond. But a year is about 255, 2 to the 55th power, nanoseconds, still far short of the number of distinct 64-bit word values. In its useful lifetime, even with theoretically ideal programming, a computer still could not generate all the possible 64-bit words.
In practice, most of the 64-bit words don’t occur in computing, even though they could. There simply isn’t time to use them all.
And I haven’t even mentioned that the average computer spends 99.9 percent of its processing time waiting. That means it is generating the no-operation instruction, which is just a single 64-bit value if it is a 64-bit computer, over and over again. On top of that, most computers are actually powered off most of the time. In practice, by random chance, I could believe a computer might run through something like 240, 2 to the 40th power or about a trillion, different 64-bit word values over the course of a year.
And the rest of the digital values? You could almost think of them as “vanity” values, included in the design of the computer processor in case they might occur, even though they don’t actually occur.
If 64-bit computing is already this far down the road to digital vanity, I have to imagine that 128-bit computing, though serious computer scientists are seriously working on it, has to be some kind of cosmic mistake. The number of possible values of a 128-bit word is 2128, 2 to the 128th power. But remember the age of the universe? It is a smaller number, 2114. If a theoretically ideal computer ran for the entire life of universe up to the present, it still could not possibly use more than a tiny fraction of the theoretically available 128-bit words.
A 64-bit processor is already comparable to the informational capacity of the physical universe. So how much could really be gained by expanding to a 128-bit processor?
The digital width of computer processors has doubled three times, from 8 bits to 16 to 32 to 64. But after comparing the theoretical digital capacity of a 64-bit processor to the informational capacity of the physical universe as a whole, I can’t believe that the move to 128-bit processing is the next step. Rather, it seems to me that it will only take some heady work by information theorists and engineers to compress 64-bit processing back into a 32-bit channel without giving up too much speed along the way. An ordinary MP3 file compresses digital music by a ratio of 7 to 1, so how hard could it be to compress the 40 bits of information we actually use in a 64-bit processor to fit into a 32-bit processor?