Basics of Crypto Security
What are entropy, random numbers, and pseudo-random numbers? It is helpful to understand the terms.
September 13 2021
Entropy is a measure of unpredictability.
The more bits of entropy, the greater the randomness.
It is surprisingly hard for either humans or computers to pick truly random numbers.
Pseudo-random numbers may not be purely random, but they can be just as hard to guess, and are often used as a substitute for truly random numbers.
You can't trust many types of random number generators due to possible insufficient randomness, or risk of interception.
What are entropy and randomness?
Entropy is unpredictability, or in other words, the measure of a system's degree of disorder. In a set of numbers with perfect entropy, any given number would be perfectly random within that set, and thus impossible to predict - any correct "guess" could occur only by sheer coincidence.
In academic circles, scientists debate whether truly random numbers exist, since physical laws of cause and effect produce changes in the physical world that humans can not predict - but someday theoretically could. In practical terms, however, there is consensus that random numbers can be selected from characteristics of physical phenomena that show unpredictable variances, such as radioactive decay, atmospheric noise, or patterns of wind in uncontrolled environments. New innovations include how to get true randomness from mobile devices.
What is a Random Number Generator (RNG)?
It is a challenging task to program a computer to generate random numbers, since computers are generally limited to predictable inputs. To address this challenge, random number generators (RNGs) are mechanisms that produce random or seemingly-random numbers. There are two main types of RNGs: non-deterministic and deterministic.
A non-deterministic RNG relies on inputs from unpredictable physical sources (such as radioactive decay rates, noise in an electrical circuit, or dice rolls with balanced dice). Some RNGs mine non-deterministic inputs derived from sources such as user mouse movements, or time gaps between keyboard clicks, although it is difficult to test the quality of such human-generated randomness sources.
In contrast, deterministic RNGs perform algorithmic functions on "seed" input values in order to produce pseudo-random outputs that are difficult to distinguish from truly random numbers. Deterministic RNGs are sometimes referred to as pseudo-random number generators, or PRNGs. The quality of randomness produced through Pseudo Random Number Generation varies, and the best PRNGs rely on randomized seeds as inputs to their calculations. (Note: There is a subset of PRNGs that is recognized as being secure enough for cryptographic use: the cryptographically secure PRNG (CSPRNG) - but this classification can be controversial.)
Why are random numbers so important to cryptocurrency wallets?
Random number inputs are essential to calculating seed phrases because they are used as the starting point for BIP39 standard algorithms, which are used to calculate wallet encryption keys. If the original input numbers are predictable, then the resultant encryption keys might be able to be derived. If wallet encryption keys can be derived, then cryptocurrency could be stolen. This is why cryptocurrency security is so dependent on the randomness (and confidentiality) of seed phrase calculation input numbers.
The reliance of encryption keys on random inputs is not unique to cryptocurrency, or to the BIP39 standard, and it is not a design flaw - it is inherent in the broader mathematical challenge of how any unpredictable value may be chosen. The United States National Institute of Standards and Technology (NIST) states: "In cryptography, the unpredictability of secret values (such as cryptographic keys) is essential." NIST adds that "Specifying an entropy source is a complicated matter" (NIST Special Publication 800-90B, "Recommendation for the Entropy Sources Used for Random Bit Generation").
Both the quality and quantity of randomness provided as input are important to cryptographic seed phrases. The amount of random data included in a seed phrase calculation can be expressed in terms of "bits of entropy." The more random digits (the more bits) that are provided, the longer and less predictable the output can be. This is why more data inputs are needed to calculate a secure 24-word mnemonic seed phrase than to calculate a shorter one.
What are the risks when generating random numbers?
Beyond the technical challenges of producing random numbers, there is risk that a computer that produces or otherwise communicates random numbers could be compromised (exploited) in a variety of subtle ways, including loss of integrity or confidentiality in file systems, source code, memory, network communications, or connected devices. A compromised computer could alter or leak randomization calculation results. For this reason, many internet-based "random number generator" web pages warn users that they are for demonstration uses only, and should not be used to produce inputs for cryptocurrency seeds.
The risk of a computer's compromise increases with its levels of connectivity to other computers, and with its usage levels. Secure computers perform limited tasks, have a small number of authorized users, and have restricted physical access. Highly-secure computers are shipped directly from a trusted source with untamperable packaging, and once received, they are configured with no connections to other computers (sometimes called "air-gapped"). Because general-purpose household computers used for browsing and entertainment do not meet these rigorous standards, it is easy to see why carefully-managed hardware wallets provide the gold standard as trusted devices for entering random numbers for mnemonic phrase generation, and for storing the resultant cryptocurrency phrases and keys.