I asked GPT a question that had been sitting in the back of my head: how much digital storage would you need to capture everything the human genome actually produces when you run it inside a living body for eighty-odd years? Not the genome itself. That’s a separate and easier question. I wanted to know about everything it generates.
The human genome contains about 3.2 billion chemical letters. Each letter is one of four options (A (Adenine), T (Thymine), G (Guanine), and C (Cytosine)), which means you can encode each one in 2 bits of digital information. Do the arithmetic and the whole genome comes to roughly 0.8 gigabytes. Less than a cheap USB stick. You could back up the complete instruction set for a human being to something you could lose down the back of a couch.
That’s actually a generous estimate. A lot of the genome is repetitive filler or sequence that barely varies between species, so the genuinely useful information content is smaller still. But take 0.8 GB as the round number.
A living body doesn’t read the genome like a computer reads a file. It runs it. The genome is more like a program than a document. It specifies how to build molecular machinery, how that machinery responds to signals, how cells develop into different types, and how the whole structure maintains itself for decades. The program is compact. What it produces is not.
At any given moment, a human body contains somewhere between ten trillion and a hundred trillion cells. Each cell is actively maintaining a state across thousands of variables: which genes are switched on, what proteins are present, what signals it’s sending and receiving, how its DNA is packaged. Even a rough tally, one byte per variable across a fraction of those cells, produces a number around 100 petabytes for a single snapshot in time. That’s 100 million gigabytes, and it’s a low-resolution estimate.
Add time and the numbers get larger still. The body isn’t a static. It’s a process running continuously over a lifetime: development, repair, immune response, ageing. If you wanted to record not just where things are now but how they got there and where they went, you’re looking at somewhere between one and a hundred exabytes. An exabyte is a billion gigabytes. So the rough relationship is: genome around 0.8 GB, body state at one moment tens to hundreds of petabytes, body process over a lifetime one to a hundred exabytes.
That’s an expansion of roughly ten billion times, from the program to the processes it generates. But the genome isn’t storing that information. It’s producing the conditions under which chemistry and physics generate it.
Not all of that complexity comes from the genome. A lot of it comes from the environment: light hitting your eyes, food you’ve eaten, infections you’ve fought off, random molecular noise inside cells. The genome doesn’t pre-specify every cellular state. What it does is build machinery capable of responding appropriately to whatever the environment throws at it. You also can’t treat every cell as independent. Cells of the same type behave similarly. Development follows constrained paths rather than wandering randomly through all possible states. The actual information in a biological trajectory is far more compressible than the raw numbers suggest.
The genome is a small program that sets up a very large and responsive system. The environment runs that system through one particular trajectory out of many it could have taken. The resulting complexity, the full state of a living human body across a lifetime, is not stored anywhere. It is continuously produced by running a compact set of instructions through wet chemistry for eighty years. The numbers are striking even after all the caveats. A sub-gigabyte program generating a process that would take exabytes to record is not a trivial compression ratio. It just means that execution in a physical system is a fundamentally different thing from storage. The genome’s job is not to pre-specify the answer. Its job is to build something capable of finding one.
The living body it generates, sampled across all its cells and processes over a lifetime, represents something in the range of exabytes of realised state. The expansion is roughly ten billion times. The genome doesn’t store that complexity. It generates it by running a compact program through physical chemistry for eighty years.
That’s not a curiosity. It’s an existence proof.
Something in the physical world can act as a generative substrate where a small specification produces vast useful complexity through execution. The complexity doesn’t live in the code. It emerges from running the code in the right kind of physical system. The body is not a digital computer poorly approximating the genome’s intentions. It is a quantum mechanical system doing what quantum mechanical systems do, and the genome is compact precisely because it only needs to specify initial conditions and interaction rules. The physics handles the rest.
This is worth sitting with before moving to quantum computing, because the standard framing of quantum computing has almost nothing to do with it.
The standard framing is about algorithm supremacy. Build enough qubits, correct enough errors, and you can factorise large numbers faster than any classical machine, or search unsorted databases with a quadratic speedup, or simulate molecular hamiltonians exactly. All of that is real. None of it is what the genome is doing.
The genome is not competing with classical chemistry on classical chemistry’s terms. It is not trying to compute the folded state of a protein by running a faster version of an algorithm a classical system could also run. It is setting up a physical process and reading the output. The protein folds because of electrostatic interactions, hydrophobic forces, hydrogen bonding, and van der Waals forces playing out in an aqueous environment. The genome’s job was to specify the amino acid sequence. Everything after that is substrate.
Quantum reservoir computing works on the same principle. A fixed quantum system with rich internal dynamics receives an input, processes it through those dynamics, and produces an output that a classical readout layer is trained to interpret. You do not program the quantum system exhaustively. You do not need to control every qubit. You run things through it and read what comes out. The quantum dynamics are the genome. The readout layer is the organism deciding what the state of the body means.
The scaling problem in conventional quantum computing is about maintaining coherence across large numbers of qubits long enough to run a deep circuit. It is genuinely hard and may remain hard for a long time. Quantum reservoir computing sidesteps most of it. The system does not need to be large. It does not need to be perfectly controlled. It needs to be rich enough in its dynamics that the mapping from input to output captures structure that would be expensive to compute classically. Ten to fifty noisy qubits in a physical system with interesting dynamics is potentially useful in this mode in a way that ten to fifty noisy qubits running Shor’s algorithm is not.
The genome did not solve biological complexity by getting bigger. It solved it by being the right kind of compact specification for the right kind of physical substrate. The question for quantum computing is the same one, stated plainly: what is the right kind of quantum system, and what is the right kind of compact input structure to run through it, such that the physics generates useful complexity without being asked to enumerate it?
That question is not new. Quantum reservoir computing researchers are working on pieces of it. What may be new is the framing. The genome is not an analogy. It is a physical system that already answered the question at scale, in noise, without error correction, running continuously for the lifetime of every person alive.
The genome framing is useful: it suggests the right question is not whether quantum reservoirs beat classical algorithms, but whether there are problems where running inputs through a physical quantum system and reading the output is simply the natural way to compute.
That seems worth looking at.