The Dream Machine for Customizing Biology is Almost Here
Protein designers are preparing for quantum computers to point the way to new cures and green materials.
Quantum computers aren’t quite ready for prime time, but in recent months biologists like me have begun bracing for impact. We may finally be close to getting the tool we’ve been dreaming of to design a cleaner and healthier world.
I work in a lab at the University of Washington that studies proteins — how they function and how they can be improved. Every cell in your body is packed with billions of these nanoscale machines, each performing vital tasks. Proteins repair, block, bind, catalyze, signal, transport, and pump. They are the molecular basis of nearly everything living things can do, from digestion and replication to vision and consciousness. DNA may be the more famous biomolecule, but its raison d’être is to tell the body how to make proteins.
Just what any given protein does is determined by its shape. Long, fibrous ones make up natural materials like silk and wool. Hemoglobin, the protein that makes blood red, has a shape that lets it capture individual molecules of oxygen and ferry them through the body.
So what is it that determines a protein’s shape? It’s entirely based on which amino acids are in it. Like letters arranged into words, the exact order in which amino acids are strung makes a protein unique. Some combinations are lethal, as evidenced by the toxic proteins found in some bacteria. But the last 40 years of biotechnology has also shown that the same building blocks can be rearranged to save lives. Indeed, proteins are now the fastest-growing category of new drugs and are already being used to treat cancer, diabetes, arthritis, migraines, and more.
Understanding how strings of amino acids encode functional shapes has driven researchers like me for decades. We use computers to model this process, because if we knew with certainty how different combinations of amino acids would structure themselves — a process called “protein folding” — the source code of biology would be laid bare. Genes would shift from objects of study to publishable prose. Cures for cancer, autoimmune diseases, Alzheimer’s, and more might be programmed. Scientists could engineer — at the most precise level possible — new biodegradable materials and catalysts for more efficient clean-energy systems.
The possibilities are far greater than what can be found in nature. How can we be sure? There are more ways to string 100 amino acids together than there are atoms in the universe. Just in the last five years, scientists in my lab have hit on new proteins that break down gluten, capture carbon dioxide, and block multiple strains of flu.
Many other cures and new materials are surely out there. But the computers we have been using to model proteins can handle only so much complexity. Instead we need computers powerful enough to accurately model the chemical forces that govern protein shape. We might be about to get such machines, and with them the chance to master the language of life.
Ending the random walk
The promise of quantum computing has enticed technologists for decades. If the bizarre laws of quantum mechanics can be harnessed inside a processor, the thinking goes, then some of the most demanding problems in computing may give way. How strings of amino acids fold into functional proteins is just one example.
The power of quantum processors stems from their unusual physical behavior. Regular computers are based on switches that toggle between two states, each symbolized as either 1 or 0, which serves as a unit of data called a bit. String enough of these switches together — and toss in a program to control their switching — and you can store and process information.
Quantum bits, also known as qubits, are much stranger. While they too toggle between 1 and 0, a third state, called quantum superposition, essentially allows both states to exist at the same time. If all the qubits in a quantum computer are in a state of superposition, that computer is sampling every possible combination of states.
Sampling states is half the challenge in protein design. When we want to invent a new protein, we start by designing a shape that we believe will perform a certain function, like sticking to the flu. Then we sample different amino acids to find a combination that locks in that shape. Because classical computers can’t sample all states at once, we use software that tries combinations essentially at random. It’s known as a random-walk approach, and it’s a bit like trying to navigate to a scenic place in a dense forest by zigzagging until you get there. It might work, and it might not.
“The algorithm isn’t guaranteed to find the best solution,” says Vikram Mulligan, a senior software engineer in my lab. “But, for small problems, it is highly likely to find a pretty good solution if you let it run long enough.” As the problem gets bigger, however, the odds of staying lost in the woods grows exponentially.
Conventional supercomputers have been custom built for protein-modeling calculations, including D.E. Shaw’s Anton machine and IBM’s Blue Gene. My lab in Washington currently uses up six supercomputers — local, state, and federal — to churn through our design calculations. We have also created one of the largest distributed computing communities. But even with hundreds of thousands of computers slogging away, our random-walk algorithms still get stuck in the forest.
Quantum computers should shine on puzzles like that, by trying all paths at the same time.
Quantum computers have been creeping closer over the past decade. A company called D-Wave has been selling what it calls a quantum computer with 2,000 qubits, but it’s not clear that the system is fully harnessing quantum effects, and it’s considered useful for only certain computing problems. Meanwhile, though, companies like Google, Microsoft, and IBM and academic labs are making progress on general-purpose quantum machines. Quantum processors have to operate at ultra-cold temperatures to bring out their weird physical properties, so they are finicky—it’s hard to keep qubits stable long enough to perform calculations reliably. But researchers are figuring out ways to deal with and ultimately reduce such errors. Google revealed this year that it’s developing a chip with 72 qubits, which could be enough to outperform any classical computer.
In recent months, software engineers in my lab have been getting ready, retooling our protein-design software to run on quantum processors. Instead of going on random walks, we hope to zero in on new strings of amino acids that fold up into new proteins with bespoke properties.
In 2012, researchers at Harvard and D-Wave showed that quantum computers could speed up certain protein modeling calculations. But machines at the time were limited to modeling no more than six amino acids. Fast-forward to last September, when IBM’s Jerry Chow and colleagues carried out the largest chemical simulation ever performed on a quantum computer. It was not a protein, just one chemical. Yet even that work “shows a path” to designing full proteins, says Chow, who heads IBM’s experimental quantum computing group.
Proteins are the molecular basis of nearly everything living things can do, from digestion and replication to vision and consciousness.
Chow adds that “there is a lot of work to be done to scale that up.” As long as quantum processors are so prone to glitches and noise, certain kinds of calculations will remain unworkable. In the near term at least, hybrid software that links both classical and quantum processors will be needed to solve any truly complex problems.
Effectively designing proteins might require a quantum computer with 1,000 stable qubits, Mulligan says. But even before that day comes, the protein designers in my lab expect breakthroughs. We’re focusing for now on a class of proteins called cyclic peptides, which are easier to design because they’re tiny. Many natural cyclic peptides have appealing medicinal properties. They are bigger than the molecules in conventional drugs like aspirin, but less complex than large protein therapeutics like antibodies.
“Once we can do peptides,” says Mulligan, “we can use the same algorithms to design any protein. We’ll just be waiting on hardware.”