The stories are turning into a steady trickle. DeepMind’s neural network again crushes the competition:
DeepMind entered AlphaFold into the Critical Assessment of Structure Prediction (CASP) competition, a biannual protein-folding olympics that attracts research groups from around the world. The aim of the competition is to predict the structures of proteins from lists of their amino acids which are sent to teams every few days over several months. The structures of these proteins have recently been cracked by laborious and costly traditional methods, but not made public. The team that submits the most accurate predictions wins. On its first foray into the competition, AlphaFold topped a table of 98 entrants, predicting the most accurate structure for 25 out of 43 proteins, compared with three out of 43 for the second placed team in the same category.
The result was initially depressing to at least one scientist:
Mohammed AlQuraishi, a biologist who has dedicated his career to this kind of research, flew in early December to Cancun, Mexico, where academics were gathering to discuss the results of the latest contest. As he checked into his hotel, a five-star resort on the Caribbean, he was consumed by melancholy.
The contest, the Critical Assessment of Structure Prediction, was not won by academics. It was won by DeepMind, the artificial intelligence lab owned by Google’s parent company.
“I was surprised and deflated,” said Dr. AlQuraishi, a researcher at Harvard Medical School. “They were way out in front of everyone else.”
. . . . .
After the conference in Cancun, Dr. AlQuraishi described his experience in a blog post. The melancholy he felt after losing to DeepMind gave way to what he called “a more rational assessment of the value of scientific progress.”
But he strongly criticized big pharmaceutical companies like Merck and Novartis, as well as his academic community, for not keeping pace.
This is good news! Yes, there will be disruption, but we have discovered new tools to crack the most computationally expensive problems. This is tremendous work and heralds a future of dramatic advances in energy, medicine, and automation.
Very, very complicated algorithms are starting to solve problems in ways we don’t fully understand. And it again raises the question of whether we as a species are headed into that incomprehensible future.
I think if it solves fusion, I’ll take it:
The number of design choices for optimizing this fusion plasma is enormous, because all aspects of the capsule’s dimensions and structure, as well as the details of the laser and the time dependence of the laser’s power, can be varied. Implosion performance can also be considerably affected by ‘hydrodynamic’ instabilities that are seeded by inevitable imperfections in the manufactured capsule and imbalances or instabilities in the applied laser light. Unsurprisingly, the complexity of this implosion system leads to fusion performance that is extremely sensitive to design details and instabilities.
With so many design choices, and with limited experimental data, the standard approach to optimizing fusion performance has been to use theoretical insights along with sophisticated radiation–hydrodynamic simulations that follow, as well as we know how, the physics of the implosions and their degradations.
. . . . .
The authors trained a statistical model to match an initial set of experimental data using simulation outputs. They then used this model to suggest changes to the implosion design that the model predicted would improve the fusion performance.
. . . . .
By consistently following this methodology to design a series of experimental campaigns, Gopalaswamy and colleagues improved the fusion yield by a remarkable factor of three compared with OMEGA’s previous record.
[I]t is humbling for scientists dedicated to understanding such complex systems to recognize how much they don’t understand. As a quote attributed to physicist Eugene Wigner states: “It is nice to know that the computer understands the problem. But I would like to understand it, too”.
Our wetware brains weren’t evolved to track all these variables. But we are building machines that can.
In a scientific first, Columbia neuroengineers have created a system that translates thought into intelligible, recognizable speech. By monitoring someone’s brain activity, the technology can reconstruct the words a person hears with unprecedented clarity.
This was done by tracking the brain activity created by someone listening to very simple words (the numbers 0-9), not imagining speaking those words. But it’s impressive it could be done at all using only brain signals, and it suggests that it is indeed possible to think a sentence and have a machine say it out loud.
Open questions: Would a neural network trained on one person work on many others? What kinds of other thoughts might be decoded?
Recently I posted about the phenomenal performance of the AlphaZero algorithm in computer chess. For the first time in history, an algorithm displayed human-like understanding of chess. AlphaZero seemed to understand what moves were best and spent its time focusing only on them. It didn’t mechanically crunch through millions of possible positions, run out of time, and then select the best move. The best moves emerged from its computer neural network, like a human grandmaster. How? Here’s the paper.
Here’s the TL;DR explanation
Input. AlphaZero receives the board position and the known rules of chess – that’s it.
Neural network. AlphaZero’s neural network generates a list of pretty good looking moves. THIS IS BLACK MAGIC.
Search. AlphaZero then spends most of its time searching pretty much at random through the deep combinations that begin with each of those “good looking” moves.
The random search is influenced by trying to get good coverage of all the suggested moves and focusing on moves that are starting to look good.
Output. When AlphaZero runs out of time it picks the best looking move.
So This is Crazy
It’s hard to overstate AlphaZero’s performance. It was given just the rules of chess and nine hours to play itself 44 million games, and then it learned something so deep about chess that it crushed the world champion computer chess program, Stockfish, 155 games to 6. (They played 1,000 total games; at this level most games are draws.)
Now first, you might say, wow 44 million games is a lot of chess. And it is. I wonder how good I might be at chess if I was able to play (and somehow remember) that many games. But it’s worth noting the enormous complexity of the game of chess. In the first two moves (white moves then black moves), there are 400 possible board positions. In the second two moves, there are ~200k possible board positions. In the third two moves, there are ~121 million possible board positions. The descent into insanity accelerates from there.
There are more possible games of chess than there are atoms in the universe multiplied by grains of sand on all the world’s beaches. There are so many possible games of chess that we don’t know how many games of chess there are: it’s literally too hard to compute. There are so many games of chess that the next time you play a game, it is quite probably the only time anyone in history has ever played that game and the only time in history anyone will ever play that game.
So, yes, 44 million games is a lot. But it’s also nothing. Like really, you’re not even scratching the surface.
AlphaZero Played Smarter
AlphaZero learned something deep about chess in those 44 million games, and that knowledge, encoded within its neural network, allowed it to play very, very efficiently. AlphaZero did not out-compute Stockfish, at least not in a traditional sense. In fact, AlphaZero evaluated fewer positions per second than Stockfish, crunching through around 60,000 positions per second compared to 60 million per second for Stockfish. Somehow, AlphaZero was focusing on moves that were so much better than Stockfish that it could crush the world champion computer program while operating on 1/1000 of the raw horsepower.
And here’s the crazy part: I’m burying the lede. AlphaZero shattered our understanding of the upper limits of chess performance with just nine hours of self-play. The same algorithm crushed world champion programs and people in the games of Shogi and Go. It is a generalized, intelligent game playing beast the likes of which we have never seen.
The Rough Details of How AlphaZero Works
Essentially AlphaZero has two components: (1) a neural network trained by playing itself chess; and (2) a Monte Carlo Tree Search algorithm to explore the moves suggested by the neural network. You can think of AlphaZero as a deep learning neural network that does an amazing job pruning the search space to focus only on the moves that matter.
What does AlphaZero take as input? AlphaZero takes the board position as input – that’s it. It understands the basic rules of chess, but does not use any domain-specific features or heuristics (e.g., material point values, pawn structure, king safety, etc.).
The input data representation for AlphaZero’s neural network is a set of planes representing the position of each type of piece (separate for each player), duplicated for a history of eight moves. So that’s 8 historical moves x 2 players x 6 kinds of pieces. In addition, a set of constant-valued planes denoting the player’s color, move number, and the state of special rules (e.g., legality of castling) is provided.
For chess playing, that’s 119 total planes provided in an image stack. Below is a table summary from the paper. Note that the 119 total planes is broken down as (6 planes for each of player one’s piece types + 6 planes for each of player two’s piece types + 2 repetition counts) * 8 historical moves + the 7 constant-valued planes = 119.
What does AlphaZero’s neural network output? AlphaZero’s neural network generates (1) a list of possible moves along with the probability that each of those moves will be made in this position; and (2) a simple win-lose-or-draw estimate of the outcome of the game starting from the given board position.
The output data representation is a bit unusual, at least from a human point of view. The list of possible “good moves” is provided in the form of a stack of 73 planes of size 8 x 8 that represent a probability distribution over 4,672 possible moves. These are “possible moves” simply because 8 x 8 x 73 = 4,672, but obviously most of these moves will be illegal under the rules of chess. Illegal move probabilities are set to 0%.
The first 56 planes encode “queen moves,” which are not necessarily moves by a queen piece; they are moves that a queen could make (horizontal, vertical, diagonal). The next 8 planes encode knight moves, and the final 9 planes encode underpromotions such as a pawn being promoted to a rook, bishop, or knight.
The paper doesn’t say expressly, but the “move probabilities” are likely determined simply by the magnitude of the network response thrown into a softmax function. The softmax function is just a way of converting a bunch of values into probabilities. For example, if a move is identified with magnitude 4 and two others are given magnitude 2’s (4, 2, 2), the move probabilities will be assigned such that the 4 is most probable and the 2’s are not, and they all sum to 100%. In this example, the softmax would be 78%, 11%, 11%.
How does AlphaZero choose a move from the output of the neural network? AlphaZero evaluates the move candidates from its neural network roughly in order of probability to find the best possible move from the given position. Each of these candidates may have position trees that extend into billions or trillions of subsequent possible combinations, and AlphaZero doesn’t have time to exhaustively evaluate all possible end results. So it uses a “Monte Carlo” search, which is a fancy way of saying that it searches more or less at random while trying to get good coverage of all the possibilities and then focusing on moves that are starting to look good.
This “sort-of-random” Monte Carlo search is unusual for chess-playing algorithms. Traditionally, computers have used an “alpha-beta” search, which focuses on the best-looking moves. Monte Carlo search strategies have faired poorly, as you might expect, because they spend precious time exploring moves that already seem like bad moves.
So why does the combination of AlphaZero’s neural network and a Monte Carlo search strategy work so well? The researchers have a theory:
AlphaZero evaluates positions non-linearly using deep neural networks, rather than the linear evaluation function used in typical chess programs. This provides a more powerful evaluation function, but may also introduce larger worst-case generalization errors. When combined with alpha-beta search, which computes an explicit minimax, the biggest errors are typically propagated directly to the root of the subtree. By contrast, AlphaZero’s MCTS [Monte Carlo Tree Search] averages over the position evaluations within a subtree, rather than computing the minimax evaluation of that subtree. We speculate that the approximation errors introduced by neural networks therefore tend to cancel out when evaluating a large subtree.
Preprint at 16.
In other words, the neural network may make big mistakes when evaluating some positions. If the searching function focuses on exactly what the neural network is saying, it could end up focusing on a mistake. If the searching function instead takes all the suggestions with a grain of salt and keeps its options open, the mistakes made by the neural network tend to cancel themselves out. This is akin to saying, “I’m going to point you in the general direction, but this is a gut feeling. You’re going to want to check the specifics.”
Sound familiar? To me this sounds like the grandmaster who immediately considers three possible moves in a position and spends the rest of his time double-checking the tactics of making these moves.
How does AlphaZero train its neural network? AlphaZero’s neural network starts with randomized parameters and then learns new parameters by playing games with itself. It plays each game by outputting a list of moves from its neural network (randomly at first) and then using the Monte Carlo search to try to find the best move. It plays the game all the way to completion and then takes the result (win, loss, or draw) and compares that against the predictions it had made during the game. The neural network parameters are updated to minimize the errors it made during play. Neural network parameters are updated by backpropagation.
AlphaZero played itself 44 million times in 9 hours and then crushed anything that had ever come before it. It learned something deep about the game of chess, and it did so without leveraging any human insight.
But AlphaZero can’t tell us what it learned any more than a grandmaster can tell us why a move feels right. That learning is embedded in the millions and millions of parameters of AlphaZero’s neural network. We can only stare at those numbers and hazard guesses about what they mean. How normal will that become? Are we on our way to an “incomprehensible future“?
These rules have a lot of overlap, but also a lot of noise. Of course systems should be safe and reliable and just and secure. This is marketing noise and no one disagrees. We need to figure out the hard rules. How transparent should we require AI systems to be? How explainable? This could be hard.
Yet this peculiar retreat was venomous: No matter how Stockfish replied, it was doomed. It was almost as if AlphaZero was waiting for Stockfish to realize, after billions of brutish calculations, how hopeless its position truly was, so that the beast could relax and expire peacefully, like a vanquished bull before a matador. Grandmasters had never seen anything like it. AlphaZero had the finesse of a virtuoso and the power of a machine. It was humankind’s first glimpse of an awesome new kind of intelligence.
When I was a graduate student in AI I often pointed to the Deep Blue v. Kasparov series as an illustration of how clueless we were in AI: grandmasters usually considered only a few moves in their two minutes of thinking, and Deep Blue churned through millions. Yet we had no idea how grandmasters came up with those few moves for consideration.
AlphaZero is a revelation. I hope to post more on it.