Google’s AI subsidiary DeepMind has unveiled the latest version of its Go-playing software, AlphaGo Zero. The new program is a significantly better player than the version that beat the game’s world champion earlier this year, but, more importantly, it’s also entirely self-taught. DeepMind says this means the company is one step closer to creating general purpose algorithms that can intelligently tackle some of the hardest problems in science, from designing new drugs to more accurately modeling the effects of climate change.
The original AlphaGo demonstrated superhuman Go-playing ability, but needed the expertise of human players to get there. Namely, it used a dataset of more than 100,000 Go games as a starting point for its own knowledge. AlphaGo Zero, by comparison, has only been programmed with the basic rules of Go.
“By not using human data — by not using human expertise in any fashion — we’ve actually removed the constraints of human knowledge,” said AlphaGo Zero’s lead programmer, David Silver, at a press conference. “It’s therefore able to create knowledge itself from first principles; from a blank slate […] This enables it to be much more powerful than previous versions.”