Deep learning has revolutionized artificial intelligence. We’ve changed from telling computers how to do things, and are now telling computers what to do and letting them figure it out. For many activities (e.g., object identification) we can’t even really explain how to do it. It’s easier to just tell a system, “This is a ball. When you see this, identify it as a ball. Now here are 1M more examples.” And the system learns pretty well.
Except when it doesn’t. There is a burgeoning new science of trying to tell artificial intelligence systems what exactly we want them to do:
Told to optimize for speed while racing down a track in a computer game, a car pushes the pedal to the metal … and proceeds to spin in a tight little circle. Nothing in the instructions told the car to drive straight, and so it improvised.
[. . . . .]
The team’s new system for providing instruction to robots — known as reward functions — combines demonstrations, in which humans show the robot what to do, and user preference surveys, in which people answer questions about how they want the robot to behave.
“Demonstrations are informative but they can be noisy. On the other hand, preferences provide, at most, one bit of information, but are way more accurate,” said Sadigh. “Our goal is to get the best of both worlds, and combine data coming from both of these sources more intelligently to better learn about humans’ preferred reward function.”Researchers teach robots what humans want
This is critical research, and probably under-reported. If robots (like people) are going to learn mainly by mimicking humans, what human behaviors should they mimic?
People want autonomous cars to drive less aggressively than they do. And they should also be less racist, sexist, and violent. Getting the right reward function is critical. Getting it wrong may be immoral.