informa
2 min read
article

Researchers teach AI to play 'hard' games by showing it YouTube videos

Google DeepMind researchers report on how they successfully trained artificial intelligence to play "infamously hard exploration games" like Pitfall! with YouTube videos of human playthroughs.

A group of researchers at Google DeepMind have published a paper today outlining how they successfully trained artificial intelligence to play "infamously hard exploration games" with YouTube videos of human playthroughs.

Even if you (like me) aren't well-versed in the art of AI, the paper has some interesting takeaways, including the core premise that it's apparently difficult for deep reinforcement learning algorithms to improve at tasks which, in the words of the paper, take place "where environment rewards are particularly sparse."

"One epitomizing example is Atari’s Montezuma's Revenge, which requires a human-like avatar to navigate a series of platforms and obstacles (the nature of which change substantially room-to-room) to collect point-scoring items," the paper reads. "For example, reaching the first environment reward in Montezuma's Revenge takes approximately 100 environment steps, equivalent to 100^18 possible action sequences."

While researchers say they can feed human demonstrations to the AI to teach the correct way to play a game, this paper seems to suggest that AI learn to play in a more interesting way (i.e. not using artificial demonstrations) by showing it a bunch of (mismatching) YouTube videos of people playing a game, with one earmarked as what the AI should do to receive a reward.

"Specifically, providing a standard RL agent with an imitation reward learnt from a single YouTube video, we are the first to convincingly exceed human-level performance on three of Atari’s hardest exploration games: Montezuma's Revenge, Pitfall! and Private Eye," the paper continues. "Despite the challenges of designing reward functions or learning them using inverse reinforcement learning, we also achieve human-level performance even in the absence of an environment reward signal."

It's fascinating research, and you can read the rest of the details in full in the appropriately titled paper "Playing hard exploration games by watching YouTube" on Arxiv's website. Some demonstration videos (one of which is embedded above) are also available to watch on researcher Tobias Pfaff's YouTube channel.

 

Latest Jobs

Treyarch

Playa Vista, California
6.20.22
Audio Engineer

Digital Extremes

London, Ontario, Canada
6.20.22
Communications Director

High Moon Studios

Carlsbad, California
6.20.22
Senior Producer

Build a Rocket Boy Games

Edinburgh, Scotland
6.20.22
Lead UI Programmer
More Jobs   

CONNECT WITH US

Register for a
Subscribe to
Follow us

Game Developer Account

Game Developer Newsletter

@gamedevdotcom

Register for a

Game Developer Account

Gain full access to resources (events, white paper, webinars, reports, etc)
Single sign-on to all Informa products

Register
Subscribe to

Game Developer Newsletter

Get daily Game Developer top stories every morning straight into your inbox

Subscribe
Follow us

@gamedevdotcom

Follow us @gamedevdotcom to stay up-to-date with the latest news & insider information about events & more