Sponsored By

A group of researchers have published a paper detailing their efforts to use open-world games like Grand Theft Auto V to help train computers to see and identify objects in the real world.

Alex Wawro, Contributor

September 14, 2016

2 Min Read

In advance of the European Conference on Computer Vision next month, a group of researchers have published a paper detailing their efforts to use open-world games like Grand Theft Auto V to help train computers to see and identify objects in the real world.

It's an intriguing read, as it outlines how modern open-world games are now realistic enough to be useful in training computer vision systems (which are critical to technologies like self-driving cars) to recognize things like lamp posts, pedestrians and sidewalks.

And that seems to be great news for computer vision researchers, because labeling objects in a video game is a lot easier than labeling objects in video footage of the real world.

"Using the acquired [video game] data to supplement real-world images significantly increases accuracy and...enables reducing the amount of hand-labeled real-world data," reads an excerpt of the paper. "Models trained with game data and just 1/3 of the CamVid training set outperform models trained on the complete CamVid training set."

The CamVid set researchers refer to there is the Cambridge-driving Labeled Video Database, which purports to be the first collection of real-world video that contains pixel-perfect "object class semantic labels" which tell a machine what things in the footage are (a mail box or a street sign, for example.)

Actual human beings had to go through the time-intensive process of creating those labels, so the fact that researchers were able to pull similar data directly from open-world games and incorporate it into footage of said games (by way of a third-party program running outside the game itself.)

"Although the source code and the internal operation of commercial games are inaccessible, we show that associations between image patches can be reconstructed from the communication between the game and the graphics hardware," reads another excerpt of the paper. "This enables rapid propagation of semantic labels within and across images synthesized by the game, with no access to the source code or the content."

Developers curious to learn more should read the rest of the paper, which also includes a helpful video (embedded above) breaking down how video games can potentially improve computer vision system training regimes.

About the Author(s)

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like