A new paper from Microsoft Research in Cambridge details how machine learning routines are being used to improve the speed and accuracy of limb-detection techniques for the company's Kinect sensor.
"Real-Time Human Pose Recognition in Parts from Single Depth Images" [PDF
] lays out in detail new AI routines that could be used to improve Kinect performance in future software.
The basic process involves analyzing millions of 3D depth maps that were pre-labeled with identifiable body parts -- such as arms, legs and torso. A server consisting of 1,000 cores analyzed roughly a million of these images each day, compiling the aggregate results into a series of trees that could successfully identify the body parts quickly without the identifying data.
Once the trees are fully built, they're used to probabilistically guess a specific body part for each bit of 3D pixel data taken in by Kinect. Finally, the system uses these pixels to assign positions for the joints that make up the skeleton of a character's 3D model.
The researchers claim the Xbox 360 GPU can perform this pattern recognition process in under 5 ms -- a rate of over 200 frames per second -- which the team says is "at least one order of magnitude faster than existing approaches." What's more, the large number of machine learning inputs means the process reportedly works across a variety of different body types without any calibration poses, as seen in this explanatory video
Early Kinect demos and software received some criticism from developers and observers for a noticeable lag between real-world motions and on-screen reactions when using Kinect, which some measured at hundreds of milliseconds.