Motion Capture on a Budget - Animating Satellite Reign

When faced with the problem of having to create a large volume of animations on a tight budget, and with an animation team consisting of a single person, I turned to a cheap, markerless mocap solution for Satellite Reign.

support.satellitereign.com

Mitchell Clifford, Blogger

October 17, 2013

9 Min Read

Something we’ve seen raised a number of times around the internet is the fact that a title like Satellite Reign is fairly ambitious for a small team, which is a fair point. Essentially, what this means is we each have to be as efficient as possible in order to achieve our goals. The five of us are currently in pre-production, figuring out our individual pipelines, trying to minimise headaches further down the line. Today, I’m going to give a rundown on how I’ll be approaching the animation aspect of Satellite Reign.

Animating an entire city population is no small task, especially when your entire animation team consists of one person. So, in order to get through the amount of work in front of me, I need a method to quickly create large volumes of animation. Animation is an incredibly time-consuming task, so animation-heavy games often have reasonably large animation teams. We’re on a very limited budget, so that isn’t really an option for us. This is what made me start looking at motion capture.

For anyone who isn’t familiar, motion capture (also referred to as ‘mocap’) is a process in which animation data is captured from real-life actors. You may have previously seen images or videos of people in funny looking lycra suits covered in dots, like this:

This is Andy Serkis performing in The Lord of the Rings as Gollum. The little dots all over his suit reflect infrared light, which are recorded and isolated by an array of cameras (numbering in the dozens) surrounding the entire stage. With this data, a software package can track the dots from all of the cameras in 3D space, which in turn is used to control a 3D character, sort of like a puppet.

The great thing about motion capture is you can get huge amounts of incredibly realistic movement as quickly as it can be acted out. It’s never quite perfect, as the actor’s proportions usually differ from the that of the 3D actor, but an animator can fairly easily make the necessary adjustments much quicker than creating the animations from scratch.

But of course, this sort of setup doesn’t come cheaply. Not only does it require a lot of expensive specialised equipment and software, but you also need a fairly generous amount of room, which is another limitation we have. But, there’s an alternative.

Back when the five of us were throwing around our first ideas for the Kickstarter, I starting looking into what was being done with the Xbox 360 Kinect sensor. Some very clever people had managed to get their PCs to take the depth-data from the Kinect, and use it with standard 3D animation packages to produce homebrew mocap software, without the need for big stages or silly lycra suits.

And so, I jumped on my bike and set off to Dean’s house to borrow his Kinect. I took it home, blew the dust off, plugged it into my PC, pointed it out my bedroom door, and walked down the hallway. The software showed a 3D depth recording of my motion, which it was able to analyse and use to drive a character skeleton. Below is an example of a capture I did at a friend’s place (his house had a little bit more room to move than mine).

I was fairly impressed. The motion of me walking was viewable in 3D on my PC in a matter of minutes. However, there was a noticeable lack in quality compared to a professional mocap package. The feet were sliding around on the ground, and anything the Kinect couldn’t see (e.g. my far arm when turning side on) would flip out and go crazy.

But, the software I used (called iPi Mocap Studio) has the capability of using two Kinect sensors at the same time, meaning I could have a second one offset by about 90 degrees to the first, so there would always be line-of-sight to all of my limbs. Brent loaned me his Kinect to give it a go.

Unfortunately, calibrating the two Kinects to work together wasn’t as reliable as expected, and the final results weren’t much better. Performing actions on-the-spot were fine, but as soon as I wanted to take a few steps, the Kinects would quickly lose sight of my feet and head, due to a combination of the limited range of the depth sensor, narrow field of view, and me being rather tall (I’m 196cm). It was enough for me to get away with for the Kickstarter video, along with a free-for-use mocap library from the internet, but it wasn’t going to cut it for actual production.

Fast-forward to post-Kickstarter, once we’d finally moved into our office. There was another option I had been planning to investigate. iPi Mocap Studio also allows for the use of standard webcams, from which it can analyse regular old video from multiple cameras to extract 3D motion data. It sounded iffy, since the whole reason the Kinect is able to track motion is due to its depth sensor. Regular cameras have no way of knowing how far away anything is from the lens, so how could it possibly produce 3D animation?

Regardless, I decided to give it a try. As it turns out, PlayStation Eye cameras are not only cheap (I got them for about $18 AUD each), but they’re actually significantly better than your standard webcam. They have great light-sensitivity, they record at up to 60 frames per second (many webcams are capable of less than half that), and they’ve got a nice wide (and adjustable) field of view.

Calibration isn’t quite so easy as with a single Kinect (which was pretty much entirely unnecessary), but it still turned out to be fairly straight-forward. Once I placed the four cameras around the office in a semi-circle arrangement, calibration was as simple as moving a bright light around the capture area. The software can easily pick the light out from the rest of the video, and by analysing its movement, it can determine each camera’s position relative to one another in 3D.

The green trail is the final result of the calibration process. It’s a 3D representation of the path the flashlight took around the capture area. The lowest points are so the software can determine the location of the ground relative to the cameras. Below, the grey 3D film-cameras represent the locations of the PlayStation Eye cameras.

And from there, it’s ready to go. The actor (myself, in this case) then performs whatever motions are required, and the video streams are processed by iPi Mocap Studio. It wasn’t all clear-sailing, however.

The lack of depth information means the PlayStation Eyes are much more sensitive to environmental factors. The actor needs to contrast obviously from the environment, which proved doubly difficult for us, since our carpet is dark blue (like jeans), and our walls are white. The actor needs to wear clothes which contrast obviously from both. Either that, or you make the floors and walls the same colour. That was easy enough to achieve by buying some cheap white sheets to lay across the ground. Below, you can see one of my earlier tests. The floor and walls are light, and my clothes are dark, allowing me to contrast well from the environment.

The pink lines over me are called a skeleton. When 3D characters are animated, whether it’s for games or movies, what’s actually being animated is this skeleton, which the polygonal character model is attached to. The bones in the skeleton are placed in more or less the same places as in a real skeleton. As you can see, the software had no issues positioning the skeleton correctly, which it will then do for every frame of video, until you have a complete animation. However, the white-sheets brought on another issue. Sheets on carpet lack friction…

The sheets worked well as long as I didn’t have to do any big, fast movements. But big, fast movements are what we want, so some more experimentation was required. That’s where my “Christmas clothes” came into it.

While the pants aren’t quite as “contrasty” against the floor as I’d like, they get the job done, and with this, there’s no need for the white sheets on the floor. The socks still aren’t ideal, but it still gives enough traction to move around. And with that, I’ve got data to transfer onto our placeholder in-game agent.

With some minor adjustment here and there, the animations are ready for export into the game. This is where animation begins to move from the creative end of the spectrum towards the more technical side of things.

As of release 4.0, Unity has a cool new animation system which they’ve named “Mecanim.” Basically, it gives animators complete control over how and when their animations play in game, rather than a programmer having to manually trigger every animation via a script. Now, Chris and Mike simply just tell the animation system basic bits of information about the character, like current speed, direction, health, currently equipped weapon, etc. I take that information and use it to control animation playback.

So, when the agents spawn, they start in their default “idle” state, which is just an animation of them standing there, looking around. From there, I can transition to various other states, based on the previously mentioned values (speed, direction, etc.) Here’s a basic example:

Once the “speed” value goes above zero, the agent will start walking.
If the speed value continues to rise, they’ll transition into a “run” animation.
If their direction changes while running, they’ll blend into a “turning run” animation.
If their speed suddenly goes back to zero, they’ll play a “stop running” animation, which then transitions back to their default “idle” animation.

Above, you can see what the Mecanim editor looks like. The boxes represent animation states, and the arrows between them are where I control the transition conditions. This is all handled by me, without Mike or Chris ever having to worry about the animation system. It also means that as they update the way the agent movement is handled, the animations will automatically adapt to their new changes. It takes a significant amount of work off of the programmer’s plates, while simultaneously allowing for much more fluid animation playback. It’s a win-win!

I hope this has been an interesting insight into the animation pipeline, and helps give an idea about some of the things going on here at 5 Lives. The other guys will give a run-down of their processes in the over the course of Satellite Reign’s production, and you’ll likely start to see where our disciplines overlap.

http://support.satellitereign.com/