Featured Blog | This community-written post highlights the best of what the game industry has to offer. Read more like it on the Game Developer Blogs or learn how to Submit Your Own Blog Post
Game Analytics 101 Part 5: Data models and predictive analytics
Game Analytics 101 Part 5: Data models and predictive analytics
Running a truly data-driven development studio is easy to say, but rare in practice. Most developers have some form of service--internal or from a vendor--that tells them the basic game analytics metrics. And basic analytics are just that--basic. What’s my average revenue per user (ARPU)? How many players do I have today (DAU) or this month (MAU), or what’s the most I have on at one time (peak concurrency)? What’s my basic virality (K-factor)? These are all useful. It’s good to know what happened last week, and yes it’s good to know what’s happening today. Both are important for understanding why some parts of your game are working better than others, how you’re doing, and whether you are trending in the right direction.
But this is of course all reactive, and as your mom told you, it’s always better to be proactive. So if you can know what’s coming, you can plan accordingly. If, for example, you know that player X is not going to spend any money for at least the next 10 days, you might think about taking some sort of action, right? If you know that player Y is going to quit in 4 days and she is worth $100 to you, you’d do something to intervene. (And if you don’t know how, we’ll cover that in a future post).
As you delve into predictive analytics for games, what you quickly discover is that it’s not radically different than forecasting the weather. You have data, you have trends, and you have an estimate for what’s going to happen tomorrow. The smartest forecasters actually use scientific data-driven models. And of course, some forecasts (and forecasters) are better than others, and over time you learn which ones you can trust.
You are a developer or a marketer for one, and you’re not interested in the weather. You want to know about people, and specifically about their likelihood to take certain actions. What you care about (or should care about) in the future is:
Quitting (“Churn”)
Going from not spending to spending (“Conversion”)
How much spending when (In-Game “Monetization”)
How much spending total (Player LTV, or lifetime value)
Social Value (How much a user truly matters)
Time on site (“Time Spent” or in some quarters, “Engagement”)
Ads clicked (“Happy Advertisers”)
Now if we change from predicting the weather to predicting people, this may feel slightly science fiction-y. Maybe you’ve read Asimov and you remember the idea of psychohistory in the Foundation series (Asimov is pretty amazing if you’ve never been exposed to him). There, an advanced civilization made a science out of understanding humans and societies so well that they could accurately forecast the actions of individuals. But that’s just a book. I mean, it’s impossible to predict the future, right?
Technically, no, it’s not impossible. There isn’t magic, just--as Asimov predicted--math and typically some PhDs to figure out how to do it. Luckily, the science here is getting easier to deal with as more and better tools become available. Do you need to hire a bunch of PhDs? You can, but as I’ll note below, it’s not terribly practical (or cheap). Still, it’s important that you understand what actually happens so you understand how usable, how actionable, the results are. So how does this all work?
It works most of the time with machine learning. What is that? It’s pretty much what it sounds like. You let a program look at your data and it looks for patterns and tells you what it finds. If you provide it some direction, that’s called “training” and it usually helps the program get better results, faster.
What’s machine learning do? Let’s say there are many actions you track. The program sees this pattern showing up: A, followed by B, followed by C, followed by D. It seems to happen in that order a fair amount. Say the program saw A-B-C 50 times and 45 of those times the next action was D. Why should you care? Well, imagine that “D” is spending or quitting. Now you care a lot about what A, B and C are, right? And you care how often they happen, and how often D happens.
We’re just talking brute-force pattern recognition. There are a lot of flavors of it, and they work best in the hands of a data scientist or a top-notch service. The results will usually either be very accurate but not understandable, or very understandable but not as accurate. Your scientists will ask you which you care more about.
If you’re into this sort of thing, I have a white paper on it that gets into more depth here.
The last thing you want to think about for predictive models is how often you want them. Let’s say you have some PhDs to crank these out for you. Are you paying them to do it one time, or every week or month? Maybe knowing the answers once is good enough, but if you aspire to be a modern big-data-driven developer, you want the answers daily. And that means the process is automated. So you want a system in place that’s telling you via some dashboard or email announcement today’s batch of results.
That’s the end of 101. I hope it was a useful intro into the broad area of analytics. Please let me know what you think in the comments below, or visit our blog.
Read more about:
Featured BlogsAbout the Author
You May Also Like