This post is part one of a five-part series on analytics in the gaming industry:
Part 1: Should I deal with this? How do I start?
Part 2: Basic Definitions
Part 3: Understanding Social in Games
Part 4: Cohort Analysis and Segmentation
Part 5: Predictive Analytics
The “need for big data” is buzzing around in almost every technology-enabled industry today. According to one survey, 77 percent of companies consider making big data and analytics a priority in their organization. This so-called “big data revolution” has gotten to the point, though, where the term is more of a buzzword than a meaningful strategy.
I hear a lot of talk from game developers about becoming “data driven” and incorporating analytics into their business models, but the people I talk to generally can’t describe what that means in terms of their company. The risk is that people implement an analytics program just for the sake of doing so, but without incorporating it into a larger business intelligence strategy.
“Big data” shouldn’t be an abstract concept anymore: its use should be part of an actionable strategy that combines player data (which game developers are already collecting) with analysis to help developers achieve their business goals and create more successful games.
Before we get into the strategy, though, let’s take a step back and break down the buzzword. “Big data” refers to the idea that there’s a lot of data being produced now from many different sources, and we need to move beyond our conventional set of tools to be able to analyze it. This is especially true for the gaming industry, where gamers are producing 50 terabytes of data every day. To put it in perspective, the human brain can only hold the equivalent of about 1.25 terabytes of data, in total. It’s a lot of information being thrown around, and we need new tools to be able to turn it into something useful.
We do that through game analytics. In the gaming industry, the data we’re taking in is generally what players are doing in a game in addition to their demographic information, social network information, etc. Analytics is taking that data and making sense out of it: finding patterns, for example, or figuring out who’s likely to leave the game via tools. At its core, analytics is just manipulating your data, so as you could probably guess, there are an infinite number of ways you can analyze and many different metrics to do so.
But Step 1 is getting this thing stood up. There are four pieces to understand. Instrumenting the title, reviewing the SDK or APIs, getting comfortable with security, and understanding your need for real-time-ness.
Instrumentation simply means you’re logging the events that will power all of the metrics. This doesn’t mean you need to build a database and store them, but there does need to be an event to capture. Analytics companies will tell you which events to capture, and how to report them. It’s pretty straightforward.
Why do you need to instrument the application? When an event happens--say a user logs in--the game system needs to be able to say “User 3482 logged in at time XX:XX:XX.” The analytics company supplies a piece of code that then fires this event off, typically to a URL in the cloud, where the data are collected, processed, and then displayed.
Review the SDK or APIs
The SDK code is supplied in a library that essentially says, “wherever you have your event happening, put this line of code here.” That integration is very straightforward, and should come with instructions for your engineers. If there’s an event the video game analytics company wants that you don’t have, you’ll either need to start capturing it, or there will be some metric that you won’t get. For example, if you want to know how many players are on level 8 and you don’t collect an event like “advanced from level 7 to 8,” that metric isn’t going to show up. A good SDK will also let you capture an event you care about that it doesn’t foresee. Maybe your game has snowboarding and you want to capture and display half-pipe tricks. The SDK should support that. In general, this is not a complicated process. However, you will need to give your engineers the time to do it, or you simply won’t get even the basics.
Alternatively, you’re sending data from your servers via an API. The vendor should have that code available to review, as well as examples/samples of it in action.
Security and Enterprise vs. SaaS
Your game may or may not have a database where you store event logs right now. If you do have one, where is it? Up in the cloud or local? What kind of security do you have on it? Your data are of course precious, and you can ill afford to expose personally identifiable information to the world. The good news is that metrics solutions shouldn’t need this info. However, any company still needs to get your data. You deliver it in one of two ways.
The first option is an enterprise deployment where the firm comes in and copies from your existing database to one they will set up on site. This is fairly complex because the firm will have to map from your data schema to theirs. For example, if they have an event called “Kill_Mob” and you call it “Slay_Foo” then a line of code has to be written to translate it (called an “ETL”). So, this method will take longer and cost more. The advantage is that the data doesn't leave your shop. However, opinions vary as to whether that is more or less secure. Chances are, lawyers will weigh in, but if possible let engineers drive the decision since they are more likely to understand what’s safe and where.
The second, more common option, is to install an SDK so that events fire from your client app into the cloud. Nearly every video game analytics company is using Amazon or the equivalent, and the security is typically excellent. Some engineers believe their data is safer there than locally.
One last note on security. Since any decent dashboard allows exporting, giving access to that dashboard is giving the power to spread data. Now as a rule, the world is a much better place if you give more people access to information, and the same is true inside your company. For example, if marketing can see developer dashboards and vice versa, they are more likely to speak the same language and have common reference points. However, you may not want everyone in customer support to see the financial data. Or maybe you do. A good dashboard system will allow you to set those kind of permissions. Consider your local culture and values before using them.
Understand “real time”
This brings up one last issue, which is the “real time”-ness of your data. Nearly every video game analytics company says they offer “real time analytics.” To some extent, this is statistical and marketing sleight of hand. Yes, we can all process your users and tell you how many logged in, or are on level 5, or spent money. A big Excel spreadsheet in the cloud is indeed basically instantaneous. But modeling (vanilla or predictive) is not something this big Excel program can do, and it’s definitely not instantaneous. So, consider that if everything a firm can do is in “real time,” they’re not doing anything particularly special or powerful.
Models are special and powerful, and it takes time to run them. That’s going to vary by how complex they are and how many players and data points are involved. If your game has 10k players, those models are going to be basically instantaneous. If your game has 10m players, it might take a few hours. Heavy duty models like ours that also crunch social interactions take longer still. So, one thing you want to always understand is how often your predictive analytics are being run. We suggest daily because of the tradeoffs of processing cost, as well as developers’ typical inability to act on things faster. But, if you are nimble, or can afford extra costs, consider having your models run more frequently. It’s about actionability—if you would take action, consider springing for it.
OK, so now you’re set up and you need to know what do first. This is where people start getting lost along the data trail, using the wrong metrics and ultimately ending up with worthless information. As a game developer, the most important question you need to ask before getting into analytics is, “What do I want to know?”
For example, you can ask what promotions are providing the highest ROI, how stable your player base is, or what mechanics are driving player conversion. From there, you can pick out the most appropriate tools to use. This way, you can avoid getting set up with a dashboard measuring K-factor, which is a measure of how viral your game is, if you actually want the answer to what your churn rate is. TLDR: Know the goal before you pick the tool.
In the next post, I’ll review the simplest tools: basic metrics.