Daily news, dev blogs, and stories from Game Developer straight to your inbox
Featured Blog | This community-written post highlights the best of what the game industry has to offer. Read more like it on the Game Developer Blogs.
Everyone wants to know their players. But to truly understand them you need lot of data. With lots of data you get lots of problems. How do you manage this without losing your hair? This is how we did it...
April 16, 2015
7 Min Read
Speed kills (in a good way)
change your game designers' lives with faster data.
Jozo Kovac, INFINARIO Vision Designer
Speed is important, most people will agree. But changing over to a new system can sometimes
have overhead and a detangling of legacy code that makes you not want to switch. But we've
learned that when it comes to data, switching over saves so much time in the long run that it's
worth that overhead. We also learned that if you want to get anyone to try a new tool, you've got
to reduce that overhead as much as possible.
Our very first client at Infinario had an in-house system that could spit out all the stats they wanted for their games. It was great source of collected knowledge which tracked all their key metrics, and it was able to provide them some very specific and deep design insights. The problem was, whenever somebody needed to get a new view on the data, their IT team had to create a new screen. Usually it took around a week. Not great.
When requests like this stacked up, the most important ones were processed first, naturally. Financial and accounting data had always the highest priority. Toward the bottom, you'd find requests for game-specific views that were important to game designers. Not surprisingly, this company's game designers eventually realized they were going to have to live without deep data insights. It took so long that by the time the results came the situation had already changed, and they needed more tweaks, for which they'd have to wait even longer still.
First attempt: A big enterprise-style data warehouse
So we knew what our task was - we had to create a tool for ad-hoc analysis. We created a data warehouse with some OLAP cubes and a query tool. It worked well enough on a good day. Other days though, the company's game developers would do something like change the definition of player tables, and ETL jobs would jam. Suddenly we had holes in our data that couldn’t be repaired, and it happened way too often.
Adding new attributes was a painful process. The games didn't collect "unnecessary" data by default, and even when they would add something, this meant the analytics team had to spend days updating code. Analysts coded SQLs instead of analyzing information. And the game designers weren’t satisfied - all they got were excuses, and promises of "it'll be better with the next revision."
New technology will save us! Maybe!
In our next iteration, we decided to switch all collection of custom data from the backend to the front-end. We figured we would create an SDK to enable simple event tracking directly, from a central player database. Ideally we would do this without needing to define the structure of events beforehand, which would save us a lot of time. We switched from SQL to NoSQL databases. In NoSQL, structure doesn’t have to be defined in advance, and we should be more scalable.
We were amazed even by the initial results. It worked like a charm. We built a front-end that could provide a set of analyses, we wrote map-reduce jobs to produce funnels, retention cohorts. and everything else. But after our initial success, our users started to become less and less satisfied. Every day it was taking longer and longer to get results. The designers had no time or patience to wait for data. “Display it instantly or it's worthless,” they said. And if we couldn't provide that, we'd be out of a job.
“Display it instantly”
Our first shot at this was to run previews on a sample of our data. We went with 20%, as an initial try. The game designers couldn't care less about it. “I need all my data or you may as well not show me any,” they told us. This makes sense, in the world of free to play games. In free to play, often just 2% of players actually pay. Why would you care about data from only 20% of those 2%? On top of that, for every AB test they needed a population of 5x the size. What if that AB test went wrong, and turned off 5x more players? So we had to agree with them (hello Google Analytics!).
“I need all my data … instantly“
We started looking around for a faster database. Everybody wanted ad-hoc analyses, not predefined - there are already solid solutions for predefined statistics. So caching or any kind of pre-calculation isn't a solution to this problem. ElasticSearch, Cassandra, Big Table, Hadoop, Apache Spark... None of these could deliver results for 100 million rows instantly, while matching the flexibility requirement. So we sent all data to memory, optimized for space and speed, and added scalability options.
Doing it right.
Now everything is different. Game designers have been able to create hundreds of custom ad-hoc analyses in our custom solution. They are willing to explore data from all points of view, with greater ease of experimentation and analysis.
In fact, the analytics team doesn’t even have to code anymore. They are in day-to-day communication with game designers, helping them to design experiments and AB tests. Together they design new data to be tracked, pass it to developers, then validate those results. They deliver answers to game designers problems in structured way; 1) what's the problem, 2) what does the data say, 3) what are the next steps based on this information.
Analysts' perceived value in the company is actually much higher now. They are also easier to hire, as no tech skills are required, just experience with design, UX, and so forth.
Game designers spend more time testing out experiments, doing deep analysis of results, and now have proofs for their ideas. They can show how game tweaks improved key metrics, and can explain how others should improve. This allows their games to grow their value.
One example to rule them all. 2 months or 90 minutes?
To recap, let's imagine the process process of player segmentation in a traditional company. People from various departments (game design, marketing, finance, et cetera) meet in a room, discuss and propose a segmentation on day 1. Then IT or analysts have to spend a day implementing the system's rules, and getting numbers. They present results to the team and nobody is really satisfied. We'll need to remove one segment, add another, change some rules, and wait another day. Still crap. Again and again. It takes weeks, or it simply never happens - just because it’s such a time consuming process. And now imagine those people sitting in a room with a lighting fast segmentation tool on big screen. They create segments as they speak and see results immediately. Fine tune for couple of minutes, see it from all required angles. 90 minutes and the job is done.
Don’t have time to switch?
If you're a smaller studio using a slower solution, the overhead of switching may seem prohibitive. But if you need to wait more than 2 hours for new data to be processed, I can guarantee you, switching to a faster solution will take you even less time than that. And your new realtime solution still has plenty of time to get your data faster. You'll be glad you trusted in speed.
You May Also Like
Exploring the 2024 State of the Game Industry report - Game Developer Podcast ep. 39Feb 2, 2024
Phantom inspiration and the ethical auteur with Xalavier Nelson Jr.Dec 8, 2023
Designing Killer Queen: from playground experiment to modern arcade sensationOct 18, 2023
Rod Humble and King Choi illustrate the ambition of Life By YouSep 22, 2023
Get daily news, dev blogs, and stories from Game Developer straight to your inbox
Subscribe to Game Developer Newsletters to stay caught up with the latest news, design insights, marketing tips, and more