A few years ago, I built a shed. I could have picked up a prefab kit from Lowes or Home Depot, except that we live in the city, and our yard, while large by Seattle city limits standards, limited placement to a single spot beneath our deck if we wanted to maximize land usage. This meant that the shed had to fit very specific dimensions to fit this space, while also being large enough to store a lawnmower, rakes, some garden tools, and other various rarely-used stuff. Since it wasn’t my first carpentry project, I was able to do a tiny amount of research on roofing materials, then draw up some plans and get right to work. The process was intensive, but straighforward. The only unique bits to this project were the dimensions and the fact that I wanted the project to be knock-down, as building in-place would have been difficult due to the space. It was easier to build in our garage, in a fashion that the four walls and roof could be removed, repositioned on site, and then reassembled and finished.
In some ways, many coding projects start out like the shed. There’s some adjustment to a tried-and-true pattern that differentiates the project, a plan is made of some specificity, certain details are hand-waved over due to time or eagerness to begin work, and the developers dive in. It’s usually those waved-over details that result in the unforeseen overruns and bits of complexity. The Agile/Scrum process is one way of approaching these inevitable surprises, but it’s easy to fall into the trap with Agile of always finding new stuff to do, and early on in a project the goal of finishing every Sprint with something “shippable” depends on a very loose definition of that word.
Over the past few weeks, I’ve been driving to complete the Circuitboard Game Mode of our first title. While most of the main logic and display functionality is complete, I migrated a bunch of hardcoded values to data-driven values, which are read out of several XML files and resource files. This lead down a path of writing code to load and parse these data files.
One of the challenges of software (or hardware) is creating test situations or prototypes that correctly represent the final intended functionality. It’s easy to fool yourself into believing that this cobbled together bit of code will be good enough for when it’s on user’s machines or in customers hands. At Amazon, we had large prototype boards with multiple cameras on them that were supposed to be representative of the final four-corner-cameras from the Fire Phone, but we discovered that things like optical distortion, field of view, and alignment were critical factors in how well our head-finding algorithms worked. As Jeff pointed out on stage at the announce, there were also many difficult problems to solve with how well the cameras worked in everyday environments, and we weren’t able to detect these issues with a large board in a lab setting until we had an actual handheld device that could be carried around.
Back on my current project, following the path of loading the data files lead to some peculiarities of how my platform deals with file systems, and how I intend to deliver episodic content. The content, of course, will be downloaded on demand once it’s available, but I understand that in the case that I have a million users, making the content available only at the moment it’s meant to be played will result in a massive request spike, even if the content is distributed over many CDNs (Content Delivery Networks). Thus, I made the decision to make the content available to be preloaded. Now I walk down the path of having a server system set up where the game can check to see what content is available ahead of time, download that in an encrypted form (using AES), and then request a relatively tiny key to unlock the content at the moment it’s supposed to be playable.
This lead to a different discovery, which was that decrypting content of any reasonable size takes a few seconds, during which my game was unresponsive. That’s unacceptable, so now I had to look into spinning off the decryption process on a different thread. I’ve done multithreaded code before (actually, my entire career, since the first controls systems on which I worked were multithreaded), and if you’re not familiar with multithreading, the concept is that you spin up a background task to do some work, while the main task continues to function and provide responsiveness (otherwise it appears that your system has locked up). Pretty much every computer that people use on a regular basis these days, from your Mac to your Smartphone, is multithreaded, where it’s rapidly switching between hundreds of tasks to provide the illusion that everything is running smoothly all the time. Multithreading is rife with issues, from synchronizing the tasks, to sharing memory, to threads that deadlock or livelock or become unresponsive in some other fashion, so this tradeoff for responsiveness added a bit of potential complexity. Still, with two days of investigation and some test applications, I was able to get this decryption functionality running happily in a new thread, and waiting idly by until it was needed.
So now, to ensure that one mode of my game works, I have a pre-caching system, a validity check, a key request, and a background decryption of content that finally can be read to actually run that game mode. If I had just spent a week hacking together a workaround to just read the XML files from some temporary location, I wouldn’t have discovered the need for all these other systems, nor the potential negative impact of the decryption slowdown, until much later in production. Those are the kind of late “gotchas” that lead to late-in-the-schedule customer-frustrating ship-date slips. While I had an investigation into this work scheduled for later on, by following the code now, I unblocked myself for a large set of work that’s still to come, and will hopefully see productivity improvements on getting the remaining game modes up and running.
My advice with this post is to follow your code to conclusion as soon as you can. Go down the deep path, or as a former manager used to call it, create the stovepipe from coals to smoke, that guarantee that your entire system works end-to-end as soon as possible. This will point out the gotchas and bring potential issues to light so that you can deal with and account for them sooner, rather than be surprised by them later.