[In this pointed opinion piece, British games journalist and producer Simon Parkin looks critically at how the media rates and ranks games, suggesting that review scoring acts as more of an impediment than an advantage.]
Last month a British games journalist reviewed Xbox Live Arcade’s Penny Arcade Adventures for two different publications. In one of the magazines the game scored 4/10 while, in the other it was awarded 68%. While it’s a discrepancy that caused some to raise their eyebrows
, most commentators acknowledge that the difference simply reflects each publication’s own particular use of the numerical review scale.
Two weeks later Microsoft announced their plans to remove games with an average Metacritic score of 65% or lower from their XBLA service. If the decision on whether to keep Penny Arcade Adventures
on the service were to be based solely on the judgement of this reviewer, its fate would swing on which review was looked at.
While a game’s Metacritic or Gameranking average score has often been used to dictate the size of a development staff’s bonuses, EA’s decision to use numerical scores as the criterion for has elevated the numbers issue a whole new level of consequence.
Some argue that scores represent different things to different publications, one title’s 4/10 being another’s 68%. Others question why, when scores rarely tally with a game’s commercial success, we should use them to make commercial decisions? Always, the question behind the question is: do review scores actually matter and, if so what do they even mean?
At a glance, review scores seem to be the most harmless of things. While good critics will bemoan having to reduce a 1000-word piece of incisive criticism to a number on a 10 point scale (or, um, 19 point scale if you’re GameSpot), to the average consumer they offer a useful shorthand reference point with which to compare different titles and inform buying decisions.
But to fully understand the confusing tangle review scores have landed both reviewers, consumers and the wider industry in, it’s important to understand their origins. Review scores are a system imported from those publications that review and rate consumer products like televisions and toasters. For example, look at this review of the Canon EOS400D camera
. It’s 25 pages long and is the most objective dissection of this model of camera as it is possible to create.
Every aspect to the product is pulled apart, rated and weighed with statistical graphs and comparative data. By the end of the review you know every single detail about the camera and how it empirically compares to its rivals.
It’s a huge exercise in absolute objectivity and, at the end of the gigantic review the author sums up the good points and the bad points and there is no shadow of a doubt that everything said is ‘factually correct’.
Additionally, there is a place on a defined scale of quality upon which the product sits at that moment in time. It compares to other cameras on the market in defined ways, despite being a complex product. Using the review data it would be possible to arrange all of the digital cameras into a ‘truth’ line of quality, with the ‘best’ camera sitting at 100 and the ‘worst’ at 1 and to place this camera somewhere along that line, thus communicating to a consumer its relative and inherent qualities in a single representative digit.
It seems sensible then to believe that such an exercise could be applied to video games to construct a similar scale of quality. Indeed, this is exactly what many video game consumers want from their reviews.
The average reader (even if they don’t know it) is after a complete objective, scientific comparison between game x and game y with data and statistics and, finally, a numerical point on a linear scale by which they can compare, for example, Mass Effect
with Rock Band
and see which one is empirically better.
Except, of course, video games don’t work in the same way as toasters or digital cameras. Sure, they have mathematical elements and measurable mechanics and it’s possible to compare the number of polygons between this one and that and spin out ten thousand graphs detailing how two specimens compare. But, unlike with the Canon EOS400D, I would have no idea at the end of those 25 pages which game was better or where they would sit on the ‘true’ scale of quality.
Games are experiential and it is impossible to be wholly empirical or objective about them. Game reviewers instead present their experience of the game with, hopefully, lots of reference points and their weight of knowledge behind them. They might make empirical comparisons between game x and game y’s framerates but they will also argue whether they think this in any way effects the experience for better (in the case of bullet hell shooters such as DoDonPachi
) or for worse. They have to argue their points because there isn’t data on the overall, indefinable quality of a game.
In the early days of magazine publishing, video game reviewers would often break a game down into all of its constituent parts (graphics, sound, ‘lastability’ etc), score each on a comparative line of quality and then present the average of those scores as the game’s overall measure of quality.
However, this approach presumes that it’s possible to put each of a game’s constituent parts on a definable scale of quality. The truth is that gauging a game’s graphical appeal is a subjective pursuit in the same way that trying to comparatively score a Monet against a Picasso would be. Call of Duty 4
’s competent stab at sunset-drenched realism has a certain appeal, but then so does the 8-bit elegance of a Chuckie Egg
or Geometry Wars
Secondly, games are more than the sum of their parts. You could have a visually astounding videogame with a gut-wrenching soundtrack and astute, nuanced voice acting and it could still be terrible to play and vice versa. Aggregating scores from extrapolated game elements tells you nothing anyone would actually want to know about the game.
At this point, defendants of the review score will offer: ‘Why not just review the game on how fun it is, then?'
The problem with wanting a purely objective ‘review’ of a video game is made doubly complicated by the fact that a video game’s purpose is never so narrow nor so easily defined. Consumer goods have a very clearly defined job to do. A digital camera is there to take the best possible photographs, a toaster is there to make toast to whatever specification the consumer requires in the shortest and most efficient timescale. And because their purpose is tight and the measure of the product’s success easily calculable, they lend themselves to ‘review’ and ’score’ testing.
In contrast, the purpose of a video game is much less narrowly defined. Most game ‘reviewers’ would say that the purpose of a game is to be fun and to entertain. But actually pinning down such abstract concepts is tricky as there are as many criteria and understandings of what is entertaining and fun as there are humans. Thus, reviewing a video game in the same way as you’d review a digital camera or other similar consumer product is inappropriate or, at very least, misleading.
All this is not to say that review scores are entirely meaningless or misleading. In fact, they do have a very clearly defined purpose; it’s just that it’s a different purpose to the one that’s widely understood.
Scores have come to represent whether a game over achieves or underachieves on the preview hype that was generated by the publication ahead of its release. As previews in the average video game magazine are so heavily influenced by advertisers (after all, a preview is offering no judgment on the quality of a game, so a magazine/website can print riotously positive spin in it and maintain clear conscience) this weighting of preview coverage sets imbalanced expectations in readers.
Rather than focusing on the most interesting, promising or innovative games coming out, readers are made to get excited about those whose publishers pay the most for, be it directly through advertising or indirectly through the general marketing promotion of a title.
This is why when a game like Koei’s Bladestorm
gets 8/10 in some publications, readerships become incredulous. Their expectations for the game haven’t been set that high because they were being fed hype of a different flavour.
Then, conversely, when Metal Gear Solid 4 scores an 8/10 on Eurogamer last week
, the readership revolts the other way - because that’s far below their expectations. Remember: in both cases nobody but the reviewer had played the game at the point the reviews came out - why then were people so quick to damn each respective score (for opposing reasons) if they’ve no hands-on experience?
Scores then become a reference to a game’s preceding hype. An 8/10 for a game that was hugely hyped to hobbyist gamers is a punch in the stomach for excited fans (see the anguish exhibited in the MGS4 comments thread
). Conversely, an 8/10 for a game nobody cares about is viewed a gross over-generosity.
And that, is why video game review scores are pointless: they often answer a pertinent question that nobody realised they were asking.