Eurogamer recently announced that it was dropping review scores – pointing to, among other things, the increasingly fluid nature of development, the problems associated with using a single numerical score to summarise a series of words (and the ensuing arguments about that score), and the nefarious influence of review-aggregating sites like Metacritic on game development and the industry in general. I think it’s a good move – although for old lags like us who have the benefit of looking at games long after they’re released, with a small audience who hopefully come for the words rather than the numbers, scores are harmless enough as a quick summary of enjoyment and quality.
Anyway, the whole thing got me thinking about the many years I’ve spent reading reviews and the different systems various publications used. Without wanting to produce an exhaustive list or summary, I thought I’d go back and take a look at some of the magazines I grew up with, aided by a number of archived scans (and thanks, incidentally, to the folk who spend their time providing such a resource).
In the mid-late 80s, and the time – in our household, at least – of the Amstrad CPC, Amstrad Computer User was my first regular source of games coverage. With the CPC being a computer, though, it obviously wasn’t all about reviews of the latest games, with a significant proportion of the magazine devoted to hardware, programming and the rest of it. Whether that contributed to the rather conservative and grown-up feeling ACU had, at least for a time, I don’t know, although they did often recognise the front-cover appeal of gaming ahead of, say, a feature about accounting software.
Anyway, the reviews had a rather interesting approach – much of the descriptive element was covered in a main body of text, and then the opinions of three reviewers would appear below, with each giving a score out of 20. There was no attempt at discussion, or at averaging out the scores to give some kind of ‘overall verdict’ – as if this would somehow be definitive – and no boastful preamble about how their scoring system was ‘the one to be trusted’. Each reviewer had limited space to give their thoughts, and it was obvious at times that each hadn’t had a chance to battle through to a game’s later levels, but looking back now, it seems rather ahead of its time – some information, some opinion, a rough score, and the opportunity for the player to find out about the rest for him or herself.
Sadly, a couple of redesigns later, the magazine had drifted into more familiar territory, with a single reviewer awarding percentage ratings for (groan) ‘Graffix’, ‘Sonix’ and ‘Playability’. Bizarrely, the overall rating was not given as a percentage but as a cartoon picture.
By this point, we’d also started getting rival publication Amstrad Action (an early – in fact, the earliest – Future Publishing effort), a magazine that also referred to ‘Graphics’ and ‘Sonics’ but did at least spell them correctly. Other factors in the overall percentage score were ‘Grab Factor’ – ie the immediate appeal, and ‘Staying Power’ – how long you were likely to keep playing. The main reviewer was named and awarded the score, but there was usually room for a short second opinion boxout from another writer. Scores for individual elements with an overall percentage were fast becoming a staple format.
Future’s ST magazine, ST Format, also identified immediate impact and lasting impression as key factors that needed to be quantified, albeit with a slightly more vague mark out of ten, along with a mark for ‘Intelligence’, which, as the introduction to the review page helpfully explained, meant “How clever do you need to be to tackle the game? Puzzle and strategy games should be tough; few shoot ‘em ups are mentally taxing.” Presumably this didn’t factor into the final score – and, as more and more elements were added into the mix, alongside unlikely claims of ‘accuracy’, review scores occasionally attracted the attention of the mathematically-minded, whose correspondence appeared in the letters pages demanding to know how what calculations were used in coming up with the overall rating.
During this time I was also introduced to the concept of a multi-format magazine, and Advanced Computer Entertainment (ACE, another Future effort) was briefly a regular in my home after it forged a reputation for having the most reliable reviews. Here, the instant and lasting appeal of the game was removed from the scoring system and into a separate line graph predicting how interested you’d be in the game after an hour, a day, a week, a month and a year. Again, the amount of thinking required (the ‘IQ Factor’) was given a numerical score. An overall rating took the graph and the numerical ratings (including something called a ‘Fun Factor’ – as ex-PC Zone writer Steve Hill once said, “Fun is hardly a quantifiable constant”) into account, and was a mark out of 1000, which certainly had a whiff of Spinal Tap about it. My Dad became incredibly annoyed by the rating given in ACE’s review of French adventure game B.A.T, and it was never seen in the house again after that.
I’d enjoyed having a games-only mag to read though, and a Christmas gift of an Atari Lynx meant I dabbled with the console kids for a time, becoming a semi-regular reader of Computer and Video Games (CVG). The partisan Spectrum/Amiga baiting that I was used to was replaced with a calmer egalitarian approach, albeit one motivated by a desire not to piss off readers, with some fairly transparent pandering which included writers being asked to name their favourite console, and each naming a different one. Still, at least it meant someone said they liked the Lynx, even if they were lying.
CVG reviews circa 1993 were generally short descriptive paragraphs plopped around multiple screenshots and boxouts, including those where members of the reviewing team gave their opinion on the game in question. Ratings were presented a little like a machine that had been tasked with calculating a title’s overall merits – aside from percentage ratings for graphics, sound, (dun dun duh) gameplay, and value, there were also marks out of ten ascribed to ‘strategy’, ‘skill’, ‘action’ and ‘reflexes’.
Soon enough, we got our first PC, and I graduated to PC Zone, probably my favourite magazine of all time. As far as ratings went, Zone settled for an overall percentage, which gradually became the norm as publications came to see complicated systems as largely meaningless. It still didn’t stop people quibbling about the odd 1% here and there, though, especially at the top end. After some correspondence from a reader about old games remaining in the Zone Buyer’s Guide for too long, they trialled a system which saw older games rescored over time, when measured against the new genre leaders. It was quietly abandoned a few issues later.
As a bone-fide console owner, and frequent train traveller, in the early-mid 00s, I found myself an occasional reader of Edge, and for all of its po-faced ridiculousness, they never made too much of a meal of scores, or a complicated rating system. However much I despised their snooty nonsense, even I couldn’t suppress a grin when they updated their explanation of their scoring system to read, “10 = ten, 9 = nine…” and so on. Surely, dear reader, you don’t need us to explain the concept of a mark out of ten to you?
As many comments sections show, some people do need the explanation, the justification, the policy, the ‘calculation’, and for Eurogamer, the problems probably started to outweigh the benefits. While music and film critics have also used – and continue to use – scoring systems (although some don’t), gaming is fairly unique in terms of range and complexity of, and the importance it seems to ascribe to, a numerical value plopped at the end of a review. Unless we can get to a place where they’re seen as nothing more than a summary, and a flawed and often inconsistent one, then they maybe do more harm than good. Perhaps taking them out of the equation for a while is a necessary part of getting to that place.
If you’re now feeling nostalgic for magazines of old, check out some of the following:
Interesting read, thanks for sharing yout thoughts!
Personally I think scores are an important part of a review, because they give you a very fast overview on how much the reviewer liked the game and they can be really usefull to filter reviews into bashing, analysing and praising. And to me the actual problem isn’t the scoring: A review can be totally subjective and suggestive without it and even if the companies can’t buy a score anymore, they can still buy propaganda.
Still the scoring can be taken to an extreme. Besides such things like overanalyzing every aspect of the game (that Intelligence thing sounds really arbitrary) or being outright ridiculous (like a promille scoring system) I think those long term graphs are rather silly: While it is rather hard to judge a game by giving it a certain score, I can’t imagine how you would determine the point where you lose interest in a game. Especially if it is in the months region. Did they actually play the games that long? Sounds more like reading tea leaves to me.
Some German magazines took this to the extreme though: They published charts where they plotted motivation versus time… on an hour scale. So you could tell in which minute they suddenly lost interest.
February 15, 2015 @ 6:21 pm
Those graphs in ACE are thoroughly ridiculous – I suppose the word ‘predicted’ gets them off the hook slightly, but as you say, unless you actually play the game for that long, who knows? No-one would reasonably expect that, either, so why include it?
I think the more scientific you try to be, the dafter the whole thing looks. I do have to laugh at the idea of recording motivation levels somehow – *speaks into dictaphone* “Hour 3 – motivation, 7/10”
At the time, though, most people (myself included) bought into the importance of scores, and embraced a range of ludicrous systems without question!
February 16, 2015 @ 2:28 pm