The Legend of Zelda: The Silver Lining

I never thought I’d actually be posting anything in this WordPress blog, since I use my account almost exclusively for commenting on other people’s posts. But today I’ve got something to say that I’d like to say somewhere, and I don’t think I’d be able to say it anywhere else. I was initially intending to register at IGN and submit this as a blog post there, but IGN apparently doesn’t allow blog posts from new members until they’ve been registered for two weeks. And in two weeks, this story will be old news.

Last week, Nintendo announced that the next Legend of Zelda title for Wii U had been delayed from 2015 to 2016. Kotaku reminded us that this is nothing out of the ordinary for major releases in the series, but many gaming blogs still expressed disappointment. After all these years, why can’t Nintendo learn to release Zelda games on time?

(by Jesse Oldershaw)

In this post I’m going to argue that when Nintendo decides Zelda games need extra development time, we should view this as a good thing, because this extra time has been beneficial to games that received it in the past. I’ve suspected for over a decade that there’s a relationship between how long a Zelda game is delayed and how good it is when it’s eventually released, but it’s only in the past few days that I thought of analyzing this relationship statistically. The analysis was done by my friend Emily Willoughby, a fellow Zelda fan who’s knowledgeable about statistics. But before I present the analysis itself, it’s important to explain the methods and data I used for it.

First, the most important point: this analysis is limited to console releases. The development cycle has generally been different for games on handheld platforms, especially for the three of them (Oracle of Ages, Oracle of Seasons and The Minish Cap) that were developed by Capcom rather than Nintendo. Although I haven’t tested this, I suspect that the correlation between delay length and eventual quality does not exist for handheld games.

Measuring quality

How do you measure the quality of a game? This is especially difficult for a series like the Legend of Zelda, where nearly ever game has a devoted group of fans who consider it the best of the series. It is fairly well-agreed that Ocarina of Time is the most critically acclaimed game of the series, and that Zelda II: The Adventure of Link is the least critically acclaimed. (That is, excluding the universally loathed CD-i games.) But that’s pretty much where the consensus ends. In order to run the analysis, though, I need a way to rank the games objectively.

Nearly any ranking I could come up with is going to be controversial among fans. What’s most important is to make sure I’m basing it on as objective a measurement as possible, so that even people who disagree with my results will understand how I arrived at them. When it comes to ranking video games, the most objective method available is is to base the ranks on the games’ review scores. For every game of the series, the website has listed its review score from each of the major gaming publications, as well as its average score from all reviews.

I’ve applied one important transformation to the Gamerankings data. Several of the older games in my analysis were the subject of retrospective reviews upwards of a decade after they came out, and some of these later reviews scored the games lower because they seem dated by modern standards. This tends to skew the averages of older games towards lower scores, which gets in the way of what I want to measure. Even if the difference in technology makes a modern game seem more impressive than a game from the 1990s, that doesn’t tell us anything about how much each of them benefitted or not from its delay (or lack of delay). To avoid this bias, I’ve based my average score for each game only on reviews from within six years after the game’s release.

The most important effect of this transformation was to raise the score of A Link to the Past and (to a lesser extent) the score of Majora’s Mask. Without the transformation, A Link to the Past suffered more than any other game from having its score depressed by modern reviews. As for Majora’s Mask, its average score without the transformation was depressed by a 2009 review that gave it a score equivalent to 60%, about ten points lower than the lowest score it received from any other publication. The 60% score was an obvious outlier, so it was beneficial to use a method that excluded it.

(Photo by FPStanley)

There’s one downside to this method, which is that it means I can’t include the two NES Zelda games in the analysis. The reason for this is simple: Gamerankings does not include any reviews of either game from less than six years after their release. Even though I could potentially find reviews of these games in gaming magazines from the 1980s, an analysis like this depends on having consistent criteria across the board. If I were to use reviews of the NES games that weren’t included at Gamerankings, I’d have to do that for all of the other games in the series as well, and it would become more or less arbitrary which reviews I included or left out.

Measuring delay length

I initially thought measuring the amount of time that games were delayed would be easy, but in some cases it turned out to be harder than I anticipated. I’ll run through it game by game.

1: A Link to the PastA Link to the Past was originally developed as a NES game, until development was eventually shifted to the SNES. However, I haven’t been able to find any documentation showing when the game would’ve been released if it had stayed on the NES. The one delay that’s clearly documented is mentioned in this article. Shigeru Miyamoto initially intended for its (Japanese) release to be in March 1991, but it was pushed back to November 1991, which amounts to a delay of 8 months.

2: Ocarina of Time – This was the toughest game to calculate. The game had one very well-documented delay from fall of 1997 to fall of 1998, so there is no dispute that it was delayed at least a year. But some fans of the series have given its delay length as 2 years. What gives?

When footage of the game (known at the time as “Zelda 64”) was first shown at Spaceworld 1995, Nintendo heavily implied that it either would be a N64 launch title, or would be released within a few months after the console came out. As far as I know, Nintendo did not give it an actual release date this early on, but at that point even their lengthiest estimates of development time would not have had it coming out later than spring of 1997. Thus, going with the most conservative delay estimate that incorporates Nintendo’s predictions from 1995, Ocarina of Time was delayed a year and a half.

3: Majora’s MaskMajora’s Mask was released on time.

4: The Wind Waker – The American release of Wind Waker was originally scheduled to be released in November or December 2002, but was delayed to March 2003. This amounts to a delay of three or four months; I’ve used 3.5 months for my calculation.

5: Twilight Princess – This delay was the easiest to calculate. Twilight Princess was originally scheduled to be released in November 2005, but was postponed to 2006. The Wii version was eventually released in November 2006, almost exactly a year later than originally planned.

6: Skyward Sword – Although a lot of people assume that Skyward Sword was delayed, according to the Kotaku article this assumption is based on a misunderstanding of Nintendo’s plan for the game’s development. Nintendo evidently never planned to release the game earlier than 2011, so for the purpose of this analysis it was released on time.

The data

Based on the above parameters, this is the data I’ve collected. Delay on this table is measured in months, and the score for Twilight Princess is the average of the scores for the Gamecube and Wii versions.

And here’s the same data in a graph.

Obviously, a lot of people are going to disagree with Majora’s Mask being the lowest-ranked Zelda game on the list. This is one of the reasons why it’s important for readers to understand the methods I used, and that I calculated the average scores using what I think is the most objective method possible. In fact, the MM score I’ve used in my analysis is slightly higher than its raw average score at Gamerankings, because their average includes the retrospective reviews that I omitted. It’s also important to remember that there are no bad games on this list — this analysis is measuring the difference between great games and extraordinary games. Even with its score lowered by the retrospective reviews, Majora’s Mask is still rated higher than all but six N64 games. On the other hand, Ocarina of Time is the second-most-highly rated video game ever released, surpassed only by Super Mario Galaxy.

Emily has calculated the correlation between delay length and eventual quality (as measured by review scores), and found a correlation of 0.92, or 92%. Squaring the correlation gives the coefficient of determination, meaning that 84.7% of the variance in game quality can be explained by variance in delay. The p-value of the correlation was 0.009. The p-value represents how likely it is that random chance could produce a correlation this strong, and that in this case the odds of that are less than one in a hundred.

The equation y=92.9 + 0.24*x describes the statistical relationship between a Zelda game’s delay length and its approximate score when it’s released, where x is the delay time in months, and y is the eventual score. If the Wii U Zelda is delayed for 12 months, then its eventual score will be somewhere in the range of 92.9 + 0.24*12, or 95.78%. On the other hand, if it had not been delayed, we could only expect its score to be around 92.9%.

Demonstrating this relationship probably will not come as much of a surprise to most long-term Zelda fans, but I think it’s good to have some empirical confirmation of what we already suspected was the case: delays to Zelda games are worth it, because in the end they make for better games. Or as Shigeru Miyamoto put it, “A delayed game is eventually good, but a rushed game is forever bad.”

Click here for the complete statistical analysis.