Microsoft subtweets aside, we’re sticking to our data until they prove us wrong.

When you try to extrapolate console usage patterns from randomly sampled server data, you're always looking for ways to test that data against reality. Microsoft shared a small slice of that reality this week, and it has us taking a second look at at least part of our recent, wide-ranging report on Xbox Live usage.
On Wednesday evening, Xbox Chief Marketing Officer Mike Nichols shared via Twitter that "roughly 50 percent of Xbox One owners have played" a backward compatible Xbox 360 game, putting in "over 508 million hrs" with the feature since it launched in late 2015.

Those numbers seem to contrast pretty heavily with what we see in our randomly sampled Xbox Live server data. Assuming about 28 million total Xbox One owners for back-of-the-envelope math, Nichols' number averages out to about 18 total hours of backward compatible play per owner. In our approximately 4.5 month sample, we could expect that to translate to about 4.5 hours of backward compatible play per Xbox Live user. Instead, we measured a 25.9 minute average during that period.

While we didn't measure how many users have played at least one backward compatible game (and issues with how we store the data prevent us from getting at that number directly), almost every game in our sample showed up in the usage of less than 1 in 1,000 active users—often much less. That seems hard to reconcile with a world in which 50 percent of Xbox One users have put in some backward compatible time.

There are some timing-based explanations that can account for some of this difference. For one, if a player used backward compatibility only outside of our sample range (Sept. 26, 2016 through Feb. 12, 2017), they would be included in Nichols' 50 percent but not in our own players-per-game ratios. It's possible (though not all that likely) that backward compatibility was just abnormally unpopular during the few months we sampled.

There's also evidence that interest in backward compatibility use has picked up in recent months, after our sample concluded. Back in November (right in the middle of our sample), Microsoft said Xbox One owners had only put 210 million hours into backward compatible games. Assuming about 25 million Xbox One owners at the time, that translates to about 0.7 hours per owner per month, or 3.15 hours in a 4.5 month period.

The addition of Call of Duty: Black Ops 2 to the backward compatibile lineup in April (after our sample) would go a long way to explain that increase in total and average usage (That launch also helps explain Phil Spencer's tweeted statement that "one or two BC games [appear] in our daily top played games.") The July addition of Red Dead Redemption to backward compatibility could have also caused a brief surge in interest in the feature that largely predated our sample. Overall, it's hard to say how our 4.5 month chunk of usage might line up with 18 months of what could be temporary spikes in backward compatible interest.

Re-evaluating

Those potential explanations only go so far, though. As it stands, there seems to be a large gulf between the backward compatibility numbers shown in our data and the ones being reported by Microsoft.

It's possible this discrepancy is in the usage data we received from XboxAPI itself. Perhaps the service failed to capture all backward compatible usage or mischaracterized some of that usage as direct play on an Xbox 360. It's unclear whether direct usage numbers for Xbox 360 and Xbox One games on their original hardware would similarly be undercounted in the API.

In either case, we have no reason to believe that the relative performance of games in the sample would be significantly impacted by this kind of undercounting issue (all games would probably be undercounted at similar rates, in aggregate). The "ownership" data in our report, which is taken from a completely separate sample, also wouldn't necessarily be affected by this kind of problem. We're going over our coding and data compilation methodology with a fine-toothed comb to determine if there are any basic counting errors on our end.

In the end, though, potential errors like this are why we spent over 1,000 words of our analysis piece laying out the limitations and caveats associated with our sampling method. In short: we don't know every Gamertag, we miss offline users, we only sample a portion of the Gamertags we do have, and used and borrowed discs are not counted as sales. For backward compatibility, at least, it seems we need to add a new caveat about potential incompleteness or errors in the API data itself.

As we said in the piece, "all of these sources of potential error make us uncomfortable using our data to directly extrapolate total sales or usage numbers for the entire Xbox playerbase. The numbers and ratios presented in this report should only be considered representative, sampled estimates of the online portion of the Xbox community, which could be significantly different from the total community of Xbox owners."

We also said in our initial piece that "the backward compatibility usage numbers are so low that we almost doubted the reliability of our data." Apparently that's a doubt we should have taken to heart a bit more, at least as far as this portion of the data is concerned.

Prove us wrong

Microsoft Xbox Corporate Vice President Mike Ybarra subtweeted our data Tuesday by saying that "scraping some data off servers gives an inaccurate view of what people do." We actually agree with that sentiment, to some extent. For all the reasons we laid out in our piece (and above), our numbers might not be the same as the real numbers Microsoft has direct access to.

That said, even with all the potential issues, we still stand behind the statement of purpose we put forth in the initial piece: "We think these estimates of Xbox usage and game ownership are still superior to the utter lack of information we had about the world of Xbox usage before."

Microsoft and other publishers and platform holders remain incredibly tight-lipped about sales and usage data for their games. There are surely good, competitive and collective action reasons why gaming corporations are unwilling to share robust information on how their games are being played. Yet these issues don't seem to get in the way in the film industry—where public box office receipt estimates are available going back decades—or TV—where Nielsen audience estimates are published every single day.

Even in music, Spotify publishes relative popularity data for its songs that can feed larger analysis. Spotify also publishes its own analyses of that data.

Compared to this, the game industry is starving for quality, public information about what people are buying and playing. This is the kind of information can be crucial for indie developers trying to develop a business case for a certain type of game, or for small publishers looking to see what the competitive landscape looks like. For the media, this kind of information is important to help frame our coverage decisions and decide what our audience might be interested in. And for plenty of consumers, there's a general interest in wanting to know how personal console usage stacks up to others and what games are driving interest in a platform of choice, as well as other platforms.

Microsoft has yet to respond to a direct request for comment on our reported numbers (aside from the tweets mentioned in this piece). If the company wanted to, though, it could share the precise percentage of total Xbox One usage time represented by backward compatible Xbox 360 games. It could give us a top 20 list of the most popular and/or best selling Xbox One games every month.

The company could confirm or deny the accuracy of the rest of Ars' reported numbers and help us correct our estimates where necessary. Or, it could obviate the need for those estimates altogether by just sharing at least some of this kind of data directly.

Until and unless the industry is willing to be more open with some of its data, we'll continue to look for other ways to get at this kind of information. And we'll continue to be upfront about the limitations of these methods, warts and all, for public consideration.