What I learned playing "SteamProphet"

May 9, 2017
protect

Yesterday I tweeted a thread with some scary stats (click through to see the whole thing):

A quick updated summary of some of the things I noted:

  • At least 249 indie games have launched on Steam in the past 13 weeks

    • not including VR or F2P games

    • that's more than 30 games every week, on average

  • In their first month...

    • 75% made at least $0

    • 10% made at least $1K-9K

    • 7.5% made at least $10K-49K

    • 2% made at least $50K-99K

    • 5% made at least $100K-999K

    • Exactly one made > $1M

Here's a graph:

What you're looking at is minimum estimated earnings for each game, in their first month on Steam. If a game is pegged at "$0" that doesn't mean it made $0, it just means I don't have enough data to confidently state it made any more than that. (Doing some spot checks with willing developers, it seems that games in the $0 tier make a few thousand in their first month at best).

How did I get this data, you ask?

Let's Play SteamProphet

For the past thirteen weeks I've been playing a little game with a small group of friends I like to call "SteamProphet," and it's basically fantasy football for indie games on Steam.

(UPDATE: we have a website now.)

Every Sunday, I put together a list of upcoming games releasing on Steam in the next week, with a few filters (must be tagged "indie", no VR, no F2P, must have a concrete release date, etc). Each player selects a "portfolio" of five games from the list they think will do well, and selects one game as their top weekly pick.

Four weeks later we score the results and see who did best. We calculate a pessimistic minimum estimate for how much money each game has earned, and a player's score for that week is the sum of each game's individual score, plus the score for their weekly pick (the weekly pick is the same game as one of the five regular picks).

Scoring

In the first iteration we simply used the lower bound of SteamSpy's "players" metric for each game to count as the score. We prefer the "players" metric over "owners" because it's less susceptible to distortions from giveaways, bundles, free weekends, and metric-inflation scams.

"Owners" counts anyone who has the game in their Steam library, whether that's a paying customer, a journalist who got a free review copy, or a smurf account trading greenlight votes for game keys. The "players" metric represents someone who actually bothered to install and run the game, and although this figure is still gameable, it's harder to do. Most importantly, we're pretty sure that "Players" will correlate closer to actual purchasers than "Owners", especially in a game's first month. To be extra conservative, we take the lower bound, so for a figure like "Players total: 58,263 ± 6,959" we count that as 51,304 players. Since SteamSpy uses a statistical confidence interval of 98%, taking the lower bound means we can be 98% confident the actual number of players is at least that high.

However, just counting players makes it hard to compare differently priced games -- $2.99 easily garners more players than $19.99. To normalize scores, we multiply players by the lowest price in the game's first month, rounded down to the nearest 1,000. This makes it easier to compare the relative success of two different games.

But since we're dealing with estimates rather than hard figures from developers themselves, we insist on being conservative. This scoring method stacks four ruthless forms of pessimism:

  • Use players instead of owners (players is always < owners)

  • Use the lower bound

  • Use the lowest price

  • Round down

What we're left with is a pretty reliable "hit detector." If a game scores 100,000 points with this estimation method, we can be pretty sure that it's actually earned at least $100,000 on Steam. The only major fly in the ointment is regional pricing -- if everyone who bought the game was from Russia or China, then the actual purchase price could be significantly lower, but as long as western buyers represent a significant chunk (a near certainty), the built-in pessimism should more than account for this.

If we wanted to create a "miss detector" instead, we would probably do the opposite -- go with the most optimistic estimate possible in each case; take the upper bound of the owners metric and multiply by the highest price, and count that as a pretty confident ceiling on a game's earnings (on Steam, at least) -- and a low ceiling could reliably indicate a flop. But that wasn't our chief concern -- we wanted to know which games had almost certainly done well.

Predicting

The predicting part of the game was inspired by the concept of "Superforecasters" -- a group of people who try to actually get better at predicting future events (a much-needed skill in our age of non-stop punditry). The basic gist of their method is:

  1. Make clear, quantifiable, falsifiable predictions.

    • Bad: The economy will do better.

    • Good: The S&P 500 will close higher than 3000 points.

  2. Set a maturation date for the prediction.

    • Bad: The economy will do better "soon"

      <

JikGuard.com, a high-tech security service provider focusing on game protection and anti-cheat, is committed to helping game companies solve the problem of cheats and hacks, and providing deeply integrated encryption protection solutions for games.

Read More>>