In my previous article, I discussed how Valve introduced changes to Steam's visibility algorithms on October 5th, 2018. It's not clear exactly what changed, but this seems to have resulted in less Discovery Queue traffic to smaller games. This is unfortunate, as Discovery Queue traffic is particularly important for these games. It provides visibility during sales. It provides an audience that is looking for new games. And it can also amplify the effect of whatever marketing efforts a developer can deliver.
Valve hasn't explained what changed back in October. But I wanted to get a better understanding of how the Discovery Queue works and what games it is showing. Between December 2nd and February 28th I viewed 672 games. For each game I was shown, I saved the store page HTML. I then spent far too many hours obsessing over the data. This is an in-depth breakdown of the statistics and an analysis of the results. I may have gotten a little carried away.
Limitations of This Approach
There are many caveats regarding the relevance of this data. For one, this only reflects my personal Steam account. It's a reasonable assumption that what I'm seeing will generalize to other accounts, but I have no way to know for certain.
Also, my data only goes back to December 2nd. It offers no insight into how the current algorithm compares to how things worked before October. I also believe that Valve made further tweaks to the algorithm in early March, which is why I am restricting my sample to the end of February. But I have no way to know if the algorithm has been consistent during the time I was collecting data. Valve could have been making subtle tweaks to their algorithms through the whole time, and we would not know it.
My Steam account preferences are set to allow mature content to be shown, with the exception of Adult Only Sexual Content. I haven't set any tags in Steam's Tags to Exclude feature. In my Discovery Queue preferences, Early Access Products, Software, Videos, and Unreleased Products are all allowed. Prior to starting this experiment, I was a semi-regular user of the Discovery Queue feature. I had seen 1323 products, and I marked 339 of them as "Not Interested." The impact of marking a product as Not Interested is unclear. One page states that "It does not change what kind of games will be recommended to you." A different page claims "We will also exclude these products from being used to recommend you other, similar items." During the experiment, I did not mark any additional products as Not Interested. The Discovery Queue does not repeat itself, so each of the games I saw was being recommended for the first time.
There are a lot of considerations to make in the design of a recommendation system. One of the most common and straightforward tools to apply is popularity. Basing discovery recommendations on popularity has its downsides, as people are likely already aware of popular things. And popularity-based recommendations means that hidden gems will inevitably be missed. But when a lot of people like something, it is a strong signal that it might be interesting.
Another factor in creating good recommendations is quality. This can be difficult to pin down, since quality can be extremely subjective. Critic and user scores attempt to put a numeric value on quality. Review scores can gloss over a lot of subtlety, but they give a number that is easy to incorporate into an algorithm. While popularity tends to be correlated with quality, something doesn't have to be good to be popular. And critically, low popularity does not imply low quality.
One final tool that I think is key to a good recommendation system is personal relevance. Determining what things might fit a user's particular tastes is challenging but potentially very helpful.
Valve's page explaining the Discovery Queue provides a good overview on how to use the system, and I recommend you give the page a read if you aren't familiar with the system. It notes that the Discovery Queue "…tries to strike a balance between prioritizing products known to be good, and new products that you may find interesting." This stated approach of balancing quality and relevance seems like a good start, with no mention of popularity. The description also notes the importance of new products, introducing the idea of recency as an additional priority in making recommendations.
This Product is in Your Discovery Queue Because it is Popular
Valve has said that they do not want to share too many details of how their algorithms work, because they don't want people to cheat the system. But Steam does provide a lot of information about why things are in your Discovery Queue. This information starts with what I call the top line, or qualifying reason. The top line reasons banner shows a general reason for why you are seeing a game, such as "because it is popular" or "because it has positive user reviews." These reasons seem to explain how a game qualified to be shown to you, but does not necessarily explain why this particular game was recommended.
I shared a draft of this article with Alden Kroll of Valve. He explained that these top line reasons do not necessarily indicate a direct cause for the game appearing. He provided the following details on this system:
- When you see the “reasons” line (eg. “This product is in your discovery queue because it has positive user reviews”), that doesn’t actually mean that review score is the reason it is there (or at least not the only reason it is in your queue).
- Think of the “reasons” field being a suggestion to the user about why, in a vacuum, they might enjoy this particular title.
- We want to pick reasons that seem strong and easily understandable to a user. For example: if a title is very highly reviewed, that’s something that’s easy to explain, and so we’re more likely to pick that as thing to highlight for a customer. If a game is really popular, then we are more likely to pick that as the thing to explain why the game is shown in your queue.
- We want flexibility to adjust and experiment with the underlying algorithm and we can’t always explain all the reasons why a game is in your discovery queue. In some cases, the algorithm used to generate the recommendation may not come with easy to digest reasons for any given recommendation, but we still think it’s valuable to give some indication to a customer about why they might like that title.
The following table shows all of the types of top line reasons that I saw and how many times each of these reasons occurred.
|Because it is popular||281||41.8%|
|Because it has positive user reviews||255||37.9%|
|Just to see if you might be interested||46||6.8%|
|Because it is new on Steam||36||5.4%|
|Because it has a high Metacritic score||26||3.9%|
|Because it is on sale||21||3.1%|
|Because it is a top seller||7||1.0%|
281 games that were shown to me with the top line reason indicating it was because they were popular. Representing 42% of the games I viewed, this reason is the most common one. It is not clear what criteria is being used to determine popularity. It is likely a mix of factors, but my best guess is that it is mostly based on recent revenue.
As I said, popularity is a useful tool for making recommendations. But I am surprised to see it so heavily emphasized in the Discovery Queue, where users are looking to find new games they may not already be aware of.
Because it has positive user reviews
The second most common top line reason for this data set is because the game has positive user reviews. The threshold for this is reason appearing is 80% positive. The following chart shows a histogram of user scores for the games I saw. The games with this top line reason are highlighted separately, and the stacked values represent the full distribution of user scores.
Stacked histogram of User Scores for games in my Discovery Queue
With this reason appearing on 38% of my Discovery Queue views, it seems like this represents an important visibility threshold in the system. However, Alden tells me that "...review score (above ‘negative’) has very little impact on whether a game gets recommended" and that there is no code that is explicitly selecting for games over 80%. He suggests that the discontinuity in the chart is due to various criteria for getting recommended happening to be correlated with having over 80% positive reviews. He wanted to make it clear that developers do not need to target a particular review score to get recommended.
Just to see if you are interested
Surprisingly, the third most common qualifying reason is "just to see if you might be interested." While this was only 46 games (6.8%), I did not expect it to be so frequent. The logic behind this category makes sense. Valve wants to collect data about the games on the platform, and randomly recommending them gives an opportunity to learn more. Though the systems only seem to factor in a very limited amount of data. If I buy a game, that will feed into Steam's popularity metric. If I play the game, then it will factor into future recommendations. But whether I Wishlist, Follow, mark as Not Interested, or simply click Next, my actions do not seem to factor into Valve's algorithms. I would think that these actions would reveal some information about the game and quite a lot about my personal preferences. But that data does not seem to be used currently, making these views much less effective than they could be.
Because it has a high Metacritic score
Metacritic score is another attempt to quantify quality. Again the threshold is 80, though this represents a much higher bar to overcome than an 80% User Score. For smaller games in particular it is difficult to even get a Metacritic score, let alone to stay above 80. And again, it seems that this is a hard yes/no threshold. There does not seem to be a trend towards higher Metacritic games being recommend more than lower-scored games. With only 26 games qualifying for this reason, the system is obviously placing a much lower emphasis on Metacritic compared to User Scores.
Stacked histogram of Metacritic Scores
Because it is new on Steam
The Discovery Queue provides some additional visibility to new games in the store. Of the 36 games that showed up under this reason, ten were products taking pre-orders, one was not available for purchase yet, and 26 were launched. The launched games were at most ten days old.
Because it is on sale
Currently being on sale also shows up as top line reason for a game being recommended. It certainly makes sense that users would be interested in games that are discounted, and this is a feature that can help developers amplify their own marketing efforts.
Because it is a top seller
Finally, we have 7 games that were shown because they were on the top sellers list. Each of these games that showed up with this reason attached were taking pre-orders. Whereas all of the games that were displayed because they were popular had, at a minimum, launched into early access. So this seems to be a system that is linked to popular pre-orders.
In these categories, we see popularity and quality factors playing the biggest role in what is qualified to be shown. Discounts, recency, and some random trials are also mixed in at a much lower rate. Something that does not show up in this at all though is personal relevance. This part of the algorithm doesn't seem to be qualifying any of the traffic according to what my personal tastes might be.
While browsing my Discovery Queue, it seemed like the games I was seeing tended to be recently released. The "new on Steam" qualifying reason only explains a small percentage of this, and the effect seems to go beyond that. This chart shows the distribution of games by age.
Histogram of days since release capped at 2000 days.
44 games (7%) had been released for a week or less, and 74 games (12%) had been available for a month or less. There does seem to be some recency effect, though it may be explained by recent games tending to be more popular than older games.
While browsing the queue, I felt like I was seeing a lot of unreleased games, but looking at the numbers, the reality is that most of the games were launched. Here is the breakdown of release status.
As part of Steam Direct, Valve introduced systems to prevent developers from creating games targeted at exploiting the achievements and trading cards systems. Games that have not reached some criteria for eligibility are marked as having restrictions on certain features. Initially, the indication would always use the text "Steam is learning about this game" but in early February, the text "Profile Features Limited" started to appear for some games. It is not clear what the difference in status indicates.
Valve has said that this "still learning" status only impacts achievements and trading cards and does not impact store visibility. I saw 71 games marked with the still learning status and 14 games marked with the features limited status. If we look at the breakdown of the topline reasons for how these games qualified for my Discovery Queue, we see a different distribution for these games, indicating that these games tend to be either newer or less popular compared to the rest of the sample.
|Because it has positive user reviews||36||41.4%|
|Just to see if you might be interested||20||23.0%|
|Because it is new on Steam||14||16.1%|
|Because it is popular||14||16.1%|
|Because it has a high Metacritic score||2||2.3%|
|Because it is on sale||1||1.1%|
It has always seemed like the Discovery Queue tends to show me a lot of VR games, even though I don't own a headset, and I've never played a VR game on my account. In this sample, 46 games (6.8%) were VR. I estimate that 10% of the total Steam catalog has VR support, so despite the impression that I had, it does not seem like Valve is giving VR games an extra visibility boost in the Discovery Queue.
464 (60.9%) of the games in the sample had the Indie tag, whereas 72% of the total catalog has the Indie tag, indicating that indie games are relatively underperforming when competing for visibility in my Discovery Queue.
Another source of information about Steam's recommendations is the "Is this game relevant to you?" info. This feature of the Steam store is not specific to the Discovery Queue. Logged in users are shown a relevance section on every game page. As we will see, these relevance factors don't seem to be directly deciding the recommendations. It seems that the information shown in this section isn't showing why the Discovery Queue chose this particular game. Instead, it is a system that looks at a game and returns whatever information it has. This is the first area where we are seeing how the recommendation system is considering personal relevance as well as popularity and quality factors.
Is this relevant?
One of the most informative things this section provides is that sometimes it provides no information at all. On occasion when exploring your Discover Queue, this section will show text stating that "You've already looked at a lot of games that we have the best information on for you. Until new games release, you might see less relevant games as you explore more of your queue."
During my experiment, I was shown games with this text nine times. For eight of those games, the top line qualifying reasons for being shown was because the game is popular. The ninth was shown because it was new on Steam. I found encountering these messages to be frustrating. The message seems to be stating that I have exhausted the algorithms ability to provide personalized recommendations for me. Yet continuing to explore my queue revealed many games with reasonable personal relevance. This seems to indicate that the algorithm is showing me the game because popularity factors outweigh the lack of relevance. The text indicating that the algorithm has run out of personally relevant games is misleading. It is also interesting to see a reference to the importance of recency in this text.
I have been shown less than 10% of the Steam library by the Discovery Queue, but it seems like the systems are designed to emphasize recent and popular titles over exploring Steam's deep back catalog of games.
Similar by tags
The most common relevance explanation I was shown was the "Because you've played games tagged:" reason, where a game's tags match tags for other games I have played. This reason showed up on 532 (79%) of the games I viewed in the experiment.
Tags on Steam are primarily crowd sourced, though developers can set tags on their own games. Valve does do some moderation. Users can report misapplied tags for review, and only tags from a pre-approved list will show up. Tags are a mix of various types of metadata, covering genre information like puzzle-platformer, feature information like Online Multiplayer, and various other miscellaneous properties such as Indie, Crime, or Colorful. Tags can provide useful information about a game. But they tend to be a very noisy data source because of their crowd sourced nature and inconsistent application. While I have been told that Valve is making efforts to reduce the importance of tags in their algorithms, as of today tags are still a major factor in visibility on Steam.
When showing game relevance by tags, one to seven matching tags will be listed in this area. While looking at my queue, it often seemed like these tag matches often weren't very relevant. A small number of tags would be listed, and those tags wouldn't be particularly related to my tastes. Here is the distribution of frequencies of tag counts that I saw.
As we can see, many of the games were matching on one or two tags, though the majority had three or four.
Beyond the number of tags, it often seemed like the particular tags that were showing up didn't capture a lot of useful information. To investigate this, I am borrowing a concept from Information Theory to calculate how much information a given tag conveys about a game. For example, the Strategy tag is present on 5894 of the 28268 games I was able to access using the SteamSpy API. Computing -log_2(5894/28268) gives 2.262 bits of entropy, which is the number of bits of information the presence of the tag conveys. If the relevance field shows multiple tags, we can sum the entropy to get the approximate amount of information the tag matches are conveying. This isn't a strictly correct measure of entropy, since tags aren't independent from each other. But it's still an interesting way to evaluate the results.
The 14 games that showed a single match on the Indie tag are recommended based on the least amount of information at 0.467 bits. The average match is based on 7.07 bits of entropy.
Histogram of bits of data conveyed by tag matches
33 different tags appeared in this section during my experiment. The following table shows how many times each tag appeared.
|Horror||61||Platformer||19||Hack and Slash||3|