Model benchmarks

2013-11-02

A lot of people have asked me what models we use for recommendations at Spotify so I wanted to share some insights. Here's benchmarks for some models. Note that we don't use all of them in production.

Performance for recommender models

This particular benchmark looks at how well we are able to rank “related artists”. More info about models:

vector_exp: Our own method, a latent factor method trained on all log data using Hadoop (50B+ events).
word2vec: Google's open sourced word2vec. We train a model on subsampled (5%) playlist data using skip-grams and 40 factors.
rnn: Recurrent Neural Networks trained on session data (users playing tracks in a sequence). With 40 nodes in each layer, using Hierarchical Softmax for the output layer and dropout for regularization.
koren: Collaborative Filtering for Implicit Feedback Datasets. Trained on same data as vector_exp. Running in Hadoop, 40 factors.
lda: Latent Dirichlet Allocation using 400 topics, same dataset as above, also running in Hadoop.
freebase: Training a latent factor model on artist entities in the Freebase dump.
plsa: Probabilistic Latent Semantic Analysis, using 40 factors and same dataset/framework as above. More factors give significantly better results, but still nothing that can compete with the other models.

Again, not all of these models are in production, and conversely, we have other algorithms not included above that are in production. This is just a selections of things we've experimented with. In particular, I think it's interesting to note that neither PLSA nor LDA perform very well. Taking sequence into account (rnn, word2vec) seems to add a lot of value, but our best model (vector_exp) is a pure bag-of-words model.

Want to get blog posts over email?

Enter your email address and get an email (roughly monthly) when there's a new post!

Erik Bernhardsson

... is the founder of Modal Labs which is working on some ideas in the data/infrastructure space. I used to be the CTO at Better. A long time ago, I built the music recommendation system at Spotify. You can follow me on Twitter or see some more facts about me.

Model benchmarks

Model benchmarks

Want to get blog posts over email?

Erik Bernhardsson

Recommend

Bagging as a regularizer

前谷歌CEO：已购买少量加密货币，对Web3抱有兴趣

When machine learning matters

Nearest neighbors and vector models – epilogue – curse of dimensionality

Fermat's principle

CVE-2020-14882&14883漏洞分析

CVE-2020-17530 Struts远程代码执行

Stuff that bothers me: “100x faster than Hadoop”

Struts2 S2-061 远程命令执行

NodeJs从零开始到原型链污染

About Joyk