Choosing an algorithm¶
Every recommender in the library implements the same fit(ratings) /
recommend(user, n) contract, so swapping one for another is one line. But
they're not interchangeable on the same problem — each has a setting where
it shines and one where it falls over. Here's the short version.
At a glance¶
| Algorithm | What it needs | When it's the right call | When it isn't |
|---|---|---|---|
MostPopular |
Just ratings | Cold-start baseline; a reasonable default for new users | Personalization (it gives everyone the same list) |
MeanRating |
Ratings + an explicit-rating scale | When highly-rated long-tail items matter | Implicit feedback; low catalog coverage |
ItemKNN |
Ratings; sparse item-item cosine | Strong CF baseline; broad catalog coverage | Cold-start items (an item with no ratings is unreachable) |
UserKNN |
Ratings; per-user top-k neighbors via NearestNeighbors |
Communities and tight clusters | When you specifically want a fully precomputed user-user matrix for downstream analysis |
SVD |
Ratings; TruncatedSVD over the sparse user-item matrix |
Dense latent structure; smoothing | Implicit feedback where 0 is not negative |
BPR |
Implicit interactions only | Implicit feedback; learning what order items rank in | When you need closed-form solves rather than SGD — reach for ALS |
ALS |
Implicit interactions only | Same setting as BPR; converges in many fewer epochs | Very high n_factors where the per-side solves dominate |
ContentBased |
Per-item features | Cold-start items, niche-content surfacing | A pure-CF win on accuracy is on the table |
TwoTowerCF |
Implicit interactions; PyTorch | Embedding models with future side-information towers | Most cases — BPR/ALS are simpler and often as strong |
HybridRecommender |
Two or more fitted recommenders | Blending CF with content (cold-start) or model classes | One signal is dramatically stronger — weights matter |
When pure collaborative filtering wins¶
If you have plenty of interactions per user and the catalog is reasonably
small/dense, neighborhood CF (ItemKNN) or matrix factorization (SVD,
ALS) is hard to beat on accuracy. The MovieLens benchmark is a canonical
example: ratings are dense, the catalog is bounded, and ItemKNN / UserKNN
/ SVD cluster within 1-2% of each other on precision and NDCG.
When you need content¶
Three signs:
- Cold-start items. A new item with zero ratings is invisible to CF;
ContentBasedcan score it from features alone. - Long-tail discovery. CF tends to collapse onto popular items. Content surfaces niches that share features with what a user already liked.
- Explainability.
ContentBased.recommend_with_reasonsreturns a per-recommendation reason ("shared tags: fantasy, magic"), which CF can't.
recommender_systems.books.build_tag_recommender turns a goodbooks-style tag
table into a ContentBased via TF-IDF in one call. Compose it with a CF
recommender by hand through HybridRecommender(...) for a hybrid pipeline.
When you need a learned latent space¶
When the data is large enough that n_factors << min(n_users, n_items) is a
useful compression, and you want a single dense user/item embedding you can
use elsewhere (clustering, search, downstream models), reach for SVD,
ALS, or TwoTowerCF. SVD is the simplest (truncated SVD on the rating
matrix); ALS converges in fewer epochs at the cost of per-epoch work;
TwoTowerCF extends naturally if you want to drop side information into
either tower.
When you need implicit-feedback ranking¶
BPR and ALS both target the implicit setting (presence of an
interaction, not its rating). They optimize different objectives —
sigmoid-margin ranking loss (BPR) vs confidence-weighted regularized least
squares (ALS) — and converge differently. ALS gets close to its final
quality in 10-15 epochs; BPR needs more passes but is simpler to extend
with custom samplers.
Composing¶
HybridRecommender fuses two or more fitted recommenders' top-N output via
weighted reciprocal-rank fusion. The hybrid never needs to peek inside its
components, so anything implementing the Recommender interface composes —
including another hybrid.
Reasonable defaults to start¶
# Strong CF baseline.
ItemKNN(k=20)
# Explicit-feedback latent factor.
SVD(n_factors=50, random_state=0)
# Implicit feedback, fast to fit.
ALS(n_factors=32, epochs=15, random_state=0)
# Content-only or content-aware hybrid.
ContentBased(item_features=tfidf_or_embeddings)
These match what scripts/benchmark.py and scripts/benchmark_goodbooks.py
ship with.