Recommender Systems

Context-Aware Recommendations

In 2016, YouTube optimized recommendations for watch time. Users spent more time on the platform but in surveys said: "I feel like I'm wasting time". In 2019 the team published a paper on switching to multi-task learning - jointly optimizing watch time, satisfaction, and engagement. Contextuality and multi-objective became the industry standard.

  • **Netflix Content Scheduling** - different content recommended on weekdays vs weekends; seasonal collections (summer blockbusters, holiday films)
  • **Foursquare/Swarm** - time+location aware venue recommendations; the "morning near office = coffee" model
  • **SASRec in Amazon** - self-attentive session recommendations; standard baseline in e-commerce sequential rec

Предварительные знания

  • Collaborative filtering and the user-item rating matrix
  • Feature interactions and embeddings from deep recommendation models
  • The explore-exploit tradeoff behind bandits
  • Collaborative Filtering
  • Deep Learning Recommendations

Factorization Machines and the Rise of Context

In 2010, Steffen Rendle introduced Factorization Machines, a model that learns pairwise interactions between every feature through shared latent vectors. The breakthrough was practical: context such as time of day, location, device, or weather could be added as ordinary features alongside user and item ids, and the model would still estimate reliable interaction weights even when most feature combinations were never observed. This made context a first-class signal rather than an afterthought and turned the rating-prediction problem into general feature-based prediction. Factorization Machines underpin much of context-aware recommendation and connect directly to contextual bandits, where the system also balances exploiting known preferences against exploring new context.

Temporal Context: seasonality and interest decay

Netflix analyzed viewing patterns and found: on weekday evenings users watch short episodes (20-30 min); on weekends - feature films and long drama series. The same user wants different content at different times. A context-blind recommender system has this as a fundamental blind spot.

**Seasonality vs short-term patterns:** seasonality (summer -> beach films, December -> holiday content) is a long-term trend. Circadian patterns (morning -> podcasts, evening -> series) are short-term. A model needs to capture both levels through different features or a multi-scale architecture.

Why encode the hour of day via sin/cos rather than as a number 0-23?

Location-Aware: geographic context and local relevance

Foursquare in 2012 discovered: a user at 8:00 AM near their office is very likely looking for coffee. The same user at 7:00 PM in the same area is more likely looking for a bar or restaurant. Location and time together create a context that neither variable describes on its own.

**Home vs Office vs Traveling:** one user has several "local contexts". Clustering visited locations (k-means on geocoordinates) identifies: home, office, gym, "traveling". Content preferences differ for each cluster.

Why do location-aware recommendations require accounting for TIME in addition to coordinates?

Session-Based Recommendations: modeling short-term intent

Classic collaborative filtering uses the user's entire history. But user intent **within a session** shifts: came looking for sneakers - five clicks later looking at socks and shorts. Session-based recommendations model the current intent through the sequence of actions in the session, ignoring long-term history.

ModelArchitectureAdvantageDisadvantage
GRU4Rec (2016)GRUEffective for short sessionsPoor long-range dependency capture
SASRec (2018)Transformer (causal)Long-range dependencies, parallel trainingMore parameters, slower
BERT4Rec (2019)BERT (bidirectional)Bidirectional contextCannot use directly in online inference
FMLP-Rec (2022)MLP + FFTFaster than Transformer, competitive qualityLess studied

How do session-based recommendations differ from classic collaborative filtering?

Multi-Task Learning: joint optimization of multiple objectives

YouTube in 2019 found that optimizing only for clicks (CTR) promoted clickbait. Optimizing only for watch time - long boring videos. **Multi-task learning** addresses this: jointly optimizing CTR + watch time + like probability + "no regret" together produced content users actually wanted to see.

**Tasks in MTL:** CTR (click-through rate), CVR (conversion), watch time, skip rate, like/share/save. Each task has its own labels and loss. Final scoring is a weighted combination: `score = w1*CTR + w2*watch_time - w3*skip_rate`. Weights are tuned via A/B tests.

Multi-task learning complicates the system without real gain - it's better to build separate specialized models.

MTL improves each task through shared representations: signal from one task helps another. MMOE lets tasks have different expert weights - conflicting tasks get different specialized paths. One inference pass vs N independent models = lower latency.

Shared lower layers in MTL act as regularization - rare signals (likes) get more training signal through shared features with frequent events (clicks). N separate models don't have this effect.

YouTube moved to multi-task learning (CTR + watch time + satisfaction). Why is CTR-only optimization insufficient?

Context-Aware Recommendations

  • **Temporal context:** sin/cos encoding for cyclicity; time decay reduces weight of old interactions; seasonality vs circadian patterns
  • **Location context:** haversine distance + geohash for filtering; home/office/travel clustering; time+location together stronger than either alone
  • **Session-based:** GRU4Rec (2016, RNN) -> SASRec (2018, self-attention); current intent matters more than long-term history for short sessions
  • **Multi-task learning:** MMOE - shared experts + task-specific gates; CTR + watch time + satisfaction; joint optimization beats N separate models

Related Topics

Context-aware recommendations build on baseline algorithms and enrich them with context.

  • Multi-Objective and Re-Ranking — Multi-task scores must be balanced during final ranking
  • Matrix Factorization — Contextual factors are added to MF as additional dimensions (CAMF, CARS)

Вопросы для размышления

  • How do session-based and long-term history recommendations complement each other - in which scenarios does each approach outperform the other?
  • What negative consequences can aggressive optimization for watch time have without accounting for satisfaction metrics?
  • Why is MMOE better than simple shared-bottom MTL when tasks conflict (e.g., CTR vs dwell time)?

Связанные уроки

  • rec-04 — Deep models are the base for contextual recommendations
  • rec-05 — Sessions extend sequential recommendation ideas
  • rec-08 — Multi-task scores feed the re-ranking stage
  • ml-30-rnn-lstm — GRU4Rec uses RNN over session events
  • stat-13-time-series — Temporal context mirrors time series structure
  • ml-01-intro
Context-Aware Recommendations

0

1

Sign In