DirL reframes ranking as token-native sequence modeling, encoding user context, long history, and candidates in one token stream so a generative backbone preserves token semantics, models long-range behavior, and reduces feature-plumbing complexity for more direct end-to-end ranking.
GenRec Direct Learning (DirL) reframes ranking as token-native sequence modeling. It replaces feature pipelines with end-to-end generative sequence models that score candidates directly.
Main feature/change and impact
DirL moves ranking from engineered feature vectors to a unified token sequence. User context, history, and candidates become a single tokenized input. A generative sequential backbone processes long histories, preserving token-level semantics. This reduces semantic loss from flattening and enables modeling of long-range dependencies without manual aggregation or late-stage fusion.Practical implications
Operational complexity shifts from feature plumbing to sequence and systems engineering. Teams trade many feature joins and validations for token design, embedding management, and sequence backbone optimization. Deployment emphasizes memory and compute efficiency, dynamic batching, and quantization. Evaluation must prioritize training velocity and cost alongside predictive metrics to keep iteration cycles practical.“treat ranking as native sequence learning, not as ‘MLP over engineered features.’”DirL improves representational fidelity by keeping cross-token semantics inside the model. It also raises engineering priorities around embedding table consolidation and sequence-length tradeoffs. Practical rollout requires pruning oversized embedding tables and right-sizing the backbone to find the performance-cost knee. DirL’s architecture uses a shared token embedding space, HSTU long-sequence layers, and an MMoE multi-task head. Candidate tokens come from projected document and interaction features. History tokens preserve temporal order, enabling long-horizon preference modeling. The candidate token hidden state drives multi-task scoring, supporting engagement and related objectives directly from the sequence. Early experiments show improved in-house metrics and increased user time spent. Major barriers are slower training velocity, higher serving cost, and capacity limits on hardware. Current engineering directions include embedding pruning, minimal effective token sets, kernel fusion, and inference optimizations. These measures aim to retain token-native benefits while reducing cost and latency. Next steps focus on production viability and cost-efficiency. Teams should empirically map sequence length against metric gains and optimize embedding consolidation. If DirL proves efficient at scale, it can simplify modeling workflows and unlock richer long-term behavior modeling across recommender systems.
Key points from the article:
Related Coverage:
- What’s new in Power Platform: February 2026 feature update
- The JavaScript AI Build-a-thon Season 2 starts March 2!
- Announcing new Cloud PC devices designed for Windows 365
From the Microsoft Developer Community Blog articles
