Post-Training Generative Recommenders with Advantage-Weighted Supervised Finetuning
Original URL: https://netflixtechblog.com/post-training-generative-recommenders-with-advantage-weighted-supervised-finetuning-61a538d717a9
Article Written: October 20, 2023
Added: October 27, 2025
Type: tech2
Summary
This article discusses the challenges and advancements in post-training generative recommender systems, particularly focusing on a novel algorithm called Advantage-Weighted Supervised Fine-tuning (A-SFT). The authors highlight the limitations of traditional reinforcement learning methods in recommendation contexts, such as the lack of counterfactual observations and noisy reward models. A-SFT aims to improve recommendation quality by effectively combining supervised fine-tuning with reinforcement learning techniques. The results demonstrate that A-SFT outperforms existing methods in aligning generative models with user preferences.