Over the past couple of years, YouTube has come under fire for its recommender system, with the media suggesting that it is promoting violent content, or banning LGBT content for violating its terms of service. Seemingly in response to all of this, Google has finally released a paper explaining YouTube’s recommender system, including how it makes recommendations and the information it gathers in doing so.
[Related Article: Trust, Control, and Personalization Through Human-Centric AI]
The paper, by Zhe Zhao, Lichan Hong, Li Wei, Jilin Chen, Aniruddh Nath, Shawn Andrews, Aditee Kumthekar, Maheswaran Sathiamoorthy, Xinyang Yi, and Ed Chi, discusses some of the problems that common/”normal” recommender systems face, some of the specific ones that a platform as big as YouTube faces, and the architecture they used to create their system.
One of the biggest issues the program had to tackle was that of scalability. Basically, no other recommender system has to work with such a large user platform, or with so many individual pieces of content. This meant that the team at Google had to make sure their system would be “effective at training and efficient at serving.”
YouTube’s Recommender System Overview
YouTube’s system learns from two types of user feedback: user engagement (clicks, watches, etc) and satisfaction behaviors (likes, dislikes). They model their ranking problem as a “combination of classification problems and regression problems with multiple objectives. Given a query, candidate, and context, the ranking model predicts the probabilities of user taking actions such as clicks, watches, likes, and dismissals.”
This is a point-wise prediction system, rather than a pair-wise or list-wise. While Google recognizes that the latter two could be used to improve the diversity of suggestions, for the time being, a point-wise system is the most efficient for a platform like YouTube.
Training the Model
The team trains their proposed models and baseline models sequentially, on the YouTube platform, thus creating and testing the model in its desired environment, rather than simulating it elsewhere. So too, by working sequentially, Google’s models can learn from and adapt to the most recent data as it comes in. They work both offline (“AUC for classification task and squared error for regression tasks”) and online aka “live” with A/B testing.
[Related Article: Designing Better Recommendation Systems with Machine Learning]
The paper concludes with proposed points of discussion, including some limitations they see in their current model, as well as more insights and further research. Some of these include:
- Exploring new model architecture for multi-objective ranking which balances stability, trainability, and expressiveness
- Understanding and learning to factorize
- Model compression to reduce serving costs