recommendation-system
Recommendation system
In this post, we visit various approach used in recommendation system. For each approach, i will walk through major points. This post is not in depth explanation, but revision of various approach followed in research and industry.
The content of this post is as follows:
- Content Based
- Collaborative filtering
- Hybrid Approach
- Matrix factorization
- Deep Learning Based recSys
- Graph Based inference model
- Content Based Method
- Based on user history
- Not helpful for cold-start problem
- Feature would be likes of product, location, feature of product we bought etc
- Collaborative filtering Approach:
- Interaction Based feature
- user-user interaction (
I like sitcom tv series, it find the similarity between users and recommend product/movies of users, who have same flavor as me) - item-item interaction (
Recommend similar product on amazon)
- user-user interaction (
- can handle cold-start problem very well
KNNalgorithm to measure similarity
- Interaction Based feature
- Hybrid Approach:
- User History
- Interaction of user-user or item-item
- Most company use this approach
- Matrix factorization:
- We need to predict the missing entries in matrix for user-item rating, it can be done using
SVDasR = U S V', whereUisuser-feature,Siseigen-valueandVisItem-features - It can be done using
Alterning Optimization approachwith loss function of|r_{ij} - u_i S v_j|^2
- We need to predict the missing entries in matrix for user-item rating, it can be done using
- Probabilistic Matrix Factorization:
- We can include
useranditemknown history/feature as well to learn betterlatent-representation. - Loss funtion is
|r_{mn} - u_n v_m|^2 + |u_n - W_u a_n|^2 + |v_m - W_v b_m|^2, wherea_nis feature or history ofnth userandb_mis the history/feature ofmth item. - We can even add regularization on
u_nandv_m - Work fantastically in
Netflix-movies
- We can include
- Deep Learning Based
- deep and shallow network approach
Deep networkwill useword-embeddingofproduct descriptionandshallow-networkuse theuser-historyas feature ormeta-data- implemented in
You-tube recSys - There are many possibile way to build network and use feature, play with
word embedding- time series content
- time distributed layer for parsing document on item
- tfidf feature
- svd feature
- can even used
doc2vecfor each document.Use gensim doc2vec for training. **Main idea is that, while training we use the same approach asword2vec, except thanwe add a document tag with it, which maintains the context for each doc as well
- Another Deep Learning Approach:
- two network, one for
usersand other foritems - Compute
user-iteminteraction bydot or cosine product - Build more dense layer to have more complex representation
- use
multiclass cross entropy lossfor5-star rating - We can also user
sparse implict feedback feature, which are binary in nature(implicit feedbackwhich is generated by system onclick-basedandexplicit-feedbackis collected bylikes, review and purchasing history)
- two network, one for
- Graph Based Network
- use graph embedding to build NN (
deep walk,random-walk) - use each feature as node and interaction as edge. For example, for movie recommendation system,
we have 3 user, 5 movies, 8 actors, 5-star rating, 10 genreswe can use each attribute as a node and thenfor each interation, we create an edgeasuser1 like actor1 and given 4-star - can use
node2vec, where each node is represented byvectorwhich is trained on same concept ofrandom-walk - Current state of art
recSys platformfollows this.
- use graph embedding to build NN (
Some practical insights of Recommender System
- Netflix uses 100s of different base model, final prediction is the weighted average of all. (Generally non-linear blending is preferred)
Weighted Hybrid: Choose10items from users rating forcollaborative filteringas well as fromcontent based filtering. Now make a list using60%weighted collaborative and40%weighted content list. Finallysortthe list.Mix Hybrid: Take5items from content,5from user history,5from trending and5from othersSwitching: confidence: If user is logged in, switch tocollaborative filtering, otherwise switch tocontent filtering.