Cheatsheet of Latex Code for Most Popular Recommendation and Advertisement Ranking module Equations
rockingdingo #recommendation #advertisement #ranking #sequential modellingCheatsheet of Latex Code for Most Popular Recommendation or Ads Equations of Ranking module
Navigation
Ranking is a crucial part of modern commercial recommendation and advertisement system. It aims to solve the problem of accurate clickthrough rate(CTR) prediction. In this article, we will provides some of most popular ranking equations of commercial recommendation or ads system.
 1.Ranking model
 1.1 Matrix Factorization
 1.2 Factorization Machine(FM)
 1.3 Collabrative Filtering
 1.4 Wide\&Deep
 1.5 DeepFM
 1.6 NFM
 1.7 xDeepFM
 1.8 Bayesian Personalized Ranking(BPR)
 2.Sequential Behavior Modelling
 2.1 DIN
 2.2 DIEN
 2.3 Bert4Rec
 2.4 GRU4Rec
Ranking model

Matrix Factorization
Equation
Latex Code
X_{m \times n} = A_{m \times k} \times B_{k times n} \\ A^{*},B^{*} = argmin_{A,B} XAB^{2}
Explanation
X denotes matrix with dimension as m \times n with, indicating m user and n items. Users' latent representations and items' latent representations are modelled as vectors of dimension k. On common approach to solve the matrix factorization is singular value decomposition (SVD), which minimize the loss of preference matrix and the matrix multiplication of user's matrix and item's matrix. See this post matrix factorization for detailed explanation.

1.2 Factorization Machine(FM)
Equation
Latex Code
\hat{y}_{FM}(x) = w_{0} +\sum^{n}_{i=1} w_{i}x_{i} + \sum^{n}_{i=1}\sum^{n}_{j=i+1} v^{T}_{i}v_{j}x_{i}x_{j}
Explanation
Factorization Machine calculates the second order interactions among n features. The summation calculation consists of three parts, the bias term w_{0}, the linear parts w_{i}x_{i} and the second order interactions part.

1.4 Wide\&Deep
Equation
Latex Code
P(Y=1x) =\sigma(w^{T}_{wide}[x,\phi(x)] + w^{T}_{deep}a^{(lf)} + b) \\ a^{(l+1)} = f(W^{(l)}a^{(l)}+b^{(l)})
Explanation
Wide&Deep model was firstly introduced by Google in 2016. It combines a wide linear part with deep neural network models. Y is the binary label, \sigma(.) means the sigmoid function, a^{lf} is the last layer of the deep neural network. See paper Wide& Deep Learning for Recommender Systems for more details.

1.5 DeepFM
Equation
Latex Code
\hat{y} = \sigma (y_{FM} + y_{DNN}) \\ y_{FM} =
+ \sum^{d}_{j_{1}=1}\sum^{d}_{j_{2}=j_{1}+1} x_{j_{1}}x_{j_{2}} Explanation
DeepFM model combines a factorization machine(fm) part with output of a deep neural network models. Y is the binary label, \sigma(.) means the sigmoid function, y_{FM} is the output of FM model, a^{lf} is the last layer of the deep neural network. See paper DeepFM: A FactorizationMachine based Neural Network for CTR Prediction for more details.

2.1 Deep Interest Network(DIN)
Equation
Latex Code
v_{U}(A)=f(v_{A},e_{1},e_{2},...,e_{H})=\sum^{H}_{j=1} a(e_{j},v_{A})=\sum^{H}_{j=1} w_{j}e_{j}
Explanation
{e_{1}, e_{2}, ..., e_{H}} is the list of embedding vector of behaviors of User U with length H, v_{A} is the embedding vector of target item A. a(.) indicates the output scalar of attention function. See paper Deep Interest Network for ClickThrough Rate Prediction for more details. }

2.1 Deep Interest Evolution Network(DIEN)
Equation
Latex Code
L = L_{target} + \alpha \times L_{aux} \\ L_{aux} =  \frac{1}{N}(\sum^{N}_{i=1}\sum_{t} \log \sigma(h^{i}_{t},e^{i}_{b}[t+1])) + \log(1 \sigma(h^{i}_{t}, \hat{e}^{i}_{b}[t+1])))"
Explanation
Deep Interest Evolution Network(DIEN) uses GRU (Gated Recurrent Unit) to model the dynamics of user sequences. For the final loss function, it adds an auxilliary loss to the target crossentropy loss. e^{i}_{b} denotes the clicked item sequence, \hat{e}^{i}_{b} denotes the negative item sequence(expoed but not clicked), which are sampled from users' browsing history. h^{i}_{t} represents the last hidden layer of user's behaviors GRU unit at step t. See paper Deep Interest Evolution Network for ClickThrough Rate Prediction for more details.