## Cheatsheet of Latex Code for Most Popular Recommendation and Advertisement Ranking module Equations 1 0

Cheatsheet of Latex Code for Most Popular Recommendation or Ads Equations of Ranking module

Ranking is a crucial part of modern commercial recommendation and advertisement system. It aims to solve the problem of accurate click-through rate(CTR) prediction. In this article, we will provides some of most popular ranking equations of commercial recommendation or ads system.

### Ranking model

• #### Matrix Factorization

##### Latex Code
            X_{m \times n} = A_{m \times k} \times B_{k times n} \\
A^{*},B^{*} = argmin_{A,B} |X-AB|^{2}

##### Explanation

X denotes matrix with dimension as m \times n with, indicating m user and n items. Users' latent representations and items' latent representations are modelled as vectors of dimension k. On common approach to solve the matrix factorization is singular value decomposition (SVD), which minimize the loss of preference matrix and the matrix multiplication of user's matrix and item's matrix. See this post matrix factorization for detailed explanation.

• #### 1.2 Factorization Machine(FM)

##### Latex Code
            \hat{y}_{FM}(x) = w_{0} +\sum^{n}_{i=1} w_{i}x_{i} + \sum^{n}_{i=1}\sum^{n}_{j=i+1} v^{T}_{i}v_{j}x_{i}x_{j}

##### Explanation

Factorization Machine calculates the second order interactions among n features. The summation calculation consists of three parts, the bias term w_{0}, the linear parts w_{i}x_{i} and the second order interactions part.

• #### 1.4 Wide\&Deep

##### Latex Code
            P(Y=1|x) =\sigma(w^{T}_{wide}[x,\phi(x)] + w^{T}_{deep}a^{(lf)} + b) \\
a^{(l+1)} = f(W^{(l)}a^{(l)}+b^{(l)})

##### Explanation

Wide&Deep model was firstly introduced by Google in 2016. It combines a wide linear part with deep neural network models. Y is the binary label, \sigma(.) means the sigmoid function, a^{lf} is the last layer of the deep neural network. See paper Wide& Deep Learning for Recommender Systems for more details.

• #### 1.5 DeepFM

##### Latex Code
            \hat{y} = \sigma (y_{FM} + y_{DNN}) \\
y_{FM} =  + \sum^{d}_{j_{1}=1}\sum^{d}_{j_{2}=j_{1}+1}x_{j_{1}}x_{j_{2}}

##### Explanation

DeepFM model combines a factorization machine(fm) part with output of a deep neural network models. Y is the binary label, \sigma(.) means the sigmoid function, y_{FM} is the output of FM model, a^{lf} is the last layer of the deep neural network. See paper DeepFM: A Factorization-Machine based Neural Network for CTR Prediction for more details.

• ### Sequence Models

• #### 2.1 Deep Interest Network(DIN)

##### Latex Code
            v_{U}(A)=f(v_{A},e_{1},e_{2},...,e_{H})=\sum^{H}_{j=1} a(e_{j},v_{A})=\sum^{H}_{j=1} w_{j}e_{j}

##### Explanation

{e_{1}, e_{2}, ..., e_{H}} is the list of embedding vector of behaviors of User U with length H, v_{A} is the embedding vector of target item A. a(.) indicates the output scalar of attention function. See paper Deep Interest Network for Click-Through Rate Prediction for more details. }

• #### 2.1 Deep Interest Evolution Network(DIEN)

##### Latex Code
L = L_{target} + \alpha \times L_{aux} \\
L_{aux} = - \frac{1}{N}(\sum^{N}_{i=1}\sum_{t} \log \sigma(h^{i}_{t},e^{i}_{b}[t+1])) + \log(1- \sigma(h^{i}_{t}, \hat{e}^{i}_{b}[t+1])))"

##### Explanation

Deep Interest Evolution Network(DIEN) uses GRU (Gated Recurrent Unit) to model the dynamics of user sequences. For the final loss function, it adds an auxilliary loss to the target cross-entropy loss. e^{i}_{b} denotes the clicked item sequence, \hat{e}^{i}_{b} denotes the negative item sequence(expoed but not clicked), which are sampled from users' browsing history. h^{i}_{t} represents the last hidden layer of user's behaviors GRU unit at step t. See paper Deep Interest Evolution Network for Click-Through Rate Prediction for more details.