-1
TABLE OF CONTENTS
machine learning
- Area Under Uplift Curve AUUC
- Average Treatment Effect ATE
- Bellman Equation
- Conditional Average Treatment Effect CATE
- Deep Kernel Learning
- Diffusion Model Forward Process
- Diffusion Model Forward Process Reparameterization
- Diffusion Model Reverse Process
- Diffusion Model Variational Lower Bound
- Diffusion Model Variational Lower Bound Loss
EQUATION LIST
-
Hidden Markov Model
#machine learning #nlp
$$Q=\{q_{1},q_{2},...,q_{N}\}, V=\{v_{1},v_{2},...,v_{M}\} \\ I=\{i_{1},i_{2},...,i_{T}\},O=\{o_{1},o_{2},...,o_{T}\} \\ A=[a_{ij}]_{N \times N}, a_{ij}=P(i_{t+1}=q_{j}|i_{t}=q_{i}) \\ B=[b_{j}(k)]_{N \times M},b_{j}(k)=P(o_{t}=v_{k}|i_{t}=q_{j})$$
READ MORE -
Diffusion Model Forward Process Reparameterization
#machine learning #diffusion
$$x_{t}=\sqrt{\alpha_{t}}x_{t-1}+\sqrt{1-\alpha_{t}} \epsilon_{t-1}\\=\sqrt{\alpha_{t}\alpha_{t-1}}x_{t-2} + \sqrt{1-\alpha_{t}\alpha_{t-1}} \bar{\epsilon}_{t-2}\\=\text{...}\\=\sqrt{\bar{\alpha}_{t}}x_{0}+\sqrt{1-\bar{\alpha}_{t}}\epsilon \\\alpha_{t}=1-\beta_{t}, \bar{\alpha}_{t}=\prod_{t=1}^{T}\alpha_{t}$$
READ MORE -
Diffusion Model Reverse Process
#machine learning #diffusion
$$p_\theta(\mathbf{x}_{0:T}) = p(\mathbf{x}_T) \prod^T_{t=1} p_\theta(\mathbf{x}_{t-1} \vert \mathbf{x}_t) \\ p_\theta(\mathbf{x}_{t-1} \vert \mathbf{x}_t) = \mathcal{N}(\mathbf{x}_{t-1}; \boldsymbol{\mu}_\theta(\mathbf{x}_t, t), \boldsymbol{\Sigma}_\theta(\mathbf{x}_t, t))$$
READ MORE -
Diffusion Model Variational Lower Bound
#machine learning #diffusion
$$\begin{aligned} - \log p_\theta(\mathbf{x}_0) &\leq - \log p_\theta(\mathbf{x}_0) + D_\text{KL}(q(\mathbf{x}_{1:T}\vert\mathbf{x}_0) \| p_\theta(\mathbf{x}_{1:T}\vert\mathbf{x}_0) ) \\ &= -\log p_\theta(\mathbf{x}_0) + \mathbb{E}_{\mathbf{x}_{1:T}\sim q(\mathbf{x}_{1:T} \vert \mathbf{x}_0)} \Big[ \log\frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T}) / p_\theta(\mathbf{x}_0)} \Big] \\ &= -\log p_\theta(\mathbf{x}_0) + \mathbb{E}_q \Big[ \log\frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} + \log p_\theta(\mathbf{x}_0) \Big] \\ &= \mathbb{E}_q \Big[ \log \frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} \Big] \\ \text{Let }L_\text{VLB} &= \mathbb{E}_{q(\mathbf{x}_{0:T})} \Big[ \log \frac{q(\mathbf{x}_{1:T}\vert\mathbf{x}_0)}{p_\theta(\mathbf{x}_{0:T})} \Big] \geq - \mathbb{E}_{q(\mathbf{x}_0)} \log p_\theta(\mathbf{x}_0) \end{aligned}$$
READ MORE -
Diffusion Model Variational Lower Bound Loss
#machine learning #diffusion
$$\begin{aligned} L_\text{VLB} &= L_T + L_{T-1} + \dots + L_0 \\ \text{where } L_T &= D_\text{KL}(q(\mathbf{x}_T \vert \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_T)) \\ L_t &= D_\text{KL}(q(\mathbf{x}_t \vert \mathbf{x}_{t+1}, \mathbf{x}_0) \parallel p_\theta(\mathbf{x}_t \vert\mathbf{x}_{t+1})) \text{ for }1 \leq t \leq T-1 \\ L_0 &= - \log p_\theta(\mathbf{x}_0 \vert \mathbf{x}_1) \end{aligned}$$
READ MORE -
Variational AutoEncoder VAE
#machine learning #VAE
$$\log p_{\theta}(x)=\mathbb{E}_{q_{\phi}(z|x)}[\log p_{\theta}(x)] \\ =\mathbb{E}_{q_{\phi}(z|x)}[\log \frac{p_{\theta}(x,z)}{p_{\theta}(z|x)}] \\ =\mathbb{E}_{q_{\phi}(z|x)}[\log [\frac{p_{\theta}(x,z)}{q_{\phi}(z|x)} \times \frac{q_{\phi}(z|x)}{p_{\theta}(z|x)}]] \\ =\mathbb{E}_{q_{\phi}(z|x)}[\log [\frac{p_{\theta}(x,z)}{q_{\phi}(z|x)} ]] +D_{KL}(q_{\phi}(z|x) || p_{\theta}(z|x))\\$$
READ MORE -
Bound on Target Domain Error
#machine learning #transfer learning
$$\epsilon_{T}(h) \le \hat{\epsilon}_{S}(h) + \sqrt{\frac{4}{m}(d \log \frac{2em}{d} + \log \frac{4}{\delta })} + d_{\mathcal{H}}(\tilde{\mathcal{D}}_{S}, \tilde{\mathcal{D}}_{T}) + \lambda \\ \lambda = \lambda_{S} + \lambda_{T}$$
READ MORE -
Domain-Adversarial Neural Networks DANN
#machine learning #transfer learning
$$\min [\frac{1}{m}\sum^{m}_{1}\mathcal{L}(f(\textbf{x}^{s}_{i}),y_{i})+\lambda \max(-\frac{1}{m}\sum^{m}_{i=1}\mathcal{L}^{d}(o(\textbf{x}^{s}_{i}),1)-\frac{1}{m^{'}}\sum^{m^{'}}_{i=1}\mathcal{L}^{d}(o(\textbf{x}^{t}_{i}),0))]$$
READ MORE -
Graph Attention Network GAT
#machine learning #graph #GNN
$$h=\{\vec{h_{1}},\vec{h_{2}},...,\vec{h_{N}}\}, \\ \vec{h_{i}} \in \mathbb{R}^{F} \\ W \in \mathbb{R}^{F \times F^{'}} \\ e_{ij}=a(Wh_{i},Wh_{j}) \\ k \in \mathcal{N}_{i},\text{ neighbourhood nodes}\\ a_{ij}=\text{softmax}_{j}(e_{ij})=\frac{\exp(e_{ij})}{\sum_{k \in \mathcal{N}_{i}} \exp(e_{ik})}$$
READ MORE -
GraphSage
#machine learning #graph #GNN
$$h^{0}_{v} \leftarrow x_{v} \\ \textbf{for} k \in \{1,2,...,K\} \text{do}\\ \textbf{for} v \in V \text{do} \\ h^{k}_{N_{v}} \leftarrow \textbf{AGGREGATE}_{k}(h^{k-1}_{u}, u \in N(v)); \\ h^{k}_{v} \leftarrow \sigma (W^{k} \textbf{concat}(h^{k-1}_{v},h^{k}_{N_{v}})) \\ \textbf{end} \\ h^{k}_{v}=h^{k}_{v}/||h^{k}_{v}||_{2},\forall v \in V \\ \textbf{end} \\ z_{v} \leftarrow h^{k}_{v} \\ J_{\textbf{z}_{u}}=-\log (\sigma (\textbf{z}_{u}^{T}\textbf{z}_{v})) - Q \mathbb{E}_{v_{n} \sim p_n(v)} \log(\sigma (-\textbf{z}_{u}^{T}\textbf{z}_{v_{n}}))$$
READ MORE -
Hidden Markov Model
#machine learning #nlp
$$Q=\{q_{1},q_{2},...,q_{N}\}, V=\{v_{1},v_{2},...,v_{M}\} \\ I=\{i_{1},i_{2},...,i_{T}\},O=\{o_{1},o_{2},...,o_{T}\} \\ A=[a_{ij}]_{N \times N}, a_{ij}=P(i_{t+1}=q_{j}|i_{t}=q_{i}) \\ B=[b_{j}(k)]_{N \times M},b_{j}(k)=P(o_{t}=v_{k}|i_{t}=q_{j})$$
READ MORE -
Model-Agnostic Meta-Learning MAML
#machine learning #meta learning
$$\min_{\theta} \sum_{\mathcal{T}_{i} \sim p(\mathcal{T})} \mathcal{L}_{\mathcal{T}_{i}}(f_{\theta^{'}_{i}}) = \sum_{\mathcal{T}_{i} \sim p(\mathcal{T})} \mathcal{L}_{\mathcal{T}_{i}}(f_{\theta_{i} - \alpha \nabla_{\theta} \mathcal{L}_{\mathcal{T}_{i}} (f_{\theta}) })$$
READ MORE -
Progressive Layered Extraction PLE
#machine learning #multi task
$$g^{k}(x)=w^{k}(x)S^{k}(x) \\ w^{k}(x)=\text{softmax}(W^{k}_{g}x) \\ S^{k}(x)=\[E^{T}_{(k,1)},E^{T}_{(k,2)},...,E^{T}_{(k,m_{k})},E^{T}_{(s,1)},E^{T}_{(s,2)},...,E^{T}_{(s,m_{s})}\]^{T} \\ y^{k}(x)=t^{k}(g^{k}(x)) \\ g^{k,j}(x)=w^{k,j}(g^{k,j-1}(x))S^{k,j}(x) $$
READ MORE