## Cheatsheet of Latex Code for Most Popular Causual Inference and Uplift modelling Equations

rockingdingo 2022-07-10 #causal inference #uplift modelling #auuc #qini 2 0

Cheatsheet of Latex Code for Most Popular Causual Inference and Uplift modelling Equations

In this blog, we will summarize most fundamental concepts and equations of causal inference and uplift modelling equations. Causal inference has attracted much more attention in the modern machine learning, statistics community. Generally speaking, uplift modelling is a group of methods to estimate the effect of an action(covariate) on the final outcome. Some basic concepts will be discussed in the following post, including ATE(Average Treatment Effect), CATE(Conditional Average Treatment Effect), Unconfoundness assumption or (CIA conditional independence assumption). Let W_i denotes the indicator function that whether instance is assigned to control group (W_i=0) or treatment group (W_i=1). We denote Yi(0) as the potential outcome of instance X_i if it is assigned to control group(W_i=0), and Yi(1) as the potential outcome if it's assigned to treatment group (W_i=1).

### 1. Basic Concepts of Causal Inference

• #### 1.1 Average Treatment Effect(ATE)

##### Latex Code
            \text{ATE}:=\mathbb{E}[Y(1)-Y(0)]

##### Explanation

Average Treatment Effect(ATE) is defined as the expectation of the difference between the treatment group Y(1) and control group Y(0)

• #### 1.2 Individual Treatment Effect(ITE)

##### Latex Code
            \text{ITE}_{i}:=Y_{i}(1)-Y_{i}(0)

##### Explanation

Individual Treatment Effect(ITE) is defined as the difference between the outcome of treatment group Y_i(1) over the outcome of control group Y_i(0) of the same instance i. There exists a fundamental problem that we can't observe Y_i(1) and Y_i(0) at the same time because each instance item i can only be assigned to one experiment of control group or treatment group, but never both. So we can't observe the individual treatment effect(ITE) directly for each instance i.

• #### 1.3 Conditional Average Treatment Effect(CATE)

##### Latex Code
            \tau(x):=\mathbb{E}[Y(1)-Y(0)|X=x]

##### Explanation

Since we can't observe ITE of item i directly, most causal inference models estimate the conditional average treatment effect(CATE) conditioned on item i (X=x_{i}).

• #### 1.4 Propensity Score

##### Latex Code
            e := p(W=1|X=x)

##### Explanation

The propensity score is defined as the degree of propensity or likelihood that instance i is assigned to treatment group W=1.

• #### 1.5 Unconfoundedness Assumption

##### Latex Code
            \{Y_{i}(0),Y_{i}(1)\}\perp W_{i}|X_{i}

##### Explanation

The unconfoundedness assumption or CIA(Conditional Independence assumption) assume that there are no hidden confounders between (Y(0),Y(1)) vector and treatment assignment vector W, conditioned on input X.

### 2. Models

• #### 2.1 S-Learner

##### Latex Code
            \mu(x,w)=\mathbb{E}[Y_{i}|X=x_{i},W=w] \\
\hat{\tau}(x)=\hat{\mu}(x,1)-\hat{\mu}(x,0)

##### Explanation

S-Learner use a single machine learning estimator \mu(x,w) to estimate outcome Y directly. And the treatment assigment variable W=0,1 is treated as features of S-learner models. The CATE estimation is calculated as the difference between two outputs given the same model \mu and different inputs features of W, namely w=1 and w=0.

• #### 2.2 T-Learner

##### Latex Code
            \mu_{0}(x)=\mathbb{E}[Y(0)|X=x],\mu_{1}(x)=\mathbb{E}[Y(1)|X=x],\\
\hat{\tau}(x)=\hat{\mu}_{1}(x)-\hat{\mu}_{0}(x)

##### Explanation

T-Learner models use two separate models to fit the dataset of control group W=0 and dateset of treatment group W=1. The CATE estimation is calculated as the difference between two outputs given same input x and different models \mu_0 and \mu_1.

• #### 2.3 X-Learner

##### Latex Code
\tilde{D}^{1}_{i}:=Y^{1}_{i}-\hat{\mu}_{0}(X^{1}_{i}),\tilde{D}^{0}_{i}:=\hat{\mu}_{1}(X^{0}_{i})-Y^{0}_{i}\\
\hat{\tau}(x)=g(x)\hat{\tau}_{0}(x) + (1-g(x))\hat{\tau}_{1}(x)

##### Explanation

See this paper for more details of X-learner Metalearners for estimating heterogeneous treatment effects using machine learning

### 3. Metrics

• #### 3.1 Area Under Uplift Curve(AUUC)

##### Latex Code
            f(t)=(\frac{Y^{T}_{t}}{N^{T}_{t}} - \frac{Y^{C}_{t}}{N^{C}_{t}})(N^{T}_{t}+N^{C}_{t})

##### Explanation

Authors in this paper Causal Inference and Uplift Modeling A review of the literature defines AUUC coefficient as the area under the uplift curve.

• #### 3.2 QINI

##### Latex Code
            g(t)=Y^{T}_{t}-\frac{Y^{C}_{t}N^{T}_{t}}{N^{C}_{t}},\\
f(t)=g(t) \times \frac{N^{T}_{t}+N^{C}_{t}}{N^{T}_{t}}

##### Explanation

Author in this paper Using control groups to target on predicted lift: Building and assessing uplift model defines Qini coefficint as the area under the QINI curve, which is more suitable for the unbalanced samples size of control group and treatment group.