Intromental variables (IV) is an alternative causal inference method that does not rely on the ignorability assumption.
Let us think of an example. A: smoking during pregnancy (yes or no); Y: birthweight; X: party, mother's age, weight, etc.
Concern: There could be unmeasured confounders.
Challenge: Not ethical to randomly assignsmoking to pregnant women.
So there could be unmeasured confounders like mother's age, whether she's given birth before or her weight, etc. What we could do is using the encouragement design.
Adding Z: randomized to either receive encouragement to stop smoking (Z=1) or receive usual care (Z = 0). An intention-to-treat analysis would focus on the causal effect of encouragement:
This is a valid causal effect and would likely be of some interest.
What can we say about the causal effect of smoking itself? This is the focus of IV mothods.
We can begin by imagining a randomized trials:
Essentially the non-compliance makes a randomized trial like an observational study. There could be confounding based on treatment received. It might be reasonable to assume that treatment assignment does not directly affect Y. Here Z can be thought of as (strong) encouragement to receive the treatment.
We can classify people based on potential treatment.
Label | ||
---|---|---|
0 | 0 | Never-takers |
0 | 1 | Compliers |
1 | 0 | Defiers |
1 | 1 | Always-takers |
The number in the table represents if a person actually take the treatment or not. Take the first row as anexample, when he is not assignted to receive to treatment, he does not receive the treatment (0). However, when he is assigned, he still does not take the treatment (0). Therefore we call him never-takers. The same way of interpretation for the other three rows.
A motivation for using IV methods in general is concern about possible unmeasured confounding. If there is unmeasured confounding, then we can not marginalize over all confounders, by matching, IPTW, etc.
IV methods do not focus on the average causal effect for the population. Instead, they focus on a local average treatment effect.
The target of inference is
This is causal because it contrasts counterfactuals in a common population. This is known as complier average causal effect (CACE).
In the real world, for each person we observe an A and a Z, not
Z | A | Class | ||
---|---|---|---|---|
0 | 0 | 0 | ? | Never-takers or compliers |
0 | 1 | 1 | ? | Always-takers or defiers |
1 | 0 | ? | 0 | Never-takers or defiers |
1 | 1 | ? | 1 | Always-takers or compliers |
Without additional assumptions, we cannot classify each subject into one of these categories. However, we can narrow it down to two options.
Compliance classes are also known as principal strata. These are latent, not directly observable. In the next section we will talk about how we estimate the complier average causal effect and what assumptions are needed.
A variable is an instrumental variable (IV) if:
Above we have arrived at the conclusion that we are interested in the complier subgroups. The classes include defiers are not of interested. Therefore we need the another assumption. The monotonicity assumption is that there are no defiers.
* No one consistetnly does the opposite of what they are told.
* It is called monotonicity because the assumption is that the probability of treatment should increase with more encouragement.
In this section we are going to discuss identification and estimation of causal effects from instrumental variable type of analysis.
Recall that the goal is to estimate . Let's begin with something we can identify, the intention to treat (ITT) effect:
Given the condition that Z has no effect on the always-takers, never-takers and the monotonicity assumption, we can derive the following result:
which implies
Note that is the proportion of people who are always takers or compliers and is the proportion of people who are always takers. Therefore P(compliers) is just .
We can derive the expression of CACE as follows:
The denominator is causal effect of treatment assignment on the treated received. The numerator is the ITT: causal effect of treatment assignment on the outcome.
Note:
Two stage least squares is a method for estimating causal effect when you have an instrumental variable.
First step is to regress treatment received, A, on the intrumental variable, Z
where the rror term is mean 0, constant variance. By randomization, and are independent.
After that we can obtain the predicted value of A given Z:
The second stage is to regress the outcome, Y, on the fitted value from stage 1, :
The estimate of is estimate of the causal effect.
This section is about how to carry out an instrumental variable analysis
in R. The variables are as follows:
More schooling is associated with higher income, but is it due to the
fact that people with more schooling are different in other ways? That
is, we are concerning measured and unmeasured confounding.
One proposal is to raise the proximity to college as an IV. Living near
a 4 year college is a type of encouragement.
# install.packages("ivpack")
library(ivpack)
data("card.data")
# VI is nearc4 (near 4 year college)
# outcome is lwage (log of wage)
# 'treatment' is educ (number of years of education)
# you can take a look at descruptive statistivs of variables, but we skip this step here.
# make education binary
educ12 = card.data$educ>12
# estimate proportion of compliers
compl_prop = mean(educ12[card.data$nearc4==1]) - mean(educ12[card.data$nearc4==0])
cat("The proportion of complier is", compl_prop)
## The proportion of complier is 0.1219293
We can see that the proportion is only 12%, the intrument is not
extremely strong but not so weak either. It seems like living near a
four year college does increase the chances of getting an education of
more than 12 years.
# intention to treat effect
itt = mean(card.data$lwage[card.data$nearc4==1]) - mean(card.data$lwage[card.data$nearc4==0])
cat("The intention to treat effect is", itt)
## The intention to treat effect is 0.1559075
cat("The complier average causal effect is", itt/compl_prop)
## The complier average causal effect is 1.278672
# two stafe least squares
## Stage 1: regress A on Z
s1 = lm(educ12 ~ card.data$nearc4)
## get predicted value of A given Z for each subject
predtx = predict(s1,type = 'response')
table(predtx)
## predtx
## 0.422152560083588 0.54408183146614
## 957 2053
We can see that the people who had encouragement had a probability of
0.54 to have an education for more than 12 years. It is slightly higher
than those who did not had encouragement.
# stage 2: regress Y on predicted value of A
lm(card.data$lwage~predtx)
##
## Call:
## lm(formula = card.data$lwage ~ predtx)
##
## Coefficients:
## (Intercept) predtx
## 5.616 1.279
We can see that the CACE is 1.279. It is the same as the one we
estimated before by hand.
That brings us the end of the causal inference series. We have gong through the definition of causal effect to the estimation of causal inference. Part 2 talks about matching, which is used to solve the problem when we do not have randomized trials. Part 3 is about IPTW, which is an important method to solve the problem that the subject number in the two matching groups is unbalanced. In this post, we introduced intrumental variable to cope with the situation when the ignorability assumption is not fulfilled.
My biggest impression after the past 2 months' study is that the statsiticians have made so much effort to develp the causal inference and make it as applicable to real world as possible. And I hope the readers can remember correlation does not equal to causal relationship!