0
$\begingroup$

Long story short, I'm seeing in the literature that linear instrumental variables models are identifiable, even in the presence of unobserved confounders. The unobserved confounding aspect befuddles me, since it is not clear where this insight came from.

Briefly, given a linear instrumental variables setup where

$X = Z\beta + U\theta + \epsilon, \; \epsilon \sim N(0, \sigma_{\epsilon})$

$Y = X\alpha + U\phi + \delta, \; \delta \sim N(0, \sigma_{\delta})$

where $Z\in \mathbb{R}^z$ are instruments $X \in \mathbb{R}^x$ are the exogenous variables, $Y \in \mathbb{R}^y$ are the endogenous variables and $U \in \mathbb{R}^u$ are the unobserved confounders. In the instrumental variables setup, the average causal effect $\mathbb{E}[Y|do(x)]$ is of interest.

If $U$ is observed, I can see how this works out. But it isn't clear to me how an unobserved $U$ gets dropped / marginalized out in the linear setting, and I have not been successful finding the original proof, even though this was hinted in Bowden + Turkington 1984, Pearl 2008 and Rubin + Imbens 2015. Any references or pointers would be appreciated.

P.S. This question is similar to what was asked here, but the collider insight is only part of the story, since we know that ACE is not identifiable in a non-parametric setting.

P.S.S. This has been originally posted on mathoverflow, before I realized that this forum was better suited for causal inference questions

$\endgroup$

1 Answer 1

1
$\begingroup$

The causal effect can be identified with the right methodology

An instrumental variable (IV) can be used to estimate the causal effect even under hidden confounding. However, one has to use a suitable estimation procedure and how to best estimate effects in an IV setting is a research question of its own. The Wikipedia article on Instrumental Variable Estimation has a good summary and cites many of the standard works in the literature.

Probably the simplest and most common approach would be what's called "Two stage least squares" (2SLS). The idea is to first regress $X$ on the instrument $Z$ to obtain an unconfounded estimate $\hat{X}$. One can then regress $Y$ on $\hat{X}$ to obtain the causal effect. The Wikipedia entry on the subject also contains a short proof of the computation of the 2SLS estimator. See also this video for an instructive proof of the unbiasedness of the istrumental variable regression.

$\endgroup$
3
  • $\begingroup$ Hi, thanks for commenting. But I was asking about a mathematical proof of identifiability. Why can 2SLS identify the causal effect, even in the presence of unobserved confounders? $\endgroup$ Commented Jun 7, 2023 at 14:47
  • 1
    $\begingroup$ I added a link to a video explaining the unbiasedness of the estimator. The same channel has a whole series introducing IV regression if you're looking for more details. $\endgroup$ Commented Jun 7, 2023 at 16:16
  • $\begingroup$ I feel that we are getting warmer, but that video doesn't have a discussion on confounders (or unobserved confounders). $\endgroup$ Commented Jun 8, 2023 at 17:12

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.