DISTRIBUTION-FREE ASSESSMENT OF POPULATION OVERLAP IN OBSERVATIONAL STUDIES

LIHUA LEI – STANFORD UNIVERSITY

ABSTRACT

The credibility of causal inference with observational studies relies crucially on the overlap of baseline covariates between different treated groups, which is also known as the positivity or common support assumption. The current empirical assessment of overlap is typically based on estimated propensity scores. This approach is meaningful only when the propensity score model is correctly specified, and in general, it has no formal statistical guarantee due to the lack of proper uncertainty quantification. In this work, we formally define a measure of population overlap inspired by the strict overlap condition (e.g. that propensity scores lie in [0.1, 0.9] almost surely), and develop a family of upper confidence bounds on this measure. We call them O-values. The O-values are valid in finite samples without any assumption on the data generating process, as long as the observations are independent and identically distributed. Technically, we construct the O-values based on a non-standard partial identification approach, with the uncertainty quantification handled by computable concentration inequalities.