High-dimensional Statistics in the Inconsistency Regime



It is common in high-dimensional statistics to focus on the amount of structure required to recover high-dimensional parameters consistently. In this talk, I will instead focus on regimes in which consistency is impossible. I will present two vignettes illustrating the utility of analyzing such regimes.

In the first vignette, I study mean-field variational Bayesian methods in high-dimensional linear models. It has been widely observed that such methods underestimate posterior uncertainty, including in genetic association studies, leading to inflated error rates. This results from an insufficient accounting of high-dimensional estimation errors. I present a rigorous analysis of a method which correctly accounts for these errors, leading to well-calibrated inference.

In the second vignette, I study a problem from semiparametrics— estimation of a population mean with data missing at random —in a setting in which nuisance parameters cannot be estimated consistently. In this challenging regime, even standard doubly-robust estimators can be inconsistent. I describe novel approaches which enjoy consistency guarantees for the population mean even though standard approaches fail.

Finally, I will provide my perspective on the broader implications of this work for designing methods which are less sensitive to errors from high-dimensional prediction models.

Related Papers: