Trustworthy Post-selection Inference in A/B Testing

ALEX (SHAOJIE) DENG – MICROSOFT ANALYSIS AND EXPERIMENTATION TEAM

ABSTRACT

When consuming A/B tests, we typically focus only on the statistically significant results, and take them by face values. This practice, termed post-selection inference in the statistical literature, may negatively affect both point estimation and uncertainty quantification in A/B testing, and therefore hinder trustworthy decision making. To address this issue, we explore two seemingly unrelated paths, one based on supervised machine learning and the other on empirical Bayes, and propose post-selection inferential approaches that combine the strengths of both. Via large-scale simulated and empirical examples, we demonstrate that our proposed methodologies stand out among other existing methods in both reducing post-selection biases and improving confidence interval coverage rates, and discuss how they can be conveniently adjusted to real-life scenarios.