MODEL MISSPECIFICATION IN MICROBIOME STUDIES

AMY WILLIS – UNIVERSITY OF WASHINGTON

ABSTRACT

The relative abundances of bacterial species in a microbiome are an important parameter to estimate given the critical role that microbiomes play in human and environmental health. By analyzing data from artificially constructed microbiomes, we show that high-throughput sequencing distorts the true composition of microbial communities. We propose a statistical model for microbiome data that reflects this observation, and a stable algorithm for estimating model parameters. Notably, our model and estimation procedure permit relative abundances to lie on the boundary of the simplex. We conclude with examples of the utility of the method, and recommendations for the design and analysis of microbiome studies. Our approach can be leveraged to select experimental protocols, design experiments with appropriate control data, and remove sample-specific contamination. This is joint work with David Clausen.

Related Paper: https://arxiv3.org/abs/2204.1273

BIO

Amy Willis is the Principal Investigator of the Statistical Diversity Lab and a tenure-track Assistant Professor in the Department of Biostatistics at the University of Washington. Amy and the StatDivLab develop tools for the analysis of microbiome and biodiversity data. Amy is passionate about reproducible science, meaningful data analysis, ecosystem and host health, and collaborating with scientists who share these values. Amy is the recipient of a NIH Outstanding Investigator Award, a UW Outstanding Faculty Mentor Award, and a UW Outstanding Faculty Teaching Award.