465 Jon M. Huntsman Hall
3730 Walnut Street
Philadelphia, PA 19104
Research Interests: High-dimensional asymptotics, random matrix theory, multiple testing.
My research focuses on statistical methods for “big” data. On the theoretical side, I leverage results from random matrix theory for the analysis of multivariate data when the dimension and sample size are large. On the applied side, I have developed methods for multiple testing motivated by the genomics of exceptional human longevity.
I obtained my PhD in Statistics from Stanford University in June 2017. I’ve had David Donoho as my PhD advisor and collaborated with Art Owen, with Stuart Kim’s lab, and with Amit Singer’s group. In 2012, I obtained a BA in mathematics from Princeton University.
Talk slides: GitHub.
Edgar Dobriban and Stefan Wager (2018), High-dimensional asymptotics of prediction: ridge regression and classification, The Annals of Statistics, 46 (1), pp. 247-279.
Edgar Dobriban (2017), Weighted mining of massive collections of p-values by convex optimization, Information and Inference: A Journal of the IMA, 7 (2), pp. 251-275.
Edgar Dobriban, William Leeb, Amit Singer (Under Review), Optimal prediction in the linearly transformed spiked model.
Description: This paper supersedes the older Dobriban, Leeb, Singer manuscript "PCA from noisy, linearly reduced data: the diagonal case".
Edgar Dobriban (2017), Sharp detection in PCA under correlations: all eigenvalues matter, The Annals of Statistics, 45 (4), pp. 1810-1833.
Kristen Fortney, Edgar Dobriban, Paolo Garagnani, Chiara Pirazzini, Daniela Monti, Daniela Mari, Gil Atzmon, Nir Barzilai, Claudio Franceschi, Art B. Owen, Stuart K. Kim (2015), Genome-wide scan informed by age-related disease identifies loci for exceptional human longevity, PLOS Genetics, 11 (12), pp. 1-23.
Edgar Dobriban (2015), Efficient computation of limit spectra of sample covariance matrices, Random Matrices: Theory and Applications, 4 (4).
Edgar Dobriban, Kristen Fortney, Stuart K. Kim, Art B. Owen (2015), Optimal multiple testing under a Gaussian prior on the effect sizes, Biometrika, 102 (4), pp. 753-766.
Discrete and continuous sample spaces and probability; random variables, distributions, independence; expectation and generating functions; Markov chains and recurrence theory.
Elements of matrix algebra. Discrete and continuous random variables and their distributions. Moments and moment generating functions. Joint distributions. Functions and transformations of random variables. Law of large numbers and the central limit theorem. Point estimation: sufficiency, maximum likelihood, minimum variance. Confidence intervals.
This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.
This page has links to software implementing methods developed in my papers. The software is usually hosted on GitHub. That page also contains software to reproduce the computational results of my publications.
Feel free to contact me if you are interested in using this software.
Contains the ePCA method for principal component analysis of exponential family data, e.g. Poisson-modeled count data.
Implements methods for denoising individual datapoints. (with L.T. Liu)
Related paper: Liu et al., 2016;
Contains methods for working with large random matrices, including
Implements P-value weighting techniques for multiple hypothesis testing. These methods can improve power in multiple testing, if there is prior information about the individual effect sizes. Includes the iGWAS method designed for Genome-Wide Association Studies.