Edgar Dobriban

Associate Professor of Statistics and Data Science, with secondary appointment in Computer and Information Science

Contact Information

Primary Email:
dobriban@wharton.upenn.edu

office Address:
305 Academic Research Building
265 South 37th Street
Philadelphia, PA 19104

Research Interests: Statistics and machine learning

Overview

Research interests:

AI safety
- uncertainty quantification for machine learning: predictive inference, calibration
- robustness in theory and in practice: distribution shift, jailbreaking LLMs
scalable data analysis
- randomized algorithms: sketching and random projections
- distributed learning
high-dimensional asymptotics
- simple models of neural nets, random feature models
- high-dimensional regression
- dimension reduction, PCA
statistics in algorithmic fairness
data augmentation, invariance, and symmetry

The group is always looking to expand. We are recruiting PhD students at Penn to work on problems in statistics and machine learning. PhD applicants interested to work with me should mention this on their application. Please apply through the departments of Statistics & Data Science, Computer and Information Science, and the AMCS program, as it gives higher chances for admission.

Education (cv):

PhD in Statistics, Stanford University, 2017. Advisor: David Donoho
BA in Mathematics (with highest honors/summa cum laude), Princeton University, 2012.

Awards and Honors

Peter Gavin Hall IMS Early Career Prize, 2024 Description
“For deep, fundamental, and wide-ranging contributions to mathematical statistics and statistical machine learning, including high-dimensional asymptotics (ridge regression, PCA), multiple testing, randomization tests, scalable statistical learning via random projections and distributed learning, uncertainty quantification for machine learning (calibration, prediction sets), robustness, fairness, and Covid-19 pooled testing via hypergraph factorization.”
Air Force Office of Scientific Research Young Investigator Program (AFOSR YIP) Award, 2024
ICSA Outstanding Young Researcher Award, 2023
Army Research Office Early Career Program Award, 2023
Committee of Presidents of Statistical Societies (COPSS) Emerging Leader Award, 2023
Sloan Research Fellowship in Mathematics, 2023
Bernoulli Society New Researcher Award, 2023
NSF CAREER Award, 2021
T.W. Anderson Theory of Statistics Dissertation Award, Department of Statistics, Stanford University, 2017
Howard Hughes Medical Institute International Student Graduate Research Fellowship, 2015
Stanford Department of Statistics Teaching Award, 2013
Middleton Miller ’29 Prize for best independent work in mathematics, Princeton University, 2012

Miscellaneous

This page has links to methods from my papers. Feel free to contact me if you are interested to use them.

ePCA: github

The ePCA method for principal component analysis of exponential family data, e.g. Poisson-modeled count data. (with L.T. Liu);

EigenEdge: github

Methods for working with large random data matrices, including

Computing eigenvalue distributions of covariance matrices (general Marchenko-Pastur distributions).
Optimal statistics for testing in principal component analysis.
Tools for spiked covariance models: spike and cosine descriptors, optimal shrinkers.

pweight : github R. github Matlab

P-value weighting techniques for multiple hypothesis testing. These can improve power in multiple testing, if there is prior information about the individual effect sizes. Includes the iGWAS method for Genome-Wide Association Studies.

Activity

Latest Research

Behrad Moniri, Seyed Hamed Hassani, Edgar Dobriban, Evaluating the Performance of Large Language Models via Debates.

All Research