305 Academic Research Building
265 South 37th Street
Philadelphia, PA 19104
Research Interests: Statistics and machine learning
Research interests:
The group is always looking to expand. We are recruiting PhD students at Penn to work on problems in statistics and machine learning. PhD applicants interested to work with me should mention this on their application. Please apply through the departments of Statistics & Data Science, Computer and Information Science, and the AMCS program, as it gives higher chances for admission.
Education (cv):
Recent news:
Miscellanea:
Talk slides: GitHub. Google Scholar.
Xinmeng Huang, Shuo Li, Mengxin Yu, Matteo Sesia, Seyed Hamed Hassani, Insup Lee, Osbert Bastani, Edgar Dobriban, Uncertainty in Language Models: Assessment through Rank-Calibration.
Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Seyed Hamed Hassani, Eric Wong, JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models.
Leda Wang, Zhixiang Zhang, Edgar Dobriban, Inference in Randomized Least Squares and PCA via Normality of Quadratic Forms.
Xianli Zeng, Guang Cheng, Edgar Dobriban Minimax Optimal Fair Classification with Bounded Demographic Disparity.
Yonghoon Lee, Edgar Dobriban, Eric Tchetgen Tchetgen Simultaneous Conformal Prediction of Missing Outcomes with Propensity Score ε-Discretization.
Xianli Zeng, Guang Cheng, Edgar Dobriban Bayes-Optimal Fair Classification with Linear Disparity Constraints via Pre-, In-, and Post-processing.
Edgar Dobriban and Mengxin Yu SymmPI: Predictive Inference for Data with Group Symmetries.
Wenwen Si, Sangdon Park, Insup Lee, Edgar Dobriban, Osbert Bastani, PAC Prediction Sets Under Label Shift.
Patrick Chao, Alexander Robey, Edgar Dobriban, Seyed Hamed Hassani, George J. Pappas, Eric Wong, Jailbreaking Black Box Large Language Models in Twenty Queries.
Behrad Moniri, Donghwan Lee, Seyed Hamed Hassani, Edgar Dobriban, A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks.
This page has links to methods from my papers. Feel free to contact me if you are interested to use them.
The ePCA method for principal component analysis of exponential family data, e.g. Poisson-modeled count data. (with L.T. Liu);
Methods for working with large random data matrices, including
P-value weighting techniques for multiple hypothesis testing. These can improve power in multiple testing, if there is prior information about the individual effect sizes. Includes the iGWAS method for Genome-Wide Association Studies.