Weijie Su

Associate Professor of Statistics and Data Science
Associate Professor of Computer and Information Science (secondary appointment)

Contact Information

Primary Email:
suw@wharton.upenn.edu
Office Phone:
215-746-8565

office Address:
411 Academic Research Building
265 South 37th Street
Philadelphia, PA 19104

Research Interests: statistical machine learning, high-dimensional inference, large-scale multiple testing, optimization, and privacy-preserving data analysis.

Links: Personal Website

Research

Jinho Bok, Weijie Su, Jason Altschuler (Working), Shifted Interpolation for Differential Privacy.
Richard A. Berk, Andreas Buja, Lawrence D. Brown, Edward I. George, Arun Kumar Kuchibhotla, Weijie Su, Linda Zhao (2021), Assumption Lean Regression, American Statistician, 75 (1), pp. 76-84.
Matteo Sordello, Hangfeng He, Weijie Su (Working), Robust Learning Rate Selection for Stochastic Optimization via Splitting Diagnostic.
Abstract: This paper proposes SplitSGD, a new dynamic learning rate schedule for stochastic optimiza- tion. This method decreases the learning rate for better adaptation to the local geometry of the objective function whenever a stationary phase is detected, that is, the iterates are likely to bounce at around a vicinity of a local minimum. The detection is performed by splitting the single thread into two and using the inner product of the gradients from the two threads as a measure of stationarity. Owing to this simple yet provably valid stationarity detection, SplitSGD is easy-to-implement and essentially does not incur additional computational cost than standard SGD. Through a series of extensive experiments, we show that this method is appropriate for both convex problems and training (non-convex) neural networks, with performance compared favorably to other stochastic optimization methods. Importantly, this method is observed to be very robust with a set of default parameters for a wide range of problems and, moreover, yields better generalization performance than other adaptive gradient methods such as Adam.
Hangfeng He and Weijie Su (2020), The Local Elasticity of Neural Networks, International Conference on Learning Representations (ICLR), (to appear) ().
Zhiqi Bu, Jinshuo Dong, Qi Long, Weijie Su (Working), Deep Learning with Gaussian Differential Privacy.
Bin Shi, Simon S. Du, Weijie Su, Michael I. Jordan (2019), Acceleration via Symplectic Discretization of High-Resolution Differential Equations, Advances in Neural Information Processing Systems 32.
Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie Su (2019), Algorithmic Analysis and Statistical Estimation of SLOPE via Approximate Message Passing, Advances in Neural Information Processing Systems 32.
Jinshuo Dong, Aaron Roth, Weijie Su (Working), Gaussian Differential Privacy.
Qingyuan Zhao, Dylan Small, Weijie Su (2019), Multiple Testing When Many p-Values are Uniformly Conservative, with Application to Testing Qualitative Interaction in Educational Interventions, Journal of the American Statistical Association, 114 (527), pp. 1291-1304.
Damian Brzyski, Alexej Gossmann, Weijie Su, Malgorzata Bogdan (2019), Group SLOPE – Adaptive Selection of Groups of Predictors, Journal of the American Statistical Association, 114 (525), pp. 419-433.

Teaching

All Courses

AMCS5999 - Independent Study
Independent Study allows students to pursue academic interests not available in regularly offered courses. Students must consult with their academic advisor to formulate a project directly related to the student’s research interests. All independent study courses are subject to the approval of the AMCS Graduate Group Chair.
AMCS9999 - Ind Study & Research
Study under the direction of a faculty member.
STAT4050 - Stat Computing with R
The goal of this course is to introduce students to the R programming language and related eco-system. This course will provide a skill-set that is in demand in both the research and business environments. In addition, R is a platform that is used and required in other advanced classes taught at Wharton, so that this class will prepare students for these higher level classes and electives.
STAT4310 - Statistical Inference
Graphical displays; one- and two-sample confidence intervals; one- and two-sample hypothesis tests; one- and two-way ANOVA; simple and multiple linear least-squares regression; nonlinear regression; variable selection; logistic regression; categorical data analysis; goodness-of-fit tests. A methodology course. This course does not have business applications but has significant overlap with STAT 1010 and 1020. This course may be taken concurrently with the prerequisite with instructor permission.
STAT5110 - Statistical Inference
Graphical displays; one- and two-sample confidence intervals; one- and two-sample hypothesis tests; one- and two-way ANOVA; simple and multiple linear least-squares regression; nonlinear regression; variable selection; logistic regression; categorical data analysis; goodness-of-fit tests. A methodology course.
STAT7050 - Stat Computing with R
The goal of this course is to introduce students to the R programming language and related eco-system. This course will provide a skill-set that is in demand in both the research and business environments. In addition, R is a platform that is used and required in other advanced classes taught at Wharton, so that this class will prepare students for these higher level classes and electives.
STAT9910 - Sem in Adv Appl of Stat
This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.
STAT9917 - Sem in Adv Appl of Stat
This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.
STAT9950 - Dissertation
Dissertation

Awards and Honors

Peter Gavin Hall IMS Early Career Prize, 2022 Description
For fundamental contributions to the development of privacy-preserving data analysis methodologies; for groundbreaking theoretical advancements in understanding gradient-based optimization methods; for outstanding contributions to high-dimensional statistics, including false discovery rate control and limits in sparsity estimation; for wide-ranging contributions to the theoretical foundation of deep learning.
Society for Industrial and Applied Mathematics (SIAM) Early Career Prize in Data Science, 2022
Facebook Faculty Research Award, 2020
Alfred P. Sloan Research Fellowship, 2020
NSF CAREER Award, 2019-2024
Theodore W. Anderson Stanford Dissertation Award in Theoretical Statistics, 2016

Activity

Latest Research

Jinho Bok, Weijie Su, Jason Altschuler (Working), Shifted Interpolation for Differential Privacy.

All Research

In the News

Can We Still Detect AI-generated Content?

As models like GPT-4 and Claude get better at mimicking humans, researchers offer a new way to test watermarking — the hidden markers used to identify machine-made text.…Read More

Knowledge at Wharton - 4/28/2025

All News

Weijie Su

Contact Information

Overview

Education

Research

Teaching

All Courses

AMCS5999 - Independent Study

AMCS9999 - Ind Study & Research

STAT4050 - Stat Computing with R

STAT4310 - Statistical Inference

STAT5110 - Statistical Inference

STAT7050 - Stat Computing with R

STAT9910 - Sem in Adv Appl of Stat

STAT9917 - Sem in Adv Appl of Stat

STAT9950 - Dissertation

Awards and Honors

In the News

Knowledge at Wharton

Activity

Latest Research

In the News