For Additional Information, Please Contact
the Wharton Department of Statistics
PhD, Cornell University, 1964
BS, California Institute of Technology; 1961
Member, National Academy of Sciences
DSc (honorary) Purdue University, 1993
Fellow, Institute of Mathematical Statistics and American Statistical Association
Recipient, Wilks Memorial Award (of the American Statistical Association), 2002
CR and B Rao Prize in statistics
Provost’s Award for Doctoral Education (UPenn)
Wharton: 1994-2018 (named Miers Busch, W’1885, Professor, 1994).
Previous appointments: Cornell University; Rutgers University; University of California, Berkeley.
Visiting appointments: University of California, Los Angeles; Hebrew University; Technion, Haifa, Israel; Birkbeck College, London; Peking University and Chinese National Academy of Sciences, Beijing
National Academy of Sciences, Section 32 Chairman (Applied Mathematical Sciences), 2000-2002
Member, NRC Select Committee to Review U.S. Census for 2000, 1998-2004
Member, NRC Committee on National Statistics, 1999-2005
Chairman, NRC Committee on National Statistics, 2010-2018
Member, NAS Report Review Committee
Chairman, NRC Committee to Review Research and Development Statistics program at NSF, 2002-2005
Member, NRC Panel on Coverage Evaluation in the 2010 Census, 2004-2008
Arun Kumar Kuchibhotla, Lawrence D. Brown, Andreas Buja, Edward I. George, Linda Zhao (2020), Valid Post-selection Inference in Assumption-lean Linear Regression, Annals of Statistics, (to appear).
Abstract: This paper provides multiple approaches to perform valid post-selection inference in an assumption-lean regression analysis. To the best of our knowledge, this is the first work that provides valid post-selection inference for regression analysis in such a general settings that include independent, m-dependent random variables.
Arun Kumar Kuchibhotla, Lawrence D. Brown, Andreas Buja, Edward I. George, Linda Zhao (2020), A Model Free Perspective for Linear Regression: Uniform-in-model Bounds for Post Selection Inference, Econometric Theory, (to appear).
Abstract: For the last two decades, high-dimensional data and methods have proliferated throughout the literature. The classical technique of linear regression, however, has not lost its touch in applications. Most high-dimensional estimation techniques can be seen as variable selection tools which lead to a smaller set of variables where classical linear regression technique applies. In this paper, we prove estimation error and linear representation bounds for the linear regression estimator uniformly over (many) subsets of variables. Based on deterministic inequalities, our results provide “good” rates when applied to both independent and dependent data. These results are useful in correctly interpreting the linear regression estimator obtained after exploring the data and also in post model-selection inference. All the results are derived under no model assumptions and are non-asymptotic in nature.
Andreas Buja, Lawrence D. Brown, Richard A. Berk, Edward I. George, Emil Pitkin, Mikhail Traskin, Kai Zhang, Linda Zhao (2019), Models as Approximations I: Consequences Illustrated with Linear Regression, Statistical Science, 34 (4), pp. 523-544.
Andreas Buja, Lawrence D. Brown, Arun Kumar Kuchibhotla, Richard A. Berk, Edward I. George, Linda Zhao (2019), Models as Approximations II: A Model-Free Theory of Parametric Regression, Statistical Science, 34 (4), pp. 345-365.
Anru Zhang, Lawrence D. Brown, Tony Cai (2019), Semi-supervised Inference: General Theory and Estimation of Means, Annals of Statistics, 47 (5), pp. 2538-2566.
Daniel McCarthy, Kai Zhang, Lawrence D. Brown, Richard A. Berk, Andreas Buja, Edward I. George, Linda Zhao (2018), Calibrated Percentile Double Bootstrap For Robust Linear Regression Inference, Statistica Sinica, 28 (4), pp. 2565-2589.
Arun Kumar Kuchibhotla, Lawrence D. Brown, Andreas Buja (Working), Model-free Study of Ordinary Least Squares Linear Regression.
This course covers Elements of (non-measure theoretic) probability necessary for the further study of statistics and biostatistics. Topics include set theory, axioms of probability, counting arguments, conditional probability, random variables and distributions, expectations, generating functions, families of distributions, joint and marginal distributions, hierarchical models, covariance and correlation, random sampling, sampling properties of statistics, modes of convergence, and random number generation. Two semesters of calculus (through multivariate calculus), linerar algebra, or permission of the instructor to enroll.
This class will cover the fundamental concepts of statistical inference. Topics include sufficiency, consistency, finding and evaluating point estimators, finding and evaluating interval estimators, hypothesis testing, and asymptotic evaluations for point and interval estimation. Prerequisite: If course requirements not met, permission of instructor.
This This class will cover the fundamental concepts of statistical inference. Topics include sufficiency, consistency, finding and evaluating point estimators, finding and evaluating interval estimators, hypothesis testing, and asymptotic evaluations for point and interval estimation.
Data summaries and descriptive statistics; introduction to a statistical computer package; Probability: distributions, expectation, variance, covariance, portfolios, central limit theorem; statistical inference of univariate data; Statistical inference for bivariate data: inference for intrinsically linear simple regression models. This course will have a business focus, but is not inappropriate for students in the college. This course may be taken concurrently with the prerequisite with instructor permission.
Continuation of STAT 101. A thorough treatment of multiple regression, model selection, analysis of variance, linear logistic regression; introduction to time series. Business applications. This course may be taken concurrently with the prerequisite with instructor permission.
Further development of the material in STAT 111, in particular the analysis of variance, multiple regression, non-parametric procedures and the analysis of categorical data. Data analysis via statistical packages. This course may be taken concurrently with the prerequisite with instructor permission.
Discrete and continuous sample spaces and probability; random variables, distributions, independence; expectation and generating functions; Markov chains and recurrence theory.
Graphical displays; one- and two-sample confidence intervals; one- and two-sample hypothesis tests; one- and two-way ANOVA; simple and multiple linear least-squares regression; nonlinear regression; variable selection; logistic regression; categorical data analysis; goodness-of-fit tests. A methodology course. This course does not have business applications but has significant overlap with STAT 101 and 102. This course may be taken concurrently with the prerequisite with instructor permission.
Elements of matrix algebra. Discrete and continuous random variables and their distributions. Moments and moment generating functions. Joint distributions. Functions and transformations of random variables. Law of large numbers and the central limit theorem. Point estimation: sufficiency, maximum likelihood, minimum variance. Confidence intervals. A one-year course in calculus is recommended.
Graphical displays; one- and two-sample confidence intervals; one- and two-sample hypothesis tests; one- and two-way ANOVA; simple and multiple linear least-squares regression; nonlinear regression; variable selection; logistic regression; categorical data analysis; goodness-of-fit tests. A methodology course.
An introduction to the mathematical theory of statistics. Estimation, with a focus on properties of sufficient statistics and maximum likelihood estimators. Hypothesis testing, with a focus on likelihood ratio tests and the consequent development of "t" tests and hypothesis tests in regression and ANOVA. Nonparametric procedures.
Written permission of instructor, the department MBA advisor and course coordinator required to enroll.
This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.
Written permission of instructor and the department course coordinator required to enroll.
For most of the past century, factories offered a path upward for Americans short on education. But millions of “good” manufacturing jobs have fallen victim to automation and global competition, leaving many low and semi-skilled workers to turn to a 21st century replacement: the telephone call center. What are the advantages of call centers, how can the technology best be used and what is the outlook for call-center employment in the next decade?Knowledge @ Wharton - 2002/04/10