Edward I. George

Edward I. George
  • Universal Furniture Professor
  • Professor of Statistics

Contact Information

  • office Address:

    446 Jon M. Huntsman Hall
    3730 Walnut Street
    Philadelphia, PA 19104

Research Interests: hierarchical modeling, model uncertainty, shrinkage estimation, treed modeling, variable selection, wavelet regression

Links: CV

Overview

Education

PhD, Stanford University, 1981
MS, SUNY at Stony Brook, 1976
AB, Cornell University, 1972

Career and Recent Professional Awards

Elected Fellow of the International Society for Bayesian Analysis (2014); Elected Fellow  of the American Statistical Association (1997); Elected Fellow of the Institute of Mathematical Statistics (1995).
CBA Foundation Award for Outstanding Research Contributions (1998) and the CBA Foundation Award for Research Excellence (1995), The University of Texas at Austin.
Excellence in Education Award (2001) and the Joe D. Beasley Award for Teaching Excellence (1996), The University of Texas at Austin
McKinsey Award for Excellence in Teaching (1987) and the Emory Williams Award for Excellence in Teaching (1987), The University of Chicago.

Academic Positions Held

Wharton: 2001-present (Chairperson, Statistics Department, 2008-2014; named Universal Furniture Professor, 2002)
Previous appointment: University of Texas at Austin, University of Chicago.
Visiting Appointments: Cambridge University; University of Paris; University of Valencia

Professional Leadership

Editor, Annals of Statistics, 2016-2018; Executive Editor, Statistical Science, 2004-2007.

For more information, go to My Personal Page

Continue Reading

Research

  • Daniel McCarthy, Kai Zhang, Lawrence D. Brown, Richard A. Berk, Andreas Buja, Edward I. George, Linda Zhao (2018), Calibrated Percentile Double Bootstrap For Robust Linear Regression Inference, Statistica Sinica, (in press).

  • Richard A. Berk, Lawrence D. Brown, Andreas Buja, Edward I. George, Linda Zhao (2018), Working with Misspecified Regression Models, Journal of Quantitative Criminology, (in press).

  • Veronika Rockova and Edward I. George (2018), The Spike-and-Slab LASSO, Journal of the American Statistical Association, Theory and Methods, 113 (521), pp. 431-444.

  • Arun Kumar Kuchibhotla, Lawrence D. Brown, Andreas Buja, Edward I. George, Linda Zhao (Working), A Model Free Perspective for Linear Regression: Uniform-in-model Bounds for Post Selection Inference.

    Abstract: For the last two decades, high-dimensional data and methods have proliferated throughout the literature. The classical technique of linear regression, however, has not lost its touch in applications. Most high-dimensional estimation techniques can be seen as variable selection tools which lead to a smaller set of variables where classical linear regression technique applies. In this paper, we prove estimation error and linear representation bounds for the linear regression estimator uniformly over (many) subsets of variables. Based on deterministic inequalities, our results provide “good” rates when applied to both independent and dependent data. These results are useful in correctly interpreting the linear regression estimator obtained after exploring the data and also in post model-selection inference. All the results are derived under no model assumptions and are non-asymptotic in nature.

  • Gemma Moran, Veronika Rockova, Edward I. George (Under Review), On Variance Estimation for Bayesian Variable Selection.

  • Matthew T. Pratola, Hugh A. Chipman, Edward I. George, Robert E. McCulloch (Under Review), Heteroscedastic BART Using Multiplicative Regression Trees.

  • Sameer Deshpande, Veronika Rockova, Edward I. George (Under Review), Simultaneous Variable and Covariance Selection with the Multivariate Spike-and-Slab Lasso.

  • Arun Kumar Kuchibhotla, Lawrence D. Brown, Andreas Buja, Richard A. Berk, Linda Zhao, Edward I. George (Working), Valid Post-selection Inference in Assumption-lean Linear Regression.

    Abstract: This paper provides multiple approaches to perform valid post-selection inference in an assumption-lean regression analysis. To the best of our knowledge, this is the first work that provides valid post-selection inference for regression analysis in such a general settings that include independent, m-dependent random variables.

  • Edward I. George, Veronika Rockova, Paul R. Rosenbaum, Ville Satopää, Jeffrey H. Silber (2017), Mortality Rate Estimation and Standardization for Public Reporting: Medicare’s Hospital Compare, Journal of the American Statistical Association, Applications and Case Studies, 112 (519), pp. 933-947.

    Abstract: Bayesian models are increasingly fit to large administrative data sets and then used to make individualized recommendations. In particular, Medicare's Hospital Compare webpage provides information to patients about specific hospital mortality rates for a heart attack or Acute Myocardial Infarction (AMI). Hospital Compare's current recommendations are based on a random-effects logit model with a random hospital indicator and patient risk factors. Except for the largest hospitals, these individual recommendations or predictions are not checkable against data, because data from smaller hospitals are too limited to provide a meaningful check. Before individualized Bayesian recommendations, people derived general advice from empirical studies of many hospitals; e.g., prefer hospitals of type 1 to type 2 because the risk is lower at type 1 hospitals. Here we calibrate these Bayesian recommendation systems by checking, out of sample, whether their predictions aggregate to give correct general advice derived from another sample. This process of calibrating individualized predictions against general empirical advice leads to substantial revisions in the Hospital Compare model for AMI mortality. In order to make appropriately calibrated predictions, our revised models incorporate information about hospital volume, nursing staff, medical residents, and the hospital's ability to perform cardiovascular procedures. For the ultimate purpose of comparisons, hospital mortality rates must be standardized to adjust for patient mix variation across hospitals. We find that indirect standardization, as currently used by Hospital Compare, fails to adequately control for differences in patient risk factors and systematically underestimates mortality rates at the low volume hospitals. To provide good control and correctly calibrated rates, we propose direct standardization instead.  

  • Veronika Rockova and Edward I. George (2017), Fast Bayesian Factor Analysis via Automatic Rotations to Sparsity, Journal of the American Statistical Association, Theory and Methods, 111, pp. 1608-1622.

    Abstract: Rotational transformations have traditionally played a key role in enhancing the interpretability of factor analysis via post-hoc modifications of the factor model orientation. Regularization methods also serve to achieve this goal by prioritizing sparse loading matrices. In this work, we cross-fertilize these two paradigms within a unifying Bayesian framework. Our approach deploys intermediate factor rotations throughout the learning process, greatly enhancing the effectiveness of sparsity inducing priors. These automatic rotations to sparsity are embedded within a PXL-EM algorithm, a Bayesian variant of parameter-expanded EM for posterior mode detection. By iterating between soft-thresholding of small factor loadings and transformations of the factor basis, we obtain (a) dramatic accelerations, (b) robustness against poor initializations and (c) better oriented sparse solutions. For accurate recovery of factor loadings, we deploy a two-component refinement of the Laplace prior, the spike-and-slab LASSO prior. The potential of the proposed procedure is demonstrated on both simulated and real high-dimensional data, which would render posterior simulation impractical.

Teaching

Current Courses

  • STAT621 - Accelerated Regression Analysis For Business

    STAT 621 is intended for students with recent, practical knowledge of the use of regression analysis in the context of business applications. This course covers the material of STAT 613, but omits the foundations to focus on regression modeling. The course reviews statistical hypothesis testing and confidence intervals for the sake of standardizing terminology and introducing software, and then moves into regression modeling. The pace presumes recent exposure to both the theory and practice of regression and will not be accommodating to students who have not seen or used these methods previously. The interpretation of regression models within the context of applications will be stressed, presuming knowledge of the underlying assumptions and derivations. The scope of regression modeling that is covered includes multiple regression analysis with categorical effects, regression diagnostic procedures, interactions, and time series structure. The presentation of the course relies on computer software that will be introduced in the initial lectures.

    STAT621001 ( Syllabus )

    STAT621003 ( Syllabus )

    STAT621005 ( Syllabus )

Past Courses

  • STAT399 - Independent Study

  • STAT613 - Regression Analysis for Business

    This course provides the fundamental methods of statistical analysis, the art and science if extracting information from data. The course will begin with a focus on the basic elements of exploratory data analysis, probability theory and statistical inference. With this as a foundation, it will proceed to explore the use of the key statistical methodology known as regression analysis for solving business problems, such as the prediction of future sales and the response of the market to price changes. The use of regression diagnostics and various graphical displays supplement the basic numerical summaries and provides insight into the validity of the models. Specific important topics covered include least squares estimation, residuals and outliers, tests and confidence intervals, correlation and autocorrelation, collinearity, and randomization. The presentation relies upon computer software for most of the needed calculations, and the resulting style focuses on construction of models, interpretation of results, and critical evaluation of assumptions.

  • STAT621 - Accelerated Regression Analysis for Business

    STAT 621 is intended for students with recent, practical knowledge of the use of regression analysis in the context of business applications. This course covers the material of STAT 613, but omits the foundations to focus on regression modeling. The course reviews statistical hypothesis testing and confidence intervals for the sake of standardizing terminology and introducing software, and then moves into regression modeling. The pace presumes recent exposure to both the theory and practice of regression and will not be accommodating to students who have not seen or used these methods previously. The interpretation of regression models within the context of applications will be stressed, presuming knowledge of the underlying assumptions and derivations. The scope of regression modeling that is covered includes multiple regression analysis with categorical effects, regression diagnostic procedures, interactions, and time series structure. The presentation of the course relies on computer software that will be introduced in the initial lectures.

  • STAT995 - Dissertation

  • STAT999 - Independent Study

Awards and Honors

  • Cornell University Distinguished Alumni for the Department of Statistical Sciences, 2018
  • Simons Fellowship, Isaac Newton Institute for Mathematical Sciences, Cambridge, 2018
  • Bohrer Lecturer, University of Illinois at Urbana-Champaign, 2017
  • Fellow, International Society for Bayesian Analysis, 2014
  • Challis Lecturer and Award for Outstanding Contributions to Statistics, University of Florida, 2012
  • Geisser Distinguished Lecturer, University of Minnesota, 2012
  • Palmetto Lecturer, University of South Carolina, 2012
  • Loeb Lecturer, Washington University in St. Louis, 2011
  • Medallion Lecturer, Institute of Mathematical Statistics, 2010
  • Penn IUR Faculty Fellow, 2009-2010
  • Hartley Memorial Lecturer, Texas A&M University, 2007
  • Wharton Core Professor Award, “Tough But We’ll Thank in Five Years”, 2004
  • Wharton Core Professor Award, “Goes Above and Beyond”, 2004
  • ISI Highly Cited Researcher in Mathematics, 2004
  • Excellence in Education Award, The University of Texas at Austin, 2001
  • Faculty Honor Roll for Core Class Teaching, The University of Texas at Austin, 2000
  • Lawrence Baxter Memorial Lecturer, SUNY at Stony Brook, 1998
  • Dean’s Fellow, The University of Texas at Austin, 1998-1999
  • CBA Foundation Award for Outstanding Research Contributions, The University of Texas at Austin, 1998
  • Fellow, American Statistical Association, 1997
  • Fellow, Center for Management of Operations and Logistics, The University of Texas at Austin, 1996
  • Member, International Statistical Institute, 1996
  • Joe D. Beasley Award for Teaching Excellence, The University of Texas at Austin, 1996
  • Fellow, Institute of Mathematical Statistics, 1995
  • CBA Foundation Award for Research Excellence, The University of Texas at Austin, 1995
  • GBC Award for Excellence in Teaching of the Core Curriculum, Graduate School of Business, The University of Texas at Austin, 1993
  • Spurgeon Bell Centennial Fellowship, Graduate School of Business, The University of Texas at Austin, 1993-1994
  • McKinsey Award for Excellence in Teaching, The University of Chicago, 1987
  • The Emory Williams Award for Excellence in Teaching, The University of Chicago, 1987

In the News

Knowledge @ Wharton

Activity

In the News

‘Every Time a Bell Rings …’

It’s a Wonderful Life is a Christmas classic, but Wharton statistics professor Edward George says it should also be required viewing for business leaders.

Knowledge @ Wharton - 2013/04/8
All News