Descriptions of MBA Level Courses

STAT613 - REGR ANALYSIS FOR BUS (Course Syllabus)

This course provides the fundamental methods of statistical analysis, the art and science if extracting information from data. The course will begin with a focus on the basic elements of exploratory data analysis, probability theory and statistical inference. With this as a foundation, it will proceed to explore the use of the key statistical methodology known as regression analysis for solving business problems, such as the prediction of future sales and the response of the market to price changes. The use of regression diagnostics and various graphical displays supplement the basic numerical summaries and provides insight into the validity of the models. Specific important topics covered include least squares estimation, residuals and outliers, tests and confidence intervals, correlation and autocorrelation, collinearity, and randomization. The presentation relies upon computer software for most of the needed calculations, and the resulting style focuses on construction of models, interpretation of results, and critical evaluation of assumptions.

Prerequisites: The basic mathematical skills covered in STAT 611, Mathematics for Business Analysis

Other Information: Lecture and discussion, assigned exercises, data analysis project, quizzes and a final exam.


STAT 621 is intended for students with recent, practical knowledge of the use of regression analysis in the context of business applications. This course covers the material of STAT 613, but omits the foundations to focus on regression modeling. The course reviews statistical hypothesis testing and confidence intervals for the sake of standardizing terminology and introducing software, and then moves into regression modeling. The pace presumes recent exposure to both the theory and practice of regression and will not be accommodating to students who have not seen or used these methods previously. The interpretation of regression models within the context of applications will be stressed, presuming knowledge of the underlying assumptions and derivations. The scope of regression modeling that is covered includes multiple regression analysis with categorical effects, regression diagnostic procedures, interactions, and time series structure. The presentation of the course relies on computer software that will be introduced in the initial lectures.

Prerequisites: Recent exposure to the theory and practice of regression modeling.

Other Information: Lecture and discussion, assigned exercises, data analysis, quizzes, and a final exam.

STAT701 - MODERN DATA MINING (Course Syllabus)

Modern Data Mining: Statistics or Data Science has been evolving rapidly to keep up with the modern world. While classical multiple regression and logistic regression technique continue to be the major tools we go beyond to include methods built on top of linear models such as LASSO and Ridge regression. Contemporary methods such as KNN (K nearest neighbor), Random Forest, Support Vector Machines, Principal Component Analyses (PCA), the bootstrap and others are also covered. Text mining especially through PCA is another topic of the course. While learning all the techniques, we keep in mind that our goal is to tackle real problems. Not only do we go through a large collection of interesting, challenging real-life data sets but we also learn how to use the free, powerful software "R" in connection with each of the methods exposed in the class.

Prerequisites: STAT 613 or equivalent

STAT705 - STAT COMPUTING WITH R (Course Syllabus)

The goal of this course is to introduce students to the R programming language and related eco-system. This course will provide a skill-set that is in demand in both the research and business environments. In addition, R is a platform that is used and required in other advanced classes taught at Wharton, so that this class will prepare students for these higher level classes and electives.

Prerequisites: STAT 613 or STAT 621 or waiving the Statistics Core completely.


This course provides an introduction to the wide range of techniques available for statistical forecasting. Qualitative techniques, smoothing and decomposition of time series, regression, adaptive methods, autoregressive-moving average modeling, and ARCH and GARCH formulations will be surveyed. The emphasis will be on applications, rather than technical foundations and derivations. The techniques will be studied critically, with examination of their usefulness and limitations.

Prerequisites: STAT 613 or equivalent


This course follows from the introductory regression classes, STAT 102, STAT 112, and STAT 431 for undergraduates and STAT 613 for MBAs. It extends the ideas from regression modeling, focusing on the core business task of predictive analytics as applied to realistic business related data sets. In particular it introduces automated model selection tools, such as stepwise regression and various current model selection criteria such as AIC and BIC. It delves into classification methodologies such as logistic regression. It also introduces classification and regression trees (CART) and the popular predictive methodology known as the random forest. By the end of the course the student will be familiar with and have applied all these tools and will be ready to use them in a work setting. The methodologies can all be implemented in either the JMP or R software packages.

Prerequisites: STAT 613 or STAT 621 or having waived the statistics core completely


This course introduces methods for the analysis of unstructured data, focusing on statistical models for text. Techniques include those for sentiment analysis, topic models, and predictive analytics. Course includes topics from natural language processing (NLP), such as identifying parts of speech, parsing sentences (e.g., subject and predicate), and named entity recognition (people and places). Unsupervised techniques suited to feature creation provide variables suited to traditional statistical models (regression) and more recent approaches (regression trees). Examples that span the course illustrate the success of text analytics. Hierarchical generating models often associated with nonparametric Bayesian analysis supply theoretical foundations.

Prerequisites: Students should be familiar with regression models at the level of STAT 613 and the R statistics language at the level of STAT 705. Familiarity with the R-Studio development environment is presumed, as well as common R packages such as stringr, dplyr and ggplot. Those with more knowledge of Statistics, such as from STAT 722, or computing skills will benefit. The predominant software used in the course is R, with bits of JMP when helpful for interactive illustration. Familiarity with basic probability models is helpful but not presumed.

STAT770 - DATA ANALY & STAT COMP (Course Syllabus)

This course will introduce a high-level programming language, called R, that is widely used for statistical data analysis. Using R, we will study and practice the following methodologies: data cleaning, feature extraction; web scrubbing, text analysis; data visualization; fitting statistical models; simulation of probability distributions and statistical models; statistical inference methods that use simulations (bootstrap, permutation tests).

Prerequisites: STAT 613 or STAT 621 or waiving the Statistics Core completely.

STAT776 - APPL PROB MODELS MKTG (Course Syllabus)

This course will expose students to the theoretical and empirical "building blocks" that will allow them to develop and implement powerful models of customer behavior. Over the years, researchers and practitioners have used these methods for a wide variety of applications, such as new product sales forecasting, analyses of media usage, customer valuation, and targeted marketing programs. These same techniques are also very useful for other types of business (and non-business) problems. The course will be entirely lecture-based with a strong emphasis on real-time problem solving. Most sessions will feature sophisticated numerical investigations using Microsoft Excel. Much of the material is highly technical.

Prerequisites: Students must have a high comfort level with basic integral calculus, and recent exposure to a formal course in probability and statistics is strongly recommended.

Other Information: Format: Lecture, real-time problem solving


This course will build on the fundamental concepts introduced in the prerequisite courses to allow students to acquire knowledge and programming skills in large-scale data analysis, data visualization, and stochastic simulation.

Prerequisites: STAT 770 or STAT 705 or equivalent background acquired through a combination of online courses that teach the R language and practical experience.

STAT851 - FUND OF ACTUARIAL SCI I (Course Syllabus)

This course is the usual entry point in the actuarial science program. It is required for students who plan to concentrate or minor in actuarial science. It can also be taken by others interested in the mathematics of personal finance and the use of mortality tables. For future actuaries, it provides the necessary knowledge of compound interest and its applications, and basic life contingencies definition to be used throughout their studies. Non-actuaries will be introduced to practical applications of finance mathematics, such as loan amortization and bond pricing, and premium calculation of typical life insurance contracts. Main topics include annuities,loans and bonds; basic principles of life contingencies and determination of annuity and insurance benefits and premiums.

Prerequisites: One semester of calculus


This specialized course is usually only taken by Wharton students who plan to concentrate in actuarial science and Penn students who plan to minor in actuarial mathematics. It provides a comprehensive analysis of advanced life contingencies problems such as reserving, multiple life functions, multiple decrement theory with application to the valuation of pension plans.

Prerequisites: STAT 851 or BEPP 851


This course covers models for insurer's losses, and applications of Markov chains. Poisson processes, including extensions such as non-homogeneous, compound, and mixed Poissonprocesses are studied in detail. The compound model is then used to establish the distribution of losses. An extensive section on Markov chains provides the theory to forecast future states of the process, as well as numerous applications of Markov chains to insurance, finance, and genetics. The course is abundantly illustrated by examples from the insurance and finance literature. While most of the students taking the course are future actuaries, other students interested in applications of statistics may discover in class many fascinating applications of stochastic processes and Markov chains.

Prerequisites: Two semesters of Statistics


One half of the course is devoted to the study of time series, including ARIMA modeling and forecasting. The other half studies modifications in random variables due to deductibles, co-payments, policy limits, and elements of simulation. This course is a possible entry point into the actuarial science program. The Society of Actuaries has approved STAT 854 for VEE credit on the topic of time series.

Prerequisites: One semester of probability


Prerequisites: Written permission of instructor, the department MBA advisor and course coordinator.

Statistics Department

The Wharton School,
University of Pennsylvania
400 Jon M. Huntsman Hall
3730 Walnut Street
Philadelphia, PA 19104-6340

Phone: (215) 898-8222
Fax: (215) 898-1280