Research Interests: applications of statistics to public health, design and analysis of experiments and observational studies for comparing treatments, longitudinal data, measurement error, medicine and economics
Links: Personal Website
PhD, Stanford University, 2002
BA, Harvard University, 1997
For more information, go to My Personal Page
Bikram Karmakar, Chyke A. Doubeni, Dylan Small (2020), Evidence Factors in a Case-control Study with Application to the Effect of Flexible Sigmoidoscopy Screening on Colorectal Cancer, Annals of Applied Statistics, (to appear).
Bikram Karmakar, Dylan Small, Paul R. Rosenbaum (2020), Using evidence factors to clarify exposure biomarkers, American Journal of Epidemiology, (to appear).
Bikram Karmakar and Dylan Small (2020), Assessment of the Extent of Corroboration of an Elaborate Theory of a Causal Hypothesis Using Partial Conjunctions of Evidence Factors, Annals of Statistics, (to appear).
Qingyuan Zhao, Jingshu Wang, Gibran Hemani, Jack Bowden, Dylan Small (2020), Statistical Inference in Two-sample Summary-data Mendelian Randomization using Robust Adjusted Profile Score, Annals of Statistics, (in press).
Edward H. Kennedy and Dylan Small (2020), Paradoxes in Instrumental Variable Studies with Missing Data and One-sided Noncompliance, Journal of the French Statistical Society, (in press).
Hyunseung Kang, Tony Cai, Dylan Small (Under Review), Robust Confidence Intervals for Causal Effects with Possibly Invalid Instruments.
Bo Zhang, Jordan Weiss, Dylan Small, Qingyuan Zhao (2020), Selecting and Ranking Individualized Treatment Rules With Unmeasured Confounding, Journal of the American Statistical Association, (to appear).
Timothy G. Gaulton, Sameer K. Deshpande, Dylan Small, Mark D. Neuman (2020), Observational Study of the Association between Participation in High School Football and Self-Rated Health, Obesity, and Pain in Adulthood, American Journal of Epidemiology, (to appear).
Study under the direction of a faculty member.
This course covers Elements of (non-measure theoretic) probability necessary for the further study of statistics and biostatistics. Topics include set theory, axioms of probability, counting arguments, conditional probability, random variables and distributions, expectations, generating functions, families of distributions, joint and marginal distributions, hierarchical models, covariance and correlation, random sampling, sampling properties of statistics, modes of convergence, and random number generation. Two semesters of calculus (through multivariate calculus), linerar algebra, or permission of the instructor to enroll.
This class will cover the fundamental concepts of statistical inference. Topics include sufficiency, consistency, finding and evaluating point estimators, finding and evaluating interval estimators, hypothesis testing, and asymptotic evaluations for point and interval estimation. Prerequisite: If course requirements not met, permission of instructor.
This This class will cover the fundamental concepts of statistical inference. Topics include sufficiency, consistency, finding and evaluating point estimators, finding and evaluating interval estimators, hypothesis testing, and asymptotic evaluations for point and interval estimation.
Data summaries and descriptive statistics; introduction to a statistical computer package; Probability: distributions, expectation, variance, covariance, portfolios, central limit theorem; statistical inference of univariate data; Statistical inference for bivariate data: inference for intrinsically linear simple regression models. This course will have a business focus, but is not inappropriate for students in the college.
Continuation of STAT 101. A thorough treatment of multiple regression, model selection, analysis of variance, linear logistic regression; introduction to time series. Business applications.
Further development of the material in STAT 111, in particular the analysis of variance, multiple regression, non-parametric procedures and the analysis of categorical data. Data analysis via statistical packages.
This course will cover the design and analysis of sample surveys. Topics include simple sampling, stratified sampling, cluster sampling, graphics, regression analysis using complex surveys and methods for handling nonresponse bias.
Questions about cause are at the heart of many everyday decisions and public policies. Does eating an egg every day cause people to live longer or shorter or have no effect? Do gun control laws cause more or less murders or have no effect? Causal inference is the subfield of statistics that considers how we should make inferences about such questions. This course will cover the key concepts and methods of causal inference rigorously. The course is intended for statistics concentrators and minors.
Elements of matrix algebra. Discrete and continuous random variables and their distributions. Moments and moment generating functions. Joint distributions. Functions and transformations of random variables. Law of large numbers and the central limit theorem. Point estimation: sufficiency, maximum likelihood, minimum variance. Confidence intervals.
An introduction to the mathematical theory of statistics. Estimation, with a focus on properties of sufficient statistics and maximum likelihood estimators. Hypothesis testing, with a focus on likelihood ratio tests and the consequent development of "t" tests and hypothesis tests in regression and ANOVA. Nonparametric procedures.
This is a course in econometrics for graduate students. The goal is to prepare students for empirical research by studying econometric methodology and its theoretical foundations. Students taking the course should be familiar with elementary statistical methodology and basic linear algebra, and should have some programming experience. Topics include conditional expectation and linear projection, asymptotic statistical theory, ordinary least squares estimation, the bootstrap and jackknife, instrumental variables and two-stage least squares, specification tests, systems of equations, generalized least squares, and introduction to use of linear panel data models.
Topics include system estimation with instrumental variables, fixed effects and random effects estimation, M-estimation, nonlinear regression, quantile regression, maximum likelihood estimation, generalized method of moments estimation, minimum distance estimation, and binary and multinomial response models. Both theory and applications will be stressed.
Questions about cause are at the heart of many everyday decisions and public policies. Does eating an egg every day cause people to live longer or shorter or have no effect? Do gun control laws cause more or less murders or have no effect? Causal inference is the subfield of statistics that considers how we should make inferences about such questions. This course will cover the key concepts and methods of causal inference rigorously.
This course will cover the design and analysis of sample surveys. Topics include simple random sampling, stratified sampling, cluster sampling, graphics, regression analysis using complex surveys and methods for handling nonresponse bias.
This course will cover statistical methods for the design and analysis of observational studies. Topics will include the potential outcomes framework for causal inference; randomized experiments; matching and propensity score methods for controlling confounding in observational studies; tests of hidden bias; sensitivity analysis; and instrumental variables.
This course is designed for Ph.D. students in statistics and will cover various advanced methods and models that are useful in applied statistics. Topics for the course will include missing data, measurement error, nonlinear and generalized linear regression models, survival analysis, experimental design, longitudinal studies, building R packages and reproducible research.
Decision theory and statistical optimality criteria, sufficiency, point estimation and hypothesis testing methods and theory.
Theory of the Gaussian Linear Model, with applications to illustrate and complement the theory. Distribution theory of standard tests and estimates in multiple regression and ANOVA models. Model selection and its consequences. Random effects, Bayes, empirical Bayes and minimax estimation for such models. Generalized (Log-linear) models for specific non-Gaussian settings.
This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.
New Wharton research examines the long-term impact of playing high school or college football.Knowledge @ Wharton - 2017/07/21