Nancy Zhang

Nancy Zhang
  • Professor of Statistics

Contact Information

  • office Address:

    456 Jon M. Huntsman Hall
    3730 Walnut Street
    Philadelphia, PA 19104

Research Interests: change-point methods, empirical bayes estimation, genomics., model and variable selection, scan statistics, statistical modeling

Links: CV

Overview

I got a BSc in Mathematics (2001), MSc in Computer Sciences (2001), and PhD in Statistics, all from Stanford University. From 2005-2006 I was a postdoctoral researcher at UC Berkeley. In 2006 I joined the Department of Statistics at Stanford University as assistant professor. I moved to Univ. of Pennsylvania in 2011.

Continue Reading

Research

Teaching

Past Courses

  • STAT102 - INTRO BUSINESS STAT

    Continuation of STAT 101. A thorough treatment of multiple regression, model selection, analysis of variance, linear logistic regression; introduction to time series. Business applications.

  • STAT405 - STAT COMPUTING WITH R

    The goal of this course is to introduce students to the R programming language and related eco-system. This course will provide a skill-set that is in demand in both the research and business environments. In addition, R is a platform that is used and required in other advanced classes taught at Wharton, so that this class will prepare students for these higher level classes and electives.

  • STAT431 - STATISTICAL INFERENCE

    Graphical displays; one- and two-sample confidence intervals; one- and two-sample hypothesis tests; one- and two-way ANOVA; simple and multiple linear least-squares regression; nonlinear regression; variable selection; logistic regression; categorical data analysis; goodness-of-fit tests. A methodology course. This course does not have business applications but has significant overlap with STAT 101 and 102.

  • STAT471 - MODERN DATA MINING

    Modern Data Mining: Statistics or Data Science has been evolving rapidly to keep up with the modern world. While classical multiple regression and logistic regression technique continue to be the major tools we go beyond to include methods built on top of linear models such as LASSO and Ridge regression. Contemporary methods such as KNN (K nearest neighbor), Random Forest, Support Vector Machines, Principal Component Analyses (PCA), the bootstrap and others are also covered. Text mining especially through PCA is another topic of the course. While learning all the techniques, we keep in mind that our goal is to tackle real problems. Not only do we go through a large collection of interesting, challenging real-life data sets but we also learn how to use the free, powerful software "R" in connection with each of the methods exposed in the class.

  • STAT701 - MODERN DATA MINING

    Modern Data Mining: Statistics or Data Science has been evolving rapidly to keep up with the modern world. While classical multiple regression and logistic regression technique continue to be the major tools we go beyond to include methods built on top of linear models such as LASSO and Ridge regression. Contemporary methods such as KNN (K nearest neighbor), Random Forest, Support Vector Machines, Principal Component Analyses (PCA), the bootstrap and others are also covered. Text mining especially through PCA is another topic of the course. While learning all the techniques, we keep in mind that our goal is to tackle real problems. Not only do we go through a large collection of interesting, challenging real-life data sets but we also learn how to use the free, powerful software "R" in connection with each of the methods exposed in the class.

  • STAT705 - STAT COMPUTING WITH R

    The goal of this course is to introduce students to the R programming language and related eco-system. This course will provide a skill-set that is in demand in both the research and business environments. In addition, R is a platform that is used and required in other advanced classes taught at Wharton, so that this class will prepare students for these higher level classes and electives.

  • STAT991 - SEM IN ADV APPL OF STAT

    This seminar will be taken by doctoral candidates after the completion of most of their coursework. Topics vary from year to year and are chosen from advance probability, statistical inference, robust methods, and decision theory with principal emphasis on applications.

Awards and Honors

  • Sloan Fellowship, 2011
  • New World Silver Medal for Best PhD Thesis in Mathematical Sciences, 2007

Activity

Latest Research

Yuchao Jiang, Rujin Wang, Eugene Urrutia, Ioannis N. Anastopoulos, Katherine L. Nathanson, Nancy Zhang (Under Review), CODEX2: full-spectrum copy number variation detection by high-throughput DNA sequencing.
All Research

Awards and Honors

Sloan Fellowship 2011
All Awards