Deep Learning for Censored Survival Data

JANE-LING WANG – UNIVERSITY OF CALIFORNIA, DAVIS

ABSTRACT

Unlike standard tasks, survival analysis requires modeling incomplete data, such as right-censored data, which must be treated with care. While deep neural networks excel in traditional supervised learning, it remains unclear how to best utilize these models in survival analysis. A key question asks which data-generating assumptions of traditional survival models should be retained and which should be made more flexible via the function-approximating capabilities of neural networks. In addition, most of these methods are difficult to interpret and mathematical understanding of them is lacking. In this talk, we explore these issues from two directions. First, we study the partially linear Cox model, where the nonlinear component of the model is implemented using a deep neural network. The proposed approach is flexible and able to circumvent the curse of dimensionality, yet it facilitates interpretability of the effects of treatment covariates on survival. Next, we introduce a Deep Extended Hazard (DeepEH) model to provide a flexible and general framework for deep survival analysis. The extended hazard model includes the conventional Cox proportional hazards and accelerated failure time models as special cases, so DeepEH subsumes the popular Deep Cox proportional hazard (DeepSurv) and Deep Accelerated Failure Time (DeepAFT) models. We provide theoretical support for the proposed models, which underscores the attractive feature that deep learning is able to detect low-dimensional structure of data in high-dimensional space. Numerical experiments further provide evidence that the proposed methods outperform existing statistical and deep learning approaches to survival analysis.

Time permitting, we will explore hypothesis testing for the significance of a covariate in a deep survival model.

*Based on Joint work with Qixian Zhong (Xiamen University) and Jonas Mueller (Clean Lab)