Probabilistic Methods for Designing Functional Protein Structures

BRIAN TRIPPE – COLUMBIA UNIVERSITY

ABSTRACT

The biochemical functions of proteins, such as catalyzing a chemical reaction or binding to a virus, are typically conferred by the geometry of only a handful of atoms. This arrangement of atoms, known as a motif, is structurally supported by the rest of the protein, referred to as a scaffold. A central task in protein design is to identify a diverse set of stabilizing scaffolds to support a motif known or theorized to confer function. This long-standing challenge is known as the motif-scaffolding problem.

In this talk, I describe a statistical approach I have developed to address the motif-scaffolding problem. My approach involves (1) estimating a distribution supported on realizable protein structures and (2) sampling scaffolds from this distribution conditioned on a motif. For step (1) I adapt diffusion generative models to fit example protein structures from nature. For step (2) I develop sequential Monte Carlo algorithms to sample from the conditional distributions of these models. I finally describe how, with experimental and computational collaborators, I have generalized and scaled this approach to generate and experimentally validate hundreds of proteins with various functional specifications.

Related Papers: