Alex Derry

Alex Derry’s Dissertation Defense 12/8

Friday, December 8th, 2023

10:00 am PST

Location: Y2E2 111

Deep learning on local sites for protein structure and function analysis

Understanding how the three-dimensional structure of a protein leads to its function is important for determining disease mechanisms, developing targeted therapeutics, and engineering new proteins with desired functional characteristics. The expansion of protein structure databases due to experimental and computational advances provides an unprecedented opportunity to learn structure-function relationships in a data-driven manner. Deep learning methods that operate on protein structures have shown promise for specific tasks, but their utility for functional analysis has been limited due to inconsistencies in model training and evaluation, lack of labeled proteinfunction data, and an inability to reconcile global predictions with local biochemical mechanisms. In this dissertation, I explore these challenges and propose a framework for protein analysis based on learning on local sites rather than the entire protein structure. First, to establish standards for model development and evaluation, I present work on (1) developing a suite of benchmark datasets, processing tools, and baseline models, and (2) quantifying the effect of differing structure compositions in the training data. I then describe a self-supervised learning method that leverages evolutionary relationships to learn general-purpose representations of local structural sites and show how these representations enable improved performance on downstream tasks involving classification, search, and annotation of functional sites. By clustering millions of sites, I propose a framework for protein analysis based on conserved structural motifs which enables the discovery of functional relationships across protein classes. Finally, I present a method for explainable function annotation that predicts the overall function of a protein as well as the individual residues which are responsible.


(PW: 271506)

Kyle Daniels

Weekly Seminar: Kyle Daniels, 12/7/23


Speaker: Kyle Daniels, Assistant Professor of Genetics, Stanford University

Title: Decoding the language of signaling domains to control cell function

Abstract: Cell therapies are powerful technologies in which human cells are reprogrammed for therapeutic applications such as killing cancer cells or replacing defective cells. The technologies underlying cell therapies are increasingly complexity, making rational engineering of cell therapies more difficult. Creating the next generation of cell therapies will require improved experimental approaches and predictive models. Artificial intelligence (AI) and machine learning (ML) methods have revolutionized several fields in biology including genome annotation, protein structure prediction, and enzyme design. Combining experimental library screens and AI to build create predictive models, design rules, and improved designs could accelerate the development of cell therapies. Chimeric antigen receptor (CAR) costimulatory domains derived from native immune receptors steer the phenotypic output of therapeutic T cells. We constructed a library of CARs containing ~2,300 synthetic costimulatory domains, built from combinations of 13 signaling motifs. These CARs promoted diverse cell fates, which were sensitive to motif combinations and configurations. Neural networks trained to decode the combinatorial grammar of CAR signaling motifs allowed extraction of key design rules. For example, non-native combinations of motifs which bind tumor necrosis factor receptor-associated factors (TRAFs) and phospholipase C gamma 1 (PLCg1) enhanced cytotoxicity and stemness associated with effective tumor killing. Thus, libraries built from minimal building blocks of signaling, combined with machine learning, can efficiently guide engineering of receptors with desired phenotypes.

Suggested readings:



Jean Fan

Weekly Seminar 11/30: Jean Fan

Date: 11/30/23

Speaker: Jean Fan, Assistant Professor of Biomedical Engineering at Johns Hopkins University

Title: Computational Methods for Comparative Spatial Omics Analysis

Abstract: Mammalian tissues are comprised of many molecularly and functionally distinct cell-types and cell-states organized into meso-scale structures and patterns to achieve intricate biological functions. Likewise, cells within tissues regulate thousands of interacting genes and other molecules to sense, respond to, and shape their tissue microenvironments. In turn, extrinsic signals from the local microenvironment impact cell state and cell-type specification. Recent advances in high-throughput spatial transcriptomics (ST) technologies now enable the identification and characterization of these cell-type and their molecular states in health versus disease while preserving the cell’s spatial context. Application of these ST technologies provides the opportunity to contribute to a more complete understanding of how cellular spatial organization relates to tissue function and how cellular spatial organization is altered in disease. New statistical approaches and scalable computational tools are needed to connect these molecular states and spatial-contextual differences. In this talk, I will provide an overview the latest ST technologies as well as associated computational analysis methods developed by my lab and their applications. I will highlight our development of STalign to align 2D spatially resolved transcriptomics datasets within and across technologies and to 3D common coordinate framework in order to make molecular and cell-type compositional comparisons at matched spatial locations across structurally similar tissues. I will present ongoing developments of CRAWDAD, Cell-type Relationship Analysis Workflow Done Across Distances, to quantitatively evaluate cell-type spatial relationships across different length scales to make cell-type relational comparisons. We anticipate that such statistical approaches and computational methods for analyzing spatially resolved transcriptomic data will offer the potential to identify and characterize spatial organizational differences and contribute to important fundamental biological insights regarding how cell-type spatial organization differs in healthy and diseased settings.


For more info:

1 2 3 5