Translating Single Cell Genomics Data into Computable Knowledge about Cell Phenotypes using Artificial Intelligence

Yun (Renee) Zhang, Ph.D.

Assistant Professor

Informatics Department

J. Craig Venter Institute

Seminar Information

Seminar Date
May 17, 2024 - 2:00 PM

The FUNG Auditorium - PFBH



Cells are the most fundamental units of life in human body and other multi-cellular organisms. A cell’s identity and function (hereinafter, cellular phenotype or cell type) are dictated by the subset of genes/proteins/metabolites expressed. Any abnormalities in the expressed genome are disorders that form the physical basis of disease. Therefore, understanding normal and abnormal cellular phenotypes is key biomedical knowledge essential for diagnosing diseases and identifying potential therapeutic targets. Single cell genomics technologies are revolutionizing our understanding of the complex biological systems. In this talk, I will focus on our investigations of the diverse cellular phenotypes using computational, statistical, and artificial intelligence approaches from single cell genomics data. Our group has developed a suite of software tools, NS-Forest and FR-Match, for characterizing the large-scale cell types discovered by the single cell/nucleus RNA-sequencing (scRNA-seq) technology at an unprecedented rate. NS-Forest is a random forest machine learning-based method for identifying cell type-specific marker genes that are optimized for cell type classification. Applications of the NS-Forest marker genes include serving as an informatively reduced feature space for cell type matching using FR-Match, as probe genes for multiplex spatial transcriptomics cell type localization, and as definitional characteristics for semantic cell type representation in the Provisional Cell Ontology (PCL). We have extensively validated these methods in human and mouse brains, human lung, kidney, skin, and endometrial tissue. In a recent pilot study, we have implemented a Neo4j integrated knowledge graph leveraging the 400+ brain cell types from human primary motor cortex defined in PCL and the OMIM knowledgebase for human genetic disorders, where we revealed novel connection of a known drug target to a specific cell type – the Inh L1 PAX6 CHRFAM7A cell type (PCL:0015009) whose marker gene (CHRNA7) “BINDS” a proved Alzheimer’s Disease drug compound (Galantamine).

Speaker Bio

Yun (Renee) Zhang, PhD, is an Assistant Professor in the Informatics Department at the J. Craig Venter Institute (JCVI). She received an MMath in Mathematics and Statistics from the University of Oxford, UK, and a PhD in Statistics from the University of Rochester Medical Center. Dr. Zhang’s research interest includes statistical modeling and methodology development for big data produced by advanced biotechnologies. Her recent research focus is on applying machine learning (ML) and explainable artificial intelligence (XAI) approaches to single cell genomics.