Mining the Hidden Genome to Map Tumor Site of Origin

Ronglai Shen, Ph.D.

Associate Member

Department of Epidemiology and Biostatistics

Memorial Sloan-Kettering Cancer Center, New York NY

Seminar Information

Seminar Date
May 14, 2021 - 2:00 PM


Ronglai Shen, Ph.D.


The vast preponderance of somatic mutations in a typical cancer are either extremely rare or have never been previously recorded in available databases that track somatic mutations. These constitute a hidden genome that contrasts the relatively small number of mutations that occur frequently, the properties of which have been studied in depth. We demonstrate that this hidden genome contains much more accurate information than common mutations for the purpose of identifying the site of origin of primary cancers in settings where this is unknown. We accomplish this using a projection-based statistical method that achieves a highly effective signal condensation, by leveraging DNA sequence and epigenetic contexts using a set of meta-features that embody the mutation contexts of rare variants throughout the genome.

Speaker Bio

Dr. Shen’s research interest lies in developing statistical and computational genomics approaches and their applications to translational cancer research. Her past work includes statistical methodologies for data integration and molecular subtype analysis of cancer and across multiple data modalities; tumor evolution and allele-specific copy number analysis using whole-genome, whole-exome and targeted capture sequencing data. Her recent research interest also includes a novel investigation of somatic variant richness using statistical methodologies developed in ecology and computational linguistics. This project uses sophisticated statistical tools to extract information from rare variants in existing databases with a view to identifying the site of origin for cancers of unknown primaries and cancers detected from circulating cell-free DNA in the blood.