BMSIP Projects 2024

2024

Project Title PI Intern
Predictive Models for Alzheimer’s Disease Risk using Diverse Clinical and Demographic Data Jinying Chen Haochun Huang
Cancer Progression Markers in Melanoma using Multimodal Sequencing Datasets Deborah Lang Shripushkar Ganesh Krishnan
Deep Learning Methods for Cancer Epitranscriptomics Ignaty Leshchiner Xavier Roy
Alcohol Addiction Associated Spatial Gene Expression Characterization in the Brain Phillip Mews Andreea Soica
GPS2-mediated Signaling Crosstalk with Mitochondrial Unfolded Protein Response using ChIPSeq Valentina Perissi Jawahar Mahendran
Characterizing White Adipose Tissue Celltype Lineage Commitment with scRNASeq Nabil Rabhi Akhila Gundavelli
Neuronal Vulnerability in Alzheimer’s Disease using snRNASeq Jean-Pierre Roussarie, Anatomy & Neurobiology at BUSM Bhanu Shankar Dhulipalla
Molecular Mechanisms of Aortic Aneurysm using Multimodal Sequencing Datasets Francesca Seta Allison Madsen
Down Syndrome Epigenetics using iPSC-derived Cortical Organoids Ella Zeldich Shreya Nalluri
Bidirectional Encoder Representations Transformer (BERT)-based Microbial Identification Analysis Pipeline Chao Zhang Truman Fogler
Transmission Electron Microsopy Image Curation and Analysis in Chronic Kidney Diseases Chao Zhang Zach Derse
Whole Slide Image Analysis Algorithms in Ovarian and Breast Cancer Chao Zhang Saumya Pothukuchi
Novel Graph-based Model Algorithms for scRNASeq Analysis Chao Zhang Muxi Wang

Predictive Models for Alzheimer’s Disease Risk using Diverse Clinical and Demographic Data top

PI: Jinying Chen
Intern: Haochun Huang

This internship is in collaboration with the Chobanian & Avedisian School of Medicine Data Science Core Many predictive models for AD risk assessment have been proposed, however, most use a handful of selected input variables or features (i.e., risk factors) or simple linear models (e.g., logistic regression models) to combine features. Advanced machine learning (ML) models can incorporate many features and model complex, non-linear relationships between these features and the outcome. However, they often face criticisms on lack of interpretability (i.e., lacking obvious connections between some features important for the model’s prediction accuracy and the outcome to predict) which limits their application, especially in clinical settings. This internship seeks a summer intern student to assist in developing interpretable machine learning models for risk prediction in AD progression.

Cancer Progression Markers in Melanoma using Multimodal Sequencing Datasets top

PI: Deborah Lang
Intern: Shripushkar Ganesh Krishnan

We focus on melanocytes, melanocyte stem cells, and melanoma, on pathways that promote stem cell maintenance that in parallel are corrupted in cancer progression.  We work with cells, proteins, RNA, DNA, mouse models, and datasets derived from patient populations. We have a number of projects available: 1) We discovered that tumor suppressive, differentiation, and growth inhibitive pathways are targeted and inhibited by degradation of specific RNA transcripts.  We will analyze global patterns and gene specific targeting of RNA expression patterns and splicing variations; 2) We focus on specific transcription factors and gene regulation through binding to enhancers; 3) analysis of melanocyte stem cells and comparison between stem cells and differentiated cells, and how the stem cells change during aging.

Deep Learning Methods for Cancer Epitranscriptomics top

PI: Ignaty Leshchiner
Intern: Xavier Roy

This internship is in collaboration with the Chobanian & Avedisian School of Medicine Data Science Core There is an ever-growing body of literature which show how modifications to the DNA, both genetically and epigenetically, as well as modifications to proteins are distinct in the setting of cancer. Changes to RNA, especially those epitranscriptomic in nature, have not yet extensively been studied. This is because while the technologies to accurately assess the base-pair changes to DNA are well established, the ability to detect native epitranscriptomic changes in DNA and RNA is not yet a robust technology. We are developing deep learning methods that enable calling of methylation modifications from both RNA and DNA with higher accuracy from Native Nanopore based sequencing and identify tumor type specific DNA/RNA modifications. The project will involve analyzing and training the models to improve call accuracy and detect biological changes within samples.

Alcohol Addiction Associated Spatial Gene Expression Characterization in the Brain top

PI: Phillip Mews
Intern: Andreea Soica

This project addresses how alcohol reshapes the brain’s transcriptional landscape, leading to persistent neural and behavioral adaptations. Despite advances in sequencing technologies, current methods fall short in mapping the spatial distribution of gene expression changes within the brain’s complex architecture. This gap hinders our comprehension of the intricate ways in which alcohol exposure leads to addiction by altering gene expression across various brain regions and cell types. The project aims to overcome these limitations by employing spatial transcriptomics to provide a detailed view of the transcriptional changes induced by alcohol within the context of the brain’s cytoarchitecture, thereby offering new insights into the molecular mechanisms driving addiction.

GPS2-mediated Signaling Crosstalk with Mitochondrial Unfolded Protein Response using ChIPSeq top

PI: Valentina Perissi
Intern: Jawahar Mahendran

This internship explores the crosstalk between GPS2-mediated retrograde signaling and the classic mitochondrial unfolded protein response (mtUPR) pathway mediated by the stress-inducible nuclear-encoded bZIP TFs ATF4 and ATF5. Our hypothesis is that GPS2 is recruited to a subset of nuclear stress response genes through its interaction with ATF4/ATF5. This hypothesis is based on the following preliminary observations: i) Numerous classic ATF4/5 target genes were found among the DEGs regulated by GPS2 upon mitochondrial-to-nucleus translocation in 3T3-L1 cells; ii) Direct interaction between ATF4 and GPS2 is reported in BioGRID and was validated by co-immunoprecipitation in Hela cells; iii) Preliminary overlap between GPS2 target genes and ATF3/ATF4/5-bound genes listed in ChIP-ATLAS suggest there may be a large number of genes that are co-regulated by both GPS2 and bZIP TFs, including ATF4 and ATF5.

Characterizing White Adipose Tissue Celltype Lineage Commitment with scRNASeq top

PI: Nabil Rabhi
Intern: Akhila Gundavelli

We recently identified a new population of quiescent naïve cells that exist within white adipose tissue and that capable of giving rise to committed adipose progenitor cells (APC). To better understand mechanisms of activation of naïve cells and their molecular reprograming toward committed APCs we generated a mouse that enable the in vivo labeling of both naïve cells and APCs. The goal of this proposal is to elucidate the cellular dynamics that enable precursor cells to transition from a quiescent state to beige-primed APCs capable of differentiating into beige adipocytes using data generated from this mouse model.

Neuronal Vulnerability in Alzheimer’s Disease using snRNASeq top

PI: Jean-Pierre Roussarie, Anatomy & Neurobiology at BUSM
Intern: Bhanu Shankar Dhulipalla

This internship will analyze data collected from postmortem entorhinal cortex tissue from individuals that did not have any symptom at the moment of their death but already some pathological lesion accumulation in entorhinal. We are performing single-nucleus RNAseq on these samples and would like to identify genes that are differentially expressed in the different cell populations, and that are associated with appearance of AD pathologies.

Molecular Mechanisms of Aortic Aneurysm using Multimodal Sequencing Datasets top

PI: Francesca Seta
Intern: Allison Madsen

My research seeks to understand molecular mechanisms of aortic aneurysm, a vascular disease for which we want to identify novel therapeutic targets. We generally use vascular smooth muscle cells or aortas from transgenic mice vs control to discover novel gene and protein expression patterns. In the past, we used RNA Sequencing or Proteomics analysis that Bioinformatics students helped analyze to identify novel targets and pathways. My goal is to move our focus on human databases and test whether some of the pathways we discovered in mice and cells translate to humans.

Down Syndrome Epigenetics using iPSC-derived Cortical Organoids top

PI: Ella Zeldich
Intern: Shreya Nalluri

The internship will address how the DS-associated epigenetic landscape shapes the developmental and functional trajectories of oligodendrocytes and other cell types populating cortical organoids. We plan to uncover how the presence of additional chromosome 21 affects transcriptional dynamics and accessibility of chromatin in DS.

Bidirectional Encoder Representations Transformer (BERT)-based Microbial Identification Analysis Pipeline top

PI: Chao Zhang
Intern: Truman Fogler

This internship is in collaboration with the Chobanian & Avedisian School of Medicine Data Science Core The PI proposes developing a novel method that combines efficiency and accuracy for microbiome identification from standard sequencing data. This multi-step pipeline integrates a BERT-based model and EM algorithm to achieve several goals: 1) filtering host DNA/RNA, 2) separating bacteria, viruses, and fungi reads, and 3) accurately quantifying the abundance of each taxon. Additionally, this pipeline can serve as a general quality control tool for detecting laboratory contaminations from other species based on sequencing data.

Transmission Electron Microsopy Image Curation and Analysis in Chronic Kidney Diseases top

PI: Chao Zhang
Intern: Zach Derse

This internship is in collaboration with the Chobanian & Avedisian School of Medicine Data Science Core Collaborating with several different labs, we have collected TEM images from mice (100+), rats (1500+), and humans (6000+). The current prototype works very well on mouse and rat data but is less accurate on human data due to a lack of labels. The potential projects during the summer internship could be: 1) Assisting in labeling, organizing human data, and improving the deep learning model to achieve better accuracy. 2) Implementing a transformer-based model to integrate the TEM images and pathological reports. This could involve generating pathological report templates from images to reduce the writing time for pathologists or building new classification models for disease and drug response prediction. 3) Utilizing data from The Kidney Precision Medicine Project to explore the potential of multimodal integration for TEM images, histology images, and omics data.

Whole Slide Image Analysis Algorithms in Ovarian and Breast Cancer top

PI: Chao Zhang
Intern: Saumya Pothukuchi

This internship is in collaboration with the Chobanian & Avedisian School of Medicine Data Science Core Collaborating with several different labs, we have collected whole slide images (WSI) from ovarian and, breast cancers. This internship aims to tackle the critical challenge of processing and analyzing pathological WSIs of ovarian and breast cancers. The project’s goal is to develop and implement algorithms that can efficiently and accurately identify cancerous tissues, quantify tumor heterogeneity, and predict clinical outcomes. This involves overcoming obstacles such as image segmentation, feature extraction, and the classification of cancer subtypes. By enhancing our capability to analyze WSIs, we aim to contribute to the early detection of ovarian and breast cancers, improve the accuracy of prognosis predictions, and assist in the formulation of personalized treatment plans.

Novel Graph-based Model Algorithms for scRNASeq Analysis top

PI: Chao Zhang
Intern: Muxi Wang

This internship is in collaboration with the Chobanian & Avedisian School of Medicine Data Science Core Our lab is currently developing a novel graphical-based model for single-cell RNA-seq analysis. To evaluate the performance of our new method, we plan to compare it with various scRNA-seq data integration packages using diverse benchmarking datasets. During this summer internship, the student will: 1) Participate in optimizing and implementing the novel deep learning-based method. 2) Assist in running multiple packages, utilizing both R and Python. Additionally, the student will be involved in collecting benchmarking databases and selecting key benchmarking scRNA-seq datasets for data integration tasks.