Population Health Data Science (PHDS) affiliates conduct a diverse range of innovative, high-impact research. Their cutting edge data science research informs policies and practices that lead to improvements in population-level health. Below are some selected projects from our PHDS community of scholars.
Prior studies assessing the impact of climate-related exposures on health outcomes have been limited in their ability to assess the impacts on some of the most vulnerable populations, including those with low socioeconomic means across the US and those who rely on public health insurance. With data on all Medicaid claims across the US from 2005-2012, and accessing new information on claims from 2013-2018 through a BUSPH Spark award, Dr. Nori-Sarma is building a research hub to better understand and quantify the impacts of extreme environmental exposures associated with climate change on health in some of the most vulnerable populations across the US. This research will additionally bring in collaborators across BUSPH to understand a broad variety of exposures and outcomes associated with climate change, from physical to mental health.
Click here to read more about the BUSPH Strategic Direction Spark award that provided pilot funds for this project.Explore Dr. Nori-Sarma's Research
Can algorithms see the effects of redlining on neighborhoods? An important cause of racial health disparities is the differences in built environment conditions that result from long-term disinvestment. Unfortunately, collecting observations about the built environment is costly and time-consuming and few good measures of long-term disinvestment exist. To overcome these challenges, we trained machine learning models to identify associations between present-day aerial imagery and historical redlining (based on 1930s lending policies that allocated investment based on a neighborhood’s racial composition). We used commonplace computer vision methods to extract data from the imagery and predict the redlining grade (i.e., A through D) in Philadelphia, PA. The predicted grade was our long-term disinvestment measure. On visual inspection, the model learned that worse 1930s grades are associated with less green space and higher housing density today, and these scores aligned with other measures of disinvestment. We also found that the disinvestment scores strongly predicted present-day firearm violence, even after controlling for other important neighborhood-level covariates. Our results indicate that this ML-based approach can help measure long-term disinvestment and potentially contribute to our understanding of health disparities.Explore Dr. Jay's Research
Distributed Analytics for Enhancing Fertility in Families: This project leverages information from self-administered surveys (such as SPH’s PRESTO study) and medical records to produce highly accurate personalized predictions regarding fertility potential, pregnancy, the success of an In Vitro Fertilization cycle, and the presence of specific reproductive health issues affecting fertility. In addition to predictions, the project develops methods to generate personalized recommendations, empowering individuals and their physicians to make the most appropriate, individualized health care decisions. The work is in line with the emergence of personalized medicine, aided by data and algorithmic advances.Explore Dr. Paschalidis' Research
Graph convolutional network (GCN) uses the Laplacian matrix of a given graph as the kernel matrix to train the convolutional network. This approach assumes that the given graph is error-free. However, in real-world applications, a graph is usually manually constructed, which is error-prone. Thus, classifiers obtained by overlapping neighboring nodes around the origin node can be misleading as the neighborhood defined by the graph is no longer error-free. To quantify the uncertainty, we assume an edge between two nodes is sampled from a Bernoulli distribution where the probability parameter is generated by a graphon function. To address the aforementioned issues of GCN, we propose the oracle graph convolutional network (orGCN), which replaces the kernel matrix with a graphon estimation. As graphon is the limit of a graph, when the number of nodes n goes to infinity, it is less susceptible to connectivity errors. The superiority of the proposed method is demonstrated by various synthetic and real experiments.Explore Dr. Cheng's Research
Green space or trees and natural vegetation can provide mental health benefits and possibly lower risk of neurodegenerative diseases such as Alzheimer’s disease and related dementias (ADRD); however, green space is often measured poorly in epidemiologic research. My research integrates high-resolution images from Google Street Views to derive ground-level objective measurements of green space into the Multi-Ethnic Study of Atherosclerosis, a diverse prospective cohort study using deep learning algorithms, mediation analysis and geographic mixtures. This research will enable unprecedented perspectives on exposures that drive ADRD and related cognitive decline risk, and will provide translational insights into potential interventions to optimize opportunities for ADRD prevention and reduce racial disparities in ADRD.Explore Dr. Pescador Jimenez's Research
The World Health Organization’s global action plan emphasizes the significance of prompt and precise dementia diagnoses as a key strategic objective in public health. Our research lab is actively engaged in advancing AI-powered solutions to tackle this global challenge. We have recently created innovative and interpretable deep learning models capable of accurately assessing dementia using routinely gathered clinical data. Our results have been validated against established clinical standards, further supported by post-mortem evidence.Explore Dr. Kolachalama's Research