{"id":196,"date":"2017-01-09T13:12:10","date_gmt":"2017-01-09T18:12:10","guid":{"rendered":"https:\/\/sites.bu.edu\/britereu\/?page_id=196"},"modified":"2022-10-18T15:18:47","modified_gmt":"2022-10-18T19:18:47","slug":"196-2","status":"publish","type":"page","link":"https:\/\/sites.bu.edu\/britereu\/reu-faculty-projects\/196-2\/","title":{"rendered":"REU Faculty Past Projects"},"content":{"rendered":"<h3>2021<\/h3>\n<p><a href=\"http:\/\/tandem.bu.edu\/\">Gary Benson <\/a><br \/>\nDepartments of Biology and Computer Science<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Genetic variation and linkage to phenotype<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Benson lab develops algorithms and software for biological sequence comparison and repeat detection in genomic sequences. The focus is understanding the occurrence and functional effects of tandem repeats (TRs), and especially, those with variable copy number, also known as variable number of tandem repeats (VNTRs). The lab has developed an analysis tool, VNTRseek, to identify VNTRs, using high-throughput sequencing data, but it is limited to those TRs that \ufb01t within a sequencing read. This project will develop new algorithmic and statistical methods to permit detection of longer VNTR repeats and the use of longer read sequencing technologies. Additionally, an online database will be created to store and analyze the variant data. Students will gain knowledge in human genetic variability and DNA repeats, and skills in analyzing high-throughput sequencing data, algorithm design and testing, and database development.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/www.bu.edu\/biology\/people\/profiles\/ethan-deyle\/\">Ethan Deyle<\/a><br \/>\nDepartment of Biology<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Quantifying cross-scale interaction in complex natural systems<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>Many of the tools scientists use to quantitatively study the world were developed for engineered systems and laboratory experiments, where a single cause produces a single effect independent of other variables (&#8220;linear separability&#8221;). Natural systems, whether individual cells or entire ocean ecosystems, do not always follow these expectations. Instead, interactions are often state-dependent, where the action of a cause depends on the context around it (i.e. it depends on the state of other variables). This nonlinear state-dependence can interfere with the comfortable, correlative approaches to studying systems, but also presents rich opportunities. This project will center on applying nonlinear causal inference to identify interaction between scales of complexity in natural systems (e.g. single fish populations and ecosystem functioning or single cell expression and organism physiology). Options are available to focus on applied data study of neuronal gene expression, aquatic food-webs, or marine fishery management. It is also possible to focus entirely on numeric simulation data. Students will gain hands-on experience in data processing, non-parametric statistics, and time-series analysis using R or Python (based on preference). Previous coding experience is not a strict requirement but will affect the scope of the project.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/www.bu.edu\/sph\/profile\/josee-dupuis\/\">Josee Dupuis<\/a><br \/>\nDepartment of Biostatistics<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Fine-mapping of genetic loci for quantitative traits<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Dupuis lab develops statistical approaches to identify specific genes or genetic variants that in\ufb02uence complex phenotypes through their associated quantitative traits, which are traits that can be measured numerically, such as height or blood pressure in humans, and seed size or oil content in plants. This project involves developing statistical analyses which combine genome wide association results with prior information from \u201comics\u201d studies (gene variant functionality, gene expression, methylation, metabolomic data, and proteomic data) to determine regions with common or rare genetic variants that are potentially causally associated with traits of interest. Students will become familiar with genetic studies and software for genetic analysis, and will explore publicly available databases to assign putative function to sets of variants.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/www.bumc.bu.edu\/compbiomed\/people\/faculty\/w-evan-johnson\/\">W. Evan Johnson<\/a><br \/>\nDepartments of Biostatistics and Medicine<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Profiling human microbial communities<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Johnson lab studies the human microbiome, i.e., microbial communities which live in and on the human body and play a vital role in health and disease. This project involves the development of statistical tools and software for jointly analyzing microbial and host data from sequencing experiments, in order to determine community content, microbe-microbe interactions, and host-microbe relationships. Students will help develop tools and work\ufb02ows to compile annotated libraries of genes and genomes, curate functional associations between genes and microbes in metabolism, and link microbial abundance to host gene\/pathway expression and other outcomes.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/microbesatbu.wordpress.com\/jenny\/\">Jennifer Bhatnagar<\/a><br \/>\nDepartment of Biology<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Ecological forecasting: Predicting changes in soil microorganisms<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Bhatnagar lab studies soil microbiome variation in the context of changing environmental conditions. Soil microorganisms perform a variety of essential roles, including acting as plant symbionts, animal pathogens, and free-living decomposers that recycle nutrients and carbon through the biosphere. Yet, it is unclear that soil microorgamisms will persist in a changing world. This project will develop new bioinformatics tools to predict which soil microorganisms will endure and remain active over space and time. Microbial DNA sequence data will be collected from soils obtained through a national sampling initiative \u2013 the National Ecological Observatory Network (NEON). The data will be analyzed for gene clusters involving biochemical pathways affecting microbial ecology (e.g., the ability to serve as pathogens or symbionts) and used to develop statistical models that predict variance in microbial function based on location, time, temperature, precipitation, soil nutrient content, and plant biomass. Student will learn key steps in metagenome analysis and methods for data visualization.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/people.bu.edu\/dietze\/index.html\">Michael Dietze<\/a><br \/>\nDepartment of Earth and Environment<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Near-term ecological forecasting<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Dietze lab uses a combination of ecological theory, informatics, statistics, and cyberin-frastructure development to advance the \ufb01eld of predictive ecology, and iterative forecasting, in which new data are used to re\ufb01ne predictive models. Current application areas include: soil microbes, vegetation phenology, land carbon and water \ufb02uxes, aquatic productivity, and algal blooms. This project involves the development of computational forecasting work\ufb02ows, including modules for expansion to new data repositories and forecasting data types, statistical model calibration and validation, and forecast visualization. Students will learn about ecological forecasting, high-performance and cloud computing, software containerization, real-time work\ufb02ow automation, databases, and the statistics of iterative model-data assimilation.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/www.bumc.bu.edu\/compbiomed\/people\/faculty\/josh-campbell\/\">Joshua Campbell<\/a><br \/>\nDepartment of Computational Biomedicine<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Single cell transcriptomics<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Campbell lab focuses on developing computational methods for characterizing cellular heterogeneity in gene expression using single cell RNA sequencing. Tools include CELDA (CEllular Latent Dirichlet Allocation), which identi\ufb01es hidden transcriptional states and cellular subpopulations in count-based, single-cell RNA-seq data, and DecontX, which estimates contamination by ambient RNA in single cell data. This project involves analyzing publicly available single cell datasets to test and develop new methods of single cell analysis. Students will learn about RNA sequencing for bulk tissue and single cell samples, and will help develop analysis pipelines and data visualizations in the R programming language with the R\/shiny graphical user interface.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/sites.bu.edu\/davieslab\/\">Sarah W. Davies<\/a><br \/>\nDepartment of Biology<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Genes and pathways regulating symbiosis in corals<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Davies lab studies how corals and their symbiotic algae maintain and lose symbiosis under varying environmental conditions. Corals meet the majority of their energy needs through life-long symbiotic relationships with single-celled algae. Loss of this relationship leads to coral bleaching and, eventually, colony death. Some corals have a facultative relationship with their algal symbionts, wherein both the host and symbiont can be cultured independently and manipulated in and out of symbiosis. This project involves analyzing gene expression, and in particular, orthologous gene covariance, in such facultative systems under baseline and stress conditions, to help elucidate maintenance and loss of symbiosis. It will use a holobiont (coral + algae) transcriptome developed in the Davies lab. Students will develop knowledge and skills related to ecology, evolution, and RNA-seq analysis. Since stress experiments are ongoing, the project may include an experimental component.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/chemweb.bu.edu\/groups\/allengroup\/\">Karen Allen<\/a><br \/>\nDepartment of Chemistry<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Determining modes of protein-membrane interaction<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Allen lab explores the relationship between protein structure and function using X-ray diffraction and enzyme kinetic studies. In bacteria, a principal mechanism for glycan (complex sugar molecule) assembly on the cytoplasmic face of cell membranes involves polyprenol phosphate (PrenP) phosphoglycosyl transferases (PGTs). PGTs catalyze transfer of a phosphosugar to a membrane-bound PrenP acceptor. Recently, the lab solved the X-ray crystallographic structure of the PGT PglC from Campylobacter concisus, showing that it contains a re-entrant membrane helix (RMH) that penetrates only one lea\ufb02et of the bilayer then re-emerges on the cytoplasmic face. This contradicts computational prediction that the RMH is a transmembrane helix. This project involves developing hidden Markov models (HMM) to predict these \u201cmisannotated\u201d helices in other protein families using data from an in vivo cysteine labeling method to assess whether the N-terminus lies on the cytoplasm or periplasm side of the membrane. Students will gain exposure to protein chemistry, enzyme functional studies, chemoinformatic library analysis, sequence and structural alignment methods and HMM modeling techniques. Since the cysteine labeling studies are ongoing, this project may include an experimental component.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/profiles.bu.edu\/Prasad.Patil\">Prasad Patil<\/a><br \/>\nDepartment of Biostatistics<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Multi-Study Feature Selection<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>I work with sets of datasets that measure the same outcome and overlapping sets of features in multiple patient cohorts. Generally, these are datasets within which patient survival (outcome) and gene expression measurements of 20,000+ genes (features) are recorded for ovarian or breast cancer patients. The goal is to train a prediction rule for risk of cancer progression or recurrence that performs well across datasets and generalizes well for new patients. Oftentimes, there are far more predictors than patients in each dataset, so feature filtration and selection prior to training a prediction rule is a necessary step. A prevailing question is how to ensure that we select features which predict well across studies and avoid features that only perform exceedingly well within a single study. Students will gain experience working with high-dimensional genomic datasets and feature selection and machine learning approaches implemented in the R programming language.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/profiles.bu.edu\/Chunyu.Liu\">Chunyu Liu<\/a><br \/>\nDepartment of Biostatistics<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Genetic and Life Style Factors for Complex Phenotypes<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The Liu lab develops statistical approaches and applies those methodologies to identify genetic and life style factors that influence complex phenotypes. Two projects are available:<o:p><\/o:p><\/p>\n<p>1) Mitochondrial DNA (mtDNA) sequencing project: Mitochondria are power house in human cells. mtDNA is involves in the major pathway for power production. Students will have the opportunity to use publicly available software to identify mutations in the mitochondrial genome (mtDNA) from whole genome sequencing data in human. In addition, they will also have the opportunities to perform association analysis of the mtDNA mutations with cardiovascular disease.<o:p><\/o:p><\/p>\n<p>2) Gene expression and alcohol consumption project: Gene expression is the process by which information in a gene is used to generate messenger RNA (mRNA) for protein production. Students will have the opportunity to identify genes that are related to alcohol consumption. In addition, students will explore gene pathway analyses to identify gene networks that are related to alcohol consumption and cardiovascular disease.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/www.bu.edu\/biology\/people\/profiles\/cynthia-a-bradham\/\">Cynthia Bradham<\/a><br \/>\nDepartment of Biology<o:p><\/o:p><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Identifying Cell-Types Across Treatments in Single-cell RNA Sequencing Data<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>Single-cell RNA sequencing technologies are extraordinarily powerful for dissecting cell compositions in heterogeneous cell mixtures in single biological conditions. However, complications arise when multiple biological conditions are introduced &#8212; such as disease status or drug treatment. We have developed a novel algorithm ICAT to more accurately identify shared, as well as distinct, cell-types between treatments in scRNAseq data. Potential BRITE students could look forward to contributing to new features in ICAT including identifying stably expressed genes between treatments, testing performance across different implementations, and creating a Seurat wrapper. Students would get first-hand experience using machine learning to parse large datasets, implementing high performance Python code, and exposing Python packages to R using the reticulate package. Students will also make heavy use of the linux command line and git. This project will be perfect for students interested in algorithm development, Python and R programming, and machine learning, while working at the very intersection of mathematics, computer science, and developmental biology.<o:p><\/o:p><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<hr \/>\n<p>&nbsp;<\/p>\n<h3>2019<\/h3>\n<hr \/>\n<p><a href=\"http:\/\/www.bumc.bu.edu\/compbiomed\/people\/faculty\/josh-campbell\/\">Joshua Campbell<\/a><br \/>\nDepartment of Medicine<br \/>\nDivision of Computational Biomedicine<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><\/strong><strong>Single Cell Sequencing Analysis<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>High-throughput genomic technologies are rapidly evolving including the areas of DNA and RNA sequencing. Novel types of complex data are being quickly generated and require novel methods for quality control and analysis. We are currently focused on developing and\/or applying methods for identifying genomic alterations in cancer, quantifying the mutagenic effect of carcinogens, and characterizing cellular heterogeneity using single cell RNA sequencing. For example, we have developed the Celda framework (CEllular Latent Dirichlet Allocation), which can be used to identify hidden transcriptional states and cellular populations in count-based single-cell RNA-seq data. This project will include application of Celda to new and\/or publicly available single cell datasets using the Single Cel ToolKit \u2013 an interactive R\/shiny app.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/sites.bu.edu\/davieslab\/\">Sarah Davies<\/a><br \/>\nDepartment of Biology<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><\/strong><strong>Coral symbiosis and Climate Change<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Our lab is interested in\u00a0understanding the diversity and dynamics of coral-algal symbiosis. Unfortunately, this symbiotic relationship breaks down in response to\u00a0thermal stressors associated with climate change in a phenomenon known as coral bleaching.\u00a0The current project will utilize RNA-sequencing data to determine how gene expression responds to thermal stress in coral and their algal symbionts. This project will utilize various bioinformatic tools and statistical approaches to disentangle which genes and gene networks are impacted due to thermal stress. The student will ultimately help address questions as to how algal-coral symbiosis operates, and why this relationship breaks down under thermal stress, which is relevant now more than ever under a changing climate.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/jlab.bu.edu\/\">W. Evan Johnson\u00a0<\/a><br \/>\nDepartments of Medicine and Biostatistics<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><\/strong><strong>Tools and software for profiling microbial communities in multiple human diseases<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Microbial communities which live in and on the human body play a vital role in health and disease. The simultaneous study of microbial function and associated host response is key for understanding metabolism in healthy people and pathogenesis in diseased individuals. Recent innovations in sequencing technologies have enabled the profiling of microbial communities, their function, and the interrogation of host-pathogen interactions at a deeper resolution than ever before.<\/p>\n<p>However, due to the heterogeneity of the microbiome across individuals and the complexity of microbes\u2019 interactions with each other and their hosts, these analyses require advanced computational and statistical techniques and current tools for this purpose are not sufficient. We are developing flexible and powerful statistical tools and software for jointly analyzing microbial and host data from microbiome experiments.<\/p>\n<p>These tools include workflows to compile annotated libraries of genes and genomes, the curation of functional associations between genes and microbes in metabolism and immune response, and linkage of microbial abundance to host gene\/pathway expression and other outcomes. Our software package will also provide a user-friendly R\/Shiny interface with interactive graphics and analysis tools, making the pipeline accessible to users without strong computational backgrounds. We are applying our methods in the context of multiple existing and prospective studies in obesity, kidney disease, lung cancer, and infectious diseases such as HIV and tuberculosis. These studies will result in a deliberate and focused effort to develop a greater understanding of the microbial communities the cohabitate human systems, including the community content, microbe-microbe interactions, and host-microbe relationships.<\/p>\n<p>Researchers on this project will aid in the development of statistical tools and software, and support the analysis of data from microbiome studies from multiple diseases.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/people.bu.edu\/sebas\/about.htm\">Paola Sebastiani<\/a><br \/>\nDepartment of Biostatistics<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><\/strong><strong>Biomarkers of Healthy Aging<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Dr Sebastiani is the statistician of two studies of human longevity: the New England Centenarian Study 1 and the Long Life Family Study2. Both studies aim to discover the genetic and environmental factors that promote long and healthy lives. The two studies are complementary in terms of data structure (one is a population based study with more than 2,000 centenarians and one is a family based study with approximately 550 families demonstrating clustering for exceptional longevity). Both studies, funded by the National Institute on Aging, investigate genetic and environmental factors that affect aging and individual susceptibility to age-related diseases and disability. Key long term objectives of the studies are<\/p>\n<p>(1) Identify genetic risk factor as well as life styles that can effect aging and use this information to design interventions that reduce the burden of morbidity and mortality in older people.<br \/>\n(2) Translate the discoveries from genetic studies into risk prediction models that can help identify genetic signatures of healthy aging and their interaction with the environment.<br \/>\n(3) Identify biomarkers of aging that can be used as prognostic and diagnostic tools.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/microbesatbu.wordpress.com\/\">Jennifer Bhatnagar<\/a><br \/>\nDepartment of Biology<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><\/strong><strong>Predicting changes in the Earth microbiome; near-term ecological forecasting of critical soil microorganisms<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Soil microorganisms have a variety of functions in our natural ecosystems \u2013 from acting as plant symbionts to animal pathogens to free-living decomposers that break down dead material (like plant litter) and recycle nutrients and carbon through the biosphere. This project will develop new bioinformatics tools to answer the questions, \u201chow well will our natural ecosystems function in the coming century and do we know enough about microorganisms in the environment to predict which ones will persist \u2013 and how active they will be \u2013 over space and time?\u201d We will obtain microbial DNA sequence data collected from soils across a new national sampling initiative \u2013 the National Ecological Observatory Network (NEON). This data will be analyzed for biosynthetic clusters that code for specific biochemical pathways in microbial genomes that affect their ecology (e.g. ability to serve as pathogens or symbionts). The data will be used to develop a statistical model that partitions the variance in microbial function on orthogonal axes of space and time, as well as across environmental variables like climate (temperature, precipitation), soil nutrient content, and plant biomass.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/www.emililab.org\/\">Andrew Emili<\/a><br \/>\nDepartment of Biochemistry<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><\/strong><strong>Biomolecular Interactomes<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Dr. Emili develops and applies advanced proteomic, molecular genetic, genomic and bioinformatic technologies to investigate the molecular associations and biological roles of the many varied macromolecules present in different cells, tissues and organisms. His group aims to generate comprehensive maps of the physical and functional \u201cinteractomes\u201d of informative model organisms.\u00a0 These maps are expected to lead to breakthrough mechanistic understanding of how cells and tissues function at a fundamental molecular level, and serve as valuable resources for the broader research community. Ultimately, the aim is to translate this basic knowledge into novel diagnostic and therapeutic tools, with an emphasis on cancer, cardiovascular disease and neurodegenerative disorders.\u00a0 Students on this project will aid in the development of software and tools that support the analysis of interactome data.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<hr \/>\n<p>&nbsp;<\/p>\n<h3>2018<\/h3>\n<hr \/>\n<p><a href=\"http:\/\/tandem.bu.edu\/home.html\">Gary Benson<\/a><br \/>\nDepartments of Biology and Computer Science<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Algorithms for detecting genetic variation<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>The Benson lab develops algorithms to detect genetic variation using next-generation sequencing data. Our main focus is human genetic variation in tandem repeat sequences, called variable number of tandem repeats (VNTRs) and large structural variations (SVs), including inversions, translocations, duplications, insertions and deletions. We use publicly available human sequencing data, especially high coverage, long read data, to detect and characterize variants. We are interested in where the variants occur, their frequency in populations, and how they may affect gene function or chromosome function. Besides detection tools, we also develop online databases to store and share information about the variants we detect. A variety of student projects related to this work are available.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"http:\/\/people.bu.edu\/dietze\/\">Michael Dietze<\/a><br \/>\nDepartment Earth &amp; Environment<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Near-term forecasting of ecological processes: integrating multiple data streams and models<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>The world is changing in many ways that impact ecosystems and the essential services they provide to human health and well-being. In the face of such change, it is imperative that scientists provide the best available information about likely impacts, and responses to decision alternatives, to help society both anticipate and adapt to change. Ecological forecasting is a transformative, interdisciplinary research area concerned with predicting the future states and distributions of ecosystems and their services to humans. This project involves an automated workflow for integrating multiple data streams into predictive models. Students will have to opportunity to contribute to the overall cyberinfrastructure (R packages, Docker+RabbitMQ stack, database) and to active forecast projects in: tick-borne disease, ticks, and small mammal hosts; harmful algal blooms; land surface CO2 and water fluxes; leaf phenology.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/sites.google.com\/site\/kirillskorolev\/\">Kirill Korolev<\/a><br \/>\nDepartment of Physics<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Temporal and spatial variation of microbiota in inflammatory bowel disease<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Humans like all other animals and plants are colonized by thousands of microbial species. This microbiome helps us digest food, trains our immune system, protects us from pathogens, but also plays an important role in disease. Detecting species responsible for diseases is a major goal in microbiome research. This project will study how the microbiome changes along the GI tract in controls and patients with different inflammatory diseases of the digestive system. One of the goals will be to develop prognostic and diagnostic biomarkers that can distinguish disease subtypes or guide the choice of treatment.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/profiles.bu.edu\/Paola.Sebastiani\">Paola Sebastiani<\/a><br \/>\nDepartment of Biostatistics<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Biomarkers of Healthy Aging<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Dr Sebastiani is the statistician of two studies of human longevity: the New England Centenarian Study 1 and the Long Life Family Study2. Both studies aim to discover the genetic and environmental factors that promote long and healthy lives. The two studies are complementary in terms of data structure (one is a population based study with more than 2,000 centenarians and one is a family based study with approximately 550 families demonstrating clustering for exceptional longevity). Both studies, funded by the National Institute on Aging, investigate genetic and environmental factors that affect aging and individual susceptibility to age-related diseases and disability. Key long term objectives of the studies are<br \/>\n(1) Identify genetic risk factor as well as life styles that can effect aging and use this information to design interventions that reduce the burden of morbidity and mortality in older people.<br \/>\n(2) Translate the discoveries from genetic studies into risk prediction models that can help identify genetic signatures of healthy aging and their interaction with the environment.<br \/>\n(3) Identify biomarkers of aging that can be used as prognostic and diagnostic tools.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/microbesatbu.wordpress.com\/research\/\">Jennifer Bhatnagar<\/a><br \/>\nDepartment of Biology<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Predicting changes in the Earth microbiome; near-term ecological forecasting of critical soil microorganisms<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>Soil microorganisms have a variety of functions in our natural ecosystems \u2013 from acting as plant symbionts to animal pathogens to free-living decomposers that break down dead material (like plant litter) and recycle nutrients and carbon through the biosphere. This project will develop new bioinformatics tools to answer the questions, \u201chow well will our natural ecosystems function in the coming century and do we know enough about microorganisms in the environment to predict which ones will persist &#8211; and how active they will be \u2013 over space and time?\u201d We will obtain microbial DNA sequence data collected from soils across a new national sampling initiative \u2013 the National Ecological Observatory Network (NEON). This data will be analyzed for biosynthetic clusters that code for specific biochemical pathways in microbial genomes that affect their ecology (e.g. ability to serve as pathogens or symbionts). The data will be used to develop a statistical model that partitions the variance in microbial function on orthogonal axes of space and time, as well as across environmental variables like climate (temperature, precipitation), soil nutrient content, and plant biomass.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a href=\"https:\/\/www.bumc.bu.edu\/zaia\/\">Joe Zaia<\/a><br \/>\nDepartment of Biochemistry<\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Glycan Abundance Imputation through Clustering and Machine Learning<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>The Zaia lab studies protein glycosylation, a mechanism whereby glycans (complex sugar molecules) are attached to proteins through a series of biosynthetic enzyme reactions. Because the reactions do not go to completion, the result is populations of mature protein molecules heterogeneous with respect to glycosylation and diverse with respect to biological activities. Many families of proteins contain domains (known as lectin domains) that bind to glycan substructures (known as epitopes, containing 1-5 monosaccharide units). Glycosylation of a protein strongly influences its binding to other proteins via lectin domains. The lab is studying the mechanisms whereby viruses alter glycosylation of surface glycoproteins in order to escape host immune response. We have generated large datasets measuring the expression of glycans and glycoproteins from virus evolution experiments, using liquid chromatography-mass spectrometry and tandem mass spectrometry.<br \/>\nREU students will design and implement computer programs to cluster glycan compositions, which they will then use as composite baselines for imputing missing glycopeptide abundances. They will perform and evaluate the imputation through machine learning algorithms; this will build off of published data and currently unpublished work within the lab. Students will learn how glycans modulate interactions with glycan binding proteins, and will gain experience with data structures, statistical evaluation, and scientific programming in the context of clustering algorithms and machine learning methods.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<hr \/>\n<p>&nbsp;<\/p>\n<h3>2017<\/h3>\n<hr \/>\n<p><a target=\"_blank\" href=\"http:\/\/cidarlab.org\/\" rel=\"noopener noreferrer\"><strong>Douglas Densmore, PhD<\/strong><\/a><strong><br \/>\n<\/strong><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Data-Driven Methods for Automated Design of Lab on a Chip Devices<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>The project involves the engineering of biological systems in a field called synthetic biology. A key challenge in synthetic biology is to control numerous environmental conditions and genetic network interactions. Microfluidics has the potential to address this challenge when used as the platform on which synthetic biological systems are explicitly specified, designed, built, and tested. Undergraduate researchers will work on algorithmic methods for designing microfluidic primitives by extracting analytical models from the datasets of mass-prototyped microfluidic chips developed in the CIDAR Lab.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a target=\"_blank\" href=\"https:\/\/www.bu.edu\/sph\/profile\/josee-dupuis\/\" rel=\"noopener noreferrer\"><strong>Jos\u00e9e Dupuis, PhD<\/strong><\/a><strong><br \/>\n<\/strong><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> An integrated genomics information resources platform <\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><br \/>\nGenome-wide association analyses have identified a large number of genetic variants associated with complex traits. However, most susceptibility variants discovered to date fall outside of gene regions, and identifying the true causal genetic variants remains a challenge. The goal of this project is to build an annotation pipeline that will integrate information from various public genomics databases to help prioritize functional studies on genetic variants found to be associated with one or several traits of interest.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a target=\"_blank\" href=\"http:\/\/www.kirillkorolev.com\/\" rel=\"noopener noreferrer\"><strong>Kirill Korolev, PhD<\/strong><\/a><strong><br \/>\n<\/strong><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Temporal and spatial variation of microbiota in inflammatory bowel disease <\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><br \/>\nHumans like all other animals and plants are colonized by thousands of microbial species. This microbiome helps us digest food, trains our immune system, protects us from pathogens, but also plays an important role in disease. Detecting species responsible for diseases is a major goal in microbiome research. This project will study how the microbiome contributes to inflammatory bowel disease (IBD) and develop prognostic and diagnostic biomarkers for this disease. Unlike most previous studies, we will analyze not only bacterial, but also fungal and archaeal communities because we previously found strong microbial imbalance in the fungal as well as bacterial communities.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a target=\"_blank\" href=\"http:\/\/dna.bu.edu\/tullius\/\" rel=\"noopener noreferrer\"><strong>Tom Tullius, PhD<\/strong><\/a><strong><br \/>\n<\/strong><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> A structural map of the human genome<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><br \/>\nHumans like all other animals and plants are colonized by thousands of microbial species. This microbiome helps us digest food, trains our immune system, protects us from pathogens, but also plays an important role in disease. Detecting species responsible for diseases is a major goal in microbiome research. This project will study how the microbiome contributes to inflammatory bowel disease (IBD) and develop prognostic and diagnostic biomarkers for this disease. Unlike most previous studies, we will analyze not only bacterial, but also fungal and archaeal communities because we previously found strong microbial imbalance in the fungal as well as bacterial communities.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><a target=\"_blank\" href=\"https:\/\/www.bu.edu\/biology\/people\/profiles\/david-j-waxman\/\" rel=\"noopener noreferrer\"><strong>David Waxman, PhD<\/strong><\/a><strong><br \/>\n<\/strong><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">High-throughput differential gene expression database <\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><br \/>\nHumans like all other animals and plants are colonized by thousands of microbial species. This microbiome helps us digest food, trains our immune system, protects us from pathogens, but also plays an important role in disease. Detecting species responsible for diseases is a major goal in microbiome research. This project will study how the microbiome contributes to inflammatory bowel disease (IBD) and develop prognostic and diagnostic biomarkers for this disease. Unlike most previous studies, we will analyze not only bacterial, but also fungal and archaeal communities because we previously found strong microbial imbalance in the fungal as well as bacterial communities.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<p>&nbsp;<\/p>\n<hr \/>\n<hr \/>\n<p>&nbsp;<\/p>\n<h3>2016<\/h3>\n<hr \/>\n<p><a target=\"_blank\" href=\"http:\/\/www.bu.edu\/segrelab\/\" rel=\"noopener noreferrer\"><strong>Daniel Segre, PhD<\/strong><\/a><strong><br \/>\n<\/strong><\/p>\n<p><strong><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"> Synthetic ecology of microbes<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/strong><\/p>\n<p>Synthetic ecology of microbes is a young, fast-developing research area concerned with the design, construction and understanding of engineered microbial consortia. The idea of designing microbial consortia is inspired by the ubiquitous presence of microbial communities on our planet, and the key role that these communities play in many aspects of human life, including biogeochemical cycles, animal and plant physiology, and me<img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/Segre-150x150.jpg\" alt=\"Segre\" class=\" wp-image-53 size-thumbnail alignleft\" width=\"150\" height=\"150\" \/>tabolic engineering. Synthetic ecology may allow us to perform specific tasks by understanding and embracing \u2013 rather than avoiding &#8211; properties that seem often inherent in the natural microbial world, such as diversity, resilience, competition for resources, division of labor, and obligate interdependence. Moreover, an engineered community of organisms may perform tasks that no individual species could possibly perform on its own. In recent years, the Segr\u00e8 lab has pioneered new computational methods for studying metabolic dynamics in natural and engineered microbial ecosystems, based on the knowledge of all metabolic reactions encoded in an organism\u2019s genome.<\/p>\n<p><img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/COMETS-strip-636x152-636x152.jpg\" alt=\"COMETS-strip-636x152\" class=\" wp-image-54 aligncenter\" width=\"569\" height=\"136\" \/><\/p>\n<p>One of these approaches, an open source platform for the Computation of Microbial Ecosystems in Time and Space (COMETS), has been successfully tested on small synthetic communities. The specific project suggested for this summer program will involve simulating computationally and testing experimentally the interaction between different bacterial species on a Petri dish. In particular, based on available literature data on co-occurring species in natural microbial communities, and on predictions previously generated in the Segr\u00e8 lab, the student will select two or more species to use in the experiments. COMETS simulations will predict the growth of the species on their own and in co-culture, including possible changes in colony morphology. The predictions will be compared with experimental measurements of colony growth on a Petri dish, using an established protocol for automated acquisition and analysis of images taken with a regular slide scanner connected to a computer and an Arduino microcontroller. Thus the student will have a chance to learn about genome-scale models of artificial microbial communities, and to test predictions using a simple but quantitative and instructive experimental setup. The project will have the potential to probe mechanistically putative inter-species microbial interactions, and can be easily extended to multiple organisms and conditions.<strong>\u00a0<\/strong><\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p><strong><a target=\"_blank\" href=\"http:\/\/tandem.bu.edu\/\" rel=\"noopener noreferrer\">Gary Benson, PhD<\/a><\/strong><\/p>\n<p style=\"text-align: left;\"><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><strong><\/strong><strong>Bit-Parallel Sequence Alignment Algorithms for Tandem Repeat Detection<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\">The Benson lab develops algorithms and software for biological sequence comparison\u00a0and approximate repeat detection in genomic sequences. DNA repeats often\u00a0contain binding sites for regulatory proteins and polymorphic differences can lead to\u00a0differential gene expression. The lab is particularly interested in the occurrence, variability,\u00a0and evolution of tandem repeats and has developed a high-throughput sequencing analysis\u00a0tool, VNTRseek, to identify tandem repeats which<img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/benson2.jpg\" alt=\"benson2\" class=\"size-full wp-image-64 alignleft\" width=\"171\" height=\"203\" \/> have variable copy numbers in a\u00a0population, also known as variable number of tandem repeats loci or VNTRs. In addition,\u00a0because genome analysis often requires extensive sequence comparison (read mapping,\u00a0homology detection, etc.) the lab is developing fast, bit-parallel sequence alignment\u00a0algorithms which use computer logic operations to emulate the score calculations in\u00a0standard serial alignment algorithms.<\/p>\n<p style=\"text-align: left;\">The REU project involves the development of new bit-parallel alignment algorithms and\u00a0their application to characterize VNTR occurrence in whole human genome sequencing\u00a0data. The student will help develop and implement a bit-parallel algorithm for tandem\u00a0global alignment (used when detecting tandem repeats of unknown copy number) using\u00a0SIMD instructions (single instruction, multiple data) and possibly parallel instructions\u00a0designed for Graphical Processing Units (GPUs). The student will also help analyze\u00a0VNTRseek results from long sequencing reads (~1000 bp per read) generated with Pacific\u00a0Biosciences sequencing technology. These tasks will ultimately help in the development of\u00a0an internet accessible public database for VNTRs. The student will gain knowledge in\u00a0human genetic variability and DNA repeats, and skills in analyzing high-throughput\u00a0sequencing data sets, algorithm design and testing, and parallel programming.<\/p>\n<p style=\"text-align: left;\"><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p style=\"text-align: left;\"><strong><a target=\"_blank\" href=\"http:\/\/math.bu.edu\/people\/mkon\/\" rel=\"noopener noreferrer\">Mark A. Kon, PhD<\/a><\/strong><\/p>\n<p style=\"text-align: left;\"><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><strong> <\/strong><strong>Work on copy number versus gene expression biomarkers in cancer<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p style=\"text-align: left;\">The use of copy number information in the analysis, separation, and prognostication\u00a0for cancer subtypes is becoming a common approach in cancer biomarker analysis.\u00a0However, the comparison of predictions arising from copy number biomarkers in\u00a0tissue samples to those arising from gene expression information has had some\u00a0difficulties. Among these is the fact that these two information subtypes are quite\u00a0incompatible in their formats. Our group has developed a formatting procedure for\u00a0copy num<img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/image001.png\" alt=\"image001\" class=\" wp-image-66 alignleft\" width=\"171\" height=\"182\" \/>ber information that makes it similar in format to gene expression\u00a0information. This will allow the importation of a large number of gene expression\u00a0analysis tools to the study of copy number information, now in a parallel fashion.\u00a0One of the goals of this project will be to analyze the application of both toolkits\u00a0(gene expression and copy number) to subtyping and outcome prediction in cancer,\u00a0in order to compare their effectiveness as well as find whether they synergize as\u00a0predictive tools. A second goal will be to determine the methods\u2019 usefulness in\u00a0unsupervised learning. This involves the discovery of cancer subtypes from larger\u00a0groups of cancer samples using clustering and related methods. The use of parallel\u00a0data formats for both copy number and gene expression data may have some\u00a0interesting implications for such subtyping.<\/p>\n<p style=\"text-align: left;\">This project will initially involve the student\u2019s development of skills in\u00a0implementing machine learning programs for prediction of subtypes and outcomes\u00a0based on feature vectors involving genomic and or copy number information. Once\u00a0the skills involving tools such as support vector machine and random forest have\u00a0been developed, the student will be expected to apply these tools two predict\u00a0outcomes of ovarian and other cancer classes based on the two information types.\u00a0There will be a need for computational skills and for mathematical understanding\u00a0of basic concepts.\u00a0In addition to the application of machine learning to the prediction of cancer\u00a0outcomes and identification of subtypes, the participant will be expected to attend\u00a0the laboratory meetings of students\/postdocs affiliated with the DeLisi group at BU,\u00a0which is involved largely with uses of machine learning in computational biology.<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p style=\"text-align: left;\"><a target=\"_blank\" href=\"http:\/\/chemweb.bu.edu\/groups\/allengroup\/\" rel=\"noopener noreferrer\"><strong>Karen Allen, Ph.D.<\/strong><\/a><\/p>\n<p style=\"text-align: left;\"><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><strong>HALOALKANOIC ACID DEHALOGENASE SUPERFAMILY (HADSF)<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p style=\"text-align: left;\">The HAD superfamily is a large enzyme family (~120,000 non-redundant sequences) of\u00a0phosphotransferases (phosphomutases, ATPases and phosphatases) represented in all three\u00a0kingdoms of life, and, within each cell, by numerous homologues (28 in E. coli; 35 in Salmonella\u00a0typhimurium; 31 in Pseudomonas aeruginosa; 30 in Mycobacterium tuberculosis; 84 in Caenorhabdit<img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/karen-pic-400x300.jpg\" alt=\"karen-pic-400x300\" class=\"wp-image-75 alignleft\" width=\"234\" height=\"175\" \/>is\u00a0elegans; 169 in Arabidopsis thaliana; and 183 in human). Approximately 80-90% of the HAD\u00a0superfamily members are phosphatases and it is estimated that 40% of the bacterial metabolome is\u00a0comprised of phosphorylated compounds. Although the HADSF fold is dominant among eukaryotic\u00a0and prokaryotic phosphatases, it has yet to be truly exploited for inhibitor discovery. Such inhibitors\u00a0would be invaluable to discovery and study of metabolic pathways involving phosphorylated\u00a0metabolites. This is in contrast to the phosphotyrosine phosphatase family of phosphatases, for which\u00a0great progress has been made in inhibitor design and focused library-screening-based discovery. To date there are just two reports of HADSF phosphatase inhibitor discovery. We aim to\u00a0focus on targeting the region of the HAD proteins responsible for phosphoryl-group binding\u00a0<span style=\"line-height: 1.5;\">(contributed by the catalytic domain). This may have the added benefit of producing a more<\/span><span style=\"line-height: 1.5;\">generalizable phospho-mimetic, one of the \u201choly grails\u201d of phosphoryl transfer.<\/span><\/p>\n<p style=\"text-align: left;\"><img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/apgm-17D6_GS-PEG6K-24_1x_50-636x470.jpg\" alt=\"apgm-17[D6]_GS-PEG6K-24_1x_50\" class=\" wp-image-74 alignright\" width=\"250\" height=\"185\" srcset=\"https:\/\/sites.bu.edu\/britereu\/files\/2016\/07\/apgm-17D6_GS-PEG6K-24_1x_50-636x470.jpg 636w, https:\/\/sites.bu.edu\/britereu\/files\/2016\/07\/apgm-17D6_GS-PEG6K-24_1x_50.jpg 800w\" sizes=\"(max-width: 250px) 100vw, 250px\" \/><\/p>\n<p style=\"text-align: left;\">In order to identify such a mimetic, we will leverage a number of atomic resolution (~1 \u00c5) structures of\u00a0HAD members liganded to transition-state analogues. These enzymes invariably form a trigonal\u00a0bipyramid with the phosphoryl group together with an apical ligand from the nucleophilic aspartate in\u00a0the phosphatase. The REU student will utilize this data to make a template molecular model of an\u00a0inhibitor scaffold defined by hydrogen bond donors and acceptors which ignore the phosphorus atom\u00a0itself. This model scaffold will then be utilized to mine databases of known binding fragments and\u00a0inhibitors. The student will also utilize chemi-informatics and protein mapping algorithms in order to\u00a0analyze the chemical diversity of \u201chits\u201d and the match to the biophysical properties of the\u00a0corresponding binding sites. Ultimately, the compounds will be tested experimentally on a set of HAD\u00a0phosphotranferases for inhibitory activity and successful compounds will be studied for binding mode\u00a0by obtaining X-ray crystal structures of the complexes with the HAD enzymes. Through these studies,\u00a0students will gain exposure to chemi-informatic library searching and analysis, in silico docking and\u00a0structural analysis (with the possibility of experimental kinetics and structure analysis).<\/p>\n<p style=\"text-align: left;\"><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p style=\"text-align: left;\"><strong><a target=\"_blank\" href=\"http:\/\/www.programmingbiology.org\/\" rel=\"noopener noreferrer\">Douglas Densmore, PhD<\/a><\/strong><span><\/span><\/p>\n<p style=\"text-align: left;\"><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\"><strong>Living Computing Project (LCP)<\/strong><\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p style=\"text-align: left;\">The Living Computing Project (www.programmingbiology.org) investigates computing paradigms in\u00a0living organisms. Specifically, it explores if digital, analog, memory, and communication concepts can be\u00a0implemented in cellular environments. Understanding if quantitative approaches and standardized\u00a0metrics can broadly be applied to these systems will help us develop the formalized mechanisms we can\u00a0use to specify, design, and verify these systems. Solutions in medicine, materials, sensing, and\u00a0manufacturing will be able to be more easily<img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/D.-Densmore-200x300.png\" alt=\"D.-Densmore-200x300\" class=\" wp-image-83 alignleft\" width=\"145\" height=\"218\" \/> created, efficiently implemented, and broadly distributed if\u00a0computing paradigms are found to be applicable.<\/p>\n<p style=\"text-align: left;\"><img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/living-computing-project-logo.png\" alt=\"living-computing-project-logo\" class=\" size-full wp-image-84 aligncenter\" width=\"384\" height=\"125\" \/><\/p>\n<p style=\"text-align: left;\">The collection of 10 UROP students for the summer of 2016 will be aiding in this research. Specifically\u00a0they will be involved in one of four efforts:<\/p>\n<p style=\"text-align: left;\">2016 Boston University Wet Lab iGEM Team \u2013 Four Students \u2013 These students will take basic DNA\u00a0building blocks and assemble them into genetic circuits. These circuits will act as either digital or\u00a0memory based computing elements. The students will then characterize these circuits to extract\u00a0quantitative metrics related to their performance. This data will be archived physically and electronically\u00a0along with the biological DNA information to begin to curate a library of computational components for\u00a0the LCP. These components will be housed in the LCP Inventory of Composable Elements (ICE), the\u00a0Synthetic Biology Open Language (SBOL) Stack, and BTSync (for flow cytometer data). This collection of\u00a0information will be used to augment existing design software to predict the performance of future\u00a0circuits and search for optimized designs. This project will require molecular biology skills and\u00a0bioinformatics analysis. These students will be supervised by BME graduate student Divya Israni.<\/p>\n<p style=\"text-align: left;\">2016 Boston University Hardware iGEM Team \u2013 Three Students &#8211; These students will be creating a\u00a0microfluidic design environment to automate the testing of genetic logic circuits. The fabrication, control,\u00a0and data extraction for this platform will be automated with the use of software tools. The team will be\u00a0creating a genetic system that interfaces with off the shelf sensors and hardware so that it can be quickly\u00a0reconfigured for numerous environments and designs. It will consist of a set of input locations,\u00a0intermediate locations, switch fabric, and output locations. This will allow for a generic device that is\u00a0differentiated experiment by experiment. This project will involve embedded systems design, CNC\u00a0milling, 3D printing, and software programming. These students will be supervised by ECE graduate\u00a0student Ryan Silva.<\/p>\n<p style=\"text-align: left;\">Phagebook and CIDAR Software \u2013 Two Students \u2013 Synthetic biology software includes tools for\u00a0specification, design, assembly, verification, and data management activities. CIDAR lab\u00a0(www.cidarlab.org) has numerous software packages that need to be made either more robust, user\u00a0friendly, or more widely tested. These include a design environment for functional specification and\u00a0assembly of genetic circuits (Phoenix) as well as a social media platform for synthetic biology\u00a0(Phagebook). This project will require web design, Java\/Javascript, cloud computing, and database skills.\u00a0These students will be supervised by ECE graduate student Prashant Vaidyanathan.<\/p>\n<p style=\"text-align: left;\">Living Computing Project Research Intern \u2013 One Student \u2013 This student will work on fundamental\u00a0research questions related to models of computation in synthetic biology (e.g. state machines, data flow\u00a0networks) and how they can be formalized and assigned to biological elements. This project will require\u00a0computational interests, programming skills, and some computer science exposure. This student will be\u00a0supervised by ECE Research Assistant Professor Dr. Swapnil Bhatia.<\/p>\n<p style=\"text-align: left;\"><\/div>\n<\/div>\n<\/p>\n<hr \/>\n<p style=\"text-align: left;\"><a target=\"_blank\" href=\"https:\/\/microbesatbu.wordpress.com\/\" rel=\"noopener noreferrer\"><strong>Jennifer Talbot, PhD<\/strong><\/a><\/p>\n<p style=\"text-align: left;\"><div class=\"bu_collapsible_container \" aria-live=\"polite\" data-customize-animation=\"false\"><h4 class=\"bu_collapsible\" aria-expanded=\"false\"tabindex=\"0\" role=\"button\">Understanding variation in microbial community composition in both space and time<\/h4><div class=\"bu_collapsible_section\" style=\"display: none;\"><\/p>\n<p>New DNA sequencing technology has fundamentally transformed our understanding of microbial communities. We can now rapidly census the species composition of microbial communities in complex systems like soil, and relate them to the spatial and environmental factors that structure these communities. This technology has enabled us to test classic ecological the<img loading=\"lazy\" src=\"\/britereu\/files\/2016\/07\/talbot-e1410557129684-636x636.jpg\" alt=\"talbot-e1410557129684\" class=\"size-medium wp-image-46 alignleft\" width=\"200\" height=\"200\" srcset=\"https:\/\/sites.bu.edu\/britereu\/files\/2016\/07\/talbot-e1410557129684-636x636.jpg 636w, https:\/\/sites.bu.edu\/britereu\/files\/2016\/07\/talbot-e1410557129684-150x150.jpg 150w, https:\/\/sites.bu.edu\/britereu\/files\/2016\/07\/talbot-e1410557129684-1024x1024.jpg 1024w\" sizes=\"(max-width: 200px) 100vw, 200px\" \/>ories about how microbial communities change across space (Fierer &amp; Jackson, 2006). However, little work has characterized how microbial communities change over time (Shade et al., 2013). Filling this knowledge gap will increase our chances of accurately forecasting how microbial systems will respond to disturbance in the future. This is a critical need in Earth system science, because we are beginning to find that specific microbial species have unique activities in the cycling of elements and energy within the biosphere. Nevertheless, to date there is no work testing the relative importance of space vs. time in shaping microbial community composition and activity.<\/p>\n<p><strong>Approach and learning outcomes <\/strong><br \/>\nWe propose a meta-analysis approach to determining the time vs. space variation in microbial community composition. To do this, we will collect DNA sequence data from already identified publications that have resolution in both space and time. This sequence data comes from high-throughput sequencing platforms (e.g. 454-pyrosequencing, Illumina MiSeq runs) that generate Gb of sequence data for each sample set for a publication.<\/p>\n<p>Once collected, this data will be analyzed for community composition using the QIIME bioinformatic pipeline (Caporaso et al., 2010) through the BU SCC. The community data will then be used to develop a statistical model that partitions the variance in community composition data on orthogonal axes of space and time.<\/p>\n<p>We seek an REU student to collect DNA sequence data published online, work with the data through bioinformatics pipelines, and analyze the data using statistical software (R). The project will involve development of coding skills using Jupyter Notebooks and statistical training in analysis and visualization of microbial community composition data.<\/p>\n<p><strong>A successful project will result in training of: <\/strong><br \/>\n1) Front-to-back analysis of large DNA sequence-based microbial community datasets;<br \/>\n2) Analysis and visualization of data using R software packages (e.g. ggplot2);<br \/>\n3) Communicating results in a presentation and draft of a manuscript by the end of the internship<\/p>\n<p><\/div>\n<\/div>\n<\/p>\n","protected":false},"excerpt":{"rendered":"<p>2021 Gary Benson Departments of Biology and Computer Science Ethan Deyle Department of Biology Josee Dupuis Department of Biostatistics W. Evan Johnson Departments of Biostatistics and Medicine Jennifer Bhatnagar Department of Biology Michael Dietze Department of Earth and Environment Joshua Campbell Department of Computational Biomedicine Sarah W. Davies Department of Biology Karen Allen Department of [&hellip;]<\/p>\n","protected":false},"author":10810,"featured_media":0,"parent":45,"menu_order":1,"comment_status":"closed","ping_status":"closed","template":"","meta":[],"_links":{"self":[{"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/pages\/196"}],"collection":[{"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/users\/10810"}],"replies":[{"embeddable":true,"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/comments?post=196"}],"version-history":[{"count":39,"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/pages\/196\/revisions"}],"predecessor-version":[{"id":901,"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/pages\/196\/revisions\/901"}],"up":[{"embeddable":true,"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/pages\/45"}],"wp:attachment":[{"href":"https:\/\/sites.bu.edu\/britereu\/wp-json\/wp\/v2\/media?parent=196"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}