Associating growth conditions with cellular composition in Gram-negative bacteria

Funding Agency: Army Research Office (ARO), Biology.

Award Number: W911NF-12-1-0390.

Principal Investigators: C. Wilke, J. E. Barrick, E. M. Marcotte, P. Ravikumar, M. S. Trent at UT Austin, D. Segre’ and Yannis Paschalidis at Boston Univ., and C. J. Marx at Harvard Univ.

Project Summary

Microbes impact many vital civilian and military activities. Naturally occurring pathogens, such as Legionella pneumoniae (causative agent of Legionnaire’s disease) or Yersinia pestis(causative agent of the bubonic plague), can incapacitate entire communities. With genetic manipulation or growth in the laboratory for deliberate attacks, microbes can pose national security risks, as in the anthrax mailings in 2001. Microbes can also be engineered for useful purposes: to serve as vaccines, to produce chemicals, or to break down toxins or other undesirable substances.

This project will develop mathematical methods and make systematic biological measurements to associate the conditions under which bacteria have grown with the resulting composition of the bacterial cell. This association has important applications both in bacterial forensics (e.g., identifying the source of a pathogen used in a deliberate attack) and in engineering applications. We will complete four specific tasks:
Task I: Develop methods to identify statistical association in multiple-input–multiple-output (MIMO) data. We will develop linear and non-linear methods to associate two high-dimensional data sets with each other and to predict values in one set from data points in the other.
Task II: Incorporate domain-specific knowledge into high-dimensional association models. We will leverage side-information (such as metabolic pathways) within the statistical framework of Task I. We will also develop inverse optimization methods as a source of side information.
Task III: Generate reference data sets of bacterial composition. We will grow bacteria under carefully controlled conditions and measure biomass composition, mRNA abundances, protein abundances, lipid abundances, and metabolic fluxes.
Task IV: Validate models using reference data sets. We will validate the models developed under Tasks I and II against the reference data sets. Validation results will feed back into further model development as well as subsequent experiments.

To pursue this work, we have assembled an outstanding, highly interdisciplinary team of statisticians, computer scientists, microbiologists, and biochemists. Our work will be centered at The University of Texas at Austin, and it will involve experts in microbial physiology and computational modeling at Boston University and Harvard. Successful completion of this project will yield novel mathematical methods to associate bacterial growth conditions with cellular composition, identification of the types and ranges of growth conditions that lead to distinguishable cellular composition, identification of key compositional markers that are diagnostic of specific bacterial growth conditions, and assessments of model uncertainty, robustness, and computational cost.