Pittsburgh and Stanford teams will lead collaborations and provide computing, software and data infrastructure, as well as integrate unprecedented cellular resolution data as part of an international network of centers working to create a kind of cell-level Google Maps for the human body. Over the next four years, these teams — two of the six components of the HuBMAP Integration, Visualization & Engagement (HIVE) Collaboratory — will receive about $16 million and $4 million from the National Institutes of Health to fund further development of hybrid cloud infrastructure, transdisciplinary collaboration, data integration and software tools for the Human BioMolecular Atlas Program (HuBMAP).
The HIVE Infrastructure and Engagement Component (HIVE IEC), co-led by Phil Blood of the Pittsburgh Supercomputing Center (PSC, a joint center of Carnegie Mellon University and the University of Pittsburgh) and Jonathan Silverstein of Pitt, will receive $16 million to lead community engagement and data collection, as well as provide ongoing hybrid cloud-based data storage, processing and analysis infrastructure to knit together the software tools and data produced by scientists in the HuBMAP Consortium and the broader scientific community. The Carnegie Mellon University (CMU) Tools Component of the HIVE (HIVE CMU TC), led by Matt Ruffalo of the CMU School of Computer Science, will be granted $4 million to continue to develop new software tools for uniformly processing, visualizing, searching and modeling data flowing from a vast collaboration of sites across the U.S. and elsewhere mapping specific proteins, genes and other cellular actors to cell function and location throughout the body.
The new funds are part of roughly $98 million in overall grants for the next four years of HuBMAP, a consortium composed of diverse research teams funded by the Common Fund at the National Institutes of Health. The HuBMAP Consortium (hubmapconsortium.org) is developing the tools needed to create an open, global atlas of the human body at the cellular level. These tools and maps will be openly available, to accelerate understanding of the relationships between cell and tissue organization and function and human health.
The HIVE IEC developed an innovative hybrid cloud microservices architecture to provide the underlying data infrastructure supporting the data and software tools generated by other teams within the HuBMAP Consortium. These data consist of hundreds of molecular and spatial datasets sampled from hundreds of human tissues across 14 different organs. Software tools deployed on this infrastructure by other HIVE teams will organize these data into reference tissue maps revealing essential life processes in microscopic structures within healthy human tissue. Key to the effort is providing mapping frameworks that ensure uniform incorporation of new data into the HuBMAP mapping project.
“Strong multi-institutional collaborations and flexibility in both the technologies and social aspects of the project were keys to the success of the first phase of HuBMAP,” said Blood, lead principal investigator (PI) of the HIVE IEC and Scientific Director at PSC. “Our successful approach with HuBMAP is now being leveraged in other NIH data projects, including the Cellular Senescence Network and the Common Fund Data Ecosystem.”
Over the next four years, the project will build a multiscale, open Human Reference Atlas (HRA), including tests for different life processes, which scientists across the world can use to answer questions about human health and disease. A fundamental part of this project will be to allow incorporation of new data in a way that ensures ease of importing the data alongside rigorous curation focused on scientific reliability and provenance. The effort will necessitate new tools to search, visualize and explore the huge volume of biomolecular data, and will feature interaction and engagement with the larger scientific community in a way that ensures the framework will be useful and available to researchers long after HuBMAP’s funding ceases.
“We look forward to continuing this terrific partnership among Pitt, PSC, CMU and Stanford into its production phase and the impact of our hybrid architecture on more biomedical science projects,” said Silverstein, PI of the HIVE IEC, Chief Research Informatics Officer at Pitt Health Sciences and the Institute for Precision Medicine and a professor in the Department of Biomedical Informatics (DBMI) at the Pitt School of Medicine.
The CMU TC, led by Matt Ruffalo at CMU’s School of Computer Science, has developed processing and computational analysis pipelines used to process HuBMAP data sets, providing uniform results across tissues, data providers and variants of an assay. These pipelines have been used to analyze molecular data for millions of cells across several modalities, with additional types of data to come during the production phase of the consortium. The CMU Computational Tools Component will continue to index uniformly-processed HuBMAP datasets at the cellular level, allowing queries for individual cells of interest based on gene expression or protein specificity, and will work with other HIVE components to integrate this index into a molecular-resolution human reference atlas.
“It's very exciting to start the production phase of this consortium, using the technical and collaborative expertise we've developed over the last four years to integrate new and existing HuBMAP data into a true biomolecular atlas of the human body,” said Ruffalo, PI of the CMU TC and a systems scientist in CMU’s Computational Biology Department. (AKS/Newswise)