The Common Fund Data Ecosystem
The Common Fund Data Ecosystem (CFDE) is developing an online portal that will allow researchers to access and work across multiple Common Fund (CF) program data sets within a digital cloud environment. CF programs generate a wide range of diverse and valuable data sets designed to be used by the research community to accelerate discovery. However, these data sets reside in different locations and it is challenging or even impossible to work with multiple data sets in an accessible and user-friendly way. The CFDE portal will help remedy this problem by creating a shared resource that helps make CF data sets FAIR (Findable, Accessible, Interoperable, and Reusable) and enables researchers to ask scientific and clinical questions from a single access point.
The CFDE Coordinating Center oversees CFDE activities and works closely with participating CF data coordinating centers to include an initial subset of CF data sets, with plans to expand to additional data sets in the future. The CFDE portal will also develop and deploy a number of resources and tools, including training materials that will be created to empower the research community to use CF data sets for novel scientific research that was not possible before. This may include hypothesis generation, discovery, or validation that leads to new insights in health and disease.
CFDE Coordinating Center Website
CFDE resources for NIH staff (requires NIH log in)
The CFDE includes several integrated efforts:
- CFDE Coordinating Center – The CFDE Coordinating Center oversees CFDE activities, engages with participating Common Fund programs, connects with user communities, supports training, develops tools and standards, and provides technical expertise. These activities are conducted in close partnership with relevant Common Fund programs.
- Participating Common Fund data coordinating centers (DCCs) – The DCCs are working with the CFDE Coordinating Center to understand their programs’ unique requirements for data storage and analysis, adopt/adapt guidelines and best practices, share resources and tools with other DCCs, develop use cases for cross-data analyses, and provide training. In January 2020, the Common Fund released an Engagement Opportunity Announcement for eligible DCCs to engage with the CFDE Coordinating Center and other DCCs to establish the CFDE. For more details, please view the Engagement Opportunity Announcement and Process for Rolling Submission of Engagement Opportunity Award Plans.
- Leveraging the Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability (STRIDES) Initiative - The CFDE is working with the STRIDES Initiative from the NIH Office of Data Science Strategy (ODSS) to provide favorable pricing for cloud data storage and to develop guidelines that ensure data are stored and organized optimally for proper data versioning and upkeep.
Using STRIDES for in-kind cloud services for new Common Fund applicants: To fully leverage the Common Fund's investment in STRIDES, Common Fund award applicants will be asked to outline the anticipated type, direct cost, and justification for activities related to cloud computing in the Budget Justification section, including, but not limited to, data storage, computing, data movement/egress (see below), professional services, training, and related activities. To foster a cloud-centric model that minimizes data movement out of the cloud, data egress fees (i.e. charges for outgoing traffic from cloud environments) should be minimized. Any requests to support egress fees incurred by large-scale data download functionalities should have strong justification. NIH will use this cost estimate to provide in-kind services via STRIDES if the application is funded and the amount requested for cloud services will not be added to the requested budget total or count toward the direct cost limit for the award. Upon award, NIH staff will coordinate with awardees to work through logistical details associated with STRIDES accounts. For more information, please see Notice of Information: Leveraging STRIDES for Cloud Computing Activities in Common Fund Awards (NOT-RM-20-009).
Currently, there are 13 Common Fund programs that are eligible to engage with the CFDE Coordinating Center to establish the CFDE: 4D Nucleome (4DN), Acute to Chronic Pain Signatures (A2CPS), Extracellular RNA Communication (ExRNA), Gabriella Miller Kids First Pediatric Research (Kids First), Genotype Tissue Expression (GTEx), Human BioMolecular Atlas Program (HuBMAP), Illuminating the Druggable Genome (IDG), Knockout Mouse Phenotyping Program (KOMP2), Library of Integrated Network-based Cellular Signatures (LINCS), Metabolomics, Molecular Transducers of Physical Activity Consortium (MoTrPAC), Stimulating Peripheral Activity to Relieve Conditions (SPARC), and Undiagnosed Diseases Network (UDN). These programs offer different perspectives that will enable a deeper understanding of the issues around using and integrating diverse data types, identifying mutual needs for Common Fund programs, and collaborating across programs to enhance data utility. Applying best practices and lessons learned from these partnerships, the CFDE coordinating center will expand its activities to engage with future Common Fund programs as well.
More information can be found in presentations from the May NIH Council of Councils meeting where the CFDE and the ODSS efforts were discussed, as well as the September Council of Councils meeting where the concept for the upcoming DCC funding opportunity to establish the CFDE was approved.
This page last reviewed on September 30, 2020