Skip to main content

BD2K Training and Education

The Big Data to Knowledge (BD2K) Training activities were designed to improve big data skills of biomedical scientists and increase the number of biomedical data scientists. BD2K-funded grants have produced a number of educational resources to strengthen the role of data science in modern biomedical research.

Resources available through the Training Coordination Center

NIH-funded biomedical data science training programs represent a broad range of degree programs, career-development paths, in-person workshops, virtual events, and other unique activities. 

The BD2K Training Coordination Center (TCC) helps promote and support training and educational activities across the collection of NIH-funded Big Data to Knowledge (BD2K) grants. Learn more about the TCC. 

Click on items in the list below for links and descriptions of each resource produced by the TCC.

Resources available through BD2K Training Grants

BD2K training grants have produced a number of in-person courses, Massive Open Online Courses (MOOCs), workshops, summer training programs, and other activities, which can be accessed through the sunburst, an interactive display of NIH-funded biomedical data science training programs. Explore educational resources from the BD2K training grants through the sunburst .

BD2K Mentored Career Development Award in Biomedical Big Data Science for Clinicians and Doctorally Prepared Scientists (K01

  • Project Tycho : A repository for global health data in a standardized format that is compliant with FAIR guidelines. Project Tycho contains case counts for notifiable conditions for the United States and includes data for dengue-related conditions for 100 countries obtained from the World Health Organization and national health agencies.
  • HastagHealth : A resource that addresses both the dearth of neighborhood data and offers novel characterizations of neighborhoods. Neighborhood indicators include food themes, healthiness of food mentions, frequency of exercise/recreation mentions, metabolic intensity of physical activities, and happiness levels.
  • genTB : An analysis tool for translational tuberculosis genomic data that offers a means for sharing, citing and crediting tuberculosis data and metadata, the prediction of resistance on genotype using a machine learning algorithm, geographic data mapping, and a user friendly statistical analysis tool.

BD2K Open Educational Resources for Biomedical Big Data (R25

  • Oregon Health & Science University (OHSU) Educational Materials: A repository of advanced introductory materials for individuals seeking to learn more about data science to expand their research programs, explore future career paths into data science, and understand and apply knowledge of the application of BD2K concepts in their present jobs. 

Training Resources from the BD2K Centers

The BD2K Centers also produced training and educational resources including courses, workshops, webinars, lecture series, summer internships and training programs. Visit the Centers pages below for additional information.

BD2K- Library of Integrated Network-based Cellular Signatures Data Coordination and Integration Center (DCIC)

The BD2K-LINCS DCIC delivers high quality educational materials through the web like Massive Open Online Courses (MOOCs) as well as through mentoring, seminars and symposia: 

Center for Causal Discovery (CCD)

The CCD center offers courses, workshops, and lectures on causal relationships in big biomedical data: 

Center for Expanded Data Annotation and Retrieval (CEDAR)

The CEDAR center provides a list of educational resources for metadata training, and offers tutorials on the CEDAR software for the creation of simple template and metadata records. 

The Mobilize Center

The Mobilize Center faculty have created a number of MOOCs and run workshops for individuals interested in data science:

Center for Predictive Computational Phenotyping (CPCP)

The CPCP conducts training activities on data science, predictive models for biomedicine, and computational phenotyping for a broad set of audiences:

Mobile Censor Data-to-Knowledge (MD2K)

MD2K offers an annual training program to help investigators develop the multidisciplinary skills needed to generate high-quality mHealth research and solutions. Lectures from past training programs, training videos, and webinars on biomedical applications are available on the MD2K website:


The KnowEnG center offers an online resource that hosts prototypes of educational games for teaching sequence alignment, dynamic programming, and phylogenetic tree reconstruction algorithms. Through an R25 program partnership with the University of Illinois Chicago Urbana-Champaign and Mayo Clinic, and Fisk University, the KnowEnG center provides under-represented minority undergraduate students with curricular training and experience in Bioinformatics and Big Data.


PIC-SURE trains the next generation of biomedical big data scientists through its Summer training program in Biomedical Informatics, and by offering data science and precision medicine graduate-level courses:

This page last reviewed on March 22, 2024