Skip to main content
HMP program logo.

The World’s Most Comprehensive Baseline Dataset of Microbiome and Human Host Sequence Data

The Human Microbiome Project has transitioned from Common Fund support. For more information please visit

Please note that since the Human Microbiome Project is no longer being supported by the Common Fund, the program website is being maintained as an archive and will not be updated on a regular basis.

The NIH’s Human Microbiome Project (HMP) mission was to create the foundational research resources to support growing scientific interest in the role of the microbiome in human health and disease. A key resource which was created was a complete dataset of microbiome and host sequence data from a cohort of 300 adults verified to be free of disease and so considered healthy. This reference dataset includes over 2000 metagenomes and over 10 terabytes (TB) of DNA sequence data, making it the largest set of microbiome data from human or any other habitat. This resource is the world’s most comprehensive reference host/microbiome dataset as it includes the microbial community composition from five major body regions (nasal, oral, skin, gastrointestinal tract, and urogenital tract) of these subjects, and the predicted metabolic pathways of the microbial communities in these body regions. All microbial members (bacterial, archaeal, bacteriophage, viral, and fungal) have been included in this baseline dataset and both phylogenetic marker gene sequence [e.g. 16S rRNA, 18S rRNA, and internal transcribed spacer region (ITS)] and metagenomic whole genome shotgun sequence data were generated. Additional attention has been paid to the gut microbiome in this cohort and the research community has used the HMP reference dataset to analyze the mobile gene content, the antibiotic resistome, the bacteriophage composition and the presence of putative pathogens in this key subset of the human microbiome reference dataset. To complete the overall dataset, the human genome sequence has also been analyzed for these subjects. Many broadly used databases have incorporated the complete HMP reference dataset. Two of these are notable: the HMP Data Analysis and Coordination Center and the Qiita web-based microbial study management platform. The human sequence dataset are under controlled access but can be requested for appropriate research purposes from NIH’s database of genotypes and phenotypes (dbGaP).

The following nine key papers include the various datasets which comprise the complete HMP ‘healthy human’ microbiome reference dataset:

Structure, Function and Diversity of the Healthy Human Microbiome. The Human Microbiome Project Consortium. Nature. 2012 Jun 13. 486(7402): 207–14. doi: 10.1038/nature/11234.

Country-Specific Antibiotic Use Practices Impact the Human Gut Resistome. Forslund K, Sunagawa S, Kultima JR, Mende DR, Arumugam M, Typas A and Bork P. Genome Research 2013 Jul. 23(7): 1163–69. doi: 10.1101/gr.155465.113.

Classification and Quantification of Bacteriophage Taxa in Human Gut Metagenomes. Waller AS, Yamada T, Kristensen DM, Kultima JR, Sunagawa S, Koonin EV, and Bork P. ISME J. 2014 Jul. 8(7): 1391-1402. doi: 10.1038/ismej.2014.30.

Metagenomic Analysis of Double-Stranded DNA Viruses in Healthy Adults. Wylie, KM, Mihindukulasuriya KA, Zhou Y, Sodergren E, Storch GA, and Weinstock GM. BMC Biol. 2014 Sep 10. 12:71. doi: 10.1186/s12915-014-0071-7.

Host Genetic Variation Impacts Microbiome Composition across Human Body Sites. Blekhman R, Goodrich JK, Huang K, Sun Q, Bukowski R, Bell JT, Spector TD, Keinan A, Ley RE, Gevers D, and Clark AG. Genome Biol. 2015 Sep 15. 16:191. doi: 0.1186/s13059-015-0759-1.

Mobile Genes in the Human Microbiome Are Structured from Global to Individual Scales. Brito IL, Yilmaz S, Huang K, Xu L, Jupiter SD, Jenkins AP, Naisilisili W, Tamminen M, Smillie CS, Wortman JR, Birren BW, Xavier RJ, Blainey PC, Singh AK, Gevers D, and Alm EJ. 2016. Nature. 2016 Jul 21. 535(7612): 435–39. doi: 10.1038/nature18927.

Strains, Functions and Dynamics in the Expanded Human Microbiome Project. Lloyd-Price J, Mahurkar A, Rahnavard G, Crabtree J, Orvis J, Hall AB, Brady A, Creasy HH, McCracken C, Giglio MG, McDonald D, Franzosa EA, Knight R, White O and Huttenhower C. Nature. 2017 Oct 5. 550(7674): 61–66. doi: 10.1038/nature23889.

The Gut Mycobiome of the Human Microbiome Project Healthy Cohort. Nash AK, Auchtung TA, Wong MC, Smith DP, Gesell JR, Ross MC, Stewart CJ, Metcalf GA, Muzny DM, Gibbs RA, Ajami NJ and Petrosino JF. Microbiome. 2017 Nov 25. 5(1): 153. doi: 10.1186/s40168-017-0373-4.

Host Genetic Variation and its Microbiome Interactions within the Human Microbiome Project. Kolde R, Franzosa EA, Rahnavard G, Hall AB, Vlamkais H, Stevens C, Daly MJ. Xavier RJ, and Huttenhower C. Genome Med. 2018 Jan 29. 10(1):6 doi: 10.1186/s13073-018-0515-8.


This page last reviewed on September 13, 2023