Having a catalog of reference genes is a really important step to take in terms of successfully analyzing the gut microbiome. Existing catalogs are based on samples from single cohorts or reference genomes meaning that coverage is potentially limited for our purposes. As such we decided to combine some pre-existing catalogs into one source and so, we now have a catalog of just short of 10 million genes. We published this work in Nature Biotechnology.


Many analyses of the human gut microbiome depend on a catalog of reference genes. Existing catalogs for the human gut microbiome are based on samples from single cohorts or on reference genomes or protein sequences, which limits coverage of global microbiome diversity. Here we combined 249 newly sequenced samples of the Metagenomics of the Human Intestinal Tract (MetaHit) project with 1,018 previously sequenced samples to create a cohort from three continents that is at least threefold larger than cohorts used for previous gene catalogs. From this we established the integrated gene catalog (IGC) comprising 9,879,896 genes. The catalog includes close-to-complete sets of genes for most gut microbes, which are also of considerably higher quality than in previous catalogs. Analyses of a group of samples from Chinese and Danish individuals using the catalog revealed country-specific gut microbial signatures. This expanded catalog should facilitate quantitative characterization of metagenomic, metatranscriptomic and metaproteomic data from the gut microbiome to understand its variation across populations in human health and disease.


Li J1, Jia H2, Cai X2, Zhong H2, Feng Q3, Sunagawa S4, Arumugam M5, Kultima JR4, Prifti E6, Nielsen T7, Juncker AS8, Manichanh C9, Chen B10, Zhang W10, Levenez F6, Wang J10, Xu X10, Xiao L10, Liang S10, Zhang D10, Zhang Z10, Chen W10, Zhao H10, Al-Aama JY11, Edris S12, Yang H13, Wang J14, Hansen T7, Nielsen HB8, Brunak S8, Kristiansen K15, Guarner F9, Pedersen O7, Doré J16, Ehrlich SD17; MetaHIT Consortium, Bork P18, Wang J19; MetaHIT Consortium.


Pons N, Le Chatelier E, Batto JM, Kennedy S, Haimet F, Winogradski Y, Pelletier E, LePaslier D, Artiguenave F, Bruls T, Weissenbach J, Turner K, Parkhill J, Antolin M, Casellas F, Borruel N, Varela E, Torrejon A, Denariaz G, Derrien M, van Hylckama Vlieg JE, Viega P, Oozeer R, Knoll J, Rescigno M, Brechot C, M'Rini C, Mérieux A, Yamada T, Tims S, Zoetendal EG, Kleerebezem M, de Vos WM, Cultrone A, Leclerc M, Juste C, Guedon E, Delorme C, Layec S, Khaci G, van de Guchte M, Vandemeulebrouck G, Jamet A, Dervyn R, Sanchez N, Blottière H, Maguin E, Renault P, Tap J, Mende DR.

Journal and Citation

Nature Biotechnology 32(8): 834-841

DOI: 10.1038/nbt.2942

Link: http://www.nature.com/nbt/journal/v32/n8/full/nbt.2942.html

