Title of proposed idea: Beyond GWAS (genome wide association studies)
Nominator: Innovation Brainstorm participants
Major obstacle/challenge to overcome: Although GWAS (genome wide association studies) have uncovered many genetic loci for a range of conditions and diseases, a major challenge is translating this knowledge into functional insights. One key roadblock is the inability to capture precisely various and diverse environmental measurements. Incomplete, nonstandardized, and shallow collection of phenotype data contributes to the difficulty of using GWAS data to define mechanisms and/or suggest potential interventions. Insufficient sample sizes prohibit the clarification of the role and relevance of complex traits in health and disease. In some cases, valuable opportunities may be missed, as in harnessing genotyping data from randomized clinical trials that have rich phenotypic data. For the massive amounts of data that already exist, practical and effective strategies for integration lag behind. Possible remedies include new algorithms for performing higher-order ‘omics studies, a repository of rare knockouts, and more complete sharing of data and biospecimens.
Fig. 1. Scientific progress over time
Emerging scientific opportunity ripe for Common Fund investment: Further progress in GWAS requires both persistence and innovation. While GWAS execution is routine and fairly well-established (in the “D” portion of the graph below), others are in a period of rapid growth [in the “C” region of the curve: single trait analysis, expression quantitative trait loci (eQTLs)], and still others require a substantial push to reach their potential (in the “A” and “B” areas below: functional annotation of genetic variants, annotation of a reference genome (ENCODE, the ENCyclopedia Of DNA Elements), whole-genome analyses in unrelateds/families, large-scale phenotyping, and clinical translation). This last group is likely to be the most ripe for Common Fund (CF) investment.
Common Fund investment that could accelerate scientific progress in this field: Three proposed projects (each independent but complementary and potentially synergistic) could overcome some of the current roadblocks in this area.
- Human Phe-Ge Project. This proposed project represents a very large-scale effort to create a “National Cohort” of people (DNA plus phenotypic data) for discovery research in health and disease. The large sample size (1 million people across the United States) would permit sufficient coverage of the human genome, along with a diversity of participants that reflect the U.S. population make-up. Clinical data would be harvested from electronic medical records (EMRs) (after participants opt in), and the cohort would be followed longitudinally. Web surveys (e.g. 23andwe) could harness the reach and power of social networking to gather data in real settings. Within the larger cohort would be a sub-cohort of approximately 1,000 people, who would be subjected, upon consent, for deep phenotyping and clinical validation. Data sharing would be free and wide, with appropriate consent in place from volunteer participants.
- Functional Genome Project. This potential project would leverage functional information to find causal variants, employing ENCODE (http://www.genome.gov/10005107), epigenomics, and functional genomics strategies. Functional annotation of 1,000 individuals over multiple cell types / conditions would record transcription, DNA methylation / histone modifications, and DNA sequencing (phased whole-genome sequencing). The project aims to advance GWAS science by yielding a more granular phenotype that will enable faster translation of genomic findings to clinical applications.
- Multidimensional Analyses for Genomic Studies. To further address the issue of GWAS data integration, this project would strive to provide context for genomic data by accessing environmental measures, incorporating population and family structure, and including epigenetic context. Higher-level interactions could be identified through the capture of functional interactions, pathway analyses, and novel combinatorics approaches. Candidate methodological innovations include such as more flexible analysis methods and study designs, whole-genome sequencing, and computational improvements that speed and expand processing capabilities.
Potential impact of Common Fund investment: Moving GWAS beyond its current capability offers faster movement from association to function, which will likely accelerate discovery for multiple traits. Clinical relevance of most GWAS to date is lacking: The proposed projects aim to lead to better clinical decision support, new diagnostics and therapeutics, improved coordination with industry, as well as the realization of meaningful use criteria of the HITECH act (The Health Information Technology for Economic and Clinical Health (HITECH) Act, enacted as part of the American Recovery and Reinvestment Act of 2009, was signed into law on February 17, 2009, to promote the adoption and meaningful use of health information technology).