Expanding Our View of The Genomic Landscape Using the Genotype-Tissue Expression (GTEx) Data Set
The study of human genetics can help us find answers to questions about what makes us unique and how various diseases develop. The understanding of changes and differences in our DNA, known as genetic variation, can help explain some differences in risk of disease and how people respond to drug treatment. Although there has been some success in identifying genetic variants linked to diseases, there remains a challenge of explaining the function of millions of genetic variants across the human genome. To help overcome this challenge, the NIH Common Fund’s Genotype-Tissue Expression (GTEx) Program developed a reference data set for studying how genetic variation impacts the way a gene behaves in various cell types, tissues, and across individuals.
A group of researchers supported by the GTEx program analyzed the eighth version (V8) of the GTEx data set that includes genetic data from 17,382 samples from 54 tissues of 948 post-mortem donors, to catalogue genetic variants that influence the activity, or expression, of almost all genes. These variants can control how a gene behaves in a cell, like a power button can turn an electronic device on and off. The researchers found many examples of how individual genetic variation among GTEx donors affected gene expression. Part of what makes these findings valuable is that they account for individual genetic variation, but they do so in context of specific cell types, like brain or liver, where different genes play different roles in cell function. Also, the breadth of GTEx cell types allows researchers to ask a variety of questions about many different topics in health and disease. For example, GTEx researchers used the V8 data to see how a person’s sex affects gene expression, to find better ways to identify rare genetic variants, to better link a genetic variant to disease, to study how multiple genetic variants are connected in complex diseases, and to account for natural genetic variation among diverse populations in studies linking genetics to a specific trait.
In the study to uncover genetic variants that affect gene expression based on sex and population, the researchers identified a genetic variant that increased the expression of the gene AURKA in skeletal muscle in males but not in females. This gene has been widely studied as a risk factor for several cancers. They also identified a variant that decreased the expression of the gene SLC44A5 in the esophagus of individuals of European ancestry, but the expression of this gene was lower in African Americans. The SLC44A5 gene has been linked to Alzheimer’s disease in prior large genetic studies. Findings that link genetics to specific traits, like disease risk, could inform efforts to make personalized medicine a reality.
Studies like these shows how the GTEx V8 data are stimulating new discoveries. The data are available to the research community, so scientists can use them to dissect the effect of genetic variation and gene expression, and to improve our understanding of the role of genetic variation in most human diseases.
The GTEx V8 data are now available to view at https://gtexportal.org/home/.
Read news articles about GTEx V8 findings at: New York Genome, The Broad Institution and Science.
The GTEx Consortium atlas of genetic regulatory effects across human tissues. The GTEx Consortium. Science. 2020 Sept. 11.
*access to full text may require institutional permission.
Gene Expression Changes are Observed Shortly After Death
For some biomedical research studies, human tissues (like brain) are impractical to obtain from living donors. The NIH Common Fund’s Genotype Tissue Expression (GTEx) program relied on the generous donation of tissue samples from deceased donors. Post-mortem interval (PMI), which is the time between death and sample collection can alter normal RNA levels in post-mortem tissue. The cause of death may also affect the quality of the collected tissues and RNA levels. The quality of RNA is a key factor used to measure gene expression (whether a gene is turned on or off). Therefore, gene expression measured in post-mortem tissue samples can be affected both by biological responses to death, as well as the loss of RNA that occurs because of cell death.
In a new study, Dr. Pedro Ferreia and colleagues used the GTEx data set to analyze how the length of PMI affects gene expression in various tissues. The investigators observed changes in gene expression in muscle tissue right after death (less than four hours), while expression changes in other tissues took place later. In the blood, most expression changes occurred between 7 to 14 hours after death, which could be related to changes in blood flow triggered by death. Also, the investigators developed a model to predict PMI based on gene expression changes in various tissues. The results showed gene expression changes in a collection of just four tissues – fat, lung, thyroid, and skin – could be combined to effectively predict time since death. This study demonstrates that gene expression changes in post-mortem tissues could be informative when determining the time since death in forensic cases. This study also shows how the GTEx data set could be used to better understand the biological processes that are triggered by death.
The Effects of Death and Post-mortem Cold Ischemia on Human Tissue Transcriptomes. Ferreira PG, Muñoz-Aguirre M, Reverter F, Sá Godinho CP, Sousa A, Amadoz A, Sodaei R, Hidalgo MR, Pervouchine D, Carbonell-Caballero J, Nurtdinov R, Breschi A, Amador R, Oliveira P, Çubuk C, Curado J, Aguet F, Oliveira C, Dopazo J, Sammeth M, Ardlie KG, Guigó R. Nat Commun. 2018 Feb 13;9(1):490.
GTEx Creates a Reference Data Set to Study Genetic Changes and Gene Expression
Research studies have identified links between many genetic variants (a change in DNA sequence) and common diseases (e.g. cancer, diabetes, hypertension, Alzheimer's disease). We are now aware that genetic variants can regulate genes being turned on or off, which may contribute to complex diseases. However, which genes are turned on or off varies a lot in healthy people depending on which tissue type (e.g. heart, lung, brain, etc.) is being examined, and this makes it even harder to link a specific genetic variant to disease. The NIH Common Fund’s Genotype-Tissue Expression (GTEx) project has developed a reference data set for studying genetic variants and gene activity in multiple healthy tissues. This catalogue has stimulated research that will enrich our understanding of how differences in our DNA sequence contribute to health and disease, and make us different from everyone else. GTEx researchers, Eric Gamazon, Nancy Cox, and Hae Kyung Im used the GTEx reference data set to design a statistical method called PrediXcan that estimates how much of gene activity (whether a gene is turned on or off) is due to differences in DNA sequence.1 PrediXcan then links this estimate with observable traits as a way to identify genes associated with disease. The authors used this method to identify specific genes associated with five diseases: bipolar disorder, coronary artery disease, Crohn's disease, rheumatoid arthritis and type 1 diabetes. In another study, researchers used GTEx data to mathematically measure gene activity changes associated with a given genetic variant. This is a significant approach for investigating how genetic variants affect cellular processes.2 All GTEx data are publicly available at the GTEx Portal, which gives researchers everywhere access to the reference data set. Access to the data set will create new opportunities to study links between genetics and disease and to investigate possible advanced treatment options.
 A Gene-Based Association Method for Mapping Traits using Reference Transcriptome Data. Gamazon ER, Wheeler HE, Shah KP, Mozaffari SV, Aquino-Michaels K, Carroll RJ, Eyler AE, Denny JC; GTEx Consortium, Nicolae DL, Cox NJ, Im HK. Nat Genet. 2015 Sep;47(9):1091-8.
 Quantifying the Regulatory Effect Size of Cis-Acting Genetic Variation using Allelic Fold Change. Mohammadi P, Castel SE, Brown AA, Lappalainen T. Genome Res. 2017 Nov;27(11):1872-1884.
GTEx data are aiding researchers in uncovering how genetic alterations contribute to schizophrenia
Over the last decade researchers have discovered more than 100 genetic loci associated with schizophrenia diagnosis; however, determining how these variants alter biology and contribute to schizophrenia development has been more challenging. With the expansion of large genetics databases like GTEx and ENCODE, researchers are beginning to move beyond genome wide association studies to link genetic variation to alterations in gene expression and to more accurately determine the molecular basis of complex diseases like schizophrenia. In a recent paper from Nature Neuroscience, researchers reported that around 20% of the 108 previously known schizophrenia loci contain genetic variants that potentially contribute to altered gene expression in the brain, and thus could contribute to schizophrenia development. Furthermore, in five of the loci investigated the genetic variants were located in a single gene: FURIN, TSNARE1, CNTN4, CLCN3 and SNAP91. The researchers went on to demonstrate that three of those genes (FURIN, TSNARE1 and CNTN4) were involved in neurodevelopment in zebrafish, strengthening the evidence that they may also play an important role in human brain development and that their altered expression may be contributing to schizophrenia development. These discoveries were made possible by data from the CommonMind Consortium – a Public-Private partnership with a large brain sample collection – together with GTEx’s extensive collection of post-mortem brain donors and public set of expression quantitative trait loci from brain.
GTEx dataset helps researchers determine how gene duplications potentially lead to genes with new biological functions
A major source of new genes – which can lead to new biological functions through evolution – is through duplication of ancestral genes. Many genes function normally only when present as a single copy because the dosage of their resultant protein is tightly controlled. When a gene is duplicated the dosage is also duplicated, and in many instances this negatively affects survival and creates evolutionary pressure to restore gene dosage. In the majority of cases gene dosage is restored because the duplicated gene accumulates debilitating mutations rendering it nonfunctional – faster and more likely to occur. Occasionally, the duplicated gene will develop a new function that provides an evolutionary advantage which then spreads through the population – slower and less likely to occur. Since gene loss is favored over preservation, what mechanisms support the persistence of new gene duplicates long enough for new biological functions to evolve? Using the GTEx dataset Lan and Pritchard published an article in Science providing evidence that suggests new gene duplicates are preserved because gene expression from both copies is downregulated, restoring gene dosage. They suggest that since this reduces the evolutionary pressure for either duplicate to become nonfunctional, both genes can evolve independently over time.
Read a highlight of this exciting research on NIH director Dr. Francis Collins blog!
GTEx dataset helps researchers uncover biological functions for the small amount of Neandertal DNA present in modern humans
Following modern humans exodus from Africa ~60,000 years ago they encountered now-extinct Neandertals and on at least a few occasions interbreeding occurred. As a result, genomes of modern Eurasians contain ~1.5 to 4% Neandertal DNA. However, the contribution(s) of this DNA to modern human’s physiology and disease susceptibility/progression is only beginning to be understood. In a recent Science publication, Simonti and coworkers identified two single nucleotide polymorphisms (SNP) within the introgressed Neandertal DNA that were associated with disease. A SNP in the intron of P-selectin (SELP) was associated with a hypercoagulable state while a second upstream of stromal interaction molecule 1 (STIM1) was associated with incontinence, bladder pain, and urinary tract disorders. Because of the GTEx dataset they were able to show that both Neandertal SNPs are associated with changes in SELP (increased) and STIM1 (decreased) gene expression, suggesting that the effects from modern human-Neandertal interbreeding are still with us today.
GTEx hopes to play a role in uncovering how genetic alterations contribute to psychiatric disorders
Genome wide association studies (GWAS) have identified over 100 genetic loci associated with schizophrenia diagnosis. However, GWAS studies have limited ability to uncover how genetic loci associated with schizophrenia diagnosis alter biological processes resulting in risk for or protection from schizophrenia or whether those genetic loci associated with schizophrenia diagnosis are amenable to interventions. The GTEx program is optimistic that its sequence database containing over 900 post-mortem donors – over 420 of them whole brain donors – will help to untangle how the over 100 loci associated with schizophrenia diagnosis actually function in the progression of the disease. This week, a highly publicized Nature paper uncovered the biological basis for why the major histocompatibility complex locus, a GWAS-identified genetic loci spanning several megabases, is associated with schizophrenia diagnosis. With the aid of GTEx data from brain frontal cortex, the researchers showed that each common complement component 4A (C4A) allele associates with schizophrenia in proportion to its tendency to generate greater expression of C4A mRNA.
GTEx Perspective: Understanding how non-coding genomic polymorphisms affect gene expression
Read a Washington Post article on the Nature paper
Read an NIH Press Release
GTEx Reaches Midpoint Milestone
The GTExPortal was just updated! This latest version of sequence data encompasses roughly half of the anticipated 960 postmortem donors. This release includes genotype data from approximately 450 donors and over 9600 RNA-seq samples across 51 tissue sites and 2 cell lines, with adequate power to detect Expression Quantitative Trait Loci in 44 tissues. Full gene and isoform expression datasets are available for download through the GTEx Portal while genotypes and RNA-seq bam files are available via dbGaP.
GTEx Scientists Investigate Sex Differences.
Sex and gender play a role in how health and disease differ across individuals, and considering these factors during research informs the development of preventive and therapeutic interventions for both sexes. Learn how supplements to GTEx grants are enabling researchers to investigate sex as a biological variable.
This page last reviewed on September 15, 2020