FY22 X01 Projects for the Gabriella Miller Kids First Program

Project Number: HD110884-01 Contact PI / Project Leader: Chakravarti, Aravinda
Title: The genomic architecture of Hirschsprung Disease Awardee Organization:
University Of Texas Health Science Center
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Hirschsprung disease (HSCR) is a male-biased developmental disorder associated with a lack of innervation of the gastrointestinal tract. Genetic studies have been instrumental in understanding its multifactorial inheritance, high heritability, syndromic associations, and genetic heterogeneity with variable penetrance and expressivity. 24 known genes and 9 loci with pathogenic alleles (PAs) underlie HSCR pathogenesis and explain 62% of its population attributable risk (PAR). Despite this heterogeneity, there is functional unity in HSCR: ~53% of HSCR PAs disrupt RET and EDNRB signaling in the developing enteric nervous system (ENS) with 11 HSCR genes comprising a gene regulatory network controlling RET and EDNRB gene expression. We propose to identify the remaining 30% PAR by studying 857 unrelated HSCR cases, their 125 affected and 1,446 unaffected first-degree relatives by whole genome sequencing to increase statistical power of gene discovery through improved detection of all types of coding and regulatory PAs. HSCR arises from cell autonomous defects in enteric neural crest cell precursors (ENCCs) affecting their proliferation, differentiation and migration in the ENS, functional studies that will guide our detection of novel genes. PUBLIC HEALTH RELEVANCE: Pathogenic allele (PA) diversity in HSCR is extensive and includes diverse molecular types of de novo mutations (DNMs) and segregating variants explaining 63% of its population attributable risk (PAR). We propose to identify the remaining 30% PAR by studying 857 unrelated HSCR cases, their 125 affected and 1,446 unaffected first-degree relatives by whole genome sequencing (WGS) by increasing statistical power of gene discovery through improved detection of SNVs, INDELs/CNVs and DNMs and coding and regulatory PAs.

Project Number: HD110887-01     Contact PI / Project Leader: Chung, Wendy 
Title: Genomic Analysis of Esophageal Atresia and Tracheoesophageal Fistulas and Associated Congenital Anomalies Awardee Organization: Columbia University Health Sciences
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Project Summary/Abstract Esophageal atresia/tracheoesophageal fistula (EA/TEF) is a rare and complex aerodigestive congenital anomaly with an estimated incidence of 1 in 2500 to 1 in 4000 live births. There is a 45% incidence of associated congenital malformations, most commonly digestive, cardiovascular, urogenital, and musculoskeletal, often part of a syndrome or complex association, with VACTERL (vertebral defects, anal atresia, cardiac defects, tracheoesophageal fistula, renal anomalies, and limb abnormalities) being most frequently recognized. Advanced surgical techniques and pre and post-operative care have improved the prognosis and survival of EA/TEF patients over the past decades. However, with improved survival, many of the long-term morbidities of EA/TEF have been exposed. It is likely that the outcome in EA/TEF patients is influenced by multiple genetic and clinical factors; however, determining which factors are critical has been limited by the lack of data, particularly genomic data. Many families and health care providers seek prognostic clinical information about other associated birth defects or genetic syndromes, but prognostic data are extremely limited unless a chromosomal anomaly is identified. Evidence is accumulating that many congenital anomalies can result from copy number variants, de novo mutations, and inherited rare mutations, often unique to the family. We propose to elucidate the underlying genomic architecture of EA/TEF and define new genes and conditions associated with EA/TEF by performing whole genome sequencing on 100 parent child trios in a clinically well characterized cohort to identify rare de novo mutations and inherited variants. We believe this information will improve genetic diagnostic methods and provide more accurate clinical prognostic information to guide clinic decisions and improve outcomes. PUBLIC HEALTH RELEVANCE: Esophageal atresia/tracheoesophageal fistula (EA/TEF) is a rare and complex aerodigestive congenital anomaly with an estimated incidence of 1 in 2500 to 1 in 4000 live births. We propose to elucidate the underlying genomic architecture of EA/TEF by performing whole genome sequencing to characterize new clinical syndromes associated with EA/TEF to provide more accurate clinical prognostic information. 

Project Number: HD110902-01 Contact PI / Project Leader: Espinosa, Joaquin M.
Title: Epigenome analysis in the Human Trisome Project     Awardee Organization: University of Colorado Denver
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Despite decades of research, the mechanisms by which trisomy 21 (T21) causes the myriad developmental and clinical hallmarks of Down syndrome (DS) are poorly understood, creating an obvious challenge in the clinical management of DS. T21 causes a different disease spectrum among those with DS, protecting these individuals from developing certain conditions, such as most solid malignancies, while strongly predisposing them to others, such as congenital heart disease, leukemias, and autoimmune disorders. Therefore, elucidating the mechanisms by which T21 causes this novel disease spectrum will greatly serve both people with DS and the general population affected by the numerous conditions modulated by T21. To accelerate research in this area, our team launched a pan-omics cohort study of people with DS known as the Crnic Institute Human Trisome Project (HTP). Supported by previous X01 awards, the HTP has become one of the deepest studies of people with DS to date, having completed matched analysis of the genome, transcriptome, proteome, metabolome, immune maps, and microbiome, along with deep clinical data annotation for hundreds of research participants. These efforts produced several discoveries about the impact of immune dysregulation in DS, leading to a novel clinical trial funded by the INCLUDE Project to test the safety and efficacy of a JAK inhibitor to improve health outcomes in DS. Now, we propose to complete a comprehensive analysis of epigenetic variation in DS with a focus on the immune system through the following Specific Aims: Aim 1. To complete a cross-sectional analysis of epigenetic variation in Down syndrome. We propose to complete DNA methylation analysis via bisulfite sequencing for 400+ participants with T21 versus 200+ age- and sex-matched euploid controls, most of whom have matched transcriptome and immune mapping data. Aim 2. To complete a longitudinal analysis of cell type-specific epigenetic variation in Down syndrome. We propose to complete an analysis of the epigenome of monocytes, a key immune cell type with major roles in inflammation in DS, via matched analysis of DNA methylation, chromatin accessibility (ATAC-seq), and transcriptome in a three-year longitudinal sample set. Aim 3. To complete a comprehensive analysis of the T and B cell receptor repertoires in Down syndrome. A key source of epigenetic variation highly relevant to the study of DS resides in the repertoire of rearranged genomic sequences encoding the T and B cell receptors (TCRs, BCRs). We propose to complete targeted long read sequencing to elucidate the TCR and BCR repertoires in 400+ participants with T21 versus 200+ controls. Altogether, this proposal is likely to generate the most comprehensive analysis of epigenetic variation in individuals with T21 to date, with a strong focus on the immune system, a key player in the etiology of many co- occurring conditions of DS.

Project Number: HD110998-01     Contact PI / Project Leader: Gleeson, Joseph G
Title: Whole Genome Sequencing in Structural Defects of the Neural Tube Awardee Organization: University of California, San Diego
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Myelomeningocele (aka meningomyelocele, MM) is the most severe form of spina bifida, a neural tube defect (NTD) in humans and the most common CNS birth defect. This defect occurs in 3.72/10,000 live US birth, and is partly preventable with prenatal folate, but the genetic basis and the mechanisms by which folate work to reduce disease incidence remain obscure. MM is associated nearly uniformly with prenatal hydrocephalus and the Arnold-Chiari malformation, as well as paraplegia and lifelong disability. The genes for several syndromic forms of NTDs are known, but the causes for the majority with sporadic clinical presentation remain unknown. Despite the importance of MM, previous research has been limited to targeted sequencing and association studies of folate metabolism genes, or very small-scale exome sequencing. We hypothesize that de novo mutations (DNMs) that produce likely gene disrupting (LGD) contribution to MM risk. Using conservative estimates of between 50-100 recurrently mutated discoverable genes, and given our preliminary data demonstrating an excess of LGD DNMs in MM compared with control individuals, we estimate that with a cohort size of 1000 trios, we should uncover between 5-20 new recurrently mutated genes underlying MM, with minimal false-discovery. With this in mind, we formed the Spina Bifida Sequencing Consortium, and established a platform for data and sample sharing. We recently completed submission of 333 trios for WGS at GMKF and are awaiting return of data. We have also more recently embarked on a new recruitment effort of an additional cohort of 400 new simplex MM trios, in collaboration with the US Spina Bifida Association, consented trios to allow for data sharing, and have performed detailed quality control on samples. To achieve recruitment success at this scale, we have had to emphasize saliva rather than blood sampling. This cohort in now half-way assembled and ready for sequencing, and the remaining cohort will be ascertained in the next 6 months. Here we propose to perform WGS the new 400 trios from saliva-derived DNA for this X01 effort to continue this discovery. We have established a workflow for de novo SNP/INDEL/SV detection from WGS and have ample computer storage and nodes to see the project to completion. We also plan to continue recruitment into the future with the goal of 2000 trios in the next 2 years. We propose a detailed bioinformatics workflow to identify gene mutations within a statistical framework, taking into account detailed RNA expression profiling from developing mouse neural tube, and have developed a robust functional validation workflow using Xenopus and mouse gene targeting. Our project has the potential to uncover a host of causes for this most common of the CNS birth defects, paving the way for future breakthroughs in detection, treatment and prevention. PUBLIC HEALTH RELEVANCE: This work will identify new genetic disease genes predisposing to myelomeningocele, the most common pediatric structural brain disease.

Project Number: HD110886-01     Contact PI / Project Leader: Helbig, Ingo
Title: The Genetic Basis of Structural Pediatric Epilepsies Awardee Organization: `Children's Hospital Of Philadelphia    
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Childhood epilepsies are the most common neurological causes for hospital admissions in children. Gene identification has soared in the epilepsies over the last decade, and these discoveries have already led to novel therapies. In contrast, individuals with structural developmental brain anomalies who often benefit from epilepsy surgery are typically excluded from genetic testing. However, recent findings clearly suggest a strong genetic component. Accordingly, there is an essential need for a detailed understanding of the contributing genomic factors, which will be critical to improve patient care. We have recruited >700 patient-parent trios, 1,200 singleton, and 200 tissue samples derived from resective brain surgery. Our goal is to improve patient care by characterizing the genomic and transcriptional landscape of pediatric epilepsies. We hypothesize that known or presumed structural genetic epilepsies have a high frequency of disease-causing variants that can be identified through whole-genome sequencing and RNA sequencing and the analysis of longitudinal outcomes data through an established Electronic Medical Record (EMR) pipeline within the Kids First framework. Our study has two aims. First, we will analyze germline, somatic, and transcriptional contribution to structural pediatric epilepsies. We propose comprehensive profiling of 3,500 samples through Whole Genome Sequencing and 200 brain tissue samples derived from resective epilepsy surgery through parallel Whole Genome Sequencing and RNA Sequencing. We expect that this analysis will identify recurrent germline and somatic disease-causing variants, using the gene discovery expertise of our team to interpret identified genetic variants. Second, we aim to characterize the EMR-based longitudinal disease history of structural pediatric epilepsies. Our team has built frameworks and concepts for the use of longitudinal, de-identified EMR data. We will extract, de-identify, and map EMR data using the Human Phenotype Ontology (HPO) that we have been involved with for the last decade. We expect that implementing clinical data harmonization and analysis of complex phenotypic information will allow us to characterize the natural history and treatment response in known or presumed structural pediatric epilepsies which have not received attention in the past. In summary, our proposed project will have the possibility of systematically providing evidence for causative factors and impact on patient outcomes in known or presumed structural pediatric epilepsies, taking advantage of one of the largest pediatric epilepsy biobanks in conjunction with our expertise in cloud-based bioinformatic analysis and HPO-based data harmonization. The suggested datasets deposited within the Kids First Data Resource will help put structural epilepsies on similar footing to pediatric brain tumors and structural birth defects, enabling us to understand underlying disease mechanisms and allowing us to provide more targeted and improved treatments. PUBLIC HEALTH RELEVANCE: The proposed research is relevant to public health as more precise measurement of clinical and outcome data in structural childhood epilepsies is ultimately expected to translate into improved diagnostics and treatment of one of the most common structural disorders of the brain, allowing for greater personalized treatment choices. This project will address the significant unmet needs of children with structural epilepsies through the generation and integrative analysis of genomic data, including trio-based cohorts of longitudinally followed patients with deep clinical and phenotypic characterization.

Project Number: DE032472-01     Contact PI / Project Leader: Marazita, Mary 
Title: Kids First: Genomics of Isolated Cleft Lip Awardee Organization: University of Pittsburgh
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Isolated Cleft Lip (CL) is one of three subtypes of nonsyndromic orofacial clefts (OFCs), a heterogeneous group of anomalies that also include cleft lip with cleft palate (CLP) and cleft palate alone (CP). Cumulatively, OFCs occur in about 1/700 live births worldwide, and thus comprise a significant proportion of human structural birth defects. Isolated CL (i.e. CL without CP) occurs in about 1 in every 2,800 babies in the U.S (1). Individuals with CL face feeding difficulties, speech, and dental problems and undergo multiple corrective surgeries and ongoing therapy that comes at a substantial personal and financial burden. Further, patients can experience lifelong psychosocial effects, increased mortality rates, and a higher risk of various cancer types. Because CL and CLP share a defect of the upper lip, these two OFC subtypes have historically been combined in genetic studies. However, most studies are dominated by CLP, the most common of all OFCs, and any contribution from the smaller CL sample alone is often undetectable. In recent years, increasing sample sizes afforded the ability to analyze CL and CLP as separate entities. There is now evidence that the subtypes of OFC have distinct differences in their genetic risk patterns. Isolated CL has not yet been investigated in detail as a separate sub-phenotype, with only a few GWAS reports and no reported whole genome sequencing (WGS) studies to date. Therefore, critical gaps in our understanding of CL persist. The major goal of this proposal is to investigate the genome in isolated CL trios to begin to fill this knowledge gap. To do so, we request WGS for 762 CL trios from our large collaborative resources. There is already WGS for a total of 2,078 OFC proband trios from multiple ethnicities (many from our research collaborations and funded mostly through the Gabriella Miller Kids First Consortium—GMKF). Of those trios, 272 have isolated CL; combined with the new trios we will have a total of 1,034 CL trios, a powerful sample size for CL risk variant discovery and equivalent to the discovery resources for CLP and CP. This larger resource of CL trios will fill a critical gap in data and resources necessary to deeply understand the genetic architectures of each OFC subtype. To accomplish the overall goal we will (i) identify risk variants for isolated CL by WGS of CL trios; (ii) compare and contrast the genetic architectures of CL to CLP and CP; and (iii) replicate variants/genes identified through (i) and (ii). PUBLIC HEALTH RELEVANCE: The goal of this project is to better understand the genetic architecture of isolated cleft lip (CL) birth defects by performing whole genome sequencing in multi-ethnic CL families.

Project Number: HD110862-01 Contact PI / Project Leader: Rios,  Jonathan
Title: GMKF Project on Congenital Clubfoot Awardee Organization: University of Texas Southwestern Medical Center
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Talipes equinovarus (clubfoot) is a common congenital structural birth defect. Clubfoot occurs in ~1 in 1,000 live births in the United States, though the incidence varies in other worldwide populations. Clubfoot affects the structural components of the foot (tarsals), muscular and connective tissues including the Achilles and tibial tendons, and ligaments of the ankle. Males are affected twice as frequently as females, and at least half of patients present with bilateral disease. Surgical correction in infants requires tendoachilles lengthening, where the Achilles tendon is severed and the foot is casted to allow healing with the foot in a proper position. Although short-term success following clubfoot treatment is positive, long-term outcomes following surgical clubfoot correction are poor; complications include arthritis, reduced range of motion, weakness, pain and persistent deformity. Importantly, the number of surgical procedures required to produce a corrected clubfoot was significantly associated with poorer long-term outcomes. The pathogenesis of clubfoot remains largely unknown, though ~12% of patients report a family history. Targeted gene sequencing studies have failed to identify genetic loci associated with clubfoot. We recently reported FSTL5 as the first GWAS-associated locus associated with clubfoot, and we showed this gene was associated with sexually-dimorphic phenotypes in mice. Together with the Gabriella Miller Kids First Initiative (GMKF), we are poised to continue advancing the field’s understanding of the genetic causes of clubfoot. The GMKF provides comprehensive genome sequencing (GS) in individuals and families with congenital birth defects. However, no study of congenital limb malformations has yet been conducted as part of the GMKF; thus, our study will be the first GMKF study to investigate genetic causes of congenital limb defects. For the past several decades, we have collected DNA samples from families with multiple relatives affected with clubfoot. Similar to our previously-awarded GMKF studies (1X01HL132375 led by Dr. Rios and DE031445 co-led by Dr. Hecht), we will utilize the power of family-based inheritance mapping to discovery genetic causes of clubfoot. Using the comprehensive GS provided by the GMKF, we will discover sequence variants and copy number variants co-segregating with clubfoot in these families. We present a systematic approach to discover these clubfoot-causing variants, including an integrated approach to investigate variants impacting potential non-coding regulatory elements, which have previously been implicated in other limb malformations. At the completion of this study, we will provide the field with the most comprehensive genomic analysis of families with clubfoot, which will provide novel hypotheses regarding the genetic etiology of this complex congenital birth defect. PUBLIC HEALTH RELEVANCE: Clubfoot is a common birth defect affecting ~1 per 1,000 births, where infants and children are treated with various surgical and nonsurgical methods to correct the foot deformity. Our goal is to better understand the genetic factors contributing to clubfoot through family-based and population-based genetic studies, and our research team recently reported the first GWAS-associated clubfoot locus. Here, we propose to use genome sequencing and robust analytical approaches to systematically evaluate the causes of clubfoot in large multi- generational families.

This page last reviewed on October 12, 2022