Skip to main content

2025 X01 Projects Abstracts

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
Azeez ButaliHL140516University of IowaWhole Genome Sequencing of African and Asian Orofacial Clefts Case-Parent Triads492

Abstract:

DESCRIPTION (provided by applicant): 

Dr Azeez Butali is a tenure-track Assistant Professor at the Iowa Institute for Oral Health Research, College of Dentistry, and the University of Iowa. His primary research focus is on the genetics and epidemiology of complex traits including orofacial clefts. Dr Terri Beaty is a Professor at the John Hopkins University. Her research focus is on genetic epidemiology studies of several chronic diseases with complex etiologies, where both genetic and environmental risk factors control risk of disease. Co-investigators: Dr Adebowale Adeyemo is Deputy Director at the National Human Genome Research Institute. His focus is on the genetics and genomics of complex traits in African population. Dr Marazita is a Professor at the University of Pittsburgh. She is an expert in statistical genetics application for complex traits and identification of sub-clinical cleft phenotypes. Dr Cao is an Assistant Professor at the University of Iowa. He uses bioinformatics tools to interrogate the human genome and for analyses of gene-regulatory networks. Dr Ruczinski is a Professor at the John Hopkins University. His expertise is in statistical genetics, genomics and proteomics of complex traits. Dr Taub is an Assistant Scientist at the John Hopkins University. Her area of expertise is in genomics and statistical genetics for gene expression data, genotyping data and DNA methylation data Environment: The University of Iowa is a leading institution with a strong reputation for excellence in teaching, research and healthcare. The John Hopkins University is one of the leaders in the research, teaching and healthcare in the US. Both institutions are consistently amongst centers supported by NIH grants Research Study: The focus of this study is to identify novel risk variants for OFC in Africa and Asian OFC case-parent triads through analysis of Whole Genome Sequencing data. PUBLIC HEALTH RELEVANCE: TITLE: Whole Genome Sequencing of African and Asian Orofacial Case-Parent Triads The long-term goal of this study is to identify specific genomic variants through WGS of OFC case-parent triads from African and Asian populations. The knowledge gained from these WGS studies will drive future research on OFC and should eventually lead to more effective interventions to reduce the risk of OFC.

 

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
John R ShafferHD114124University Of Pittsburgh at PittsburghEpigenomics of Orofacial Clefts2543

Abstract:

DESCRIPTION (provided by applicant): 

Orofacial cleft (OFC) birth defects are one of the most common structural birth defects in humans, and the most common craniofacial anomalies, with worldwide incidence of approximately 1 per 700 newborns. OFCs represent a major public health problem due to the associated morbidity, mortality, and significant medical care expenditures. Based on structures affected, OFCs have been categorized as three subtypes: clefts affecting the lip only (cleft lip, CL), clefts affecting both the lip and the palate (cleft lip and palate, CLP), and clefts affecting the palate only (cleft palate, CP). Historically, CL and CLP have been considered variations of the same malformation that differ in severity, whereas the developmental origins of the affected structures, epidemiology, and familial patterns suggest that CP has a separate etiology than CL and CLP. Both genetic and environmental factors play important roles in the development of OFCs, although understanding of these risk factors is incomplete. The proposed project aims to expand the Gabriella Miller Kids First (GMKF) resource by collecting data to investigate the role of DNA methylation – an epigenomic marker of gene activity – on the development of clefts. We propose to collect genome-wide DNA methylation assays in a large cohort of affected children as well as DNA methylation and transcriptomics assays in a subset of children with available discarded surgical tissue. Ultimately, these data will contribute new and complementary types of omics data to the GMKF resource for participants with already-available whole-genome sequencing data. This resource will allow us and others to perform analyses to identify the differentially methylated regions of the genome associated with OFCs and subtypes, and explore the functional roles of previously identified OFC-associated genetic loci. Successful completion of this project will expand and deepen our understanding of the genetic architecture and regulatory landscape of OFCs including identifying new risk loci and determining the mechanisms through which known risk loci influence the development of OFCs. PUBLIC HEALTH RELEVANCE: This project will expand the Gabriella Miller Kids First resource and deepen our understanding of the genetic architecture and regulatory landscape of non-syndromic orofacial clefts including identifying new risk loci and determining the mechanisms through which known risk loci influence the development of OFCs. This knowledge may ultimately be useful for applications such as recurrence prediction or personalized therapeutic interventions.

 

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
Bruce D GelbHL161587Icahn School Of Medicine At Mount SinaiExpanding our understanding of the role of noncoding variation causing congenital heart defects1201

Abstract:

DESCRIPTION (provided by applicant): 

The epidemiology of congenital heart defects (CHD) indicates that genetic variation is the overwhelmingly predominant cause of these commonest birth defects, but more than 50% of CHD cases remain unexplained even after trio exome sequencing (ES). The remaining large gap in genetic causality for CHD is what the Pediatric Cardiac Genomics Consortium (PCGC), a component of NHLBI’s Bench-to-Bassinet Program, seeks to address through the Gabriella Miller Kids First (GMKF) Pediatric Research Program. Under the auspices of prior GMKF awards for trio genome sequencing (GS), the PCGC began to elucidate the role of de novo noncoding damaging single nucleotide variants and small insertions and deletions (indels) in CHD causality . The cumulative mean attributed risk from noncoding de novo variants (DNVs) for exome-negative CHD was 17-45%.  To further the understanding of the role of genetic variation in causing CHD, the PCGC is requesting GS to perform GS for 500 probands born with tetralogy of Fallot (ToF) or hypoplastic left heart syndrome (HLHS), unsolved after exome sequencing, and their unaffected parents. In addition to increasing cohort size to improve statistical power, we will use a new and larger control trio GS dataset available through TOPMed (n = 1,758) and generate an improved version of our neural network,HeartENN, which predicts functional impact of genetic variation, through incorporation of more-than-double cardiac noncoding regulatory feature data. Of note, the PCGC has the wherewithal to confirm relevant variants with other genomic methods as well as to perform functional cell-based assays to further support claims of pathogenicity.  We will also use the GS data to expand our understanding of structural variation underlying CHD. We will use a best-of-class SV calling pipeline, developed by the Talkowski group at the Broad Institute. Analytic focus of SVs will include disruptions of known autosomal dominant CHD genes, 2nd hits in trans to damaging coding variants in known autosomal recessive CHD genes, and burden analysis for apparently damaging SVs combined with existing data about putatively damaging coding variants (SNVs and indels) from > 5000 CHD trios. Finally, in an exploratory portion of this aim, we will attempt calling of SVs such as repeat expansions that are difficult with short-read GS, focusing on a limited number of regions of potential interest based on our analysis of PacBio long-read GS from 200 CHD probands, currently being generated under the auspices of the GMKF pilot program.

 

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
Wendy ChungHD110887Boston Children's HospitalGenomic Analysis of Esophageal Atresia and Tracheoesophageal Fistulas and Associated Congenital Anoma360

Abstract:

DESCRIPTION (provided by applicant): 

Project Summary/Abstract Esophageal atresia/tracheoesophageal fistula (EA/TEF) is a rare and complex aerodigestive congenital anomaly with an estimated incidence of 1 in 2500 to 1 in 4000 live births. There is a 45% incidence of associated congenital malformations, most commonly digestive, cardiovascular, urogenital, and musculoskeletal, often part of a syndrome or complex association, with VACTERL (vertebral defects, anal atresia, cardiac defects, tracheoesophageal fistula, renal anomalies, and limb abnormalities) being most frequently recognized. Advanced surgical techniques and pre and post-operative care have improved the prognosis and survival of EA/TEF patients over the past decades. However, with improved survival, many of the long-term morbidities of EA/TEF have been exposed. It is likely that the outcome in EA/TEF patients is influenced by multiple genetic and clinical factors; however, determining which factors are critical has been limited by the lack of data, particularly genomic data. Many families and health care providers seek prognostic clinical information about other associated birth defects or genetic syndromes, but prognostic data are extremely limited unless a chromosomal anomaly is identified. Evidence is accumulating that many congenital anomalies can result from copy number variants, de novo mutations, and inherited rare mutations, often unique to the family. We propose to elucidate the underlying genomic architecture of EA/TEF and define new genes and conditions associated with EA/TEF by performing whole genome sequencing on 100 parent child trios in a clinically well characterized cohort to identify rare de novo mutations and inherited variants. We believe this information will improve genetic diagnostic methods and provide more accurate clinical prognostic information to guide clinic decisions and improve outcomes.

PUBLIC HEALTH RELEVANCE:

Esophageal atresia/tracheoesophageal fistula (EA/TEF) is a rare and complex aerodigestive congenital anomaly with an estimated incidence of 1 in 2500 to 1 in 4000 live births. We propose to elucidate the underlying genomic architecture of EA/TEF by performing whole genome sequencing to characterize new clinical syndromes associated with EA/TEF to provide more accurate clinical prognostic information.

 

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
Ariadne LetraDE031445University of PittsburghWhole Genome Sequencing Studies Of Multiplex Nonsyndromic Cleft Lip/Palate Families923

Abstract:

DESCRIPTION (provided by applicant): 

In this proposal, we request whole genome sequencing (WGS) services of 923 individuals from our cohort of well-characterized, large multigenerational nonsyndromic cleft lip/palate (NSCLP) families of Hispanic and non-Hispanic white ethnicities. NSCLP is a common birth defect accounting for 65% of all birth defects and annually affecting approximately 135,000 newborns worldwide. Despite improvement in treatments, NSCLP imposes significant medical, psychosocial and financial burdens that affect quality of life of affected individuals and their families. NSCLP is complex, caused by genetic and environmental factors, and their interactions. Recent advances in genomic approaches have improved our knowledge of the genetic factors involved in NSCLP; however, most of the variants associated with NSCLP account for ~25% of the genetic liability and reflect common, modest risk-variants often located in noncoding regions of the genome. More recently, it has been suggested that some genetic risk for NSCLP lies in rare variants and this has contributed to the lack of consistent findings and difficulty in unraveling risk alleles. Further, it is likely that the missing heritability of NSCLP result in part due to interactions between common, modest risk variants and rare, high-risk variants. We will analyze WGS data and apply polygenic risk score analysis to identify novel, high-penetrance NSCLP variants and to systematically evaluate the contributions of both common and rare variants to NSCLP, and how they segregate individually and in concert within and between families. The results of this study will provide novel and important insights about the genetic architecture contributing to the complex etiology of NSCLP. Importantly, this proposal will translate into a rich and publicly available resource of NSCLP genotypic and phenotypic data that will be made available to the broader scientific community to foster additional studies on NSCLP, as well as other birth defects and/or associated co-morbidities. Further, this proposal will provide genotype and allele frequency data in Hispanics for which limited data is available on genetic databases. Additional follow-up studies proposed, although beyond the scope of this X01 application, include validating the variants identified in this study in our additional NSCLP trios as well as through joint analysis with data from additional existing Kids First datasets. Successful completion of this study will provide novel and important insights about the genetic architecture contributing to the complex etiology of NSCLP, and will translate into a large, rich resource for genetic and phenotypic information on NSCLP.

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
Andrew L HongHD114129Emory UniversityBasis of Childhood Kidney Cancers and Birth Defects932

Abstract:

DESCRIPTION (provided by applicant): 

Wilms Tumor is the most common renal tumor of childhood. Although cure rates approach 90% after initial therapy that includes a combination of surgery, chemotherapy and radiation therapy, our understanding of the biology of how children develop this cancer remains limited due to small patient cohorts. Prior studies have uncovered a number of important genetic alterations associated with Wilms Tumor. However these studies are based on small cohorts. Here, we propose to advance our prior studies with a multi-decade effort to obtain high quality samples from over 200 pediatric institutions through Children’s Oncology Group Renal Tumor studies. With samples from approximately 2,946 patients, we propose to assess the whole genome, methylome and transcriptome of the patient’s germline, normal adjacent kidney and tumor kidney. Given the large sample size, we will be powered for the detection of rare variant alleles and validation of prior studies. Just as importantly, our patient cohort represents the diversity of the United States. The multi-PI team along with senior leadership of the COG Renal Tumor studies have deep expertise in the analyses of epidemiology, genetics, epigenetics and transcriptomics in childhood cancers along with decades experience with the care of children with renal tumors. This proposed study provides a timely opportunity to aid our understanding of cancer risk in children with genitourinary congenital anomalies and more broadly, our understanding of Wilms Tumor, from a diverse population. These data will provide a critical resource for cancer germline risk, congenital anomalies, developmental biology and cancer biology. PUBLIC HEALTH RELEVANCE: Wilms Tumor is the most common kidney cancer in children. Although some predisposition syndromes have been associated with Wilms Tumor (e.g., Beckwith Wiedemann Syndrome, Denys Drash Syndrome, Hemihypertrophy, WAGR Syndrome), recent studies suggest many more children with Wilms Tumor may have an underlying predisposition syndrome. This study will explore how these germline changes relate to the developing kidney or structural birth defects in addition to the development of kidney cancer which may lead to prevention strategies or enhance risk stratification and therapeutic target identification.

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
Erin Peckham-Gregory and Carl AllenCA267639Texas Children's HospitalGenomic Analysis Of Histiocytosis1008

Abstract:

DESCRIPTION (provided by applicant): 

Langerhans cell histiocytosis (LCH) is an inflammatory myeloid neoplasm characterized by lesions including pathogenic CD207+ dendritic cells among an inflammatory infiltrate. The median age at diagnosis is 30 months, and up-front chemotherapy fails in ~50% of patients resulting in multiple relapse events for 40-50% of cases, and long-term sequelae. Sequencing studies have found recurrent, mutually exclusive somatic activating mutations in MAPK pathway genes in ~85% of LCH lesions, including BRAF V600E in 50-65%. There is a “Misguided Myelomonocytic Precursor Model” in which specific somatic MAPK mutations at critical stages of myeloid differentiation determine extent of disease. However, this model fails to explain the significant differences in LCH risk across ethnicities. Despite advances to elucidate the somatic mutational landscape underlying LCH pathogenesis, germline risk factors remain largely unknown. Therefore, we conducted the first genome-wide association study of LCH and identified a SMAD6 variant associated with increased risk. SMAD6 inhibits bone morphogenetic protein and transforming growth factor-beta/activin signaling, which are determinants of Langerhans cell differentiation. This variant appears to suppress SMAD6 protein expression without a decrease in SMAD6 messenger RNA expression in patients carrying the risk allele. This risk allele is also more common in Hispanics who are at the highest risk of LCH, and absent in blacks who experience the lowest LCH incidence. Our preliminary data also support the emerging observation that LCH somatic activating mutations vary by race/ethnicity. Specifically, sequencing of tumors from black patients indicated that only 25% were BRAF V600E+ (compared to >60% in other populations), whereas 50% had mutations in MAP2K1 (compared to < 10% in other populations). Therefore, the objective of this Kids First X01 application is to more fully elucidate LCH etiology by defining the role of de novo mutations (DNMs) in established LCH genes and novel susceptibility genes, and by comprehensively assessing somatic variation in LCH. Our central hypothesis is that penetrant DNMs and novel germline and somatic variation may contribute to, or in some cases, drive LCH tumorigenesis. 1) We will leverage our ongoing Children's Oncology Group study, Genetic Epidemiology of Childhood Histiocytosis (GECHO), to sequence 300 LCH case-parent trios to evaluate the impact of recurrent DNMs on inherited susceptibility to LCH. 2) We will leverage paired germline-tumor samples from 200 patients enrolled to the Texas Children's Histiocytosis Program protocol or GECHO study to comprehensively assess the somatic landscape of LCH and identify germline variation contributing to somatic mutational profiles. Successful completion of the proposed aims may (1) improve genetic testing and counseling strategies in LCH patients and families; (2) advance surveillance and chemoprevention protocols; and (3) identify novel therapeutic targets.

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
David TeacheyHD114203Children's Hospital of PennsylvaniaSomatic and Germline Variants in Childhood T-cell acute lymphoblastic leukemia730

Abstract:

DESCRIPTION (provided by applicant): 

The outcome for children with relapsed T-cell acute lymphoblastic leukemia (T-ALL) is dismal. Thus, the primary goal in treatment is to prevent relapse, which requires accurate risk stratification. Prior attempts to identify genetic aberrations that are prognostic independent of treatment response have failed. We recently performed comprehensive genomic profiling (whole genome sequencing (WGS), whole exome sequencing (WES), and whole transcriptome profiling (WTS) of tumor; WGS of germline) from >1300 patients with T-ALL treated on the AALL0434 clinical trial through a Gabriella Miller Kids First X01 award (X01HD100702) and made several novel, practice changing observations. We found T-ALL can be classified into 15 distinct groups, many of which are novel. We found that leukemic drivers were in non-coding regions in 60% of cases, highlighting the importance of WGS. We identified multiple subtypes that were predictive of favorable and unfavorable outcome. The successor trial to AALL0434 was AALL1231. On AALL1231, several changes were made to the backbone to eliminate cranial radiation in most patients and these changes had prognostic implications. Before we can prospectively incorporate genetic aberrations into risk stratification, we need to validate our results in an independent cohort treated with current therapy and identify genomic variants that are reproducibly prognostic irrespective of therapeutic backbone. In addition, we found the prognostic impact of some variants differed based on genetic ancestry; some genetic variants that were associated with higher cure rates in children of European ancestry were not associated with higher cure rates in children of African ancestry. We need to increase the number of patients studied from different racial and ethnic groups to ensure equity in future risk stratification. Finally, we were unable to identify the genomic driver in a small percentage of cases (5%) and long-read sequencing may be able to overcome this gap. We hypothesize that comprehensive genomic profiling will identify recurrent genetic alterations that can be used prospectively to risk classify patients with T-ALL. Genomic profiling of a large cohort of patients treated on the AALL1231 trial will serve as a natural extension of our initial X01 award, providing the power to assess the impact of genomic variants on outcome across genetic ancestral groups. We will test our hypothesis with the following specific aims: (1) validate prognostic variants in an independent cohort of patients with T-ALL; (2) identify novel genomic structural variants using long-read sequencing; and (3) determine the association between genetic ancestry, tumor biology and outcomes. The goal of the Kids First Program is to improve understanding of genetic mechanisms of disease, leading to improved diagnostic capabilities and ultimately more targeted therapies. Genomic profiling across two of the largest clinical trials ever performed in children with T-ALL will clearly meet these important goals. This work will not only fundamentally transform the understanding of T-ALL disease biology but also allow us to risk stratify patients accurately and equitably understand differences in tumor biology based on genetic ancestry. PUBLIC HEALTH RELEVANCE: Modern genetic tests have helped find better ways to identify children with T-cell acute lymphoblastic leukemia (T-ALL) who are less likely to be cured. Before we can use these tests in the clinic, we need to show they are helpful regardless of therapy used to treat the leukemia

Contact PI/Project LeaderProject NumberAwardee OrganizationTitleAnticipated Number of Samples
Sharon DiskinCA268005Children's Hospital of PennsylvaniaThe Genetic Basis Of Treatment Outcomes And Late Effects After High-Risk Neuroblastoma407

Abstract:

DESCRIPTION (provided by applicant): 

Children diagnosed with high-risk neuroblastoma receive intensive multi-modal therapy, yet 40-50% die of their primary cancer, and those who survive experience substantial treatment-related morbidities. There is no reliable way to identify those at greatest risk of treatment failure (death) or late effects, and only a nascent understanding of underlying genetic determinants. Our long term goal is to improve neuroblastoma outcomes by first characterizing the events driving tumorigenesis and treatment response so that evidence-based and less toxic therapies can be developed. We hypothesize that comprehensive whole genome sequencing (WGS) of high- risk neuroblastoma subjects treated with modern therapy and annotated with late effect phenotypes will identify genetic determinants of survival and treatment-related morbidities. Through an existing Gabriella Miller Kids First (GMKF) project, we performed WGS of neuroblastoma patient-parent triads/dyads (n=556) together with matched tumor DNA (n=336) and RNA-sequencing (n=207). These data have defined the heritable fraction of rare pathogenic variants in cancer predisposition genes and suggest carriers have worse survival. However, only a subset of cases (n=178) sequenced are high-risk and none include phenotyping of late effects. Here, we will build on existing GMKF profiling to generate germline WGS for 1,100 total children (n=922 new) who received modern high-risk neuroblastoma therapy, along with additional WGS of matched tumor DNA (n=553 new) and RNA-sequencing (n=461 new). All subjects participated in the Children’s Oncology Group (COG) neuroblastoma biology study (ANBL00B1). The entire cohort is annotated with demographic (age, sex, race, ethnicity), clinical (e.g. age at diagnosis, stage, risk group, survival), and tumor biological (e.g. MYCN status) co-variates. A subset (n=367) are 5+ year survivors enrolled in the COG ALTE15N2: Late Effects After High-Risk Neuroblastoma (LEAHRN) study and have undergone extensive clinical assessments, with excellent characterization of late toxicities. We will test our hypothesis through two Specific Aims: 1) Identify germline and somatic variants associated with high-risk neuroblastoma treatment failure. Using a phased approach, we will identify coding and non-coding germline variation, somatic alterations, and transcriptomic profiles predicting refractory disease and survival. 2) Discover genetic risk factors associated with late effects after high-risk neuroblastoma therapy. We will define the spectrum, prevalence, and association of rare pathogenic variants with respect to hearing loss, cardiomyopathy, growth impairment and primary gonadal failure in the LEAHRN subjects. Data from NCI- TARGET (n=1,108), our genome-wide association study (GWAS; n=6,202), and phenotyping in recent high-risk trials will be integrated to validate genetic associations with treatment outcomes. Sequencing of this unique and extensively phenotyped high-risk neuroblastoma cohort will provide an unparalleled opportunity to discover germline and somatic alterations that can be used to identify patients at risk for treatment failure and late effects. This will serve as rationale for the design of future trials aimed at improved survival and reduction in late effects.

*Sequencing of this project is supported by the NIH Childhood Cancer Data initiative and and the data will be shared through the NCI Cancer Data Service

This page last reviewed on May 15, 2026