Skip to main content
Project Number: HD107271-01 Contact PI / Project Leader: Cody, Jannine De Mars
Title: The genomic basis of structural birth defects associated with chromosome 18 copy number changes Awardee Organization:
University Of Texas Health Science Center


DESCRIPTION (provided by applicant): 

Chromosome abnormalities are a common cause of structural birth defects. However, because these genomic copy number changes involve multiple genes and because most of these conditions are individually rare defining the specific causative genes behind specific phenotypes has been a challenge. We met that challenge by spending the last 28 years enrolling and evaluating anyone with a chromosome 18 abnormality. The cohort now includes over 700 individuals with a wide variety of chromosome 18 copy number changes (CNV) as well as their parents. Because the vast majority of these participants have individually unique CNVs we have been able to perform extensive genotype phenotype correlations resulting in 58 publications. Of the 263 genes on chromosome 18, 28 are liked to specific hemizygous phenotypes. However, most have low penetrance. Genomic sequence data could identify variants in the extant allele that are hypermorphic and compensate for hemizygosity or are hypomorphic and exacerbate hemizygosity. Additionally, there are phenotypes within this cohort that are rare or an extreme version of one of the more common phenotypes. Genomic sequence data could help to discover new biallelic conditions by revealing functional sequence variants of the extant allele. In both of these cases there could also be variants in other genes on other chromosomes that may be associated with the phenotype that confer susceptibility or resilience. Inclusion of this unique cohort in the Gabriella Miller Kids First Pediatric Research Program can advance the goals of the program in several ways. First, by adding known susceptibility loci to the existing Kids First cohorts for the structural birth defects that are also found in our cohort. Second, for those structural birth defects known to be polygenetic, this cohort has a single defined risk factor which can simplify the search for secondary factors. Third, these studies can also bring clarity to people with chromosome 18 conditions and help to make the genotype more accurately predict phenotype. This approach could be a model for understanding the many other rare chromosome abnormalities which collectively are a common cause of structural birth defects.


Project Number: CA268005-01 Contact PI / Project Leader: Diskin, Sharon 
Title: The Genetic Basis of Treatment Outcomes and Late Effects After High-Risk Neuroblastoma Awardee Organization:
Children's Hospital Of Philadelphia


DESCRIPTION (provided by applicant): 

Children diagnosed with high-risk neuroblastoma receive intensive multi-modal therapy, yet 40-50% die of their primary cancer, and those who survive experience substantial treatment-related morbidities. There is no reliable way to identify those at greatest risk of treatment failure (death) or late effects, and only a nascent understanding of underlying genetic determinants. Our long term goal is to improve neuroblastoma outcomes by first characterizing the events driving tumorigenesis and treatment response so that evidence-based and less toxic therapies can be developed. We hypothesize that comprehensive whole genome sequencing (WGS) of high- risk neuroblastoma subjects treated with modern therapy and annotated with late effect phenotypes will identify genetic determinants of survival and treatment-related morbidities. Through an existing Gabriella Miller Kids First (GMKF) project, we performed WGS of neuroblastoma patient-parent triads/dyads (n=556) together with matched tumor DNA (n=336) and RNA-sequencing (n=207). These data have defined the heritable fraction of rare pathogenic variants in cancer predisposition genes and suggest carriers have worse survival. However, only a subset of cases (n=178) sequenced are high-risk and none include phenotyping of late effects. Here, we will build on existing GMKF profiling to generate germline WGS for 1,100 total children (n=922 new) who received modern high-risk neuroblastoma therapy, along with additional WGS of matched tumor DNA (n=553 new) and RNA-sequencing (n=461 new). All subjects participated in the Children’s Oncology Group (COG) neuroblastoma biology study (ANBL00B1). The entire cohort is annotated with demographic (age, sex, race, ethnicity), clinical (e.g. age at diagnosis, stage, risk group, survival), and tumor biological (e.g. MYCN status) co-variates. A subset (n=367) are 5+ year survivors enrolled in the COG ALTE15N2: Late Effects After High-Risk Neuroblastoma (LEAHRN) study and have undergone extensive clinical assessments, with excellent characterization of late toxicities. We will test our hypothesis through two Specific Aims: 1) Identify germline and somatic variants associated with high-risk neuroblastoma treatment failure. Using a phased approach, we will identify coding and non-coding germline variation, somatic alterations, and transcriptomic profiles predicting refractory disease and survival. 2) Discover genetic risk factors associated with late effects after high-risk neuroblastoma therapy. We will define the spectrum, prevalence, and association of rare pathogenic variants with respect to hearing loss, cardiomyopathy, growth impairment and primary gonadal failure in the LEAHRN subjects. Data from NCI- TARGET (n=1,108), our genome-wide association study (GWAS; n=6,202), and phenotyping in recent high-risk trials will be integrated to validate genetic associations with treatment outcomes. Sequencing of this unique and extensively phenotyped high-risk neuroblastoma cohort will provide an unparalleled opportunity to discover germline and somatic alterations that can be used to identify patients at risk for treatment failure and late effects. This will serve as rationale for the design of future trials aimed at improved survival and reduction in late effects.

Project Number: HL161587-01 Contact PI / Project Leader: Gelb, Bruce D
Title: Expanding our understanding of the role of noncoding variation causing congenital heart defects
Awardee Organization: Icahn School Of Medicine At Mount Sinai
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): The epidemiology of congenital heart defects (CHD) indicates that genetic variation is the overwhelmingly predominant cause of these commonest birth defects, but more than 50% of CHD cases remain unexplained even after trio exome sequencing (ES). The remaining large gap in genetic causality for CHD is what the Pediatric Cardiac Genomics Consortium (PCGC), a component of NHLBI’s Bench-to-Bassinet Program, seeks to address through the Gabriella Miller Kids First (GMKF) Pediatric Research Program. Under the auspices of prior GMKF awards for trio genome sequencing (GS), the PCGC began to elucidate the role of de novo noncoding damaging single nucleotide variants and small insertions and deletions (indels) in CHD causality . The cumulative mean attributed risk from noncoding de novo variants (DNVs) for exome-negative CHD was 17-45%.  To further the understanding of the role of genetic variation in causing CHD, the PCGC is requesting GS to perform GS for 500 probands born with tetralogy of Fallot (ToF) or hypoplastic left heart syndrome (HLHS), unsolved after exome sequencing, and their unaffected parents. In addition to increasing cohort size to improve statistical power, we will use a new and larger control trio GS dataset available through TOPMed (n = 1,758) and generate an improved version of our neural network,HeartENN, which predicts functional impact of genetic variation, through incorporation of more-than-double cardiac noncoding regulatory feature data. Of note, the PCGC has the wherewithal to confirm relevant variants with other genomic methods as well as to perform functional cell-based assays to further support claims of pathogenicity.  We will also use the GS data to expand our understanding of structural variation underlying CHD. We will use a best-of-class SV calling pipeline, developed by the Talkowski group at the Broad Institute. Analytic focus of SVs will include disruptions of known autosomal dominant CHD genes, 2nd hits in trans to damaging coding variants in known autosomal recessive CHD genes, and burden analysis for apparently damaging SVs combined with existing data about putatively damaging coding variants (SNVs and indels) from > 5000 CHD trios. Finally, in an exploratory portion of this aim, we will attempt calling of SVs such as repeat expansions that are difficult with short-read GS, focusing on a limited number of regions of potential interest based on our analysis of PacBio long-read GS from 200 CHD probands, currently being generated under the auspices of the GMKF pilot program.

Project Number: HD107383-01 Contact PI / Project Leader: Krantz, Ian D
Title: RNAseq in Cornelia de Lange Syndrome, Related Diagnoses and Structural Birth Defects Awardee Organization: Children's Hospital Of Philadelphia
Abstract Text:

Abstract: DESCRIPTION (provided by applicant):Disorders of human morphogenesis are a major cause of human suffering for the affected individuals and their families. Congenital anomalies are identified in approximately 3% of term births, 10% of stillbirths, and in as many as 50% of first trimester spontaneous abortuses. While most, if not all, human structural birth defects have a significant genetic component, identification of genetic perturbations in isolated structural birth defects has been complicated by the complex nature of their underlying etiologies, likely involving disruption of regulatory elements that can act in a temporal and tissue specific manner, multi-gene, epigenetic and gene-environment interactions. Our approach to tease out genetic contributions to birth defects has been to identify the underlying causes of syndromic birth defects which are often Mendelian in nature and therefore lend themselves more readily to genetic causal identification. Once identified, these genetic causes of syndromic forms of birth defects can be leveraged to understand the genetic contributions to isolated birth defects seen in constellation in these syndromes. We propose to use Cornelia de Lange Syndrome (CdLS), a dominant multisystem developmental disorder consisting of a constellation of structural birth defects involving most body systems and significant growth and cognitive impairment as a prime example of this approach. We and others have shown that alterations in the cohesin and associated pathways are causative of CdLS and related diagnoses when disrupted and have more broadly been termed “cohesinopathies” or “disorders of transcriptional regulation (DTRs)”. In this proposal we outline an initial plan to perform RNA sequencing on a unique cohort of 77 probands with clinically confirmed CdLS or a related diagnosis (and 74 unaffected family members) in whom genome sequencing performed as part of previous XO1 project was non-diagnsotic, but are strongly suspected of having an underlying genetic alteration to explain their clinical features. This work will lead to the identification of genes critical in human embryonic development, provide novel insights into transcriptional regulation and help to identify genetic causes and candidate genes for isolated birth defects seen in constellation in this group of diagnoses. Most critical developmental genes are also cancer genes and the genes known to cause CdLS are no exception. CdLS is not a cancer predisposition syndrome so understanding the mutational mechanisms in these genes that lead to structural birth defects when present in the germ line and result in cancer when mutated somatically is a fundamental aspect of this research.

Project Number: DE031445-01 Contact PI / Project Leader: Letra, Ariadne M
Title: Whole genome sequencing studies of multiplex nonsyndromic cleft lip/palate families Awardee Organization: `
University Of Texas Health Sci Ctr Houston
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): In this proposal, we request whole genome sequencing (WGS) services of 923 individuals from our cohort of well-characterized, large multigenerational nonsyndromic cleft lip/palate (NSCLP) families of Hispanic and nonHispanic white ethnicities. NSCLP is a common birth defect accounting for 65% of all birth defects and annually affecting approximately 135,000 newborns worldwide. Despite improvement in treatments, NSCLP imposes significant medical, psychosocial and financial burdens that affect quality of life of affected individuals and their families. NSCLP is complex, caused by genetic and environmental factors, and their interactions. Recent advances in genomic approaches have improved our knowledge of the genetic factors involved in NSCLP; however, most of the variants associated with NSCLP account for ~25% of the genetic liability and reflect common, modest risk-variants often located in noncoding regions of the genome. More recently, it has been suggested that some genetic risk for NSCLP lies in rare variants and this has contributed to the lack of consistent findings and difficulty in unraveling risk alleles. Further, it is likely that the missing heritability of NSCLP result in part due to interactions between common, modest risk variants and rare, high-risk variants. We will analyze WGS data and apply polygenic risk score analysis to identify novel, high-penetrance NSCLP variants and to systematically evaluate the contributions of both common and rare variants to NSCLP, and how they segregate individually and in concert within and between families. The results of this study will provide novel and important insights about the genetic architecture contributing to the complex etiology of NSCLP. Importantly, this proposal will translate into a rich and publicly available resource of NSCLP genotypic and phenotypic data that will be made available to the broader scientific community to foster additional studies on NSCLP, as well as other birth defects and/or associated co-morbidities. Further, this proposal will provide genotype and allele frequency data in Hispanics for which limited data is available on genetic databases. Additional follow-up studies proposed, although beyond the scope of this X01 application, include validating the variants identified in this study in our additional NSCLP trios as well as through joint analysis with data from additional existing Kids First datasets. Successful completion of this study will provide novel and important insights about the genetic architecture contributing to the complex etiology of NSCLP, and will translate into a large, rich resource for genetic and phenotypic information on NSCLP.

Project Number: CA267638-01 Contact PI / Project Leader: Lupo, Philip J
Title: Genetic Overlap Between Anomalies and Cancer in Kids in the Childrens Oncology Group: The COG GOBACK Study Awardee Organization:
Baylor College Of Medicine
Abstract Text:

Abstract: DESCRIPTION (provided by applicant):ABSTRACT One of the strongest risk factors for cancer in children and adolescents is being born with a congenital anomaly— this is true both for chromosomal abnormalities (e.g., Down syndrome) and non-chromosomal birth defects (e.g., non-syndromic congenital heart defects), as recently validated in our registry linkage study of over 10 million live births. Specifically, by linking data from population-based birth defects and cancer registries in four states included in the Genetic Overlap Between Anomalies and Cancer in Kids (GOBACK) Study, we identified multiple novel congenital anomaly-cancer associations that are not part of known cancer predisposition syndromes. We also observed increasing cancer risk with a corresponding increase in the number of non-chromosomal defects. Children with multiple congenital anomalies (MCAs) who develop cancer are likely a subset of individuals enriched for cancer predisposition syndromes that have yet to be identified. There are two unanswered questions that limit clinical translation of these findings. Specifically, it is not clear: 1) what proportion of these associations may be due to known cancer predisposition variants; and 2) if novel cancer predisposition syndromes might underlie these observed associations. These gaps limit genetic testing and counseling strategies for these families. Our central hypothesis is the co-occurrence of congenital anomalies and cancer results from molecular etiologies identifiable through genomic evaluation. Our hypothesis is built on rigorous prior research, including findings from the GOBACK Registry Linkage Cohort and whole-genome sequencing (WGS) on a subset of children enrolled in the GOBACK Family Cohort. Our proposed Kids First study leverages several robust and existing resources, including: 1) matched tumor-normal samples from children with congenital anomalies and cancer without a reported syndrome (N=500) enrolled as part of the Children's Oncology Group (COG) APEC14B1 protocol (Project:EveryChild or PEC); 2) an approved COG protocol that allows for WGS of these samples; 3) an ongoing study to recontact these families to obtain additional phenotypic data; 4) our established bioinformatic pipelines to evaluate the role of rare variants consistent with autosomal dominant, recessive, or X- linked disorders; and 5) the expanded GOBACK Registry Linkage Cohort representing >35% of the U.S. population, which can be used to evaluate associations and WGS findings observed in PEC. Our central hypothesis will be evaluated in two aims: 1) determine the frequency of known cancer predisposition variants among children with congenital anomalies and cancer; and 2) identify variants that underlie novel anomaly- cancer predisposition syndromes and describe the landscape of somatic alterations in these children. By extending our integrated population-based and genomic approach, this application has the potential to: 1) generate novel insights into the developmental pathways that lead to cancer; 2) identify new cancer susceptibility syndromes; and 3) subsequently lead to improved cancer surveillance strategies for children with congenital anomalies.

Project Number: CA267576-01 Contact PI / Project Leader: Meshinchi, Soheil 
Title: Germline and Somatic Variants in Pediatric AML Awardee Organization:
Fred Hutchinson Cancer Research Center
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Advances in genomic sequencing have allowed identification of somatic variants as potential therapeutic targets. Although myeloid disorders in children may show morphologic similarities to that seen in adults, TARGET AML initiative (Meshinchi, PI) clearly demonstrated that somatic genomic and transcriptome variants are highly distinct in children and young adults. In fact, there are a number of variants that are uniquely restricted to younger children. TARGET AML initiative, although modest in number, helped identify numerous somatic alterations with high therapeutic potential in younger AML patients. In addition to identification of somatic variants, analysis of the germline data provided a glimpse into the constitutional make-up of patients with AML. The identification of numerous “function altering” variants may provide an insight into possible interactions between the host and the disease, where these germline variants might alter AML risk (predisposition), response to therapy (altering target expression, drug metabolism), susceptibilities to short and long-term complications (including infectious and cardiac complications), or modify risk of secondary malignancies. Armed with data from initial sequencing efforts in AML, we are poised to take full advantage of the available sequencing technology to conduct the most comprehensive genome and transcriptome interrogation of myeloid disorders in children with specimens we have amassed over the last decade. To this end, we have put in place unparalleled specimen resources from children with de novo AML treated on COG AAML1031, to create the most comprehensive genome, transcriptome and epigenome profiling in AML. Our original X01 application providing funding support for whole genome sequencing of approximately half of the patients treated on AAML1031. Given that transcriptome (mRNA, miRNA, LncRNA) and methylation data is available for the entire AAML1031 cohort, completion of the genomic sequencing will provide completion of the profiling effort in this single trial cohort. The data provided by the prior X01 sequencing effort has yielded unparalleled data in defining novel therapeutic targets, prognostic biomarkers and have informed of germline variants that are associated with cancer predisposition. Also, germline variants can be exploited to identify those that might be at high risk of adverse events (cardiac complications, secondary malignancies, etc.) and their therapy tailored to minimize anticipated complications. Thus, we propose that the optimum outcome can only be obtained thru comprehensive interrogation of the somatic and germline genome to fully annotate the genomic makeup of the leukemia and its host.

Project Number: CA267587-01 Contact PI / Project Leader: Resnick, Adam Cain
Title: Germline and Somatic Disease Modifiers of Pediatric Brain Tumors Awardee Organization: Children's Hospital Of Philadelphia
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Brain tumors are the most common form of cancer in children aged 0-19 in the United States, and are the largest cause of cancer-related deaths. The estimated number of new cases in 2019 is nearly 3,800 and thus brain tumors are a rare disease. Despite their relative rarity, the years of potential life lost due to brain tumors in 2009 was estimated at 47,631 years for children and adolescents aged 0-19 in the United States; this is a disproportionate amount of life lost compared to adult cancers and represents an unrecognized societal threat. There is an urgent need to improve therapies for these children. Most of the high-grade glial and embryonal brain cancers still remain largely incurable despite decades of clinical and laboratory research. Existing non-targeted chemotherapies and radiation, while at times effective, often represent pyrrhic victories, leaving behind life-long health burdens and causing a significant risk of secondary malignancies. NIH funded pediatric brain tumor cohort-based genomic dataset generation efforts have lagged behind other histologies and have yet to be included as part of large-scale sequencing efforts. However, consortia-based initiatives like those supported by the Children's Brain Tumor Network (CBTN) have demonstrated the early potential for clinically annotated genomic cohorts and their utility and interest by both the pediatric cancer and structural birth defect community with more than 130 data access requests for a non-embargoed cohort of tumor/normal whole genomes and paired tumor RNAseq. Indeed more than one quarter of this 800-subject initial sequencing cohort were identified to have birth-defect-associated clinical annotations in their clinical records, however, to our knowledge limited to no trio-based genomics cohort studies exist for any one pediatric brain tumor histology. The project's proposed sequencing cohort defines the largest, clinically annotated pediatric brain tumor cohort study to date and seeks to define the intersection of germline and somatic underpinnings of pediatric brain tumors across a shared developmental context of cancer and structural birth defects.

Project Number: CA267639-01
Contact PI / Project Leader: Scheurer, Michael E
Title: Genomic Analysis of Histiocytosis Awardee Organization: Baylor College Of Medicine
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Langerhans cell histiocytosis (LCH) is an inflammatory myeloid neoplasm characterized by lesions including pathogenic CD207+ dendritic cells among an inflammatory infiltrate. The median age at diagnosis is 30 months, and up-front chemotherapy fails in ~50% of patients resulting in multiple relapse events for 40-50% of cases, and long-term sequelae. Sequencing studies have found recurrent, mutually exclusive somatic activating mutations in MAPK pathway genes in ~85% of LCH lesions, including BRAF V600E in 50-65%. There is a “Misguided Myelomonocytic Precursor Model” in which specific somatic MAPK mutations at critical stages of myeloid differentiation determine extent of disease. However, this model fails to explain the significant differences in LCH risk across ethnicities. Despite advances to elucidate the somatic mutational landscape underlying LCH pathogenesis, germline risk factors remain largely unknown. Therefore, we conducted the first genome-wide association study of LCH and identified a SMAD6 variant associated with increased risk. SMAD6 inhibits bone morphogenetic protein and transforming growth factor-beta/activin signaling, which are determinants of Langerhans cell differentiation. This variant appears to suppress SMAD6 protein expression without a decrease in SMAD6 messenger RNA expression in patients carrying the risk allele. This risk allele is also more common in Hispanics who are at the highest risk of LCH, and absent in blacks who experience the lowest LCH incidence. Our preliminary data also support the emerging observation that LCH somatic activating mutations vary by race/ethnicity. Specifically, sequencing of tumors from black patients indicated that only 25% were BRAF V600E+ (compared to >60% in other populations), whereas 50% had mutations in MAP2K1 (compared to < 10% in other populations). Therefore, the objective of this Kids First X01 application is to more fully elucidate LCH etiology by defining the role of de novo mutations (DNMs) in established LCH genes and novel susceptibility genes, and by comprehensively assessing somatic variation in LCH. Our central hypothesis is that penetrant DNMs and novel germline and somatic variation may contribute to, or in some cases, drive LCH tumorigenesis. 1) We will leverage our ongoing Children's Oncology Group study, Genetic Epidemiology of Childhood Histiocytosis (GECHO), to sequence 300 LCH case-parent trios to evaluate the impact of recurrent DNMs on inherited susceptibility to LCH. 2) We will leverage paired germline-tumor samples from 200 patients enrolled to the Texas Children's Histiocytosis Program protocol or GECHO study to comprehensively assess the somatic landscape of LCH and identify germline variation contributing to somatic mutational profiles. Successful completion of the proposed aims may (1) improve genetic testing and counseling strategies in LCH patients and families; (2) advance surveillance and chemoprevention protocols; and (3) identify novel therapeutic targets.

Project Number: CA267565-01 Contact PI / Project Leader: Shlien, Adam
Title: Discovering the Timing and Origins of Bone and Soft Tissue Cancers
Awardee Organization:
Hospital For Sick Children (Toronto)
Abstract Text:

Abstract: DESCRIPTION (provided by applicant): Discovering the Timing and Origins of Bone and Soft Tissue Cancers SUMMARY Sarcomas are cancers of the bone and connective tissue that affect a higher proportion of children than adults. Many childhood sarcomas are difficult to diagnose, which can lead to therapeutic delays. At relapse, childhood sarcoma patients have poor survival, with little improvement seen in 40 years. We hypothesize that childhood sarcomas’ true beginnings – their pre-malignant mutations or cells of origin – in fact occur many years prior to diagnosis. With sequencing support from the Gabriella Miller Kids First, we will determine the temporal order and molecular processes that give rise to childhood sarcomas. To do so, we will draw on samples and clinical data from our repositories containing >6,000 samples. High quality specimens have been selected to inform each of the key temporal landmarks in the development of sarcoma – from tumor initiation, to the generation of critical oncogenic fusions and malignant potential, to possible relapse or metastasis. This project will be pursued in three parallel aims, using existing bioinformatics pipelines. First, we will find the originating mutations for childhood soft tissue and bone cancers. This is motivated by our finding that childhood Ewing- and osteo- sarcomas are initiated multiple years before diagnosis, sometimes starting in utero. We will reconstruct phylogenetic trees for rare sarcomas in this cohort. We will see how often early-onset tumors are associated with early oncogenesis. Second, we will use non-neoplastic tumors of bone and soft tissue as a model for sarcoma initiation, without proliferation. Complementing this, we will sequence late-emerging childhood sarcomas - from adults who developed sarcoma types typically found only in children. We will learn whether adult and childhood sarcomas of the same type are driven by the same mutagenic processes. Third, we will use long read sequencing to find the missing structural rearrangements underpinning childhood sarcomas. We will determine the formation signatures of gene fusions, which are major drivers of early sarcomagenesis. Finally, we will use the same sequencing approach to examine sarcoma patients at relapse, to find clinically useful secondary mutations missed by conventional short read approaches. Collectively, these data will provide a thorough understanding of malignant progression in childhood sarcoma. This will lay the foundation for trials of early therapeutic intervention in childhood sarcoma, for example by predicting the evolutionary trajectory of relapse before it occurs. 

This page last reviewed on September 21, 2023