Abstract

Cleft lip or palate (CL/P) is a common facial defect present in 1 : 700 live births and results in substantial burden to patients. There are more than 500 CL/P syndromes described, the causes of which may be single-gene mutations, chromosomopathies, and exposure to teratogens. Part of the most prevalent syndromic CL/P has known etiology. Nonsyndromic CL/P, on the other hand, is a complex disorder, whose etiology is still poorly understood. Recent genome-wide association studies have contributed to the elucidation of the genetic causes, by raising reproducible susceptibility genetic variants; their etiopathogenic roles, however, are difficult to predict, as in the case of the chromosomal region 8q24, the most corroborated locus predisposing to nonsyndromic CL/P. Knowing the genetic causes of CL/P will directly impact the genetic counseling, by estimating precise recurrence risks, and the patient management, since the patient, followup may be partially influenced by their genetic background. This paper focuses on the genetic causes of important syndromic CL/P forms (van der Woude syndrome, 22q11 deletion syndrome, and Robin sequence-associated syndromes) and depicts the recent findings in nonsyndromic CL/P research, addressing issues in the conduct of the geneticist.

1. Introduction

Cleft lip or palate (CL/P) is a common human congenital defect promptly recognized at birth. Despite the variability driven by socioeconomic status and ethnic background,the worldwide prevalence of CL/P is often cited as 1 : 700 live births; nevertheless, the different methods of ascertainment may lead to fluctuations in the prevalence rates [1]. Essentially, CL/P results from failure of fusion of the maxillary processes or palatal shelves, which occur between the 4th and 12th weeks of embryogenesis (as reviewed by Mossey et al. [2]). Cellular processes of proliferation, differentiation, and apoptosis, which are essential for appropriate lip and palate morphogenesis, are regulated by complex molecular signaling pathways; therefore, genetic and environmental factors that dysregulate those pathways are subject of intensive research as it is expected that their understanding will accelerate the development of preventive measures. Maternal alcohol intake or exposure to tobacco and several chemicals, such as retinoic acid and folate antagonists (e.g., valproic acid), among others, has been shown to be teratogenic, thus representing risk factors to embryos during the first trimester of pregnancy (reviewed by Bender [3] and by Dixon et al. [4]). Despite their etiological importance as environmental predisposition factors to CL/P, we will focus in this paper on the genetic causes of CL/P.

Within CL/P, cleft lip with or without cleft palate (CL ± P) is considered a distinct entity from cleft palate only (CP), based on the different embryonic origin when palate development occurs, that is, the closure of the palatal shelves occurs between 8th and 12th weeks of the human gestation [5] while lip formation is concluded at the 7th week [6]. Accordingly, this subdivision is clearly supported by epidemiological findings [4]; however, in some syndromic forms of CL/P, both entities may segregate in the same family [710]. CL/P can occur as the only malformation (nonsyndromic (NS), representing 70% of CL ± P cases and 50% of CP cases) or associated with other clinical features (syndromic, 30% of CL ± P and 50% of CP cases; [11]), a classification that we will consider in the next topics.

The majority of children affected by CL/P require a lasting and costly multidisciplinary treatment for complete rehabilitation. The precise clinical diagnosis of CL/P patients, which is not always simple, is crucial for an accurate genetic counseling, patient management, and definition of surgical strategies, as reviewed below.

2. Genetic Factors

2.1. Syndromic CL/P

Mutations in single genes and chromosomal abnormalities are the most common mechanisms underlying syndromic CL/P. The Online Mendelian Inheritance in Man database (OMIM) describes more than 500 syndromes with CL/P as part of the phenotype. Furthermore, several cases of trisomy of chromosomes 13, 18, and 21 associated with CL/P were described, as well as partial deletions and duplications of other chromosomes [12]. These findings suggest that there may be several genomic regions containing loci which, in excess or in insufficiency, may lead to CL/P.

In this paper, we highlight van der Woude syndrome (VWS) and Velocardiofacial syndrome (VCFS), due to their high frequency among CL/P cases, together with Robin sequence (RS), a clinical feature that may be associated with other syndromes, including VCFS.

2.1.1. Van der Woude Syndrome (VWS)

Van der Woude syndrome (VWS; OMIM 119300), the most frequent form of syndromic CL/P, accounts for 2% of all CL/P cases [13]. VWS is a single gene disorder with an autosomal dominant pattern of inheritance. Its penetrance is high (89–99%; [14]) and it is clinically characterized mainly by CL ± P or CP, fistulae on the lower lip, and hypodontia [15]. There is a wide spectrum of clinical variability, in which patients lacking fistulae are indistinguishable from individuals affected by nonsyndromic forms. Kondo et al. [16] showed that missense and nonsense mutations in interferon regulatory factor 6 (IRF6) were responsible for the majority of VWS cases. Although the pathogenic mutations may occur in any region of the gene, about 80% of them have been found in exons 3, 4, 7, and 9 (reviewed by Durda et al. [17]). It is predicted that the pathogenic mutations leading to SVW cause loss of function of the protein encoded by the gene [16].

Although we can estimate that the recurrence risk for future children of affected patients is 50%, it is still not possible to predict the severity of the disease in a fetus with a pathogenic mutation in IRF6, as there is no clear genotype-phenotype correlation. The pathogenic mutations in IRF6 seem to play its major harmful effect during embryonic development, indicating that IRF6 plays a critical functional role in craniofacial development. However, IRF6 also seems to act after birth, as children with VWS have an increased frequency of wound complications after surgical cleft repair than children with NS CL ± P [18].

The spectrum of clinical variability of VWS has recently been expanded by the demonstration that mutations in IRF6 are also causative of the Popliteal Pterygium Syndrome (PPS; OMIM 119500), an allelic, autosomal dominant disorder that presents, besides the facial anomalies typical of VWS, bilateral popliteal webs, syndactyly, and genital anomalies [17]. Most of the pathogenic mutations causative of PPS are located in exon 4 of the IRF6 gene [16]. There are a strong genotype-phenotype correlation associated with VWS and PPS, but how the different mutations lead to PPS or VWS is still uncertain [19].

Since most of the VWS and PPS cases can be diagnosed by clinical evaluation, the necessity of genetic testing should be evaluated in each case.

2.1.2. Velocardiofacial Syndrome or 22q11.2 Deletion Syndrome

Velocardiofacial syndrome (VCFS; OMIM 192430) is an autosomal dominant disorder mainly characterized by the presence of cardiac anomalies (conotruncal defects, predominantly tetralogy of Fallot and conoventricular septal defects), CP or submucosal CP, velopharyngeal incompetence, facial dysmorphia, thymic hypoplasia, and learning disabilities [20]. The major known mutational mechanism causative of VCFS is a submicroscopic deletion at 22q11.2, usually spanning 1.5 Mb to 3 Mb. The spectrum of clinical variability is very wide, with the mildest cases presenting only two clinical signs of the syndrome in contrast to the full blown phenotype of the syndrome. Patients with DiGeorge syndrome (DGS; OMIM 188400), a condition with a great clinical overlap with VCFS, is also caused by deletions at 22q11.2, and thus represents a single entity; the term “22q11.2 deletion syndrome” is now commonly used to refer to all these cases. The clinical diagnosis for this group of patients is usually difficult, and genetic tests are often recommended in the presence of at least two clinical features of the syndrome, such as velopharingeal insufficiency and cardiac defects [21]. Moreover, patients may develop late onset psychosis or behavior disturbances, such as schizophrenia or bipolar disorders [22]. The severity of the syndrome is not dependent on the size of the deletion [23, 24] and several studies have pointed loss of one copy of TBX1 as the major etiological agent within 22q11.2 leading to the phenotypic alterations [25, 26]. However, other environmental or genomic factors may also influence phenotype manifestation. Therefore, identification of 22q11.2 deletion patients is important for genetic counseling purposes as well as for discussing prognosis and surgical intervention, as the choice of surgical procedure depends upon the presence of abnormal and misplaced internal carotid arteries, which is relatively common in these patients (reviewed by Saman and Tatum [27]) The recurrence risk is high (50%) for carriers of the 22q11 deletion and it is still not possible to predict the severity of the disorder in fetuses with this alteration.

2.1.3. Robin Sequence and Associated Syndromes

Robin sequence (RS), also referred as Pierre Robin sequence, is characterized by the presence of micro or retrognathia, respiratory distress, and glossoptosis, with or without CP [28, 29]. It is also associated with high morbidity secondary to a compromised airway, feeding difficulties, and speech problems. It can occur isolatedly (called NS RS), but most of the time it is associated with a genetic syndrome [30]. Therefore, RS must not be regarded as a definitive diagnosis, and defining the presence of an associated syndrome has implications for future case management and determination of recurrence risks [30]. The most common syndromes associated with RS are Stickler syndrome and VCFS, both with an autosomal dominant pattern of inheritance and with several additional clinical complications that are not present in NS RS.

The pathogenesis of NS RS is heterogeneous and not well defined. NS RS has been considered the result of intrauterine fetal constraint where extrinsic physical forces (e.g., oligohydramnios, breech position, or abnormal uterine anatomy) inhibit normal mandibular growth. Micrognathia in early fetal development may in turn cause the tongue to remain between the palatal shelves, thus interfering with palate closure [29, 31]. However, this mechanism has been challenged by the identification of several genetic alterations associated with RS, including chromosomal deletions such as 2q24.1-33.3, 4q32-qter, 11q21-23.1, and 17q21-24.3 [32] and microchromosomal deletions involving regulatory elements surrounding SOX9 [33]. NS RS usually occurs as the unique case in the family and the recurrence risk for future pregnancies of the couple with one affected child is low [34].

2.2. Nonsyndromic CL ± P (NS CL ± P)

NS CL ± P includes a wide spectrum of clinical variability, from a simple unilateral lip scar to bilateral cleft lip and cleft of the palate, as partly represented in Figure 1. Different epidemiological evidence, as familial recurrence, observed in 20–30% of the cases [35, 36] and twin concordance rates (40–60% for monozygotic and 3–5% for dizygotic; [37]), suggest an important genetic component in NS CL ± P etiology. High heritability rates have been estimated in several studies (reaching 84% in Europe [38],78% in China [39] and 74% in South America [40]; in Brazil, our group found estimates ranging from 45% to as high as 85%, depending on the population ascertained [36]). The most accepted genetic model for NS CL ± P is the multifactorial, in which genetic and environmental factors play a role in phenotype determination.

Researchers have conducted different approaches to seek for genetic NS CL ± P susceptibility loci. Linkage analysis and association studiesof candidate genes were, initially, the most popular approaches, and the first gene suggested to be associated with NS CL ± P was transforming growth factor alpha (TGFα), by Ardinger et al. [41]. Thereafter, linkage analyses raised some other genomic regions as possible susceptibility factors, as 6p24-23 [42] (recently studied by Scapoli et al. [43]), 4q21 [44], 19q13 [45], and 13q33 [46]. Additional studies, however, faced a lack of reproducibility of the emerged genomic loci, as reviewed in detail by others [4, 47], suggesting the existence of a strong genetic heterogeneity underlying the predisposition to the disease (i.e., different causal loci might be acting in the different studied families).

Candidate genes analyzed through association studies emerged not only from initial findings by linkage analysis, but also from: (1) the gene role in lip or palate embryogenesis, as suggested by animal model studies (e.g., TGFα, in the pioneer study by Ardinger et al. [41] and MSX1 [48]); (2) gene role in the metabolism of putative environmental risk factors (e.g, MTHFR, involved in folate metabolism and firstly tested by Tolarova et al. [49], and RARα, which encodes a nuclear retinoic acid receptor, tested initially by Chenevix-Trench et al. [50]); (3) from the identification of chromosomal anomalies in patients (as SUMO1 [51]), and (4) from their role in syndromic CL/P, such as van der Woude (IRF6, its causal gene, was firstly associated with NS by Zucchero et al. [52]), Cleft Lip/Palate Ectodermal Dysplasia Syndrome (caused by mutations in PVRL1 [53], firstly associated with NS CL ± P by Sözen et al. [54]) and EEC and AEC (both caused by mutations in TP63 [55], associated with NS CL ± P by Leoyklang et al. [56]), among others.

Among all loci that arose through linkage and candidate gene association studies, the IRF6 gene was the only locus to be consistently associated with NS CL ± P, as first shown by Zucchero et al. [52]. Rahimov et al. [57] identified a common nucleotide variant (namely rs642961) in an IRF6 regulatory sequence conferring risk to NS CL ± P that could potentially dysregulate IRF6 transcription levels and consequently dysregulate other signaling pathways. The variant rs642961 has been repeatedly associated in other studies in Europe [58, 59], Latin America [60, 61], and Asia, [62, 63]. Nevertheless, the role of rs641961 in embryonic development and how it predisposes to NS CL ± P remains to be elucidated.

With the advent of high-throughput genotyping technologies, which allowed for a deeper investigation at the genomic level without prior hypothesis of candidate regions to be tested, the landscape changed substantially. Genome-wide association studies (GWASs) came up from these advances, providing remarkable contribution to the understanding of NS CL ± P etiology. Four large GWASs were performed on NS CL ± P so far, and their main findings are summarized in Figure 2. Markers within a gene desert in the chromosomal region 8q24 were unequivocally implicated in NS CL ± P susceptibility, since they shared similar results. A second promising locus that emerged from these studies is the region 10q25. Other minor association studies have replicated association for both 8q24 and 10q25 [59, 60, 6669]. Therefore, the IRF6 gene and the chromosomal regions 8q24 and 10q25 are, to date, the most corroborated loci implicated in NS CL ± P. However, contrary to IRF6 association, for which a punctual susceptibility variant has been identified, finding the functional causative mutations and the molecular pathogenesis beneath the associations observed for 8q24 and 10q25 regions remains a challenge; Table 1 summarizes the main candidate genes proposed by these studies. Recently, a GWAS performed in 34 consanguineous families from a Colombian isolated population suggested that the loci 11p12, 11q25 and 8p23.2 may harbor recessive genes underlying NS CL ± P etiology [70]; these results, however, will need further replication. A recent linkage analysis applying high-throughput genotyping also suggested a role for the region of FOXE1 (9q22-q33) in NS CL ± P susceptibility [71]; nevertheless, this locus lacks reproducibility in other studies.

The difficulty of replication of the investigated loci may be a consequence of the genetic heterogeneity in NS CL ± P, that is, susceptibility variants differing from patient to patient; also, susceptibility variants may be different across unrelated populations. Beaty et al. [64] highlighted a stronger evidence for 8q24 in Europeans compared to Asians. Ethnic heterogeneity was also observed by Blanton et al. [67]; we have observed differences even across the Brazilian country populations [69], and a study with a Kenyan population failed in finding this association [74]. On the other hand, the Asians in the study reported by Beaty et al. [64] presented the most solid association for 20q12 and 1p22, compared to the European sample. It is possible that such differences may be a consequence of low statistical power in the subsample of a given ethnicity, as observed by Murray et al. [75]. Anyhow, these findings stress the value of testing non-European populations in order to identify the risk factors of NS clefting for each population, and to better understand the genetic architecture of the disease.

Regardless of the success of GWAS in identifying new susceptibility loci, those consistently implicated in NS CL ± P fail in explaining the complete genetic contribution proposed. This “failure” has been a common observation in many other traits, such as type 2 diabetes, height, and early onset myocardial infarction [76], and there is a current debate on where the remaining genetic causes could be hidden. One hypothesis is that gene-gene and gene-environment interactions may represent a substantial additional risk; however, their evaluation is still difficult with the current research tools. It is also possible that a combination of rare mutations per individual can be responsible for a large proportion of cases. New technologies to perform exome and genome sequencing are promising approaches to bridge this gap, and have potential to bring out new susceptibility variants. The use of other approaches, such as expression analysis, can also bring new insights into the causative pathways behind this malformation. In this respect, we have recently shown that dental pulp stem cells from NS CL ± P patients exhibit dysregulation of a set of genes involved in extracellular matrix remodeling, an important biological process for lip and palate morphogenesis [77].

2.3. Nonsyndromic CPO (NS CPO)

Cleft palate only is also a common malformation with a wide variability spectrum, comprising mildest phenotypes involving only uvula bifida to more severe cases, the majority of which include cleft of the soft and hard palates (Figure 1). The higher recurrence risk observed for close relatives compared to the general population [78, 79], and the higher concordance in monozygotic compared to dizygotic twins [80, 81] evidence the presence of genetic components in the etiology of NS CPO. Akin to NS CL ± P, NS CPO is believed to result from a combination of genetic and environmental factors [78]. However, in contrast to NS CL ± P, only a few studies on the genetic basis of NS CPO have been conducted, probably because of its lower prevalence and difficulty of ascertainment.

A first linkage genome scan to find NS CPO susceptibility loci was performed in 24 Finnish families by Koillinen et al. [82], which suggested 1p32, 2p24-25, and 12q21 as candidate regions; all of them, however, reached only borderline significance. Recently, Ghassibe-Sabbagh et al. [83] demonstrated the involvement of the Fas-associated factor-1 gene (FAF1) with NS CPO and provided insights into the gene’s function in facial chondrogenic development, using a combination of an association study in a large multi-ethnic sample, gene expression analysis and animal model. Beaty et al., [84] performed a GWAS in 550 trios (proband and parents) of mixed ancestries and, although they did not find significant results by testing the associations of genetic markers with phenotype, they obtained interesting results when they performed the association tests conditioning on environmental variables (maternal smoking, alcohol consumption, and vitamin supplementation): association of TBK1, ZNF236, MLLT3, SMC2, and BAALC was suggested. None of the loci raised in these studies were in common with those emerged for NS CL ± P. Similarly, in search of a possible common etiology between NS CL ± P and NS CPO, many researchers tested the involvement of NS CL ± P candidate loci with NS CPO, but negative or conflicting results were reported for TGFα, TGFβ3, MSX1, SUMO1, BCL3, IRF6 and 8q24 [57, 72, 8590].

A number of studies in mice has shown that defects in several genes lead to cleft palate, often accompanied by a set of other defects, as reviewed by Cobourne [91]. Among those genes, the MSX1 was the most penetrant, that is, alterations in MSX1 led to CPO more frequently than alterations in other genes. Some authors have also reported chromosomal duplications, deletions and rearrangements in NS CPO patients [9294]. Nonetheless, the genes located within those chromosomal regions lack confirmation with regards to their pathogenic role.

3. Genetic Management of the Family with CL/P-Affected Children

The clinical evaluation of a CL/P patient, outlined in Figure 3, starts with his/her classification in syndromic and nonsyndromic cases, based on the presence or absence of other dysmorphisms or malformations, together with an investigation of the occurrence of relatives with similar features.

Among the syndromic cases, it is first necessary to investigate the possibility of non-genetic causes, for example, exposure to teratogens during the first trimester of gestation. In cases of CL/P arising from the action of teratogenic agents during embryogenesis, the recurrence risk is negligible since exposure to teratogens in a next pregnancy does not recur. Once the possibility of a teratogenic origin for CL/P is ruled out, the geneticist should raise the diagnostic hypothesis of genetic syndromes and recommend the most adequate test (however, these tests might also be useful in the cases of teratogenic exposure, in order to refute chromosomal abnormalities). The most commonly performed tests are the karyotype, Multiplex Ligation-dependent Probe Amplification (MLPA), Comparative Genomic Hybridization array (CGH-array), gene target sequencing, and exome sequencing. Whilst the karyotype is a cytogenetic technique which allows for detection of large structural and numeric chromosomal anomalies in a low resolution, MLPA and CGH-array are quantitative molecular tests that enable the investigation of gain or loss of genetic material at the submicroscopic level. MLPA is applied to investigate specific targets in the genome while CGH-array can be used to screen the whole genome with a very high resolution. MLPA or CGH-array are the recommended tests to be used for a first screening, depending on the available resources [95, 96].

Gene target sequencing is recommended when one or more genes are known to be causative of the disorder. There is a trend towards the use of next generation sequencing particularly in diseases associated with genetic heterogeneity, as this approach permits the simultaneous testing of several genes, thus resulting in a more cost-effective test in the long run. Recurrence risk estimates for future children of the parents of one affected patient is dependent on the definition of the etiological mechanisms of the disease, evidencing the importance of selecting the appropriate test, combined with the clinical evaluation, for the establishment of the diagnosis.

In nonsyndromic cases, due to our full lack of understanding with regards to their etiological mechanisms, the recurrence risks have been empirically determined by epidemiological studies. As expected for a multifactorial model of inheritance, these risks can be influenced by several factors, such as gender of the affected propositus, severity of the orofacial cleft, and number of affected relatives [97]. The recurrence risk among families with one first-degree affected relative has been estimated as 4% for NS CL ± P and 2% for NS CPO [98]. These estimates may vary depending on the population. In Brazil, the recurrence risk has been estimated at only 2% among families with one first-degree NS CL ± P affected relative [36].

In NS cases, the identification of other individuals with CL/P in the family should be always interpreted with caution. Due to genetic heterogeneity associated with NS CL/P, a family with several affected individuals can actually represent the segregation of a single-gene disorder, which would not be promptly recognized based solely on clinical evaluation. For example, among 102 families with at least two individuals affected by NS CL/P, we identified 4 families with pathogenic mutations in IRF6, which actually represented VWS cases. Due to the high prevalence of VWS, we thus recommend IRF6 genetic testing in familial cases of NS CL/P [99].

CL/P is a complex group of disorders and the adequate genetic management of the family requires evaluation by a trained group of geneticists in order to best define the diagnosis of the affected propositus, evaluation of prognosis, surgery indications, and, finally, recurrence risk estimates for the individuals at risk. With the advance of genomic technology, we expect that new advances and understanding of the genetic mechanisms leading to CL/P will be achieved in the upcoming years.

Glossary

Association Analysis: correlates the occurrence, in two groups of individuals (e.g., affected and unaffected), of one genetic variant with the phenotype. If the frequency difference of one genotyped variant is statistically significant between the two groups, the genomic region harboring the variant will be associated with the trait. This approach is better suited to identify common and low impact genetic variants of shared origin.

Exome Sequencing: sequencing focused on the 2% of the genome which constitutes the protein-coding genes (exome). Despite the low proportion of the genome, 85% of the high-impact mutations already identified rely on the exome [100], which makes this approach highly promising.

Genetic Marker: any polymorphism loci of known location which is suitable for gene mapping. Single nucleotide polymorphisms (SNPs), which involve one nucleotide substitution, are the most used for this purpose (e.g., in GWAS). A large number of SNPs can be analyzed simultaneously through the use of semi-automated equipments and microchips.

GWAS: association analysis at the genomic level. Requires the genotyping of thousands or millions of genetic markers, and has been made possible after advances in the characterization of the human genome (e.g., the Human Genome Project and the HapMap Project (http://www.hapmap.org/)) and automation of genotypic analysis. This strategy is suitable for identifying common low-effect variants without prior hypothesis. Finding association of the trait with a genetic marker does not necessarily mean that the marker is directly involved with the disease; most likely, the chromosomal region harboring this marker also comprises one or more susceptibility factors. Finding the real cause behind the association signal is currently a challenge.

Heritability: fraction of phenotypic variance in a population attributable to genetic factors.

Linkage Analysis: approach that searches for genomic regions which cosegregate among affected individuals within a family, by genotyping known genetic markers spread throughout the genome. Powerful to detect genes of high impact, but loci of small or moderate effect are usually missed. Large families with many affected individuals are required.

Polymorphism: genomic locus that admits two or more variants in the population and its rarest variant has a populational frequency greater than 1%.

Whole-Genome Sequencing: sequencing analysis of the whole genome, including coding and noncoding regions.

Acknowledgments

The authors are grateful to Regina de Siqueira Bueno for the assistance with the images, and all the colleagues and patients involved with cleft research. The authors are funded by CEPID/FAPESP and CNPq/CAPES.