Mismatch repair genes founder mutations and cancer susceptibility in Lynch syndrome

Founder mutations in specific populations are common in several Mendelian disorders. They are shared by apparently unrelated families that inherited them from a common ancestor that existed hundreds to thousands of years ago. They have been proven to impact in molecular diagnostics strategies in specific populations, where they can be assessed as the first screening step and, if positive, avoid further expensive gene scanning. In Lynch syndrome (LS), a dominantly inherited colorectal cancer disease, more than 50 founder pathogenic mutations have been described so far in the mismatch repair (MMR) genes (MLH1, MSH2, MSH6 and PMS2). We here provide a comprehensive summary of the founder mutations found in the MMR genes and an overview of their main characteristics. At a time when high‐throughput strategies are being introduced in the molecular diagnostics of cancer, genetic testing for founder mutations can complement next generation sequencing (NGS) technologies to most efficiently identify MMR gene mutations in any given population. Additionally, special attention is paid to MMR founder mutations with interesting anthropological significance.

The year 2013 marked the hundredth anniversary of the first description of the cancer predisposing syndrome initially known as cancer family syndrome, then hereditary non-polyposis colorectal cancer (HNPCC), and finally Lynch Syndrome (LS) (1,2). LS is the most common cause of hereditary colorectal cancer (CRC), showing an autosomal dominant inheritance pattern. In addition to CRC, LS is characterized by an increased risk of malignancies at certain extracolonic sites such as the endometrium, ovary, stomach and small bowel, among others (3). LS is caused by germline mutations in one of the DNA mismatch repair (MMR) genes (MLH1, MSH2, MSH6, and PMS2) (4)(5)(6)(7). Constitutional epimutations, characterized by soma-wide allele specific promoter methylation and transcriptional silencing of MLH1 and MSH2 (in the latter gene the transcriptional inactivation is secondary to deletions in the neighboring EPCAM gene), also trigger LS (8). Therefore, MMR deficiency is the characteristic signature of LS tumors, and provides us with two useful identification tools: microsatellite instability (MSI) and loss of immunohistochemical (IHC) staining of the MMR proteins (9). Systematic tumor testing by MSI or IHC for all patients with CRC (or CRC <70 years) and all patients with endometrial cancer (EC) (or EC < 70 years) has been recommended for the identification of patients with LS (10).
As a result of ∼20 years of mutation analysis worldwide, more than 1000 unique DNA variants have been reported for each of the two major LS genes, MLH1 and MSH2, plus several hundred for MSH6 and PMS2 (11). When the individuals carrying the mutations are considered, the numbers are up to six times higher (11), implying that mutations tend to recur among the populations. Two situations cause this recurrence of mutations among unrelated families. On the one hand, in the so-called recurrent mutations, sequence peculiarities can predispose to an abnormal event at meiosis making a mutation to arise repeatedly de novo (e.g. c.942+3A>T in MSH2) (12,13). On the other hand, the so-called founder mutations arose in a single ancestor who subsequently passed it on to succeeding generations. In order to prove that a particular mutation is a founder, it is necessary to haplotype several markers [single nucleotide polymorphisms (SNPs) or microsatellites] surrounding the mutation in both carriers and non-carriers. If all carrier individuals from the different families share a common haplotype not frequently present in non-carriers, we can conclude that most probably the mutation originated in a single founder individual who spread the mutation. We expect that the shorter the haplotype is, the older is the mutation. By means of haplotyping cases and controls, we are also able to estimate the age of founder mutations using one of the several mathematical models described (14)(15)(16)(17)(18)(19)(20)(21). It is founder mutations that are the subject of this review.
Two alternative founder mutation-enrichment mechanisms are believed to take place. Some founder mutations could be positively selected through natural selection because they carry an advantage (22). In other cases, in isolated populations without significant genetic influx, a new mutation can be enriched by genetic drift. In such cases, the mutation could have appeared de novo or could have come with the new founder individuals (founder effect). Because genetic drift is enhanced in small populations we find numerous examples of founder mutations in relatively isolated regions (e.g. Quebec, Newfoundland, Tenerife Island in Spain), countries (e.g. Finland, Iceland, Netherlands) or ethnic groups such as the Ashkenazi Jews (AJ). The tradition of intermarriage among members of the same community during the last millennium has led the AJ to a genetic isolation. Their unusual high prevalence of disease-associated mutations is consistent with the occurrence of several founding events, repeated bottlenecks and dramatic expansions among their history (23). Founder mutations have been also discovered in large, genetically heterogeneous populations (e.g. Europe, North-America).
Founder mutations can cause recessive patterns of pathogenicity such as hemochromatosis (24), beta thalassemia (25), cystic fibrosis (26) or xeroderma pigmentosum (27) among others. In dominant predisposing diseases, founder mutations usually exist when the age of onset of the disease is past the reproductive age, so they are not eliminated because of reduced reproductive fitness. This is the case of some cancer-predisposing founder mutations that have been found, among others, in BRCA1 and BRCA2 genes causing hereditary breast and ovarian cancer syndrome (28), in CDKN2A gene causing cutaneous malignant melanoma (29), or in the MMR genes causing LS (30). In the latter case, more than 50 founder mutations have been identified so far. In this review, we aim to summarize all the published founder mutations causing LS, paying special attention to those with clinical or anthropological relevance.

Founder mutations in specific regions
At least 55 founder mutations causing LS have been described so far (Tables 1-3). Although some mutations have been seen in only a couple of unrelated families, their single origin is supported by haplotype analysis in cases and controls. In other cases, founder mutations result to be very commonly found in specific countries or areas, representing a very useful tool in regional genetic screening of LS.
The most evident example of this, is probably the case of the two MLH1 Finnish founder mutations (c.454-1G>A and exon 16 deletion) that together, account for ∼50% of LS families in Finland (31,32). In 1996, Moisio and colleagues performed genealogical and haplotype analyses in families carrying these two mutations, and not only were the first to describe the existence of founder mutation in LS, but also were able to estimate their age. Numerous other founders existing in a relatively high proportion of LS families in particular areas have been described. In Spain, c.306+5G>A and c.1865T>A MLH1 mutations represent the 17.6% of MMR mutations in a specific series of families residing in Catalonia (33). Around 25% of Danish families that fulfill Amsterdam criteria are carriers of a founder splicing mutation (c.1667+2_1667+8del7ins4) in MLH1 gene (34,35). Twenty percent of the LS families identified in a series of Hong Kong, carry the novel c.1452_1455del MSH2 mutation (36). In Portugal, Pinheiro et al. described two founder mutations that have been proved to be very prevalent: a large deletion comprising exons 17-19 of the MLH1 gene and exons 26-29 of the contiguous LRRFIP2 gene, and the c.388_389del in MSH2 gene, represent 17% and 16% of LS-causing mutation in their series, respectively (37,38). Interestingly, the c.388_389del in MSH2 gene appeared independently in families from Germany, Scotland, England and Argentina, who did not share the Portuguese haplotype, which led authors to the conclusion that it may be a mutational hotspot within the MSH2 gene, probably due to the existence of a short repeat motif (TCTCTCTC) existing upstream of the deletion (37). Similarly, in the province of Newfoundland (Canada) the c.942+3A>T MSH2 mutation was found in 11 families. Although this mutation is the most common recurrent mutation in MMR genes, repeatedly arising de novo because of misalignment at replication or recombination caused by the presence of 26 adenines (13,39). Newfoundland carriers share a common ancestral haplotype not present in carriers from England, Italy, Hong Kong or Japan (12,39). Therefore, this recurrent mutation also appeared with a founder  (38) a Estimation of the years that had passed since the most common ancestor appeared. b Apparently unrelated families described in the referred paper.
effect in Newfoundland, where it presents in 27% of LS families (12). The presence of a higher rate of founder mutations in a population can distort the generally accepted, distribution of mutations among the MMR genes [32% in MLH1, 38% in MSH2, 14% in MSH6, and 15% in PMS2 (40)]. For example, the existence in a Spanish series, of two MSH2 founder deletions (exon 4-8 and exon 7) doubles the rate of mutations in this gene compared to MLH1 (41). Also, a substantial enrichment of Sardinians was seen among the patients carrying MSH2 deletions (or more precisely, exon 8 deletions) in an Italian series (10/13 exon 8 deletions) (42). Further investigations proved the founder effect of two exon 8 deletions, carried by 7 and 2 of the 10 families, respectively. Each deletion shared breakpoints and haplotype. Given the presence of these two founder mutations, 50% of LS Sardinian families carry a mutation in the MSH2 gene (42). In the same vein, in the Netherlands more than half of all LS mutants are in MSH6. This is probably accounted for by the existence of two common founder mutations in this gene (c.467C>G and c.1614_1615delinsAG) (43).
a Estimation of the years that had passed since the most common ancestor appeared. b Apparently unrelated families described in the referred paper.
Authors of this study also discussed whether the high frequency of MSH6 mutations could be attributed to the use of less stringent clinical inclusion criteria than in other studies. They argued that the later age of onset of disease because of mutations in MSH6 may lead to a persistence of founder mutations in this gene, as they had seen in their series (43). A similar hypothesis has been proposed for PMS2 gene by Tomsic et al. (44), who observed only 36 distinct mutations in a sample of 61 independently ascertained Caucasian probands of mixed European background with PMS2 mutations. Six of these mutations were detected in more than two individuals and accounted for 51% of the ostensibly unrelated probands. By haplotyping they found that two mutations (c.137G>T and exon 10 deletion) were founders and a third one (c.1A>G) was a probable founder (44). They suggested that it is possible that the two MMR genes with the lowest penetrance (PMS2 and MSH6) also share the property of having frequent recurrent or founder mutations, being maybe more abundant in PMS2 (44). Although this is a plausible hypothesis, considering the founder mutations in the MMR genes that have been described so far (Tables 1-3), we do not notice an increased accumulation in MSH6 and PMS2 compared with MLH1 and MSH2. This could be because of the fact that lower penetrance and higher age of onset of LS in MSH6 and PMS2 genes (45-47) may uncover the presence of mutations in these genes. Additionally, MSH6 and PMS2 were discovered to be associated to LS later than MLH1 and MSH2, therefore the later establishment of their genetic testing, added to the difficulty to screen PMS2 because of the presence of many pseudogenes, may be delaying as well the discovery of founder mutations. More studies will be needed to more precisely determine the distribution of founder mutations among the MMR genes.

Founder mutations in AJ
Mutation spectra and gene distribution among AJ are unique because of their particular genetic isolation. Therefore, founder mutations are commonly seen (48)(49)(50)(51)(52). In LS, three founder mutations have been showed to account for 73% of all MMR genes mutations in a cohort of AJ from Israel, thus changing again the distribution of mutations (75% in MSH2, 18% in MSH6 and only 8% in MLH1) (53). The most common one appears in MSH2 (c.1906G>C), is highly penetrant in the AJ population and dates 200-500 years ago (51,54). The other two, present in MSH6 gene (c.3959_3962del and c.3984_3987dup) and showed to be highly penetrant as well and harbour a higher risk to develop EC than CRC (52). They appeared approximately 1425 and 1325 years ago, respectively (52). In the Israeli cohort of AJ, three patients carried one of the founder mutations in both alleles and developed constitutional mismatch a Estimation of the years that had passed since the most common ancestor appeared. b Apparently unrelated families described in the referred paper. [Correction added on 9 February 2015, after first online publication: The founder mutation "c.10C>T (p.Gln4X)" was previously omitted and this has now been added in Table 3  repair deficiency (CMMR-D) (53). CMMR-D is caused by biallelic mutations in the MMR genes and leads to haematological malignancies and tumors of brain and bowel early in childhood (55). The existence of founder mutation can be associated with a higher prevalence of the deleterious allele in the population, which can result in unexpectedly frequent occurrence of biallelic mutations in some populations. Similarly, Clendenning et al. (56) underlined the significant prevalence of a founder PMS2 mutation (c.736_741delins11), which appeared in 1 of 399 controls. Despite its reported reduced penetrance, over time, if the number of heterozygous carriers increases in the population, a rise in the number of homozygous carriers presenting with CMMR-D is expected (56).

Founder mutations in heterogeneous populations
Not all LS founder mutations are limited to specific, relatively isolated regions or communities; they also occur rarely in outbred populations. A remarkable example is the American founder mutation (AFM) in MSH2 (exon 1-6 deletion), which was initially identified in nine families from the United States (57) and subsequently in 32 more (58). It was estimated that 6.8% of LS in the United States are because of the AFM (57), and was calculated to be carried by 18,981 (95%CI, 6038-34,466) Americans (59). Nevertheless, recent estimates point a higher prevalence than previously thought, being clearly prominent in Ohio, Kentucky, and Texas states (58). A second MSH2 exon 1-6 deletion, with different breakpoints and haplotype has been found to be shared by three families from northern Italy (60). Other founder mutations have been identified in the United States with a lower prevalence.
In MLH1 gene, a splicing mutation (c.589-2A>G) has been found in 10 unrelated American families on a large shared haplotype, representing 1% of LS mutations carriers diagnosed in the Mayo Clinic (61). The same mutation has also been described as a founder but in a different haplotype, in Italy (61). Several PMS2 founder mutations have been described in the United States (44,56).

Founder epimutations
Epimutations have been also described to cause LS by two different mechanisms. Germline deletions of the EPCAM gene cause that its transcription extends into the MSH2 adjacent gene and subsequently, MSH2 promoter, in cis with the deletion, is methylated and MSH2 is therefore inactivated (62). Two founder mutations have been described so far in EPCAM gene, one in Denmark and the other in Spain (62,63). Transmission of MLH1 gene epimutations have also been reported to occur in both non-Mendelian (64) and autosomal dominant patterns linked to localized cis-acting genetic variants (65,66). Consistent with the latter, up to five families (two from Western Australia, two from the US and one from the Netherlands) have been showed to carry the same two mutations in MLH1 (c.[−27C>A; c.85G>T]), which confer cancer susceptibility through its propensity for soma-wide epigenetic silencing (67). All families identified with such haplotype are of European ethnicity and in four of them, sharing of a common haplotype could be proven, providing strong evidences of a common ancestor with European origin (67). It is interesting that by individually comparing haplotypes from one of the Australian families with the two American and the Dutch ones, authors could show a larger haplotype was shared between the Australian and the US families, suggesting they are more closely related to one another than to the Dutch family (67).

Founder mutations in the United States
As shown by the latter example, the identification and characterization of MMR genes founder mutations in different and sometimes far-away geographical areas allows us to retrace the main migratory patterns that have involved large populations in the past centuries. Origins and prevalence of the AFM in MSH2 gene (exon 1-6 deletion) have been thoroughly studied and discussed by Clendenning et al. (58). Although the mutation was previously thought to be carried by a putative common ancestor who migrated to the United States from Germany in the early 18th century, a time of significant European immigrations (68), the subsequent characterization of haplotypes flanking the mutation for 29 additional families suggested an earlier founding event, around 500 years ago. Twenty-seven of the AFM-carrier families could be linked into seven extended families (subfounder families), each with a common ancestor being born between 1700 and the early 1800s. Thus, Clendenning et al. proposed that either the subfounder families were in the United States before this period of time so the mutation was introduced by an early European immigrant or by a Native American, or the mutation originated in Europe but was carried to the US by several individuals during the greater European immigrations. The fact that no carriers have been found in Europe better supports the first hypothesis (58). The other AFM, seen in MLH1 gene in the United States (c.589-2A>G) also exemplifies how, by studying the haplotypes of founder mutation carriers, we can get anthropologically interesting theories (61). This splicing mutation appeared in 10 US and 3 Italian families. In the United States, it occurs in a large haplotype (∼4.8 Mb) that also harbors a missense variant (c.2146G>A, p.V716M), with an approximate age of 450 years. In Italy, the mutation is shared in a shorter haplotype (∼2.2 Mb) that does not carry the p.V716M missense variant. Interestingly, the p.V716M was found by itself in United States, Germany and Italy, in individuals with a common haplotype of 280 kb, which allowed the estimation of its age (5600 years). Therefore authors suggested that the p.V716M represents a single, ancient mutational event, and that the splicing mutation arose at least twice, once in an early American immigrant (or in an ancestor of an immigrant) in a chromosome carrying the ancestral p.V716M, and once elsewhere (perhaps in Italy) a longer time ago (61).

Italian founder mutations
The case of Italian families is interesting, with several examples of their migratory activity within the LS founder mutations. In the early part of the 20th century, about four million people moved from Italy; half of these migrated to Northern Europe, whereas the great majority of the rest left to the US, Canada, and Australia (69). Specifically, the Italian colonization of the province of Quebec in Canada began in the mid-19th century. According to this, two different MLH1 founder mutations were identified in unrelated families of Italian origin in Quebec (70). The c.1831delAT was described in two Italian-Quebec families sharing the same haplotype (70). The c.545+3A>G splicing mutation was found in the same haplotype in an Italian family from a town close to Naples (71) and in a Quebec family that originates from the region of Italy around Naples, which is consistent with most Italian Immigrants to North America coming from Southern Italy (70). This mutation has also appeared in three Brazilian families but no haplotype or genealogical study was done to link them to the Italian founder mutation (72,73). Similarly, the MLH1 c.2269dup has shown a founder effect in the northern Italian district of Modena and Reggio Emilia by haplotype evaluation (74) and has also been found in an Argentinean family whose ancestors were natives from the Reggio Emilia area (75) (Fig. 1).

Migratory flows within different regions of the same country
Other cases of MMR genes founder mutation well exemplify the big immigration waves from Europe to America (12,44,56), but other studies show that the current distribution of more local founder mutations can be explained by migratory patterns in geographically localized subsets of populations. Two examples are provided by Borras et al. by means of the ancestral study of two MLH1 founder mutations in Spain (33). The splice variant c.306+5G>A was found in several families with ancestors coming from the Ebro river valley, in northern Spain. Its age was estimated to be ∼1879 years. Taking into account the fact that the river valley is geographically isolated by mountain ranges, and the Ebro river was navigable until the 19th century, they hypothesize that the mutation arose somewhere in the valley and was distributed along the river over the years (33). The other founder mutation, c.1865T>A, was carried by families with ancestors in the mountainous province of Jaén, in Southern Spain and was proven to be of more recent origin (∼384 years). This was consistent with the identified probands being mainly from Madrid and Barcelona, frequent destinations of internal migratory movements during the period of 1960-1970 (33). In Hong Kong, 10 families have been shown to share an MSH2 founder mutation c.1452_1455del in a common haplotype (36). Interestingly, they all originated from the Guangdong province of southern China, which is the origin of the most Hong Kong inhabitants. Given that during the 19th and 20th centuries there were major emigrations from Hong Kong and Guangdong province, this mutation is interesting not only for its founder effect in China, but also for Chinese communities worldwide (36).

Diagnostic, surveillance and management implications of LS founder mutations
It is well known that the presence of founder mutations in a specific geographical area or population can be very helpful in designing cost-effective molecular diagnostic approaches. By screening a particular frequent founder mutation as the first step in the routine screening for LS we can avoid further expensive mutational testing in a significant number of samples. The most evident case is probably found in Finland, where the screening of the two founder mutations that account for around half of their LS families is a first step in the mutation analysis (76). In our experience in Northern Italy, when tumor IHC is suggestive for MLH1 alterations, the germline mutation analysis starts from the MLH1 founder mutation typically found in the area (c.2269dup) (74). Raskin et al. proposed that a panel designed to detect the known AJ founder mutations (MSH2 c.1906G>C, MSH6 c.3959_3962del and c.3984_3987dup, APC I1307K, and BLM Ash) could have value as a first-line screen in all AJ CRC and/or EC cases, irrespective of family history, IHC or MSI status (52). The benefit of founders becomes even more evident when the mutations are gross rearrangements. In most of these cases the exact breakpoints have been characterized, thus permitting the design of mutation detection by a simple cost and time-effective PCR test. In this line, it was proposed a test of this kind for the AFM, as a first line of molecular testing in patients whose tumors stain negatively for MSH2 (58). A similar approach is used in Sardinia for the two founder MSH2 exon 8 deletions, found to have different breakpoints (42). Pérez-Carbonero also suggested the need to design and implement as a pre-screening a simple, fast, and cheap method to detect their two founder mutations in MSH2 (exon 4-8 and exon 7 deletions) (41). The standardization of next generation sequencing (NGS) methods and analyses in the past few years is starting to make possible the introduction of this technology into the diagnostic molecular strategies, usually through the analysis of gene panels (77). Possibly, with the future global establishment of NGS in the genetic screening algorithms of LS, the existence of founder mutations will partially lose its diagnostic interest; the above mentioned technique has, in fact, shown promising results in the detection of all classes of MMR mutations, including single nucleotide variants, small insertions and deletions, and large copy number variants (78). However, those founder mutations with a high prevalence within specific populations can still represent a useful cost-effective tool before considering the more expensive NGS approaches.
A significant proportion of the mutations found in LS are missense or putative splice variants with unknown clinical significance. Although some algorithms to clarify their putative pathogenic effect have been proposed (30), in a typical routine diagnostic setting it is not affordable to address the effects of each of the variants encountered. When these changes appear to be founder in a certain population, more research efforts have been made to elucidate the effect of the variants thus improving genetic counseling of carrier families. Through segregation investigation, computational analyses, mRNA processing and stability assays and protein expression experiments, Borras et al. were able to confirm that two MLH1 founder changes from Spain, a splice variant (c.306+5G>A) and a missense variant (c.1865T>A), were pathogenic and the cause of the disease in carrier families (33). However, in most of the cases a less thorough characterization, without functional assays has been sufficient to show the causal deleterious effect of the founder variants. In Tenerife Island, the MSH2 c.2063T>G missense variant was considered pathogenic because of the switch of polarity of the aminoacid change (p.Met688Arg) and the conservation between species of the affected residue, which belongs to an important functional domain of the protein (79). Apart from these examples, several other criteria and functional assays have been used to determinate the pathogenicity of the several variants presented in Tables 1-3. In general, it is accepted that penetrance of cancer among MLH1 and MSH2 is higher than for MSH6 and PMS2 (47,80) and that EC is more frequently found than CRC in women carrying MSH6 mutations (81). Some founder mutations have been described to cause differential phenotypes. In Italy, the mutation c.2252_2253del significantly increased the risk of pancreatic tumors compared with other MLH1 mutations (82). The two founder mutations described in MLH1 gene by Borras et al. (c.306+5G>A and c.1865T>A), showed a lower penetrance compared to other pathogenic Spanish MLH1 mutations (33). In Denmark, the founder mutation c.1667+2_1667+8del7ins4 conferred comparable risks for CRC and lower risks for extracolonic cancer than the other MLH1 mutant Danish families (35). In Northern Italy, a genotype-phenotype analysis of the founder mutation c.2269dup revealed a proclivity to multiple tumors arising in the same subject and a higher tumor burden per family compared to other MLH1 or MSH2 mutations (83). However, most of the founder studies either do not thoroughly examine particular clinical characteristics in the carrier families, or the phenotypic patterns are similar to the expected. Therefore, although preliminary search for common founder mutations permits cost-effective and time-saving diagnostics strategies for LS, this cannot yet be translated into tailored cancer control strategies because of the lack of evidences of highly specific phenotypes related to the presence of founder mutations.

Conclusions
The identification of at least 55 founder mutations in LS has helped the diagnostic, surveillance and management of LS patients for almost two decades. In some populations, founder mutations represent up to 50% of their LS-causing mutations, which is extremely helpful for the development of cost-effective strategies to diagnose LS. This becomes even more advantageous in the case of founder gross rearrangements thanks to the possibility of designing easy polymerase chain reaction (PCR) assays to detect them once the breakpoints are known. With the imminent introduction of NGS into the molecular diagnostic algorithms of LS, pre-screening of the highly prevalent founder mutations will help reduce the number of samples undergoing expensive high-throughput sequencing. Although several studies have assessed the phenotypical features of LS founder mutations, no clinical management strategies specific for founder mutations seem to be necessary.