Lynch syndrome is mostly characterized by early-onset colorectal and endometrial adenocarcinomas. Over 90% of the causal mutations occur in two mismatch repair genes, MSH2 and MLH1. The aim of this study was to evaluate the age-dependent cancer risk in MSH2 or MLH1 mutation carriers from data of DNA diagnostic laboratories. To avoid overestimation, evaluation was based on the age-dependent proportion of mutation carriers in asymptomatic first-degree relatives of identified mutation carriers. Data from 859 such eligible relatives were collected from 8 centers; 387 were found to have inherited the mutation from their relatives. Age-dependent risks were calculated either using a nonparametric approach for four discrete age groups or assuming a modified Weibull distribution for the dependence of risk on age. Cancer risk was estimated starting at 28 (25–32 0.68 confidence interval) and to reach near 0.70 at 70 years. The risks were very similar for MSH2 and MLH1 mutation carriers. Although not statistically significant, the risk in males appeared to precede that for females by ten years. This difference needs to be investigated on a larger dataset.
If confirmed, this would indicate that the onset of the colonoscopic surveillance may be different in male and female mutation carriers.
1. Introduction
Many genetic disorders have been found to exhibit a simple Mendelian inheritance pattern, and advances in the knowledge about their genetic basis have led to the expansion
of DNA testing both for diagnosis and prediction of disease susceptibilities.
In the case of mutations associated with an increased risk of common cancers,
one parameter of major practical importance is the age-dependent cancer risk,
defined as the risk for a mutation carrier of developing a tumor prior to a
given age. Indeed, a precise knowledge of this parameter is instrumental in the
counseling of individuals who are identified as carriers by genetic testing and
who are faced with different options for cancer prevention or early detection.
A
number of methodologies have been developed to estimate penetrance and the
lifetime risk or recurrence risk of cancer prone individuals. Often, the first
evaluation of cancer risk is performed from symptomatic individuals identified
in large pedigrees used in linkage studies [1]. It has been shown that such
design leads to a severe overestimation. Alternative evaluation methods, which
tend to reduce such biases, include population-based studies and/or prospective
follow-up of unaffected mutation carriers. These approaches, however, may be
expensive, time consuming, and may require long follow-up in order to provide
sufficient information. HNPCC, also known as the Lynch syndrome, is an
autosomal dominant condition caused by mutation in one or several genes
involved in DNA mismatch repair (MMR) [2]. Mutation carriers have been shown to
be at high risk to develop colorectal and endometrial adenocarcinomas. In
addition, significantly increased risks have been reported for cancers of small
bowel, upper urological tract, stomach, ovary, and biliary tract [3]. Although
at least four MMR genes (MSH2, MLH1, MSH6, and PMS2) have been
implicated in Lynch syndrome, more than 90% of the causative mutations have
been identified in two of them, MSH2 and MLH1. It has been estimated that
the prevalence of mutations in these two MMR genes in the general population of
European origin is between 1 of 500 and 1 of 1000 [4]. The prevalence in
colorectal cancer patients is 2.7% [4]. In studies
where ascertainment of Lynch families was not corrected, the estimated lifetime
risk of colorectal cancer ranges from 0.68 to 0.82.
A
precise knowledge of the age-dependent risk of cancer for individuals with
deleterious MSH2 and MLH1 mutations is helpful in the
identification and clinical management of families at high risk of colorectal
and endometrial cancers.
However, it has been recognized that evaluation of the cancer risk of HNPCC
individuals performed from symptomatic patients referred to a cancer family
clinic leads to overestimation [1, 5, 6]. We had
previously briefly delineated an evaluation method which may be less sensitive
to recruitment bias. It is based on the age-dependent proportion of mutation
carriers observed in asymptomatic offspring of mutation carriers which when
applied to 267 individuals led to an evaluation of the age-dependent risks of
first cancer to be approximately 0.43 at age 38 and 0.62 at age 51 in mutation
carriers [7]. The recent development of cancer family clinics offers the
potential to generate a large amount of data, thus providing the opportunity of
improving evaluations of cancer risk. Here, we more explicitly present the
method and provide an example of its application by studying data from a total
of 859 asymptomatic offspring of mutation carriers, distributed over an
extended range of age, that have been collected through the contribution of
hospital laboratories which perform genetic testing of MSH2 and MLH1 genes in
France and Switzerland. We also show that the number of observations has to be
substantially increased in order to provide precise estimates.
2. Patients and Methods
2.1. Patients
A
retrospective questionnaire was sent to eight genetic units which offer
germline analysis of MSH2 and MLH1 genes under a Health Ministry
agreement in France and Switzerland. This questionnaire asked, for each genetic
test performed on asymptomatic offspring of mutation carriers, the following
information: disease causing germline mutation identified in the proband using
the international mutation nomenclature (http://www.hgvs.org/mutnomen/), birth
date, sex, and age at genetic diagnosis. The questionnaire was fulfilled by the
biologist having validated the predictive tests. No follow-up data of the
corresponding at-risk relatives was required.
2.2. Genetic Testing
In all laboratories that provide data, the presence or absence of the
disease causing mutation was assessed on genomic DNA extracted from two independent
blood samples according to the French and Swiss rules for examination of the
individual genetic characteristics. Depending on the
mutation type, point mutation, or large genomic rearrangement found in the proband, either by DNA sequencing or by
quantitative, fluorescent multiplex PCR [8] was
performed.
2.3. Risk Calculation
The method is based on the determination of the age-dependent
proportion of mutation carriers observed in asymptomatic first-degree relatives
of mutation carriers at the time of the genetic test. No question about
survival of mutation carrier is addressed in this work. The probabilities at birth for a first-degree
relative of a mutation carrier to be either mutation carrier or nonmutation
carrier are approximately equal and will be assumed equal in the rest of the
study. For these two groups, the proportion of individuals that become symptomatic
with age differs.
Let
and be the probability of a nonmutation carrier and of a mutation
carrier to be affected by cancer before age , respectively.
is also called cancer risk. The proportions and of
asymptomatic individuals that can still be observed at age are and .
Therefore, in proportion of mutation carriers at age in a
group of asymptomatic first-degree relatives of mutation carriers;
, it follows that
It is, therefore, possible to evaluate the age-dependent
increased risk of mutation carriers (and, therefore, cancer risk) from the age-dependent
risk of nonmutation carriers and the age-dependent proportion of mutation
carriers among asymptomatic first-degree relatives of mutation carriers. In the
rest of this work, we will assume that
remains small so that we have the approximate
relationship:
A nonparametric
estimate of the cancer risk can be obtained as ,
where is an estimation of approximated by ,
where (and ) is the number of mutation carriers (and noncarriers) sampled at
age between .
Thus, is estimated as ,
with variance according to the method [9].
A parametric
estimate based on the logistic regression model can also be proposed. Based on
the definition of and ,
we have the following important relationship: Assuming the
following modified Weibull distribution for the cancer risk: we observe that This equation
suggests the familiar logistic regression model. We can consider the following
simple model: We can find the
maximum likelihood estimate of by maximizing the likelihood function based on the above
probability model. Once we have an estimate for ,
we can obtain the estimate for for any given . We call this estimate the
parametric estimate. To obtain the confidence interval of the parametric
estimate, we use the standard bootstrap method [10]. To compare the cancer risk
between two groups, such as males and females, we used the likelihood ratio
statistic that compares the likelihood assuming the same parametric model
(i.e., the common for both samples) with the likelihood obtained
by allowing to be varied between two samples. The
statistical significance of the test can be evaluated through a permutation
procedure by randomly shuffling the group ids (i.e., gender or gene name) among
all subjects.
3. Results
Eight
genetic units contributed information on a total of 859 asymptomatic offspring
of mutation carriers: 581 from SO, 116 from CL and QW, 58 from SBS, 44 from
MPB, 21 from PH, 16 from ER, 14 from OC, and 9 from VB. They were 472
nonmutation carriers (233 males and 239 females, aged 18 to 89 years) and 387
mutation carriers. Mutation was located within the MSH2 gene in 183 cases (86 males and 97 females, aged 18 to 73
years) and within the MLH1 gene in
204 cases (84 males and 120 females, aged 18 to 74 years). Mutations were all
predicted to result in protein truncation or were missense mutations classified
as deleterious on the basis of functional tests in yeast, cosegregation
analyses, and tumor cells studies. First-degree relatives of index cases
carrying DNA variants of unknown
significance were removed from this study. The 859 unaffected first-degree
relatives of MSH2 and MLH1 mutation carrier were classified
into five age groups (Table 1).
Table 1: Distribution of mutation carrier and non carrier in 859 unaffected individuals
with one 1st degree relative carrying a deleterious MSH2 or MLH1 mutation.
When
both the gender and the nature of the mutated gene (whether MSH2 or MLH1) are considered, the number of observations in each group is
too small to enable a nonparametric evaluation of the cancer risk. Pooling
together the four groups (MHS2, MLH1, males, and females), we attempted
a nonparametric evaluation of the age-dependent cancer risk. It can be observed
that as pooled age groups become older, the proportions of affected mutation
carriers tend to decrease. This decrease is due to the removal from the study
of the mutation carriers who become symptomatic. In the younger age group, the
number of mutation carriers was larger than that of the nonmutation carriers,
an observation that is likely due to the small number of observations and which
suggests that cancer risk is very small in this age group. For the other
groups, the proportion of mutation carriers was smaller than 0.5, and thus for
these groups, a nonzero cancer risk could be estimated (Figure 1). For instance,
for the age group between 50 and 60, the median age was 53, and the cancer risk
was evaluated to 0.43. We note that the standard deviation of the present
evaluation is large.
Figure 1: Cumulative risk of cancer for
MSH2 and
MLH1 mutation carriers. Individuals
are not differentiated with respect to the mutated gene or gender. The five
horizontal blue lines indicate the risk for each of the 5 groups as shown in Table
1. For the youngest group, the risk was evaluated to zero. For the other four
groups, the vertical line is placed at the median age and indicates the 0.68
confidence interval for the evaluated risk. The red curve shows the risk
evaluated assuming that it follows a modified Weibull distribution as a
function of age. The two flanking dotted graphs indicate the 0.68 confidence
interval of this evaluation. The age at onset of the risk is evaluated to 28
(0.68 confidence interval = 25–32 years).
In an
attempt to obtain a more precise evaluation, we performed a parametric estimate
of the age-dependent cancer risk assuming that cancer risk would be negligible
before an age threshold called
, and starting from this age, cancer risk would increase according
to a Weibull distribution with a parameter . Weibull distributions are currently used in survival analyses and
have been applied to parameterize age-dependent cancer risk [11]. Under this model, the maximum likelihood
estimate of is 28 years (68% confidence interval 25–32 years). After
this age, cancer risk rises rapidly and reaches a value of 0.48 (68% confidence
interval 0.42–0.54) at 53
years, and 0.67 (68% confidence interval ) at 70 years. After this age,
the probability of a nonmutation carrier to have developed cancer becomes
substantial so that the model may need to be corrected according to (1). The
number of observations of nonsymptomatic fist-degree relatives older than 70 is
small in our series, and no attempt was made to evaluate cancer risk after this
age.
A
similar method was applied separately on asymptomatic individuals with first-degree
relatives carrying an MLH1 or an MHS2
mutation. The difference is risk for MSH2 and MLH1 mutation carriers appeared
minimal (Figure 2(a)). When males and females were analyzed separately, the age-dependent
risk for males appeared shifted by ten years as compared to females. Under the
parametric model, the age at onset of the increased risk in males is 23 years
(95% confidence interval 11–27) as it is 32
years (95% confidence interval 29–37) in females. However, a permutation test failed to demonstrate statistical significance .
Figure 2: Cumulative risk of cancer for mutation carriers when individuals are
differentiated with respect to the mutated gene (part (a)) or gender (part (b)).
The risk estimation was only attempted under the assumption that the risk
follows a modified Weibull distribution as a function of age.
4. Discussion
There
is a clear need to improve our estimation of the age-dependent cancer risk for
many genetic diseases. This is especially important for those conditions that predispose to
cancer as this knowledge may influence the definition of the best surveillance
protocol. The development of presymptomatic DNA diagnostic tests offers an
opportunity to improve this knowledge. However, we need to apply methods that
are less prone to biases than those based on the age at the onset of
symptomatic individuals referred to cancer family clinics [11, 12].
The
evaluation method discussed in this paper requires a set of data that are
collected in a two-stage process. In the first stage, symptomatic individuals
are referred to a clinic and the deleterious mutations are identified.
Importantly, the age at onset of the symptomatic individuals collected at this
stage is not used to evaluate cancer risk as it is well known that such
procedure may lead to major overestimation. In the second stage, asymptomatic first-degree
relatives of individuals with an identified mutation are recruited and a test
is conducted to determine their mutation status. Cancer risk is only evaluated
from the age-dependent
proportion of gene carriers in this group of asymptomatic first degree
relatives. This method shares some of the potential biases that may be observed
in population-based studies. The highly penetrant mutations are likely to
contribute more than the low-penetrant mutations as the earlier are more
readily detected than the former. Also mutations that lie in chromosomal
regions that are investigated by routine DNA diagnostic techniques (e.g.,
mainly exonic point mutations or genomic large rearrangements) have been
preferentially included in the study. Thus, the group of mutations that have been evaluated for cancer
risk may not be representative of the mutation spectrum that is present in the
population. However, for the group of mutations that have been identified, the
method appears minimally biased. This lack of bias stems from the requirement
that the individuals included in the study should be asymptomatic.
The
present method requires a large number of observations. With the present
dataset, variance of our estimation is large. It is barely informative when a
nonparametric evaluation method is used (e.g., when individuals are pooled
into 10-year age groups). The assumption of a modified Weibull distribution for cancer
risk enables to decrease this variance at the cost of minimal hypotheses.
Simulation studies indicate that the collection of a 4-fold increased the number
of observations would decrease the confidence interval by a factor 2 (results
not shown). With the development of presymptomatic DNA diagnostic tests, such number
should be obtainable in the near future at little cost.
In
the present work, we have applied the method to the evaluation of the age-dependent
cancer risk of mutations in the MSH2 and MLH1 genes associated with Lynch
syndrome. The resulting evaluation of the age-dependent cancer risk is
consistent with those that have been previously published based on population-based
studies [13]. It does confirm that the previous evaluation based on the age at
onset of retrospectively included symptomatic individuals was overestimated. It
also indicates that as previously proposed, cancer risk of MLH1 and MSH2 mutations
is similar [14]. An analysis distinguishing gender also suggests that the onset
of the increased risk occurs ten years earlier in males than in females. This
observation, if confirmed, suggests that the colonoscopy surveillance in males
may have to be started earlier than in females, possibly leading to changes in
the standard guidelines [15, 16]. Also, if it
is confirmed that cancer risk is lower at all ages in females than in males, it
would imply that the colorectal risk may be much smaller in females that in
males since females are also at high risk of endometrial cancer. Similar
observations have been recently published in the literature [12, 14].
Besides
Lynch syndrome, it would be of interest to apply this approach to predisposing
diseases for which the first manifestations are not present at birth and which
may have irreversible deleterious consequences when the diagnosis is delayed
until symptomatic. This includes not only the cancer predisposing conditions such as those
associated to BRCA mutations [17, 18] but also
possibly conditions associated with other degenerative processes (neurological
or metabolic).
Acknowledgments
This
work was supported by Institut National du Cancer and the French ministry of
health. SO coordinates the MMR Genes Testing National Network.