Recently published analyses for four smoking-related diseases show that the declining excess relative risk by time quit is well fitted by the negative exponential model. These analyses estimated the half-life of this excess, that is, the time after quitting when the excess relative risk reaches half that for continuing smokers. We describe extensions of the simple model. One quantifies the decline following an exposure reduction. We show that this extension satisfactorily predicts results from studies investigating the effect of reducing cigarette consumption. It may also be relevant to exposure reductions following product-switching. Another extension predicts changes in excess relative risk occurring following multiple exposure changes over time. Suitable published epidemiological data are unavailable to test this, and we recommend its validity to be investigated using large studies with data recorded on smoking habits at multiple time points in life. The basic formulae described assume that the excess relative risk for a continuing smoker is linearly related to exposure and that the half-life is invariant of age. We describe model adaptations to allow for nonlinear dose-response and for age-dependence of the half-life. The negative exponential model, though relatively simple, appears to have many potential uses in epidemiological research for summarizing variations in risk with exposure changes.

1. Introduction

A huge literature relates smoking to the risk of various major diseases. Although most of the evidence concerns comparison of risk in never and current smokers, many studies show that quitters have intermediate risks. Until now, few researchers have attempted formal modelling of the time pattern of the decline in relative risk (RR) following quitting [1, 2] and then only based on data from individual studies. Recently, however, publications on ischaemic heart disease (IHD) [3], lung cancer [4], chronic obstructive pulmonary disease (COPD) [5], and stroke [6] have fitted the negative exponential model (NEM) to multiple data sets to describe quantitatively the pattern of decline in RR with increasing time quit.

While these publications [36] provide important information on the benefits of quitting smoking, they are limited by not more generally considering how RR changes with variation in exposure to tobacco products. Thus, apart from quitting, smokers may change the amount they smoke, switch to a reduced exposure product, or may restart smoking having previously quit. Indeed, their level of exposure may change a number of times during their lifetime, and it would be useful to be able to model how RR varies over time for complex smoking histories. Not only would such a model allow a more comprehensive description of the risks associated with differing lifetime patterns of smoking, it may also be useful when attempting to predict the health impacts of novel products.

This paper describes extensions to the NEM to allow modelling of complex smoking histories. Section 2 gives a formal definition of the NEM; Section 3 then briefly summarizes material from the published papers on quitting [36] concerning the methodology used, the goodness-of-fit of the data for the different diseases studied, and evidence on between-study heterogeneity. Section 4 describes a first extension to the NEM which applies to reducing cigarette consumption or switching to a reduced exposure product. A recent review [7] provides estimates of the reduction in lung cancer risk following a reduction in cigarette consumption, and it is shown that the NEM extension fits these data well. Section 5 describes a further extension of the NEM to allow for multiple exposure changes and illustrates it by some examples. No published epidemiological data are available to verify this, and we suggest that its validity might be tested using large data sets which have recorded changes in smoking habits at multiple time points. The methods described up to this point assume that exposure is linearly related to the excess relative risk () and that the rate of its decline with time is invariant of age. Section 6 describes adaptations of the NEM which allow for nonlinearity of the dose-response and for the rate of decline in ER to vary with age.

The methods described and justified here give fuller detail than those summarized recently by Weitkunat et al. [8] when outlining a novel approach to assess the population health impact of introducing a modified risk tobacco product (MRTP). In that approach two sets of simulated individual smoking histories are compared, one scenario being derived on the assumption that the MRTP is introduced into the population and the other scenario derived assuming that it is not. The histories are derived from assumptions concerning the extent to which smokers will start smoking the two types of product (conventional cigarettes or MRTPs), will switch from one product to another, or will quit. The RR of the major smoking-related diseases (compared to never smokers) is then estimated for each individual’s smoking history using the formulae described here, and average risks over individuals for each scenario are then used to estimate the reduction in deaths attributable to the assumed level of adoption of the MRTP.

2. The Negative Exponential Model

The NEM predicts that the smoking-associated ER declines with time quit, so that if is the ER in current smokers compared to never smokers, where is age in years, then , the ER in quitters of the same age, years after quitting, is given by where and is the “half-life,” that is, the time since quitting when the ER declines to half that associated with continuing smoking.

3. Quitting Smoking

Studies which relate risk to time quit typically present RRs and 95% confidence intervals (CIs), relative to never smokers, for current smokers and for quitters by grouped period of time quit. Some studies present RRs and CIs for never smokers and quitters, expressed relative to current smokers. Typically, the RRs are adjusted for potential confounding variables. As described in detail elsewhere [3], fitting the NEM requires a pseudotable to be estimated (using the method of Hamling et al. [9]). This pseudotable consists of numbers of subjects subdivided by smoking habit and disease status which correspond to the observed RRs and 95% CIs. For prospective studies the numbers are of cases and at risk, and for case-control studies they are of cases and controls. Using these data and estimates of , the midpoint time quit (taken to be zero for current smokers and infinite for never smokers) and maximum likelihood methods then allow fitting of the model. For prospective studies, the model iswhere is the absolute risk of disease at time quit and , , and are parameters to be estimated. Here references smoking group (current smokers, quitters by time quit, or never smokers), is the absolute risk in never smokers, and is the absolute risk in current smokers. is equal to . For case-control studies, the model differs as follows:

Here is the RR (versus never smokers, estimated by the odds ratio) not the absolute risk (), and is the ER, rather than the increase in absolute risk (), in current smokers. The interpretation of is the same as for prospective studies.

Lee et al. [3] describe methods for testing goodness-of-fit, carrying out meta-analysis of estimates of , and testing heterogeneity. They also describe sensitivity analyses which attempt to account for “reverse causation” (short-term quitters having an increased risk as quitting can be prompted by symptoms from undiagnosed disease), either by omitting short-term quitters or by reclassifying them to be current smokers. The methodology has recently been applied to all available data sets for four major smoking-related diseases. The analysis for IHD [3] was based on 41 independent blocks of RRs from 23 studies [1032], that for COPD [5] on 14 blocks from 11 studies [23, 3342], and that for stroke [6] on 22 blocks from 13 studies [15, 19, 20, 23, 25, 37, 38, 4348]. The analysis for lung cancer [4] was based on the most data, involving 106 blocks of RRs from 85 studies, 30 in the USA or Canada [23, 45, 4976], 25 in Europe [10, 13, 30, 34, 7797], and 30 from other or multiple countries [19, 37, 38, 72, 98122]. (Note that one publication [38] provided separate data for two different Australian studies, while another [72] provided separate data for a US and a Japanese study.) Based on the combined results, various conclusions can be made, which are summarized below.

3.1. Failure to Fit

The method may fail to converge to a valid solution () or may converge to a solution with a huge variance, when a decline in risk following quitting is not clearly evident. This most often occurs when the current smoker RR is not or only minimally elevated. Failure (leading to exclusion from further analysis) occurred for none of 106 lung cancer data sets, one of 41 IHD sets, where the current smoker RR was 1.0, and one of 14 COPD sets, where no risk decline following quitting was evident. Failure often occurred for stroke, where current smoker RRs were below 1.4. Here, our meta-analysis used only those 11 (of 22) data sets which showed a stronger relationship with smoking.

3.2. Goodness-of-Fit

Where convergence did occur, goodness-of-fit, as assessed by comparison of observed and fitted numbers, was adequate for every COPD and stroke data set and for nearly all IHD sets, except where the decline in risk was clearly nonmonotonic. For lung cancer, model fit was poor when reverse causation was ignored but was much improved in sensitivity analyses omitting short-term quitters or counting them as current smokers. However, some misfit remained, seemingly resulting from odd data patterns in individual studies rather than from any systematic misfit.

3.3. Estimates of Half-Life

Meta-analysis estimates (years) were larger for COPD, 13.32 (95% CI 11.86–14.96), and lung cancer, 9.93 (9.31–10.60), than for IHD, 4.40 (3.26–5.95), or stroke, 4.78 (2.17–10.50). It has been shown more recently [123] that the half-life is somewhat greater for adenocarcinoma than for squamous cell carcinoma of the lung, with the ratio of the s for the two cell types estimated as 1.32 (1.20–1.46).

These estimates provide more precise information on the pattern of the decline in ER following quitting than that found in various authoritative reviews, for example, [15, 124126]. Thus, for example, the International Agency on Research and Cancer (IARC) [126] notes for IHD that, though the data were heterogeneous, “the body of evidence points toward the risk of CHD asymptotically approaching the risk of never smokers.” While imprecise, this conclusion seems correct. The same cannot be said of an earlier report, by the US Surgeon-General [15], which stated that “the excess risk of CHD caused by smoking is reduced by about half after 1 year of smoking abstinence and then declines gradually. After 15 years of abstinence, the risk of CHD is similar to that of persons who have never smoked.” Here the indicated decline in ER following quitting is much more rapid than indicated by the overall data currently available.

For lung cancer, the US Surgeon-General [15] stated that “after 10 years of abstinence, the risk of lung cancer is about 30–50% of the risk for continuing smokers; with further abstinence, the risk continues to decline.” While not incorrect, the statement does not fully quantify the decline in ER with time.

As far as we are aware, there have been no attempts, for any of the four diseases considered, to formally quantify the shape of the decline in risk following quitting based on all the available epidemiological evidence.

3.4. Sensitivity Analyses

Although improving the fit, the sensitivity analyses had little effect on the meta-analysis estimates of . This was unsurprising as, for each data set, the estimated was usually close to where the ratio of ERs in quitters and continuing smokers reduced to below 50%. This would occur longer after quitting than when the short-term increases were seen.

3.5. Heterogeneity

Heterogeneity between individual estimates for a disease was assessed by likelihood-ratio tests. Significant () heterogeneity was not seen for COPD but was seen for the other diseases, markedly so for IHD (where was estimated as <2 years for 10 of 40 data sets and as >10 years for 12) and for stroke. For lung cancer, was somewhat higher in males and in older populations. For IHD and stroke, the relationship of to age was also evident.

For all four diseases, the NEM was found to provide a simple method for summarizing quitting data, generally fitting it well. Misfits seen were mainly due to unusual results, which no plausible model could be expected to fit, rather than to any systematic misfits. However, the evidence of a greater in older populations, seen for three of the four diseases studied, suggests that some adaptation of the simple NEM may be required to allow for this. This is described later.

4. Reducing Cigarette Consumption

Quitting can be regarded as reducing the effective exposure from 1 to 0 units. The NEM can be extended, assuming that the ER is linearly related to exposure to estimate the ER, , years after reduction in exposure from 1 to units, by

Here is the time since the reduction occurred. When , formulae (1) and (4) are identical. Formula (4) can apply, not only to reducing cigarettes/day from, say, 25 to 15 (), but also to switching to a product with a reduced uptake of relevant smoke constituents. Note that the formula implies a linear dose-response relationship. Modification of the formula to allow for nonlinearity is discussed later (see Section 6.1).

A recent review on the effect on lung cancer risk of reducing amount smoked [7] summarized evidence from three prospective and three case-control studies which involved varying reductions in amount smoked over a varying time. Each study demonstrated a risk reduction, with a meta-analysis estimating the overall RR for reducers versus nonreducers as 0.81 (95% CI 0.74–0.88).

To compare these results with the predictions of formula (4), data are required on the extent of reduction of consumption in the reducers, the length of follow-up since the reduction, and the average age of the subjects at the time of reduction. Such data were only available for the prospective studies, each of which involved two separate examinations when smoking habits were recorded and then a mortality follow-up period of 10 or more years. The study details and the observed and NEM predicted RRs are shown in Table 1.

Study 1 provided one RR estimate for sexes combined, study 2 provided one estimate per sex, and study 3 provided an estimate for males for different degrees of reduction. Each RR was adjusted for age and confounders. , the ratio of consumption of post- to prereduction, varied from 0.17 to 0.50. values came from the source for studies 1 and 2 but for study 3 were based on assumed midpoints of 30, 15, and 5 cigs/day for the three categories used. The time from reduction to end of follow-up was estimated as length of follow-up plus half the time between examinations.

RRs were predicted using formula (4), as , where is a weighted mean value of over the time from reduction to end of follow-up, the values being computed at 0.1 year intervals. Based on the Doll-Peto formula [127] the weighting factor was taken as , being the average age at time of reduction and being the time since the reduction occurred. This factor is necessary because the observed RR derives from cancers occurring at varying times after the reduction started, whereas formula (4) relates to the reduction in ER at a specific time. It also accounts for the absolute risk in continuing smokers rising with increasing age (or duration). was taken as 9.93, based on earlier work [4]. No account was taken of any survival differences between continuing smokers and reducers, likely to have relatively minor effect given the ages of the populations.

As seen in Table 1, the predicted RR is always near its observed value and lies well within its 95% CI. For the six observed RRs, meta-analysis yields an estimate of 0.73 (95% CI 0.64–0.82) with no heterogeneity. Using weights as for the observed data, the estimate based on the NEM predicted RRs, 0.71 (0.63–0.80), was similar. Although various assumptions and simplifications are involved, unavoidable without access to the full study data, and study 3 estimates are not independent (estimates 4 and 5 both involving comparisons with continuing heavy smokers), the predicted RR from formula (4) is clearly a good approximation to the observed value.

5. Multiple Changes in Exposure

Denoting the negative exponential function by , formula (4) can be rewritten as

More generally, for someone switching from exposure to units, the ER for the switcher, can be expressed aswhere is the time since switch. Setting and , the two formulae are clearly the same.

Figure 1 illustrates various patterns of observed ERS. The five lines in red relate to smokers of conventional cigarettes (, ER = 10) who, at the age of 40, either continue to smoke conventional cigarettes, switch to differing MRTPs (with , 0.50, or 0.25), or quit (). The five lines in blue relate to smokers of one type of MRTP (, ER = 5) who, at the age of 40, either continue to smoke the same product, switch to conventional cigarettes, switch to other modified risk products (, 0.25), or quit (). is set as 10 throughout.

This formulation can be extended to allow for multiple periods of exposure. We first define the following:: number of periods (),: the time of the th exposure change,: the exposure in period , with taken as zero,: the ER for a continuing smoker with , at age , the value being derived from epidemiological data,: the ER in period at age for the given pattern of exposure. Note that ER is only defined for .

The method of estimation of is illustrated below for four switches () working through the five periods in turn.

In period 0, is zero.

In period 1, the ERs are estimated by multiplying the exposure by . To allow calculation for later periods, this must be estimated for each later exposure. Thus, we have

In period 2, the ERs are calculated from period 1 ERs as follows:

In period 3, the ERs are calculated from period 2 ERs as follows:

Similarly in period 4, we have

Thus the person of interest, with exposures 0, , , , and , has the ERs in each period given by the first equation shown for each period. The other ERs are used only to calculate these.

Figures 2, 3, and 4 show illustrative results for three smoking patterns, each starting with smoking of conventional cigarettes () up to the age of 40 when the ER is taken as 10 but involving differing patterns later. is set as 10, as for Figure 1. Figure 2 concerns smokers who quit at the age of 40, resume at the age of 50 (), quit again at the age of 60, and resume at the age of 70 (). Figure 3 concerns smokers who reduce consumption slightly at the age of 40 (), further reduce consumption at the age of 50 (), increase consumption at the age of 60 (), and return to their original habits at the age of 70 (). Figure 4 concerns smokers who cut down markedly at the age of 40 (), increase consumption at the age of 50 (), reduce consumption at the age of 60 (), and then quit at the age of 70.

Note that, as far as we are aware, there are no published epidemiological data available giving changes in risk following multiple periods of exposure which would allow formal comparisons to be made of observed risks and those predicted by the formulae given above.

6. Modifications of the NEM

6.1. Allowing for Variation in the Dose-Response Relationship

The methods described in Sections 4 and 5 assume that exposure is linearly related to the ER. Although recent dose-response meta-analyses [128] show little upward curvature in the relationship between ER and the amount smoked per day, a nonlinear dose-response has been claimed, with Doll and Peto [127] suggesting that risk is proportional to the (cigs/day + 6) squared. can be adjusted to take account of nonlinearity, as illustrated below for the Doll and Peto example.

Thus, suppose that corresponds to 20 cigarettes/day, so the ER is proportional to , where 262 is the dose factor in the Doll/Peto formula for smokers and 62 is that for nonsmokers. Adjusted values, , are then calculated by , so that multiplying the ER for by gives the quadratically interpolated curves. Thus unadjusted values of 0.75, 0.5, 0.25, and 0.1 become adjusted values of 0.633, 0.344, 0.133, and 0.044.

6.2. Allowing for Variation in the Half-Life

As shown in Table 2 there is evidence that increases with age for lung cancer [4], IHD [3] and stroke [6], although not for COPD [5]. A modification of the NEM allows for this. Thus, whereas when is invariant of age is defined as , where is time since switch, for varying , is defined as the product of terms in , with time since switch divided into periods () where the half-life is .

Figure 5 corresponds to Figure 2, showing predicted ERs for the first smoking pattern described above. Here instead of constant 10 years, is set at 7 years for ages 40–55, at 10 years for ages 55–70, and then at 14 years. While the initial decline is greater in Figure 5 than in Figure 2, the predicted ER is very similar after the age of 70.

7. Discussion

Based on substantial data for IHD [3] and lung cancer [4] and more limited data for COPD [5] and for stroke [6], the decline in ER following quitting has been shown to fit well the simplest NEM version (formula (1)). Using more limited data, the decline in lung cancer risk following reducing cigarette consumption can be fitted well by an extended form (formula (4)).

The further extension (formula (6)) allows estimation of ER following multiple exposure changes. All it requires are estimates of , the ER for continued smoking (), and the exposures () in each relevant period. The formula allows estimation of predicted risk patterns, not only for intermittent smoking periods and consumption changes, but also for changes in product smoked. Thus if conventional cigarette smoking has factor , and a MRTP [129] reduces exposure to relevant smoke constituents by 80%, would be set as 0.2. for dual users, using half conventional cigarettes and half MRTPs, could be set as 0.5 (1 + 0.2) = 0.6. Given data on switching rates from conventional cigarettes to MRTP and the frequency of dual use, the NEM can then predict changes in risk for smoking-related diseases.

Validating formula (6) is problematic, since published suitable epidemiological data on risk changes following multiple exposure changes are lacking. It is likely that, in fact, large epidemiological data sets exist which have recorded data on the extent of smoking at three or more time points. Providing that there are sufficient numbers of subjects showing differing patterns of changing exposure, it would be valuable to test the accuracy of fit to the NEM.

Formula (6) is most simply employed with taken as proportional to the extent of exposure, thus implicitly assuming a linear dose-response relationship. In fact, recent extensive dose-response analyses for lung cancer [128] do not suggest marked nonlinearity, and the same seems true for COPD [130] and other smoking-related diseases [131, 132]. However, it is possible, as we describe, to adapt the NEM to allow for nonlinearity by using values of that reflect alternative relationships. It should be noted that observed dose-response relationships may not precisely reflect the truth because of inaccurate reporting of amount smoked. Though the assumption of linearity may not be correct, it certainly serves as a useful simple starting point for ER estimations.

Another assumption inherent in formula (6) is that is invariant of age. Available results (see Table 2) suggest that increases with age for lung cancer, IHD, and stroke, though not for COPD. There is therefore a case for using age-dependent estimates of , and we describe an adaptation of the NEM to allow for this. However, it may be difficult to obtain precise estimates. Those in Table 2 come from case-control and prospective studies. For case-control studies, age is when the cancer occurred, but for prospective studies it was taken as age at baseline (or for lung cancer, midpoint age of follow-up); such studies do not usually report risk by actual age.

Another issue is uncertainty in , particularly for IHD and stroke, where substantial heterogeneity from different available data sets [3, 6] renders the overall estimate less reliable. This creates uncertainty in ER estimates resulting from varying exposure histories, which might be addressed by presenting estimates using alternative values (e.g., ±1 SE).

This paper does not attempt to further investigate the usefulness of the NEM by comparing its predictions with other models. There are two reasons for this. Firstly, the fit to more detailed models has only been studied for some of the diseases of interest. Second, for lung cancer, where various models have been tried, such as the multistage model (see e.g., [1, 133143]) or the two-stage clonal expansion model [144147], full comparisons of the predictions of the NEM and other models would be a major exercise, beyond the scope of this paper. We prefer, if possible, to test the goodness-of-fit of the NEM using available data sets. At the present time, we feel that the material presented suggests that the NEM is a valuable, simple tool meriting further investigation.

8. Conclusion

The NEM is a useful model with many applications to smoking and health data. The most general form not only allows a simple description of the time course of disease following quitting, but can predict changes in risk following single or multiple reductions or increases in exposure, whether resulting from changes in amount smoked or the type of product smoked, perhaps following switching to MRTPs.


Neither sponsor was involved in the planning, execution, or writing of the paper or the decision to submit it for publication.


The opinions and conclusions of the authors are their own and do not necessarily reflect the position of Philip Morris Products S.A.

Conflict of Interests

Peter N. Lee, founder of P. N. Lee Statistics and Computing Ltd., is an independent consultant in statistics and an advisor in the fields of epidemiology and toxicology to a number of tobacco, pharmaceutical, and chemical companies. This includes Philip Morris Products S.A., the sponsor of this study. The other three authors are employees of P. N. Lee Statistics and Computing Ltd.


The authors thank Philip Morris Products S.A. that funded the work. They also thank Pauline Wassell, Diana Morris, and Yvonne Cooper for assistance in typing the various drafts of the paper and obtaining the relevant literature. The preparation of the paper and work on quitting smoking and multiple exposure changes was supported by Philip Morris Products S.A. The work on cigarette consumption reduction was supported by Altria Client Services Inc.