Abstract

We present a stochastic methodology to study the decay phase of an epidemic. It is based on a general stochastic epidemic process with memory, suitable to model the spread in a large open population with births of any rare transmissible disease with a random incubation period and a Reed-Frost type infection. This model, which belongs to the class of multitype branching processes in discrete time, enables us to predict the incidences of cases and to derive the probability distributions of the extinction time and of the future epidemic size. We also study the epidemic evolution in the worst-case scenario of a very late extinction time, making use of the Q-process. We provide in addition an estimator of the key parameter of the epidemic model quantifying the infection and finally illustrate this methodology with the study of the Bovine Spongiform Encephalopathy epidemic in Great Britain after the 1988 feed ban law.

1. Introduction

Outbreaks of infectious diseases of animals or humans are subject, when possible, to control measures aiming at curbing their spread. Effective measures should force the epidemic to enter its decay phase and to reach extinction. The decay phase can then be simply detected by a decrease of the number of cases, when this decrease is obvious. However this is not always the case, and this rough qualitative information might not be sufficient to evaluate accurately the effectiveness of the proposed measures to reduce the final size and duration of the outbreak. The goal of this paper is to present a stochastic methodology in discrete time to study more accurately the decay phase of an epidemic. Our framework is the spread, in a large open population, of a rare transmissible disease such that the infection process may be assumed to follow a Reed-Frost type model, with a probability for a susceptible to become infected by a given dose of pathogens inversely proportional to the total population size. Moreover the latent period (during which an individual is infected but not yet infectious) may be random and long compared to the generation time. Questions about the decay phase include the following: which quantitative criteria can ensure that the disease has entered an extinction phase? What is the probability distribution of the epidemic extinction time, of the epidemic final size, and of the incidence of infected individuals? Finally, what would be the evolution of the epidemic in the event of a very late extinction of the disease?

From a practical point of view, it is generally impossible to observe all infections. Susceptible and infected but not yet infectious individuals are most often not distinguishable, being both apparently healthy. This leads to the fact that the only available observations correspond to the incidence of individuals with clinical symptoms. One way to deal with this lack of information was proposed in [1] by Panaretos, who used a model taking into account two types of infected individuals, the observed and the unobserved. In order to answer the previous questions, we choose here a different approach, considering a stochastic model depending on the sole incidences of infectives at each time. We assume that an infective can transmit the disease during one given time unit at most. Therefore the incidence of infectives corresponds to the incidence of cases. The process then describes in a recursive way how one single infective can indirectly generate new infectives (so-called “secondary cases”) time units later, where . We assume that this number of secondary cases follows a Poisson probability distribution with parameter . The recursive formula defining is then the following: where the variable is the incidence of secondary cases produced at time with a delay (latent period) by individual infectious at time . The are assumed independent given , and the are assumed i.i.d. (identically and independently distributed) given , with a common Poisson distribution with parameter . This model is therefore time homogeneous and is in this sense less general than the one introduced in [2], which describes the spread of infectious animal diseases in a varying environment. However since we focus on the extinction phase only, the assumption of a constant environment with no new control measure is well founded and enables us to describe more accurately the decay phase. This process is the generalization of the well-known single-type BGW (Bienaymé-Galton-Watson) branching process, which is the limit, as the total population size tends to infinity, of the process describing the spread in a closed population of an infectious disease with a negligible latent period and a probability to become infected following a Reed-Frost model (see, e.g., [3, 4] and citations therein).

The core of the paper lies in Section 2, where the whole methodology is presented. We first formulate the epidemic model as a multitype branching process with Poissonian transitions, the types representing the memory of the process. This formulation provides useful analytical results such as the extinction criteria, and the distributions of the extinction time and of the epidemic size (Sections 2.12.3). Then, in order to investigate the worst-case scenario of an extreme late extinction of the epidemic, we introduce in Section 2.4 the -process , obtained by conditioning on a very late extinction. Using this process, we focus the study on the early behavior of the decay phase in the worst-case scenario, rather than on its long range behavior, which would have little meaning in our setting. Motivated by practical applications to real epidemics, for which we want to predict the processes and , as well as the derived distributions above, we need to know the values of the parameters . We may write , where is the mean number of individuals infected at age by an infective by direct or indirect transmission, is the largest survival age, is the probability for the individual aged at infection to have a latent period equal to (given his survival), and is the probability to be aged at the end of the latent period. Parameters are the key quantities for the spread of the disease and can be subject to changes due to control measures during the epidemic. We assume here that , where is constant over time, while may change with control measures. A typical example is when is the mean number of individuals infected by an infective by horizontal route at age , assumed independent of , and represents the maternal transmission probability . In this case So we assume here that, except for , the other parameters of the are constant over time and are known (generally estimated) from previous experiences or from the study of the whole epidemic evolution, in particular its growth phase. We moreover assume that each depends affinely on (see Section 3.1). In Sections 2.5 and 2.6, we provide optimal WCLSE (Weighted Conditional Least Squares Estimators) of in the decay phase, in the frame of as well as in the frame of the associated conditioned process , and prove the strong consistency and the asymptotic normality of these estimators.

The final Section 3 is devoted to the application of this method to real epidemics. We first present in Section 3.1 some general conditions under which the spread of a SEIR disease (susceptible, exposed (latent), infectious, removed) can be approximated by our epidemic process defined by (1.1) and give an explicit derivation of the parameters . We then illustrate the methodology in Section 3.2 with the decay phase of the BSE (Bovine Spongiform Encephalopathy) epidemic in Great Britain. According to the available data [5], the epidemic is obviously fading out. We assume that the satisfy (1.2). Then thanks to the stochastic tools developed here, we provide in addition to this rough information, short- and long-term predictions about the future spread of the disease as well as an estimation of a potentially remaining horizontal infection route after the 1988 feed ban law.

2. Methodology for the Study of an Epidemic Decay Phase

In this section we present a general methodology to study the decay phase of a SEIR disease in a large population, modeled by the process (1.1) defined in Section 1. Our main goal is to provide analytical tools to evaluate the efficiency of the last control measures taken prior to the considered time period. Most of our results are derived from the fact that this epidemic model can be seen as a multitype branching process. Indeed, defined by (1.1) is a Markovian process of order . Consequently, the -dimensional process defined by is Markovian of order 1, and it stems directly from (1.1) that is a multitype Bienaymé-Galton-Watson (BGW) process with types (see, e.g., [6]). Note that the types in this branching process do not correspond to any attribute of the individuals in the population, which is usually the case in mathematical biology (see, e.g., [7]), but simply correspond to the memory of the process . The information provided by the -dimensional Markovian process is therefore the same as the one given by the 1-dimensional -Markovian process , but the multitype branching process setting gives us powerful mathematical tools and results stemming from the branching processes theory [6]. The first basic tool is the generating function of the offspring distribution of , , defined on by , where denotes the th basis vector of and for . For all , we have here The second basic tool is the mean matrix defined by , which is here Let us notice that, since for each , then is nonsingular, positive regular (see [6]) and satisfies the condition, where denotes the sup norm in .

2.1. Extinction of the Epidemic
2.1.1. Almost Sure Extinction

Since the single-type process has a memory of size , it becomes extinct when it is null at successive times, or equivalently as soon as the -dimensional process reaches the -dimensional null vector . According to the theory of multitype positive regular and nonsingular BGW processes [6], the extinction of the process occurs almost surely (a.s.), if and only if , where is the dominant eigenvalue (also called the Perron's root) of the mean matrix . Thus is solution of . In general for , has no explicit expression. However, leads directly to the following explicit threshold criteria.

Proposition 2.1. The epidemic becomes extinct almost surely if and only if , where is the total mean number of secondary cases generated by one infective in a disease. We call the basic reproduction number.
Moreover, when , then with equality if and only if either or , and when , then with equality if and only if .

Note that when , only provides information about the threshold level, whereas provides an additional information about the speed of extinction of the process, as shown in the next two paragraphs.

2.1.2. Speed of Extinction

Thanks to well-known results in the literature about multitype branching processes and more particularly to the Perron-Frobenius theorem (see, e.g., [6]), we can deduce the expected incidence of infectives in the population at time , for large. Denoting by and the right and left eigenvectors of associated to the Perron's root ; that is, and , with the normalization convention , where stands for the usual scalar product in and where the superscript denotes the transposition, then . The first coordinate in the latter formula becomes for the epidemic process . Computing explicitly and , we obtain that for all , which leads to the following asymptotic result: Hence if , the mean number of infectives decreases exponentially at the rate . In the following section, we provide a much finer result on the estimation of the disease extinction time in the population.

2.1.3. Extinction Time of the Epidemic

The extinction time distribution can be derived as a function of the offspring generating function. As usual in stochastic processes, this quantity is calculated conditionally on the initial value , but for the sake of simplicity we do not let it appear in the notations. Note that since we are building tools for the prediction of the spread of the disease, the time origin corresponds here to the time of the last available data (generally the current date). Let denote the extinction time of the process , and let be the th iterate of the generating function given by (2.2). We denote . Then, by the branching property of the process [6], the probability of extinction of the epidemic before time is immediately given by , that is to say, It can be immediately deduced from convergence results for as [8], that if , , while if , , for some constants . As a consequence, the closer is to unity, the longer the time to extinction will be in most realizations. More specifically, (2.7) enables the exact computation (resp., estimation) of for any by the iterative computation of , being given, when the parameters of (1.1) are known (resp., estimated). Moreover, since for the epidemic becomes extinct in an a.s. finite time and , then for any given probability there exists such that . So in practice, for any , (2.7) enables us to compute the -quantile of the extinction time,

2.2. Total Size of the Epidemic

Under the assumption and the independence of the (we previously assumed the independence of the , for each ), we derive the distribution of the future total size of the epidemic until its extinction, that is, the future total number of infectives until the extinction of the disease. It can be shown [9] that, given the initial value , the time origin being the same as in Section 2.1, the probability distribution of is where denotes the equality in distribution, an empty sum is by convention , the are independent, and the and the are i.i.d. with that is, for each , . Consequently, under the convention that an empty product is 1, the probability distribution of is, for any , which may be calculated (resp., estimated), replacing the by their values (resp., estimations). In practice, for any , (2.11) enables to compute the -quantile of the total epidemic size, We obtain moreover an explicit formula for the mean value and variance of the size of the epidemic,

2.3. Exposed Population

Depending on the disease, it might also be crucial to study and predict the evolution of the incidence of exposed individuals in the population, which is generally unobservable. We assume that this information is given by the process defined by the conditional distribution , where is the mean number of individuals infected at time by an infective of this time (see Section 3.1). This property enables on the one hand to reconstruct the whole past epidemic (i.e., the incidence of infectives as well as of exposed individuals) thanks to the observable data. On the other hand, it allows to simulate the evolution of the incidence of exposed individuals in the future, based on predictions of the evolution of the epidemic process .

2.4. Worst-Case Scenario: Very Late Extinction of the Epidemic

Even in the case when the epidemic dies out almost surely (), and although one can provide the -quantile of the extinction time with the probability as large as wanted (see (2.8)), the epidemic might become extinct after this time with a small but nonnull probability of order . This raises the following question: how would the incidences of infectious and exposed individuals evolve in the (unlikely) case of a very late extinction? In terms of risk analysis, this issue appears to be crucial to evaluate the risks associated with this worst-case scenario. The tools developed in the previous subsections allow to evaluate the probability of all possible outcomes. But since the worst ones, typically a very late extinction, have a negligible probability, these tools do not bring any information in these worst cases and in particular do not inform on the evolution at each time step of the spread of the disease (would it decrease extremely slowly, stay at a constant rate for a very long time, present several peaks in its evolution, etc.?). In order to study the propagation of the epidemic in the decay phase, assuming that extinction occurs very late, we are interested in the distribution of the process conditionally on the event that the epidemic has not become extinct at time , where is very large. We therefore consider for any and any the conditioned probability , where the subscript denotes the initial value. If is finite this distribution cannot be easily handled due to its time inhomogeneity. However, when , it is known ([10]) that this conditioned distribution converges, as , to the distribution of a -dimensional Markov process : We will further discuss in Proposition 2.5 the relevancy of approximating the conditioned probability for fixed by the limiting object (2.14). The conditioned process defined by (2.14) is known in the literature as the -process associated with , also described as the process conditioned on “not being extinct in the distant future.” It has the following transition probability [10]: for every , , , where is the normalized right eigenvector of associated to the Perron's root as introduced in Section 2.1, and computed explicitly in (2.5). In the same way as for the process , we define the 1-dimensional process . By construction we then have , for each and each .

Proposition 2.2. The stochastic process is, conditionally on its past, distributed as the sum of two independent Poisson and Bernoulli random variables: where , is the convolution product symbol, and

Proof. Applying (1.1) and (2.15), we obtain that for all , The equality implies that for all , , and that . Consequently, , and thus

Remark 2.3. Note that if one compares (2.16) with the transition probability of the unconditioned process , it appears that behaves at each time step like , according to a Poisson distribution, except that it has the possibility at each time step to add one unit or not, according to a Bernoulli random variable. Moreover, if , then according to (2.17), , which implies that at time , the probability to add one unit is equal to one, thus preventing the extinction of the process.

Proposition 2.4. The process admits a stationary probability measure with finite first- and second-order moments.

Proof. Since the multitype branching process satisfies property (2.4), it is known [10] that the -process is positive recurrent with a stationary probability measure given by where is the Yaglom distribution of the process , uniquely defined by the following property: for all , . In the literature, this stationary measure for the conditioned process is also referred to as the doubly limiting conditional probability. Moreover, by Proposition 2.2, which implies that . We consequently obtain by means of Fatou's lemma that, for every , We similarly prove that has finite second-order moments by writing

Let us discuss the relevancy of approximating the epidemic process conditioned on nonextinction at some finite time , for large, by the -process obtained by letting . When considering the case of late extinction, one works under an hypothetical assumption based on the unknown future, hence in practice one does not focus on a specific value for the survival of the disease in the population. We therefore might consider that is chosen large enough such that the approximation of the process conditioned on the event by the process is valid. Of course, the order of magnitude of such will depend on the rate of convergence of the conditioned process to .

Proposition 2.5. Let and . Then the difference decreases, as , with , where is an eigenvalue of such that , with the being the other eigenvalues of . In case we stipulate that the multiplicity of is at least as great as the multiplicity of .

Proof. Thanks to (2.15) and to the Markov property together with the fact that , we have The right term of (2.24) is known to converge to 0, as , thanks to the property that , for some , where (see [8]). This stems from two convergences, namely, , where , and . Let us write , where . Since and , it comes It thus appears that the rate of convergence of to is of the same order of magnitude as the one of to . Let us determine this rate in an accurate way. We use the following inequality produced by Joffe and Spitzer in [8]: for each , where we are going to replace and by some explicit formulae function of and . For this purpose, we use a detailed asymptotic behavior of , as , presented for instance in [11]; we have , where . For the sake of clarity the symbol will denote either a scalar or a matrix with all the entries satisfying the associated property. This implies the existence of some constant such that, for all , , where . Moreover, following [8], let us write, for all , , where , and as . Then , which implies the existence of some constant such that, for all , , with . We thus have provided an explicit formula for the sequences and introduced by Joffe and Spitzer in [8]. Finally let us apply (2.26) to and replace in this inequality and by their explicit expressions that we got. We obtain that, for all , Consequently, the right member of (2.24) will decrease with , as .

Hence the concept of the -process will have most practical relevance to approximate the very late extinction case if is near to zero and if is small compared with . Note however that the very late extinction scenario is more likely to happen if is near to unity because the time to extinction in most realizations will then be long (see Section 2.1).

2.5. Estimation of the Infection Parameter

We assume for this subsection that the parameters of the epidemic model (1.1)–(2.1) are not entirely known. More precisely, we assume that the are of the form, for all , where and are constants, and is an unknown real parameter. We will write in what follows , where and so forth. This general assumption corresponds in particular to the case where the are of the form (1.2).

We estimate by the following WCLSE in model (1.1)–(2.1). This estimator generalizes the well-known Harris estimator [12] in a BGW process. Let , , such that . The WCLSE is based on the normalized process and is defined by We easily derive the following explicit form: The normalization of the process by appears to be the most natural and suitable for the following reasons. First, this normalization generalizes the normalization in the monotype case, which is the one leading to the Harris estimator of since we have, for , . It also corresponds, in the linear case , to the maximum likelihood estimator of . In addition, defining for any vector , and , we have hence the conditional variance of the error term in the stochastic regression equation is invariant under multiplication of the whole process, and bounded respectively to , leading to the quasi-optimality of at finite and , in the sense of [13].

Let us provide asymptotic results for the estimator defined by (2.30), as the initial population size tends to infinity. We denote by the th entry in the th power of the matrix given by (2.3).

Theorem 2.6. Let us assume that, for each , there exists some such that . Then is strongly consistent, that is, , and is asymptotically normally distributed: where

Proof. Let us first prove that, for each and each , Using the branching property of the process ([6]) we write where, for all and , is the th coordinate of a -type branching process at time initialized by a single particle of type . For , , and fixed the random variables are i.i.d. with mean value . According to the strong law of large numbers and under the theorem assumption, we have, for every such that , which together with the theorem assumption leads to (2.34).
To prove the consistency of we apply (2.34) to (2.30), using the fact that and , and obtain By definition, hence (2.37) immediately leads to the strong consistency.
We are now interested in the asymptotic distribution of . We derive from (2.30) that By (1.1), where the are i.i.d. given , following a Poisson distribution with parameter , and the are independent given . Renumbering the we then obtain Applying a central limit theorem for the sum of a random number of independent random variables (see, e.g., [14]), we obtain that, for all , We have used the fact that is a real positive sequence growing to infinity, and is a sequence of integer-valued random variables such that converges in probability to a finite random variable. In our case the limit is actually deterministic, since we have shown in (2.34) that Using (2.41) in (2.39), we write Using again (2.34), which, combined to (2.42) and (2.44), implies by Slutsky's theorem that By (2.33) and the strong consistency, , from which we finally deduce (2.32).

2.6. Estimation of the Infection Parameter in the Worst-Case Scenario

In order to make predictions of the evolution of the epidemic in case of a very late extinction, that is, in order to make predictions of the behavior of the conditioned process introduced in Section 2.4, we need to estimate the parameter in the setting of this conditioned process. We point out that does not play the same role in the conditioned process and in the unconditioned process , since, as shown in Proposition 2.2, this parameter interferes not only in the Poisson random variable but also in the Bernoulli one. It would thus be irrelevant to estimate with an estimator aimed for the unconditioned process, such as . Let us notice that, according to (2.16), the process could be written as a multitype branching process with state dependent immigration. Because of this state-dependence, and since the parameter acts in a nonlinear way in the immigration, the methods developed in estimation theory for branching processes with immigration (see, e.g., [15]) cannot be directly applied here. Similarly as in Section 2.5 we consider the WCLSE based on the process , namely, where is defined in Section 2.5, and where Let be the error term between the normalized process and its conditional expectation. We obtain that which implies In what follows, we denote by the derivative of with respect to , and similarly for the other quantities depending on .

Theorem 2.7. The estimator is strongly consistent, that is, , and has the following asymptotic distribution:

Remark 2.8. Note that (2.51) involves the function and thus requires the knowledge of the derivative of the function given in (2.5), which is not an explicit function of since is not either. However, satisfies ; hence is known as soon as can be computed. Consequently, denoting by when is the parameter of the model, (2.51) can be used as soon as is known; for this purpose one can for instance numerically approximate the largest solution of .

Proof. The proof heavily relies on a strong law of large numbers for homogeneous irreducible positive recurrent Markov chains applied to the conditioned process and its stationary distribution (given by Proposition 2.4), which states [16]: for every -integrable function , Note that the Perron's root of and the associated right normalized eigenvector are -functions of .
In order to prove the strong consistency of as , since is not linear, we cannot use the general standard method based on the first-order expansion of at , and on the strong law of large numbers for martingales applied to correctly normalized. We consequently use the following conditions given in [17]:(i) is Lipschitz on ; that is, there exists a nonnegative -measurable function (where ), satisfying, for all , , a.s., (ii),(iii). Let us note that, in the frame of a general model , since is compact, (i) is satisfied as soon as has a first derivative in with , (ii) is satisfied for any optimal estimator in the sense of [13] and moreover could be weakened (see [17]), and (iii) is a necessary condition. First, for all and , , , where denotes the derivative of with respect to , which thanks to (2.17) is bounded on . Condition (i) is thus satisfied. Condition (ii) follows from (2.50). Condition (iii) comes from the fact that, for every and every such that , applying the mean value theorem to the -function , Let us show that the function is -integrable. For every , and , denoting (by continuity of on ), and , Hence, for all , , and applying (2.52) together with (2.22) we obtain that Let fixed. Since, for all , , the extreme value theorem implies that . Hence the right term in (2.55) is strictly positive, which together with (2.55) leads to (iii).
Let us now consider the asymptotic distribution of . For this purpose, we follow the steps of the proof of Proposition 6.1 in [17]. Writing the Taylor expansion of at we obtain that , for some , with . Since , we can write where . Let us first show that This is an application of (2.52) and (2.22), since, for all , , In view of (2.56), we now prove that Computing thanks to the formula , it appears that (2.59) is true, as soon as the following holds: Let us prove (2.60)–(2.62). First, (2.60) is given by a strong law of large numbers proved in [17, Proposition 5.1]. The latter can be indeed applied since (as an immediate consequence of the stronger result (2.57)), and since fulfills the required Lipschitz condition. Indeed we proved earlier that is continuous on the compact set and is thus bounded on , which implies that is bounded by a -measurable function. In view of (2.61), we consider the function and its derivative . Similarly as for (2.54), one can show that there exists a constant such that for all , and all , . This implies Consequently, which by (2.57) and the strong consistency of almost surely tends to . Writing this implies (2.61). It now remains to prove (2.62). We write which thanks to (2.57) and the strong consistency of implies (2.62). In view of (2.56), we finally want to prove that converges in distribution. For this purpose we make use of a central limit theorem for martingale difference arrays [18, 19]; if , is a sequence of square integrable martingales with associated Meyer process satisfying for some constant , and such that, for all , then . Let us define , for every . First, for any , . Second, hence is a sequence of square integrable martingales. Moreover, using inequalities (2.50) and (2.58), we obtain, by (2.22), So, by means of (2.52), Third, using Cauchy-Schwarz and Bienaymé-Chebyshev inequalities, Let us compute . We can show that the th central moment of the independent sum of a Poisson and a Bernoulli random variables equals , where and denote the th central moment of the Poisson and of the Bernoulli variable. If these variables have parameter and , respectively, then , , , and . We thus obtain Hence Since the highest power of involved in (2.73) is , and since by Proposition 2.4 the stationary distribution has finite second moments, we can apply (2.52) to (2.71) and obtain that It then ensues from the central limit theorem mentioned above that Finally, (2.56), together with (2.57), (2.59), (2.75), and Slutsky's theorem, implies that which leads to
It now remains to prove the asymptotic distribution (2.51). Thanks to what precedes, this result is immediate as soon as we prove that as well as the equivalent result for the numerator. For this purpose, we write and show that has a bounded derivative and is thus Lipschitz. We have indeed , hence for some constant . This enables us to write By the strong consistency of together with (2.52) and (2.22), (2.81) almost surely tends to zero. Combined with (2.52) and (2.69) in (2.79), this implies (2.78).

3. Application to Real Epidemics

3.1. Explicit Epidemic Model

Although the epidemic model (1.1) has a clear interpretation in terms of the disease propagation, it can be also proved [9] that it is the limit process, as the initial population size tends to infinity, of the incidence of infectives described by a thorough process in a large branching population, taking into account all the health states of the disease and describing in detail the infection and latent processes for each individual in the population. This detailed process is a multitype branching process with age and population size dependence, taking into account the variability of many individual factors such as the reproduction, the survival, the transmission of the disease, and the latent time. We assume a disease of the general SEIR type. We also assume a random latent period and, for the sake of simplicity, a duration of the state of one time unit only. Hence the incidence of infectives exactly corresponds to the incidence of cases.

The main assumptions for the construction of the limit model are, concerning the disease as follows: (i) at the initial time the disease is rare and the total population size is large; (ii) the infection via horizontal route is of Reed-Frost type, with the probability for a susceptible individual to become infected by a given dose of pathogens being inversely proportional to the total population size; (iii) the individual survival law is the same for and individuals and is independent of the population and of the time; (iv) the latent time law given the individual survival is independent of the time, the individual age, and the infectives population during the latent period. This last property is possible if we assume that overinfection during the incubation has a negligible effect on the latent time. We moreover assume that the whole healthy population size is relatively stable over time.

The limit process (1.1) is then obtained in an inductive way as the limit in distribution of as the initial total population size , where denotes the number of new infectives at time . More precisely, denoting by the incidence of exposed individuals at time and by the largest individual survival age, we obtain that , with and where for all , satisfies (1.2) in Section 1, with the latent period distribution given survival, , assumed independent of , being denoted by . Thus and similarly, Moreover, when assuming a relatively stable population size and denoting by the individual survival probability until age at least, then is equal to .

3.2. Illustration: The BSE Epidemic in Great Britain

In this section we provide an illustration of the methodology developed in Section 2 by studying the decay phase of the BSE epidemic in Great Britain, based on the observations of the yearly diagnosed cases from 1989 until 2011, with 1988 being the date of the main control measure. The disease that was first officially identified in 1986 [20] reached its peak in 1992 (36682 cases) and is obviously now in its decay phase. Since only a very few cases were recently reported [5], namely, 11 in 2010 and 5 in 2011, the spread of the disease should a priori “soon” come to an end. It has been accepted that the epidemic is fading out, with a very low level of risk for cattle and humans. However it might be interesting to have a more precise idea of the extinction phase of the disease, for instance, of its length and of its intensity.

3.2.1. Choice of the Epidemic Model

BSE is a transmissible disease through ingestion of prions (horizontal route) and through maternal route. It is commonly accepted (see, e.g., [21]) that the infectious phase (including the clinical state) lasts at most one year, while the latent state can be of several years. If one compares the epidemic data with the total population of around 9 million cattle in Great Britain, it is reasonable to assume that, even at the peak time, BSE may be considered as a rare disease in a large population. We moreover assume that the probability for an animal to become infected by a horizontal route follows a Reed-Frost type model as in (ii) of Section 3.1, and that the probability for a calf to become infected by its dam is nonnull and constant over time. Moreover, for the sake of simplicity and parsimony, we do not take into account potential heterogeneity factors such as the different regulations from 1989 (the main regulation was the feed ban of July 1988 and was shown to be quite efficient [22]), the different types of breeding and of races, the age for animals older than one year, and the evolution of the surveillance system and of diagnosis tests. So the process defined by (3.1)–(3.4) appears relevant to model the spread of this fatal disease after the 1988 ban, where the infectious state corresponds to the end of the incubation period and the clinical state, and corresponds to the death (assumed to be either by routine slaughtering for and cattle, and control slaughtering for cattle). Choosing a time step of one year, then represents the yearly incidence of notified cases, which corresponds to the available data provided by the World Organisation for Animal Health [5].

3.2.2. Choice of the Model Parameters

In order to predict the future epidemic evolution, we need to evaluate all the parameters involved in (3.3). These parameters were already studied in previous works [2225] using different models, amounts of observations, and kinds of estimators. Since it might have a crucial role in the decay phase of the disease, we focus here on the infection parameter from 1989 via a horizontal route of transmission (i.e., the mean number per infective and per year of newly infected). Until the 1988 feed ban regulation, the main routes of transmission of BSE were horizontal via protein supplements (Meat and Bone Meal, milk replacers), and maternal from a dam to its calf. Since a previous statistical study [22] concluded to the full efficiency of the 1988 ban and since most of cattle are slaughtered before the age of 10 years, the fact that cases of BSE are still observed more than 20 years later could suggest the existence of a remaining source of infection, either via a maternal transmission route, or via a horizontal one, for example, via the ingestion of excreted prions from alive infected animals, or from prions left in the environment. The prediction of the future disease spread thus strongly relies on the intensity of this infection, quantified by the parameter . We provide an estimation of together with a confidence interval, which will be used for prediction estimations. We provide in addition a sensitivity analysis to the other parameters of the model, which are chosen as follows. First, the maximal age of the individuals in the population is set to , which corresponds to the largest reported survival age with a nonnegligible probability. Next, we set , which is the largest order of magnitude commonly accepted for the maternal infection probability [23, 25]. Following a previous work, [22], we assume here a discretized Weibull distribution with parameters for the distribution of the latent period; that is to say, for each , , where is a shape parameter, and is the mode of the probability density of the corresponding continuous Weibull distribution. The parameters and have a very bad identifiability with the infection parameters on a sole monotonous phase and are consequently estimated in [22] on the whole epidemic series (growth and decay), by Bayesian maximum a posteriori (MAP) estimations. We set and . Since the death of susceptible and exposed animals is mostly due to routine slaughtering, the survival probability (probability for an apparently healthy animal to survive at least until years) corresponds to the probability of being slaughtered after the age of years. It is derived from existing literature [24].

3.2.3. Estimation of the Infection Parameter

Our study is based on the yearly number of cases of BSE reported in Great Britain from 1989 until 2011 (Figure 1) provided by the World Organisation for Animal Health [5], denoted by for each year . We set and denote by the -dimensional vector . We choose the first time of the epidemic model such that the model with its initial values covers the period starting from 1989, which we consider here as a time-homogeneous period. Hence , and corresponds to the year . We estimate by the WCLSE (2.30) in model (1.1)–(3.3), where are given by the previous values, and where . According to [5], . We are thus close to the asymptotic . The number of observations is . The estimator (2.30) provides the estimation . We point out that this estimation is of the same order of magnitude as the maximum a posteriori Bayesian estimation based on the whole epidemic until 2007, assuming a uniform prior probability [22]. Using (2.32) we obtain the following confidence interval with asymptotic probability , where , , and . Assuming , we get (observed value of ), and Although this confidence interval is an asymptotic one, as , it is a very good approximation of the true confidence interval for a finite , since is here very large. Since the estimation of relies on the values given to the other model parameters , we evaluate in addition its sensitivity to the values of these parameters. For this purpose, we compute the estimation of and the associated confidence interval with asymptotic probability , for different values of (Table 1). The first line of the table corresponds to the parameters chosen for the model. In each of the four following lines, we fix two coordinates and choose an extremal (unrealistic) value for the third one. It appears that the estimation of is almost independent of the value of the maternal infection parameter. However, the estimation seems more strongly dependent on the parameters of the latent period distribution. Nevertheless, even for very unrealistic values , all the estimations of remain in the same order of magnitude of several units. This is really small compared to estimations obtained for the infection via Meat and Bone Meal or lactoreplacers (before 1989) which are of the order of [22]. However, although these estimations are all very small, seems nonnull. This could suggest the existence of a minor but nonnull infection source which is not of maternal type.

3.2.4. Extinction of the Epidemic

We know thanks to Proposition 2.1 that becomes extinct almost surely if and only if . The estimated basic reproduction is here . Moreover, solving with a computing program the equation , we obtain the following value for the Perron's root , which provides the speed of decay of the expected yearly incidence of cases (see (2.6)); from a certain time, the expected number of new cases will decrease from around 33% every year.

3.2.5. Prediction of the Incidences of Cases and Incidences of Infected Cattle

Let us predict the spread of the disease from 2012 by means of simulations of , where is replaced by its previous estimation , and where the initial time of the model is 2011, that is, . The simulations are done recursively using the transition law (3.1). We point out that the model initialized by provides quite realistic simulations on the period 1998–2011 compared to the real observations on the same period, as illustrated in Figure 2(a). The epidemic process thus seems to provide a satisfying prediction of the overall evolution of the real epidemic. In order to predict the incidences of futures cases, we simulate 1000 trajectories of initialized by the observed values , with the estimated infection parameter . We illustrate in Figure 2(c), for each year from 2012, the maximum, minimum, median, , and quantiles associated with these 1000 realizations. It is also relevant to study and predict the evolution of the incidence of infected cattle in the population, which represents the hidden face of the epidemic. The incidence of infected cattle at time , conditionally on the number of cases at that time, is given by the Poisson distribution (3.2). For every and for each of the 1000 previously simulated values , we generate one realization of . We then illustrate in Figure 2(d), the yearly maximum, minimum, median, , and quantiles associated with the 1000 realizations.

3.2.6. Prediction of the Year of Extinction

Let denote the extinction year of the epidemic process . According to (2.7) and denoting by the offspring generating function of defined in (2.2) (from now on we let the dependence in appear in the notation), we have , for every , which by iterating can be computed explicitly. We obtain in particular, for the estimated value , the following -quantiles for the extinction time (see (2.8)): , and . Keeping in mind that corresponds to the complete extinction of the epidemic (i.e., consecutive years without any case), these results actually mean that for the infection parameter , with probability larger than (resp., and ), no case will arise in the population from year 2020 (resp., 2027 and 2031). Moreover, in order to take into account the uncertainty around the estimation of the infection parameter , we make use of the asymptotic confidence interval (3.5) of and of the fact that is a decreasing function of , which implies that for every , We collect in Table 2 the observed interval , for each (if we have, conditionally on the initial value , because of the memory which is not equal to 0). Note that these intervals are very narrow, leading to an accurate estimation of .

3.2.7. Prediction of the Epidemic Size

Let be the total size of the future epidemic from (total number of cases from 2012 until the extinction of the epidemic). We compute the distribution of using (2.11), conditionally on the event . We obtain in particular, for the estimated value , the following -quantiles for the total epidemic size (see (2.12)): , and . From (2.13) we deduce that the mean value and variance of for the parameter (resp., ) are 21 and 27 (resp., 22 and 28). Moreover, we obtain similarly as for the extinction time and thanks to (3.5) a confidence interval of , for every , with asymptotic probability 95%, collected in Table 3.

3.2.8. Study of the Very Late Extinction Case

In order to predict the behavior of the “most dangerous” evolution of the epidemic, we first need to compute the estimation (2.47) based on the data in Great Britain. The number of available observations is only , so we are far from the asymptotic setting of Theorem 2.7. However, the large value of can make us hope for a good accuracy. We point out that, by making use of the estimator on the real data, we make an unverifiable assumption on the future of the epidemic; we consider the data as if they were the beginning of a trajectory with very late extinction. This should have the following consequence: the estimation provided by should a priori be a bit smaller than the value 2.4324 provided by . Indeed we obtain . Using this value, we deduce from (2.51) the confidence interval of with asymptotic probability 95%, where , , and Therefore, , and which is of the same magnitude order as the confidence interval obtained with the unconditioned process (see (3.5)). Let us predict the most dangerous evolution thanks to the transition law of the conditioned process given by Proposition 2.2. Note that if one computes the eigenvalues and introduced in Proposition 2.5, one obtains and . It thus appears that the convergence of the epidemic process conditioned on nonextinction at time to the -process as is not very fast. As a consequence the study of the -process only provides information about the behavior of the disease spread in the case of an extremely late extinction. First, we see thanks to Figure 3(a) that the simulations provided by the conditioned process initialized by , and where is estimated by , are true to the real observations on the period 1999–2011. Figure 3(b) is an example of one simulation in the period 2012–2040 of the conditioned process, for . It appears that the values of this simulated trajectory are rapidly very small and of course are never equal to for consecutive times. For a finer prediction, we simulate 1000 realizations of this process from 2012 until 2050, with . Moreover, for every and for each of the 1000 simulated values , we make one realization of the incidence of infected cattle at time , according to the law given by (3.2). Figures 3(c) and 3(d) represent the yearly maximum, minimum, median, , and quantiles associated with the 1000 realizations of, respectively, the incidence of cases and infected cattle, in case of an extreme late extinction. It appears thanks to Figures 3(c) and 3(d) that the supposedly “most dangerous” trajectories nevertheless do not reach high values and do not present a new peak of epidemic.

4. Conclusion

The methodology developed in Section 2 thus applies well here and enables not only to confirm mathematically what is commonly accepted, namely, that BSE is fading out, but also to predict, with a very large probability, that the last BSE case will occur before 2027, and that until the complete extinction of the epidemic in the population, there will be less than 31 cases to come. We obtain moreover from Figure 2(d) the order of magnitude of the number of infected cattle in the population. In addition, the estimation of the infection parameter concludes to the possible existence of a minor but nonnull infection source which is not of maternal type, and which is very small (only around 3 newly infected animal per year and per infective) compared to the main source of horizontal infection until 1988 due to protein supplements. Finally, the study of the worst-case scenario shows that even in the case of an extreme late extinction of the disease in the population, the incidence of cases will decrease quite rapidly to 0, with afterwards only 1 or 2 yearly cases occurring regularly but sparsely, with no appearance of a new peak of epidemic. We have shown with this example that the methodology developed in Section 2 provides accurate tools to study the decay phase of an epidemic under the current sanitary measures, which would help to make new policy decisions. This evaluation is all the more relevant since it is obtained not by simply computing what should most probably happen, but also by taking into account the variability of many factors (infection, incubation, and survival), and by studying the potentially most dangerous evolution.