The basic reproductive ratio, 𝑅0, is one of the fundamental concepts in mathematical biology. It is a threshold parameter, intended to quantify the spread of disease by estimating the average number of secondary infections in a wholly susceptible population, giving an indication of the invasion strength of an epidemic: if 𝑅0<1, the disease dies out, whereas if 𝑅0>1, the disease persists. 𝑅0 has been widely used as a measure of disease strength to estimate the effectiveness of control measures and to form the backbone of disease-management policy. However, in almost every aspect that matters, 𝑅0 is flawed. Diseases can persist with 𝑅0<1, while diseases with 𝑅0>1 can die out. We show that the same model of malaria gives many different values of 𝑅0, depending on the method used, with the sole common property that they have a threshold at 1. We also survey estimated values of 𝑅0 for a variety of diseases, and examine some of the alternatives that have been proposed. If 𝑅0 is to be used, it must be accompanied by caveats about the method of calculation, underlying model assumptions and evidence that it is actually a threshold. Otherwise, the concept is meaningless.

1. Introduction

The basic reproductive ratio—also known as the basic reproductive number, the basic reproduction number, the control reproduction number, or 𝑅0—is one of the foremost concepts in epidemiology [1–3]. 𝑅0 is the most widely used epidemiological measurement of the transmission potential in a given population [4]. It is a measure of initial disease spread, such that if 𝑅0>1, then the disease can invade an otherwise susceptible population and hence persist, whereas if 𝑅0<1, the disease cannot successfully invade and will die out. The concept is defined as the number of secondary infections produced by a single infectious individual in an otherwise susceptible population [5].

Despite its place at the forefront of mathematical epidemiology, the concept of 𝑅0 is deeply flawed. Defining 𝑅0 proves to be significantly more difficult than it appears. Few epidemics are ever observed at the moment an infected individual enters a susceptible population, so calculating the value of 𝑅0 for a specific disease relies on secondary methods. There are many methods to calculate 𝑅0 from mathematical models, few of which agree with each other and few of which produce the average number of secondary infections. Methods to calculate 𝑅0 from theoretical models include the survival function, the next-generation method, the eigenvalues of the Jacobian matrix, the existence of the endemic equilibrium, and the constant term of the characteristic polynomial. 𝑅0 can also be estimated from epidemiological data via the number of susceptibles at endemic equilibrium, the average age at infection, the final size equation and calculation from the intrinsic growth rate. For an overview, see Heffernan et al. [2].

Furthermore, there are many diseases that can persist with 𝑅0<1, while diseases with 𝑅0>1 can die out, reducing the utility of the concept as a threshold. 𝑅0 is also used as a measure of eradication for a disease that is endemic, but issues such as backward bifurcations, stochastic effects, and networks of spatial spread mean that an invasion threshold does not necessarily coincide with a persistence threshold. This results in a reduction of the usefulness of 𝑅0. For example, it is possible that a disease can persist in a population when already present but would not be strong enough to invade. Finally, the threshold value that is usually calculated is rarely the average number of secondary infections, diluting the usefulness of this concept even further.

In this paper, we outline the problems with 𝑅0 and examine a number of alternatives that have been proposed. We include a worked example of malaria to demonstrate the many different results that the various methods give for the same model. Finally, we survey some of the recent uses of 𝑅0 in the literature. The number of articles that use 𝑅0 likely numbers in the tens of thousands, so an exhaustive review is not feasible. We have restricted ourselves to articles published since 2005 and which include interesting or novel explorations of 𝑅0.

2. Methods for Calculating 𝑅0

In this section, we identify some of the more popular methods (although by no means all) used to calculate 𝑅0. We also describe the limitations that each method presents and demonstrate one of the core problems with 𝑅0. Specifically, we address a key problem with 𝑅0: how do biologists make sense of it from mathematical models? (See, for example, the puzzled discussion in van den Bosch et al. [6].)

Although the “𝑅” in 𝑅0 is derived from “reproductive”, based on the original formulation of the concept as the average number of secondary infections, many thresholds have been denoted by “𝑅0”, even when they are not related to the average number of secondary infections. Thus, in keeping with the notation, we will use the notation 𝑅0,𝑋 to denote an 𝑅0-like surrogate associated with a particular method, symbolised by 𝑋.

2.1. The Survival Function

The survival function is given by 𝑅0,𝑆=∞0⎛⎜⎜⎜⎜⎜⎝theaveragenumberofsusceptibleindividualsthataninfectedindividualnewlyinfectsperunittimewheninfectedfortotaltimeğ‘ŽâŽžâŽŸâŽŸâŽŸâŽŸâŽŸâŽ Ã—âŽ›âŽœâŽœâŽœâŽœâŽœâŽtheprobabilitythatanewlyinfectedindividualremainsinfectiousforatleasttimeğ‘ŽâŽžâŽŸâŽŸâŽŸâŽŸâŽŸâŽ ğ‘‘ğ‘Ž(1)

The survival function has the advantage that it always produces the average number of secondary individuals infected by a single infected individual, in the same class. Thus, in Figure 1, where one human infects two mosquitoes, who each infect three humans, the survival function produces 𝑅0=6. This is the number of humans infected by a single infected human via mosquitoes; or, equivalently, the number of mosquitoes infected by a single mosquito via humans. The survival function is a generalised method of calculating the basic reproductive ratio that is not restricted to ODEs.

However, determining the individual probabilities can be cumbersome, especially if multiple states are involved. For a vector-borne infection such as malaria, with two infection states (human and mosquito), calculating the first probability involves determining the probability that a human infected at time 0 exists at time 𝑡, the probability that a human infected for total time 𝑡 infects a mosquito and the probability that an infected mosquito lives to be age ğ‘Žâˆ’ğ‘¡, where 0â‰¤ğ‘¡â‰¤ğ‘Ž [2]. For diseases with more states, such as Guinea Worm disease, where there is a waterborne parasite, which can attach itself to copepods, which in turn are ingested by humans and which subsequently grow into an internal nematode, the calculations of the survival probabilities become unwieldy.

Thus, although this method always produces the correct 𝑅0, in practice, it is difficult to use. This is especially true for models with sufficient complexity, which are often those encountered most frequently.

2.2. The Jacobian

The Jacobian matrix is used to linearise a nonlinear system of differential equations. Around the disease-free equilibrium, the linear system will have the same stability properties as the nonlinear system if it is hyperbolic; that is, if no eigenvalues have zero real part. In particular, if all eigenvalues have negative real part, then the equilibrium is stable, whereas if there is an eigenvalue with positive real part, the equilibrium is unstable.

It follows that a threshold is 𝜆max=0, where 𝜆max is the largest eigenvalue of the Jacobian matrix (or the largest real part if the eigenvalues are complex). However, for a system of 𝑛 differential equations, this requires solving an 𝑛th order polynomial, which may be impossible. Furthermore, rearranging the condition 𝜆max=0 to produce a threshold 𝑅0,𝐽=1 is not a unique process and does not always produce the average number of secondary infections.

2.3. Constant Term of the Characteristic Polynomial

When 𝜆max=0, the constant term of the characteristic polynomial will be zero. However, the reverse is not true, as the polynomial could have both zero and positive roots. If the characteristic polynomial is 𝜆𝑛+ğ‘Žğ‘›âˆ’1𝜆𝑛−1+⋯+ğ‘Ž1𝜆+ğ‘Ž0=0,(2) then ğ‘Ž0=0 is a threshold if ğ‘Žğ‘—â‰¥0 for all 𝑗. More generally, the Routh-Hurwitz condition allows the coefficients to take on other signs, under certain restrictions, but the nonconstant coefficients all being positive is a sufficient condition. Another sufficient condition is that ğ‘Žğ‘—â‰¥0 under the constraint ğ‘Ž0=0 (so that the largest eigenvalue at ğ‘Ž0=0 is 0).

This method is significantly easier to use than finding the largest eigenvalue, although verifying that ğ‘Ž0=0 necessarily corresponds to the largest eigenvalue can become complicated for some models. However, similar to the above, rearranging ğ‘Ž0=0 to produce 𝑅0,𝐶=1 is not a unique process and does not always produce the average number of secondary infections.

2.4. The Next-Generation Method

The next-generation method, developed by Diekmann et al. [8] and Diekmann and Heesterbeek [9], and popularised by van den Driessche and Watmough [10], is a generalisation of the Jacobian method. It is significantly easier to use than Jacobian-based methods, since it only requires the infection states (such as the exposed class, the infected class and the asymptomatically infected class) and ignores all other states (such as susceptible and recovered individuals). This keeps the size of the matrices relatively manageable.

However, the next-generation method suffers from a lack of uniqueness. In order to determine the matrices 𝐹 and 𝑉 (where 𝐹 accounts for the “new” infections and 𝑉 accounts for the transfer between infected compartments), biological insight must be used in order to decide which terms count as “new” infections and which terms are transfer terms. While this may seem intuitive for most models, it is easy to construct a counterexample ğ¼î…ž=𝛽𝑆𝐼−𝜇𝐼=𝛽𝑆𝐼+5𝐼−5𝐼−𝜇𝐼.(3) Here, the term +5𝐼 might represent, for example, new infections arising from vertical transmission, whereas the term −5𝐼 might represent a disease-specific death rate. Although this construction is clearly arbitrary, it demonstrates that identifying “new” infections is not a unique process and relies on the modeller's judgement.

We would then have 𝐹=𝛽𝑆+5,𝑉=5+𝜇,(4) and thus 𝑅0,5=𝛽𝑆+55+𝜇.(5) This has the same threshold property as 𝑅0,𝑁=𝛽𝑆/𝜇, but is clearly not the average number of secondary infections. In essence, the next-generation method is a mathematical generalisation of putting the negative values on one side and dividing (this is what 𝑉−1 is) so that the eigenvalue threshold at zero is transformed into an 𝑅0-like threshold at one. However, as we have seen, this does not produce a unique result.

van den Driessche and Watmough [10] note that other decompositions of 𝐹 and 𝑉 can be chosen, which lead to different values for 𝑅0,𝑁. They claim that only one choice of 𝐹 is epidemiologically correct. However, this is not true, as the above example shows. Furthermore, it means that the definition of 𝑅0 relies upon the judgement of the modeller as to what “epidemiologically correct” means.

Furthermore, the next-generation method does not produce the number of humans infected by a single human if there is an intermediate host, but rather the geometric mean of the number of infections per generation.

For example, consider a mosquito-borne disease where humans infect two mosquitoes, while mosquitoes infect three humans, as shown in Figure 1. For convenience, label these 𝑅𝐻=2 and 𝑅𝑀=3. Then the number of humans infected from a primary human (via mosquitoes) is 𝑅0=2×3=6. (This is also the value calculated by the survival function.)

However, the next-generation method would calculate 𝑅0,𝑁=√6, which is a weighted average (2<√6<3) of the number of infectives each individual produces in the next infection event. While mathematically sound, it is questionable whether this is biologically meaningful.

This example could be extended. Consider a three-stage disease, such as tularemia, where ticks may transmit between humans and livestock, but humans may also be infected by eating livestock. Suppose that a single human directly infects two ticks (so 𝑅𝐻=2), each tick infects four animals (so 𝑅𝑇=4) and each animal infects three humans (so 𝑅𝐴=3). Then, a single human has resulted in 𝑅0=2×4×3=24(6) infected humans. However, 𝑅0,𝑁=3√24≈2.88, since there are three infection stages. As the number of infection stages increase, 𝑅0,𝑁 becomes a progressively higher surd.

The next-generation method is likely the most frequently used method to calculate 𝑅0. It has been used extensively to calculate 𝑅0-like values from host-vector models (see Wonham et al. [11], Gaff et al. [12] or Gubbins et al. [13]). However, it does not produce the number of newly infected individuals in the same infection class and does not always produce the average number of secondary infections.

2.5. The Graph-Theoretic Method

In de Camino-Beck et al. [14], a graph-theoretic method for calculating 𝑅0 is given. Starting from the definition of 𝑅0=𝜌(𝐹𝑉−1), they derived a series of rules for reducing the digraph associated with 𝐹𝜆−1−𝑉 to a digraph with zero weight, from which 𝜆=𝑅0. The rules are as follows.

Rule. To reduce the loop âˆ’ğ‘Žğ‘–ğ‘–<0 to −1 at node 𝑖, every arc entering 𝑖 has weight divided by ğ‘Žğ‘–ğ‘–.

Rule. For a trivial node 𝑖 on a path 𝑗→𝑖→𝑘, the two arcs are replaced by 𝑗→𝑘 with weight equal to the product of weights on arc 𝑗→𝑖 and 𝑖→𝑘. Weights on multiple arcs 𝑗→𝑘 are added.

The graph-theoretic method has the advantage that it avoids calculation of 𝑉−1, which may be complicated for large systems. However, it always produces the same threshold value as the next-generation method.

They considered the simple vector-host model 𝑑𝐼𝑑𝑡=𝛽𝑠𝑆𝑊−(𝑏+𝛾)𝐼,𝑑𝑊𝑑𝑡=𝛽𝑚𝑀𝐼−𝑐𝑊,𝑑𝑆𝑑𝑡=𝑏−𝑏𝑆+𝛾𝐼−𝛽𝑠𝑆𝑊,𝑑𝑀𝑑𝑡=𝑐−𝑐𝑀−𝛽𝑚𝑀𝐼,(7) where 𝐼, 𝑊, 𝑆, and 𝑀 are proportions of infective hosts, infective vectors, susceptible hosts, and susceptible vectors, respectively, 𝑏 is the host birth and death rate, 𝛾 is the host recovery rate, 𝑐 is the vector birth and death rate, and 𝛽𝑠 and 𝛽𝑚 are disease transmission coefficients. The disease-free equilibrium is (0,0,1,1)𝑇. They found matrices 𝐹=⎛⎜⎝0𝛽𝑠𝛽𝑚0⎞⎟⎠,𝑉=âŽ›âŽœâŽğ‘+𝛾00ğ‘âŽžâŽŸâŽ .(8) From this, they produced a digraph of 𝐹𝜆−1−𝑉 and concluded that 𝑅0=î„¶î„µâŽ·ğ›½ğ‘šğ›½ğ‘ ğ‘(𝑏+𝛾).(9)

However, this method still contains the same issues as the next-generation method. The 𝑅0 value calculated is not the number of humans infected by a single human, but rather the (less biologically meaningful) geometric mean of the number of humans infected by the vector and the number of vectors infected by a human. Furthermore, the creation of 𝐹 and 𝑉 is not a unique process, as we have seen above, and does not always produce the average number of secondary infections.

For example, consider the mathematically equivalent model 𝑑𝐼𝑑𝑡=𝛽𝑠+9𝑆𝑊−9𝑆𝑊−(𝑏+𝛾)𝐼,𝑑𝑊𝑑𝑡=𝛽𝑚𝑀𝐼−𝑐𝑊,𝑑𝑆𝑑𝑡=𝑏−𝑏𝑆+𝛾𝐼−𝛽𝑠𝑆𝑊,𝑑𝑀𝑑𝑡=𝑐−𝑐𝑀−𝛽𝑚𝑀𝐼,(10) with matrices 𝐹=⎛⎜⎝0𝛽𝑠+9𝛽𝑚0⎞⎟⎠,𝑉=âŽ›âŽœâŽğ‘+𝛾90ğ‘âŽžâŽŸâŽ .(11)

The graph reduction then proceeds as in Figure 2. It follows (either by the graph-reduction method or using the conventional next-generation method) that 𝑅0=12⎡⎢⎢⎣9𝛽𝑠+9𝑐(𝑏+𝛾)+⎷81𝛽𝑠+92𝑐2(𝑏+𝛾)2+4𝛽𝑚𝛽𝑠+9𝑐(𝑏+𝛾)⎤⎥⎥⎦.(12) Thus, the graph-theoretic method inherits the existing problems from the next-generation method.

2.6. Existence of the Endemic Equilibrium

𝑅0 can also be calculated from the endemic equilibrium, in the case where there is a bifurcation at 𝑅0=1 such that the endemic equilibrium does not exist for 𝑅0<1. The existence of the endemic equilibrium is, thus, a threshold that can be rearranged to produce an 𝑅0-like surrogate.

However, for many models, calculating the endemic equilibrium can be quite difficult. Furthermore, in the case of a backward bifurcation, the endemic equilibrium still exists for 𝑅0<1 (in fact, two endemic equilibria exist in this case, one of which is stable and the other unstable). It follows that the endemic equilibrium is not a useful general method for calculating 𝑅0 and does not always produce the average number of secondary infections.

van den Bosch et al. [6] use the endemic equilibrium to determine a threshold given by ̂‌𝐼=(1/𝛼)(ğœ‡ğ‘žâˆ’ğœ‡+𝛼𝐾) so that ̂‌𝐼>0 if ğœ‡ğ‘žâˆ’ğœ‡+𝛼𝐾>0 (where ̂‌𝐼 represents infected plants at the endemic equilibrium, 𝛼 is the transmissibility, 𝜇 is the death rate, ğ‘ž is the fraction of infectious seeds, and 𝐾 is the total plant population density). They show that two different rearrangements of this inequality give 𝛼𝜇(1âˆ’ğ‘ž)𝐾>1or𝛼𝜇𝐾+ğ‘ž>1,(13) and note that either would suffice as an 𝑅0 value, but were unable to resolve the question of which was appropriate.

2.7. Summary of 𝑅0 Methods

In summary, there are many methods available for calculating 𝑅0, but few of them agree with each other and almost none reliably calculate the average number of secondary infections in a wholly susceptible population. The only method which does is the survival function, but this method is too cumbersome to be used except for the simplest of models.

3. A Worked Example

In this section, we take a sample model and apply the various methods for calculating 𝑅0 to it. We show that each method can produce a different result, all of which have the property that they have an outbreak threshold at 𝑅0=1 but otherwise bear little relation to one another.

Consider the following model for malaria:ğ»î…žğ‘†=Λ𝐻−𝛽𝑀𝐻𝑀𝐼𝐻𝑆−𝜇𝐻𝐻𝑆,ğ»î…žğ¼=𝛽𝑀𝐻𝑀𝐼𝐻𝑆−𝜇𝐻+ğœŽî€¸ğ»ğ¼,ğ‘€î…žğ‘†=Λ𝑀−𝛽𝐻𝑀𝑀𝑆𝐻𝐼−𝜇𝑀𝑀𝑆,ğ‘€î…žğ¼=𝛽𝐻𝑀𝑀𝑆𝐻𝐼−𝜇𝑀𝑀𝐼.(14) Humans may be susceptible or infected (𝐻𝑆 and 𝐻𝐼, resp.), while mosquitoes may be susceptible or infected (𝑀𝑆 and 𝑀𝐼, resp.). The birth rates are Λ𝐻 for humans and Λ𝑀 for mosquitoes. The background death rates are 𝜇𝐻 and 𝜇𝑀 for humans and mosquitoes, respectively, while the disease-specific death rate for humans is ğœŽ. The transmission rate from mosquitoes to humans is 𝛽𝑀𝐻, while the transmission rate from humans to mosquitoes is 𝛽𝐻𝑀. See Figure 3.

The Jacobian matrix at the disease-free equilibrium is 𝐽=âŽ¡âŽ¢âŽ¢âŽ¢âŽ¢âŽ¢âŽ¢âŽ£âˆ’ğœ‡ğ»00−𝛽𝑀𝐻𝐻𝑆0âˆ’ğœ‡ğ»âˆ’ğœŽ0𝛽𝑀𝐻𝐻𝑆0−𝛽𝐻𝑀𝑀𝑆−𝜇𝑀00𝛽𝐻𝑀𝑀𝑆0âˆ’ğœ‡ğ‘€âŽ¤âŽ¥âŽ¥âŽ¥âŽ¥âŽ¥âŽ¥âŽ¦,(15) where 𝐻𝑆 and 𝑀𝑆 are the equilibrium values of susceptible humans and mosquitoes, respectively. Thus, the nontrivial part of the characteristic polynomial satisfies 𝜆2+𝜇𝐻+ğœŽ+𝜇𝑀𝜆+𝜇𝑀𝜇𝐻+ğœŽî€¸âˆ’ğ›½ğ»ğ‘€ğ›½ğ‘€ğ»ğ»ğ‘†ğ‘€ğ‘†=0.(16)

Using the Jacobian method, the largest eigenvalue is 𝜆max=12î‚¸âˆ’ğœ‡ğ»âˆ’ğœŽâˆ’ğœ‡ğ‘€+î”î€·ğœ‡ğ‘€âˆ’ğœŽâˆ’ğœ‡ğ»î€¸2+4𝛽𝐻𝑀𝛽𝑀𝐻𝐻𝑆𝑀𝑆.(17) Rearranging 𝜆max=0, we have𝑅0,𝐽=î„¶î„µî„µâŽ·î€·ğœ‡ğ‘€âˆ’ğœŽâˆ’ğœ‡ğ»î€¸2+4𝛽𝐻𝑀𝛽𝑀𝐻𝐻𝑆𝑀𝑆𝜇𝑀+ğœŽ+𝜇𝑀2.(18)

Conversely, since the nonconstant coefficients of the characteristic polynomial are positive, all eigenvalues will be negative if the constant term of the characteristic polynomial is positive, whereas there will be an eigenvalue with positive real part if the constant term is negative. Rearranging, we have𝑅0,𝐶=𝛽𝑀𝐻𝛽𝐻𝑀𝐻𝑆𝑀𝑆𝜇𝐻+ğœŽî€¸ğœ‡ğ‘€.(19)

This is not the only rearrangement possible, so, like the nonuniqueness of the next-generation method, it is possible to rearrange by adding and subtracting arbitrary constants. Thus, for example,𝑅0,9=𝛽𝑀𝐻𝛽𝐻𝑀𝐻𝑆𝑀𝑆+9𝜇𝐻+ğœŽî€¸ğœ‡ğ‘€+9(20) is also a measure of disease spread with the property that the disease persists if 𝑅0,9>1 and is eradicated if 𝑅0,9<1.

Other rearrangements are possible, so long as the threshold at 0 is transformed into a threshold at 1. Thus,𝑅0,𝑒=exp𝛽𝑀𝐻𝛽𝐻𝑀𝐻𝑆𝑀𝑆−𝜇𝐻+ğœŽî€¸ğœ‡ğ‘€î‚(21) has the same threshold property. A generalised version of this formulation was used in Smith? et al. [15] called 𝑇0.

The endemic equilibrium satisfies 𝑀𝐼=𝛽𝐻𝑀𝑀𝑆𝐻𝐼𝜇𝑀,𝑀𝑆=Λ𝑀𝛽𝐻𝑀𝐻𝐼+𝜇𝑀,𝐻𝑆=𝜇𝑀+ğœŽî€¸ğœ‡ğ‘€î‚€ğ›½ğ»ğ‘€îğ»ğ¼+𝜇𝑀𝛽𝑀𝐻𝛽𝐻𝑀Λ𝑀,𝐻𝐼=Λ𝐻Λ𝑀𝛽𝑀𝐻𝛽𝐻𝑀−𝜇𝐻𝜇𝐻+ğœŽî€¸ğœ‡2𝑀Λ𝑀𝛽𝑀𝐻𝛽𝐻𝑀𝜇𝐻+ğœŽî€¸+𝛽𝐻𝑀𝜇𝐻𝜇𝑀𝜇𝐻+ğœŽî€¸.(22) It follows that there will be a biologically meaningful endemic equilibrium if 𝐻𝐼>0, or 𝑅0,𝐸=𝛽𝑀𝐻𝛽𝐻𝑀𝐻𝑆𝑀𝑆𝜇𝐻+ğœŽî€¸ğœ‡ğ‘€>1,(23) since 𝑀𝑆=Λ𝑀/𝜇𝑀 and 𝐻𝑆=Λ𝐻/𝜇𝐻. As above, this is not the only rearrangement possible.

The next-generation matrices are 𝐹=⎡⎢⎢⎣0𝛽𝑀𝐻𝐻𝑆𝛽𝐻𝑀𝑀𝑆0⎤⎥⎥⎦,𝑉=âŽ¡âŽ¢âŽ£ğœ‡ğ»+ğœŽ00ğœ‡ğ‘€âŽ¤âŽ¥âŽ¦,(24) since the next-generation method only considered the two infected classes. Then, using the next-generation method,𝑅0,𝑁=î„¶î„µî„µâŽ·ğ›½ğ‘€ğ»ğ›½ğ»ğ‘€ğ»ğ‘†ğ‘€ğ‘†î€·ğœ‡ğ»+ğœŽî€¸ğœ‡ğ‘€.(25)

Thus, for the same model, 𝑅0,𝑁, 𝑅0,𝐽, 𝑅0,𝐶, 𝑅0,9, and 𝑅0,𝑒 are all distinct. Although 𝑅0,𝐸=𝑅0,𝐶 in this case, this does not hold in general. Note that 𝑅0,𝐶 and 𝑅0,𝐸 produce the number of infected humans per infected human (or, equivalently, the number of infected mosquitoes per infected mosquito), but this is also not necessarily true in general. Figure 4 illustrates how these 𝑅0-like expressions vary with ğœŽ, when all other parameters are held constant.

Anecdotally, we have found that most biologists believe that there is one and only one 𝑅0 value for a given disease and, in order to verify an 𝑅0 is correct, all that is required is to check that it has a threshold at 1. We have demonstrated here that 𝑅0 is not a unique concept. Indeed, it is straightforward to construct simple variations that satisfy the threshold criterion at 1. Furthermore, since multiple methods calculate multiple thresholds, it follows that the vast majority of them cannot be the average number of secondary infections. However, the nonuniqueness is not the only problem associated with the concept, as we will demonstrate.

4. Backward Bifurcations

Backward bifurcations occur when multiple stable equilibria coexist for 𝑅0<1. This presents a serious complication when a disease is already endemic, since lowering the basic reproduction number below 1 may no longer be a viable control measure; hence, different prevention and control measures may have to be considered. In particular, a backward bifurcation makes the system more complicated, since the behaviour now depends on the initial conditions.

A backward bifurcation at 𝑅0=1 may result in persistence of the disease when 𝑅0<1. In this case, the disease will always persist for 𝑅0>1. However, there is a point ğ‘…ğ‘Ž<1 such that the endemic equilibrium exists for ğ‘…ğ‘Ž<𝑅0<1 and a third, unstable, equilibrium also exists between the two stable equilibria. Hence, an endemic disease is only eradicated if 0<𝑅0<ğ‘…ğ‘Ž. For ğ‘…ğ‘Ž<𝑅0<1, the outcome depends on initial conditions. If the disease is still in its early stages—that is, if the initial conditions are sufficiently small—then the system will approach the disease-free equilibrium and the disease will be eradicated. However, if the initial conditions are large, then the system will approach the endemic equilibrium and the disease will persist. Thus, a backward bifurcation prevents the system from switching to the disease-free equilibrium as soon as 𝑅0<1 is reached. See Figure 5(a).

Several mechanisms have been shown to lead to backward bifurcation in epidemic models, but backward bifurcations in compartmental models have only recently attracted serious research attention.

Gómez-Acevedo and Li [16] investigated a mathematical model for human T-cell lymphotropic virus type I (HTLV-I) infection of CD4+ T cells that incorporates both horizontal transmission through cell-to-cell contact and vertical transmission through mitotic division of infected T cells. They assumed that a fraction ğœŽ of the infected cells survive the immune system attack after the error-prone viral replication. Under the biologically sound assumptions that the fraction ğœŽ should be very low and the rate of the mitotic division should be high, their model has a bifurcation that predicts persistent infection for an extended range of the basic reproduction 𝑅0>𝑅0(ğœŽ0), where 𝑅0(ğœŽ0)<1. This model undergoes a backward bifurcation as ğœŽ increases: multiple stable equilibria exist for an open set of parameter values, when the basic reproduction number is below one.

Safan et al. [17] studied an epidemiological model under the assumption that the susceptibility after a primary infection is 𝑟 times the susceptibility before a primary infection. They present a method for determining the control effort required to eliminate an infection from a host population when subcritical persistence may occur. This effort can be interpreted as a reproduction number but is not necessarily the basic reproduction number.

For 𝑟>1+𝜇/𝛼, this model exhibits backward bifurcations, where 𝜇 is the death rate and 𝛼 is the recovery rate. For such models, the authors presented a method for determining the minimum effort required to eradicate the infection from the endemic steady state if one concentrates on control measures affecting the transmission rate constant.

Garba et al. [18] presented a deterministic model for the transmission dynamics of a single strain of dengue by realistically adopting a standard incidence formulation and allowing dengue transmission by exposed humans and vectors. The model was extended to include an imperfect vaccine for dengue. A backward bifurcation was observed in both models. This makes 𝑅0<1 no longer sufficient for effectively controlling dengue in a community. However, this phenomenon can be removed by replacing the standard incidence function in the model with a mass-action formulation.

Reluga et al. [19] proposed a series of epidemic models for waning immunity that can be applied in many different settings. With biologically realistic hypotheses, they found that immunity alone never creates a backward bifurcation. However, this does not rule out the possibility of multiple stable equilibria, which can be shown by a counterexample of the forward bifurcation at 𝑅0=1. See Figure 5(b).

5. 𝑅0 in Spatial Contexts

When 𝑅0 is considered in a spatial context, many of its properties fail to hold. In particular, diseases with 𝑅0>1 can fail to persist, depending on the nature of the spatial transmission. Since many diseases are spatially dependent, this further limits the utility of 𝑅0. As 𝑅0 increases beyond 1, the probability of disease invading the initially infected host group increases, but additional criteria are important to determining the probability of the spreading of the disease to other groups.

5.1. Networks

Green et al. [20] related deterministic mean-field models to network models, taking into account the manner in which the contact rate and infectiousness change over time. For a node with exactly 𝑚 connections, the expected number of secondary infections at infection age 𝑢 is given by 𝑅(𝑚,𝑢)=𝑚1−𝑒−𝛽𝑢/𝑘,(26) where 𝛽 is the transmissibility and 𝑘 is the mean number of connections per node. This model applies when the rate of infectious contact is independent of 𝑘.

If there is a constant rate of generation of new cases, then the expected number of secondary infections in an infectious period of length 𝑢 is given implicitly by 𝑅(𝑚,𝑢)=𝑢0(𝑚−𝑅)𝐶𝐼𝑆(𝑡)𝜓(𝑡)𝑑𝑡,(27) where 𝐶𝐼𝑆(𝑡) denotes the contact per susceptible neighbour and 𝜓(𝑡) denotes the infectiousness at time 𝑡 after infection.

Kao [21] showed that novel pathogens may evolve towards a lower 𝑅0, even if this results in pathogen extinction. This is because the presence of exploitable heterogeneities, such as high variance in the number of potentially infectious contacts, increases 𝑅0; thus, pathogens that can exploit heterogeneities in the contact structure have an advantage over those that do not. The exploitation of heterogeneities results in a more rapid depletion of the potentially susceptible neighbourhood for an infected host. While the low 𝑅0 strategy is never evolutionarily stable, invading strains with higher 𝑅0 will also converge to the low 𝑅0 strategy if not sufficiently different from the resident strain. This is in contrast to the conventional belief that the emergence of novel pathogens is driven by maximisation of 𝑅0.

In a randomly mixed epidemiological network, 𝑅0 can be approximated by𝑅0=⟨𝑙in𝑙out⟩⟨𝑙out⟩,(28) where 𝑙in and 𝑙out are, respectively, the number of inward and outward “truly infectious” links per node; the angled brackets represent the expected value of the relevant quantity [22].

Meyers [23] showed that in a contact network framework, 𝑅0=ğ‘‡âŽ›âŽœâŽî«ğ‘˜2î¬âˆ’âŸ¨ğ‘˜âŸ©âŸ¨ğ‘˜âŸ©âŽžâŽŸâŽ ,(29) where 𝑇 is the mean probability of transmission between individuals and ⟨𝑘⟩ and ⟨𝑘2⟩ are the mean degree and mean square degree of the network. Here, 𝑅0 depends explicitly on the structure of the network (i.e., on ⟨𝑘⟩ and ⟨𝑘2⟩). A single pathogen may, therefore, have very different transmission dynamics depending on the population through which it spreads. If two networks have the same mean degree, 𝑘, then the one with the larger variance in degree, ⟨𝑘2⟩−⟨𝑘⟩2, will be more vulnerable to the spread of disease.

In compartment models, infected hosts are assumed to have potentially disease-causing contacts with random individuals from the population according to a Poisson process that yields an average contact rate of 𝛽 per unit time. The mass-action assumption of compartmental models is tantamount to assuming that the underlying contact patterns form a random graph with a Poisson degree distribution. Estimates of 𝑅0 that assume a mass-action model may, therefore, be invalid for populations with non-Poisson contact patterns and, in particular, will underestimate the actual growth rate of the disease in highly heterogeneous networks.

5.2. Individual-Level Models

Rahmandad and Sterman [24] noted that if 𝑅0>1 in an agent-based model then, due to the stochastic nature of interactions, it is possible that no epidemic occurs or that it ends early if, by chance, the few initially contagious individuals recover before generating new cases.

Schimit and Monteiro [25] showed that in an individual-level model, 𝑅0 cannot be uniquely determined from some features of transient behaviour of the infective group. The value of 𝑅0 can be unambiguously determined from the asymptotical stable stationary concentrations, but this relies on waiting for the system to reach its permanent regime, which is not feasible in practice. The same value of 𝑅0 can be associated to networks with distinct values of clustering coefficients and average shortest path length. This result can affect the evaluation of the effectiveness concerning different strategies employed for controlling a disease. Because distinct values of topological properties can produce the same value of 𝑅0 in a model considering the spatial structure of the contact network, it is difficult to evaluate the effective contribution of each control measure. This is because the correspondences among 𝑅0 and the topological properties of the contact network are not one-to-one.

5.3. Metapopulation Models

Cross et al. [26] showed that when 𝑅0 is based on data collected from simulated epidemics mimicking epidemiological contact-tracing data, 𝑅0 can be substantially greater than one and yet not cause a pandemic. In populations with social or spatial structure, a chronic disease is more likely to invade than an acute disease with the same 𝑅0, because it persists longer within each group and allows for more host movement between groups.

Under the settings where the rate of host population turnover was negligible relative to the rate of disease processes of infection and recovery, they showed that 𝑅0>1 was insufficient for disease invasion when the product of the average group size and the expected number of between-group movements made by each individual while infectious was less than 1.

Smith? et al. [15] examined a metapopulation model with travel between two regions, with reproductive ratios 𝑅(0)0,1 and 𝑅(0)0,2 for each region in the absence of travel, and 𝑅0,1 and 𝑅0,2 when only susceptibles travel, but infectives do not. They showed that if both 𝑅(0)0,1<1 and 𝑅(0)0,2<1, then there are conditions on the travel of susceptible such that 𝑅0,1>1 and 𝑅0,2<1. Thus, a disease which would otherwise be eradicated in both regions could be sustained in one of the regions if there were sufficient travel of the susceptibles (not infectives). Furthermore, if 𝑅(0)0,1<1 and 𝑅(0)0,2>1, then there are conditions on the travel of susceptibles such that 𝑅0,1>1 and 𝑅0,2>1. Thus, if one region sustains the disease on its own, while the other does not, then sufficient travel of susceptibles (not infectives) could sustain the disease in both regions.

5.4. Partial Differential Equation Models

Althaus et al. [27] examined an age-dependent partial differential equation model of in-host HIV infection. They showed that 𝑅0=1∫∞0ğ‘’âˆ’ğ‘Ÿğ‘Žğ‘”(ğ‘Ž)ğ‘‘ğ‘Ž,(30) where the denominator is the Laplace transform of the generation time distribution 𝑔(ğ‘Ž) and 𝑟 is the growth rate. They found that estimates for 𝑅0 were generally smaller than those derived from the standard model when the generation time was taken into account.

6. Stochastic Effects

When stochastic effects, such as those inevitably found in nature, are included, the threshold at 𝑅0=1 may be disturbed. This includes assumptions about the distribution of transition times (assumed to be exponential in most models) as well as variations in individual parameters.

Heffernan and Wahl [28] derived improved estimates of 𝑅0 for situations in which information about the dispersal of transition times is available to the clinical or epidemiological practitioner. Rather than rederiving 𝑅0 for a number of models (SIR, SEIR, etc.), they introduce a “correction factor”, 𝜙: the ratio of 𝑅0 when the lifetimes are nonexponentially distributed to the value of 𝑅0 that would be calculated assuming exponential lifetimes. They were able to derive limiting values of 𝜙 and used this to gauge the sensitivity of 𝑅0 to dispersion in the underlying distributions.

By combining the movement of hosts, transmission with groups, recovery from infection and the recruitment of new susceptibles, Cross et al. [29] expanded the earlier analysis of Cross et al. [26] to a much more broader set of disease-host relationships, exploring settings where the duration of immunity ranges from transient to lifelong or where the demographic processes occur on comparable (or faster) timescales to disease processes. The focus of this study was to investigate how recruitment of susceptibles affects disease invasion and how population structure can affect the frequency of superspreading events (SSEs). They found that the frequency of SSEs may decrease with the reduced movement and the group sizes due to the limited number of susceptibles available.

The hierarchical nature of disease invasion in host metapopulations is illustrated by the classification tree analysis of the model results. Firstly, the pathogen must effectively transit within a group (𝑅0>1), and then, the pathogen must persist within a group long enough to allow for movement between different groups. Hence, the infectious period, group size and recruitment of new susceptibles are as important as the local transmission rates in predicting the spread of pathogens across a metapopulation. It should be noted that in 35% of simulations when 𝑅0 was greater than one, the disease failed to invade.

Tildesley and Keeling [30] examined whether 𝑅0 was a good predictor of the 2001 UK Foot and Mouth disease. They concluded that 𝑅0 explained just 29.3% of the standard deviation of the epidemic impact. They also noted that 𝑅0=1 did not act as a threshold: at the value of 𝑅0=1, only 20% of initial seedings generated epidemics; this probability increased to around 50% for the largest reasonable 𝑅0 values. When heterogeneities exist in the population, infection is most likely to become focused within the high-risk individuals who are both more susceptible and more infectious. This highlights the stochastic nature of the disease in its early stages and the dependence of the ensuing epidemic on favourable local conditions in the neighbourhood of the initial infection.

The probability of extinction, assuming exponentially distributed infectious periods, was 𝑝expext=1𝑅0,(31) when an epidemic began with a single infected individual. Thus, if 𝑅0≤1, then extinction occurs with probability 1, but if 𝑅0>1 then extinction occurs with some probability. See Chiang [31, Chapter 4] for more discussion.

7. 𝑅0 Failures

In this section, we note a variety of problems with 𝑅0 that are not covered in the previous sections. These include problems with the underlying structure of compartment models, the mismatch between an individual-based parameter and a population-level compartment model, and the failure of 𝑅0 to accurately measure an outbreak of a new disease.

Breban et al. [32] argued that in order to associate an 𝑅0 to a model of ODEs, an individual-level model (ILM) which is compatible to the ODE model must be developed; only then can the 𝑅0 of the ILM be unambiguously calculated. These ILMs are growing (not static) network models, with individuals added to a network of who infected whom based on global or local network rules. Then, 𝑅0 is computed as the limit of the average number of outgoing links of individuals in a node that no longer accepts new links, as time goes to infinity. They showed that a broad range of 𝑅0 values were compatible with a given ODE model.

For example, consider the basic model𝑑𝑆𝑑𝑡=−𝛽𝐼,𝑑𝐼𝑑𝑡=𝛽𝐼−𝜇𝐼,(32) where 𝛽 is the transmissibility and 𝜇 is the disease death rate; note that the transmission term is equivalent to the standard term 𝛽𝑆𝐼/(𝑆+𝐼) when the depletion of susceptibles is negligible so that 𝑆/(𝑆+𝐼)≈1.

The expected 𝑅0 value from this model using other methods (such as the next-generation method) is 𝑅0=𝛽𝜇.(33) The corresponding ILM consists of an infection rule, where an individual joining the infectious pool is infected by an infectious individual who is uniformly randomly selected, and a removal rule, where a uniformly randomly selected individual leaves the infectious pool. The flow of newly infected individuals is 𝛽𝐼(𝑡). Thus, the flow per already-infected individual is 𝛽. Since the removed individuals are randomly sampled from the infectious individuals, the average length of the infectious period equals the time expectation of the infectious period, which is 1/𝛽 (rather than 1/𝜇, as calculated from the next-generation method). It follows that 𝑅0=1, which is independent of the transmission and death rates from the corresponding ODE. Thus, in this case, 𝑅0 does not signal epidemic growth as anticipated from other methods.

Roberts [3] noted three fundamental properties commonly attributed to 𝑅0: (i) that an endemic infection can persist only if 𝑅0>1, (ii) 𝑅0 provides a direct measure of the control effort required to eliminate the infection, and (iii) pathogens evolve to maximise their 𝑅0 value. He demonstrated that all three statements can be false. The first, as we have noted, can fail due to the presence of backward bifurcations. The second can fail when control efforts are applied unevenly across different host types (such as a high-risk and a low-risk group), since 𝑅0 is determined by averaging over all host types and does not directly determine the control effort required to eliminate infection.

The third can fail when two pathogens coexist at a steady state that exists and is stable whenever both single-pathogen steady states exist but are unstable. In this case, the order in which the pathogens are established in the host population matters. The established parasite has a role in determining a modified carrying capacity and the pathogen with the largest basic reproductive ratio does not necessarily exclude the other.

Breban et al. [33] showed that two individual-level models having exactly the same expectations of the corresponding population-level variables may yield different 𝑅0 values. They showed that obtaining 𝑅0 from empirical contact-tracing data collected by epidemiologists and using this 𝑅0 as a threshold parameter for a population-level model could produce misleading estimates of the infectiousness of the pathogen, the severity of an outbreak, and the strength of the medical and/or behavioural interventions necessary for control. Thus, measuring 𝑅0 through contact tracing (as generally occurs during an outbreak investigation) may not be a useful measure for determining the strength of the necessary control interventions.

Many different individual-level processes can generate the same incidence and prevalence patterns. Thus, assigning a meaningful 𝑅0 value to an ODE model without knowledge of the underlying disease transmission network may be impossible. Only an epidemic threshold parameter can be used to design control strategies. Since 𝑅0 fails to possess this threshold quality, its usefulness may be vastly overstated.

Meyers [23] compared the theoretical calculation of 𝑅0 with observed SARS data from China and showed that estimates of 𝑅0 seemed incompatible. The basic reproductive rate has two critical inputs: (1)intrinsic properties of the pathogen that determine the transmission efficiency per contact and the duration of the infectious period,(2)the patterns of contacts between infected and susceptible hosts in the population.

While the first factor may be fairly uniform across outbreaks, the second may depend significantly on context, varying both within and among populations. The problem with the SARS estimates stems from the mass-action assumption of compartmental models; that is, that all susceptible individuals are equally likely to become infected. When this assumption does not hold, the models may yield inaccurate estimates or estimates that do not apply to all populations. 𝑅0 estimates for SARS in the field were based largely on outbreak data from a hospital and a crowded apartment building, with anomalously high rates of close contacts among individuals.

The author suggested that it might be inappropriate to extrapolate estimates for 𝑅0 from specific settings such as these to the population at large. Conversely, since the population at large is also unlikely to satisfy mass-action requirements, it may also be concluded that 𝑅0 is not a meaningful estimate of disease spread.

8. Alternatives to 𝑅0

A number of alternatives have been offered over the years, due to recognised problems with 𝑅0. Older examples include the critical community size [5], the group-level reproductive number, 𝑅∗ [34], the type reproduction number [35] and the basic depression ratio [36]. See Heffernan et al. [2] for a summary of these methods. In this section, we review some of the more recent alternatives to 𝑅0.

The actual reproduction number, ğ‘…ğ‘Ž, is defined as a product of the mean duration of infectiousness and the ratio of incidence to prevalence [37–39]. 𝑅0 coincides with ğ‘…ğ‘Ž when the transmission probability is constant but accounts for the more general situation when the transmission probability varies as a function of infection age (as happens in diseases such as HIV/AIDS).

The effective reproduction number 𝑅(𝑡) measures the number of secondary cases generated by an infectious case once an epidemic is underway. In the absence of control measures, 𝑅(𝑡)=𝑅0𝑆(𝑡)/𝑁, where 𝑆(𝑡)/𝑁 is the proportion of the population susceptible [40, 41]. The estimation of reproductive numbers is typically an indirect process because some of the parameters on which these numbers depend are difficult or impossible to quantify directly. The effective reproductive number satisfies 𝑅(𝑡)≤𝑅0, with equality only when the entire population is susceptible [40].

The effective reproduction number is of practical interest, since it is time dependent and can account for the degree of cross-immunity from earlier outbreaks. However, since it is based on the basic reproduction number, the effective reproduction number inherits many of the issues from 𝑅0.

Breban et al. [32] proposed 𝑄0, the average number of secondary infections over the infectious population. The average number of secondary infections of actively infectious individuals, 𝑄0(𝑡), is computed as the average number of outgoing links of a node in the infected compartment at time 𝑡. Then, 𝑅0≡limğ‘¡â†’âˆžğ‘…0(𝑡),𝑄0≡limğ‘¡â†’âˆžğ‘„0(𝑡),(34) when the limits exist. Unfortunately, 𝑅0(𝑡) is never defined in the paper, limiting the usefulness of this formulation of 𝑅0. However, by analogy with 𝑄0(𝑡), 𝑅0(𝑡) is the average number of outgoing links of a removed compartment at time 𝑡 (Romulus Breban, personal communication).

For the SI model (32), under the assumption that every infection is uniquely assigned as a secondary infection for either a removed or an infected individual, 𝑁𝑖(𝑡)=𝐼(𝑡)𝑄0(𝑡)+𝑁𝑟(𝑡)𝑅0(𝑡),(35) where 𝑁𝑖(𝑡)=∫𝑡0𝛽𝐼(𝑢)𝑑𝑢 is the cumulative number of infected individuals that occur in the time interval (0,𝑡] and 𝑁𝑟(𝑡)=∫𝑡0𝜇𝐼(𝑢)𝑑𝑢 is the cumulative number of removed individuals.

Their definition of 𝑅0 evaluates the average number of secondary cases over removed individuals as the distribution of secondary cases becomes stationary. This definition does not imply a particular individual-level model; it depends exclusively on the structure of the disease-transmission network [42]. However, 𝑅0 is not measured at the start of an epidemic, which may limit its usefulness during an initial outbreak.

Grassly and Fraser [43] demonstrated that standard epidemiological theory and concepts such as 𝑅0 do not apply when infectious diseases are affected by seasonal changes. They instead define 𝑅0=𝐷10𝛽(𝑡)𝑑𝑡,(36) where 𝛽(𝑡) is the transmission parameter at time 𝑡 and 𝐷 is the average duration of infection. Thus, 𝑅0 can be interpreted as the average number of secondary cases arising from the introduction of a single infective into a wholly susceptible population at a random time of the year.

They noted that the condition 𝑅0<1 is not sufficient to prevent an outbreak, since chains of transmission can be established during high-infectious seasons if 𝐷𝛽(𝑡)>1. However, 𝑅0<1 is both necessary and sufficient for long-term disease extinction.

Meyers [23] noted that in the case of networks, estimating the average transmissibility 𝑇 may be more valuable than 𝑅0. This means reporting not only the number of new infections per case, but also the total estimated number of contacts during the infectious period of that case. Given the primary role of contact tracing in infectious disease control, these data are often collected. Unlike 𝑅0, 𝑇 can be justifiably extrapolated from one location to another even if the contact patterns are quite disparate.

Instead of 𝑅0, the author offered a number of alternatives to determine whether an outbreak will occur, based on network modelling. These include the probability that an individual will spark an epidemic and the probability that a disease cluster will spark an epidemic.

The probability that an individual with degree 𝑘 will spark an epidemic is equal to the probability that transmission along at least one of the 𝑘 edges emanating from that vertex will lead to an epidemic. For any of its 𝑘 edges, the probability that the disease does not get transmitted along the edge is 1−𝑇. The probability that disease is transmitted to the attached vertex but does not proceed into a full-blown epidemic is 𝑇𝑢, where 𝑢 is the probability that a secondary infection does not spark an epidemic. Thus, the probability that an individual will spark an epidemic is 1−(1−𝑇+𝑇𝑢)𝑘.

The probability that an outbreak of size 𝑁 sparks an epidemic is 1âˆ’âŽ›âŽœâŽâˆ‘âˆžğ‘˜=1𝑘𝑝𝑘(1−𝑇+𝑇𝑢)𝑘−1âˆ‘âˆžğ‘˜=1ğ‘˜ğ‘ğ‘˜âŽžâŽŸâŽ ,(37) where 𝑝𝑘 is the relative frequency of vertices of degree 𝑘 in the network.

Kao et al. [22] defined an epidemiological network contact matrix 𝑀 whose elements 𝑚𝑖𝑗 are either 1 or 0, depending on whether an infectious contact between nodes 𝑖 and 𝑗 is possible. The spectral radius of 𝑀 is an alternative approximation for 𝑅0, which can be calculated via a weighted version of (28).

This explicitly accounts for the full contact structure of the network, but the evaluation of extremely large, reasonably dense matrices (some highly active nodes may have hundreds of potentially infectious links) is difficult and time consuming, particularly when this evaluation process must be repeated multiple times. However, comparisons between the two approximations for subsets of a sheep network with several thousand nodes show little difference between 𝑅0 and the spectral radius of 𝑀 (typically less than 5%).

Kamgang and Sallet [44] used the special structure of Metzler matrices (real, square matrices with nonnegative off-diagonal entries) to define 𝒯0, an analytical threshold condition. 𝒯0 is a function of the parameters of the system such that if 𝒯0<1, the disease-free equilibrium is locally asymptotically stable, and if 𝒯0>1, the disease-free equilibrium is unstable. 𝒯0 has an association with 𝑅0, although it is a stronger condition; however, it has no direct biological interpretation. The algorithm for deriving 𝒯0, although highly mathematical in nature, allows computation of a threshold for high-dimensional epidemic models.

Huang [45] defined four reproductive numbers associated with four types of transmission patterns, each depending on 𝑧, the ratio of the mean infectious period to the mean latent period. These four reproduction numbers are the following: (1)𝑅𝐼0, the minimal reproductive number associated with the slowest latency process and the fastest recovery process,(2)𝑅𝐼𝐼0, the middle reproductive number associated with mean latency and recovery processes,(3)𝑅𝐼𝐼𝐼0, the maximal reproductive number associated with the fastest latency process and the slow recovery process,(4)𝑅𝐼𝑉0, the largest reproductive number associated with the fastest latency process and the extremely slow recovery process.

All four reproduction numbers are strictly increasing functions of 𝑧 and satisfy 𝑅𝐼0<𝑅𝐼𝐼0<𝑅𝐼𝐼𝐼0<𝑅𝐼𝑉0.(38) These numbers allow a disease to be classified as mild (𝑅𝐼0<𝑅0<𝑅𝐼𝐼0) or severe (𝑅𝐼𝐼𝐼0<𝑅0<𝑅𝐼𝑉0).

Hosack et al. [46] noted that 𝑅0 does not necessarily address the dynamics of epidemics in a model that has an endemic equilibrium. They used the concept of reactivity to derive a threshold index for epidemicity, 𝐸0, which gives the maximum number of new infections produced by an infective individual at a disease-free equilibrium. They also showed that the relative influence of parameters on 𝐸0 and 𝑅0 may differ and lead to different strategies for control.

If 𝑅0 is derived from the next-generation operator so that 𝑅0=𝜌𝐹𝑉−1,(39) then the threshold for endemicity is 𝐸0=𝜌𝐹+𝐹𝑇2𝑉−1,(40) such that the system is reactive when 𝐸0>1 and nonreactive when 𝐸0<1. 𝐸0 describes the transitory behaviour of disease following a temporary perturbation in prevalence. If the threshold for epidemicity is surpassed, then disease prevalence can increase further even when the disease is not endemic. This suggests that epidemics can occur even in areas where long-term transmission cannot be maintained.

Reluga et al. [47] defined the discounted reproductive number 𝑅𝑑. The discounted reproductive number is a measure of reproductive success that is an individual's expected lifetime offspring production discounted by the background population growth rate. It is calculated as 𝑅𝑑=𝜌𝐹(𝛿𝐼−𝑉)−1,(41) where 𝛿 is the discount rate, 𝐼 is the identity matrix, and 𝐹 and 𝑉 are from the next-generation matrix decomposition. 𝑅𝑑 combines properties of both the basic reproductive number and the ultimate proliferation rate although it also inherits the nonuniqueness problems from the next-generation method.

Nishiura [4] developed a likelihood-based method for estimating 𝑅0 without assuming exponential growth of cases and offers a corrected value of the actual reproduction number. The author noted that 𝑅0 is extremely sensitive to dispersal of the progression of a disease or variations in the underlying epidemiological assumptions.

9. Discussion

Despite being a crucial concept in disease modelling, with a long history and frequent application, 𝑅0 is deeply flawed. It is not a measure of the number of secondary infections, it is never calculated consistently, and it fails to satisfy the threshold property. Rarely has an idea so erroneous enjoyed such popular appeal. Why, then, are we so attached to the concept?

The answer is that 𝑅0, despite all its flaws, is all that we have. No other concept has so effectively transcended mathematics, biology, epidemiology, and immunology. No other concept is so general that it can be understood in terms of compartment models, network models, partial differential equations, and metapopulation models. “The number of secondary infections” has an intuitive appeal that outlasts even the inaccuracy of that statement when applied to the concept.

The threshold nature of 𝑅0 is used to monitor and control severe real-time epidemics; control measures are often concluded if 𝑅0<1 [48], making the problems with 𝑅0 more than just theoretical. Due to the inconsistencies in calculation, different diseases cannot be compared unless the same method was used to calculate 𝑅0; if HIV has an 𝑅0 of 3 and swine flu has an 𝑅0 of 4, we cannot conclude that swine flu is worse than HIV if different methods were used to determine these values. All we can conclude is that both diseases have 𝑅0>1; however, as we have demonstrated, this does not necessarily guarantee disease persistence.

Of the many different methods used to calculate 𝑅0, only the survival function reliably calculates the average number of secondary infections; however, this method is too cumbersome to use in most practical settings. The next-generation method is probably the most popular, yet it suffers from uniqueness problems and does not cope well with more than one disease state. Since 𝑅0 is rarely measured in the field, it instead relies upon after-the-fact calculations to determine the strength of disease spread [2]. This limits its usefulness even further.

Policy decisions are being based upon the concept, with limited understanding of the complexity and errors that exist in the very structure of the concept. Funding decisions about where money should be best spent are based on estimates of 𝑅0, resources are directed towards one disease over another, and monitoring programmes are abandoned, their objectives only half-realised, because of 𝑅0. Lives may be saved or lost, based on this imperfect and inconsistent measure.

𝑅0 is a quantity that relates to the initial phase of an epidemic. This makes practical sense in terms of disease prevention. However, it is also used to guide eradication efforts when a disease is endemic. Some methods derive an eradication threshold from an equilibrium value that may not be attained for a very long time. This suggests that different measures are needed in different stages of an epidemic in order to characterise transmissibility and to guide intervention strategies. When a new pathogen emerges, a quantity describing the initial spread is useful. When a disease is endemic, a quantity that applies to the long-term equilibrium is more appropriate.

We note that although the definition of 𝑅0 is broad, it is not a universal quantity that applies to all settings. In different settings, one should use a quantity that satisfies the following properties: that an endemic equilibrium only persist if 𝑅0>1 and that 𝑅0 provides a direct measure of the control effort required for eradication. The contact structure of the population, the variation of risk factors and the order of establishing parasites (if applicable) should accompany the identification of a meaningful 𝑅0.

What is urgently needed is a simple, but accurate, measure of disease spread that has a consistent threshold property and which can be understood by nonmathematicians. If 𝑅0 is to be used, it must be accompanied by a declaration of which method was used, which assumptions are underlying the model (e.g. mass-action transmission) and evidence that it is actually a threshold, with no backward bifurcation. Without such caveats, the concept of 𝑅0 will continue to fail.


A. 𝑅0 for Specific Diseases

In this appendix, we illustrate some recent calculations of 𝑅0 for various diseases from the past few years, noting in particular specific values for 𝑅0 that have been given. This illustrates the wide variety of values that are presented as being “the” 𝑅0 value for a specific disease.

A.1. HIV

Bacaër et al. [49] developed a mathematical model of the HIV/AIDS epidemic in Kunming, China. They considered needle sharing between injecting drug users and commercial sex between female sex workers and clients. The basic reproductive ratio is expressed as 𝑅0≈max𝑅IDU0,𝑅sex0,(A.1) where the superscript “IDU” represents injecting drug users and “sex” represents sex workers. They estimate 𝑅IDU0=32 and 𝑅sex0=1.7.

Abu-Raddad et al. [50] examined the population-level implications of imperfect HIV vaccines. They divided the population into two groups: unvaccinated and vaccinated, both of whom could become infected (albeit at different rates, depending on the efficacy of the vaccine). By forming two independent next-generation matrices, they derived 𝑅0 and 𝑅0𝑉, the basic reproductive ratios for the unvaccinated and vaccinated populations, respectively. They note that the condition 𝑅0𝑉<1 may not be sufficient for disease eradication, due to the possibility of a backward bifurcation. The two formulas are equivalent in the absence of the vaccine, assuming that risky behaviour does not change. They estimate 𝑅0𝑉=1.4 in the absence of vaccine protection.

Gran et al. [39] argued that 𝑅0 was inappropriate for long-lasting infections and instead defined the actual reproduction number ğ‘…ğ‘Ž(𝑡). This is a time-dependent measure defined as the average number of secondary cases per case to which the infection was actually transmitted during the infectious period in a population. Thus, ğ‘…ğ‘Ž(𝑡)=𝑋(𝑡)𝑌(𝑡)𝐷,(A.2) where 𝑋(𝑡) is the incidence rate at time 𝑡, 𝑌(𝑡) is the prevalence at time 𝑡, and 𝐷 is the average length of the infectious period. They illustrated their results for a staged HIV/AIDS progression model for homosexual men in the UK and showed that ğ‘…ğ‘Ž was 13.81 and 17.01 in 1982 and 1983, respectively, but was much closer to 1 by the mid 1990s, due to decreases in risk behaviour.

Smith? et al. [15] used the nonuniqueness of 𝑅0-like thresholds to evaluate the feasibility of eradicating AIDS using available resources. They developed a linear metapopulation model that has the same eradication threshold as more realistic models. They defined the threshold as 𝑇0=𝑒𝑠(𝐾),(A.3) where 𝑠(𝐾) is the maximal real part of all eigenvalues of 𝐾, the Jacobian matrix evaluated at the disease-free equilibrium; 𝐾 represents the change of status of infected individuals in different regions. The model has the property that the disease will be eradicated if 𝑇0<1 but will persist if 𝑇0>1. The paper stresses that this quantity should be understood only as a threshold condition for eradication; the model from which it derives does not quantify the transient dynamics, the timecourse of the infection or the prevalence of the disease. However, they showed that the simpler version of the model has the same eradication threshold (𝑇0) as more realistic, nonlinear models. The mathematical analysis results, together with a simple cost analysis, used the threshold nature of 𝑅0 to show that eradication of AIDS is feasible, using the tools that we have currently to hand, but action needs to occur immediately and globally.

A.2. Influenza

Chowell et al. [51] used four methods to calculate 𝑅0 using data from the 1918 influenza pandemic in California. The four methods involved (1) an early exponential growth rate, (2) an SEIR model, (3) a more complicated SEIR model with asymptomatic and hospitalised cases, and (4) a stochastic SIR model with Bayesian estimation. These yielded average 𝑅0 values of (1) 2.98, (2) 2.38, (3) 2.2, and (4) 2.1 during the first 17 days of the epidemic and then 2.36 for the entire autumn wave. In particular, uncertainty during the early part of the epidemic leads to a wide range of estimates for 𝑅0 (0.5–3.5, 95% confidence interval), but uncertainty decreases with more observations.

Wallinga and Lipsitch [52] investigated how generation intervals shape the relationship between growth rates and reproductive numbers. For new emerging infectious diseases, the observed exponential epidemic growth rate 𝑟 can indirectly infer the value of the reproductive number. The existence of a few different equations that relate the reproductive number to the growth rate makes such inference ambiguous. It is unclear which of these equations might apply to new infection. The authors showed that these different equations differ only with respect to their assumed shape of the generation-interval distribution. They found that the shape of the generation-interval distribution determines which equation is appropriate for inferring the reproductive number from the observed growth rate. By assuming all generation intervals to be equal to the mean, they obtained an upper bound on the range of possible values that the reproductive number may attain for a given growth rate. They also showed that it is possible to obtain an empirical estimate of the reproductive number by taking the generation-interval distribution equal to the observed distribution.

The authors illustrated the impact that various assumptions about the shape of the generation interval distribution may have on the estimated value of the reproduction numbers for a given growth rate using human infections with influenza A virus. Observed generation intervals for influenza A in a Japanese household study excluding possible coprimary and tertiary cases [53] estimated a mean generation interval of 2.85 days and a standard deviation of 0.93 days. Without specific assumptions about the shape of the generation-interval distribution, they found that the reproductive number of influenza A is larger than 𝑅=1, but smaller than 𝑅=1.77.

Chowell and Nishiura [54] reviewed quantitative methods to estimate the basic reproductive ratio of pandemic influenza. They use the intrinsic growth rate to estimate 𝑅0 in the range from 1.36 to 2.07. They defined the effective reproduction rate 𝑅(𝑡)=∞0𝐴(𝑡,𝜏)𝑑𝜏,(A.4) where 𝐴(𝑡,𝜏) is the reproductive power at time 𝑡 and infection age 𝜏 at which an infected individual generates secondary cases. If contact and recovery rates do not vary with time, then this simplifies to 𝑅(𝑡)=𝑆(𝑡)𝑆(0)𝑅0,(A.5) where 𝑆(𝑡) is the population of susceptibles at time 𝑡.

They also outlined statistical methods of estimation and showed that the expected number of secondary transmissions is given by 𝑅0=âˆžî“ğ‘˜=1𝑘𝑝𝑘,(A.6) where the pattern of secondary transmission follows a discrete probability distribution 𝑝𝑘 with 𝑘 secondary transmissions. This approach accounts for demographic stochasticity, which can be crucial during the initial phase of an epidemic when the number of infected individuals is small. This has been applied to observed outbreak data for avian influenza, where extinction was observed before growing to a major epidemic [55].

Tiensin et al. [56] used flock-level mortality data and statistical back-calculation methods to estimate 𝑅0 from an SIR model for the 2004 avian influenza epidemic in Thailand. They calculated 𝑅0 to be between 2.26 and 2.64, depending on the length of the infectious period.

A.3. Malaria

Smith et al. [57] revisited the basic reproduction number of malaria and its implications for malaria control. Their estimates of 𝑅0 are based on two more commonly measured indices called the entomological inoculation rate, which is the average number of infectious bites received by a person in a year, and the parasite ratio, which is the prevalence of malaria infection in humans. They made 121 estimates of 𝑅0 for Plasmodium falciparum malaria in African populations. The estimates (which do not come from a mathematical model) range from around one to more than 3000; the median is 115 and the interquartile range is 30–815.

They then revised the formula of 𝑅0 to adopt potential sources of bias when biting is heterogeneous and when there is some transmission-blocking immunity. Under the assumption that the transmission-blocking immunity develops, the estimates of 𝑅0 ranged from below one to nearly 11,000, with a median of 86 and an interquartile range of 15–1,000. This is because transmission-blocking immunity and heterogeneous biting skew the probability that a mosquito becomes infected, per bite. They also showed that in small human populations, the classical formulas of 𝑅0 approximate transmission when counting infections from mosquito to mosquito after one parasite generation, but they overestimate it from human to human.

A.4. West Nile Virus

Wonham et al. [11] examined the conflicting outbreak predictions for West Nile virus, showing how the choice of transmission term affects 𝑅0. They concluded that some transmission terms apply biologically only at certain population densities and showed that six common North American bird species would be effective outbreak hosts. All seven models considered shared a similar compartment structure, but with varying factors, such as incubation periods, loss of immunity, age structure, and so forth. They created a core model synthesised from all seven models and used the next-generation method to calculate 𝑅0=î„¶î„µî„µî„µî„µâŽ·ğ›½ğ‘…ğ‘‘ğ‘‰î„¿î…ƒî…Œvectortoreservoir×𝛽𝑅𝑁∗𝑉𝑑𝑅𝑁∗𝑅reservoirtovector,(A.7) where 𝛽𝑅 is the transmissibility of the reservoir, 𝑑𝑅 is the death rate of the reservoir, 𝑑𝑉 is the death rate of the vector, 𝑁∗𝑉 is the total vector population at the disease-free equilibrium and 𝑁∗𝑅 is the total reservoir population at the disease-free equilibrium.

The first term under the square root sign represents the number of secondary reservoir infections caused by one infected vector. The second term under the square root sign represents the number of secondary vector infections caused by one infected reservoir. The square root gives the geometric mean of the two terms (see Section 2). Using reported minimum, mean and maximum estimates, 10,000 Monte Carlo estimates were run to generate an estimated distribution of 𝑅0. Mean numerical values for 𝑅0 for house sparrows ranged from 25–1200.

However, these values are enormous, implying that a single sparrow would infect up to 1.44 million sparrows (i.e., 1200×1200) during its infectious lifetime. Cruz-Pacheco et al. [58], for example, estimated 𝑅0=5.08 for the house sparrow, using data from Komar et al. [59].

A.5. Cholera

Hartley et al. [60] developed a compartment model of cholera incorporating both a hyperinfective state and a lower infective state. They used the average age of infection to estimate 𝑅0 from data for four cholera outbreaks in developing countries during the nineties. 𝑅0 values ranged from 3.1 in Indonesia (1993–1999) to 15.3 in Pakistan (1990–1995), with an average of 8.7. They then used the next-generation method to develop a theoretical estimate for 𝑅0 that incorporates hyperinfectivity. They show that this raises 𝑅0 from approximately 3.2 when there is no hyperinfectivity to 18.2 when the contact with hyperinfected water is similar to contact with nonhyperinfected water, but that 𝑅0 could be significantly larger with more frequent contact with hyperinfected water.

A.6. Dengue

Nishiura [61] clarified the contributions of mathematical and statistical approaches to dengue epidemiology. This highlighted the practical importance of the basic reproduction number, 𝑅0, in relation to the critical proportion of vaccination required to eradicate the disease in the future. The author illustrated three different methods to estimate 𝑅0: (i) the final size equation, (ii) the intrinsic growth rate, and (iii) age distribution, with published estimates and examples. The author pointed out that, although the estimates of 𝑅0 most likely depend on the ecological characteristics of the vector population, it would be appropriate to assume a serotype-nonspecific estimate for 𝑅0 is approximately 10, at least when planning vaccination strategies in endemic areas.

A.7. Rift Valley Fever

Gaff et al. [12] modelled Rift Valley fever, a mosquito-borne disease affecting both humans and livestock that currently exists in developing countries but has the potential to be introduced to the western world. The disease can be transmitted both horizontally (from host species to host species via mosquitoes) or vertically (via parent-to-offspring transmission in mosquitoes). They expressed the basic reproductive ratio as 𝑅0=𝑅0,𝑉+𝑅0,𝐻,(A.8) where 𝑅0,𝑉 is the basic reproductive ratio for vertical transmission and 𝑅0,𝐻 is the basic reproductive ratio for horizontal transmission. They used the next-generation method and uncertainty analysis to calculate 𝑅0=1.19, with a range from 0.037 to 3.743.

A.8. Bluetongue

Gubbins et al. [13] examined bluetongue virus, an insect-borne infectious disease of ruminants in Europe. They used the next-generation method to determine 𝑅0 and then Latin Hypercube sampling and partial rank correlation coefficients to assess the effect of parameter variation. Median values for 𝑅0 were 0.81 (cattle only), 0.73 (sheep only) and 0.55 (both cattle and sheep), implying that the disease should not persist. However, the corresponding maximum values were 18.77, 15.48, and 12.82, respectively, suggesting that variation in parameters can have an enormous impact on the outcome, including persistence of the disease. The most significant changes in the estimation of 𝑅0 were due to the mosquito biting rate, the extrinsic vector incubation period, the vector mortality rate, the probability of transmission from host to vector, and the ratio of vectors to cattle and sheep.

A.9. Plant Pathogens

van den Bosch et al. [6] described a systematic method to calculate the basic reproduction ratio from knowledge of a pathogen's life cycle and its interactions with the host plant. They developed a system of linear difference equations and rearranged the dominant eigenvalue to find 𝑅0. A fraction ğ‘ž of infected spores are deposited in location 1, while a fraction 1âˆ’ğ‘ž are deposited in location 2. This results in 𝑅0=ğ‘žî€·ğ›¾1𝛼1𝜏1𝜌𝐻+(1âˆ’ğ‘ž)𝛾2𝛼2𝜏2𝜌𝐻,(A.9) where the first term represents the number of spores that successfully germinate in location 1 and the second term represents the number of spores that successfully germinate in location 2. Here, 𝛾𝑖 is the probability that a spore deposited on location 𝑖 will germinate, 𝛼𝑖 is the number of spores produced per time unit on location 𝑖, 𝜏𝑖 is the infectious period of a spore-producing lesion on location 𝑖, 𝜌 is the probability that a spore is deposited on a susceptible site, and 𝐻 is the density of susceptible sites in a host population.

They included a separate box explaining pitfalls in calculating the basic reproductive ratio from nonlinear models and noted its nonuniqueness, calculating two distinct values for 𝑅0. However, they were unable to decide which value was correct. The authors conclude by stating that “Although nonlinear differential equations are invaluable tools for studying epidemics, they should not serve as the primary basis for the determination of 𝑅0.”

Cunniffe and Gilligan [62] considered the effects of host demography upon the outcomes for invasion, persistence and control of pathogens in a model for botanical epidemics. They used the next-generation method to calculate 𝑅0=𝑅𝑝0+𝑅𝑠0,(A.10) partitioned into primary and secondary infection. These values are unaffected by host demography. However, the endemic level of infection is highly sensitive to changes in 𝑅0 when 𝑅0>1. If 𝑅0 is increased by shortening the infectious period, then the endemic level of infection increases monotonically. However, if increases in transmission rates or decreases in the decay of free-living inoculum increase 𝑅0, then the endemic level of infection first increases (for 1<𝑅0<2) but then decreases (for 𝑅0>2). Consequently, increasing the intensity of control can result in more endemic infection.


The authors thank Bernhard Konrad, Elissa Schwartz, Jane Heffernan, Lindi Wahl, Helen Kang, and Romulus Breban for technical discussions and are grateful to an anonymous reviewer for comments that significantly improved the paper. R. J. Smith? is supported by an NSERC Discovery Grant, an Early Researcher Award, and funding from MITACS. For citation purposes, note that the question mark in Smith? is part of his name.