Abstract

Statistical challenges in monitoring modern biosurveillance data are well described in the literature. Even though assumptions of normality, independence, and stationarity are typically violated in the biosurveillance data, statistical process control (SPC) charts adopted from industry have been widely used in public health for communicable disease monitoring. But, blind usage of SPC charts in public health that ignores the characteristics of disease surveillance data may result in poor detection of disease outbreaks and/or excessive false-positive alarms. Thus, improved biosurveillance systems are clearly needed, and participation of statisticians knowledgeable in SPC alongside epidemiologists in the design and evaluation of such systems can be more productive. We describe and study a method for monitoring reportable disease counts using a Poisson distribution whose mean is allowed to vary depending on the week of the year. The seasonality is modeled by a trigonometric function whose parameters can be estimated by some baseline set of data. We study the ability of such a model to detect an outbreak. Specifically, we estimate the probability of detection (POD), the average number of weeks to signal given that a signal has occurred (conditional expected delay, or CED), and the false-positive rate (FPR, the average number of false-alarms per year).

1. Introduction

Homeland Security Presidential Directive 21 of October 18, 2007, defined biosurveillance as “… the process of active data-gathering with appropriate analysis and interpretation of biosphere data that might relate to disease activity and threats to human or animal health—whether infectious, toxic, metabolic, or otherwise, and regardless of intentional or natural origin—in order to achieve early warning of health threats, early detection of health events, and overall situational awareness of disease activity….” This suggests two distinct goals: situational awareness and early event detection. While situational awareness is certainly an important characteristic of a disease surveillance system, we focus on the problem of early event detection (EED).

Many diseases have an occurrence rate that is periodic throughout the year. Often a disease will peak in the summer and have a low occurrence rate in the winter, or vice versa. For example, E. coli O157:H7 infections tend to peak in the summer and pertussis tends to peak in the winter. There are some diseases, such as tuberculosis, that have a nearly constant occurrence rate throughout the year. A surveillance system for diseases whose occurrence rate is seasonal cannot have a constant control limit since there would be frequent signal limits in the peak season and practically no signals in the off-season. In this paper, we propose and study the use of a seasonal model for the mean, or expected, disease count. Data are typically collected weekly and the Poisson distribution is a reasonable model for these disease counts. Figure 1 illustrates the seasonality in the reported cases of pertussis in the state of Missouri from 2002 through 2011. The rate is least in late winter, early spring, and highest in late fall and throughout most of the winter. The control limits, which are shown in red in this figure, will be discussed in the next section.

The problem of disease surveillance or biosurveillance has been addressed in a number of books in recent years. The edited books by Kass-Hout and Zhang [1], Lawson and Kleinman [2], Lombardo and Buckeridge [3], M’ikanatha et al. [4], Wagner et al. [5], Wilson et al. [6], and Zeng et al. [7] cover a wide spectrum of techniques for monitoring disease. Recently, the books by Chen et al. [8] and Fricker [9] have systematically studied the problem of disease surveillance.

Disease surveillance shares many characteristics with quality control. In quality control or quality surveillance, control charts are used to monitor the quality of a manufacturing process. When a special event occurs, causing the output of the process to abruptly change, the control chart should raise a signal for workers to investigate. Thus, a control chart is used in industry to identify when the output differs from what would be expected by chance. In a similar fashion, disease surveillance is supposed to detect when the occurrence of a disease exceeds what would be expected by chance because this aberration in data may indicate a probable disease outbreak. Thus, the primary tool for disease surveillance could be a control chart, similar to what is used in industry. This technique is called statistical process control (SPC).

There are, however, differences between biosurveillance and SPC, and SPC methods must be modified before using them for biosurveillance analysis. While quality control has the single objective of detecting process changes, biosurveillance has the dual goals of early event detection and situational awareness. In quality control, no effort is placed on estimating the current state of the system, since in the absence of a signal, the process mean is assumed unchanged. In quality control, when a chart does raise an out-of-control signal, the process is stopped and a search is made for the cause. When the process resumes, the control charts are reset when monitoring begins after a signal. By contrast, in biosurveillance, there is no “stopping of the process” and as a result, process monitoring continues and the monitoring statistics are never reset. Even without intervention, disease outbreaks that cause out-of-control signals are transient, and disease incidence returns to its original rate when the outbreak is over. Because of this, the metric of average run length (ARL), which presumes a prolonged constant step shift, is inappropriate for comparing methods of surveillance. Montgomery [10] gave a comprehensive account of methods for statistical process control, including charts for counts, and Woodall [11] gave a review of methods for discrete charts and provided a thorough review of literature up to that point in time. The Poisson distribution, which plays a central role in disease surveillance is monitored using the c-chart or the u-chart. Borror et al. [12] proposed use of an exponentially weighted moving average control chart for Poisson data which has better run length properties for small shifts, and Jonsson [13] proposed use of the CUSUM. Farrington et al. [14] developed methods for detecting the occurrence of outbreaks using a scanning system along with a simple regression algorithm. Parker [15] specifically applied the Poisson distribution to problems of disease surveillance. Recently, Noufaily et al. [16], Unkel et al. [17], and Enki et al. [18] developed and applied methods for Poisson counts and quasi-Poisson counts. Shmueli and Burkom [19] and Fricker [20] discuss some of the methodological issues in disease surveillance. Serfling [21] was the first to suggest modeling seasonality in disease rates when he studied the spread of influenza.

In this paper, we describe a simple method for early outbreak detection, DESTEM (Disease Electronic Surveillance with Trigonometric Models), and we apply it to the reportable communicable diseases data in Missouri, USA.

2. The Trigonometric Model

We assume that the disease count in week follows a Poisson distribution with mean where varies by week according to a first-order trigonometric model or a second-order trigonometric model Of course, a higher-order model, such as could be used, but the first-order and second-order models are sufficient to model most diseases. Also, for diseases that exhibit no seasonality, a zeroth order model, can be applied. A first-order trigonometric model can model simple seasonality, where the expected counts follow a sinusoidal curve. A second-order trigonometric model can model situations where a disease “lingers” for a while once the disease peaks. Figure 2 shows the added flexibility in a second-order model.

For the Poisson distribution, the variance equals the mean, or, equivalently, the standard deviation equals the square root of the mean. As mentioned in the previous section, the main reason for monitoring diseases is to detect when an outbreak occurs. In order to do this, either in disease surveillance or quality surveillance, it is common to put an upper limit (or both an upper and a lower limit) and when observed points are outside the limits, it is inferred that a change in the process has occurred and an investigation should be done into the nature and cause. In this way, quality surveillance and disease surveillance are similar. Usually, however, in disease surveillance we are interested in whether there has been an increase in the underlying disease rate ; this suggests using an upper control limit only. Thus, when an observed count exceeds the upper limit, an outbreak is inferred. The most common approach is to put limits three standard deviations above the mean (also, three standard deviations below the mean in the case of quality surveillance). Since the standard deviation of the Poisson is equal to the square root of the mean, we have the following result for the upper control limit: The most common choice for is , since for most distributions nearly all of the probability lies with three standard deviations of the mean. For a discrete distribution like the Poisson, this probability is quite variable since the output is constrained to be an integer. For example, if the mean is , then the standard deviation is 2.5298 and the upper control limit is 13.99, leaving a probability of of exceeding the upper control limit. (Here indicates the random count for the week.) On the other hand, if the mean is just slightly higher, say , then the upper control limit is 14.63, leaving a probability of of exceeding the upper control limit. While these are both small numbers, their reciprocals, 1/0.00625 = 160.0 and 1/0.00339 = 294.7, which have an interpretation as the expected number of time periods for a signal (assuming that the means stays constant), are quite different.

Thus, a small change in the mean, from 6.4 to 6.5, caused a significant change in the probability of exceedance, with one being almost double the other.

The trigonometric models in (1) and (2) assume that the disease pattern remains constant from year to year. This assumption can be relaxed by incorporating one or more terms that depend on time ; for example, a simple linear trend can be accounted for by modifying the models to be of the form or For some diseases, such as tick-borne ehrlichiosis, there is a strong increasing trend across time that must be taken into account, and for other diseases, such as giardiasis, there is a decreasing trend. We have found that for most such diseases, a linear trend as in (7) or (8) is adequate.

It is possible that the predicted means near the trough, the lowest point on the curve, are negative if the models in (1) through (8) are used. This phenomenon actually occurs for some diseases. There are two approaches to dealing with this problem. One is to truncate the expected count at 0 and make a single case cause a signal. This is desirable for some diseases, such as anthrax, where a single occurrence of a disease is evidence of an outbreak. The second approach is to assume that the expected count is actually the exponential of the value shown in (1) to (8). For example, the second-order model with a linear trend would have mean function For only a few of the diseases that are monitored does the estimated mean function dip below zero, so we have found that applying one of the formulas (1) through (8) is sufficient. Even in these cases the estimated mean function will usually dip below zero for just a few weeks during the year, and in these cases a single case is sufficient to raise a signal.

Figure 1 shows the pertussis cases in Missouri from 2002 through 2011 with the upper control limits, calculated by , where is the second-order model from (2). For pertussis, the estimated equation for is Figure 3 shows the more recent case of E. coli O157:H7 infections for the year 2013. For E. coli O157:H7, a first-order model is adequate. Here the estimated trigonometric curve, from (1), is Calculations for the method and the resulting plots were done using the package R [22]. The regressions were done using the linear model function (lm) in R. Figure 3 shows the system, developed in R [22], that is used to do the required calculations and display the graphs.

Typically, 50 to 60 diseases (of the 152 reportable diseases) are monitored each week. A summary is prepared for all of the monitored diseases. The report gives the current week count, the year to date count, the count from the same week of the previous year, a flag denoting the status, and the upper control limit for the next week. The flag is red if the point is above the upper control limit, and blue otherwise. The next week’s upper control limit is given so that a signal can be raised at the earliest possible moment. For example, if next week’s UCL is 2.4 and three cases have been observed by the middle of the week, then a signal can be raised at that point. Table 1 shows part of the weekly summary based on the DESTEM procedure of the status of reportable diseases for the state of Missouri in Week 17 of 2014, the week ending April 26, 2014.

3. A Simulation Study of DESTEM Performance

In order to test the method’s ability to detect aberrations indicative of a possible disease outbreak, a Monte-Carlo simulation is used to model a number of processes with an induced outbreak. The typical outbreak is one where the incidence increases linearly for time periods, reaching a maximum of additional cases, then, decreasing linearly for time periods when the number of additional cases is back to zero. Figure 4 illustrates this type of outbreak and how it is applied to the simulated data. Following the suggestions of Fraker et al. [23], we use simulation to estimate the following quantities.(1)The probability of outbreak detection (POD): this is the probability that the outbreak is detected while it is still occurring. A small outbreak for a short period of time will have a smaller probability of being detected than a large outbreak for a longer period of time.(2)Conditional expected delay (CED): this is the expected number of time periods until a signal is reached conditioned on there being a signal.(3)False-positive rate (FPR): this is the average number of false-alarms that are signaled per year.

The usual metric of average run length (ARL), which is ubiquitous in the process control literature and widely used in disease surveillance, is inappropriate for the application discussed here. Use of the ARL presupposes a prolonged step shift in the variable being monitored, which is unrealistic for most diseases. More reasonable is an outbreak for which the number of cases increases for a while and then, even in the absence of an intervention, decreases back to the background noise.

We used a 10-year database of Missouri’s surveillance of two epidemiologically distinct diseases: E. coli O157:H7 infections and pertussis. For pertussis, we selected a second-order model and for E. coli O157:H7 infections we selected a first-order model. These choices were based on the epidemiological curve of those diseases across the ten-year data base. In order to obtain baseline data, the weeks where an outbreak was obviously occurring were removed from the data base. Because most outbreaks usually occur within a defined geographic region rather than in the entire state at once, we tested DESTEM on both, the regional and the state level data.

3.1. First-Order Trigonometric Model Simulation for E. coli O157:H7

The experimental factors included outbreak start time, maximum outbreak size, and outbreak duration. One outbreak per year was considered in the simulation which could start at trough, midrange, or peak time of the yearly disease cycle. The outbreak duration for E. coli O157:H7 was 2 (short), 3 (medium), or 4 (long) weeks. Two sets of outbreaks, small and large, were imposed in the simulation. The magnitude of small and large outbreaks varied by the outbreak start time. The outbreak sizes considered for E. coli O157:H7 were 1 and 2 additional cases per week during the trough time, and 2 and 4 additional cases per week during the disease midrange and peak occurrence. In total, 18 scenarios were designed to assess the performance of DESTEM in detecting disease outbreaks. Figure 5 illustrates four years of a Monte-Carlo simulation using the parameters of scenario 15 (size = 2, duration = 4, signature = , total cases = 6, occurring at peak) along with real E. coli O157:H7 counts in Missouri’s Eastern region during 2010–2013. The signature of an outbreak gives the number of additional expected cases for each week during the outbreak; for example, a signature of would indicate a three-week outbreak with one additional case the first week, three the second week, and one the third week. The blue line shows fitted first-order trigonometric curve used to model the average weekly counts. The circles are the real E. coli O157:H7 cases, and the triangles are simulated counts. The simulated outbreaks occurred at disease peak (week 32) and are shown in red triangles. The outbreak duration was 4 weeks and the size was 2 additional cases per week.

Each scenario was simulated 10,000 times and the average POD, CED, and FPR were computed. The simulation results, shown in Table 2, indicate that the probability of detecting an outbreak at the regional level was higher and the detection was faster (CED values were smaller) when compared to the state level data. The false-positive signal rate was higher for the regional data analysis; however, the overall false-positive rate was small for all scenarios. On average, one false-alarm was produced by DESTEM in 1.91 years at the state level, and one false-alarm was produced in 1.26 years at the regional level for E. coli O157:H7 infections.

3.2. Second-Order Trigonometric Model Simulation for Pertussis

Since pertussis cases occur more frequently than E. coli O157:H7 cases and the outbreaks are prolonged, the experimental parameters were changed. One outbreak per year was considered at trough, midrange, or peak of yearly occurrence and could last for 4 (short), 5 (medium), or 6 (long) weeks. The outbreak size during the pertussis trough time (Week 10) is assumed to be equal to either 5 or 10 additional cases per week, and during midrange and peak time (weeks 25 and 46, resp.) the imposed outbreak is considered to be either 10 or 20 additional cases per week. Figure 6 shows the actual pertussis data (circles) along with the simulated data (triangles) using the parameters of scenario 15 in Table 3 (size = 10, duration = 6, signature = , with a total of 42 additional cases, occurring at peak). Although we found a seasonal component to the counts of pertussis cases, others such as de Greeff et al. [24] have found an even stronger seasonal component. We then ran the simulation 10,000 times for each of the 18 scenarios (Table 3). DESTEM performed well in detecting pertussis outbreaks accurately and timely. The false-positive signal rate was small: roughly one false-alarm every 5.35 years at the state level and one false-alarm every 3.89 years at the regional level. The probability of detection was nearly 1 for many outbreak scenarios and the conditional expected delay was usually about 2 or under.

3.3. Performance of DESTEM When There Is Overdispersion

The Poisson distribution is the simplest model for counts, and it can be an appropriate model for counts in a number of situations. The Poisson distribution, being a one-parameter distribution, has the property that the mean and the variance are equal. In many applications, the variance of the data exceeds the mean, making the Poisson model inappropriate. This concept is called overdispersion. As a remedy, the negative binomial distribution is often used as an alternative to the Poisson (Sparks et al. [25]). The negative binomial distribution is often thought of as the number of failures in a sequence of Bernoulli trials before obtaining the th success. The probability mass function for the random variable with negative binomial distribution is While the description given above requires the parameter to be a positive integer, the above formula is a valid probability mass function for every , so long as the gamma distribution is used instead of factorials. Thus, for an arbitrary positive value , the negative binomial has probability mass function The mean and variance of the negative binomial distribution are and , so that the variance to mean ratio is . The negative binomial can be reparameterized so that the mean is equal to and the variance is . The parameter is called the overdispersion parameter. A value of leads to the variance being equal to the mean, and as the negative binomial approaches the Poisson.

We ran this negative binomial model on the pertussis data and found that the estimated mean function was with an estimated overdispersion parameter of . There were, however, a number of pertussis outbreaks during this period. These outbreaks seem to exaggerate the effect of overdispersion. If the points above the three standard deviation limits are removed, the estimates become with an estimated overdispersion parameter of . The estimated curves are shown in Figure 7. The top part of Figure 7 is the original data and the bottom part is the data with outliers removed. Although there are still some points above the UCL in the bottom figure, these seem to be due to the higher variability and not to newly discovered outbreaks.

To see how the DESTEM algorithm works when the assumption of a Poisson distribution is violated, we ran several simulations under which the counts had a negative binomial distribution. For both E. coli O157:H7, where the counts averaged approximately 1.5 per week, and pertussis, where the counts averaged approximately 8 per week, we chose the overdispersion parameter to be , , or . This corresponds to a variance to mean ratio of which is 1.15, 1.3, or 1.6 for O157:H7 and 1.8, 2.6, and 4.2 for pertussis.

The simulations were performed much as in Tables 2 and 3. The observations were simulated according to a negative binomial coefficient with mean and overdispersion parameter . The Poisson distribution was then (incorrectly) assumed and the control limits were computed using three standard deviation limits. The results of the simulations are shown in Tables 4 and 5. Although the results vary a bit, the general conclusions from these tables are that the POD increases when there is overdispersion (desirable), but the FPR also increases (undesirable). These results are somewhat expected since the added variability will cause more signals, but when the process is stable, these will be false signals.

The DESTEM method here could have assumed the negative binomial distribution instead of the Poisson, and the upper control limit would have been a bit higher. This, however, requires one additional important parameter to be estimated (the overdispersion parameter ). Altogether, for the full model in (9) seven parameters would have to be estimated: , and . It has been shown (Jensen et al. [26] and Champ et al. [27]) that very large sample sizes are often needed when the number of parameters is large. Misestimation of these parameters can lead to charts whose properties do not resemble those of the corresponding chart with assumed known parameters. For moderate sample sizes, the overdispersion parameter is estimated with a fairly large standard error. In addition, the existence of outbreaks makes the problem more difficult because we must remove the outbreak from the data set and reestimate the parameters. As a general rule, we have excluded observations that are above the upper control limit and refit the model based on the remaining observations. Of course, when these are removed the upper control limit drops and inevitably there are more points outside the limits. The process of removing outliers and recomputing limits can continue, but we have found that one or two iterations are sufficient. Whatever points remain outside can either be explained by overdispersion or still more outbreaks; in many cases, it is difficult to distinguish between these two possibilities.

4. Discussion and Conclusion

We developed a simple analytical methodology, DESTEM, that was able in this simulation study to timely and accurately detect aberrations in the reportable diseases weekly data. Aberrations in the usual communicable disease patterns may provide a warning sign of an outbreak, but detection of such aberrations still remains an important challenge in public health surveillance. Statistical simulation confirmed that DESTEM performs well under different scenarios routinely encountered by the epidemiologist at the Missouri Department of Health and Senior Services (DHSS).

Current widespread usage of syndromic surveillance systems by public health departments across the USA does not preclude necessity of traditional surveillance of reportable communicable diseases [28]. Syndromic surveillance systems, usually based on sophisticated algorithms, have major strengths, such as improved timeliness of data availability (often within hours), completeness of data, flexibility, reasonably good situational awareness, and so forth. But, unlike diagnostic data that underlies reportable communicable disease surveillance, syndromic data are an indirect indicator of a disease outbreak because it is based on the nonspecific symptom data rather than on the actual disease diagnosis. Systems such as Biosense [29], Essence [30], and others, are by contrast systems for collecting, archiving, and presenting surveillance data. The goals of syndromic surveillance are often short-term (such as for the Olympics, or a political convention) and look for omnibus signals in the data. By contrast, our reportable infections data are cyclical and we look for a deviation from this norm. See Chen et al. [8, chapter 2] for a summary of many of the popular surveillance systems.

The reportable diseases list includes a variety of conditions with very different incidence rates, incubation periods, seasonality, and urgency. Despite such a variety, DESTEM performed well when two epidemiologically distinct infection outbreaks, E. coli O157:H7 and pertussis, were simulated. For the communicable disease epidemiologist, early detection of an outbreak is a priority. But, direct application to biosurveillance of the industrial SPC methods requiring sufficient data accumulation to detect change in the process mean could result in the delayed timeliness of detection, and therefore practical usefulness of surveillance system is diminished. Thus, the primary goal for DESTEM was set to meet a specific surveillance purpose, such as accurate and timely detection of disease outbreaks. The characteristic of DESTEM is that it does not directly detect a change in a disease trend; it rather detects an aberration that is different from a historical trend in any given week of the year. DESTEM analyzes whether the increased number of individual observations in any specific week of the year is unlikely to be happening by chance alone based on the modeled upper control limit that changes from week to week. Once the probability of such an occurrence by chance alone is determined to be too low, DESTEM produces a signal, prompting investigation by the epidemiologist regarding whether a true disease outbreak is occurring.

It is postulated that public health surveillance data is often overdispersed, and therefore negative binomial rather than Poisson distribution is more appropriate assumption (Sparks et al. [25]). Our simulation data showed that assuming negative binomial distribution for analysis may not be always advantageous: while the POD did increase, the FPR also went up. The problem of using estimated parameters instead of assumed known parameters arises when we must estimate the overdispersion. It is likely that both, Poisson and negative binomial distribution, can be reasonable assumptions when applied to surveillance data.

Simulation results were also consistent with a common epidemiological experience that all outbreaks start locally and should be analyzed using local data. DESTEM performed better when applied to the historical regional data rather than using state level data as a baseline for comparison. Thus, surveillance analysis methodology needs to be accurate, but it also needs to be applied to a more relevant and more precisely defined dataset.

We chose a relatively long 10-year historical baseline period for modeling because of the need for appropriate sample size and for better estimation of the variance. While producing statistically more stable estimates, longer historical periods could affect accuracy of those estimates due to systematic effects accumulating over the prolonged period of time. Naturally changing disease trends, new diagnostics, public health control measures, and changing attitudes of health care providers and population represent various systematic effects that may distort estimated baseline with a variable degree. Such a difficult tradeoff seemed to be justified based on the limited simulation analysis of DESTEM performance, but it will certainly need more vigorous assessment with the real public health surveillance data. In order to reduce possible systematic effects in the data, DESTEM renews the 10-year baseline dataset every year by discarding the oldest year from the model while adding the most recent year’s data.

In conclusion, DESTEM represents a promising tool for analysis of surveillance data of variety of reportable infectious diseases. The analytical reports based on this methodology are easy to understand not only for epidemiologists without advanced mathematical expertise, but for the general public as well.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.