Abstract

Recently, the study of the coherent noise model has led to a simple (binary) prediction algorithm for the forthcoming earthquake magnitude in aftershock sequences. This algorithm is based on the concept of natural time and exploits the complexity exhibited by the coherent noise model. Here, using the relocated catalogue from Southern California Seismic Network for 1981 to June 2011, we evaluate the application of this algorithm for the aftershocks of strong earthquakes of magnitude . The study is also extended by using the Global Centroid Moment Tensor Project catalogue to the case of the six strongest earthquakes in the Earth during the last almost forty years. The predictor time series exhibits the ubiquitous noise behavior.

1. Introduction

The prediction of the magnitudes and occurrence times of aftershocks is of crucial importance for restricting the losses caused by strong earthquakes (EQs, hereafter) because buildings already damaged by the mainshock may collapse upon the occurrence of a strong aftershock. Recently, an algorithm has been suggested [1] on the basis of the coherent noise model [24] and natural time [57] that might be useful for the determination of both magnitudes and occurrence times of aftershocks. It is the main scope of the present paper to investigate the applicability of such an algorithm to aftershock time series in Southern California. This area has been selected in view of the publication [8] of an accurate waveform relocated EQ catalogue for Southern California from 1981 to June 2011 [9] that exhibits tighter spatial clustering of seismicity than the routinely generated catalogue does.

The coherent noise model [24] is a model that shows reorganization events (avalanches) whose size distribution follows a power law over many decades and displays aftershock events. These events have been shown [3, 10, 11] to exhibit a behavior similar to that of the Omori-Utsu law [12]; see also [13, 14], for real EQ aftershocks. Moreover, it has been recently shown [15] that it is compatible with the unified scaling law [16] of waiting times between EQs.

In a time series comprising avalanches (or EQs), natural time serves as an index for the occurrence of the th event [57]. Natural time focuses on the sequential order of the events and the analysis is usually made by using the pair , where is a quantity proportional to the size (and hence to the energy) of the th event. Natural time analysis has found useful applications in a variety of fields: Statistical Physics (e.g., [1723]), Cardiology (e.g., [2427]), Geophysics (e.g., [6, 2832]), Atmospheric Sciences (e.g., [33, 34]), Seismology (e.g., [3540]), Physics of EQs (e.g., [4145]), EQ prediction (e.g., [5, 4652]), and so on (for a recent review see [7]). For the case of EQs [36], , where is the magnitude of the th EQ and . The two quantities and form the basis of the aftershock magnitude prediction algorithm, which is discussed in the next section.

2. Materials and Methods

2.1. Data Analyzed

As mentioned in the introduction, we consider the waveform relocated catalogue [8] for Southern California from 1981 to June 2011 [9]. In order to filter out distant poorly constrained events, we use the geographic polygon shown in Figure of Hauksson et al. [8] that covers the Southern California Seismic Network (SCSN) reporting area for local events. The events that occurred within this area are shown in Table 1. As seen in this table, there are 4 stronger EQs with which are the 1992 Landers, the 2010 El Mayor-Cucapah, the 1999 Hector Mine, and the 1994 Northridge EQs (see Figure 1), and 4 smaller EQs which are the two Superstition Hills large events [53] in 1987 and the 1992 Joshua Tree and Big Bear events (see Figure 2). The latter two EQs are related [54, 55] to the Landers EQ which is the strongest one in California for the last sixty years (e.g., see Table of [56]).

Following Shcherbakov et al. [13], we consider as aftershocks all reported events that occurred during a period within a region centered at the epicenter of the each strong EQ (see the last column of Table 1). The linear dimension of each region scales with starting from the region of 1.1° 1.1° selected by Shcherbakov et al. [13] for Landers EQ (cf. the scaling of the aftershock zone with mainshock magnitude was firstly introduced by Utsu and Seki [57]). This scaling has been employed since it allows the determination of the “aftershock” spatial window upon the occurrence of the mainshock and does not make use of any a posteriori information concerning the aftershocks. Hence, the spatial range of the space-time window usually called Omori regime—which is identified [58] by examining the best fits of the Omori-Utsu law to the aftershock data—may differ from this window. The period examined was one year for the four stronger mainshocks while it varies for the four smaller EQs (see Results and Discussion).

2.2. Coherent Noise Model and the Algorithm

The events, to be called avalanches, in the coherent noise model result from the following procedure [2, 3]. Consider a system of agents; for example, the points of contact in a subterranean fault, for each agent we associate a threshold , that represents the amount of stress that an agent withstands before it breaks. Without loss of generality [3], may come from the uniform distribution in the interval . The dynamics of the model consists of two steps, a “stress” step, which is more important and sufficient to produce large avalanches, and an “aging” step. During the “stress” step, we select a random number (or “stress” level) from some probability distribution function and replace all that are smaller than with new values resulting from the uniform distribution in the interval . The number of agents whose thresholds are renewed is the size of the avalanche. During the “aging” step, a fixed fraction of agents is selected at random and their thresholds are also replaced with new thresholds obtained again from the uniform distribution in the interval . If we assume that , the thresholds are represented by a threshold distribution function which initially is considered uniform in the interval ; that is, (see Figure 3). The size of the first avalanche is just the probability which represents the “mass” of the agents that had thresholds smaller than . After the subsequent “aging” step the threshold distribution becomes . Upon repeating the two steps for the second time using we can obtain the size of the second avalanche and the new threshold distribution and so on.

In the case of the coherent noise model for , the threshold distribution after the th avalanche is a piecewise constant function having [1] the following general form:where is the number of steps present in the threshold distribution function after the th avalanche (e.g., see Figure 3), , , and are positive real parameters, and is the Heaviside (unit) step function; that is, if and if . The analysis of numerical results has shown (e.g., see Figure of [1]) that the integer number may serve as a predictor for the size of the next avalanche. It has been also found [1] that as a function of (and hence versus natural time) exhibits the ubiquitous noise [5963] (together with a logarithmic trend, e.g., see Eqs. () and () of [1]).

The number of steps changes when a “stress” level is applied: (a) eliminating at least two smaller “stress” levels previously applied (cf. the case in Figure 3) or (b) being smaller than the smaller nonzero (cf. the case in Figure 3). When all , coincides with the cardinality of the sets of successive extrema formed when studying the time series of the random stresses . The sets of successive extrema are defined [25] as follows: equals the empty set . Each subsequent is obtained by the procedure described below for times: select a random number from a given probability density function and compare it with all the members of . In order to construct the set , we discard from the set all its members that are smaller than and furthermore include . Thus, for all and is a finite set of real numbers whose members are always larger than or equal to . Moreover, . The increase of the cardinality of these sets is at the most 1, but its decrease may be as large as , thus exhibiting a characteristic asymmetry as increases.

The aforementioned predictability of the coherent noise model on the basis of could be also seen as follows: when a relatively large random stress is applied, the coherent noise model almost “forgets” the effect of previously applied smaller stresses acquiring relatively high threshold density for a relatively wide threshold range starting from (cf. the case for in Figure 3). Since the archetypal physical system of the coherent noise model assumed (e.g., see the last few lines of the first page of [2]) that the agents correspond to the points of contact in a subterranean fault, it has been suggested [1] that the coherent noise model predictability can be exploited in the case of EQ aftershocks. In the latter case, one studies the successive extrema formed in the time series of the aftershock magnitudes (, where is a threshold magnitude) as a function of their order of occurrence , that is, natural time, by assuming . Setting at the occurrence time of the mainshock, as shown in the upper left rectangular box of Figure 4, the corresponding sets of the successive extrema (as defined above) in the aftershock magnitude time series are constructed and the cardinality | (e.g., see Figure 4) plays a role similar to that of in the case of the coherent noise model. A simple computer program that calculates is presented in Appendix.

Figure 5(a) depicts the aftershock magnitude time series of the Landers EQ (see Table 1) versus conventional time together with the time series of . The argument of the latter time series has been intentionally shifted by unity in order to compare the value of , that is, the number of successive extrema before the occurrence of the th aftershock, with the magnitude of the th aftershock. We observe a characteristic pattern which reflects the already mentioned asymmetry when is considered as time series with respect to the natural number , but no clear correlation between and can be seen. For this reason, we plot in Figure 5(b) the excerpt corresponding to the first three and a half hours after the mainshock occurrence. During this period, the strongest aftershock of Landers EQ (see the sixth line of Table 1), the Big Bear aftershock [54], took place. We may observe that most of the time strong aftershocks occur when is below 5. In order to further clarify what happens, it is most appropriate to depict the results in natural time. This is done in Figure 5(c) where we observe that the Big Bear aftershock occurs when was equal to 4. Thus, we observe that for Landers EQ the suggestion that might be a useful predictor could be considered plausible. The subject of the next section is to examine statistically whether can be used for the prediction of the magnitude of the forthcoming aftershock, in the sense that this magnitude exceeds a threshold. Here, we note that the coherent noise model method might be able to predict the magnitude of the next aftershock; however it cannot say when it will occur. This makes impossible the use of other verification methods as the Molchan diagram [6466].

3. Results and Discussion

We first examine whether as it results from the aftershock time series exhibits noise. To this end, we use the Detrended Fluctuation Analysis (DFA) [6769] which is a standard robust method suitable for detecting long-range power-law correlations and has been applied to diverse fields where scale-invariant behavior emerges, for example, from DNA [7072] and heart dynamics [68, 73] to meteorology [74, 75], atmospheric physics [76, 77], geophysics [28, 40, 78, 79], and economics [8085]. The major advantage of DFA is the systematic elimination of polynomial trends of different order [8689] (and hence of the much slower aforementioned logarithmic trend in ). In DFA, a fluctuation measure is constructed, based on the residuals of a piecewise polynomial fitting to the profile time series, and studied versus the scale used for the polynomial detrending (e.g., for more details see Section of [7]). If the time series exhibits power-law correlations and the power spectrum exponent is related [70] to the DFA exponent through . Figure 6 depicts the results of the DFA for the time series deduced from the aftershock magnitude time series of the four stronger EQs of Table 1. We observe that the resulting DFA exponents are very close to unity, thus showing that indeed exhibits noise.

We now turn to the prediction algorithm based on discussed in the previous section. This is applied in natural time. Thus, the time increased probability (TIP) [90, 91] is turned on after the occurrence of the th aftershock when is equal to or below some threshold and lasts until the occurrence of the (next) th aftershock with . Based on Figure of Shcherbakov et al. [13], we set . Under these assumptions, the aftershock magnitude prediction reduces to a binary prediction. We can therefore use the Receiver Operating Characteristics (ROC) graph [92] to depict the prediction quality. This is a plot of the hit rate (or True Positive rate, TPr) versus the false alarm rate (or False Positive rate, FPr), as a function of the total rate of alarms, which is tuned by the threshold . The hit rate is the ratio of the cases for which TIP was on and over the total number of cases that . The false alarm rate is the ratio of the cases for which TIP was on and over the total number of cases for which . Random predictions generate equal hit and false alarm rate, and hence they lead to the diagonal in ROC plot. Thus, only when the points lie above this diagonal a predictor is useful.

Of course, random predictions lead to ROC curves that exhibit fluctuations which depend on the positive cases (when ) and the negative cases (when ) to be predicted. The statistical significance of an ROC curve depends [93] on the area under the curve in the ROC plane (cf. this area is also usually termed AUC). Mason and Graham [93] have shown that where follows the Mann–Whitney -statistics [94]. Recently, a visualization scheme for the statistical significance of ROC curves has been proposed [95]. It is based on -ellipses which are the envelopes of the confidence ellipses (cf. a point lies outside a confidence ellipse with probability ) obtained when using a random predictor and vary the prediction threshold. These -ellipses cover the whole ROC plane and using their we can have a measure [95] of the probability to obtain by chance (i.e., using a random predictor) an ROC curve passing through each point on the ROC plane. In the present case, the values that will be reported below correspond to the whole ROC curve and are calculated on the basis of its AUC using Eq. (2) by means of the computer program VISROC.f available in Sarlis and Christopoulos [95]. This value gives the probability of obtaining by chance a prediction scheme that leads to an AUC equal to the observed one for given values of and . It is worthwhile to mention that in the standard method, by using the Gutenberg-Richter law which is a skewed distribution for which ROC graphs are [92] especially useful, the magnitudes are assumed to be independent, thus unpredictable, and this remains so even if we consider the so-called [96] short-term aftershock incompleteness (STAI) which mainly arises because directly after a large EQ the overlapping arrivals of waves from different events can be masked [9799] thus temporarily increasing the completeness threshold of the EQ catalogue (see also the discussion below). On the other hand, the present method tries to predict large aftershocks from the correlation between magnitudes. Below in this paper, we focus on evaluating the statistical significance of this method. Other methods may be also needed to evaluate its practical utility.

3.1. The Case of the Four Stronger EQs in Southern California

These are the cases of the 1992 Landers, the 2010 El Mayor-Cucapah, the 1999 Hector Mine, and the 1994 Northridge EQ. As mentioned the analysis is made by considering the areas shown in Table 1 centered at the epicenter of each mainshock and for a period equal to one year. The aftershocks thus obtained (see Figure 1) have been sorted according to their origin time and the time series versus the order of occurrence has been constructed. Figure 5 shows an example of the time series obtained from the successive extrema of the aftershock magnitude time series of the 1992 Landers EQ. We then apply the prediction algorithm as described in the previous section for various target magnitudes : when is smaller than or equal to the TIP turns on: if the magnitude of the (next) aftershock is we have a true positive successful prediction and if we have a false positive unsuccessful prediction. When TIP is off, that is, , and we have a true negative successful prediction; if we have false negative unsuccessful prediction, a miss. This way we can obtain ROC curves for various .

We first focus on a high , selected on the first decimal digit, that is determined by the formula , where is—as mentioned—the mainshock magnitude. The results for these ’s for the four strong EQs under discussion are shown in Figure 7. The corresponding values to obtain such ROC curves by chance is 0.012%, 0.036%, 0.81%, and 0.24% for the Landers, El Mayor-Cucapah, Hector Mine, and Northridge EQ, respectively. Figure 8 depicts the ROC curves obtained when selecting a lower target magnitude . We observe that in this case also we obtain ROC curves of high statistical significance. Finally, in Figures 9(a) and 9(b), we present the results obtained for AUC and the corresponding values when considering a wide range of ’s from 4.0 to 5.5.

3.2. The Case of the Four Smaller EQs

As already mentioned, this is the case of the two Superstition Hills large events [53] in 1987 and the 1992 Joshua Tree and Big Bear events which are related [54, 55] to the Landers EQ. The areas used for the selection of the aftershock magnitude time series are shown in the last column of Table 1, but now the period for the study of aftershocks may vary from one year since these EQs cannot be considered as clearly independent mainshocks. Specifically, in the case of the first Superstition Hills 6.2 EQ, we considered as the period until the occurrence of the second Superstition Hills 6.6 EQ, that is, roughly 11 hours and 23 minutes. Moreover, since Joshua Tree EQ is clearly related [54, 55] to the Landers EQ, we considered as the period until the occurrence time of the Landers EQ, that is, roughly 66 days and 7 hours. In the remaining two cases treated here, a period of one year has been used. The aftershocks considered in each case are shown in Figure 2.

Following the procedure used in the previous subsection, we sorted the aftershocks according to their occurrence time and constructed the time series, and from the latter we obtained the predictor time series . This time series has been used for the prediction of the aftershock magnitude time series and the ROC curves obtained are shown in Figures 10 and 11 for and , respectively, for each EQ. On the latter figure, we clearly observe that the predictions made on the basis of the proposed algorithm are far beyond chance (cf. for the ROC related to the Joshua Tree EQ in Figure 11 we have %). If we focus our attention on the high ’s (see Figure 10), larger values are obtained which are 5%, 0.1%, 0.9%, and 11% for the second Superstition Hills, Big Bear, first Superstition Hills, and Joshua Tree EQs, respectively. These values are also larger than those obtained in the previous subsection when studying the stronger EQs in Southern California. In Figures 9(c) and 9(d), we present the results obtained for AUC and the corresponding values when considering a wide range of ’s from 3.5 to 5.0. Figure 9(d) shows that in the vast majority (45 out) of the 59 examined cases the obtained values are below 5% pointing to the statistical significance of the method.

At this point we have to comment on the values of obtained after the last aftershock considered in the case of the first Superstition Hills and the 1992 Joshua Tree EQs. As mentioned our study above was terminated just before the occurrence of the second Superstition Hills and the Landers EQ, respectively, in each case. Hence, these values coincide with the values just before the occurrence of the latter two EQs and are 11 and 7, respectively. The operating points on the ROC diagram corresponding to equal to these values are shown with the red arrows in Figure 10. An inspection of this figure shows that depending of the selection of the operating point of the proposed algorithm an alarm could have been probably on almost 23 hours before the occurrence of the Landers EQ (actually after 12:38:20 UTC on June 27, 1992, when an EQ occurred at 34.046 N 116.287 W, i.e., 22 km away from the Landers EQ epicenter) whereas this is rather improbable for the case of the second Superstition Hills EQ since the corresponding to ROC point leads to an extremely high false alarm rate. Hence, the selection of is also very important. The latter point together with its implications for a practical application of the proposed algorithm will be discussed in the next subsection.

3.3. Further Analysis of the Results

We first focus on the fact that the prediction algorithm presented here is statistically significant. This statistical significance is very high when considering a relatively small (e.g., see Figures 8 and 11), which also pertains when focusing on (e.g., see Figures 7 and 10). A comparison of Figures 9(b) and 9(d) shows that this statistical significance may decrease when we consider the smallest in magnitude EQ, that is, the Joshua Tree EQ. Even in this case, however, we obtain values which exclude the possibility of a random result.

In order to evaluate the statistical significance of the proposed algorithm out of sample and for EQs in regions different from California, we proceeded, by employing the same procedure as that described in Materials and Methods, to the analysis of all EQs in the Earth since 1976 (see Table 2). For this study the data analyzed come from the Global Centroid Moment Tensor (CMT) Project [100, 101] that covers global seismicity since 1 January 1976 and we considered , as in Sarlis et al. [51]. For EQs that took place before 2011, the 1976 to 2010 CMT catalogue was used, whereas for EQs since 1 January 2011 to 1 July 2015 the monthly CMT catalogues have been employed (all these catalogues are publicly available from http://www.globalcmt.org/CMTfiles.html). For reasons of brevity, we will refer to these data as the GCMT catalogue. The AUC and values obtained for the wide range of from 6.0 to 7.5 are depicted in Figure 12 in a fashion similar to that of Figure 9. From the six cases studied, only the results obtained for the 2005 Sumatra-Nias EQ do not appear statistically significant. This fact has been also verified by producing 102 randomly shuffled copies of the original time series and comparing whether the AUCs obtained for a shuffled copy time series were higher than those depicted in the majority of cases presented in Figure 12(a) for each EQ: For the 2005 Sumatra-Nias EQ in 54% of the cases the AUCs for the shuffled copies were higher than those depicted in Figure 12(a), while this percentage falls to 9% and 8% for the Sumatra-Andaman and Chile cases. For the other three cases studied the obtained percentages were smaller than 2%. We note, however, that the 2005 Sumatra-Nias EQ occurred almost three months after the Sumatra-Andaman EQ and is included within the rectangular region of the latter EQ (see Table 2). The corresponding value (almost three and a half days) before the 2005 Sumatra-Nias EQ was 4, which is the same as that before the Big Bear aftershock; see Figure 5(c).

An inspection of Figure 12(b) also shows that the values are much larger than those of Figure 9. This is related to the fact that used for GCMT is much larger than that used in Southern California where an accurate waveform relocated regional catalogue [8, 9] has been employed. Using a higher does not allow to vary significantly since there are no small EQs to increase . For example, the maximum value of is 10, 9, 10, 11, 17, and 6 for the six EQs of GCMT presented in the first column of Table 2, respectively, compared to 22, 22, 22, 24, 15, 19, 14, and 15 for the EQs presented in the first column of Table 1 in Southern California. Hence, the threshold has to be modified accordingly. Figure 13 depicts two selections of operating points, corresponding to fixed for each catalogue, for the totality of the 14 EQs studied when focusing on the prediction of aftershocks with and could be seen as a summary evaluation for various regions and two different catalogues (we will return to this point later).

Let us now return to the cases in Southern California and discuss the effect of STAI on the statistical significance of the method. In order to correct from this effect, we followed Helmstetter et al. [102, 103], considered for each EQ of Table 1 a time varying given bywhere is the elapsed time measured in days from the mainshock, that is, Eq. (15) of Helmstetter et al. [103], and repeated the calculations focusing only on the aftershocks during the first 24 hours after each mainshock where STAI is stronger. The results are shown in Figure 14 and show that out of the 115 obtained values only 6 exceed 7% whereas the vast majority, that is, 107 cases, are below 5%.

As an additional check, we examined whether the validity of the proposed method originates from STAI by the following test: each time the magnitude of a reported aftershock satisfied , was considered as the new element of a time series labeled (original). At the same time, a magnitude has been randomly selected from the original distribution of and considered as the new element of a time series labeled (synthetic). The two time series and have been analyzed by ROC using the prediction algorithm and the corresponding AUCs have been compared 104 times. The results showed that the percentage of cases for which the AUC of a synthetic time series was larger in the majority of cases presented in Figure 9 than that of the original one is 4.2%, 8.2%, 0.6%, 9.9%, 59%, 0.005%, 7.2%, and 9.5% for the 8 EQs presented in the first column of Table 1, respectively. These results (apart from those concerning the second Superstition Hills 6.6 EQ that will be discussed below) indicate that the observed predictability cannot be fully accounted for by STAI (cf. if that was the case, only up to two of the above 8 numbers should have been below 10% with probability 96.2% () due to the binomial distribution).

One might also argue that the present results could be alternatively reproduced by synthetic models generated using the Epidemic Type Aftershock Sequence (ETAS) model [105], coupling the main statistical laws describing the occurrence of earthquakes in time, space in magnitude, that is, the Gutenberg-Richter magnitude distribution and the Omori-Utsu law as well as the increase of the number of aftershocks as about with the mainshock magnitude (or Bath law). Following Zhuang and Touati [106], we used the code etasim.f of SASeis2006 [104]; see also [105, 107, 108], for ETAS. By simulating magnitude by Gutenberg-Richter law for b = 1 with cutoff magnitude 2.0, we compared the results thus obtained (blue asterisks in Figure 15) with those of the test data provided in SASeis2006 (http://www.ism.ac.jp/~ogata/Ssg/softwares.html). These test data come [104] from the Japan Meteorogical Agency (JMA) EQ catalogue and concern the aftershock data for almost 18 days of the 26 July 2003 EQ of at the northern Miyagi-Ken, northern Japan, and the aftershock data for almost 6 days of the 16 August 2005 EQ of offshore of western Fukuoka-Ken, Kyushu, Japan. Both the real test data and the simulating time series have been considered with and the algorithm has been applied by using the code presented in Appendix. Figure 15 depicts the AUCs and the corresponding values. An inspection of this figure shows that the results obtained for the real data clearly outperform the simulated ones since in only 3 cases (out of 48) the AUC of the ETAS simulated data exceeded that of the real data. Thus, the experimentally observed predictability cannot be reproduced. These results also show that the present algorithm conveys additional information and for this reason it may be used together with standard and well-accepted methods [109, 110] in which the forecasting of aftershocks in time and magnitude domain is carried out by employing the modified Omori law and the Gutenberg-Richter law.

The models discussed in the previous paragraphs do not impose any direct correlation between successive magnitudes apart from those inherited from the combination of the Gutenberg-Richter law and Omori law, the fact that the rate of triggered events increases with the magnitude of the triggering event, that is, the ETAS model and STAI. It is a fact, however, that magnitude correlations exist in both regional [17, 36, 40, 111] and global [38, 39] seismic catalogues. Thus, one should also investigate whether such correlations are responsible for the predictability observed in Figure 9. Particularly, for California these correlations have been extensively studied by Davidsen and Green [96] and Lippiello et al. [112]. Davidsen and Green [96] showed that they cannot be fully accounted for by STAI as was also found in Lippiello et al. [112] (e.g., see their Figure ). In both of these studies, based on the investigation of the magnitude correlations firstly introduced by Lippiello et al. [113], the common method of investigating such correlations was the construction of the difference between the cumulative distributions of the observed differences between successive EQs and those obtained from randomly shuffled copies of the original catalogue. Since such correlations are expected [112] to be stronger for EQs which are closer in time and increases with the magnitude threshold of the catalogue used, we focused our attention on the first 24 hours after the mainshock, applied a time-dependent threshold , and investigated the presence of such correlations. The cumulative distributions and are depicted by the red (solid) and the green (dashed) curves, respectively, in Figure 16 and they actually differ as it is also shown by the (red solid) curve in Figure 17. Hence, it is of crucial importance to confirm that the predictability of large aftershocks presented in Figure 14 (for the same time period and the same ) is not caused by the phenomenon observed in Figures 16 and 17. For this reason, we employed the method of surrogates proposed by Schreiber and Schmitz [114] as implemented in the TISEAN package [115] and applied it to the magnitude time series. Such a method creates surrogate data with the same Fourier amplitudes and the same distribution of values. This procedure reproduces the behavior found in Figures 16 and 17 (see the blue dashed curve in Figure 16 and the green, blue, and thick black curves in Figure 17). We generated 103 such surrogate magnitude time series, analyzed them by the prediction algorithm, and compared their AUCs with the AUCs of the observed magnitude time series that led to Figure 14. The percentage the AUCs of a surrogate time series was greater than the AUC of the observed magnitude time series in the majority of cases shown in Figure 14 are 9.0%, 0.2%, 0.6%, 43%, 0.1%, 0.1%, 0.7%, and 54% for the cases of the Landers, El Mayor-Cucapah, Hector Mine, Northridge, 2nd Superstition Hills, Big Bear, 1st Superstition Hills, and Joshua Tree EQs, respectively. Such percentages (cf. if we repeat the procedure for the whole study period the percentages for Northridge and Joshua Tree EQs become 14% and 32%, resp.) show that the behavior found in Figures 16 and 17 which is, as mentioned, related to the magnitude correlations found by Davidsen and Green [96] and Lippiello et al. [112] cannot fully account for the observed predictability. Moreover, these results show that the predictability of the aftershock time series is a nonlinear property [116] of the magnitude time series of aftershocks. The latter fact is compatible with the high nonlinearity exhibited by the coherent noise model.

Moreover, as a more direct analysis to avoid the effect of data incompleteness, we excluded the first one day period after the mainshock from the target period of the forecasting, employed time varying threshold , and examined the predictability in the [1 day, 1 month] time-interval. The corresponding AUCs and values are shown in Figure 18. These values (apart from those concerning the second Superstition Hills 6.6 EQ) certainly point to the statistical significance of the present method (86 cases out of the 129 lead to values smaller than 10% with 79 of them below 5%), although they are higher than the values corresponding to Figure 9. This effect is not incompatible with the behavior exhibited by the coherent noise model. It has been recently shown [21, 22] that the expected avalanche size of the th avalanche of the coherent noise model in a statistical ensemble of initially identical systems relaxes to the steady state value by following a power-law decay, described by the Tsallis -exponential [117], almost inversely proportional to . When we started our study in the second day after the mainshock, we excluded almost 10% of the first month aftershocks. These events, however, express the most important part of the off-equilibrium behavior which can be best described by the present algorithm as shown in Figure 14. As the order of the avalanche increases, the system deviates less from the steady state and hence it is plausible to assume that its predictability decreases. The fact that the predictability decreases when we exclude the first 10% of the avalanches produced and focus on the latter 90% has been also verified by simulations of the coherent noise model.

Thus, in summary, we have shown that the observed predictability of the present algorithm cannot be attributed solely to STAI nor is a result included in up to date ETAS modelling nor is a direct consequence of the magnitude correlations studied by Davidsen and Green [96] and Lippiello et al. [112] in Southern California. The present method tries to predict large aftershocks from the correlation between magnitudes which appear to mimic the behavior of the coherent noise model. According to our view, this is the reason behind the predictability observed in Figure 9 that corresponds to a large time period of study as well as in Figure 14 that corresponds to the first day.

If we focus on the strongest aftershock (which may be the most destructive one), the value of the predictor before its occurrence for all the 14 EQs studied is 4, 3, 4, 0, 2, 6, 1, and 7 for the Landers, El Mayor-Cucapah, Hector Mine, Northridge, second Superstition Hills, Big Bear, first Superstition Hills, and Joshua Tree EQs in California, whereas it is 4, 3, 0, 0, 0, and 0 for the Sumatra-Andaman, Sumatra-Nias, Sumatra Indonesia, Chile, Tohoku Japan, and Indian Ocean EQs in GCMT. Thus, when selecting for Southern California the threshold and for GCMT, as in Figure 13(b), one could have successfully predicted the strongest aftershock for all the 14 EQs studied with the corresponding operating points above the diagonal in the ROC diagram. Moreover, in 12 out of the 14 cases studied, that is, if we discard the cases of Sumatra-Nias (which, as mentioned, could have been predicted based on the analysis of Sumatra-Andaman) and Indian Ocean EQ, these operating points lie in the left uppermost quadrant of the ROC diagram with an average hit rate 76% and an average false alarm rate 28% (cf. the corresponding average values for all 14 EQs are 77% and 34%, resp.). Similarly, when selecting and the 12 out the 14 operating points lie in the region and ; see Figure 13(a). The 14 operating points in this case lead to an average hit rate 61% and an average false alarm rate 16%.

Returning now to the results for Southern California, we have to note that the probability to observe in the examined predictor time series is 0.8%, 0.4%, 1.2%, and 2.0% for the four stronger EQs, respectively, thus leading to a number of alarms smaller than 2% of the observed aftershocks (with ) in each case (cf. the corresponding total alarm duration in conventional time, which is depicted in Figure 19(a), varies from four hours and forty minutes for the Landers EQ to one hour for the El Mayor-Cucapah EQ). Moreover, when considering the five stronger aftershocks in each case (see Table 3), we observe that by setting we can successfully predict (more than) 80% of these strong aftershocks (cf. only the cases labeled L4, E4, H3b, and N4 exhibit ). As an example, we also plot in Figures 19(b) and 19(c) the total alarm duration in conventional time as a function of the time elapsed from the mainshock for the predictor thresholds and 8. We observe that, during the first month from the mainshock, which is the most highly prone period to strong aftershocks, this quantity varies from 9–30 hours to 3-4 days .

4. Conclusions

In conclusion, in the present study we presented a simple algorithm for predicting aftershock magnitudes based on the analysis of the coherent noise model and examined its performance in Southern California, where a waveform relocated catalogue [8] from 1981 to June 2011 [9] is available. This study has been extended, for comparison purposes, to the six strongest EQs in the Earth during almost the last forty years according to the GCMT catalogue [100, 101]. The main conclusions are the following:(1)The predictor time series exhibits the ubiquitous noise.(2)The algorithm leads to statistically significant results for 13 out of the 14 EQs studied. Only the case of Sumatra-Nias EQ could be considered by chance, but this EQ might have been anticipated (three and a half days before its occurrence) by the similar analysis for Sumatra-Andaman EQ.(3)When focusing on predicting aftershocks with for all the 14 cases studied, an average behavior with a hit rate 61% and a false alarm rate of 16% can be achieved since the corresponding percentages may fluctuate for each particular case; see Figure 13(a).(4)The strongest aftershock of each EQ in Southern California with could have been successfully predicted with a number of alarms amounting only 2% of the total number of aftershocks (with ) and a total alarm duration of less than five hours in each case.

The 3 cases when a stronger EQ occurs within or close to the rectangular area studied during the study period of another EQ have revealed the following. The second Superstition Hills EQ would have been missed, whereas Landers and Sumatra-Nias EQs might have been anticipated. The corresponding selection of operating points on the ROC diagram can be seen in Figure 13(b) and leads to an average behavior with a hit rate 77% and a false alarm rate of 34%.

Appendix

A FORTRAN Code Implementing the Algorithm

The following simple FORTRAN code (Findek.f) calculates the value of based on the sequence of EQ magnitudes given in the file input.cat. The user has to provide the number (nmax) of EQs in the catalogue file. The output is then exported to the ASCII file out.dat that contains in each row the aftershock magnitude and the value of , that is, the value of before its occurrence.

      program Findek
      double precision rmw(100000),b(1000),nb(1000)
      integer nmax,i,ii,m,kact,nkact,nll,ll
      write(*,*) ′How many EQs do they exist in input.cat?′
      read(*,*) nmax
      open(1,file=′input.cat′)
      do i=1,nmax
      read(1,*) rmw(i)
      end do
      close()
      open(1,file=′out.dat′)
c Initialize b vector and its dimension (kact), kact=n_k+1,
c where n_k is that of Eq.() of the main text
      kact=1
      b()=rmw()
c Study each new EQ in the magnitude timeseries
c and construct the new b vector (nb) and its
c new dimension (nkact)
      do i=2,nmax
      ll=0
      nll=0
10    ll=ll+1
      if (rmw(i).gt.b(ll)) nll=ll
      if (ll.lt.kact) goto 10
      nkact=0
      nkact=nkact+1
      nb(nkact)=rmw(i)
      do m=nll+1,kact
      nkact=nkact+1
      nb(nkact)=b(m)
      end do
c Export the current EQ magnitude and the previous e_k,
c that is the previous dimension (kact) of the b vector
c minus unity
      write(1,15) rmw(i),kact-1
c Replace the b vector by its new value (nb) and
c dimension (nkact)
      do ii=1,nkact
      b(ii)=nb(ii)
      end do
      kact=nkact
      end do
      close()
15    format(f6.2,i8)
      end

As an example input.cat we provide below the data concerning the Landers EQ that led to the results presented in Figure 5(c) consisting of (nmax=)67 EQs:7.30  34.202 116.435 199206281157335.77  34.126 116.402 199206281200445.70  34.123 116.253 199206281201155.00  34.048 116.456 199206281202163.50  34.523 116.612 199206281217483.00  34.175 116.782 199206281218504.00  34.581 116.633 199206281220442.20  34.181 116.442 199206281226414.20  34.477 116.623 199206281227365.49  34.166 116.420 199206281236395.41  34.355 116.524 199206281240524.44  34.120 116.420 199206281243584.29  34.483 116.527 199206281256094.00  34.319 116.499 199206281306404.30  33.910 116.327 199206281306473.50  34.326 116.474 199206281308154.87  34.438 116.456 199206281310493.80  34.321 116.369 199206281313364.11  34.155 116.423 199206281317474.50  34.068 116.434 199206281318153.22  34.128 116.402 199206281319243.36  34.125 116.417 199206281323093.68  34.222 116.436 199206281323444.88  34.175 116.411 199206281326043.00  34.001 116.543 199206281335373.00  34.430 116.492 199206281335594.17  34.190 116.424 199206281340543.00  34.524 116.582 199206281346494.22  34.100 116.377 199206281350164.95  34.143 116.403 199206281350444.00  34.157 116.469 199206281351423.50  34.141 116.432 199206281352444.14  34.118 116.640 199206281409283.01  34.361 116.471 199206281410262.50  34.615 116.629 199206281411064.03  34.609 116.632 199206281429012.80  34.161 116.468 199206281429453.00  34.430 116.456 199206281430272.70  34.155 116.418 199206281430453.54  34.636 116.514 199206281432073.40  34.596 116.652 199206281432303.32  34.279 116.440 199206281435334.35  34.099 116.429 199206281439063.00  34.610 116.631 199206281440493.00  34.164 116.435 199206281443095.53  34.169 116.853 199206281443213.37  34.198 116.861 199206281459132.80  34.089 116.436 199206281500004.54  34.167 116.821 199206281504506.30  34.202 116.828 199206281505303.00  34.199 116.798 199206281508032.50  34.203 116.801 199206281508412.50  34.221 116.847 199206281509563.00  34.198 116.776 199206281510252.80  34.138 116.856 199206281511023.00  34.167 116.839 199206281511372.50  34.252 116.723 199206281512162.50  34.288 116.716 199206281513373.00  34.157 116.870 199206281515322.80  34.144 116.843 199206281516594.00  34.164 116.851 199206281517094.66  34.134 116.851 199206281517124.58  34.214 116.749 199206281518324.00  34.051 116.399 199206281520164.76  34.219 116.762 199206281524284.27  34.223 116.803 199206281525193.50  34.217 116.751 19920628152653Only the first column is used by Findek.f, whereas the other three columns correspond to the latitude, longitude, and time of occurrence UTC (with format: YYYYMMDDhhmmss, where YYYY, MM, DD, hh, mm, and ss stand for the year, month, day, hour, minute, and seconds, resp.) of each EQ—within the area studied; see the first row of Table 1 and Figure 1—according to the waveform relocated catalogue [8] for Southern California from 1981 to June 2011 [9] and have been included here as test data.

Competing Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to thank Professor Egill Hauksson for his suggestions on the use of the waveform relocated catalogue of Southern California (1981 to June 2011).