#### Abstract

Constructing a statistical model to characterize the physical conditions associated with large earthquake occurrence is crucial for disaster mitigation. With the aim to formulate such a model, we previously developed a statistical evaluation system which assesses the correlation of the spatiotemporal relationship between different kinds of physical quantities with the occurrence times of large inland earthquakes. In this study, we focused on assessing the relationship between geodetic and seismic quantities and attempted to find the pair of related quantities that most likely indicates preparatory processes of large earthquakes in Japan. We assessed the quantities prior to M ≥ 6.0 inland earthquakes for the period of 2001–2007 in terms of probability gains and error diagram. Our system revealed that the pair of absolute value of dilatation rate and seismic energy showed the highest statistical performance. Further validation of this result is required by updating the database of physical quantities.

#### 1. Introduction

It is an urgent issue for disaster mitigation to develop a statistical model well reflecting crustal activities, which are expected to include the preparatory processes of large inland earthquakes. Resolution of this issue requires a comprehensive understanding and monitoring of crustal activities through more than one kind of physical observation. Therefore, keeping this in mind, in this study, we focused on examining the spatiotemporal relationships between different physical quantities, at least one of which is time-variable. By determining informative combinations of physical quantities, that is, those having some correlation, we aim to develop a monitoring index/indices in crustal activities which feature the preparatory processes of large inland earthquakes for Japan. Imoto [1, 2] insisted that their seismicity model shows higher performance by taking into consideration the correlation between different kinds of geophysical parameters. This implies that there are some correlations between different kinds of observations that may represent different aspects of crustal phenomena or show causality between apparently different crustal phenomena.

As the first step toward formulating a model to capture statistical and physical conditions for the occurrence of large inland earthquakes, we created a database of physical observations with spatially and temporally gridded formats (Figure 1). Furthermore, we constructed a statistical evaluation system to assess the relationship between the occurrence times of target events and interquantity relationship in the physical observation data, at least one of which is time variable [4]. This system comprises some calculation modules corresponding to respective steps in Section 2 and their subordinate functions (not shown in Section 2) to output several parameters which are required for next steps or to display the final result of statistical evaluation at (Table 1, Figure 2).

**(a)**

**(b)**

Here, we actually operated the system to examine the relationship between two physical quantities and the occurrence times of mainshocks in Japan. We adopted the spatiotemporal relationship between geodetic quantities (dilatation rate and maximum shear strain rate) and seismic quantities (seismic energy and number of earthquakes) for representative physical observation. We look for the relationship for the earthquake from 2001 through 2007 for inland Japan. In evaluating this relationship, the system computes the statistical performance of a proposed monitoring index which would reflect an aspect of the preparatory process of large earthquake occurrence. We searched for the most informative combination between geodetic and seismic quantities based on evaluating the statistical performance of the proposed monitoring index for inland mainshocks. Evaluation of the statistical performance revealed that the time variation of the monitoring index created based on the relationship between the absolute value of dilatation rate and seismic energy was most closely associated with the occurrence times of inland mainshocks. Our result revealed the relationship between the absolute value of dilatation rate and seismic energy would best reflect the crustal activities involving preparatory processes of the occurrence of M ≥ 6.0 inland mainshocks. An important future issue is to further validate the highest performance of the proposed monitoring index by updating the database to increase the number of samples, or large inland earthquakes.

#### 2. Statistical Evaluation of the Relationship between Geodetic and Seismic Quantities

The statistical evaluation system developed by Kawamura et al. [4] statistically assesses the relationship between the spatiotemporal connection in any two kinds of physical quantities and the occurrence times of target physical events of interest We actually operated the system as follows to find a monitoring index, which is created based on a pair of items (a) to (d) in Figure 1, or a pair of geodetic and seismic quantities, such that the index can feature crustal activities which are expected to reflect the preparatory processes of large inland earthquakes.

##### 2.1. Step 1: Making a Database with Spatially and Temporally Gridded Formats

We created a database comprising items (a) to (d) in Figure 1. It shows spatially and temporally gridded maps of four physical quantities: (a) dilatation rate, (b) maximum shear strain rate, (c) seismic energy [3] within 30 km search radius, and (d) the number of earthquakes within 30 km search radius, respectively, over inland Japan. Items (a) and (b) were computed from GEONET (GPS Earth Observation NETwork) data (Figure 2(a)). Items (c) and (d) were calculated from unified hypocenter data operated by Japan Meteorological Agency (Figure 2(a)).

We here explain the method of making physical quantities (a) to (d) (Figure 2(a)). To obtain dilatation rate (a) and maximum shear strain rate (b) with grid formats, we first removed abnormal values from the GEONET coordinate data, where the coordinate data with a standard deviation >3 over half a month before and after each time was regarded as abnormal value. We next calculated the displacement rate at each observation point. We assumed that the displacement rate on an arbitrary coordinate grid can be represented by using a linear function of the displacement rate at its nearby observation point. Then, the displacement rate at observation point is related to the position of grid point as follows:
where and are east-west and north-south components of displacement rate at each observation point; and are east-west and north-south components of the position of observation point relative to the position of grid point. and are the error term of respective components. Then, dilatation rate and maximum shear strain rate *∑* can be obtained as follows:
Furthermore, the error term , was weighted depending on the distance between observation and grid points as follows:
where is each component of the estimation error of the displacement rate at observation point. *R* is the distance between observation and grid points. *D* is weight-adjusting parameter. In the following sections, we show the system operation result obtained by setting and .

To obtain seismic energy (c) and the number of earthquakes (d) for each grid point, the earthquakes within a radius of 60 km of grid point with depths shallower than 30 km and magnitudes ≥1.0 were collected (Figure 3(b)). Then, the total seismic energy and the total number of earthquakes were calculated, where the seismic energy and the number of each earthquake were weighted depending on the distance between observation and grid points as follows:
where is physical quantity related to the th earthquake, to which 1 is assigned in the case physical quantity is defined as the number of earthquakes, *n* is the total collected number of earthquakes for grid point, and R is the distance between observation and grid points. is weight-adjusting parameter. In this way, by taking the distance-depending weight into consideration, the essential event collection radius becomes 30 km. In the following sections, we show the system operation result obtained by setting = 10 km and 0 ≤ *R* ≤ 60 km.

**(a)**

**(b)**

**(c)**

**(d)**

##### 2.2. : Comparison between Geodetic and Seismic Quantities

After the selection of one of the four possible pairs (here, the absolute value of dilatation rate and seismic energy (per two years), maximum shear strain rate and seismic energy, absolute value of dilatation rate and the number of earthquakes (per two years), and maximum shear strain rate and the number of earthquakes), its relationship was examined using scatter diagrams for circular regions, each having 60-km radius centered in 0.05°-by-0.05° spatial grids, over inland Japan; each pair was compared for two-year time windows centered in a three-month-interval time grid.

##### 2.3. : Quantification of the Relationship Between Geodetic and Seismic Quantities

Quantification of the relationship between geodetic and seismic quantities requires us to define an index well characterizing the relationship. We here define the index as a statistical index *R*(*t*), where represents time. One of the simplest indices is correlation coefficient. Scatter diagrams for respective time windows, however, implied that the relationship between geodetic and seismic quantities showed relatively complicated patterns, which forced us to define another index as follows.(1)For a particular circular region centered in a spatial grid and for a two-year time window centered in a time grid, the mean of the absolute value of dilatation rate or the maximum shear strain rate was calculated from the grids ranking in the top 20% in terms of seismic energy or number of earthquakes. We defined this parameter as statistical index *R*(*t*), where represents the midpoint of each two-year time window.(2)Process (1) was carried out for each time window which is moved by three months. The linear trend of *R*(*t*) was then calculated for each time period from 1 January 2001 through 1 October 2007 with reference to the fixed time grid of 1 April 1998. The linear trend of *R*(*t*) thus obtained was classified into three categories on the basis of trend significance with a 95% confidence level: significant decrease, significant increase, and insignificant change.(3)Processes (1) and (2) are carried out for every spatial grid with an interval of 0.05°.

##### 2.4. : Statistical Evaluation of the Relationship between Geodetic-Seismic Quantity Connection and Large Event Occurrence Times

The relationship between the occurrence times of large inland mainshocks and different types of statistical index *R*(*t*) was evaluated by calculating the probability gains of alarm rate (AR) and success rate (SR) for M ≥ 6.0 inland mainshocks and drawing an error diagram based on a contingency table. To calculate AR and SR for each pair of physical quantities, we need to tabulate the relationship between the proposed potential condition, which is proposed for the possible occurrence of physical events of interest, and the actual occurrence of the event (Table 2). The items in the row and column of contingency table correspond to the potential condition for event occurrence and the actual occurrence, respectively. Defining the time period for positive potential condition in which the events will be predicted to occur as the period of *A* and the period of negative potential condition as the period of *B*, each element of (*a*, *b*, *c*, *d*) in Table 2 can be explained as follows:(*a*)the number of cases in which physical events of interest occur during the period of *A*,(*b*)the number of cases in which physical events did not occur during the period of *A*,(*c*)the number of cases in which physical events of interest occur during the period of *B*,(*d*)the number of cases in which physical events did not occur during the period of *B*.

By this definition, *a*+*b* and *a*+*c* can be interpreted as follows.(*a+b*):Total number of the periods of *A*.(*a+c*):Total number of periods in which physical events of interest occurred.

If an event occurs during the time period of *A*, one count is added to element . Using elements of (*a*, *b*, *c*, *d*), alarm rate (AR), and success rate (SR) can be calculated as follows [5, 6]:
The role of this step is to find a useful monitoring index/indices which feature the preparatory crustal activity of a large inland earthquake through a statistical evaluation process. Such a monitoring index is defined based on the information on grid counts showing a specific (ex. negative significant) trend of *R*(*t*) for a time window which are accompanied by high seismic activities. It should be noted that the performance of the monitoring index must be verified and validated in prospective way by updating the database.

We here focused on time variations of spatial grid counts which are accompanied by one-year release of total seismic energy larger than or equal to M4.0 earthquake (6.0 × 10^{10} Nm); the seismically active grid counts are specifically defined as no. PTN1 for those of significant decrease in *R*(*t*), no. PTN2 for those of significant increase in *R*(*t*), and no. PTN3 for those of insignificant change in *R*(*t*). In addition, the three kinds of grid counts were normalized by their respective total grid counts (pure-color plus dark-color grids in Figure 3) for each time period of calculation. Step tabulates thus calculated spatiotemporal distributions of no. PTNs1–3 and outputs the following figures Figure 4(a) spatial distributions of no. PTNs1–3 for each time period (Figure 3), Figure 4(b) temporal changes in no. PTN1, no. PTN2, and no. PTN3 (Figure 4), and Figure 4(c) temporal changes in no. PTN1/no. PTN2 (Figure 5); this figure represents superiority or inferiority in the normalized area of PTN1 and PTN2. Assuming the temporal changes in no. PTN1/no. PTN2 reflect the preparatory processes of large inland earthquakes, we attempted to formulate a positive potential condition for the occurrence of large inland earthquakes in terms of a way of temporal changes in no. PTN1/no. PTN2 as shown in the first row of Table 3.

**(a)**

**(b)**

**(c)**

**(d)**

**(a)**

**(b)**

**(c)**

**(d)**

Statistical evaluation process at requires the probability gains [7, 8] of alarm rate (AR) and success rate (SR). To calculate the probability gains of AR and SR, we prepared reference AR and SR for each pair of geodetic and seismic quantities in the following way.(1)We applied the bootstrap method to shuffle the spatial distributions of no. PTNs1–3 grids and their corresponding seismically inactive grids for a particular time window.(2)Process (1) was carried out by moving the time window with its beginning time grid fixed. By repeating this, we obtained the temporal variation of the shuffled spatial distributions.(3)We applied the bootstrap method to shuffle the temporal variation obtained in process (2).(4)Processes (1) to (3) were repeated 100 times.(5)The 100 ARs and SRs obtained were averaged.

Thus, calculated references AR and SR were considered as being due to a random phenomenon.

#### 3. Most Informative Pair of Geodetic and Seismic Quantities

Among the four pairs of geodetic and seismic quantities examined, the pair comprising the absolute value of dilatation rate and seismic energy was found to be the most closely related in terms of probability gains of both alarm rate (AR) and success rate (SR) for M ≥ 6.0 inland mainshocks (3.56 and 3.58, resp.) (Table 3). This pair showed the best performance even from the viewpoint of the error diagram (Figure 6). This is because the actual plot was located closest to position of the origin in Figure 6, which corresponds to the best performance of a proposed potential condition for event occurrence. The condition is represented in the first row of Table 3 or as the horizontal axis of Figure 6. This result was the same regardless of the approval methods and confidence levels. Figures 4(a) and 5(a) and the leftmost figures of respective elements in Table 3, which correspond to the pair comprising the absolute value of dilatation rate and seismic energy, indicate that M ≥ 6.0 inland mainshocks are most likely to occur during the time period of the decrease in no. PTN1 relative to no. PTN2 or the increase in no. PTN2 relative to no. PTN1.

**(a)**

**(b)**

**(c)**

**(d)**

On the assumption that this tendency holds for even a small region influenced by a change in Coulomb failure stress (CFS), a possible physical occurrence mechanism of a M ≥ 6.0 mainshock would be related to its preparatory process over a fault system. A mainshock would be triggered at the stage after microseismicity becomes active in the fault system, where stresses are considered to be accumulated on its asperities. Such a region may be accompanied by smaller strain rates. Because only a few samples of M ≥ 6.0 inland mainshocks were available to us, we need to continue to collect observations and further evaluate and validate the condition for event occurrence and the closest relation between geodetic and seismic quantities in the same manner as the one described in this study. We emphasize that this manner must not be changed in validating the condition for event occurrence. Furthermore, it is important to operate the statistical evaluation system to search for other informative pairs of physical quantities to gain a physical understanding of crustal activities as preparatory processes of large inland earthquakes.

#### 4. Summary

Based on the notion that there are some correlations between different kinds of observations that may represent different aspects of crustal phenomena, we began with creating a database of various physical quantities and further constructed a basic system for statistically evaluating and validating a monitoring index which would reflect crustal activities associated with the occurrence of physical events such as large inland earthquakes. We have indicated that the most informative pair of geodetic and seismic quantities comprises the absolute value of dilatation rate and seismic energy in terms of probability gains of alarm rate (AR) and success rate (SR) and error diagram for M ≥ 6.0 inland mainshocks. The probability gains were calculated on the basis of contingency tables which show the relationship between the temporal changes in linear trends of statistical index *R*(*t*) in seismically active areas and the occurrence times of M ≥ 6.0 inland mainshocks. Further evaluation and validation of the best pairs and the search for other informative combinations of physical quantities are necessary to gain a physical understanding of crustal activities, particularly, the preparatory processes of large inland earthquakes.

#### Acknowledgments

The authors deeply acknowledge valuable comments provided by two anonymous reviewers, which highly contributed to the improvement of the manuscript. They are also grateful to Geospatial Information Authority of Japan, and Japan Meteorological Agency for letting them utilize GEONET (GPS Earth Observation NETwork) data and unified hypocenter catalog, respectively.