#### Abstract

Radon is one of the precursory phenomena that exist in connection to the occurrence of earthquakes and may have potential in forecasting these hazards. The data set in this study contains the observations of radon from August 01, 2014, to January 31, 2015, collected in Sobra city, northern Pakistan. Weibull, gamma, log-normal, log-logistic, and Pareto probability distribution were fitted over the radon on its original scale and a log scale. Log-logistic best fits the radon on both scales. The Tukey control charts reveal several anomalies that were compared with earthquake occurrence. There are five earthquakes that occurred during the same period as the radon monitoring program, having magnitude ranges between 4.9 and 5.5, with the ratio between strain radius and distance to the epicenter greater than or equal to 1. The results of this study demonstrate that, for earthquakes, seismic events show a correlation with increasing concentrations of radon gas before their occurrence.

#### 1. Introduction

Earthquakes are natural disasters that strike so suddenly and cause damage on a larger scale. Forecasting these events has tremendous promise in terms of protecting human life and wealth. A variety of recent studies have found that certain preliminary signs, both geophysical and seismological, can be detected before the earthquakes [1–4].

Radon gas emitted from the Earth is being used as an earthquake precursor; many recent studies have shown that changes in radon concentration prior to major earthquakes may occur [5–8]. Radon is produced by decay product of uranium that is present in rock and soil minerals, and some amount of radon may accumulate in the cracks present in the Earth’s crust. It is found that there are certain shifts in the surface that may trigger the release of radon gas before the incidence of the earthquake. This increased emission of radon which is considered as a precursor for the earthquake. Vaupotic et al. [9] have studied the radon concentration in soil and rate of exhalation of radon at the Ravne Fault in NW Slovenia by calculating the concentration of radon in soil gas, rate of radon emission from the ground, permeability of soil, and gamma dose rate in the above-said region. The acquisition was done with the help of an Alpha Guard radon monitor and Gamma Tracer for radon in soil gas and gamma dose. The study concluded that the values of radon concentration and radon exhalation were lower as compared to other areas of Slovenia and there was not an increasing trend in the data because the investigation sites were a little bit far from the fault and were perpendicular in direction. Sac et al. [10] have monitored the radon concentration for earthquake precursory in Tuzla fault line in western Turkey. Active as well as passive techniques were used for measuring the concentration of radon in soil gas and thermal waters. Solid-state nuclear track detectors (LR-115) were used to measure the radon concentration in soil and thermal water fields.

Recent work for radon as an earthquake precursor [11] carried out the analysis of radon and its daughter products in relation to the great Japan earthquake that occurred in March 2011 (Richter magnitude 9.0). Two geochemical precursors were studied i.e., radon gas (Rn-222) and thoron gas (Rn-220) in the underground controlled environment. Based on annual data, a rise in thoron concentration was observed February 2011 only, whereas rise in radon concentration was detected twice in a year. Researchers found that, for earthquake forecasting purposes, both radon and thoron were reliable geochemical precursors, and the magnitude 9.0 Japan earthquake (March 2011) was successfully correlated with anomalous trends. Similarly, radon and its daughter product were studied for short term earthquake forecasting [5–8].

For continuous monitoring of radon gas, the study used an online radon measurement system named Alpha-Meter 611 [10], where a few positive anomalies were detected. Later, a prominent increase in radon concentration was detected in the period March–June 2007, which was correlated with seismic activities in the area. The variation in radon concentration reached its highest values 10 days before the Seferihisar earthquake of magnitude 3.0 occurred on June 4, 2007, while the epicenter was 35 km away from the radon detector. Another earthquake of magnitude 3.1 occurred on March 28, 2007, in the same region. After this event, abrupt decreases in radon values were observed in LR-115 data. Two other earthquakes of magnitude 3.0 and 3.5 occurred 35 km away from the Cumali radon monitoring station, and again the sharp decrease in radon concentration was observed after each event; moreover, the concentration maintained its low level afterwards. This research concluded that continuous monitoring of radon concentration in soil and thermal waters required understanding the relation of variations in radon gas emission with the occurrence of seismic events. Namvaran and Negarestani [12] measured soluble radon in the above-said study area and compared it with the earthquake forecasting process. They monitored the concentration of radon gas in groundwater over the period December 2011 to March 2012 with the help of RAD7. Continuous monitoring with a cycle of 10 minutes was selected and compared with the earthquake catalogues of the Iranian Seismological Centre (IRSC) and International Institute of Earthquake Engineering and Seismology (IIEES). Pulinets and Boyarchuk [1] worked on the emanation of radon and metallic aerosols before some strong earthquakes and compared their effects with changes occurring in the atmosphere and ionosphere. The study concludes the electric field values present in the atmosphere and ionosphere are not affected by radon emission but can be used for earthquake prediction studies. In Taiwan, along the two most famous geological faults Hsincheng and Hsinhua, radon gas variations in a soil were monitored using RTM 2100 where soil gas survey was performed followed by continuous monitoring of radon on few sensitive locations [13].

Das et al. [5] have worked on continuous monitoring of radon and its progeny at a remote station for seismic hazard surveillance in thermal springs at Bakreswar, West Bengal, India. German-made electronic radon monitor SARAD DOSEman was used to determine the concentration of radon and its progeny from Dec 2004 to Feb 2005, where radon values exceding 2 were considered as an anomaly. Anomalous values of radon and its daughter products correlated with seismic events that occurred around Indonesia, Nicobar, and the Andaman Islands. The use of a 2 limit for radon anomaly comes from statistical process control, which found extensive application for monitoring changes in environmental and geological properties e.g., temperature, humidity, air pressure, clouds movement, and rainfall [8, 14]. Instrumental errors may occur in the radon measurement system [15] which may misguide the radon anomalies. Traditional parametric 2 limits may become unreliable if the data are not independent and are deviated from the normal distribution, which is the case for radon data [5]. The Tukey control chart (TCC) has been used as a nonparametric alternative to the 2 control limit and is one of the potential candidate in determining radon anomalies [16–20]. TCC is also considered as potential candidates for nonnormally distributed data; for instance, it has been used to model gamma-based data [17, 21]. The TCC is tested in the current study for monitoring the radon from August 01, 2014, to January 31, 2015, at the measuring station located at Sobra city in northern Pakistan, which is the area affected by earthquakes.

#### 2. Material and Methods

##### 2.1. Data Description

The radon ( mega ) dataset in this study was obtained from the earthquake station located in Sobra city, Pakistan, using Sarad RTM 2200 analyser [8]. The time span covers the six months i.e., from 1st August 2014 to 31st January 2015. The continuous monitoring [22] method with one sample every 15 minutes was used. This dataset is held by the Centre for Earthquake Studies, National Centre for Physics (NCP), Islamabad https://www.ncp.edu.pk/ces.php.

##### 2.2. Instrumental Error

The data contains the sudden presence of an extremely high value that occurred for a very short time. For example, high radon emissions were recorded just one or two times and then values drop back to the average trend. According to theory and previous trends, extreme radon emission peaks cannot appear and disappear suddenly [5, 14, 23]. Hence, such extreme peaks are considered as instrumental error. The instrumental error was replaced with missing values for the purposes of data processing. Moreover, whilst the instrument RTM 2200 fulfills all the technical requirements, some missing values are observed. The methods we have chosen to monitor the radon data can work with missing observations so missing observations were not inputted to keep the best originality of the data.

##### 2.3. Monitoring Radon Concentration

For monitoring the radon concentration, appropriate limits can be determined by first fitting a probability distribution over the radon data. Let *X* represent the radon data whose probability distribution can be presented by where and radon distribution is positively skewed [5]. Hence, the potential candidates for fitting the radon data include Weibull, gamma, log-normal, log-logistic, and Pareto probability distribution. For instance, the Weibull probability distribution is defined aswhere is the parameter that defines the shape and defines the scale of the distribution.

The log normal probability distribution is defined aswhere *X* is log normally distributed (i.e., ). The probability density of log-logistic function is defined aswhere, , and .

The Pareto (Type I) distribution is defined aswhere is the (necessarily positive) minimum possible value of *X* and is a positive parameter.

The goodness of fit of probability distribution includes both graphical techniques and hypothesis testing. The graphical hypothesis testing includes the empirical and theoretical densities plot, *Q*-*Q* plot, empirical and theoretical CDFs plot, and *P*-*P* Plot, while the hypothesis testing includes the Kolmogorov–Smirnov test, Anderson–Darling test, and Cramer–von Mises test for goodness of fit. Assuming that the radon data are not normally distributed, the Tukey control chart is one of the potential candidates [18–20] to assess the radon data. A Tukey control chart is based on following steps.(i) Find the quartiles i.e., *Q*1, *Q*2 (median), and *Q*3 and Interquartile Range (IQR) from the best-fitted probability distribution(ii) Setup the control limits of the Tukey chart as

Since, radon concentration is expected to follow a skewed distribution with some really large measurements, radon concentration is assessed on the original scale and on the log scale as well.

##### 2.4. Earthquake Preparation Zone

The earthquake is supposed to accrue within the earthquake preparation zone (EPZ) called *D*. The schematic diagram for measuring the earthquake activity is presented in the upper panel of Figure 1, and the lower panel describes the schematic diagram for *D* and observed distance. The radius *R* is computed by following empirical distance formula [12]:

An event with *D*/*R* ratio that is with the ratio between strain radius and distance to the epicenter greater than or equal to 1 indicates the occurrence of earthquake in that particular radius.

##### 2.5. Computations

For computations and statistical radon assessments, *R* computing environment http://www.r-project.org/ software is used. For fitting the probability distribution and computing, the Tukey limits *R*-package ‘fitdistrplus’ [24] and ‘actuar’ [25] are used. For plotting results, *R*-package ‘ggplot2’ [26] and ‘gridExtra’ [27] are used.

#### 3. Results and Discussions

The studied dataset contains 4026 radon valid observations having minimum concentration 370 , maximum concentration 249702 , mean concentration 10922 , *Q*1 concentration 938 , and median i.e., *Q*2 concentration 2419 and *Q*3 concentration 5529 . The radon data contains 361 (8.23%) missing values of total observation. Among 4026 observations, 23 were marked as instrumental error, so the missing values increase to 8.76%. Since the proposed method did not require to input the missing observations, radon data having missing observations were used to construct the probability distribution-based Tukey control chart. The graphical comparison of the fitted and theoretical probability distribution over the original and log scale data is presented in Figure 2. Radon is a positively skewed distribution and is clearly nonnormal distribution. Further, we have used KolmogorovSmirnov test indicating that the radon data is not normally distributed ( value = 0.54). All four probability distributions being considered fit the radon data with some minor difference. With log transformation, radon tends to be normal but remains positively skewed. Log-logistic and log-normal both fit the log radon data. To determine the one best-fitted probability distribution, we have used several goodnesses of fit measures. The comparison of fitted probability distribution through Kolmogorov–Smirnov (KS) test, Cramer–von Mises (CM) test, and Anderson–Darling (AD) test together with Akaike’s information criterion (AIC) and Bayesian information criterion (BIC) is presented in Table 1. Log-logistic has the least AIC (68822.020) and BIC (68834.371) on the original scale and also has the least AIC (12109.650) and BIC (12122.001) on a log scale. A value of () is used to test the KS, AD, and CM statistics. Log-logistic best fits radon on the original and on a log scale; hence, it is used to construct the Tukey chart. The estimated log-logistic parameters (shape and scale) and Tukey chart parameters estimated from radon data are presented in Table 2.

Radon is present everywhere in the environment. It tends to move upwards the atmosphere from rocks and soil and using different pathways e.g., faults, rocks, caves, and carriers such as air and water. This means radon is a nonnegative real quantity, that is, its concentration is always positive and rarely zero, but never negative. Hence, on the original scale, Tukey chart, UCL, and CL are considered; however, on the log scale, the UCL, CL, and LCL are shown in Figure 3. Log-logistic based Tukey control chart results in a central limit of 2465.459 and upper control limit of 12704.080 on the original radon data, while it results in a central limit of 7.759 (2343), upper control limit of 11.080 (64861 ), and lower control limit of 4.524 (92 ) on the log scale. Moreover, the average run length (ARL) is computed [16]. Log-logistic best radon Tukey control chart has ARL0 = 50 and ARL1 = 161. On both scales, the data show the variations and indicate the anomalies if the measurement falls out of the limit. For example, multiple collections of increased or decreased values reported by the measurement device were collected during August 2014. Another significant trend occurred between 10–26 September 2014, and a couple of peaks occurred near 10 October 2014. It is obvious that, during September, the upper control limit is very successful in classifying the continuous increase in results. In comparison, further increases in a brief period can be readily detected from November to December 2014, even though the dataset has the lowest radon concentration at that time. The majority level of elevated radon levels are found after 15 January 2015 in Figure 3, and the maximum value is readily detectable on the Tukey chart. In short, based on the Tukey chart of the radon database, we detect three general anomalies. The first and last anomalies have persisted for two to three weeks between September 2014 and January 2015. Based on this analysis, it is concluded that the Tukey chart performs well for data with outliers or extreme values by detecting both gradual increases in radon concentration and sharp increase in concentration since both original and log-transformed results identify significant radon anomalies based on the upper limit only.

It is obvious from Figure 3 that the log-transformed radon data allows to better define a general radon background. This background varies along time, with lower values in the summer (in August, it goes up to ca 7.7, that is, 2208 ), increasing through time, and reaching a maximum background value in winter (in January, it goes up to ca 10.25, that is, 28283 ), thus ca 10 times higher than in August. On the contrary, UCL from original data ‘cuts’ through the winter radon background, and, thus, is not the best option for a recommendation. Hence, the manuscript finally recommends the use of the log-transformed radon for monitoring radon.

The time distribution of earthquakes in our research region is shown in Figure 4. Many earthquakes of varying magnitude were recorded in August–September 2014 and January 2015, the time scale of our radon study. The earthquakes with a ratio of *D*/*R* 1 (0.9) or above are shown in Table 3. Five earthquakes meet the prediscussed ratio in the earthquake database. The minimum and maximum magnitude of these earthquakes are 4.9 and 5.5. The first earthquake happened on 7 September, with a magnitude of 5.5 and the maximum ratio (*D*/*R*) of 2.6. The second earthquake happened on 18 September 2014 in the mountainous area of the Hindu Kush with a magnitude of 5.4 on the Richter scale. On 20 October 2014, an earthquake of magnitude 5.5 occurred in the Hindu Kush region, also falling within the EPZ with a ratio of *D*/*R* 1.1. The fourth earthquake occurred on 14 November 2014 (*M* = 5.5) in the northern part of Pakistan (Hindu Kush). The last occurrence happened on 15 January 2015, with a magnitude of 4.9 and a ratio (*D*/*R*) close to 1 (0.9) indicating significant earthquake. All of these earthquakes are associated with rise in radon concentration.

#### 4. Conclusion

A dataset of radon gas measurement was assessed for possible use as the precursory indicator of earthquakes that occurred in the vicinity of Sobra city, northern Pakistan from Aug 01, 2014, to Jan 31, 2015. The log-logistic probability distribution fitted the radon data on the original scale and log scale; hence, the investigation was carried out on both the original and log scale of radon. A Tukey control chart was applied to the database with the goal of statistical process control monitoring. From the earthquakes catalog, five earthquakes fulfill the condition of a *D*/*R* ratio criteria selected to include in the analysis to determine the temporal relationships between radon concentrations and earthquake occurrence. Tukey control charts of the radon highlighted different anomalies from the whole radon dataset. Among several anomalies, five are associated with earthquakes, indicating that they may be used as earthquake precursors. For earthquakes, seismic events show a correlation with increasing concentration of radon gas before their occurrence. Moreover, the use of the log-transformed radon for monitoring radon is recommended.

#### Data Availability

The data used to support the findings of the study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.