Spatiotemporal Pattern Networks of Heavy Rain among Automatic Weather Stations and Very-Short-Term Heavy-Rain Prediction
Spatiotemporal pattern networks of heavy rain among automatic weather stations, which reflect the mobility of heavy rain, were constructed and analyzed based on the hourly precipitation data over the last ten years (from 2003 to 2012) in South Korea. Moreover, a new algorithm applying the constructed heavy-rain pattern networks to very-short-term heavy-rain prediction was developed, and significant prediction results could be obtained.
South Korea is located in the temperate zone, between 33°06′N and 38°27′N and 125°04′E and 131°52′E, on the east coast of the Eurasian Continent, adjacent to the Western Pacific. In South Korea, there are four distinct seasons, and spring and autumn are short relative to summer and winter. It has complex climate characteristics, which show both oceanic and continental features, a wide interseasonal temperature difference, and much more precipitation than continental Asia. In addition, it has an obvious monsoon seasonal wind, a rainy period from the East Asian monsoon, locally called Changma , typhoons, and frequent heavy snow in winter. South Korea also belongs to a wet region, receiving more precipitation than the global average.
The annual mean precipitation in South Korea is about 1,500 mm, and 1,300 mm in the central region. The southeastern region has the largest amount of precipitation, about 2,000 mm, and the northwestern region has the lowest amount of precipitation, about 800 mm. When a stationary front lingers across South Korea for about a month in summer, more than half of the annual precipitation falls during the Changma season. Precipitation during winter is less than 10 percent of the total. Changma is a part of the summer Asian monsoon; it brings frequent heavy rain and flash floods for about a month, stirring up severe damage.
Heavy rain is one of the major severe weather phenomena in South Korea. This weather phenomenon can lead to serious damage, and it is of real importance to predict heavy rain. However, it is considered to be a notoriously hard task because it takes place over a very short time interval . It is necessary to predict torrential downpour to prevent loss of life and infrastructure [1, 3]. Heavy-rain prediction is of paramount importance to avoid or minimize natural disasters before they occur.
The heavy-rain condition has a spatiotemporal pattern of moving from one point to another according to cloud movement. The previous spatiotemporal patterns of heavy rain may provide information on where the heavy rain will go from its current location. Based on this idea, the heavy-rain pattern network of points at which automatic weather stations (AWS) are installed is constructed and analyzed in this paper. We use real hourly precipitation data collected from 692 AWSs  in South Korea, from 2003 to 2012. Furthermore, a new method of using the constructed heavy-rain pattern networks for prediction is described. We study the prediction whether or not heavy rain will occur within one to three hours at each AWS in South Korea.
The rest of this paper consists of the following parts. In Section 2, results from previous work regarding heavy-rain prediction are presented. In Section 3, weather information on heavy rain, and its mobility, is described. In Section 4, our method for constructing heavy-rain pattern networks among AWSs and the construction results are presented. In Section 5, a new heavy-rain prediction method using the heavy-rain pattern networks among AWSs is proposed, and the prediction results are discussed. Finally, in Section 6, we draw conclusions.
2. Previous Work on Heavy-Rain or Rain Prediction
There have been many predictive studies with accuracy  on heavy rain using various machine learning techniques. In particular, several studies focused on weather forecasting using an artificial neural network (ANN) [6–12]. In the studies of Ingsrisawang et al. , Hong , and Seo et al. [14, 15], a support vector machine (SVM) was applied to develop classification and prediction models for rainfall prediction. Machine learning techniques for forecasting could be combined with metaheuristics for parameter selection .
Kishtawal et al.  predicted summer precipitation using a genetic algorithm (GA). In their study, the GA found the equations that best describe the temporal variations of seasonal precipitation over India. The major merit of using GA versus other nonlinear prediction techniques, such as ANN, is that an explicit analytical expression for the dynamic evolution of the precipitation time series is obtained. However, they used a simple GA with typical parameters. If they were to conduct experiments by tuning various parameters of their GA, they would achieve better results. Liu et al.  presented a filter method for feature selection. They proposed an Improved Naïve Bayes Classifier (INBC) technique and used GAs for selection of a subset of input features in rain/no-rain classification problems. Their method produced high accuracy on precipitation prediction, but it did not focus on heavy-rain events.
Nandargi and Mulye  analyzed the relationship between the rain and rainy days, mean daily intensity, and seasonal precipitation in India, on monthly and seasonal scales. They compared a linear relationship with a logarithmic relationship, in the case of seasonal precipitation versus mean daily intensity. El Afandi et al.  examined heavy-rain events that occurred over the Sinai Peninsula, causing flash floods, using the WRF (Weather Research and Forecasting) model. The test results showed that the WRF model could capture the heavy-rain events over different regions of Sinai and predict precipitation with significant consistency using real measurements.
Routray et al.  compared simulations carried out using a nudging technique and 3DVAR (three-dimensional variation) data assimilation system, on a heavy-rain event. After observations using the 3DVAR data assimilation technique, the model could simulate better structure of the convective organization, as well as the prominent synoptic features related to the midtropospheric cyclones, compared to the nudging experiment. Hou et al.  studied the impact of 3DVAR on the prediction of two heavy-rain events: one that affected several provinces in southern China with heavy rain and severe flooding and the other that was characterized by nonuniformity and extremely high precipitation rates in localized areas. They suggested that the assimilation of all radar, surface, and radiosonde data had a more positive impact on the forecasts than that of any single type of data.
Lee et al.  studied feature selection using a GA for heavy-rain prediction in South Korea. They used a wrapper-based method using a simple GA and SVM as the fitness function. They did not provide any explanation about errors and inaccuracies in their weather data. Seo et al. [14, 15] proposed a very-short-term heavy-rain prediction method within one to six hours using hourly AWS data from the past four years (from 2007 to 2010) in South Korea. Significant weather elements were selected through a GA and differential evolution, and SVM was applied to the selected weather elements for prediction. Their study has a drawback in that the results were unstable and that it was performed only on partial AWSs instead of all AWSs in South Korea, due to missing observations.
Hitherto there have been few studies to make heavy-rain predictions based on the results of analyzing previous spatiotemporal patterns of heavy rain. As a similar work to ours, Shukla and Pal  introduced an automatic neighborhood search for tracking and nowcasting convective weather using satellite image sequences.
3. Weather Information Related to Spatiotemporal Pattern of Heavy Rain between AWSs
In this section, information on frequencies of heavy rain in South Korea is provided, as well as meteorological information for construction of spatiotemporal pattern networks of heavy rain among AWSs.
Hourly frequencies of heavy rain at each AWS from 2007 to 2012 were examined. The number of cases that satisfied the standard of heavy rain established by the Korea Meteorological Administration  (more than 70 mm of precipitation over 6 hours, or more than 110 mm of precipitation over 12 hours) was counted. Subsequently, the overall frequency was divided by 100 for normalization, and the normalized value was marked on the map. The value was marked on a scale of zero to one, and values greater than one were marked as one. These results are presented in Figure 1. It was found that heavy rain mainly occurs in summer, although it sometimes occurs in autumn, and there is a regional difference. The results also indicate that heavy rain occurs intensively in Seoul/Gyeonggi (northwest) and Namhae (south) regions. Based on these results, it is expected that the heavy-rain pattern among AWSs would also be classified according to region.
(a) Whole seasons
Although AWSs are located nationwide, the distances between two AWSs are varied because of the difference in density in each region. To appropriately set the connections between AWSs, it is necessary to identify the distribution of different AWSs within a given radius. Moreover, wind speed can significantly contribute to identifying whether or not a cloud can move to another AWS in a short time. Based on two types of statistical data, the heavy-rain pattern network among AWSs is constructed in the next section.
To construct an AWS network, it is required to set only the AWSs located within a suitable radius as the neighboring AWSs of the relevant AWS. We set the radius to a distance of 30 km, and we obtained an average of 20 AWSs as neighboring AWSs. Then, the possible candidates for movement are restricted. The number of possible candidates of neighboring AWSs may affect the speed of network construction. The speed is fast if the number is small, whereas the speed is lowered if the number is large. In addition, if the number of possible candidates of neighboring AWSs is too small or too large, it would be difficult to detect proper spatiotemporal patterns of heavy rain.
Statistics of wind speed can have significant effects on determining spatiotemporal patterns of heavy rain, serving as an important value in determining whether or not heavy rain can be practically moved from one AWS to another. We averaged hourly wind speeds observed at 692 AWSs from 2003 to 2012. When heavy rain occurred, the average wind speed was 10.6 km/h, which was higher than the wind speed of 8.0 km/h when there was no heavy rain. Considering these statistics and the radius of 30 km for setting the neighboring AWSs, which is mentioned above, it is expected that the heavy-rain condition may be moved to a neighboring AWS within an average of three hours.
4. Heavy-Rain Pattern Networks among AWSs
4.1. Construction of Heavy-Rain Pattern Networks
A heavy-rain pattern network, in which each AWS is set as a node and a directed edge with a directional property that displays the movement of heavy rain between AWSs, is constructed. Figure 2 displays AWSs across South Korea, and some of these AWSs form a node set of the network to be constructed. The AWS network is stored according to the Graph Exchange XML Format (GEXF), and the Open Graph Viz Platform (Gephi)  is used as a visualization tool.
When the start and end times of the nationwide heavy-rain events are determined, a network is constructed. The node refers to the AWS satisfying the heavy-rain standard and includes information on the start and end times satisfying the standard. The contrast of the node reflects the start time which satisfies the standard and becomes nearly black as the time approaches midnight. The size of the node represents the duration of the heavy-rain condition. As the directed edge represents the movement information, it possesses directional properties; it was added when it was determined that heavy rain moved within “moving time” hours in the scope of movable distance. As described in the previous subsection, the value of “moving time” was set to be three.
4.2. Statistics of Heavy-Rain Events and Heavy-Rain Pattern Networks
The heavy-rain pattern network was constructed using the hourly precipitation data of AWSs from 2003 to 2012. A heavy-rain event corresponded to a heavy-rain pattern network, and a total of 493 events occurred from 2003 to 2012. In considering only events of which each corresponding network has more than 10 nodes and more than 10 directed edges, the maximum 99 hours, minimum 7 hours, and average 29 hours (standard deviation of 16 hours) were sustained in heavy-rain events.
Among the 493 heavy-rain events from 2003 to 2012, the number of constructed networks larger than a certain size (when both numbers of nodes and directed edges were more than 10) was 140. The maximum number of directed edges of these 140 networks was 2,098, and the average number of directed edges was 239 (standard deviation of 438).
4.3. Examples of Constructed Heavy-Rain Pattern Network among AWSs
(a) ID: 37 (38 hours on 02/08/2003, #edges: 417)
(b) ID: 127 (27 hours on 27/06/2005, #edges: 1034)
(a) ID: 190 (99 hours on 14/07/2006, #edges: 1852)
(b) ID: 191 (77 hours on 17/07/2006, #edges: 1943)
(a) ID: 321 (21 hours on 13/07/2009, #edges: 1711)
(b) ID: 323 (68 hours on 17/07/2009, #edges: 1159)
(a) ID: 390 (52 hours on 30/08/2010, #edges: 1356)
(b) ID: 400 (25 hours on 22/09/2010, #edges: 1132)
(a) ID: 434 (67 hours on 29/07/2011, #edges: 2098)
(b) ID: 466 (22 hours on 06/07/2012, #edges: 1558)
4.4. Similarity between Heavy-Rain Pattern Networks, and Their Similarity Digraph
Equations (1) and (2) show the node-based similarity and directed edge-based similarity between two networks and , respectively. The equations measure the overlapping degree of the node or directed edge, and the results vary according to base network . That is, the equations are asymmetric measures:
Tables 1 and 2 show the similarity among the aforementioned ten example networks (of large size). In the total 140 networks, the average node-based similarity among networks, for which the edge-based similarity was larger than 0.1, was 0.52 (standard deviation of 0.19), and the average edge-based similarity was 0.21 (standard deviation of 0.10). It is inferred that the node-based similarity is larger than the edge-based similarity and that the similarity among large-scale networks is not very high, in the sense that the edge-based similarity better reflects the similarity of the actual network.
Figures 8 and 9 show a similarity digraph among the constructed networks in which the directed edge is added when the edge-based similarity between two networks is over the threshold, while setting each heavy-rain pattern network as one node and the similarity threshold as 0.1. This digraph consists of 140 nodes and 1,896 directed edges. The node size is proportional to of the corresponding heavy-rain pattern network. Spring, summer, autumn, and winter are marked in yellow, red, green, and blue, respectively. The average degree of nodes was 13.5, the modularity was 0.39, and the number of clusters was 3. Figure 8 is visualized according to the Fruchterman-Reingold method, and Figure 9 is visualized according to Yifan Hu’s multilevel method. The Gephi tool was used for visualization.
5. Heavy-Rain Prediction Based on Heavy-Rain Pattern Networks
Figure 10 depicts an example situation to be predicted based on the network after three hours at 15:00 on July 3, 2011. The red AWS appears to be predictable, whereas the blue AWS appears to be unpredictable. Although the prediction method based on AWS networks has a fundamental limitation, it can be regarded as a new method in the way that it considers only the previous spatiotemporal pattern of heavy rain.
In our prediction method, the weighted voting method of similar networks was used. Figure 11 and Algorithm 1 show the flow chart and pseudocode of the suggested prediction algorithm, respectively. First, for a given test date in a heavy-rain event, the prediction algorithm constructs a test network from the start date of the heavy-rain event to the date and gets most recently added stations ’s in . Next, it computes edge similarities ’s between the constructed test network and train networks ’s that have been constructed in advance and gets stations ’s adjacent to s in each train network . Then, it gets station scores by summing up for each and each . Finally, it outputs possible stations with scores larger than a given threshold. This process is repeated for all test dates.
Each heavy-rain event after one, two, and three hours, respectively, from 2003 to 2012 was predicted, and the results of the prediction at 692 AWSs were marked. The results were obtained using leave-one-out cross validation (LOOCV). In LOOCV, one withholds a time, builds a predictor based on the remaining times, and predicts heavy rain of the withheld time. This process is repeated for each time, and the cumulative error is calculated. Two types of experiments, which predict the start time and duration of each heavy-rain event for continuous time intervals, were conducted. In obtaining the results, the times that have already satisfied the heavy-rain standard were excluded.
The constructed heavy-rain pattern networks reflect only the movement of heavy-rain conditions within three hours. Thus, they can be used for prediction within the scope of three hours, and the heavy-rain conditions after one, two, and three hours were predicted. Table 3 and Figure 12 show the results of predicting the start time of each heavy-rain event using our network-based prediction algorithm. The possibility threshold was set to be 0.0, and the performance of this value turned out to be the best in our prior experiment. Although the equitable threat score (ETS) value decreased as the prediction time was extended, the decrease was insignificant.
(a) After 1 hour
(b) After 2 hours
(c) After 3 hours
Table 4 and Figure 13 show the results of predicting the duration of each heavy-rain event using our network-based prediction algorithm. The possibility threshold was set to be 7.5, and the performance of this value turned out to be the best in our prior experiment. Similarly to the start time prediction of the heavy-rain event, the ETS value decreased as the prediction time was extended, but the decrease was insignificant.
(a) After 1 hour
(b) After 2 hours
(c) After 3 hours
Considering that the prediction of heavy rain is a notoriously hard problem, it can be deemed that the prediction shows the potential of network-based algorithms in the sense that it is a prediction method that considers only the movement of heavy-rain condition, without considering the accumulated precipitation at each AWS, and any other weather factor.
6. Concluding Remarks
To the authors’ best knowledge, this study is the first trial to analyze spatiotemporal patterns of heavy rain. In this paper, the previous spatiotemporal pattern of heavy rain over the last ten years in South Korea was examined through the construction of heavy-rain pattern networks. Similar spatiotemporal patterns of heavy rain among the large-scale heavy-rain events were not noticeable. As an example of using the constructed heavy-rain pattern networks, very-short-term heavy-rain prediction at each AWS within three hours was conducted. Although it is difficult to predict very-short-term heavy rain, satisfactory results could be obtained. It can be deemed that this paper reveals the potential of network-based algorithms, in that the prediction method considers only the movement of heavy-rain condition, without considering the accumulated precipitation at each AWS, and any other weather factor.
It is necessary to compare the proposed approach with the approach considering cloud movement predicted by satellite image sequences, and we may improve the proposed approach by incorporating cloud movement information by satellite image sequences. It is also expected that if the heavy-rain prediction method considering spatiotemporal patterns of heavy rain, as proposed in this paper, can be combined with other techniques such as machine learning, as suggested in Section 2, it would result in better prediction performance. Since there are abnormalities in observed meteorological data , the use of corrected meteorological data will detect spatiotemporal patterns of heavy rain more correctly and hence enhance the prediction qualities. These issues will be investigated in future studies.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
The authors would like to thank Mr. Yong Hee Lee and Mr. Seung-Hyun Moon for their valuable suggestions in improving this paper. The present research has been conducted by the Research Grant of Kwangwoon University in 2014. This research was supported by the Gachon University research fund of 2015 (GCU-2015-0030). This work was also supported by the Advanced Research on Meteorological Sciences, through the National Institute of Meteorological Research of Korea, in 2013 (NIMR-2012-B-1).
J. Bushey, The Changma, http://www.theweatherprediction.com/weatherpapers/007/.
J.-H. Seo and Y.-H. Kim, “A survey on rainfall forecast algorithms based on machine learning technique,” Proceedings of the KIIS Fall Conference, vol. 21, no. 2, pp. 218–221, 2011 (Korean).View at: Google Scholar
Korea Meteorological Administration, http://www.kma.go.kr/.
F. X. Diebold and R. S. Mariano, “Comparing predictive accuracy,” Journal of Business & Economic Statistics, vol. 13, no. 3, pp. 134–144, 1995.View at: Google Scholar
L. Ingsrisawang, S. Ingsriswang, S. Somchit, P. Aungsuratana, and W. Khantiyanan, “Machine learning techniques for short-term rain forecasting system in the northeastern part of Thailand,” Proceedings of the World Academy of Science, Engineering and Technology, vol. 31, pp. 248–253, 2008.View at: Google Scholar
J.-H. Seo and Y.-H. Kim, “Genetic feature selection for very short-term heavy rainfall prediction,” in Convergence and Hybrid Information Technology: 6th International Conference, ICHIT 2012, Daejeon, Korea, August 23–25, 2012. Proceedings, vol. 7425 of Lecture Notes in Computer Science, pp. 312–322, Springer, Berlin, Germany, 2012.View at: Publisher Site | Google Scholar
W.-M. Hung and W.-C. Hong, “Application of SVR with improved ant colony optimization algorithms in exchange rate forecasting,” Control and Cybernetics, vol. 38, no. 3, pp. 863–891, 2009.View at: Google Scholar
C. M. Kishtawal, S. Basu, F. Patadia, and P. K. Thapliyal, “Forecasting summer rainfall over India using genetic algorithm,” Geophysical Research Letters, vol. 30, no. 23, pp. 1–9, 2003.View at: Google Scholar
H. D. Lee, S. W. Lee, J. K. Kim, and J. H. Lee, “Feature selection for heavy rain prediction using genetic algorithms,” in Proceedings of the 6th Joint International Conference on Soft Computing and Intelligent Systems and 13th International Symposium on Advanced Intelligent Systems (SCIS-ISIS '12), pp. 830–833, IEEE, Kobe, Japan, November 2012.View at: Publisher Site | Google Scholar
B. P. Shukla and P. K. Pal, “A source apportionment approach to study the evolution of convective cells: an application to the nowcasting of convective weather systems,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 5, no. 1, pp. 242–247, 2012.View at: Publisher Site | Google Scholar
M. Bastian, S. Heymann, and M. Jacomy, “Gephi: an open source software for exploring and manipulating networks,” in Proceedings of the 3rd AAAI International Conference on Weblogs and Social Media, San Jose, Calif, USA, May 2009.View at: Google Scholar
M.-K. Lee, S.-H. Moon, Y.-H. Kim, and B.-R. Moon, “Correcting abnormalities in meteorological data by machine learning,” in Proceeding of the IEEE International Conference on Systems, Man and Cybernetics (SMC '14), pp. 888–893, San Diego, Calif, USA, October 2014.View at: Publisher Site | Google Scholar