Abstract

Forecasting rapid intensification (hereafter referred to as RI) of tropical cyclones in the Atlantic Basin is still a challenge due to a limited understanding of the meteorological processes that are necessary for predicting RI. To address this challenge, this study considered large-scale processes as RI indicators within tropical cyclone environments. The large-scale processes were identified by formulating composite map types of RI and non-RI storms using NASA MERRA data from 1979 to 2009. The composite fields were formulated by a blended RPCA and cluster analysis approach, yielding multiple map types of RI’s and non-RI’s. Additionally, statistical differences in the large-scale processes were identified by formulating permutation tests, based on the composite output, revealing variables that were statistically significantly distinct between RI and non-RI storms. These variables were used as input in two prediction schemes: logistic regression and support vector machine classification. Ultimately, the approach identified midlevel vorticity, pressure vertical velocity, 200–850 hPa vertical shear, low-level potential temperature, and specific humidity as the most significant in diagnosing RI, yielding modest skill in identifying RI storms.

1. Introduction

Based on the National Hurricane Center (NHC) and Statistical Hurricane Intensity Prediction Scheme (SHIPS) databases, 31% of all tropical cyclones, 60% of all hurricanes, 83% of all major hurricanes, and all categories 4 and 5 hurricanes undergo rapid intensification (hereafter referred to as RI) [13]. This includes such famous systems as Hurricanes Camille (1969), Andrew (1992), Opal (1995), Mitch (1998), Charley (2004), Katrina (2005), and Wilma (2005). While not all rapidly intensifying hurricanes have made landfall, the threat against lives and property is apparent. Forecasting tropical-cyclone track has improved over the last two decades, but forecasting RI still remains a challenge [2, 4, 5]. The forecasts of RI are limited by two key issues. First, researchers lack a clear definition for RI. The NHC defines RI as a sustained wind-speed increase of 30 kts over a 24-hour period; however, thresholds of 25, 35, and 40 kt over a 24-hour period are also used for developing RI forecast models and aids [1, 35]. Alternatively, other studies, and the National Weather Service, define RI as a decrease in the minimum sea-level pressure of 42 hPa over a 24-hour period [6, 7]. These definitions are related, since changes in the mass field (demonstrated by pressure changes) will result in changes to the wind field. This relationship is characterized by the eyewall containing the fastest winds and the eye contains the lowest central pressure. Second, there is the limited understanding of which meteorological covariates are the best at predicting RI. Statistical techniques, such as multiple regression analyses [8], have been used to try to distinguish the best method to forecast RI, based on localized environmental influences and dynamic and thermodynamic parameters. However, while research in forecasting RI has become a top priority for the NHC [4], limited improvement has been made over the past two decades.

Further examination of long-term trends in RI forecasts for active tropical-cyclone basins has revealed little recent improvements in the last two decades. Elsberry et al. [2] documented intensity guidance errors for Atlantic tropical cyclones during the 2003-2004 seasons. The systems were evaluated during three intensity phases: formation, early intensification/reintensification, and decay, with 12 RI episodes identified for the 2003-2004 season [2]. For the intensification phase, forecasts by the 5-day statistical hurricane intensity forecast (SHF5), Statistical Hurricane Intensity Prediction Scheme (SHIPS), and decay SHIPS (DSHIPS) did not predict RI cases 48 hours in advance, where the SHF5 had 0 hits and 12 misses and the SHIPS and DSHIPS both had 1 hit and 11 misses each [2]. For the dynamical models, the Geophysical Fluid Dynamics Interpolated model (GFDI) did predict some RI events including 1 hit, 1 forecast that was too early, and 2 late forecasts, (which were outside the ±12-hour interval) resulting in 8 misses, while the Geophysical Fluid Dynamics Model-Navy Interpolated (GFNI) had 2 hits, 1 late forecast 24 hours after the episode, and 9 misses [2]. All of these models were also under-forecasting the peak intensity observed [2].

In order to better understand RI processes, Kaplan and DeMaria [9] first developed a simple five-predictor RI index (RII), for the Atlantic Basin, based on the SHIPS and linear discriminant analysis techniques. The SHIPS-RII has since been modified and now implements 12-, 24-, 36-, and 48-hour lead times, using 25, 30, 35, and 40 kt thresholds. Predictors include, but are not limited to, vertical shear, lower-tropospheric relative humidity, maximum potential intensity, total precipitable water, the second principal component from GOES-IR imagery, and boundary-layer predictors [3, 4, 10]. Rozoff and Kossin [5], however, suggested that models should try to capture when a storm begins to get sufficient inner-core organization, resulting in two new RI prediction models: a logistic regression model and an empirical probability model based on Bayesian principles. When evaluated, the models had forecast skill similar to the SHIPS-RII model [5]. Further evaluation showed that when all three models were combined into a 3-member ensemble, forecast skill was superior to examining each model individually [4, 5]. Therefore, in an effort to improve the SHIPS-RII, new consensus-based versions of the RII were implemented to employ the SHIPS-discriminant RII, Bayesian model, and the logistic regression model [4, 5].

There are numerous statistical forecast aids that have also been used for improving prediction of RI that are derived using models such as the SHF5, DSHIPS, and GFDI [11]. Sampson et al. [11] introduced a new deterministic RI aid based on the RI index probability thresholds (RAPID) and a five-member consensus aid (IVCN) consisting of the following models: GFDI, DSHIPS, Logistic Growth Equation Model (LGEM), Hurricane Weather Research and Forecast (HWRF) Interpolated model (HWFI), and GFNI. The research showed that although the sample size was limited, RAPID did outperform other consensus aids and determined a 40% probability threshold for RI that resulted in a 4% reduction in mean forecast errors when IVCN and RAPID were combined [11]. A new version of RAPID that developed as a result (utilized in SHIPS) used RI guidance and the consensus members of IVRI (DSHIPS, LGEM, Geophysical Fluid Dynamics Laboratory Model Interpolated (GHMI), and HWFI) [4]. Inclusion of this new RAPID aid in the consensus model reduced mean forecast errors by approximately 5%, simultaneously reducing the biases by several knots [4].

Most recently, the skill of the SHIPS-RII is still inadequate. Brier skill scores relative to climatology are near 20% when the consensus model developed by Rozoff and Kossin is implemented for a threshold of 25 kt/24-hour lead time [4, 5]. The skill for the logistic regression model alone (using this same threshold) was near 18% and the skill for both the Bayesian and SHIPS models was below 15% [4]. When microwave imagery was added to the logistic regression model developed by Rozoff and Kossin, there were slight improvements to forecasting RI when used with the developmental dataset. However, upon evaluation of the microwave imagery based model, the Brier skill scores using real-time Global Forecast System (GFS) forecast fields from 2004–2013 showed skill for the Atlantic below 20% [4].

Previous literature has suggested the need for more advanced statistics, including those which utilize learning methods [9, 12, 13]. This study sought to address these current RI forecast issues by utilizing spatial statistical techniques and learning methods (in particular, support vector machines [14]) to improve the skill of a RI discrimination system. The primary research goals were the identification of relevant synoptic-scale controls in RI and non-RI tropical cyclones and the improvement of the discrimination between these two classes of cyclones, in particular an improvement in identifying the onset of RI.

2. Data and Methods

2.1. Data

Large-scale composites require fully three-dimensional, gridded datasets of base-state meteorological fields. For this project, the NHC’s North Atlantic Hurricane Database (HURDAT) [15] and NASA Modern Era-Retrospective Analysis for Research and Applications (MERRA) [16] data were obtained for all tropical and subtropical cyclones, regardless of landfall, from 1979 to 2009. MERRA surface data are provided on a (latitude/longitude) global grid and the pressure-level data are provided on a (latitude/longitude) global grid, which also includes 25 vertical pressure levels up to 100 hPa. For this study, all three-dimensional base-state meteorological variables (temperature, and wind components, specific humidity, geopotential height, and pressure vertical velocity) as well as mean sea-level pressure (MSLP), surface pressure, skin temperature (a proxy for sea surface temperature), 10-m and 2-m and wind components, 2-m temperature, 2-m specific humidity, and tropopause pressure were collected.

Since a primary goal was diagnosis of RI using the large-scale cyclone structure, storm centric domains of MERRA data for each cyclone were obtained. The center of the study domain for each storm used the MERRA grid point nearest to the NHC-defined storm center 24 hours prior to greatest intensification (hence the data were storm centric), with variables extracted on a 15 × 11 latitude/longitude grid centered on this domain. Since inconsistencies in the definition of RI existed within the literature, an independent definition of RI was developed. The definition used a 95% bias-corrected and accelerated (BCa) bootstrap confidence interval analysis [17, 18] of 24-hour pressure change within all storms (Figure 1). The BCa method has been shown to be second-order accurate, meaning it will converge on the population confidence intervals twice as quickly as traditional confidence interval methods [19]. The resulting analysis led to the RI definition of a decrease in pressure of 25 hPa in 24 hours.

After the data were split into RI and non-RI groups, MERRA data were collected 24 hours prior to the largest 24-hour pressure decrease in both the RI and non-RI groups. In the instance that there was a tie for the greatest pressure change, the first was chosen, so that the approach only dealt with the first instance of RI in any storm that underwent RI multiple times. For example, if the largest pressure decrease occurred on June 24, 1987, at 12Z, data were collected for June 23, 1987, at 12Z. This approach yielded 76 RI events and 265 non-RI events, with the distribution of maximum intensification identified for each storm type (RI and non-RI, Figure 2).

2.2. Methods

This methodology was divided into two phases. First, synoptic-scale variables that yielded the largest statistically significant differences between RI and non-RI tropical cyclones were identified through spatial analysis. The second phase involved the implementation of a logistic regression and a support vector machine (hereafter SVM) classification algorithm to diagnose the onset of RI at a 24-hour lead time.

2.2.1. Spatial Analysis

Since the MERRA surface and upper-level data were on different grids, and the spatial analysis methods utilized herein required consistency among gridpoints, a Barnes objective analysis (similar to an inverse distance weighted scheme) was utilized to upscale the surface data to the grid of the upper-level data [35]. Interpolation errors associated with the objective analysis were approximately 1% of the magnitude of each variable.

Once proper domain structure was established, a rotated principal component analysis (RPCA) [36] and hierarchical cluster analysis were utilized to construct composite fields of the environments associated with each tropical cyclone type (RI or non-RI). RPCA is a linear analysis technique that reduces the dimensionality by transforming the original dataset into a new dataset of linear combinations in order of decreasing variability explained. The rotated principal components (RPCs) are uncorrelated and effectively represent a redistribution of variance within the full dataset to the leading modes of variability. However, depending on the problem being considered, a different underlying correlation structure can be utilized. In this study, the spatial behavior of multiple cases that do not maintain a consistent time-series structure was of interest, so a -mode RPCA [36] was employed.

The RPCA methodology included the following:(i)transformation of the original data to standard anomalies, since the variables have considerable variability among each other and among vertical levels (e.g., comparing 300 hPa geopotential height against 700 hPa specific humidity): this scaling was done by variable and by vertical level;(ii)a formulation of the correlation matrix, correlating the individual cases (-mode) instead of the gridpoints (-mode) [36];(iii)an eigenanalysis of the correlation matrix, which included reducing the dimensions of the eigenvector matrix: this reduction was done using the congruence test [37]; after data reduction, 4 RPCs remained for RI storms and 3 RPCs for non-RI storms (Table 1 provides the variance explained of each RPC for each intensification type);(iv)calculation of the RPC loadings and scores: RPC loadings (formulated using varimax rotation [36]) represented the relative contribution of each case to a given RPC score, and as such, they provide a relative match of the RPC patterns to the original cases and can be used to cluster events (see [37] for details).

Since the resulting RPC loadings provide a relative match of each case to the given RPC score field, a hierarchical clustering of the RPC loadings can be used to group events based on their dominant physical structures. Hierarchical cluster analysis has a primary goal of minimizing the intracluster event distances while maximizing the intercluster group Euclidean distances. Clustering is completed through various hierarchical linkage methods. In this study, several linkage methods were tested, but the linkage that provided the most robust results was Ward’s minimum variance method [38]. The resulting dendrograms (see the RI dendrogram in Figure 3) revealed 4 RI clusters and 8 non-RI clusters (Table 2 provides amount of events per cluster). Once each event’s membership in the cluster structure was established, events within the same cluster were averaged together, yielding composite maps of different types of RI and non-RI scenarios. This process was done for both the RI and non-RI datasets, yielding fully three-dimensional atmospheric composites of the synoptic-scale structure for each type of intensification scenario. These fields were subjectively analyzed for evident differences between the environments associated with each type of intensification.

To test the significance of the differences identified in the cluster analysis, a permutation test [38] was performed on the raw MERRA variables to determine the 95% level of statistical significance of the observed differences (). Statistically, variables that exhibit the largest statistically significant differences between RI and non-RI events should be the best discriminators of intensification type. This approach yielded a series of “best” predictors that were considered as input into the SVM in phase 2 of the methodology.

2.2.2. Statistical Modeling

For phase 2 of the methodology, a SVM classification model that utilized the “best” predictors from phase one was formulated on the original MERRA data fields to predict RI 24 hours prior to occurrence. SVMs are learning systems that project nonlinear datasets into a higher-dimensional mathematical hyperspace that represents the data linearly [14]. This mapping is accomplished through the use of a kernel map function , so that the dot product of these map functions yields a similarity matrix known as a kernel matrix . Such a representation allows for the formulation of a separating hyperplane between the two data classes (RI’s and non-RI’s). Another primary goal of the SVM is to maximize the distance between the two separating hyperplanes (the margin of separation) via a quadratic programming optimization approach. Each time the SVM is trained with new data, the margins and optimization routines are modified, so that the system learns from new information. Additionally, the quadratic programming optimization routine can be modified by penalizing points far from the margin of separation in the formulation of the decision hyperplane, known as the cost function penalty. Such a high-cost optimization example would only utilize points lying on the decision hyperplane, which are known as support vectors. Through modification of the kernel matrix function and the cost function, it is possible to obtain a theoretically infinite number of SVM configurations, and as such a cross-validation routine was implemented to identify the best and cost combination. To diagnose the added value of utilizing a SVM approach, a traditional logistic regression [38] was included in the analysis. Limited by the number of cases, and to reduce covariability and minimize overfitting, a second RPCA was performed [39, 40] on the 14 “best” predictors as determined by the results of composite and permutation test analysis, as well as previous literature: potential temperature at 1000, 925, 850, and 700 hPa, pressure vertical velocity at 850, 700, 500, and 200 hPa, vertical shear (200–850 hPa), specific humidity at 1000, 925, 850, and 700 hPa, and vorticity at 500 hPa. The resulting RPC scores for all RI and non-RI events were used as predictors in the SVM, which in turn provided a binary RI/non-RI response. The RPCA of the combined RI and non-RI event matrix yielded 7 RPCs (using a traditional scree test, also known as Kaiser’s rule, for dimension reduction [38] (Figure 4)). The scree plot, while subjective, revealed a leveling off at the seventh RPC suggesting that additional RPCs would not further explain the variability in the dataset. Additionally, the RPC scores are by definition uncorrelated when using a varimax rotation approach, so that any issues with multicollinearity within the 14 “best” predictors are eliminated by using this approach.

To establish measures of uncertainty in the statistical modeling results and allow for independent testing [38, 41] and thereby reduce the risk of overfitting, a bootstrap cross-validation method (as outlined in [38]) was used for both the logistic regression and SVM equations. For this research, the training (85% of the data) and testing (15% of the data) sets were determined via pairwise (i.e., the same random training and independent testing samples were formulated for the logistic regression and all SVM experiments) bootstrap resampling with 1000 iterations, providing a performance distribution of the derived contingency statistics [38]. Confidence intervals formulated on these pairwise resampled distributions were used to select the best SVM configurations of those tested. To determine the best SVM configuration, a series of experiments was needed, using different kernel and cost functions. Both polynomial and radial basis functions were tested; however, the polynomial kernel never predicted any RI events. As such, a radial basis kernel function (hereafter RBF) was used primarily to determine which provided the largest discrimination ability [39]. Several gamma values (0.1, 0.05, and 0.01) and cost functions (1 to 1000 on a log10 scale) were tested to identify the best SVM performance. As the goal was to establish which SVM or logistic regression worked best at classifying RI and non-RI events 24 hours prior to onset, the bootstrap cross-validation method described previously allowed for performance evaluation on each given model [42]. The output of both the logistic regression and the optimal SVM configuration was evaluated using traditional contingency statistics [38], including probability of detection (POD), false alarm ratio (FAR), and bias (where a bias of less than 1 means the model underforecasts RI events and a bias greater than 1 means the model overforecasts RI events). Additionally, formal model skill was diagnosed using the Heidke skill score (HSS), which is a measure of skill relative to baseline climatology. Values of 1 for HSS suggest a perfect forecast, while values of 0 show that the model performs as good as climatology. Negative HSS values suggest that climatology is better than the model, which is undesirable.

3. Analysis

3.1. RI and Non-RI Composites

Utilizing support from previous literature, a starting set of important diagnostic variables (Table 3) was developed for conducting subjective analysis of the composites from phase 1. These selected variables, for each cluster and for each intensification type, yielded numerous (over 200) possible maps for discussion. For the sake of brevity, images from cluster 3 for both RI and non-RI composites are provided, as this cluster represented distinct RI and strong non-RI cases. The remaining cluster results are summarized in the text below.

For pressure vertical velocity, the most notable features were at the 700, 500, and 200 hPa levels. At 500 and 200 hPa (Figure 5), the RI composites showed tighter gradients in the midlevels, while at 700 hPa, non-RI composites showed signs of shifting from storm center towards the right. This was more evident in cross sections (Figure 6). Values of upward vertical velocity (UVV) were comparable in the majority of the levels, where at 1000, 700, and 500 hPa, the percent differences between RI and non-RI storms were minimal (less than 10%). The greatest differences in UVV were present at 200 hPa, where RI cases were approximately 80% stronger than non-RI cases. The cross sections revealed maxima in pressure vertical velocity remained vertically stacked for RI systems (Figure 6). Additionally, UVV values were weaker for the majority of non-RI clusters and all of the structures had a vertical tilt in the revealed UVV core. This would result in inhibition of the moisture source through the inflow region for the non-RI tropical cyclones. Statistical significance, revealed through the permutation tests, was shown for pressure vertical velocity in the center of the storm. This was expected given the cluster composites revealing that the non-RI cases had vertically tilted structures, compared to the vertically stacked RI cases, while exhibiting stronger UVV (Figure 6(c)). Analysis of vertical shear from 200 to 850 hPa (Figure 7) revealed that little shear is present around the storm centers for RI clusters, while for the non-RI composites, there are indications of shear affecting the storm center for a majority of the clusters. This is expected since pressure vertical velocity revealed that non-RI cases exhibited tilting in the vertical structure. Permutation test results revealed that nearly the entire map (Figure 7(c)) is statistically significant at distinguishing between RI and non-RI storm types.

For vorticity (Figure 8), both RI and non-RI cases showed strong organization in the cluster composites around the storm center. Vorticity at the 700 and 500 hPa levels had higher values of positive vorticity for the RI cases. At the 700 hPa level, the RI cases revealed tighter gradients around the storm center with approximately 16% higher values of positive vorticity for RI compared to non-RI, while at 500 hPa, the RI cases were approximately 57% higher than the non-RI cases, suggesting RI tropical cyclones were strengthening 24 hours earlier. Relative vorticity at 200 hPa showed RI cases approximately 29% stronger than non-RI cases, which was also reflected in divergence at 200 hPa (Figure 9). For divergence at 200 hPa, the majority of the RI clusters had the outflow positioned over the storm center and were approximately 50% stronger (for the majority) than the non-RI composites, where the outflow was positioned over the top-right section of the composites. In cases where non-RI values were comparable in divergence values, it was important to keep in mind that the non-RI cases were not necessarily weak tropical cyclones but were just not undergoing RI. In general, vorticity at 500 hPa showed the greatest distinction between the two types of systems as the permutation results (Figure 8(c)) showed statistically significant regions in the center of the map, matching composite results. Regions of statistical significance at 200 hPa also appeared over the center for relative vorticity, matching results found through composite analysis, suggesting stronger values associated with RI events; however, divergence at this same level showed a shift in the area of statistical significance to the left of the storm center. While this could be a sign that divergence is increasing 24 hours before RI, the shift could also be representative of the tilting of the vertical structure seen through pressure vertical velocity and vertical shear for the non-RI cases (Figures 6 and 7).

Examining specific humidity at 1000 hPa (Figure 10) revealed a dry slot in the center of 3 of the 4 RI clusters, while only cluster 6 had the dry slot for the non-RI clusters. This was similar to the results for potential temperature at 1000 hPa, where cooler centers for the same 3 of the 4 RI clusters were observed, with only cluster 6 for the non-RI showing this feature. The RI storms were found to have approximately 18% more moisture at 1000 hPa than the non-RI storms, whereas maximum values for potential temperature were approximately the same for both RI and non-RI cases. The cooler center for the RI events, combined with the increased moisture, suggests increased values of latent heat release and enhanced static instability, both aiding in strengthening of the storm. At 700 hPa, both the RI and non-RI clusters had about the same moisture content, as well as potential temperature values being similar, and showed organization around the storm center, where the highest values are over the storm center, and a steady inflow of moisture was present. At 500 hPa, the biggest difference in the moisture content was noticed, where RI cases had approximately 29% more moisture than non-RI cases and potential temperature at this level was revealing notable shifting of the storm center for at least two of the non-RI composites, while all RI composite maps remained storm centric. In general for specific humidity, RI composites did exhibit tighter gradients around the storm center and strong moisture inflow, while the moisture maximum in the non-RI’s shifted from the storm center towards the right of the composite. In the latitudinal cross sections, the dry slots, as well as the cooler potential temperatures, were observed near the storm centers at the surface for RI systems (Figures 11 and 12), further indicating additional latent heat release and instability, helping to strengthen the RI storms, while vertical tilting (seen through pressure vertical velocity (Figure 6)) was inhibiting the moisture source from the center of the non-RI systems. Permutation tests also revealed the majority of the differences in the low-level moisture (Figure 10(c)) and potential temperature structures to be statistically significant; however, the dry slots, and associated cooler potential temperature values, were not statistically significant between RI and non-RI events. This is possibly a result of land influences, as these individual RI or non-RI storms were approaching coastal regions, at 1000 hPa, and the size of the domain would have collected land data (including any higher terrain features, e.g., mountains). Permutation test results in the specific humidity latitudinal cross section (Figure 11(c)) revealed that a small area to the right could represent the inflow location which would be common among all tropical cyclones (as it is necessary to sustain any tropical cyclone), but higher values were associated with RI systems. This inflow region was closely mimicked in the results for potential temperature (Figure 12(c)).

Examination of skin temperature (SST) (Figure 13) revealed similarities in magnitude (around 300 K) between the RI and non-RI cluster composites. These similarities, while not surprising as they are in an approximately barotropic environment, suggest that although there is a threshold for tropical cyclones to thrive, it may not be a good distinguisher between RI and non-RI events. Permutation test results, however, revealed the northern regions of the storms are statistically significant. This could be a result of the slightly cooler temperatures in this portion of the storm for non-RI cases, which could inhibit intensity. While the magnitude only varied by a degree K, it was enough to be significant. The permutation test results showed a portion of the region to be statistically significant in distinguishing RI from non-RI events (North of the storm center); however, this could be land (or ocean) influences not able to be resolved at the surface with the current dataset. Lastly, for MSLP, values and structure were similar for the majority of the RI and non-RI cluster composites, where a pressure gradient was around the storm center for RI cluster 3 (not shown); however, there were three non-RI composites of similar synoptic structure to the RI composites. Similarly, five non-RI cluster composites had values of the same magnitude as the RI composites, suggesting these similarities could be a result of the NHC-defined storm center not lining up exactly with the MERRA gridpoint, smoothing from the averaging of each event’s cluster structure, and/or the reanalysis not maintaining a strong storm center. Based on these results, permutation tests were not done for this variable.

Upon analysis of the results of the composites and permutation tests, and considering previous literature, variables used as predictors in the SVM algorithm and logistic regression model were revealed. The variables believed to have had the best discrimination ability were vertical shear, potential temperature at 1000, 925, 850, and 700 hPa, pressure vertical velocity at 850, 700, 500, and 200 hPa, specific humidity at 1000, 925, 850, and 700 hPa, and vorticity at 500 hPa. Table 4 shows the percent significance for each, compared to the percent significance (where applicable) for each variable regardless of height.

3.2. Support Vector Machines

The results of phase 2 SVM cross-validation experiments identified a kernel-cost combination that yielded the greatest overall discrimination ability. The confidence intervals on the HSS and bias statistics suggested that the RBF using gamma value 0.01 and costs of 10 and 100 yielded the best discrimination capability (Figure 14). In particular, median HSS values for these SVM configurations were roughly 29% higher (0.3) against the logistic regression (0.22). Further analysis of the lower HSS confidence limits revealed that while the two best SVM configurations never produced negative skill (HSS), the logistic regression lower HSS confidence interval fell below zero (−0.03), suggesting that this configuration could produce negative skill (or that climatology could provide a better forecast than logistic regression in some circumstances). Examination of the bias, also important in determining whether the models were predicting RI effectively, revealed the median for the logistic regression at 0.25, suggesting significant underprediction of RI events. The SVM configurations, however, had a median bias of 1 and 0.8. This statistic allowed for isolating the better of the two configurations, so the “best” model combination was determined to be the RBF (gamma = 0.01) with a cost of 10 (hereafter referred to as the SVM model). This result was also reflected in the POD (Figure 14(c)), where the SVM performed 73% better than the logistic regression in discriminating between RI and non-RI tropical cyclones (though some of this improvement was at the expense of an increased FAR (Figure 14(d))). Despite the improved median FAR with the logistic regression, the confidence interval spans the full range of values, suggesting this median is of little use in establishing logistic regression’s model performance.

While the skill of the SVM model was not a vast improvement over the current state of the science, the primary research goal of this study sought to identify relevant synoptic-scale controls in RI and non-RI tropical cyclones to improve the skill of a RI discrimination system. The results above were able to highlight large-scale tropical-cyclone structure variations between RI and non-RI events. Specifically, positive vorticity at 500 hPa revealed strengthening was occurring for RI events, with values 57% higher than non-RI storms. Vertical shear was present for non-RI cases and cross sections of pressure vertical velocity revealed tilting in the vertical structure and weaker values within the UVV core. This tilting would inhibit moisture through the inflow region, which was observed through analysis of specific humidity. Results for specific humidity and potential temperature also revealed higher moisture values and instability with RI events, leading to increases in latent heat and more energy available to RI events, helping to strengthen the storm.

4. Conclusions

Trying to forecast RI is challenging due to the complexity of tropical cyclones. This research sought to find common synoptic patterns that distinguish RI and non-RI events through spatial analysis and improve the diagnosis of the onset of RI using a SVM and logistic regression approach. While analyzing the composites, a few variables at different pressure levels began to stand out. At the surface for a RI storm, dry air over the storm center and lower values of potential temperature were a common feature. This result was in contrast to previous research suggesting that small values of the innercore dry air predictor, having less dry air mixing down to the surface, are favored by RI systems [9, 10].

Potential temperature values were comparable between the RI and non-RI cases. However, at the surface, RI cases had cooler values over the storm center compared to the non-RI cases, indicating more static instability for the RI cases and thus more energy available. At the midlevels for the RI cases, the composites for positive vorticity showed higher values with tighter gradients over the storm center, indicating a faster spin. The midlevels also revealed that maximum values of specific humidity shifted towards the right of the storm center for the non-RI cases, while RI cases remained storm centric. For divergence at the upper levels, composites showed some of the RI cases to have stronger outflows than the non-RI. However, there were some instances in which the non-RI cases also exhibited strong outflows. Further examination of non-RI cases revealed a shift of divergence to the right of the storm center, while the RI cases remained storm centric. Vertical shear was present for a majority of the non-RI cases, and when looking at pressure vertical velocity, the shifting that was observed in the midlevels became apparent. All RI cases remained vertically stacked, while the majority of the non-RI cases were tilted. This is believed to be inhibiting the inflow of moisture and the heat source which is needed to keep the storm going. RI cases, being vertically stacked, allow for the systems to have full access to moisture, heat, and, thus, potential energy.

Permutation tests were also done on SST, specific humidity, potential temperature, vertical shear, pressure vertical velocity, vorticity at 200 and 500 hPa, and divergence at 200 hPa. The test provided the statistical significance to aid in identifying variables used in the statistical modeling of RI and non-RI storms. The permutation test revealed that potential temperature and specific humidity were statistically significant in the storm centers from 700 to 200 hPa. At the surface, however, the dry slots for specific humidity, as well as the smaller values of potential temperature, were not statistically significant when viewed in a cross section over the storm center. Instead, the region just to the right of the storm center showed statistical significance for both variables. This was not surprising as RI cases are receiving more moisture and heat through the inflow of the tropical cyclone. Vorticity showed greater regions of statistical significance in the storm center at 500 hPa than in the upper levels at 200 hPa. This was expected since vorticity values were higher for RI cases over non-RI cases. In general, differences between RI and non-RI cases would be most reflected in the midlevels, as vorticity is directly related to the radius of maximum winds, which increases with height for all tropical cyclones [27]. When looking at the cross section for pressure vertical velocity, the permutation test confirmed vertical stacking was statistically significant in distinguishing the two types of cases. That is, over the storm centers, the RI’s and non-RI’s were not exhibiting the same UVV magnitude. This makes sense considering that cluster composites revealed vertical shear present for non-RI cases and permutation test results confirming that it was statistically significant in distinguishing between RI and non-RI storm types. For divergence at 200 hPa, the areas of statistical significance were to the left of the storm center. This region is where it is thought that the outflow is beginning to strengthen for the RI tropical cyclones but has not fully reached the maximum potential yet due to eyewall replacement processes [27]. Lastly, SST composite analysis showed similar values between RI and non-RI events, with cooler SSTs in the northern portions for non-RI cases. This could inhibit intensity growth. The permutation test results showed areas to be statistically significant in distinguishing RI from non-RI events (North of the storm center). However, this could have been a result of land influences or possibly the ocean status under a tropical cyclone heavily distorted by heavy precipitation [27].

Ultimately, potential temperature and specific humidity at the surface, vorticity at the midlevels, vertical shear, and pressure vertical velocity were used as prediction variables in two prediction algorithms (logistic regression and SVM classification).

Based on the results of the cluster analysis and permutation test results, there is a distinct difference synoptically between RI and non-RI tropical cyclones. While the results of the classification models were not fully resolving these differences, the SVM model was able to outperform the logistic regression model, yielding a 29% improvement in skill. The SVM model also has the added advantage of being a learning method and improving with additional training.

More research is needed in this area to further improve forecasting of RI. For this study alone, the different variables at different levels could still be utilized as predictors within the SVM model. For example, divergence at 200 hPa was stronger in some cases and showed statistical significance (Table 4) between RI and non-RI events but was not used in the model. Other shortcomings of this study are the use of all tropical-cyclone data regardless of landfall. Future work should consider not only how this may influence the permutation test results at the surface, but also whether the remainder of the environment was affected. Other types of variables also should be examined. While this study specifically focused on synoptic-scale variables, tropical cyclones are complex in that they are driven by both dynamic and thermodynamic processes but are also influenced by the surrounding environment. Analyses of other processes would be something to consider in future work. Finally, this work was done in a diagnostic mode and would need to be reformulated for forecast applications.

While there were shortcomings in this research, overall the SVM model’s skill, using synoptic-scale variables as predictors, did outperform a logistic regression model. Using SVM, or any artificial intelligence technique, would only improve with more data input into the system. Therefore, these techniques are thought to be an improvement over the currently implemented regression models and future research should continue to focus on their improvement for forecasting. The next step would be to try to reformulate the composite fields of relevant diagnostic variables for RI processes from numerical model guidance output at varying lead times. This would again allow for identification of which variables are important for the prediction of RI. Additionally, focusing on regions where permutation tests were significant (i.e., retaining only significant gridpoints from the composite analysis as predictors) may improve the forecast skill further. Implementing other artificial intelligence techniques, such as random forests and neural networks, to improve AI-based modeling for RI would also be a logical next step. The results of such work could be the solution for prediction of RI in the Atlantic Basin and help to solve the daunting problem of forecasting these potentially devastating systems.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors would like to thank Dr. Jamie Dyer for his input in tropical meteorology and Dr. P. Grady Dixon for his input in meteorological processes as a whole. This material is based upon work supported by the National Science Foundation under Grant no. DGE-0947419 at Mississippi State University. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.