Estimating Climate Trends: Application to United States Plant Hardiness Zones
The United States Department of Agriculture classifies plant hardiness zones based on mean annual minimum temperatures over some past period (currently 1976–2005). Since temperatures are changing, these values may benefit from updating. I outline a multistep methodology involving imputation of missing station values, geostatistical interpolation, and time series smoothing to update a climate variable’s expected value compared to a climatology period and apply it to estimating annual minimum temperature change over the coterminous United States. I show using hindcast experiments that trend estimation gives more accurate predictions of minimum temperatures 1-2 years in advance compared to the previous 30 years’ mean alone. I find that annual minimum temperature increased roughly 2.5 times faster than mean temperature (~2.0 K versus ~0.8 K since 1970), and is already an average of 1.2 0.5 K (regionally up to ~2 K) above the 1976–2005 mean, so that much of the country belongs to warmer hardiness zones compared to the current map. The methods developed may also be applied to estimate changes in other climate variables and geographic regions.
Expected values of various climate variables at particular locations are used for decision making in many sectors. Traditionally, these have been estimated based on the average over some past period. If the variable is statistically stationary, averaging over a long period should result in an accurate estimate of its expected value; however, in the presence of trends, such as those associated with global warming, such averages will not be optimal estimates for the expected value going forward [1, 2].
In this paper, the climate variable considered is the annual minimum temperature, an important determinant of the range over which particular varieties of perennial plants and overwintering insects may thrive. The United States Department of Agriculture (USDA) first released maps of plant hardiness zones for the coterminous United States (USA) and southern Canada in 1960, where each zone corresponded to a particular range of mean annual minimum temperature; similar maps, with different numbering of hardiness zones, were published as early as 1938 by Harvard University's Arnold Arboretum . The most recent USDA hardiness zone map revision for the United States and Puerto Rico, published in January 2012, is based on mean annual minimum temperatures over 1976–2005 at some 8,000 available weather stations, interpolated to a resolution finer than 1 km taking into account factors such as elevation and proximity to shorelines . The annual minimum temperature mapped over much of the USA is of order 1.4 K warmer than that in the previous USDA map, which was released in 1990 and based on 1974–1986 averages, although differences between the 1990 and 2012 releases are not necessarily the result of climate change because the interpolation methodology has also changed. That there has been warming is, however, confirmed by comparing the maps prepared for the two halves of the new climatology period, 1976–1990 versus 1991–2005 . A global hardiness zone map has also been prepared, based on 1973–2002 station averages and using gridded monthly mean temperatures to interpolate areas with sparse station coverage . These hardiness zone maps are widely used in analyses of the ranges of plant and pest species and in the formulation of planting recommendations [6–9].
There have been few systematic studies of observed changes in annual minimum temperature, compared to many studies of trends in mean temperatures. Karl et al.  found increasing trends in annual minimum temperatures, but not annual maximum temperatures, over the USA and former USSR. Zhai et al.  found that winter minimum temperatures in Chinese stations showed an increasing linear trend amounting to 2.5 K over the period 1951–1990, while summer temperatures did not increase. Knappenberger et al.  found that the linear warming trend in daily minimum temperature for USA stations over the period 1970–1997 was strongest for the coldest part of the annual range. Mckenney et al.  quantify linear trends in annual minimum temperature over Canada during the period 1961–2000, finding a mean increase of 2.2 K over the period (larger over the western provinces), which they note exceeds reported increases in annual mean temperature. Shen et al.  studied trends in the first four moments of daily temperature anomalies over the coterminous USA during 1901–2000, finding that winter temperatures have increased more than summer temperatures, daily minimum temperatures have increased more than daily maximums, and the variance of the temperature anomaly time series has decreased.
In this study, I sought to quantify trends in annual minimum temperature since the early 20th century and evaluate how currently mapped hardiness zones might be adjusted to account for these trends. My interest here was in trends (i.e., systematic shifts between time periods, including nonlinear change patterns), rather than in actual annual minimum temperature at a given location. While temperatures show pronounced differences over small spatial scales, as captured by the latest USDA map, trends in mean temperature tend to have spatial scales of hundreds to thousands of km , and here I will show that this is true of trends in annual minimum temperatures as well. I show results for the coterminous USA for comparison with earlier work and because of the relatively high density of available weather data there, although my methods could be applied with little modification to any region with adequate station data. I expressed changes in annual minimum temperature as relative to the 1976–2005 average, to complement the recent USDA release, and also examined the total change since the beginning of the current warming period around 1970.
2.1. Station Data
The source of temperature observations was a current version of the Daily Global Historical Climatology Network (GHCN-Daily, ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/daily/). GHCN-Daily is a compilation of ground weather observations from a variety of sources, going as far back as the 18th Century, that is freely available and updated in near real time by the USA National Climatic Data Center (NCDC). Data flagged as failing any of the automatic quality control tests [16, 17] were rejected. The provided station locations were mapped by country code, detecting 10 cases of obviously erroneous longitudes or latitudes; these were corrected for this analysis and reported to NCDC for the benefit of future releases.
Annual minimum temperature over a year was calculated as the lowest daily minimum temperature for that year in a given station record. The annual minimum temperature was considered missing if minimum temperature for any day during the year was missing from a given station record. Years were defined to begin in March because in the northern midlatitudes, the annual minimum temperature almost always occurs between December and February. (Thus, the official record minimum temperatures in each of the 48 contiguous states (http://www.ncdc.noaa.gov/extremes/scec/) were set on dates ranging between 22 December and 17 February.) Ending the year in February allows computing a new data point in annual minimum temperature time series as early as March, taking advantage of the near real-time updating of GHCN-Daily. Defining years instead as starting in July, as done in computing the USDA hardiness zones , led to almost identical results for the study region of the coterminous USA.
Only stations with at least 30 years of annual minimum temperature data were used. This gave 2733 available stations, of which 1065 were in the coterminous USA (Figure 1; the interpolation utilized all the stations shown, not just those in the coterminous USA). All years with at least 5 available stations were included in the analysis steps, although because of the restricted number of stations at the beginning of the analysis period (Figure 2), only results since 1900 will be shown.
While GHCN has developed an homogeneity-adjusted monthly temperature product , the daily temperature series used here are not adjusted for inhomogeneities that may arise, for example, from changing station siting or observation practices. As a partial check on the sensitivity of my estimated trends to such inhomogeneities, an additional analysis was conducted where the stations used were restricted to those in the United States Historical Climatology Network (USHCN), a set of stations specifically chosen for monitoring climate change over the USA due to their long high-quality records (http://cdiac.ornl.gov/epubs/ndp/ushcn/ushcn.html) . Imposing this restriction gave a subset of 421 available stations.
2.2. Gap Filling
For the selected stations, missing annual mean temperature values were imputed, producing complete time series of annual minimum temperature; this was done to minimize the impact of changing station coverage over time on the estimated regional trends. I chose an imputation method based on singular value decomposition (SVD), variants of which have been tested in genomics and data mining applications [20, 21]. The method consisted of the following.(1)Construct an matrix of station annual minimum temperatures, where is the number of stations and the number of years, and .(2)Center by subtracting the mean from each row. Record the maximum and minimum values of the centered matrix.(3)Fill in missing values in with the mean value for the corresponding year. (This initializes with any long-term trend that is suggested by the available stations.)(4)Find the thin SVD [22, page 72] of .(5)For each year index , recalculate the corresponding row of by regression on the known station values for that year; that is, if the set of indices of available stations in the year is given by , solve the linear problem for the unknown row of .(6)Compute . Limit the elements of to range between .(7)Compute the iteration increment as , where the norm is the vector 2-norm (root mean square) over the missing elements only and is the matrix of mean values subtracted from in step .(8)Set . Set known elements in to their observed values. Recenter .(9)Iterate steps (4)–(8) until is under a specified tolerance level . (Here , and convergence occurred after about 30 iterations.)
The limiting in step (6) prevents occasional divergence of the iterations. Presumably, it could be replaced by a suitable regularization constraint in the regression step (5). It may also be possible to refine the procedure by using only a geographically nearby station subset to fill in each value, analogous to the implementation of locality in data assimilation for numerical weather prediction .
The empirical correlation of annual minimum temperatures between pairs of stations (using only years for which both stations have data) was plotted as a function of interstation distance (Figure 3). As noted for mean temperatures and for hydroclimate variables [15, 24], the correlation length scale was typically hundreds of km. A sum of three Gaussians was fitted to model the interstation covariances: where is the distance between two stations. Station variances were taken to be Here, the second term represents the “nugget” component of station variability due to effects not shared by nearby stations (such as microclimate or measurement technique) and includes an adjustment for the station observation coverage (as a fraction of the analysis period) that downweighs stations with more values that are imputed rather than observed. Values for the parameters , were determined to minimize the weighted sum of squares of the difference between the modeled and the empirical covariances for available pairs of stations (Figure 3), where the empirical covariances were averaged over (28 km) distance ranges and weighted based on the standard error of these averages.
Ordinary Kriging  was used to interpolate each year's annual minimum temperatures on a grid over the coterminous USA based on the gap-filled station values and the modeled covariance matrix. A USA-wide annual minimum temperature time series was also computed, using an area-weighted average of the grid points.
2.4. Trend Estimation
For the annual minimum temperature time series obtained for each grid point or for the USA, a decomposition was sought in the form where is a smooth trend component and is weather variability that is basically uncorrelated from year to year. was estimated using a smoothing cubic spline with natural boundary conditions , where the regularization parameter that determines the amount of smoothing was chosen by the corrected Akaike information criterion (AICC) . Approximate pointwise standard errors for the estimated trend were obtained following .
In order to check whether the results were sensitive to the trend estimation method, an alternative smoothing approach tried was a local linear regression estimator , with a Gaussian weighting kernel whose scale parameter was estimated via AICC. Additionally, generalized cross-validation  was tried instead of AICC for choosing the smoothing spline regularization parameter.
2.5. Hindcast Experiments: End-to-End Testing
Do the trend estimates obtained on a grid via the gap filling, interpolating, and smoothing process improve predictions of actual station-measured annual minimum temperatures, compared with simply using (as in the current USDA procedure) a climatology based on the mean of previous years’ observations? To find out, I compared two methods of hindcasting station’s annual minimum temperatures using only observations from preceding years. The first method (climatology hindcast) used the mean of the previous 30 years of available values as the predictor. The second method (trend hindcast) added to the climatology predictor a correction equal to the difference between the last year’s trend value for the grid cell in which the station is located and its value over the climatology period. Only data from years before the hindcast target were used in the gap-filling, interpolation, and smoothing steps for obtaining the trend.
Hindcast values were obtained for all coterminous USA stations with available observations for comparison and for all years since 1980, for a total of 11,465 hindcast opportunities at 1-year lag and 10,854 at 2-year lag. For both hindcast methods, the root mean square, mean absolute value, and mean (bias) of the error (hindcast minus actual value) were computed.
3.1. USA Temperature Trends
Figure 4 shows the annual minimum temperature by year, averaged over the coterminous USA, relative to the 1976–2005 base period. At this spatial scale, trends over recent decades are of the same magnitude as year-to-year variability, which is 1.5 K. The multidecade variability is qualitatively similar to that known for mean temperatures, but with generally larger amplitude (as will be shown below). Annual minimum temperatures rose 0.7 K between 1900 and 1940 and then declined 0.1 K through 1970. Annual minimum temperatures have increased ~2.0 K since 1970 and as of 2011 are K above the 1976–2005 mean (1-σ uncertainty of trend) (Figure 4). The higher trend annual minimum temperature by 2011 compared to the 1976–2005 mean is consistent across the coterminous USA, but the warming magnitude ranges from 0.5 K in the southwest and upper midwest to 2 K around the southern appalachians (Figure 5(a)). Considering the period since 1970, the warming of annual minimum temperatures is in the range 1.5–2.5 K over most of the coterminous USA (Figure 5(b)).
It is also possible to look at annual minimum temperature anomalies for particular years, in addition to the trend. Averaged across the coterminous USA, 2011-2012 seems to have been the warmest since 1900 (in terms of annual minimum temperature), at 3.8 K above the 1976–2005 mean; the next warmest years were 1999-2000 (+3.2 K), 1930-1931 (+3.1 K), and 1991-1992 (+3.0 K) (Figure 4). The coldest year since 1900 was 1962-1963, at 4.4 K below the 1976–2005 mean (though, interestingly, no current state minimum temperature records date to that winter), followed by 1984-1985 (−3.7 K), 1961-1962 (−3.4 K), and 1911-1912 (−3.3 K). The gridded anomalies can be mapped to show the regional distribution of anomalies by year, as shown in Figure 6 for the extreme years 1962-1963 and 2011-2012. Uncertainties (1-σ) estimated from the Kriging correlation matrix for the individual years’ values are about 1 K for individual grid points and 0.1 K for the USA-wide mean.
The obtained increases in annual mean temperature since 1970 and since the 1976–2005 base period are very similar if only the USHCN stations are used, at 2.1 K and 1.4 K, respectively, compared to 2.0 K and 1.2 K when all HCDN-Daily stations are used (Figure 7). Similarly, the increases are similar if local linear regression instead of spline smoothing is used for trend estimation, at 2.0 K and 1.4 K, respectively, or if generalized cross-validation is used in the smoothing spline parameter estimation instead of AICC (2.0 and 1.3 K; not shown). This increase in annual minimum temperature since 1970 has been greater than the increase in annual mean temperature (computed with HCDN-Daily stations using the same method employed for annual minimum temperature), which has been only 0.8 K (Figure 7). In general, annual minimum temperatures have increased roughly 2.5 times as fast as annual mean temperatures since 1900 (Figure 8). Note that the linear relationship between minimum and mean temperature shown in Figure 8 is entirely due to the trend components of the two time series (shown in Figure 7); if these trend components are subtracted, then annual anomalies in annual minimum temperatures have no significant correlation with anomalies in annual mean temperature (, not plotted).
3.2. Hindcast Experiments
Averaged over the years since 1980, the climatology hindcasts of station minimum temperatures as the average of the past 30 years have been biased low by 1 K (Table 1), in line with the reconstructed warming trend. The trend hindcast, which adds the trend reconstructed based only on data 1 or 2 years prior to the hindcast year, reduces the magnitude of the bias by 90% and also reduces the root mean square error by 2% and the mean absolute error by 4% (Table 1). Inspection of climatology and trend hindcasts for individual stations reveals that after the 1980s trend forecasts tended to be above climatology, and usually better tracked observations: two examples, one in the south and one in the north of the study region, are shown in Figure 9. Notice that for individual stations, year-to-year variability in minimum temperature is higher than that for the nationwide average and dominates the hindcast error, so that although trend adjustment improves hindcast quality by reducing bias, root mean square and mean absolute errors are reduced by relatively little.
The approach outlined here to estimate trends from relatively sparse, temporally incomplete station data should be of general applicability to other climate variables such as maximum temperature or precipitation, though some details such as the form of the falloff in interstation covariance with distance may need adjusting when different climate variables are considered. My approach may be useful in updating climate-based quantities such as growing season length and heating degree days which find a variety of applications and which are also changing [31–34], as well as standards based on expected climate extremes for infrastructure design and adaptation . This approach could similarly be extended to tracking the migration of alternate measures of plant hardiness, such as the Canadian one, or of climate envelopes empirically determined for particular plant species; these measures may include functions of climatic variables such as precipitation and wind speed in addition to annual minimum temperature [36–39].
This approach compares favorably in generality to others proposed for reducing the bias of climatology estimates of expected values in a changing climate. Compared to the “optimal climate normals” approach for finding expected values, where the climatology averaging period is shortened to reduce the trend-induced bias , my trend estimation approach is able to mitigate bias without losing the information contained in the longer-term record where this can improve accuracy. Compared to the “hinge fit” extrapolation for climate normals proposed by Livezey et al. , which assumes a stationary expected value for the climate variable before 1975 and a linear change thereafter, my approach of trend estimation with a smoothing spline has the advantage of allowing nonlinear trends in recent decades and of allowing estimation of earlier fluctuations in expected values (such as the warming seen between 1900 and 1940).
In order to gauge how much confidence can be placed on climate model simulations of future changes in derived climate quantities of interest, such as hardiness zones and species-specific climate envelopes , it may be useful to explicitly compare the trends seen in these quantities over recent decades with model simulations and determine whether climate models can reproduce, for example, the some 2.5-fold faster increase in annual minimum temperature compared to mean temperature over the coterminous USA. Consistent with earlier studies [42, 43], a recent analysis shows that the amplitude of the seasonal cycle of monthly mean temperatures has decreased over the 1954–2007 period in the extratropical continents—implying that winters are warming up faster than summers—and that this decrease is faster than that shown by most of the IPCC AR4 ensemble of climate models . More broadly, one might consider comparing shifts in the entire probability distribution of temperature across models and observation sets as a means of achieving deeper understanding of greenhouse warming feedbacks across different conditions.
The physical explanation for the fast rise in annual minimum temperature requires detailed investigation. Certainly snow albedo feedback is a likely contributing factor to winter amplification of warming in parts of the USA that have had significant winter snow cover . Also, already Charney et al.  mentioned that greenhouse warming at the surface would be expected to be greater under cold inversions, which are likely to be found during the coldest days of the year; more recently, cold inversions have been argued to be a factor in the disproportionate winter warming of the Arctic .
In terms of the specific application of delineating USA plant hardiness zones, I found that annual minimum temperatures are already on average some 1.2 K higher than at the 1976–2005 base period used in the most recent release. Given that hardiness half-zones are defined at 2.8 K intervals , this would suggest that over one-third of the country has already shifted half-zones compared to the current release and over one-fifth has shifted full zones. Since much of the temporal variability in annual minimum temperature is coherent over scales of several hundred km (Figure 3), the resolution of the current analysis may be adequate to enable regular adjustment of the fine-resolution annual minimum temperature USDA map with estimated temporal trends, providing up-to-date guidance for horticulturists.
In order to enable adoption and modification of the procedures proposed here, the computer programs used for downloading data, imputation of missing data, interpolation, and smoothing, written in the computer language Octave , are available under an open source license from http://www-ce.ccny.cuny.edu/nir/sw/climfit/.
I have shown that the expected values of annual minimum temperature over the coterminous USA have changed substantially over the past century, that interannual variability and trends in annual minimum temperature can be mapped from available daily station data, and that accounting for estimated trends improves hindcasts of observed annual minimum temperatures. This trend estimation procedure may be used to provide yearly updates to hardiness zone maps based on expected annual minimum temperature. Similar or identical methods could be used to improve forecasts of many other climatically derived quantities, compared to the alternative of using statistics from some past baseline period without adjustment for trends.
This work benefited from discussions with Alan Betts, Pierre Gentine, and Alessandra Giannini. Soni Pradhanang, Boris Shmagin, Alexander Stine, and Brian Tkatch reviewed and commented on drafts of the paper. This study was supported by NOAA under Grant NA11SEC4810004. Statements made are the views of the author and are not the opinions of the funding agency or the US Government.
D. Wyman and H. L. Flint, “Plant hardiness zone maps,” Arnoldia, vol. 27, no. 6, pp. 53–56, 1967.View at: Google Scholar
D. J. Nowak and R. A. Rowntree, “History and range of Norway maple,” Journal of Arboriculture, vol. 16, no. 11, pp. 291–296, 1990.View at: Google Scholar
G. R. Johnson, C. Frank Sorensen, J. Bradley St Clair, and C. Richard Cronn, “Pacific Northwest forest tree seed zones,” Native Plants Journal, vol. 5, no. 2, pp. 131–140, 2004.View at: Google Scholar
P. C. Knappenberger, P. J. Michaels, and R. E. Davis, “Nature of observed temperature changes across the United States during the 20th century,” Climate Research, vol. 17, no. 1, pp. 45–53, 2001.View at: Google Scholar
D. W. McKenney, M. Hutchinson, P. Papadopol, K. Campbell, and K. Lawrence, “The generation of USDA-equivalent extreme minimum temperature models and a comparison with Canada's plant hardiness zones,” Canadian Journal of Plant Science, vol. 86, no. 2, pp. 511–523, 2006.View at: Google Scholar
D. R. Easterling, T. R. Karl, and J. H. Lawrimore, “United States Historical Climatology Network daily temperature, precipitation, and snow data for 1871–1997,” Tech. Rep. ORNL/CDIAC-118, NDP-070, Carbon Dioxide Information Analysis Centerr, Oak Ridge National Laboratory, U.S. Department of Energy, Oak Ridge, Tenn, USA, 1999.View at: Google Scholar
M. Kurucz, A. A. Andrs Benczr, and K. Csalogny, “Methods for large scale SVD with missing values,” in Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 7–14, San Jose, Calif, USA, 2007.View at: Google Scholar
G. H. Golub and C. F. van Loan, Matrix Computation, Johns Hopkins University Press, 3rd edition, 1996.
N. Y. Krakauer and I. Fung, “Mapping and attribution of change in streamflow in the coterminous United States,” Hydrology and Earth System Sciences, vol. 12, no. 4, pp. 1111–1120, 2008.View at: Google Scholar
N. A. Cressie, Statistics for Spatial Data, Wiley, 1993.
C. M. Hurvich, J. S. Simonoff, and C. L. Tsai, “Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion,” Journal of the Royal Statistical Society. Series B, vol. 60, no. 2, pp. 271–293, 1998.View at: Google Scholar
G. Wahba, “Bayesian “confidence intervals” for the cross-validated smoothing spline,” Journal of the Royal Statistical Society. Series B, vol. 45, no. 1, pp. 133–150, 1983.View at: Google Scholar
D. R. Easterling, J. L. Evans, P. Y. Groisman, T. R. Karl, K. E. Kunkel, and P. Ambenje, “Observed variability and trends in extreme climate events: a brief review,” Bulletin of the American Meteorological Society, vol. 81, no. 3, pp. 417–425, 2000.View at: Google Scholar
T. G. Huntington, A. D. Richardson, K. J. McGuire, and K. Hayhoe, “Climate and hydrological changes in the northeastern United States: recent trends and implications for forested and aquatic ecosystems,” Canadian Journal of Forest Research, vol. 39, no. 2, pp. 199–212, 2009.View at: Publisher Site | Google Scholar
J. G. Charney, A. Arakawa, D. James Baker et al., “Carbon dioxide and climate: a scientific assessment,” Tech. Rep., Climate Research Board, National Research Council, National Academy of Sciences, Washington, DC, USA, 1979.View at: Google Scholar
J. W. Eaton, D. Bateman, and S. Hauberg, GNU Octave Manual Version 3., Network Theory Limited, 2008, http://www.network-theory.co.uk/octave/manual/.