The well-known urban heat island (UHI) effect recognizes prevailing patterns of warmer urban temperatures relative to surrounding rural landscapes. Although UHIs are often visualized as single features, internal variations within urban landscapes create distinctive microclimates. Evaluating intraurban microclimate variability presents an opportunity to assess spatial dimensions of urban environments and identify locations that heat or cool faster than other locales. Our study employs mobile weather units and fixed weather stations to collect air temperatures across Roanoke, Virginia, USA, on selected dates over a two-year interval. Using this temperature data, together with six landscape variables, we interpolated (using Kriging and Random Forest) air temperatures across the city for each collection period. Our results estimated temperatures with small mean square errors (ranging from 0.03 to 0.14); landscape metrics explained between 60 and 91% of temperature variations (higher when the previous day’s average temperatures were included as a variable). For all days, similar spatial patterns appeared for cooler and warmer areas in mornings, with distinctive patterns as landscapes warmed during the day and over successive days. Our results revealed that the most potent landscape variables vary according to season and time of day. Our analysis contributes new dimensions and new levels of spatial and temporal detail to urban microclimate research.

1. Introduction

As early as the 1800s, observers recognized that urban regions are warmer than their rural surroundings, an effect known as the urban heat island (UHI). The UHI is well understood through a multitude of studies documenting its relationship to landscape differences between rural and urban areas [1, 2]. Land cover and land use changes—removal of vegetative cover and increasing amounts of impervious surface cover—will intensify and expand UHIs [3]. Urban areal extent expands as urban populations grow, and with over 50% of the world’s population now living in urban areas, the UHI is expected to intensify [4].

Furthermore, urban areas are internally heterogeneous, from spatial variation of urban surfaces and construction, along with their economic and historic development. Such differences within urban landscapes create microclimates generated by unique combinations of surface materials’ thermal properties, vegetative cover, landscape design, and remnants of the natural environment [5, 6]. Understanding the spatial patterns of urban microclimates requires assessment of local relationships between air temperatures and urban landscape features, at fine spatial scales [7]. Practical implications for improved knowledge surrounding spatial detail of urban temperatures include impacts for human health and safety, forecasting variations in energy use, and municipal administration.

This paper assesses those distinctive combinations of urban landscape features that create microclimates within a specific urban area, the city of Roanoke, Virginia, USA. We use these distinctive combinations to estimate temperatures across the entire city and address gaps in research investigating fine-scale microclimate variations within an UHI.

2. Urban Microclimate Literature

While UHI literature evaluating differences between rural and urban temperatures dates back over a century (as referenced by [1, 9]), research evaluating intraurban microclimate differences date back only a few decades. Much of this research evaluates temperature differences related to land use, that is, low-density residential, high-density residential, industrial, commercial (e.g., [1, 1017]). Some recent research has investigated land cover and related thermal differences (e.g., [1822]). However, evaluative techniques for microclimate assessments vary widely.

Extensive research has been accomplished using only satellite imagery to evaluate urban thermal patterns (e.g., [11, 15, 16, 23, 24]). Some satellite studies attempt to extrapolate specific land surface temperatures from remotely sensed images of varying resolutions (i.e., Landsat or MODIS) [25]. These studies report specific temperature values derived from radiative values (from the satellite images) and, in some cases, validating these estimated temperatures with data from a limited number of field sites [2]. But satellite imagery does not accurately represent all ground surface characteristics because in some regions it records thermal conditions of building roofs [5, 6].

Because satellite imagery cannot capture all ground conditions, some researchers utilize mobile mesonet units to record air temperatures along specified transects in conjunction with remotely sensed imagery (e.g., [1214]). Although mobile units are effective in collecting data across the range of urban landscapes, they are restricted to roadways and parking areas and have a limited temporal scope. Further, because temperatures are changing while mobile collection is underway, data should be segmented into shorter temporal intervals to minimize mixing of temporal and spatial variation.

In contrast, fixed weather stations deliver a constant stream of data from established locations. As such, some researchers acquire temperature data from fixed weather stations or temporarily place stationary weather stations in designated areas (e.g., [1, 14, 2628]). But, in most urban areas, the number of fixed stations is limited, with sparse distributions relative to place-to-place variation of local temperatures. Combining both approaches provides advantages in capturing local spatial and temporal variations.

Some urban microclimate studies only use a network of fixed stations and site units in specific locations across an urban area (e.g., [2931]). But, again, the number and placement of fixed stations are not able to capture the detail created by unique combinations of urban landscape characteristics.

The aims of urban microclimatic research are variable. For example, Gaffin et al. [29] assessed temporal patterns over the past 100 years, seasonally and diurnally. Bourbia and Boucheriba [30] specifically examined urban street canyon variables. Holmer et al. [19], bencheikh and Rchid [22], and Asgarian et al. [32] examined effects of vegetation on cooling in an urban environment. Yahia and Johansson [31] examined differences in heating and cooling between parks, urban canyons, and open residential areas.

A few researchers have examined the unique combinations of landscape characteristics, although the number of temperature sites is often limited or temporary, the time period of collection is limited, and distances (within which the landscape characteristics are captured) differ.(i)de Andrade et al. [33] used five sites, collected data over six days, and used descriptive variables to categorize the landscape (bare soil, green areas, arboreal areas, built-up areas, and water bodies) within a 450-meter radius of each station;(ii)Shahrestani et al. [34] used six sites (but the radius around each site is unclear), including street orientation, built surfaces: walls, roads, and other pavements, vegetation, and elevation, collecting data for one month during the summer;(iii)Sodoudi et al. [35] used three permanent stations and 31 sites using hand-held equipment, for one day over two different collection times; variables were soil type and texture, building dimensions, vegetation type, and impervious surface type, but again it is unclear what radius around the equipment was used to identify the specific variables;(iv)Bourbia and Boucheriba [30] used seven stations, collecting data every two hours for two weeks during summer; variables included skyview factor, street orientation, street widths, and building heights, but again buffer distance is not clear.

Rarely do researchers use these variables to extrapolate temperature patterns across entire urban areas. Only one such study was found—Hart and Sailor [28] collected afternoon temperatures for six days, no more than one hour at a time, over a two-month summer period. They chose dates when temperatures were predicted to be higher than normal. They used mobile transects (employing one to four vehicles), collecting data every 2 seconds and eliminating duplicate data collected when stopped. They normalized temperatures for elevation using data from the permanent weather station located at the nearby airport. Landscape variables included land use, tree canopy cover, impervious surface cover, ground vegetation, loose surface cover, total length of roads, and total floor space as captured within 300 meters for each of their temperature readings. They used a regression tree model to extrapolate temperatures across the entire city of Portland Oregon for two categories—weekdays and weekends.

Our research resembles that of Hart and Sailor [28] but refines and expands upon their (and other researchers’) efforts by employing finer spatial and temporal detail. We use mobile collection units, with a network of fixed stations, and include landscape variables collected at a finer spatial scale (30 × 30 m), for multiple data collection periods across two years, and used Random Forest to estimate temperatures across our entire study site. We refer readers to our Methods section for complete details.

3. Study Area

The City of Roanoke, Virginia, USA, is located in Virginia’s Central Valley, between the Blue Ridge Mountains and the Alleghany Highlands (Figure 1). It is southwest Virginia’s largest metropolitan region, and although small in area (110 km2), it is intensely urbanized, with respect to both population density (880 persons per square kilometer) and area. It has a variety of urban land uses, with an historical focus as a hub for rail and road traffic with industries supporting the rail system, as well as supporting services: finance, distribution, trade, manufacturing, and health care.

Roanoke’s elevation ranges from 269 to 531 meters. As noted in inset in Figure 1, Roanoke’s elevation is relatively uniform throughout the city, with the exception of Mill Mountain (southeast). Although Mill Mountain’s relief is documented well in the elevation map, Figure 2 clearly illustrates its notable elevation, rising out of the valley as a considerable landscape feature.

Roanoke’s land cover includes 31.9% impervious surfaces (as calculated by first author) and 47.9% tree canopy [36], although the distribution of both is not uniform across the entire city (Figure 3). The streams and river within the city’s boundaries are on the US EPA’s Total Maximum Daily Load list for PCBs, high water temperatures, and Escherichia coli [37].

4. Methods

4.1. Temperature Collection

Air temperatures were collected using both mobile mesonet units and fixed weather stations. The data collection period extended from April through November of 2013 and June through August of 2014.

4.1.1. Fixed Stations

At the beginning of our collection campaign, the fixed stations existing within Roanoke included four K-12 schools, the local airport (National Weather Service), Virginia Western Community College, a few private residences, and a local TV station. Each station reported their data via the internet either through WeatherBug or Weather Underground.

Both internet sites provide access to current and historical data, but in remarkably different ways. For WeatherBug, users must download their app and for historical data a “Plus” app. Even with the “Plus” app, temperature data is only displayed as hourly reports for the preceding 30 days and only the high and low temperatures for another 90 days. Their data is not downloadable so it must be manually recorded.

Weather Underground, however, provides complete historical data (for as long as the weather station has been reporting to the site), downloadable as a  .csv file. The data is collected as frequently as set (by the weather station owner) to record data. In addition, data can be downloaded as daily, weekly, monthly, or yearly reports. For both WeatherBug and Weather Underground the data is only accessible if the weather station is actually online.

The equipment for WeatherBug sites is unknown. But, WeatherBug does have specific equipment requirements to qualify as a WeatherBug location. Weather Underground provides equipment information (as entered by the weather station’s owner) and does monitor the equipment for consistency with other reporting stations. Station type, its maintenance, and siting are issues across most weather station networks, as they can enter bias.

As part of our research, eleven new fixed weather stations were purchased, and, with the assistance of Roanoke City Public Schools (RCPS), sited at middle and elementary schools. We chose the public schools for new stations because schools are distributed across the city. Careful selection was made for the new instruments after consulting with meteorology experts: faculty and local meteorologists. All new stations are Davis Vantage Vue Wireless Weather Stations (Model number 6250) with Davis WeatherLink data loggers and software. Table 1 provides instrument specifications.

These stations’ outdoor instruments were installed by RCPS facilities personnel, on school roofs in locations not easily accessible (Figure 4) to unauthorized personnel. Our university’s meteorology students linked indoor consoles with data loggers to outdoor equipment, set data reporting (timed for every five minutes), installed computer software, and connected stations to Weather Underground (Figure 5). We were then able to monitor the fixed weather stations remotely.

4.1.2. Mobile Temperature Collection

Our university’s mobile mesonet units (Figure 6) are Campbell Scientific mobile meteorological units. Table 2 provides the instruments’ specifications, which were configured by the fourth author for vehicular use.

The units were mounted on roofs of Chevrolet Cobalts (Figure 7) and driven into and around Roanoke for transect data collection. Data collection campaigns were organized using volunteer drivers and navigators to follow routes covering a range of landscapes and land uses (Figure 8 shows routes driven on 1, 2, 3, and 4 July 2014). As previously noted, our collection campaign included multiple dates in 2013 and 2014 and, in some cases, covered consecutive dates (Table 3).

We chose these dates to capture a range in seasonal temperature regimes: green-up during spring, summer heating, and the onset of autumn frosts. We coordinated our plans with local meteorologists to select days when we could expect calm, clear weather as cold air/warm air advection would override local conditions. In most cases, we successfully targeted days when the synoptic pattern was quiet. The exceptions were a few days in the summer, when synoptic patterns are sometimes unstable, a normal summer condition within Virginia. Although we collected some daytime data during the autumn season, our principal goal was to capture onset of early season frosts—routes driven during nighttime hours usually beginning just before or just after midnight and ending just after sunrise.

At the conclusion of each campaign, the data file was downloaded as a  .dat file, which can be imported into Excel. Latitude and longitude are collected as degrees and, separately, decimal minutes, so we combined these into one reading as decimal degrees. The time was noted as Mountain Time (stamped as hours : minutes : seconds on a 24-hour clock), so we converted to Eastern Standard Time (added two hours). The spreadsheet file was then loaded into GIS software and a point shapefile created for all collection points, maintaining all recorded data within the corresponding attribute table.

Throughout the mobile collection campaigns, the number of fixed stations reporting at a given time varied—some went offline completely, some were only reporting on certain days, and some private residences installed new weather stations. A spreadsheet file was created for all fixed weather stations’ data (temperature only) for the same dates and times as the mobile weather stations’ collection. A point shapefile was also created in GIS for each fixed station, again, maintaining each recorded parameter within the attribute table.

We thought some residual effect from heating the day before may influence morning temperatures. So, a spreadsheet was created for the daily high, low, and average temperatures for all fixed stations reporting for the day before each of the mobile unit collection dates.

Because siting of individual stations can enter bias into our analysis (as a reminder—we did not control actual installation of the fixed station equipment either for new stations or preexisting stations), we compared data from the fixed stations to the Campbell Scientific equipment mounted on the vehicles. As such, we completed a spreadsheet for dates for all mesonet data collection campaigns, noted exact times we were driving near a fixed weather station, and entered temperatures for both the mobile unit and the fixed station at that specific time and the straight line distance between the mobile unit and the fixed station. We then conducted a correlation analysis between the fixed stations’ and the mobile units’ temperatures.

4.2. Landscape Metrics

Our final step was documenting landscape characteristics within a 30 × 30 meter grid cell around each fixed weather station and all mobile units’ data points. A ten-meter digital elevation model (DEM) was downloaded from the USGS Seamless Server. The National Land Cover Database 2006 Percent Developed Imperviousness (NLCD IS, 30-meter resolution raster file) was downloaded from the Multi-Resolution Land Characteristics Consortium website. During the course of our analysis, when a new NLCD (2011) was produced, we compared the 2011 dataset to the 2006 version and noted very little change in reported impervious surfaces, so we continued our use of the 2006 data. A one-meter resolution tree canopy cover (TCC) raster dataset was provided by the Virginia Tech’s Geospatial Extension specialist. Using ArcMap, aspect and percent slope were derived from the 10-meter DEM. The one-meter TCC was aggregated to 30 meters and percent TCC for each 30-meter grid cell was calculated.

We added to the attribute table for each mobile data collection point and each fixed weather station values for elevation, TCC, NLCD IS, aspect, and percent slope using extract multiple values to points tool in ArcMap (Figure 9).

Finally, a fishnet of 30 × 30 meter grid cells was created for the entire city ( = 123,461). This tool within ArcMap also creates a point shapefile, placing a point at the center of each grid cell. Using this point shapefile, we extracted all the same landscape metrics, thus ensuring we knew these metrics for every 30 × 30 meter area over the entire city (Figure 10).

4.3. Data Analysis Using Random Forest

We incorporated data collected from fixed stations and mobile mesonet units to build a model that can estimate temperatures at 30 m resolution for the entire city. Our data has a complex structure characterized by high correlations among covariates, spatial correlations among observations taken at different locations, and nonlinear relationships between covariates and the microlevel temperatures that we want to estimate. To overcome these challenges, a predictive model based on Random Forest [38] was used.

Random Forest is a machine learning method widely used for classification and regression analysis. In our study, we use Random Forest to build a large number of regression trees (e.g., 500) derived from bootstrap samples (random sampling with replacement) of the full dataset; then the final prediction/estimation is made based on the average of the outputs of all the regression trees. This practice is also known as bagging in machine learning literature, which improves stability and accuracy of the regression tree. Another feature of Random Forest is that it builds each regression tree slightly different than a standard regression tree. In Random Forest, each tree node is split using the best split among a random subset of predictors. For a standard regression tree, each node is split using the best among all predictors. This additional layer of randomness makes it robust against overfitting [38]. Hence the results from Random Forest are generally preferred to those from regression trees.

Our predictive model was built for a selection of collection dates listed in Table 3. We used 90% of the mobile readings as our training set while retaining the remaining 10% as our validation test set. Variables elevation, TCC, NLCS IS, aspect, and percent slope were used in the model to account for spatial heterogeneity of the microlevel temperature distribution. Another variable included in the model was the predicted temperature at 30 × 30 m detail based upon fixed station readings, considered as the “basis” temperature at locations designated as predicted basis temperature. The predicted basis temperature is obtained using Kriging [39], where a variogram model is determined based on temperature readings and spatial locations of the training set and those of the fixed stations.

The last variable included in our model is called predicted lag-1 average temperature (hereinafter referred to as lag-1), which is based on average temperature readings at fixed stations and mobile unit readings of previous day. The predicted lag-1 at each spatial location is obtained in a similar fashion as the predicted basis temperature, that is, using Kriging as the interpolation strategy.

We ran our model twice for some time periods—once with only the data collected for that date and a second time including the lag-1. We experimented with including predicted lag-1 because we wanted to evaluate effectiveness of including a measure of heat absorbed by the ground during the previous day to see if it had a residual effect the next day.

Once the model was built for each time period, we evaluated its performance using the validation test set. Then the model is used to predict temperature at each location (30 × 30 m resolution) for the entire city for the same time period. At each location, the covariate values (elevation, TCC, NLCD IS, aspect, and percent slope) are obtained by landscape metrics as noted above, while the predicted basis temperature and predicted lag-1 are obtained using the Kriging method as mentioned above.

In order to assess which variables contribute most to temperature prediction, we used the importance matrix [40] from Random Forest. For each variable, the importance matrix depicts the percentage increase in prediction mean square error (MSE), based upon random permutations of that variable’s values using out-of-bag cases in Random Forest. The variable that results in largest increase in MSE will be identified as the most important variable for the Random Forest model.

5. Results and Discussion

In 2013, we drove within 300 meters of a fixed weather station 94 times and twice within 500 meters, when the fixed stations were active. For those 96 times, we found a 99% correlation between the fixed weather stations’ temperatures and those collected by the mobile mesonet units (Figure 11). (The two temperature readings that fall outside of the 99% ellipse were readings from stations that were between 300 and 500 meters from our mobile unit.) These results provide confidence in reliability of the two temperature measurement systems (fixed and mobile) for recording air temperatures (at least for weather conditions selected for our project) and provide that differences in siting weather stations, motion of the mobile units, and other operational details are not likely creating significant variability within our dataset.

A total of 107,065 mobile mesonet readings were obtained in 2013 and 53,741 readings in 2014. A total of 4,325 fixed station temperatures were collected for 2013 and 2,352 for 2014. We collected 268 daily readings for high, low, and average temperatures from the fixed stations for the day before the mobile collection date.

Here, we are reporting results of our Random Forest predictive model for one collection campaign for each season (Table 4). As a reminder, each predictive model used just one hour of data since we wanted to minimize any temporal effects in temperatures created by normal daily heating. Seasonal campaigns include morning and afternoon comparisons, comparisons between predictive models before and after including the lag-1 variable, four separate early morning campaigns on one date, and three different time periods over the course of one day.

As Table 4 shows, we have very small mean square errors (MSE) for all dates. The model performs worst on 22 April 2013 (using the lag-1 variable), but the MSE is only 0.14 and percent of variation explained is 68.92%. The model performs best on 4 July 2014 (with lag-1 variable), the MSE is 0.03, and percent of variation is 85.92%. For all dates, except 22 April, when we added the lag-1 variable, the percent of variation increased. Table 4 also lists the importance of each variable in explaining temperature variation in hierarchical order of their influence (i.e., the first row in the table, 22 April 2013 morning, basis temperature (fixed station) had the most influence and tree canopy cover had the least influence).

5.1. Discussion—Spring Campaign

The spatial distribution of our spring collection campaign predictive results is shown in Figure 12 (22 April 2013) and Figure 13 (23 April 2013) for both morning and afternoon campaigns. The morning images are before (left image) and after including the lag-1 variable (middle image) and the afternoon predictions (right image). Patterns within the city are similar in both morning images—the southeastern mountain (Figure 2) is warmer during the mornings, as are densely built-up areas of the city (consistent with the patterns of impervious surfaces in Figure 3). For 22 April 2013, the three most important variables do not change after including the lag-1 variable, and our MSE and percent explanation changed minutely, but the mountain area is not quite as warm as the built-up areas.

The estimated afternoon temperature images (right) show that as the city warmed throughout the day, the patterns changed; warmer areas now coincide with greater built-up areas of the city (again, consistent with patterns of impervious surfaces as presented in Figure 3). The afternoon patterns also demonstrate that the mountain (and area of greatest tree canopy cover) is cooler. Temperatures for the afternoon on 23 April are slightly higher than on 22 April, which highlights the more densely built-up central business district as much warmer than the rest of the city.

5.2. Discussion—Summer Campaign

Morning summer temperature patterns (Figures 14 and 15) are similar to April; the mountain is warmer, as are major roadway areas. 2 July and 3 July 2014 were much warmer than 4 July (by at least 6°C), so major roadways and intersections are more prominent in the 3 July image than the 4 July image. Including the lag-1 variable increased our percent explanation to almost 80% for 3 July and over 85% for 4 July. While there appears to be an anomaly in the southwestern part of the city on the 4 July image, this area is actually a built-up area, including a substantial railroad yard, which retained heat from two prior hot days (2 and 3 July).

5.3. Discussion—Autumn Campaign

Figure 16 shows estimated temperatures (°Celsius) for three different time periods on 25 November 2013—early morning, late morning, and afternoon. 25 November is an autumn day but temperatures on this date were closer to Roanoke’s normal winter-time temperatures. The temperature patterns demonstrate that the southern area, along with those areas of greatest impervious surfaces, is warmer in the morning (left); the patterns change as the day warms (middle); and, by midafternoon, areas that are the warmest are those greatest built-up locales: the central business district, major roadways and associated businesses, the airport (northeastern area), and a large mall.

6. Conclusions

Combining data collected from both mobile mesonet units and fixed weather stations with landscape metrics into our model, we were able to estimate air temperatures across the entire city for each period of our data collection. Furthermore, our estimated values demonstrated distinct temperature patterns (warmer versus cooler areas) related to landscape metrics.

Our patterns show warmer morning temperatures in southeastern area of the city (the mountain area) and larger roadways. These patterns changed as the city warmed throughout the day or warmed over the course of several days. The patterns also changed when a cold front passed through (on the afternoon of 3 July) when built-up areas retained heat from the previous days. In each of our estimated temperature maps, Mill Mountain is a distinctive feature because of the substantial elevation change, substantial forest cover, and the extensive roadway around the mountain’s base (Figure 2).

Future climate research for Roanoke will examine additional temperature collection for multiple contiguous days over each of the four seasons, including data for both morning and afternoon. Research to track the thermal properties for different surfaces can employ additional data from hand-held infrared thermometers to record both morning and afternoon pavement temperatures. Additional metrics to include in our Random Forest model include physical geometry (building heights and street widths), land cover albedo (especially for roofs and other impervious surfaces), thermal conductivity of different surfaces, shadowing (skyview factor), and sources of anthropogenic heat. In addition, some of our collection dates were timed to coincide with Landsat overpasses, so we plan to investigate relationships between estimates based upon our mesonet data and those derived from analysis of satellite observations.

This research offers a glimpse on how detailed representations of temperatures vary within the urban landscape and how they can inform our understanding of the urban thermal landscape. Although additional work will be required to understand the full merits and shortcomings of the strategies that we have applied here, our results provide estimates that reveal the spatial patterns and their temporal variations. Landscape data are available or can be easily constructed from other data, for most urban areas, although in some regions dates of data layers may not match well (e.g., for our study, dates of the impervious surface and canopy cover differed by 4 years). Larger cities may require larger efforts—more fixed stations, more mobile units, and longer routes to collect the required samples. Although we have not experimented to assess the influence of sample size upon accuracy, it may be feasible to design routes that collect more concise sample numbers that will provide accurate results.


The views expressed in this publication are solely those of Tammy Erlene Parece and EPA does not endorse any products or commercial services mentioned in this publication.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.


This research was financially supported by Virginia Tech’s Graduate Student Assembly’s Graduate Research Development Program Grant (2012), Sidman P. Poole Endowment for Research in Geography at Virginia Tech (2013), Sigma Xi Graduate Award for Doctoral Degree Students (2013-2014), U.S. Geospatial Intelligence Foundation Doctoral Scholarship (2013), Cabell Brand Center First Freedom Scholarship (2013), and Virginia Space Grant Consortium 2014/2015 Graduate Research Fellowship. This publication was developed under STAR Fellowship Assistance Agreement no. FP-91769301-0 awarded by the U.S. Environmental Protection Agency (EPA). It has not been formally reviewed by EPA. The authors thank the following people for their assistance with this research: Dr. Andrew Ellis, Dr. Mike Hyer, Brent Watts, Robin Reed, Sam Freeman, Mario Garza, Chris White, Paul Miller, Hans VanBenschoten, Eric Guenther, Anthony Phillips, Sarah Phillips, Dr. Emily Smith-McKenna, Adam Oliphant, Thomas Tutchings, Ash Elmelick, Erika Cropp, Bonnie Long, Michael Marston, R.J. McNally, Lynn Wormeli, Olivia Jancse, and the employees of the National Weather Service, Blacksburg Office.