Abstract

We present a geographic information system (GIS) framework to classify stream habitats and provide fish distribution predictions comprehensively at the landscape scale. Stream segments were classified into one of eighteen habitat types using three landscape attributes: stream size (three categories), stream quality (three categories), and water quality (two categories). An extensive literature search was undertaken to classify fish species into the same eighteen habitat types based on preferences for the three landscape attributes. We tested our framework in 39 sites throughout the upper Allegheny River basin in western New York. No difference was detected between observed and predicted numbers of fish species among stream habitats. Further, field collected bankfull width measurements, stream quality ratings, and water quality sampling results were largely consistent with predicted values. The habitat type expected to have the greatest fish species richness was large streams or small rivers with intact stream quality and suitable water quality. Our framework is rapidly applied, comprehensive, inexpensive, and built on widely available data thereby offering an efficient alternative to traditional field-based efforts for regional habitat classification and fish distribution prediction.

1. Introduction

Declines in biodiversity [1, 2] driven by climate change [3], overexploitation, water pollution, flow modification, habitat degradation, and invasion by exotic species [1] have prompted efforts aimed at development of laws and policies for sound ecosystem planning and management [4, 5]. Conservation of areas high in species richness is often in conflict with resource extraction and land development [6]. It is critical that species-rich areas be protected because such areas tend to support a high number of rare species [7], and proper management of these regions optimizes resources for conservation [8]. In addition, ecosystems high in diversity have been found to promote increased productivity [9], resource utilization [10], and resistance to disturbance [11].

Much effort has thus been dedicated to develop methods to manage regions for conservation of biodiversity [1214]. It is clear that basic habitat information is needed to make informed conservation decisions; however, comprehensive field sampling over large study areas can be too costly in time and labor [15]. Thus, geographic information system (GIS) models that can synthesize multiple landscape dimensions have become particularly valuable in regions where biological surveys have not been completed or are difficult to perform (e.g., [5, 16]).

GIS models have become quite common for habitat prediction in landscape scale conservation planning [1720], biodiversity conservation planning [21], and spatial pattern evaluation over large regions [15, 16, 22]. However, advancements in the use of GIS models to remotely predict biotic communities have primarily been confined to terrestrial environments (e.g., [8, 23]). Far less attention has been paid to the development of landscape models to predict aquatic communities [24]. Most existing aquatic habitat classifications are hierarchical (e.g., [2529]), have extensive data requirements (e.g., [30, 31]), or are based on only a single landscape attribute (e.g., [3234]). Thus, although many landscape attributes can be combined in a GIS for conservation planning, few complete frameworks are available to classify aquatic habitats and predict fish species presence.

We hypothesize and test the assertion that a framework can be built from fundamental principles to classify stream habitats and provide fish distribution predictions comprehensively at the basin scale for regional application. Our framework is rapidly applied, regional in scale, inexpensive, and built on widely available data thereby offering an efficient alternative to traditional field-based efforts for regional habitat classification and fish distribution prediction.

2. Methods

GIS AML (Arc Macro Language) scripts were used to assess landscape attributes from digital maps and classify stream segments into habitat types. The automated nature of this GIS classification system allowed efficient assessment of stream habitats across large regions. The first step in our habitat classification procedure was to define stream segments as the section of stream or river from tributary confluence to tributary confluence on United States Environmental Protection Agency (US EPA) River Reach File Version 3.0 maps at the 1 : 100,000 scale. Next we classified each stream segment as one of eighteen habitat types using three landscape attributes: stream size, stream quality, and water quality (Figure 1). These attributes were chosen for their influence on fish species composition, availability of data, and ability to link to species biology.

Our methods for classifying streams by size were not meant to be an analysis of biological associations with stream size, but a quantification of the judgments used by ichthyologists when they qualitatively describe habitat for fish species (e.g., [35]). Stream segments with drainage areas less than 100 km2 were defined as “small streams” (Table 1). This class of largely wadable waters includes all first and second order streams and the lower 68% (<1 standard deviation above mean drainage area) of thirrd order streams. Small streams were predicted to have channels that average no more than 20 m wide and 1 m deep in moderate flow periods. We judged that streams of this size would be classified by ichthyologists as small streams in species biology descriptions. Stream segments with drainage areas from 100 km2 to less than 3,000 km2 were defined as “large streams or small rivers” or mid-sized flow waters. These waters are commonly shallow enough for light to reach most of the substrate. Large streams or small rivers were predicted to have channels that average 20–60 m wide with water depths averaging 1–3 m in moderate flow periods. We judged that streams of this nature would be classified by field biologists as large streams, small rivers, or mid-sized flowing waters in species biology descriptions. The final stream size category was “large rivers,” which includes stream segments with drainage areas greater than 3,000 km2 and describes waters that are mostly navigable by motor boats. This size criterion was selected after reviewing drainage areas at USGS gauge sites on streams regarded as large rivers in New York, and the drainage areas for a wide range of stream and river sites described by Barnes [36]. Equations in Dunne and Leopold [37] were used to relate channel size to drainage area.

Fish species vary greatly in their need for microhabitat conditions in streams and rivers. The level of habitat specificity is described for most fish species in ichthyological reference books (e.g., [35]) and microhabitat studies. However, it was not possible to determine microhabitats available for all stream segments in a watershed. As a surrogate, we assumed that the natural range of microhabitat diversity would occur in segments that experience natural channel erosion and deposition processes. Where human land uses impinged on the stream, we assumed that channel control measures and modifications were likely and that these hydro-morphological pressures disrupted normal fluvial geomorphology processes. Thus, quality for each stream segment was classified using a GIS script in AML to generate areas and percentages of landcover (EROS Data Center 1991–1993), and total length of roads (New York Department of Environmental Conservation 1993) and railroads (New York Department of Environmental Conservation 1993) in a 30 m buffer [38, 39] surrounding each stream segment. Stream segments were then classified into stream quality categories of “intact,” “modified,” or “highly altered” based on the composition of landcover and total length of roads and railroads in the 30 m buffer. Stream segments with no urban or agricultural lands, no railroad tracks, and no roads were classified as “intact.” Stream segments with 0–5% urban or 0–40% agricultural lands [40, 41] or summed road or railroad lengths less than half of the stream length [42] were classified as “modified.” Stream segments with greater than 5% urban lands, greater than 40% agricultural lands, or summed road or railroad lengths greater than or equal to half of the stream length were classified as “highly altered” (Table 1; [4043]).

Nonpoint source (NPS) pollution, caused by agricultural and urban land use, is the primary source of stream impairment in the United States, and elevated sedimentation is the principal pollutant causing stream degradation [44]. Each of these effects, in turn, can threaten fish populations in aquatic systems [4547]. In our model, water quality was classified using an adaptation of a GIS nonpoint source runoff model originally developed by Adamus and Bergman [48]. We used inputs of landcover (EROS Data Center 1991–1993), soils (STATSGO 1994), average annual rainfall (Northeast Regional Climate Center 1961–1990), runoff coefficients, and pollutant concentrations to determine annual pollutant loading of total phosphorous (TP), total nitrogen (TN), and suspended sediment (SS) to each stream segment from its drainage basin. We adapted the model [49] to compare predicted concentrations to the allowable US EPA pollutant criteria thresholds for the study area (ecoregion 11) for TP and TN, which are 0.01 and 0.31 mg/L, respectively [50], and the strictest SS 30-day average in warm water streams (90 mg/L; 51). A stream segment was classified as having acceptable water quality for each pollutant if its estimate was below the pollution criteria threshold; otherwise, the stream segment was considered substandard. If all three pollutants were considered within acceptable levels for a single stream segment, the reach was classified as having “suitable water quality.” Otherwise, the stream segment was classified as having “degraded water quality” (Table 1).

The eighteen habitat types were determined based on combinations of stream size (three categories), stream quality (three categories), and water quality (two categories). Fish species were then classified into one of these eighteen habitat types based on their associations with the same three landscape attributes (Figure 1). An extensive literature search was undertaken to classify each of the 114 fish species in the upper Allegheny River basin into the eighteen habitat types (Table 2). Classifications were achieved by researching preferences for stream size and tolerances for stream quality degradation and water quality degradation for each species [35, 5156]. A single fish species can be classified into several habitat types. Once species were classified, totals for each habitat type were tallied and the number of fish species was predicted for each stream segment in the study area (Figure 2).

Our model was developed and tested in the upper Allegheny River basin in western New York State, USA. The upper Allegheny River basin comprises approximately 4,870 km2 (488 stream segments) north of the Pennsylvania-New York state line (Figure 3) in Chautauqua, Cattaraugus, and Allegany counties. A variety of land uses are present in the region including agricultural farming (crop and dairy 28%) and residential/urban development (1.5%). Primary and secondary growth forest (67%) and wetlands/lakes (3.5%) comprise the remainder of the land in the region.

A survey of 39 sites in the upper Allegheny River basin was completed between late May and mid-August 1998. Sites were originally chosen using stratified random sampling to maintain an equal number of sites in each habitat type. However, several sites chosen randomly were inaccessible, located in dense wetlands or dry. Such sites were replaced with suitable locations elsewhere but of the same habitat type.

In sites accessible with heavy equipment, a distance ten times the average wetted width was electro-fished once using a Honda EX1000 generator and 15 Amp Coffelt VVP-2C transformer at approximately 300 volts. Fish were identified, enumerated, and a representative proportion were measured before release. Streams measuring in excess of 10 m in width were electro-fished in intervals of ten minutes until no new species of fish were collected over a period of three intervals. Fish collections from streams inaccessible from the road were dropped from the analysis because differences in gear proved too great to allow aggregation with the rest of the data.

Bankfull width measurements were taken in riffle, pool, and run sections, where possible, at each of the sites using a measuring tape. To compare with stream size predictions from the model, we converted observed bankfull width measurements to drainage area estimates using relations in Dunne and Leopold [37] for the eastern United States.

Physical quality of the stream channel was assessed using a rapid bioassessment protocol [57] adapted for the upper Allegheny River basin. Questions characterizing stream quality addressed existence of retention devices, channel structure, channel sediments, stream-bank structure, bank undercutting, stony substrate, stream bottom, and riffle/pool spacing. Each indicator was given a numerical score and scores were summed to provide an overall rating of stream quality for each site. Numerical ratings were then converted to the categories “intact,” “modified,” and “highly altered” using cutoffs provided by Petersen [57] to match the categories used by our model. All questions were answered by the same observer throughout the study to maintain uniformity of responses.

TP, TN, and SS measurements were taken at the downstream end of each site in riffle, pool, and run areas, where applicable, between 27–30 July 1998, except for one site which was dry. This time period was chosen to take advantage of conditions when the nitrogen content was at its lowest point and water was the clearest. Three water samples of 250 mL were obtained for suspended sediment measurements at each of the sites and stored in a cooler with ice. After the field day was completed, samples were pumped through preweighed filters (cellulose nitrate filter membranes; 45 microns), dried in an oven at 103–105°C for one hour then in a dessicator for 24 hours, after which the filters were weighed again. Three additional water samples of 100 mL were taken from each of the sites for total dissolved nitrogen and total dissolved phosphorous measurements. These were stored in a freezer until processing could be completed at a lab.

Observed field data were tested against model predictions for number of fish species and all landscape attributes. Observed and predicted number of fish species were assessed using correlation and the paired -test. Drainage areas, predicted based on spatial data and calculated based on measured bankfull widths, were compared using correlation. A sign test was used to compute the probability of obtaining stream quality and water quality results by chance, and these results were used to judge confidence in our findings. Data were analyzed using Minitab statistical software. Statistical significance was tested at .

3. Results

No difference was detected between observed and predicted number of fish species in a paired -test ( = −1.085, ), and the two ranked datasets were weakly correlated ( , ) but significant. The habitat type expected to have the greatest fish species richness was clearly large streams or small rivers with intact stream quality and water quality suitable for life support. These rare (<1%) stream segments averaged 0.6 km long, considerably shorter than stream segments experiencing some form of degradation (2.7 km).

Based on the number of fish species predicted in each habitat type (Figure 2), we would expect the most significant decline in number of fish species to follow degradation in water and physical stream quality. On average, 56% of species were expected to disappear with degradations in water quality whereas an average loss of 37% of the species was related to radical changes in stream quality from intact to highly altered channels. No loss in species was expected with moderate physical stream degradation (intact to modified) once water quality has been degraded. An increase in species numbers by 28% was expected from small streams to large streams or small rivers, whereas further increases in stream size to large rivers resulted in a predicted reduction of 40% of the species.

Observed drainage areas, converted from field collected bankfull width measurements using Dunne and Leopold [37], were compared to predicted drainage areas from the model and were found to be strongly correlated ( , ). We also evaluated observed drainage areas to determine if the classification criterion used to differentiate small streams from large streams and small rivers was appropriately placed. Both the average cumulative drainage area for observed small streams (47 km2) and for large streams and small rivers (440 km2) were well within the appropriate ranges (0–100 km2 and 100–3,000 km2, resp.) for their category.

Approximately 40% of the stream segments in the study area were predicted to have highly altered stream quality with most of the remaining stream segments classified as modified (57%). Intact stream segments were predicted to be present primarily in small streams (70%) and large streams and small rivers (26%). Modified and highly altered stream segments were widely dispersed throughout the watershed. Predicted stream quality classifications matched observed for all but nine stream segments (77%). The probability of obtaining 30 matching classifications out of 39 comparisons was <0.001 indicating that the high rate of matches is a significant and highly confident result.

Approximately 85% of the stream segments in the study area were classified as having degraded water quality. Most stream segments with degraded water quality were located in the western side of the watershed where agricultural and urban land uses were concentrated and the far eastern side of the watershed. The high-quality stream segments were largely located south of the Allegheny River in an area protected by the New York State Park system. Thus, water quality degradation appeared more clustered, regional, and prevalent than stream quality degradation.

Predicted TN classifications matched observed for all but five stream segments (90%). Predicted and observed TP classifications matched for 26 of the stream segments (67%), and SS classifications matched for all stream segments (100%). The probability of obtaining 35 and 39 matching classifications out of 39 comparisons was <0.001 indicating that the high rates of TN and SS matches were significant and highly confident results. The probability of obtaining 26 matching classifications out of 39 stream segment comparisons was 0.027 indicating that rate of TP matches was slightly lower but still a significant result.

4. Discussion

This study proposed a framework that used standard GIS methods and data to classify fish habitats at the river basin scale. The framework is composed of the landscape attributes stream size, stream quality, and water quality and was tested with a survey of fishes, stream size quantification, stream quality ratings, and water quality sampling. Ranked predicted and observed fish species by habitat were correlated but there was considerable variability. Large streams and small rivers with intact stream quality and good water quality were predicted to have the greatest number of fish species. We found no significant differences between predicted and observed number of fish species for this class. While this test was one point in time, results suggest that fish species were associated with the correct habitat class.

Sources of variability in our observed fish species data stem from misidentification of uncommon fish, gear and field technician inefficiency in fish capture, and escaped fish. Sources of variability in our predicted fish species data stem from inexact model parameterization or model structure. In six sites, observed and predicted fish species values were equal, and in seventeen sites predicted values were greater than observed values. Many of the sources of variability in our observed data would result in lower species richness estimates (i.e., misidentification of uncommon fish and inefficiency of fish capture). Thus, with improvements in observed data sampling techniques we would expect our observed values to increase and our observed and predicted fish species correlation to become stronger.

Prediction accuracy from our model was promising across all three landscape attributes. We found a strong significant correlation between predicted and observed drainage areas. The model succeeded in using predicted cumulative drainage area to characterize bankfull width with significant accuracy. This lends further support that drainage area at any point is correlated closely with many size characteristics and can be used as a general measure of stream size with confident accuracy. Examination of cumulative drainage area criteria to differentiate small streams from large streams and small rivers indicated appropriate placement for the upper Allegheny River basin.

Predicted stream quality classifications matched field observations significantly more often than chance alone would indicate. Scarcity of intact stream channels was likely a consequence of agricultural activity, presence of primary and secondary roads, or human-related alteration to the stream channel or banks. This scarcity, combined with a shortage of large streams and small rivers (10% of the stream segments) and high water quality streams (15%), caused few stream segments to have the necessary qualifications for greatest predicted number of fish species.

Our adapted nonpoint source pollution load screening model was designed to predict if TP, TN, and SS concentrations were greater than US EPA criteria for acceptable water quality. The coarse, simple GIS approach of the model was intended for annual prediction of parameters and could not accurately reflect subtle changes in pollutant concentrations. Field samples, collected over one brief time interval at baseflow conditions, were not a thorough test of annual water quality averages. Additional error may have been caused by misclassification errors in creation of the digital data and inappropriate runoff coefficients and pollutant concentration values for western New York State. Despite all these sources of possible error, the model was still able to accurately match observed water quality classifications for all parameters. We reported a much more thorough test of the water quality model and details about GIS operations in Meixler and Bain [49].

Our findings indicate that our GIS framework can provide coarse, landscape scale fish habitat classifications that can be related to expected fish distributions. Effective conservation of biodiversity in aquatic communities requires the identification and protection of key locations within river basins and regionally. Our fish habitat GIS framework meets this need by providing rapid, comprehensive, inexpensive, landscape scale habitat patterns from widely available data given reasonable time and resources. Further, GIS modeling can be readily updated to reflect changes in land use patterns [15]. Note that use of our model in other ecoregions may require adjustment and verification of model parameters. Thus, some field verification may be required for early adopters of our model. However, it is clear that the GIS framework presented here has considerable potential to classify fish habitat classes and predict fish distributions at the landscape scale to better inform management decisions.

Acknowledgments

This paper was supported by the United States Geological Survey under cooperative agreements no. 1434-HQ-97-RU-01553 and RWO no. 40 and benefited from research performed in the study area under a grant from the Nature Conservancy. Special thanks go to Greg Galbreath for compiling the fish data in Table 2. The authors also wish to thank Jordan Gass and Andrew Koo for fieldwork assistance and Magdeline Laba and Steve Smith for GIS aid in support of this project.