Research Article | Open Access
The Impact of Urban Transit Systems on Property Values: A Model and Some Evidences from the City of Naples
A hedonic model for estimating the effects of transit systems on real estate values is specified and calibrated for the city of Naples. The model is used to estimate the external benefits concerning property values which may be attributed to the Naples metro at the present time and in two future scenarios. The results show that only high-frequency metro lines have appreciable effects on real estate values, while low-frequency metro lines and bus lines produce no significant impacts. Our results show that the impacts on real estate values of the metro system in Naples are significant, with corresponding external benefits estimated at about 7.2 billion euros or about 8.5% of the total value of real estate assets.
Urban transit systems play a fundamental role for the social and economic development of large urban areas, as well as significantly affecting the quality of life in such areas. Mobility and accessibility are two important factors that influence everyday life, social inclusion, and the competitiveness of firms and commercial operators. The quality of urban transit systems also affects real estate values: the higher the quality and quantity of transit system services in an urban area, the higher the active and passive accessibility of the area, and the higher the average real estate values.
When a public authority invests in transportation infrastructures and/or in transit services, it generates some external (or social) costs and benefits; a cost is considered “external” if it is produced by subject A and borne by subject , without any compensation between the subjects [1–3]. Usually, the external costs of transportation are associated with air pollution, greenhouse gas emissions, noise, accidents, and congestion. While there is extensive literature on such external costs [4–8] which are nowadays taken into account in some transportation planning processes, less attention is devoted to external benefits that are hardly ever explicitly considered in economic and social analyses of transport investments. Besides the high direct benefits due to improved accessibility, reduced generalised trip costs, and the lower environmental impacts produced by less use of private cars, investments in transit systems, especially in railways and metros, may generate an appreciable increase in property values in the zones served; this benefit should be explicitly considered inside cost-benefit analyses. Note that even if the increases in property values mainly regard most of the users that live near rail stations, they have to be considered external benefits since they are enjoyed by all property owners, regardless of whether they use the transit system or live near the stations. In terms of social equity, especially if there are major differences between the transit supply in different urban areas, an evident disequilibrium is created; indeed, some inhabitants have available a high level-of-service transit system and, moreover, if property owners, also a benefit on the value of their properties. On the other hand, people living in areas not served by good quality transit systems suffer a double disadvantage: on transportation systems and property values. Since transit systems are subsidised strongly with public money and, then, by the whole society, a more equitable transit supply should be an important objective of transport policy. This imbalance can in part be compensated by introducing local taxes on property . Another aspect of the problem is related to the optimal taxation in the presence of (positive or negative) externalities; this topic is outside the aims of this paper, but the reader may refer to Cremer et al. , Auerbach and Hines , and Christiansen and Smith .
In order to consider explicitly in transportation planning the impacts of transit system investments on real estate values as external benefits, we need to estimate the hedonic price of the presence of transit systems. According to hedonic theory , it is assumed that the price of a complex good, such as a house, can be expressed as a function of its extrinsic and intrinsic attributes: the coefficient of each attribute represents its implicit (hedonic) price. Calibration of hedonic pricing models, taking the availability of transit systems into explicit consideration, allows us to estimate the hedonic price of the presence of such systems in an area or in proximity to an area.
In the literature, much evidence can be found of the impact of transit system availability on real estate values. Obviously, most papers refer to specific case studies since it is very difficult to generalise models and analyses. Almost all research findings have underlined a positive correlation between the availability of transit systems (nearby) and property values, especially with respect to high-quality systems, whether railways [14–26] or bus rapid transit lines [19, 27–30]. Fewer have also studied the negative effects of transit system proximity [31, 32], due to externalities (mainly noise). Brandt and Maenning  and Szczepańska et al. , instead, studied the influence of road noise on real estate values. Interesting meta-analyses were proposed by Mohammad et al. , examining 102 estimates from 23 studies for the impact of rail projects on land and property values, and by Debrezion et al. .
From the modelling point of view, almost all papers propose hedonic price models (HPMs) for estimating property values and their correlation with transit system availability. Regarding the functional forms of the models in question, several types have been proposed: multiple linear regression (MLR) models [13, 14, 30, 31, 36, 37], also with a semilogarithmic form [16–18, 22, 23, 33, 38]; a multilevel model was proposed by Cervero and Kang , a quantile regression model was by Bohman and Nilsson , a Box-Cox linear transformation regression was by Chen and Haynes , a cross-sectional model was by Sun et al. , spatial econometric models were by Ibeas et al. , Efthymiou and Antoniou , Chen and Haynes , and Mulley et al. , a DID (Difference-in-Differences) estimator was by Dubé et al. , random utility models were by Jun , and a spatial Durbin model was by Zhong and Li .
This paper proposes a (nonspatial) hedonic model for evaluating the impacts of urban transit systems on property values. In the literature, numerous spatial hedonic models can be found [39–43] that may be able to improve the results, but their application requires very small zones or, at the limit, single buildings; as will be described in the following, the available zoning is not suitable for the application of this approach. Other approaches proposed in the literature require more detailed data that are not available for the case study in the object.
The main objectives of this paper are (i) to estimate the impacts of transit systems on real estate values in the city areas; (ii) to evaluate the corresponding external benefits; (iii) to evaluate the effects of some interventions on the Naples metro system on real estate values and on the corresponding external benefits for the areas concerned.
The remainder of the paper is organised as follows: in Section 2 the case study and the data used are described; in Section 3 the data are analysed and the hedonic model is formulated and calibrated; the impacts on real estate values and on the corresponding external benefits are estimated in Section 4; some analyses of two future scenarios are reported in Section 5; Section 6 concludes the paper.
2. Case Study and Data
The main objective of this paper is to estimate the impact of transit systems on real estate values in the city of Naples (Italy) and to evaluate the corresponding external benefits. The proposed approach can be applied, with the necessary modifications, to other large towns and cities. Naples is the largest city in southern Italy with about one million inhabitants (962,003 at the last census). It is the capital of the region of Campania and is close to some of the world’s best known tourist destinations. Naples and its metro system were recently studied by Pagliara and Papa  who evaluated the impact of metro investments on property values and residents’ location, analysing the changes produced by the opening of new stations with a pre- and postanalysis in the zones affected. In this paper, instead, we propose a model applicable to the whole city for estimating the impact of transit systems on property values and we explicitly estimate the external benefits produced by the presence of a metro system.
For the aims of this paper, three kinds of data were required:(i)census data on population, density, quality of buildings, number and area (in m2) of buildings, retail businesses;(ii)real estate values;(iii)transit system data.
All census data were obtained from the Italian National Institute of Statistics (ISTAT) and refer to the last national survey . The available data are aggregated into 4,307 census subzones (see Figure 1). The data that we used in our model were resident population, density, number of residential buildings classified by state of conservation (very good, good, medium, and poor), total area of residential buildings, and retail employees (as a proxy of retail businesses in the city zones).
Real estate values were obtained from the Real Estate Market Observatory (OMI) , which is a database provided by the Italian Revenue Agency about real estate values in all Italian cities, subdivided by homogeneous zones. In this database, Naples is partitioned into 65 zones, as reported in Figure 2; the dimensions of the zones are not compatible with the use of spatial hedonic models. For each OMI zone, the database provides some data; in our study we refer to the following data: (a) minimum and maximum real estate values for residential buildings in a normal state of conservation; (b) kind of zone (central, semicentral, peripheral, suburban, and rural). The median real estate values for OMI zones are summarised in Figure 3; the classes of zones are reported in Figure 4. From the database of OMI zones, we eliminated five zones (R1, R2, D28, D31, and E41) for which no real estate values were associated, since there were too few real estate sale contracts to be statistically significant or they were not present. Therefore we refer to a database of 60 OMI zones.
Data on the transit supply system were collected directly from the operators and involved the following services:(i)4 funiculars (Centrale, Chiaia, Montesanto, and Mergellina);(ii)7 metro lines (Line 1, Line 2, Cumana, Circumflegrea, and 3 Circumvesuviana lines);(iii)76 bus lines.
In Table 1, the list of railway lines with the corresponding average daily frequencies is reported. In Table 2, the average daily frequencies of bus lines are summarised. It can be noted that average bus line frequencies are very low for urban services, except for a few lines; it is due to a strong financial crisis suffered in the last decade by the municipality that owns the urban bus firm and by the region that finances the largest part of transit services. The metro and bus lines are illustrated in Figures 5 and 6, respectively. Figure 5 also shows the metro lines currently under construction (black lines).
This case study is very appropriate for the proposed analysis since the Naples transit system covers the various areas of the city with very different levels of service: (a) some zones are served by medium/high-frequency metro lines or funiculars, besides some bus services; (b) other zones are served by low-frequency metro lines, besides some bus services; (c) some zones are served only by low-frequency bus lines; (d) some zones are only partially served by low-frequency bus lines; (e) outlying zones are only marginally served by the transit system. These differences may significantly affect the real estate values of the various zones; indeed, in cities where the transit service is more equally distributed among zones the impact on property values may be very low or very difficult to quantify.
Once all transportation system data have been gathered, we can construct a database that associates the corresponding data to each OMI zone, which we adopt as zoning since they correspond to real estate values. As for census data, all the data regarding the particular census subzones are associated with each OMI zone. Transportation data are associated with the OMI zones in the following ways: a metro station is associated with a zone if it is contained in the zone or if the zone boundary is no farther than 250 m from the station; a bus line is associated with a zone if it crosses the zone or runs along a road forming the zone boundary. In Table 3, all data associated with each OMI zone that are used to draw up the hedonic model are described, some of which are obtained with simple elaboration from starting data.
Moreover, the zones are also classified in function of three variables referring to some important features of the area: Sea, if the zone is seafront; Hill, if the zone belongs to one of the hill districts of the city (Vomero, Posillipo, Camaldoli, and Colli Aminei); Prestige, if the zone is commonly assumed to be prestigious (Posillipo, Chiaia). Figures 7–9 report the classification of OMI zones regarding these variables.
3. Data Analysis and Model Calibration
Before specifying and calibrating the model, we examined the dataset in order to explore the correlation between the real estate values and the variables associated with the OMI zones. First of all, we calculated the correlation coefficient, , between the real estate values, , and each variable, , where indicates the OMI zones and the independent variable. The correlation coefficient can be positive, between 0 and 1, or negative, between and 0, and the more its value is proximal to 1 , the higher the positive [negative] correlation is between the variables. In Table 4, the values of the correlation coefficient for each independent variable are reported. The variables Pop, Surface, BD_1, BD_2, BD_3, BD_4, and Retail_emp were not considered since they were used only for calculating other variables; the variable rural was not considered since in these areas there are no real estate values available.
Upon analysing the results, it can be noted that density is not at all correlated with real estate values (the value is very close to 0); probably, it is due to the almost equal (high) density in all zones. As regards the other variables, only certain variables presented a significant correlation −0.5; these variables are HF_rail_st, Central, Suburban, Hill, and Prestige. It is worth noting that low-frequency railway stations are not correlated to real estate values: the value is very low, in absolute value, and even negative. We may assume that all variables with an absolute value of the correlation coefficient lower than 0.25 are not correlated with property values and, then, are not useful as explanatory variables. This first analysis highlights that the HF_rail_st variable is highly correlated with real estate values (it presents the maximum absolute value of the correlation coefficient). The correlation between possible explanatory variables is examined, reporting the correlation coefficient matrix in Table 5. Examining the matrix, a strong (negative) correlation is, obviously, between variables %BD_3 and %BD_1_2, since one is nearly complementary to the other; in the specification of the model, both variables will not be jointly considered. The other variables highly correlated are HF_rail_st versus Central (0.69), since in central areas there are more metro stations, and Suburban versus CF_bus since buses cover almost equally the other areas except for the suburban ones, where the transit service is very marginal. However, about these other correlations, the calibration of the model and, in particular, the -stat tests will be able to verify if the variables are significant even if considered jointly in the model.
Starting from this preliminary analysis, we formulated and calibrated a hedonic linear regression model (LRM) for estimating real estate values. The general formulation of the model is the following:where is the expected value of the dependent variable , in our case the real estate value (REV) of a zone, conditional upon the terms on the right side of equation, , which are the independent variables; is the intercept, which is a parameter invariant with the values assumed by the independent variables, ; are the parameters (or coefficients) of the model, which have to be calibrated; are the independent variables.
This model has to be specified (the independent variables to be considered in the model have to be identified) and calibrated (the values of parameters βs that are best able to reproduce the observed values of have to be estimated).
We assume that the 60 real estate values, , corresponding to the 60 OMI zones, for which the data are available, are the observed data that we collect in a vector . The values of independent variables assumed in correspondence of each observation, which are usually called predictors, , are collected in a matrix ; there are always 60 rows in this matrix, while the number of columns depends on the number of model parameters. We also introduce the vector of the regression coefficients, , and the vector of statistical errors, . Vectors and the matrix are reported below, where subscript indicates the number of parameters:Adopting this matrix notation, the multiple linear regression model can be written asIndicating with the th row of the matrix , we can writeand henceThe values of coefficients can be estimated with the well-known ordinary least squares (OLS) method that minimises the sum of the square residuals:where is usually indicated with . Indicating with the term , a very important indicator of the goodness of the model is the coefficient of determination, , that is given byThis indicator measures the proportion of variability in the variables which is explained by the linear regression model; the closer to 1 (statistical errors equal to 0 and perfect reproducibility of the phenomenon), the higher the goodness of the calibrated model.
The coefficient of determination cannot be the only indicator for evaluating the goodness of a model. Indeed, it does not always decrease (usually increases) with the number of parameters , even if some are not actually useful for explaining the phenomenon. The other indicators that have to be used to evaluate the model are the tests of hypotheses that are able to measure whether the parameters adopted in the model are actually significant for reproducing the phenomenon. Below, we use as tests of hypotheses the -test, obtained by the analysis of variance, and the -test, regarding the significance of each independent variable. We will assume that a model is acceptable if the significance is close to 0 (at least <0.05) and if the -test of each coefficient is higher [lower] than for positive [negative] , where is the value of the -distribution corresponding to the degrees of freedom (df) of the model with a confidence of 95%.
We calibrated several models by adopting an adding method for identifying the parameters to consider in the model. Limiting the analysis only to variables where the absolute value of the correlation terms is greater than 0.25, we started with a model with only one df (one parameter) considering that with the higher value of correlation coefficient (see Table 4). We calculated the corresponding values of , significance , and -test.
If the model obtained is significant in terms of and -test, we add another variable to the model, the next in terms of correlation coefficient. The new added variable is maintained in the model if all significance tests are verified; otherwise it is eliminated and the procedure continues with another variable. When all variables are examined, we try to add to the model some variables previously discarded (the significance of a variable can change with different df, since the value changes). Since always increases (or, at least, it does not change) when we add a variable, the final model obtained with this procedure will respect all significance tests and will have the highest value of of all models examined. In Table 6, the results of the procedure are summarised: globally, we tested 12 models.
The results reported in Table 6 identify number 11 as the best model, which is as follows:Table 7 summarises all statistical tests and the analysis of variance (ANOVA) for the calibrated model. The residuals are reported in Table 8, while in Figure 10 the comparison is shown between the real estate values (REVs) obtained from the OMI database and the corresponding values estimated with the proposed model.
The analysis of the results shows that the model presents a good coefficient of determination (over 0.88) which indicates that over 88% of the variability in REVs is explained by the considered variables.
An important result to underline is that low-frequency rail stations and (low-frequency) bus lines have no significant influence on REVs. By contrast, the high-frequency rail stations have a significant and nonnegligible impact on REVs, as will be seen in the next section. The lack of significance of low-frequency services (bus and metro) on REVs is probably due to the fact that these services cover more or less all areas of the city and, then, their presence has not a practical impact on property values. With the calibrated model, the lower REVs of few unserved areas are explained by the variable Suburban that, actually, regards only zones where the transit service is very scarce or absent.
4. Impact of the Naples Metro System on Real Estate Values and Estimation of External Benefits
With the model calibrated in the previous section it is possible to estimate the impact of the Naples metro system on REVs and the corresponding external benefits. The predicted contribution of high-frequency rail stations to REVs is reported in Table 9 for each OMI zone; the same table also reports the percentages of this contribution. The percentage contributions for areas containing at least one high-frequency rail station are summarised in Figure 11, while Figure 12 reports the absolute values of the contributions. It can be noted that the contribution to REVs is significant for many zones: for two zones it is higher than 20%, for nine other zones it is higher than 10%, and for nine others it is higher than 5%.
For estimating the corresponding external benefits, some hypotheses have to be assumed. From the ISTAT census data on total areas of residential buildings and on the number of buildings belonging to different classes of “conservation state,” we may calculate the area (m2) of residential buildings attributable to each class. Regarding the OMI REVs, the minimum and maximum values are available; we assume that the ISTAT “very good” classification corresponds to an expected REV equal to the maximum OMI REV, while the “poor” classification corresponds to the minimum OMI REV. The other two classes correspond to a linear interpolation of the two extreme data, assuming uniform distances between classes. Estimation of external benefits is obtained by applying the percentages estimated in Table 9 to the current REVs. In Table 10, the results of this analysis are reported for all OMI zones. Thus the external benefits, in terms of REVs, can be estimated at €7.2 billion (8.53% of the total value of real estate assets), with an average benefit per inhabitant amounting to €7,490 or an average benefit for each residential unit (100 m2) of €23,034. These benefits express the increase in real estate values enjoyed by property owners; such estimates can be useful to policy-makers, to study equalisation actions, and to transportation planners, to evaluate the total benefits of interventions, including the external ones.