#### Abstract

In contrast to private cars, rail transit systems are a more effective way to deal with the emerging challenges in cities with high population densities, such as congestion, air pollution, and traffic emissions. Rail transit systems, however, are commonly costly, due to substantial investments in construction and maintenance. It is thus necessary to design the candidate rail transit systems carefully to ensure public transport accessibility and sustainability, with consideration of the space-time correlation of population densities. In this paper, the space-time correlations of population densities are incorporated into the design of a candidate rail transit line over years. A closed-formed mathematical programming model is proposed, with an optimisation objective of social welfare budget maximisation. The social welfare budget is defined as the summation of the expected social welfare and social welfare margins. The model decision variables include rail line length, rail station number, and project start time of the candidate rail transit line. The analytical solutions for the proposed rail design model are given explicitly for different scenarios with various constraints.

#### 1. Introduction

##### 1.1. Motivation and Literature Review

The preferred travel mode in areas with low population density is a private car, such as the United States. The preferred travel mode in areas with high population density is rail transit, such as Hong Kong. One possible reason is that high congestion and long delay may occur on highways, during morning peak hours on working days in highly populated areas. The generalised travel cost of a private car may be higher than rail transit in such areas.

The travel mode choices were commonly examined, with the assumptions of given and fixed population densities (see, e.g., [1–3]). These assumptions were reasonable in their models because the travel mode choices were analysed for short-term operation optimisation. Under these assumptions, the effects of population densities on travel mode choices, however, were not explicitly investigated. The effects of population densities are of some importance and significance, for the design optimisation of a candidate rail transit line over years.

In the transportation corridor of a candidate rail transit line, the population densities in each residential location vary year by year. If the increase of population density in the first year leads to an increase in the second year, positive temporal correlations then exist between population densities in the first year and the second year, and vice versa. Similarly, if the increase of population density in one residential location leads to the increase of population density in another residential location, positive spatial-temporal correlations then exist between population densities in these two residential locations.

The space-time correlation of population densities can be considered, with the nested logit model (e.g., [4–6]) and the C-logit model (e.g., [6–9]). In these nested logit models and C-logit models, spatial-temporal correlation of population densities was considered in the residential locations and/or travel choice behaviours of households. In contrast with the nested logit model, the C-logit model had a simple closed-form probability expression and was simpler for calibration [7,8].

In these nested logit and C-logit models, the space-time correlations between alternatives were investigated to calculate the choice probabilities for the residential locations and/or travel choice behaviours of households. In other words, the space-time correlations of population densities cannot explore explicitly with the nested logit and C-logit models. The possible effects of the space-temporal correlation of population densities on the design of a candidate rail transit line cannot be examined explicitly by nested logit or C-logit models.

The space-time correlations of population densities can be taken into account explicitly by the space-time correlation coefficient of population densities [2, 3, 7, 8, 10–12] proposed a bilevel optimisation model to estimate the space-time correlation of OD demands during the same peak hour periods due to day-to-day fluctuations over the whole year. Liu [2] investigated the effects of spatial and temporal correlation of population densities on system disutility in a railway transportation corridor. It concluded that the spatial and temporal correlation of population densities had a significant influence on the results of population densities and the system performance measured in system disutility, consumer surplus, and social welfare of railway system.

Zhang et al. [3] explored the implementation flexibility of multiperiod rail line design with consideration of uncertainties in population distribution. The space-time correlation of population densities was taken into account, but the proposed model was not analytical. Yang et al. [8] proposed an estimation framework based on the Generalized Method of Moment to infer the probability function of origin-destination (OD) demand variables using sets of traffic counts over a network. Sun et al. [7] conducted the stochastic OD traffic demand estimation with a biobjective optimisation model for the traffic count location problem.

It was noted that the space-time correlations of population densities were mainly considered for road traffic origin-destination estimation in these previous studies. In this paper, we will incorporate the space-time correlations of population densities by the space-time correlation coefficient of population densities for the design optimisation of a candidate rail transit line.

Based on the space-time correlation coefficient of population densities, a closed-form programming model is introduced to examine the effects of space-time correlation of population densities on the design of a rail transit line in this paper. The optimisation objective of the proposed model is budget social welfare maximisation. The budget social welfare is defined as a summation of expected social welfare and social welfare margin. The model decision variables include rail line length, rail station number, and the project start time.

##### 1.2. Problem Statement and Contributions

As shown in Figure 1, a linear transportation corridor is separated into residential locations from the Central Business District (CBD) to the city boundary, and the planning time horizon is divided into equal time periods. and are positive integers. and are the lengths of candidate rail transit line with respect to project start time in years and , which can be determined endogenously with the use of the proposed model. and are the population densities in residential locations and in years and , respectively, with and . Spatial correlation exists between population densities and , and and , while temporal correlation exists between population densities of and , and and .

Two major extensions to the related literature are made in this paper: (i) the effects of space-time correlation of population densities on the design of a candidate rail transit line over years are investigated by a closed-form mathematical programming model; (ii) the analytical optimal solutions of design variables of the candidate rail transit line over years are obtained with the proposed model.

The remainder of this paper is organised as follows: in the next section, some basic considerations are given. A rail design model is proposed in Section 3, taking account of the space-time correlation of population densities along a linear transportation corridor. Section 4 gives illustrative numerical examples to show the application and contributions of the proposed model. A summary of this paper is given in Section 5.

#### 2. Basic Considerations

##### 2.1. Assumptions

To facilitate the presentation of the essential ideas, some basic assumptions are made, listed as follows: A1. The candidate rail transit line is assumed to be linear and start from the CBD and then be built along a linear transportation corridor [1, 13]. The candidate rail transit line project in each period is assumed to finish on time and the rail service is expected to be supplied at the end of each design period [14]. A2. The standard deviation (SD) of the population density is assumed to be an increasing function with respect to its mean value. This function is referred to as the stochastic population density function. In addition, the stochastic population density function is assumed to be a nondecreasing function with respect to its mean value. [15]. A3. Households’ responses to the quality of the rail service provided are measured by a generalised travel cost that is a weighted combination of in-vehicle time, access time, waiting time, and the fare [16]. Households are assumed to be homogeneous and have the same preferred arrival time at the workplace located in the CBD. This study focuses mainly on households’ home-based work trips, which are compulsory activities. The number of trips is, thus the number of trips is not affected by other factors, such as income level [17]. A4. The study period is assumed to be a peak hour, for instance, the morning peak hour, which is usually the most critical period in the day [19]. A5. Rail station number depends on rail line length and rail station spacings. To obtain the analytical solutions, an even rail station spacing is assumed. In other words, with the assumption of constant rail station spacing, once rail line length is determined, the rail station number is also determined. This assumption is also used in the works of Li et al. [17] and Liu [2].

##### 2.2. Space-Time Correlation of Population Densities

To take into account the space-time correlation of population densities, it is assumed that there exists a perturbation in the population density. The yearly perturbed population density is given by the following equation [12]:where is the expected population density at location in year , ; is a random term, with . It is noted that the expected population density is a deterministic value. In terms of A2, the SD of population density can be expressed as [15]where is defined as the stochastic population density function, which represents the functional relationship between the mean value and the standard deviation of the stochastic population density. Specifically, a coefficient of variation of population density is defined aswhere is a standardised measure of the dispersion of the probability distribution or frequency distribution of the population density.

To take spatial and temporal correlations between population densities into account, the following spatial and temporal covariance is defined as [10]where is the correlation coefficient, which is an important measurement reflecting the statistical correlation between and . There are three correlation coefficient cases: negative, positive, or zero, representing negative, positive statistical dependence or statistical independence of population densities. Specifically, with and , the spatial and temporal covariance becomes the standard deviation value.

##### 2.3. Households’ Residential Locations Choice Behaviours

Households are assumed to choose the residential locations to maximise their own utilities subject to budget constraint. A Cobb–Douglas form of the utility function is adopted, shown as follows [17]:where represents the daily household utility function for residential location in year ; is the daily consumption of nonhousing goods for households in a residential location in year , of which the price is normalised to 1; is the consumption of housing in a residential location in year , measured in square meters of floor space; and are positive constraints, and .

The budget constraints for households are expressed as follows:where is the daily housing rent per unit of housing in residential location in year , is the average daily household income, and is the daily generalised travel cost from residential location to the CBD in year .

Under user equilibrium condition, no households can increase his/her utility by unilaterally changing their location choices. Mathematically, the utility maximisation for households can be expressed as

A similar mathematical formulation has been formulated in Li et al. [17]. According to the equilibrium condition proposed in their study, the equilibrium household utility is shown as follows:withwhere is the equilibrium household utility in year and is the housing rent in the CBD in year . in equation (9) is the daily housing rent function per unit of housing in residential location in year , and in equation (10) is the daily consumption function of housing for households in residential location in year . It can be seen that both and are functions of daily generalised travel cost from residential location to the CBD in year .

To keep the balance of the supply and demand of housing, it requires that

Substituting equations (10) in (11), we havewhere represents the consumption of housing in residential location in year , measured in square meters of floor space, and is the expected population density of households in residential location in year at equilibrium.

The population conservation equation can be expressed aswhere is the total population along the candidate rail transit line in year and is the length of the rail transportation corridor. To describe the year-by-year variation of the total population, a yearly growth factor is assumed and shown as follows [14]:where is a compound-account factor to measure the growth of the total population compared with the based year and is the total population in the base year. As is positive, the implication is that the total population along the candidate rail transit line increases and vice versa. is the multiplier of the total population to measure the variation of the total population in year compared with the total population in the base year.

##### 2.4. Social Welfare Budget

The government or the rail operator will build a rail transit line to meet the increasing travel demand of households and eases highway traffic congestion. Social welfare is commonly used to assess the performance of a candidate rail transit line. Due to the yearly uncertainty associated with rail travel demand, the social welfare of the candidate rail transit line is also not a deterministic value. Because of the uncertainty of social welfare, an extra safety margin is assigned to ensure a higher probability of gaining a certain level of social welfare. In view of this, the concept of social welfare budget is proposed as follows:where is the social welfare budget, is expected social welfare, is a negative parameter, is the standard deviation of social welfare, and is the social welfare margin.

relates to the requirement on ensuring a certain social welfare gain. A high value of implies a relatively high and a higher probability of social welfare gain. Formally, can be related mathematically to the probability that there is a gain in the budget social welfare, namely,where is the probability of a gain in the social welfare budget. Rearranging terms in equation (16), then

From equation (16), we can obtain

Let be the standard cumulative distribution function. Equation (18) can be rewritten as follows:

As equation (19) can be transformed asand, with equation (20), can be obtained. Thus, the social welfare budget defined in equation (15) can be rewritten as

The value of represents the government’s or rail operator’s attitudes toward social welfare gain. A larger implies a larger negative safety margin and a higher probability of a gain in social welfare budget.

Social welfare of the candidate rail transit line consists of the consumer surplus of households and the profit of the rail operator. Mathematically, expected social welfare can be expressed as follows:where is expected consumer surplus of households and is the expected profit of the rail operator.

The expected consumer surplus of households is given bywhere 365 is a parameter converting daily consumer surplus into yearly consumer surplus, is the expected travel demand of rail service from residential location in year , is the expected generalised travel cost from residential location to the CBD in year by rail, is planning time horizon in years, and is the length of the rail transportation corridor.

The expected profit of rail operator is given bywhere 365 is a parameter converting daily profit into yearly profit, is rail fare, is a variable cost to supply rail service for each passenger, is rail length in year , is yearly unit fixed maintenance cost of rail line, is rail station number in year , and is yearly fixed operation cost of each rail station.

In terms of A3, the travel demand function of rail service from residential location in year , is assumed to be given by an exponential function shown as follows [18]:where is a positive constant, which responds to the households’ sensitivity to the rail service level, and is a random term, with . The inverse function of travel demand can be obtained as follows:

Substituting it into equation (23), the following equation is obtained:

The expected generalised travel cost consists of fare, access cost from residential locations to rail stations, waiting for cost for rail service at stations, and in-vehicle cost from rail stations to the CBD, shown as follows [20]:where are values of access time, waiting time, and in-vehicle time, respectively; is distance-based fare for rail service, is a compound-account factor to convert future values to present values, is average access time from residential locations to the rail station, with , is average waiting time for rail service at stations, and is average in-vehicle cost from rail stations to CBD. The distanced-based fare is given bywhere is the fixed fare component and is the variable fare component per kilometre. Waiting time is closely concerned with travel demand and supply of the rail service. For long-term planning, this value can be estimated using the following function:where 0.5 is a reasonable parameter for short train headway and passengers arrival time and is the average headway in year [21].

The average headway in year is closely concerned with cycle time of train operation and fleet size of trains :The cycle time of train operation can be calculated by [22]where is the rail line length in year , is the average train speed in year , and is average constant terminal time. The average in-vehicle travel time from rail station to the CBD, , is given by the distance between rail station and the CBD , divided by the average train speed in year , , namely,where can be calculated as follows if a constant station spacing is assumed:with .

In terms of equations (1), (2), (24)–(32), we obtain the expected budget social welfare shown as follows:

The standard deviation of travel demand for rail service isand the standard deviation of budget social welfare can be calculated as follows:

#### 3. Model Formulation and Properties

As stated above, the government or rail operator aims to maximise the social welfare budget of the candidate rail transit line by determining the optimum rail line length, rail station number, and project start time of the candidate rail transit line.

##### 3.1. Model Formulation

In terms of equations (21), (22), and (37), the social welfare budget maximisation model is formulated as follows:where is the social welfare budget, is rail line length, is rail station number, and is the project start time of the candidate rail transit line.

##### 3.2. Model Properties

Proposition 1. *For the budget social welfare maximisation problem (38), the social welfare budget is a decreasing function of the spatial and temporal correlation coefficient .*

*Proof. *In terms of equation (38), it can be found that the variation of budget social welfare with respect to spatial and temporal correlation coefficient depends on the variation of with respect to . It is easy to find that is a decreasing function of .

Proposition 2. *For the social welfare budget maximisation problem (38), at the equilibrium of equations (8)–(10), the optimal rail length , rail stations number , and project start time can be obtained by the following equations:with*

*Proof. *To obtain the optimal solution of the rail line length, the partial derivative of objective function equation (38) with respect to was set to zero. Then,whereand, then, the optimal rail line length can be obtained as follows:Similarly, to obtain the optimal solution of the rail station number, the partial derivative of objective function equation (38) with respect to was set to zero, namely,and, then, the optimal rail station number can be obtained as follows:To obtain the optimal solution of the project start time of the candidate rail transit line, the partial derivative of objective function equation (38) with respect to was set to zero, namely,whereand, then, the optimal project start time of the candidate rail transit line can be obtained as follows:with

#### 4. Numerical Examples

To facilitate the presentation of the essential ideas and contributions of this study, two illustrative examples are given below.

##### 4.1. Example 1

The input parameters are summarised in Table 1.

Figure 2 plots the contour of optimal social welfare budget in the space of spatial and temporal correlation coefficient (cc) with the objective of social welfare budget maximisation. It can be seen that, for a particular spatial cc, as temporal cc increases, the optimal social welfare budget of the candidate rail transit line decreases. For instance, for spatial cc 0, as temporal cc increases from −1 to 1, the optimal social welfare budget decreases from level of HK$ to HK$.

For a given total population, positive temporal cc means that the increase of population density in the first year leads to the increase of population density in the next year. As a result, households are distributed to limited residential locations and the total population has a centralised distribution.

In summary, as temporal cc increases, the total population has a more centralised distribution, and the social welfare budget of the candidate rail transit line decreases. More centralised population distribution can lead to a lower social welfare budget of the candidate rail transit line. Decentralised population distribution takes a high social welfare budget of the candidate rail transit line.

Similarly, for given temporal cc, as spatial cc increases, the optimal social welfare budget decreases. For instance, for temporal cc 0.8, as spatial cc increases from −1 to 1, the optimal social welfare budget decreases from HK$ to HK$.

Positive spatial cc implies the increase in population density in a residential location and leads to the increase of population density in another residential location. A type of cooperation relationship may exist between these two adjacent residential locations. For instance, the population growth in a new town can lead to an increase in population density in residential locations of the adjacent suburban city.

In summary, as spatial cc increases, the residential locations are more correlated with each other, and the optimal budget social welfare decreases. More correlated residential locations can lead to lower budget social welfare for the candidate rail transit line. Conversely, a competitive relationship between residential locations leads to the availability of a high budget social welfare for the candidate rail transit line.

It is also noted that the effects of temporal cc on the optimal social welfare budget are more significant than spatial. For instance, as temporal cc increases from −1 to 1, the optimal social welfare budget decreases from level of HK$ to . As spatial cc increases from −1 to 1 and temporal cc of 0.8, the optimal social welfare budget decreases from HK$ to HK$.

Compared with traditional studies assuming a spatial and temporal cc of 0, the optimal social welfare is overestimated in parts of (a) and (b) in Figure 2 and underestimated in parts of (c) and (d). For instance, in part of (b), the results in traditional studies are overestimated from HK$ to HK$ with spatial and temporal cc of 1. In part (c), the results in traditional studies are underestimated from HK$ to HK$ with spatial and temporal cc of −1.

Table 2 shows numerical results of optimal rail line length , optimal rail station number , and optimal project start time of the candidate rail transit line in terms of the base year with respect to temporal correlation coefficient (cc) of −1 and 1 and spatial cc from −1 to 1. It can be seen that with temporal cc of -1 the optimal rail line length in each year is longer than that of a temporal cc of 1. For instance, with temporal and spatial cc of −1, the optimal rail line length is 30.98 km in year 1, 17.49 km in year 2, and 9.59 km in year 3, while the optimal rail line length is 22.09 km in year 1, 14.36 km in year 2, and 9.21 in year 3 with temporal cc of 1 and spatial cc of −1. It implies that the optimal rail line length is longer with decentralised population distribution than that with centralised population distribution.

It can also be seen that the optimal rail line length in each year decreases as spatial cc increases from −1 to 1. For instance, with temporal cc of −1 and spatial cc increasing from −1 to 1, the optimal rail line length in year 1 decreases from 30.98 km to 20.74 km. It implies that as cooperation between residential locations becomes strong and competition between residential locations becomes week, the optimal rail line length in year 1 decreases.

From Table 2, it can be found that the optimal project start time of the candidate rail transit line is fast-tracked when temporal cc increases from −1 to 1. For instance, with temporal and spatial cc of −1, the optimal project start time of the candidate rail transit line is year 11.19 in terms of the base year, while the optimal project start time of the candidate rail transit line, with temporal cc of 1 and spatial cc of −1 is 8.60. It implies that the optimal project start time of the candidate rail transit line is earlier under centralised population distribution than that under decentralised population distribution.

##### 4.2. Example 2

Figure 3 gives the housing unit prices map around MTR stations for Hong Kong in the first half-year of 2015. The housing unit prices, within the range of two hundred meters at each MTR station, are the transaction prices of representing housing estates. The representing housing estates are in residential locations, which have the largest transaction numbers of housing estates in the past six months. The housing unit prices are measured in HK$/square feet (Sqft). The data come from Centra data, linked by hk.centranet.com/eng/ehome.htm.

Table 3 gives the housing rent list of representing housing estates at each rail station of the West Island Line in first half-year of 2015. The housing rent price ratio is around 3% in Hong Kong at year 2015. This data comes from Chiefgroup of Hong Kong (www.chiefgroup.com.hk). The average flat size is 36.5 sqft, according to Housing Authority Annual Report 2014–2015 (www.housing.wa.gov.au/housingDocuments). The daily housing rents around rail stations of Western Island Line are calculated based on the housing prices, housing rent price ratio, the average flat size, and a constant parameter. For instance, in this example, daily housing rent = (36.57housing price3%)/(1230), and 676.34 = (318643%36.57)/(1230).

Figure 4 shows the effects of space-time correlation of population densities on social welfare budget for the Western Island Line. It can be found that, with a given spatial correlation coefficient (cc) of population densities, as temporal cc increases from −1 to 1, the social welfare budget for the Western Island Line decreases. It also can be found that the effect of temporal cc of population densities is more significant than spatial cc of population densities on the social welfare budget. The results are in accord with the results of numerical example 1.

#### 5. Conclusions

This paper proposes a closed-form model to investigate the effects of space-time correlation of population densities on the design of a candidate rail transit line over years. The traditional studies with an assumption of independence of irrelevant alternatives (IIA) population densities, namely, space-time correlation of population densities of 0, are special cases of the proposed model in this paper.

The proposed model offers several insights. For example, the decentralised population distribution takes the high social welfare budget of the candidate rail transit line. Competition between residential locations takes the high social welfare budget of the candidate rail transit line. The effects of the temporal correlation coefficient (cc) on the optimal social welfare budget are more significant than the spatial correlation coefficient. The optimal rail line length in each year is longer compared to temporal cc of −1 with that with temporal cc of 1. The optimal project start time of the candidate rail transit line is fast-tracked as temporal cc increases from −1 to 1.

The proposed model also offers some managerial implications. For instance, from Proposition 1, we know that the social welfare budget is a decreasing function of the spatial and temporal correlation coefficient. The rail transit line can strengthen the spatial and temporal correlation coefficient of population densities. The social welfare budget then can be eliminated by the construction of a candidate rail transit line. The optimal design value of a candidate rail transit line, namely, the optimal rail length, rail stations number, and project start time, can be determined explicitly by Proposition 2.

This paper provides a new avenue for the modelling and analysis of space-time correlation of population densities on the design of a candidate rail transit line over years. In this paper, the population are assumed to be homogeneous with trips commuting only from residences to CBD. The proposed model can be extended to incorporate the effects of households’ risk preference on early and late arrival to CBD on the design of a candidate rail transit line over years. [22]. The decision to extend a rail line involves consideration of technological, social, and economic factors. The prime reason could be social or in other words a desire to make life more convenient as regards manoeuvrability for a specific set of people, namely, those living in the vicinity of the line and new stations to be constructed. However, only pressing economic factor is considered in this paper. More detailed social factors can be taken into account in further studies, for instance, appreciation of land value along the rail line. [3].

#### Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

#### Conflicts of Interest

The authors declare that they have no conflicts of interest.

#### Acknowledgments

The work described in this paper was jointly supported by the National Natural Science Foundation of China (Grant no. 71473060), the Science and Technology Development Center, Ministry of Education of China (Grant no. 2018A01025), Humanities and Social Sciences Fund of the Ministry of Education (Grant no. 20YJCZH225), Shanghai “Science and Technology Innovation Action Plan” Soft Science Key Project (Grant no. 20692190900), and Shenzhen Philosophy and Social Sciences Planning Project of China (Grant no. SZ2019C004). The authors would like to thank Prof. W.H.K. Lam for his comments and suggestions and Mrs. Elaine Anson for her proofreading of this manuscript.