Abstract

In order to study the main factors affecting the behaviors that city residents make regarding public bicycle choice and to further study the public bicycle user’s personal characteristics and travel characteristics, a travel mode choice model based on a Bayesian network was established. Taking residents of Xi’an as the research object, a K2 algorithm combined with mutual information and expert knowledge was proposed for Bayesian network structure learning. The Bayesian estimation method was used to estimate the parameters of the network, and a Bayesian network model was established to reflect the interactions among the public bicycle choice behaviors along with other major factors. The K-fold cross-validation method was used to validate the model performance, and the hit rate of each travel mode was more than 80%, indicating the precision of the proposed model. Experimental results also present the higher classification accuracy of the proposed model. Therefore, it may be concluded that the resident travel mode choice may be accurately predicted according to the Bayesian network model proposed in our study. Additionally, this model may be employed to analyze and discuss changes in the resident public bicycle choice and to note that they may possibly be influenced by different travelers’ characteristics and trip characteristics.

1. Introduction

Under the requirements of sustainable development in China that prescribe low carbon development and environmental harmony, more and more cities have begun to build bike-sharing systems that integrate traditional bicycle usage with public transit, providing the advantages of the flexibility of bicycles and the open sharing of public transit [1, 2], as shown in Figure 1. The construction of special facilities for these systems is relatively simple and easy to implement. The public use of bicycles is highly adaptable to the existing urban road network, and it can cooperate with public transportation networks to connect to the bus or subway, making public transit service from door to door a reality in the true sense [3]. The bike-sharing system first appeared in Holland, Amsterdam in 1968, where it was known as the “white bicycle plan.” After four technical innovations, public usage of bicycles has made great progress around the world in terms of scale, security, comfort, and other aspects. In the 21st century, we have seen public bicycle usage expand throughout the world; in Europe, Asia, and the other five continents, at least 199 bicycle sharing projects have been implemented, and Europe has accounted for approximately 78.1% of that total number. By the end of 2010, 192 cities in 27 countries had built a bike-sharing system [46].

Research on bicycle travel behavior and decisions can reflect the operational effects and mechanisms of a bike-sharing system. To date, western countries have conducted in-depth research in this area [7, 8]. Research on public bicycle travel behavior and decisions can be divided into five areas: (1) study of the operational effects of bike-sharing system usage and the constructive effects of its related facilities [913]; (2) usage times and average distance of public bicycle travel, which can reflect the basic usage characteristics of public bicycles [14, 15]; (3) the choice behavior of travel modes, including studies that discuss the factors that influence the choice to use public bicycles and other studies examining the combined effects of multiple factors on the choice of bicycle travel patterns and established theoretical models of the choice mechanism, among which the rational choice theory and discrete choice models were widely used [10, 1618]; (4) public bicycle route choice behavior, for which qualitative methods were used to analyze the relationship between the road environment and route choice of riders in many studies, some studies exploring the most important factors affecting bicycle route selection and some relying on selection models that were used to quantitatively investigate route choice behavior of bicyclers under the influence of multiple factors [1923]; (5) the difference in travel behavior of public bicyclers under conditions of different travel times and travel destinations [24, 25].

In the research of travel mode choice theory and method, the discrete model has been widely used to study the mode choice behavior of travelers, and the binary Logit, MNL, logistic regression, and nested Logit models were established [2629]. Yang and Zhu [27] established a structural equation model of residents’ mode choice in the process of urbanization based on an analysis of the trip chain, and this study came to the conclusion that family and personal attributes, land use, and trip chain influenced the choice of travel mode. Parkin et al. [28] explained the bicycle travel rate at the constituency level using the logistic model, which indicated that the car ownership rate is a significant factor hindering bike travel and noting that the pulling effect of the trip rate is weak if it relies only on the construction of bicycle facilities. Zhang et al. [29] analyzed the passenger flow composition of the Beijing-Shanghai railway and passenger travel mode choice behavior characteristics; this study utilized disaggregation theory and the binary Logit model of passengers’ mode choice. With further research on travel mode choice theory and method, many scholars have begun to study the mode choice behavior models that consider psychological and environmental variables as well [3032].

Overall, most existing researches construct in-depth and accurate models of mode choice behavior by employing the discrete model. However, the discrete model cannot directly reflect changes in individual factors and other factors that influence the mode choice. Even though public cycling is an increasingly popular way to travel, few studies have looked at the survey data about urban public bicycles, and most mode choice behavior models have not considered the public bicycle. Bias network theory expresses the association between random variables using graph theory and conditional probability to express the uncertainty relation among factors, taking advantage of a priori probability and sample information to predict a posterior probability. Bias network theory is one of the most effective methods currently applied in the uncertainty prediction field. In the transportation field, the Bayesian method has been applied in many aspects, such as predicting traffic accident situations, using rapidly collected data to calculate the probability of traffic accidents, analyzing the causes of accidents, and predicting the safety of subway operations [3339]. Therefore, the Bayesian network can express the coupling mechanism between the decision behavior and the influencing factors visually. Accordingly, this paper constructs a travel mode choice model based on a Bayesian network, in order to reflect the interaction among factors and mode choice behavior and to enrich the current understanding of the theories behind public bicycle travel behavior and decisions.

2. Data Collection

Reasonable and accurate basic data is conducive to the foundation of a strong and logical Bayesian network.

2.1. Questionnaire Design

Our survey was conducted in Xi’an, China, and the questionnaires were designed for RP and SP surveys. The RP questionnaire was mainly used to investigate the age, sex, income, and other attributes of travelers, as well as their travel distances, departure times, travel mode choices, and other travel characteristics. The RP questionnaires were randomly distributed at public bicycle rental points near bus and subway stations and outside the entrances and exits of large residential areas, shopping centers, business office buildings, and so on. In order to comprehensively investigate residents’ willingness to travel under assumed conditions, the SP questionnaire was designed to supplement the shortcomings of the RP survey. The SP survey was conducted on the WeChat and questionnaire Apps, which reduced the cost of the investigation and obtained a large number of responses to the questionnaire. Traveler characteristics were investigated through the SP survey. The SP survey also investigated the mode choice of residents according to four demand assumptions (rigid demand during peak hours, elastic demand during peak hours, rigid demand during off-peak hours, and elastic demand during off-peak hours) for travel distances of <1 km, 1–3 km, 3–6 km, and >6 km.

2.2. Questionnaire Data Statistics

A total of 500 RP questionnaires were distributed, and 459 effective questionnaires were ultimately collected. 2000 valid questionnaires were collected from the SP survey. The statistical data about travelers’ attributes, as well as the travel attributes for residents who chose public bicycle, are shown in Table 1.

3. Establishment of Public Bicycle Choice Model

A Bayesian network is a directed acyclic graph (DAG). Each node in the network structure represents a random variable, the directed arcs between the nodes represent the causal relationship between the variables, and the condition is considered to be independent if there is no arc connection. The process of constructing Bayesian networks (prior Bayesian networks) is generally divided into the three steps: (1) determine variables and variable range, (2) determine network structure, and (3) determine the local probability distribution and complete network parameter learning.

3.1. Model Variables’ Determination and Discretization

Many factors influence the behavior of resident travel mode choices. In this paper, the 12 node variables were initially selected based on a survey of public bicycle choice behavior and the factors that were identified in previous studies as influencing factors. Discretization of the 12 node variables and the corresponding variable set is summarized in Table 2.

3.2. Bayesian Network Structure Learning

The K2 algorithm was established by Cooper and Herskovits [40] based on a scoring function and the hill-climbing search strategy, which are common methods used to structure Bayesian networks. The algorithm provides a priori information about the node order, using Bayesian probability as the standard to evaluate the degree of coincidence between the model and the data. It then uses the greedy search method, and the optimal network structure can be found by continuously adding the arc to the network, thus improving the evaluation index. This is the process for node ordering: if node is sorted before , then node cannot be the parent node of node .

There are many factors that affect residents’ travel mode choice behavior, and the relationship among them is fuzzy. However, the dependencies of data variables can be shown via DAG. To construct a DAG, it is necessary to determine the correlation between the nodes (i.e., arcs) and the size of relevance. In previous studies, certain expert methods have been used to examine the nodes and the order of the nodes to determine subjectivity. Based on this, we propose a Bayesian network structure learning method based on the combination of mutual information, expert knowledge, and the K2 algorithm.

Mutual information may be used to measure the dependence degree among random variables by calculating the mutual information between any two nodes in a network. This method may be used if the mutual information between the two nodes is greater than a given threshold value, and it is considered that there is an arc between the two nodes [41]. Set the variable as (where is the total number of variables in the sample data set). Then, the size of the relevance between any two variables may be represented by mutual information (mutual information, MI), and the greater the value of the mutual information, the stronger the correlation between and . The mutual information between and is defined as follows:where is the number of possible values of the independent variable , means that independent variable takes its first value, is the number of possible values of the independent variable , and means that independent variable takes its first value.

It can be seen from the definition of mutual information that the mutual information between two variables is symmetrical; namely, . At the same time, when , then and are independent of each other and . Therefore, the smaller the value of the mutual information, the greater the possibility that the variables and are independent. According to the mutual information value for variables selection, we selected 0.002 bits as a threshold, and we found that when the mutual information value was lower than 0.002 bits, the correlation between variables was low [41].

The steps of Bayesian network structural learning based on the combination of mutual information, expert knowledge, and the K2 algorithm are as follows.

(1) Undirected graph establishment: compute mutual information for each pair of variables , , when . Connect arc - to get the undirected network .

(2) Prune undirected graph: for each node , the descending order list is obtained according to the mutual information value. Regarding the conditional independence test, if conditional independence is established, delete the corresponding arcs and obtain the optimized undirected network .

(3) According to expert knowledge, the direction of the arc in the undirected network is determined, and the directed network is obtained.

(4) After using the directed network to confirm the node order as the input of the K2 algorithm, employ the K2 score function to search for the highest scoring network, expressed as .

3.3. Bayesian Network Parameter Learning

Parameter learning is to determine the conditional probability distribution at each node in the case of a given Bayesian network structure. The principle of using the Bayesian method for parameter learning is as follows: given a distribution with unknown parameters, and a complete data set , parameter is a random variable with a prior distribution of . When the information of parameter changes, and the posterior probability is expressed as , the task of Bayesian network parameter learning is to calculate .

3.4. Bayesian Network Model Validation

The effectiveness of the established Bayesian network model is validated by calculating the hit rate of the analysis of mode choice behavior. Assume that the total number of alternative travel modes is , and if sample data chooses the first travel mode; then, the corresponding prediction probability is , and where is the total number of samples as travel mode . Generally, when the hit rate reaches 80%, the model is considered to have achieved good results.

Additionally, to validate the method proposed in this paper, K-fold cross-validation method was employed in experimental procedure. More specifically, in machine learning, the data set is divided into training set and test set . In the case of insufficient sample size, to test the algorithm effect by making full use of the data set, the data set is randomly divided into packets. Each time one of the packets is used as the test set, and the remaining packages are trained as the training set. The advantage of this method is that the samples are randomly generated and trained at once, and the results are verified every time. Accordingly, 10-fold cross-validation is a common test method, which divides the data set into 10 parts, taking nine of them as training data and one of them as test data. The hit rate (or error rate) can be obtained in each test. And the average hit rate (or error rate) of results from the 10 tests is used as an estimate of the algorithm accuracy. Generally, multiple 10-fold cross-validation is required, and then the average value is calculated as an estimate of the algorithm accuracy. Therefore, the 10-fold cross-validation will be conducted 10 times to validate the model.

In summary, the process of travel mode selection model based on the Bayesian network combining mutual information and expert knowledge is presented in Figure 2 [4246].

4. Experimental and Comparative Analysis

In this section, 2459 questionnaires (2000 SP questionnaires and 459 RP questionnaires) were used as experimental data sets, which is a complete experimental data set. And all the variables have been discretized. To make it clear, the whole data sets were selected for Bayesian network structure learning and parameter learning.

4.1. Structure Learning

In order to determine the main factors that influence residents’ travel mode choice behavior, we use formula (1) to calculate the mutual information between the travel mode (Wa) and the other 11 variables; the results are shown in Table 3.

Thus, for Ag, Oc, Ai, In, Ow, Di, Pe, Ti, and Wa, these eight main factors were screened out. Then, for each node , the mutual information of all the nodes was calculated, and the conditional independence test was obtained, yielding the undirected network. Based on the undirected network and expert knowledge, the direction of the arc between nodes was determined, and the directed network and node order were obtained.

Adopting the BNT toolbox of the MATLAB software, we used the K2 algorithm for structural learning. The process of structure learning allows for the node order to be adjusted to obtain the optimal network structure. Finally, a network structure with nine nodes (eight factors and one decision variable) and a number of directed arcs was obtained. The results of this structural learning are shown in Figure 3.

4.2. Parameter Learning

Using the BNT toolbox in the MATLAB software for parameter learning, we obtain the parameters of each node (see Tables 412).

4.3. Model Validation

To validate the method proposed by this paper, 10-fold cross-validation method is employed in the experimental procedure, which is implemented by MATLAB. And the experiment tests ten times on each data set. All attributes are discrete and there is no missing value. The approach described in Section 3.4 is used to calculate hit ratio. The performance of classifiers will be quantified using the hit ratio of each traffic mode. The hit ratio is undoubtedly the most commonly used measure of performance of a classifier, which measures the percentage of correctly classified observations. 10-fold cross-validation of the whole data set is performed in the above method. Experiments are repeated 10 times for each sample size. The hit rate of the 10 experiments is shown in Table 13.

As detailed in Table 13, based on the established algorithm, the average hit rate of the prediction model is higher, and the hit rate of each travel mode is more than 80%, indicating that the hit rate is higher and the model precision is better. From the calculated standard deviation of hit rate in 10 experiments, it can be seen that the standard deviation of hit rate of each travel mode is smaller, indicating the better performance of the model.

4.4. Comparative Analysis

To test the performance of the algorithm in this paper, the following experiment was conducted by using the complete data set described above. The test results were compared with those obtained by using two commonly used Bayesian network structure learning algorithms (random K2, hill-climbing), two most commonly used Bias classifiers (naive Bayes, tree-augmented naive Bayes), and one most commonly used travel behavior prediction method (multinomial Logit model) in traffic engineering. In this study, the programs of these five classes of classification models were compiled in turn by the software Matlab7.04. In the experiment, 10-fold cross-validation method was employed, and all the data sets were discretized without missing values. The experimental running environment is as follows: operating system of Windows 7, Intel®Core™i5 CPU, RAM: 4.0 GB.

4.4.1. Experimental Models

(1) Random K2 Algorithm. The random K2 algorithm uses random node order for structural learning.

(2) Hill-Climbing Algorithm. Hill-climbing algorithm is the process of simulating mountain climbing. It randomly selects a location to climb the mountain, and each time moves in a higher direction until it reaches the top of the mountain. The optimal solution is chosen in the near space as the current solution; it runs until a local optimum solution is obtained. This algorithm is easy to fall into the local optimal solution, and whether the global optimal solution can be obtained depends on the location of the initial point.

(3) Naïve-Bayes. A Naïve-Bayes BN, as discussed in literature [4749] is a simple structure that has the classification node as the parent node of all other nodes. No other connections are allowed in a Naïve-Bayes. Its classification time is short, which is widely used in real life, such as in text classification. Its classification accuracy is comparable to that constructed by neural networks. It has two advantages over many other classifiers. First, it is easy to construct, as the structure is given a priori (no structure learning procedure is required). Second, the classification process is very efficient. Both advantages are due to its assumption that all the features are independent of each other. Although this independence assumption is obviously problematic, Naïve-Bayes has surprisingly outperformed many sophisticated classifiers over a large number of data sets, especially where the features are not strongly correlated. A simple Naïve-Bayes structure is structured as shown in Figure 4.

(4) Tree-Augmented Naïve-Bayes (TAN). TAN is an extension structure of the simple Naïve-Bayes classifier, which is an improvement on the simple Naïve-Bayes. Friedman et al. [50] studied TAN, which allows tree-like structures to be used to represent dependencies among features. It relaxes the conditional independence hypothesis of attribute nodes of simple Naïve-Bayes classifier and allows attribute nodes to have interdependencies between each other. However, the feature node has only one feature node except the classification node. The TAN classifier model structure is presented in Figure 5.

(5) Multinomial Logit Model (MNL). Multinomial Logit is one of the most classic models for travel behavior prediction [51, 52]. It establishes the individual utility model based on the random utility theory. The selection principle of individual is to select the highest utility option; the formula is as follows:where is the utility value with the selection of option by the individual , is the utility value that can be determined, is a random term of utility value (it represents the uncertainty of utility value), and is the option set of individual .

Assuming that obeys an independent Gumbel distribution, the Logit model can be obtained, as shown inwhere is parameter of the Gumbel distribution.

This method has its inherent shortcomings, the most obvious being the independence of irrelevant alternatives (IIA), which is highly likely to lead to inaccurate forecasts.

4.4.2. Comparison Results

The performance of a model or an algorithm can only be known by comparison. To more comprehensively evaluate the classification performance of the established Bayesian network model, the proposed algorithm is compared with the classification models. The Hill-climbing and K2 algorithm are of random ordering. The data used in the experiment and the operating conditions are the same. Specifically, 500, 1000, and the whole data samples are selected for testing. To be clearer, the data needed for MNL model are obtained from the same data set. All experiments are repeated 10 times for each sample size, and the numerical results of hit rate are average values of 10-fold cross-validation of 10 experiments. The comparison results are detailed in Table 14.

As can be seen from Table 14 and Figure 6, both the presented method and other five algorithms have high classification accuracy, but classification accuracy of our method is higher than other five classification algorithms as a whole. The Naïve-Bayes and TAN perform better on small data sets and achieve higher accuracy. The Random K2 and hill-climbing algorithm have better overall performance than the Naïve-Bayes and TAN in these experiments. Our model outperforms TAN significantly. The average hit rate for our model is 86.57%, which is higher than that for TAN (84.66%). With the increase of training data, the accuracy of our model increases gradually, but our method does not perform well on small data sets. Among all experimental results, the standard deviation of our method is relatively small with the higher stability. Experimental results show that the novel method proposed in this paper has good classification performance on larger sample data sets.

5. Analysis of the Resident Travel Mode Choice

The logic sampling method is a kind of stochastic simulation method commonly used in Bayesian network inference. The logical sampling algorithm in GENIE software (it has been developed at the Decision Systems Laboratory, University of Pittsburgh) was used to implement the inference of the established Bayesian network. In this study, the influence of four single factors (owning a private car or not, resident age, trip time, and trip purpose) on the residents travel mode choice is analyzed. Specifically, the selection probability of each travel mode can be obtained by inputting each state of the attribute as evidence. And the reasoning results are shown in Figures 710.

From Figure 7, people who have private cars mainly choose to travel by car, and residents who have no private cars mainly choose public transport travel. However, from the results of our analysis of the choice to use the bike-sharing system, we found that the public bicycle selection probability of residents without a private car was higher than those who own a private car. From Figure 8, the public transport choice proportion at any age is very high; public transportation is proven to be the main method of travel for residents of Xi’an. At the same time, it can be seen that residents in the age group of 18–35 make up the largest group of people who choose the bike-sharing system, followed by those in the 36–55 age group. Residents of an older age (>55 years) rarely chose public bicycles for travel, likely because older people may not have the physical strength to cycle.

Figure 9 shows that, regardless of morning and evening peak hours or off-peak hours, the main methods of travel for Xi’an residents are bus and subway. During the morning and evening peak hours, the probability of public bicycle travel is 5.6% higher than during other periods; in other words, the proportion of residents who use the bike-sharing system do so mostly during the morning and evening peak hours. Figure 10 shows that the selection of the bike-sharing system was affected more by rigid travel needs of Xi’an residents than by elastic purposes.

Then the states of the residents with different age and travel time are selected as the evidence input, and the influence of the two factors on the travel mode is analyzed. Taking [Ag = 2] [Pe = 1] and [Ag = 2] [Pe = 2] as the evidence input, for example, the probability of the residents behavior under the combined effect of the two factors is shown in Table 15.

The most likely choices for young adults in Xi’an, as shown in Table 15, are the public transport and the subway, whether in the morning and evening peaks or in other periods. For young adults, in the morning and evening peak, the probability of choosing a public bicycle is 4.5% higher than that of other periods, while, in other periods, the probability of choosing a car is 5.1% higher than that in the morning and evening peak. Accordingly, the trip time will affect travel mode choice for young adults, and, in the morning and evening peak, young adults have a greater chance of choosing a public bicycle that can be used flexibly.

When the evidence variable is set as [Wa5 = 1, “choosing the public bicycle”], the trip purpose at this period, the possibility of travel groups can be studied. Using the established Bayesian network inference, the trip purpose distribution can be shown in Table 16.

From the results in Table 16, when a traveler chooses a public bicycle to travel, his trip purpose is more likely to be a rigid one. And the probability of various trip groups is indicated in Table 17. As a result, employees are most likely to choose public bicycles. Therefore, employees should be majorly considered in the management and operation of public bicycles.

Figure 11 shows node probability of Bayesian network for the public bicycle chosen, it is the inference result in the case of [Wa5 = 1, “choosing the public bicycle”], and it can be used to analyze the uncertainty relationship among nodes. In conclusion, the resident age, trip time, trip purposes, and other factors have a significant impact on the choice of bike-sharing system. The public bicycle selection probability of residents in the morning and evening peak is higher than that of other periods, which is in good agreement with the fact that the main purpose of travel is dictated by rigid travel demands.

6. Conclusions

Using Xi’an city in China as a case study, this paper analyzes the behavior of the public bicycle choices of urban residents.

(1) A Bayesian network structure learning method based on the combination of mutual information, expert knowledge, and K2 algorithm was proposed. Bayesian parameter estimation, which has a high accuracy of posterior probability, was used to establish Bayesian parameter learning. The general K2 algorithm is a “greedy” algorithm, meaning that its results may not yield an optimal network. The K2 algorithm combined with mutual information and expert knowledge may be used in Bayesian network structure learning. A Bayesian network allows for the main factors affecting the traffic mode choice of residents to be accurately selected and for the interdependence between variables to be obtained quickly and accurately.

(2) According to the validity of the model, the model based on a Bayesian network has a high hit rate and high accuracy. By comparing with the other 5 classification models, the novel method proposed in this paper has good classification performance.

(3) Through the establishment of the analysis model for travel mode choice based on a Bayesian network, the resident mode choice under the conditions of the traveler attributes could be accurately determined by the Bayesian network inference. Changes in the public bicycle choice under the influence of various factors (such as private car, resident age, travel time, and travel purpose) can be presented, which directly reflects the relationship between the resident travel mode choice and the influencing variables. It will be helpful in judging the behavior mechanism of urban resident choices regarding public bicycles. This study enriches the research on the travel mode choices of urban residents, as well as on the research of bicycle travel behavior and decisions. The research results may be used to analyze the personal and travel characteristics of public bicycle users, to determine the main function that the bike-sharing system offers to residents, to determine the service times and distances of this system, and to guide the future development of public bicycle usage.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this article.

Acknowledgments

This work is financially supported by the National Natural Science Foundation of China (Grant no. 51278396).