Along with the evolution of times and the surge of metropolitan populations, government agencies often promote the construction of public transport. Unlike rail transportation or rapid transit systems, it is often difficult to estimate the vehicle arrival times at each station in a bus transportation system due to metropolitan transportation congestion. Traffic status is often monitored using wireless sensor networks (WSNs). However, WSNs are always separated from one another spatially. Recent studies have considered the connection of multiple sensor networks. This study considers a combination view of peer-to-peer (P2P) overlay networks and WSN architecture to predict bus arrival times. Each bus station, which is also a P2P overlay peer, is connected in a P2P overlay network. A sensor installed in each bus can receive data via peers to obtain the moving speed of a bus. Then, each peer can exchange its data to predict bus arrival times at bus stations. This method can considerably increase the accuracy with which bus arrival times can be predicted and can provide traffic status with high precision. Furthermore, these data can also be used to plan new bus routes according to the information gathered.

1. Introduction

Over the last decade, medical advances and rapid economic development have led to a substantial increase in population, which in turn causes increased traffic. Therefore, governments have attempted to reduce the number of residents who drive cars and the corresponding environmental pollution by developing public transportation. Such measures also bring in additional urban tourism resources. One of the factors that affect whether people are willing to take public transportation is the accuracy of the arrival time when using this mode of transportation.

Residents in many cities are not familiar with public transportation services and may not know how to take public transportation. This lack of knowledge arises from the fact that people are used to driving autonomously. The most serious weakness of public transportation is that it is not systematic, which makes it difficult to shorten the time required to take public transportation. The lack of systematic and interpretable information, such as the positions of bus stations and bus routes, introduces considerable uncertainty when taking public transportation. The provision of information regarding public transportation is dependent on relevant units, particularly for individuals who are not proficient at determining directions and using maps. If clear and comprehensible information can be provided to individuals, it allows them to easily predict their use of time and plan their schedules, thus making them more willing to take public transportation to reach their desired destinations.

In public transportation, the accuracy of the arrival time of buses is the most difficult information to predict. The arrival times of rail lines can be more precisely predicted because the routes of public rail transportation are fixed and not easily disturbed during locomotion. In contrast, buses drive on the same infrastructure as general cars, and thus the prediction accuracy of the arrival time is easily affected by traffic conditions. This issue has become a major obstacle to develop bus arrival time prediction systems.

Arrival time prediction systems for public transportation are often developed with the use of modern wireless telecommunications, including wireless sensor networks (WSNs) [16]. A WSN is composed of a large number of inexpensive microsensor nodes deployed in the monitoring area [7, 8] and communicates via wireless communication [9]. A WSN is easy to deploy, programmable, and dynamically reconfigurable [10, 11].

WSNs are also suitable for applications that require network provision system environments that are fast, easy, or impossible to preestablish; WSNs can gradually replace the traditional sensor application systems that are monitored using artificial methods. WSNs can efficiently and automatically retrieve data using wireless network transmission.

Existing bus arrival time prediction systems transmit the data collected to centralized servers. However, this type of system has considerable overhead in bus systems in fairly developed areas. Because many buses drive on roads simultaneously, a significant amount of data is sent to the server in certain periods. The server must analyze and calculate these data to predict the arrival time of each bus. These data must be processed and sent in real time, so they cannot be delayed by any system overhead; otherwise, the prediction accuracy will be affected.

This paper proposes a method of combining a peer-to-peer (P2P) overlay network and WSN to develop a bus arrival time prediction system. A P2P overlay network is added to traditional prediction systems to allow for real-time data. Each bus is installed with a sensor, and each bus stop can receive data sent from sensors. Due to the distance limitation of sensors, the sensors on buses and the data-receiving device of a bus station form a single WSN environment. All bus stations and station termini are connected to form a P2P overlay network, which is used to transmit real-time bus information and predict bus arrival times. Through the WSN technology, bus stations retrieve data from buses and transmit these data to subsequent bus stations to estimate the bus arrival times. This approach can be a powerful tool for monitoring and predicting traffic conditions.

The advantage of using P2P overlay networks to connect bus stations is that each bus station can act as a client and server to exchange information for nearby buses. Through this method, the data collected from buses do not need to be transmitted from bus stations to a centralized server and could be evaluated such that bus stations can predict the arrival time. This paper adopts an Arrangement Graph-based Overlay network (AGO) [12, 13] as our P2P overlay network because this system performs well in transmitting messages. By combining P2P overlay networks and WSNs, this study considers a novel approach to predict bus arrival times and presents some initial results of this endeavor.

Some experiments were performed to demonstrate the performance of our bus arrival time prediction system compared to existing prediction systems; the experimental results revealed that our bus arrival time prediction system can make more accurate predictions. Although there are several factors that can affect the accuracy of our bus arrival time prediction system, such as the time of day and number of passengers, our prediction system is more convenient and provides more information for passengers than existing prediction systems, particularly paper timetables. The number of messages transmitted to the centralized server and the system overhead of the centralized service are both reduced considerably.

The remainder of this paper is organized as follows. Section 2 presents some related work on WSNs, P2P overlay networks, and AGO systems. Section 3 describes the proposed bus arrival time prediction system, and some experimental results are presented in Section 4. Finally, the conclusions of this study and potential avenues for future work are discussed in Section 5.

In this section, some studies related to our proposed system are introduced. Our bus arrival time prediction system is developed by using WSNs and P2P overlay networks, and thus some properties related to these technologies are introduced.

2.1. Wireless Sensor Networks

Due to the rapid advancements in microfabrication, communication, and embedded processing, small, sophisticated electronic devices can be embedded with sensing, computing, and communicating functions. Therefore, WSNs have become a popular research topic in computer science. WSNs mainly include sensing, communicating, and computing aspects (i.e., hardware, software, and algorithms, resp.). A WSN is a network system composed of several wireless data receivers and sensors, and communication between these components is wireless communication. To achieve large-scale deployment, WSN devices should be inexpensive, small, and easy to deploy and should have low power consumption [14, 15]. These devices also need to have sensing, programmability, and dynamic reconfiguration capabilities because sensors rely on the power of batteries to supply the energy necessary for operations and radio transmission distance. Sensor nodes transmit and receive data via wireless technology, and sensor networks are largely used for short-distance data transmission to reduce power consumption.

The development of WSNs initially originated in military applications, such as battlefield monitoring by the University of California, Berkeley (UC, Berkeley), for a research project called Smart Dust [16] funded by the United States Defense Research Projects Agency (DARPA). Many manufacturers have followed the direction of research by combining IEEE 802.15.4 low-rate wireless personal area networks (LR-WPANs) and ZigBee [17, 18].

Many standard applications use common wireless communication technologies, including automated home devices, online shopping [19], environment safety and control, and personal health care [20, 21].

2.2. Peer-to-Peer Overlay Network

In the recent past, an information system was typically a single server that handled all requests from clients and all responses to them. Clients had to first talk to the server to establish communication channels and then sent requests to the server for processing. If there was any information that needed to be communicated between clients, all information also needed to be sent to the server. This scheme is referred to as the client-server architecture. However, the service performance of the overall system is limited by the computing power of the server and the bandwidth of external networks, which can easily reduce the performance of system services. All clients are connected to a single server, and thus if the administrator wants to improve the system performance, the only option is to upgrade the central server and increase the bandwidth of the external network.

Therefore, P2P overlay networks emerged. The P2P overlay network is an abstract network that ignores physical network connections. Each peer on the network is treated as an individual peer and assumes that they can freely interconnect. These individual peers are connected using some specific topology to build an overlay network, which may be composed of several physical network connections. Some desired effects can be achieved through an overlay network of interconnected peers. All peers in the overlay network can both act as a client to obtain data from other peers and play the role of a server to provide data to other peers. One of the important results is to allow all peers to pool their resources, including network bandwidth, storage space, and computing power, which can increase the flexibility of the network considerably [22].

A P2P overlay network can be structured or unstructured. A structured P2P overlay network, such as Chord [23, 24], Pastry [25], and Kademlia [26], often uses a distributed hash table (DHT) to determine connections. In contrast, an unstructured P2P overlay network, such as Gnutella [27], does not have this connection relation. Thus, the unstructured P2P overlay network often uses flooding and time-to-live (TTL) to make queries on the overlay network [28, 29].

2.3. Arrangement Graph-Based Overlay

The AGO is a P2P overlay network that was developed based on arrangement graph [30]. The arrangement graph is an undirected graph. Two parameters, and , are used to define the arrangement graph. Each peer in the arrangement graph has a unique peer ID used for identification. The parameter is the number of digits in the peer IDs, is the range of each digit, and . Some properties of the arrangement graph are as follows: there are peers, the degree of each peer is , and the diameter of an arrangement graph is . Moreover, the peer in the arrangement graph has only one digit that is different from its neighbors.

Because the AGO was developed based on the arrangement graph, the AGO inherited some arrangement graph properties. Peers in the AGO establish their neighbor tables according to properties of the arrangement graph and the main information of their neighbors. There are three main functions in AGO: joining, departing, and routing. Peers initiate the joining action to join the AGO through the bootstrap peer. There is a waiting peer pool in the bootstrap peer to temporarily record information of peers that already exist in the AGO. These records are provided to the new peers joining the AGO. Moreover, the peer record in the waiting peer pool is replaced when its neighbor table is full.

When the peer in AGO tries to depart, it sends announcement messages to its neighbors to maintain the AGO’s accuracy. Furthermore, peers can discover information on other peers using the routing process. The AGO utilizes one of the properties of the arrangement graph, the digit difference, to perform routing actions. A replica mechanism is also used to increase routing performance. Furthermore, the AGO can also self-extend or self-shrink the scale of the AGO by adjusting the value of the parameters according to the number of peers.

3. Bus Arrival Time Prediction System

This section introduces the proposed method, which is a two-layer structure. One layer is a P2P overlay network that is used to transmit data between bus stations. The other layer is a WSN used by bus stations to retrieve data from buses. This structure allows bus arrival times to be estimated more efficiently.

3.1. System Architecture

WSNs have a wide range of applications, including public transportation. For this study, devices were installed on bus stations to retrieve data, and sensors were installed on buses to supply traffic conditions. When buses drive into the wireless coverage of bus stations, bus stations collect data from buses and transmit these data to neighboring bus stations. These data can provide real-time transport information, such as time and location, and can be used to estimate the expected arrival time for neighboring bus stations. In addition, some real-time transport information, such as traffic conditions or emergencies, can also be transmitted to other places to improve traffic safety and accident-handling efficiency.

This study utilizes the functional characteristics of WSNs and P2P overlay networks for urban public transportation to establish a multifunctional intelligent information system. A WSN is used to establish a framework for a multifunctional information platform that provides electronic toll collection, traffic monitoring, traffic statistics, and traffic and emergency notification systems.

The proposed system is composed of two parts, as shown in Figure 1. The top layer is a P2P overlay network, which is formed using an AGO system. This network is responsible for delivering the data collected from sensors to other bus stations. The bottom layer consists of WSNs composed of bus stations and buses. Bus stations receive data from sensors to judge the conditions of moving buses. There are three key components concerning actual practices: the technologies for combining the P2P overlay network and WSN, the designs for efficiently transmitting and predicting messages and user-friendly interfaces, and the indicators of performance evaluation. These components make the system more modular and can allow for cost savings and increased revenue. These components were also our goals during development of the proposed system.

3.2. Method

In Figure 1, the upper layer is a P2P overlay network, which is built using an AGO system. The AGO only needs to be modified with a small mechanism to apply our bus arrival time prediction system. The GPS location of each bus station must be recorded to establish the P2P overlay network according to their locations. Furthermore, a user interface is designed so that users can determine when buses will arrive in the network, as shown on information boards at bus stations. Data-retrieving devices on bus stations act as peers in the P2P overlay network and are responsible for receiving information sent from sensors on buses.

When the prediction system starts every morning, bus stations start to join the P2P overlay network. The stations link to other neighboring bus stations. Then, when a bus station detects a bus driving in its coverage, the bus station starts to collect data sent from the bus, such as the speed or location of the bus. After the bus station collects these data, it analyzes these data and sends them to its neighboring bus stations to predict bus arrival times for other bus stations. The bus station that receives data from other bus stations analyzes the received data and provides the predicted arrival time on information boards. The bus station also records the actual time in which the bus arrives. At night, when buses are out of service, bus stations upload all of the data for the day to a centralized server for storage and analysis. System operators can comprehensively analyze these data to correct the prediction system and make the system more accurate. Furthermore, the bus department can use the data to adjust the bus headway.

A diagram of our method is shown in Figure 2. Bus stations collect data from sensors on buses and send these data to other neighboring bus stations. Each bus station is connected to some other bus stations near it, when it joins the P2P overlay network according to its estimated location. Data transmitted between bus stations utilize connections in the AGO. When the next bus station receives data from the former bus stations, it can calculate the probable arrival time of the bus according to the distance, the speed of the bus, and average speed at that location. Therefore, whether a bus arrives or leaves a bus station, the bus station is required to send messages to the next bus station to enable more accurate prediction.

Bus stations are connected with those near them to prevent two adjacent bus stations from being far in the P2P overlay network. Furthermore, the main purpose of the P2P overlay network is to enable bus stations to directly exchange data with other bus stations without any mediating servers. This ability can decrease the system overhead of the centralized server considerably.

4. Experimental Results

In this section, simulation results are presented. The authors performed the following simulations according to the above methods. From these simulation results, the performance of our prediction system can be assessed. The data in simulations are produced according to the assumption in the paper, and the data is used for both the traditional system and our prediction system.

4.1. Accuracy of the Arrival Time Prediction

The accuracy of a bus arrival time prediction system is known from the status of passengers’ usage. Passengers care about the accuracy of prediction systems, and the accuracy affects passengers’ decision to use the prediction system. Passengers will use a prediction system if they consider it to be trustworthy, and the bus department gains revenue from their use of the system. Therefore, one of the aims of the simulations is to determine the accuracy of our bus arrival time prediction system.

Simulations were performed to simulate an 18 h experiment. Buses are assumed to begin service from 6:00 and end service at 24:00. Because our government regulates the speed to a maximum of 40 km/h, the speeds of buses are assumed to range from 20 to 40 km/h. Furthermore, the driving speeds of buses are also affected by traffic conditions. During peak hours, the driving speeds will be lower than non-peak-hour driving speeds.

Furthermore, the duration of traffic lights ranges from 30 to 120 s. In particular, for those crossroads that have heavy traffic during peak hours, the duration of red traffic lights at these crossroads may require 120 s. For those crossroads that do not have heavy traffic or are not being used during peak hours, the time of red lights can be shorter than 120 s. However, the duration of the red phase of traffic lights should be at least 30 s.

Another factor that can affect the accuracy of the prediction time is the number of passengers. Buses spend more time at bus stations if there are more passengers. Because it will cost time for passengers to board and alight from buses, each bus will stop at the bus station for a different period of time. During peak hours, there will be more passengers, and thus the waiting time is longer than usual. Moreover, some bus stations also require more time because they are in hot spots. Thus, there will also be more passengers utilizing buses. However, this factor is not considered in our bus arrival time prediction system because one of the methods in our system has already avoided being affected by this factor. Bus stations record the bus arrival and departure times. When bus stations transmit data to other bus stations, other bus stations receive data that already consider this factor. Therefore, in our prediction system, the time passengers take to board and alight from buses does not need to be assumed.

Figure 3 presents the accuracy of our bus arrival time prediction system as a percentage. The -axis presents the time of day that buses serve, which ranges from 6:00 to 24:00. The -axis presents the percentage accuracy of our system. A higher percentage indicates a more accurate system.

The accuracy of our bus arrival time prediction system is over 76%. The accuracy is affected during peak hours, such as 9:00 and 18:00. From 6:00 to 9:00, people start to go to work, and traffic conditions are disturbed. Similarly, at approximately 18:00, people leave work to go home; at this time, traffic conditions are complex and difficult to predict. Therefore, predicting bus arrival times is more difficult in these two time periods. However, our prediction system can still achieve an accuracy over 76% in these periods, illustrating that our bus arrival time prediction system can consider the complex factors that affect traffic conditions.

Overall, the accuracy of our bus arrival time prediction system is very good. The accuracy reaches 76% in peak hours and 85% to 90% in nonpeak hours because our prediction system exchanges data on buses directly between bus stations. Then, bus stations can calculate and analyze the bus arrival times according to these data and traffic conditions.

4.2. Amount of Messages Transmitted to Server

In traditional prediction systems, the data on buses and traffic conditions are sent to a centralized server at any time. This action produces a large number of messages to transmit those data and consumes a large bandwidth. The centralized server must receive many messages and analyze them, and hence the centralized server experiences heavy loading.

However, in our bus arrival time prediction system, the data on buses and traffic conditions are analyzed and sent at night when buses are out of service, and the arrival time is predicted by bus stations according to real-time data. The centralized server in our prediction system is not responsible for real-time prediction. Therefore, the arrival time prediction is not affected by the loading of the centralized server. Figure 4 presents the number of messages sent to the centralized server.

Figure 4 illustrates that traditional systems transmit data collected from buses at any time when buses are in operation. Therefore, many messages are transmitted to the centralized server simultaneously. However, because the centralized server also needs to calculate the predicted arrival time of buses, there is substantial system overhead in the centralized server. Similarly, the performance of the prediction arrival time is affected by the system overhead of the centralized server.

Our system does not transmit data collected during the day, and thus bus stations need to be responsible for predicting arrival time. Collected data are transmitted to the centralized server at night because the centralized server is only responsible for storing these data and comprehensively analyzing them for the bus department’s reference. In this manner, the number of messages transmitted to the centralized server is decreased considerably, the system overhead of the centralized server is also decreased, and bus stations can calculate the arrival time of buses in real time.

5. Conclusions and Future Work

Existing arrival time prediction systems do not provide passengers with information about whether their buses will depart or arrive soon because of their low accuracy and limited deployment. For passengers, the arrival time of buses is important because passengers can decide whether they should wait for the next bus or choose other means of public transportation.

This study develops a bus arrival time prediction system that combines a P2P overlay network and WSN. The data obtained at bus stations and from sensors for buses are used to accurately predict arrival time. Data collected from each bus are logged into the system and used to optimize the courses and frequency of buses. Bus routes and schedules can be optimized to meet the actual needs of passengers according to data collected from bus stations. The proposed system can reduce costs and increase revenue for bus departments as well as improving passenger satisfaction.

The experimental results demonstrate that our bus arrival time prediction system can achieve high accuracy using real-time exchange of data and traffic conditions between bus stations. Furthermore, the experimental results demonstrate that our prediction system can decrease the system overhead of the centralized server considerably. The arrival time was predicted by bus stations according to the real-time data they received.

With the advance of microfabrication technology and the development of wireless transmission technology, WSNs have been used in a wide range of applications. WSNs can also be used in electronic toll collection, traffic, and road information applications to improve safety, convenience, and efficiency. The investment trends of advanced countries indicate that telematics will be an increasingly used wireless technology.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.