- About this Journal
- Abstracting and Indexing
- Aims and Scope
- Annual Issues
- Article Processing Charges
- Articles in Press
- Author Guidelines
- Bibliographic Information
- Citations to this Journal
- Contact Information
- Editorial Board
- Editorial Workflow
- Free eTOC Alerts
- Publication Ethics
- Reviewers Acknowledgment
- Submit a Manuscript
- Subscription Information
- Table of Contents
Mathematical Problems in Engineering
Volume 2012 (2012), Article ID 530561, 21 pages
Guidance Compliance Behavior on VMS Based on SOAR Cognitive Architecture
1College of Management and Economic, Tianjin University, Tianjin 300072, China
2Transportation Planning Center, Tianjin Municipal Engineering Design and Research Institute, Tianjin 300051, China
3School of Management, Hebei University of Technology, Tianjin 300130, China
Received 13 July 2012; Accepted 18 September 2012
Academic Editor: Baozhen Yao
Copyright © 2012 Shiquan Zhong et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
SOAR is a cognitive architecture named from state, operator and result, which is adopted to portray the drivers' guidance compliance behavior on variable message sign (VMS) in this paper. VMS represents traffic conditions to drivers by three colors: red, yellow, and green. Based on the multiagent platform, SOAR is introduced to design the agent with the detailed description of the working memory, long-term memory, decision cycle, and learning mechanism. With the fixed decision cycle, agent transforms state through four kinds of operators, including choosing route directly, changing the driving goal, changing the temper of driver, and changing the road condition of prediction. The agent learns from the process of state transformation by chunking and reinforcement learning. Finally, computerized simulation program is used to study the guidance compliance behavior. Experiments are simulated many times under given simulation network and conditions. The result, including the comparison between guidance and no guidance, the state transition times, and average chunking times are analyzed to further study the laws of guidance compliance and learning mechanism.
Reality indicates that the effects are very limited if one just simply relies on constructing new roadway, adding new traffic facilities or adopting traditional management methods to solve the traffic problems . Intelligent traffic systems (ITSs) have been put forward as early as 30 years ago, which is treated as one of the most important measures to solve traffic issues . Among the various means and methods involved in ITS, traffic guidance can help drivers to make more efficient route choices and reduce the anxiety and stress of trip, which is one of the means that can truly improve the performance of systems and ensure traffic safety. Guidance compliance behavior is a basic problem in traffic guidance as it represents the perception of driver to the guidance information, the trust degree in the information, and the regular pattern of the degree variation. The guidance compliance rate is the external manifestation of driver’s guidance compliance behavior, and an effective guidance system is established based on the high compliance rate of its users .
In the past, the planning and design of traffic guidance system are normally based on the assumptions that drivers completely obey the guidance or choose route according to the compliance rate set by the system, [4–7], or simply think that drivers regard expected benefit as a goal and they are assigned to the network with the users equilibrium after system implementation [8–12]; however, the assumptions are not true. First, the guidance compliance rate is not static but changes as drivers gain further knowledge of the rate while the guidance system is implemented. Second, the guidance compliance behavior which treats expected benefit as the only goal cannot reflect drivers’ real decision-making process. Whether or not the drivers comply with the guidance information is not only related with the expected travel time and trip cost, but also closely refer to their perception degree to the guidance information, the familiarity with the road network, the attitude towards risk, the release modes of guidance information, and the approach of information display. Finally, the decision-making process of drivers in the previous guidance system planning and design is viewed as absolutely rational, it is assumed that drivers have full awareness of guidance information and that they have a clear perception of a good or a bad trip, and they completely understand the current road network condition. However, many studies have shown that the behavior of drivers in practice is limited rational . Thus, traditional approaches of guidance system planning and design are not effectively capable of reflecting the impacts of release modes and display approaches of guidance information, the position of VMS in the complex traffic environment on drivers’ thinking and decision-making processes, and the changes of the trip demand and the spatial and temporal distribution of traffic flow following the impacts.
Without the accurate estimations of the compliance rate and its evolutionary process, drivers cannot perform effectively even if the system has provided good guidance suggestions, so the expected guidance effect cannot be achieved. In fact, according to investigation and research many researchers have shown that the guidance compliance rate is far lower than the system expects [13–17]. Thus, Erke et al.  have pointed out that it is not advisable to optimize in the process of system planning and design based on the set rate which is close to 100% in order to maximize the utility. Because of the above facts, many scholars have realized the significance of researching into the guidance compliance behavior on planning and design the guidance system. Koutsopoulos and Xu  and Ozbay and Bartin  have stated that the effects of travelers on both guidance compliance rate and the change process of the rate must be considered if a guidance system wants to be useful. Adler  has indicated that the successful implementation of the advanced traveler information system (ATIS) and the in-depth research of the traffic flow model under the guidance information are both very important in studying the traffic guidance compliance behavior.
2. Literature Review
The traditional guidance theoretical methods and the methods of processing compliance rate in guidance system design can be approximately divided into two categories. The first is based on the presupposition of the guidance compliance rate, such as Thakuriah and Sen , Wang et al.  assumed that drivers completely obeyed the guidance when various guidance strategies had been simulated, Deflorio  set the rate as a constant between 0 and 1, and Yin and Yang  set the rate to match the assumption that drivers cost less travel time when they obey the guidance than they did not. The second method avoids guidance compliance rate but converses it to the method which is based on the random utility theory and user equilibrium theory. Based on this method, multinomial probability model (MNP) [8, 9], the theoretical model based on the generalized extreme value (GEV) , and the stochastic user equilibrium model (SUE) are proposed [11, 12].
Many surveys show that the real guidance compliance rate deviates from the expected rate severely after the guidance system is implemented [13–17], it challenges the traditional methods of processing the compliance rate. Hence, many scholars realize that researching the guidance compliance rate of the groups demands the study of guidance compliance behavior from the individual point [18–20]. The main three methods of studying the guidance compliance behavior are survey, experiment, and simulation models.
2.1. The Method Based on Survey
The most direct way to study the compliance behavior under a specific guidance system is to survey the service groups and extract the factors affecting the actual compliance rate to be analyzed for guiding the planning and design of the guidance system. The two main survey methods are questionnaire and network monitoring. Bonsall and Joint  have investigated 100 drivers with vehicle-mounted guidance systems and found that 30% of the drivers did not comply with the guidance information when they are familiar with the road network, and 90% complied when they are unfamiliar. Cummings  discovered that only 4–7% of the drivers obey the guidance in normal conditions and only 13% in special conditions after investigating 20 variable message systems in Europe; in addition, 5% of the drivers did not understand the information shown on the VMS, and over 10% misunderstood the information. Tarry and Graham  monitored the VMS near Birmingham through the network and found out that 27–40% of the drivers obeyed the guidance when the VMS showed that an accident happened in front, but only 2–5% when the VMS just displayed traffic jams without reporting the reasons. Swann et al.  investigated the VMS near the estuary district in Forth of Scotland and found that 16% of the drivers routed to the recommended path when VMS showed traffic congestions in the front. Erke et al.  found that about 20% of the drivers followed the recommendation and almost 100% when the VMS showed that the road ahead was closed through a spot field investigation on two sites of motorways. Zhou and Wu  analyzed 497 valid questionnaires in Beijing and found that proportions of drivers who would change the route, maybe change, would not change, and did not know what to do were 16.9%, 65.4%, 11.5%, and 6.2%, respectively. Chen et al.  and Mo and Yan  studied the parking guidance system in Nanjing and Shanghai and concluded that the behavior of the driver in discovering, understanding, and complying with the guidance information varied very much when drivers’ personal properties, travel characteristics, and parking lot selection changed.
The above investigation results show that the compliance rate is normally low whether for the vehicle-mounted guidance system or VMS. The main reasons of low rate have been analyzed as follows: (1) there is no need for the guidance information because drivers are familiar with the roads; (2) drivers do not notice the guidance information; (3) drivers do not understand the guidance information; (4) drivers do not believe the guidance information; (5) the information is received too late that the drivers have chosen the route; (6) the readable distance of VMS is too short, drivers have no time to read, and thus miss the guidance information [13–17].
2.2. The Method Based on Experiment
The method based on experiment is to find out the laws of behavior occurrence and some measures to affect the behavior by setting different guidance environments in the road network to analyze the response of the drivers. Allen et al.  observed drivers experiment with different OD and different roads congestion conditions, and the results showed that their compliance rate is high (more than 70%) when using route suggestion strategy. However, they did not consider that the recommended route would be congested and then influenced the guidance compliance behavior, which actually was very common and practical. Srinivasan and Jovanis  adopted the Designer Workbench model of Corypheaus Corporation to create a software to develop experimental environments, in which different display approaches of guidance information were used to test the effectiveness for 10 participants; they found that the display approach with the highest compliance rate was the one using the countdown progress bar to show the distance between the current location and the front crossing. Chen and Paul  allowed 99 participants to join a continuous 20-day experiment in computerized simulation road network and found that the factors affecting drivers’ guidance compliance rate were characteristics of guidance information (e.g., whether the information had shown the recommended path or not, whether the road connected or closed to highway or not), the characteristics of drivers (e.g., age, gender, and educational level), and whether the information provided the reasons of accidents or congestion or not. Wachinger and Boehm-Davis  found that drivers preferred different display modes and seemed to be more willing to comply with the favorite one. Adler  divided 80 participants into 4 groups, and two-factor measurement experiments with 20 persons per group, 15 times per person were repeated successfully in a simulated road network. The result showed that the compliance rates of drivers unfamiliar with the road network were higher than the familiar ones because the familiar drivers benefited slightly from the guidance information.
2.3. The Method Based on Simulation Model
Due to the rapid development of computer technology, people have begun using simulation model to study the guidance compliance behavior of drivers and have attained certain achievements. Lu and Tan  proposed a complexity model of guidance compliance rate based on Logit traffic assignment model and analyzed the changing properties of the rate based on the simulation. Huang et al.  proposed a stochastic user equilibrium model to study the changing process of the compliance rate in ATIS. In recent years, people have paid increasing attention to the multiagent simulation techniques to study the guidance compliance behavior of drivers. Adler and Blue  studied drivers’ guidance compliance behavior from the perspective of cooperative game by simulating the interaction of various agents in the traffic system. Wahle et al.  proposed a two-layer agent framework to study the guidance compliance behavior of drivers; the first layer showed their perception and reaction to guidance information, and the second layer described their decision-making process. Dia  added beliefs and capacities of drivers and various rules of behavior into agents when the reactions of drivers to different traffic information had been studied, and they proposed a agent framework with cognitive function. Based on this architecture, they studied the changing process of the guidance compliance behavior.
3. Research Motivation
The current research results of guidance compliance behavior provide many methods and basis to design, implement, and evaluate the traffic guidance system, the results also promote the development of intelligent traffic system, but certain shortcomings still remain. First, it is obviously unfeasible to presuppose the guidance compliance rate without scientific analysis as there are so many factors influencing the rate. The way which converses the rate leads to problem solving difficulty or deviation from the actual situation because of model complexity and too many assumptions (e.g., travelers are assumed to be fully rational), and there are still much work to do to apply it into practice. Second, it can only roughly reflect drivers’ perception of guidance information with the method based on survey; moreover, their guidance compliance behavior will change as time passes, so relying solely on survey to analyze the internal causes of guidance compliance behavior scientifically and accurately is not an easy work. Third, simulation is a good way to study the guidance compliance behavior, but the development of the realistic guidance simulator costs too much. It is also difficult to recruit volunteers to participate into the experiment for as long as several months, so the simulation can only be done in a relatively short period. It is also worth studying whether the memory, thinking, and decision-making reflected by this intensive simulation are consistent with real conditions or not. Fourth, the current simulation models cannot reflect the actual process of guidance compliance behavior because the studies on the process of perceiving, recognizing and remembering information, the decision-making, feedback, and the impact of these activities on the behavior are all insufficient.
As one of the three cognitive architectures, SOAR is based on chunks theory. It uses rule-based memory to access search control knowledge and operators and finally achieves common problem solving. This paper will adopt SOAR to build the cognitive process model of guidance compliance behavior. This is mainly based on the following reasons. First, as mentioned above, guidance compliance behavior is a dynamic learning process, which changes as time passes. SOAR can learn from experience, which means it can remember how it solves a previous problem and then uses the experiment and knowledge for subsequent problem solving tasks. It can dynamically organize the accessible knowledge to decide and also set subgoals dynamically if the knowledge is incomplete or inconsistent with decision . This is very similar with the thinking and decision-making of the guidance compliance behavior of drivers. For example, a driver organizes and analyzes the received knowledge according to his own judgment of the current road conditions and the past accuracy of the guidance information to decide if he complies with the guidance or not under his certain goal. If the driver receives the guidance information, he is not sure if it is correct, and he is unfamiliar with the road network, he cannot make a decision to minimize the travel time with the accessible knowledge, then, he is likely to set a subgoal to test the guidance information. Drivers pay more attention to the accuracy of the guidance information and are often more sensitive to the delay of the recommended path under this condition. Second, as a mature cognitive theory, the content and framework of SOAR have been sufficiently described, including the depth description of perception, memory, decision making process, and learning mechanism; thus, SOAR portrays the guidance compliance behavior with a more actual architecture. Third, SOAR has been used to simulate different aspects of human behavior successfully, such as team behavior, decision-making of people in virtual games, and decision-making behavior of pilot, empirical supports are provided by many scholars [34, 35]. Thus, SOAR provides a good idea to study traffic guidance compliance behavior.
Today, studying traffic guidance compliance behavior has received more and more attention with the growing development of traffic guidance system. People have gradually become conscious of the fact that more detailed simulations about traffic behavior are needed. It is a hot topic of the emerging field to study guidance compliance behavior based on a multiagent framework with the integration of multidisciplinary as the platform. In this paper, based on the multiagent platform, SOAR architecture is added to describe the driver agent, and the empirical method is combined to the computer simulation to study the guidance compliance of drivers.
4. SOAR Cognitive Architecture
SOAR is a general intelligent architecture developed in 1987 by Laird et al. . It is a cognitive architecture with a wide range of applications, and it mainly focuses on knowledge, thinking, intelligence, and memory. SOAR is constructed with the assumption that all goal-oriented behavior can be likened to choosing an operator from a state. A state is a representation of the current problem-solving situation; an operator transforms a state (makes changes to the representation) and produces a new state. A goal is a desired outcome of the problem-solving activity. As SOAR runs, it is continually trying to apply the current operator and select the next operator (a state can have only one operator at a time), until the goal has been achieved.
As shown in Figure 1 , SOAR architecture has the following main parts: input and output interface, long-term memory, and working memory. There are also some underlying mechanisms, such as decision-making and learning.
SOAR interacts with the environment through the perception and action interface The environment is mapped into working memory through perception, and the inner representations are returned to the exterior, after which actions are generated to act on the environment through action interface. SOAR has two kinds of memories with different forms of representation: a working memory that describes the current problem solving situations and a long-term memory that stores long-term knowledge. In SOAR, the current situation, including data from sensors, results of intermediate inferences, active goals, and active operators, is held in working memory represented as a hierarchical graph of states or goals. Long-term memory contains production memory, semantic memory, and episode memory. SOAR achieves choosing and applying operators through decision-making cycle, which is a fixed processing mechanism. Along with the decision cycle, SOAR has four different types of learning mechanism, namely, reinforcement learning, chunking, episode learning, and semantic learning .
5. Agent Design of Traffic Guidance Compliance Behavior
Vehicle and driver are integrated as a whole because their behaviors are inseparable in the study of traffic guidance compliance behavior: the driver operates by observing the external environment, and the vehicle interacts with the external environment by carrying out the operation of the driver. Thus, the driver-vehicle unit is regarded as an agent in our study.
As Figure 2 shows, VMS uses three colors to represent the traffic condition of front roads: red for traffic congestion, yellow for a little crowded, and green for smoothness. Here, we will analyze how SOAR describes the specific guidance compliance behavior under VMS. It is assumed that a driver A drives to his destination B. On his way, he passes a VMS which shows he can reach B through any of the downstream roads (left road, forward road, and right road) of crossing ahead. Driver A finally will choose one of the three roads according to the combination of many factors, such as his familiarity with the road network, the current guidance information on VMS, the traffic congestion situation in sight, the external environment, and the former experience about the accuracy of the information shown on VMS. The SOAR cognitive model is adopted to describe the guidance compliance behavior of the driver when he passes the VMS repeatedly, including the process of perception, memory, decision, and learning.
5.1. Problem Space, State, and Operator
Problem space is the internal representation of the problem. It contains three types of states: initial state, intermediate state, and goal state. A state involves all the information of the current situation in the problem solving process, as the results of perception, the description of the current goal, and the problem space. An operator transforms a state to a new state. The problem solving is to find out a sequence of operators to transform the initial state to a goal state. Figure 3 is the problem space comprised by states and operators.
In Figure 3, squares represent states containing features and values, which reflect the internal and external situations. Goal states, states in which features have values that indicate the goal has been achieved, are shaded. Arrows represent operators that change or transform states.
In the guidance compliance scene, the initial state is the information, including guidance information shown on VMS, the perception of the external traffic environment, the judgment of the information accuracy, and the experience about the downstream road conditions, when A drives into the visual range of VMS. Goal state is when A has chosen the downstream route. The psychological process between initial state and goal state is described by intermediate states. Operators transform the initial state to immediate state and finally to goal state through several immediate states. Four kinds of operators are considered in this study, which are listed as follows.(1)The first operator is choosing route, including the driver choosing to forward, left or right.(2)The second is changing road condition, according to the external environment and his travel experience, the driver infers the downstream roads conditions which they think are most consistent with the truth.(3)The third is changing the driving goal from saving money (or saving time) to saving time (saving money). Saving money requires the driver to choose the shortest path to the destination, and saving time requires the driver to choose the path which costs the least time to arrive at the destination.(4)The fourth is changing mood. The driver may feel easy when the accuracy of matching is high, while he may be impatient when the accuracy is low.
5.2. Working Memory
Dynamic information about the world and internal reasoning, including data from sensors, results of intermediate inferences, hierarchy states, active goals and active operators, is all held in the working memory.
In our guidance compliance scene, an agent is the unit of driver and vehicle, and the agent achieves the goal state by transforming among several states in one decision-making process. The attributes of any state involve the attributes of the driver, vehicle, and the external traffic environment. The attributes of driver include gender, age, character, driving years, monthly income, mood, his/her familiarity with the road network, the understanding of the guidance signal, destination, driving goal, and the current location. The attributes of the vehicle include the usage, size, and the speed, and the attributes of roads and surrounding environment include the congestion situation of the current road in sight, the predicted congestion situation of the downstream roads through perception, the true congestion of the downstream roads by feedback, the current control signal of different directions, and the guidance information shown on VMS. The state has some other attributes, such as the name and the super state of the state. Parts of the attributes, such as the mood, driving goal, and the predicted congestion situation, change with the transformation of states while the others are static. The agents with the same static attributes are classified into one category. The value range of each attribute is shown in Tables 1, 2, and 3.
5.3. Long-Term Memory
Long-term memory is the area where achievements are stored. Although all types of long-term knowledge (procedural, semantic, and episodic) are useful, procedural knowledge is primarily responsible for controlling behavior and maps directly onto operator knowledge. Semantic and episodic knowledge usually come into play only when procedural knowledge is somewhat incomplete or inadequate for the current situation. In our study of the traffic behavior, procedural memory and episodic memory are involved.
The procedural knowledge is represented by production rules. The rules combine the procedural knowledge with the operations to things. When some specific conditions are satisfied, a set of actions in the matched rule are triggered. When the associations between goals and subgoals are triggered, production system is transformed from one to another, that is, once a production system is triggered and implemented, the control of action or behavior will be transformed to the other production system which meets the conditions. Production is represented by an if-then rule. The “if” portion of each rule contains the conditions, while the “then” portion contains action or behavior. Partial initial rules of an agent in the above guidance compliance behavior are shown in Table 4.
In Table 4, [I] is for “if”, [T] is for “then”, and C-D, V-S, C-S, P-D, Des, Loc, Moo, and M-A are short for current-density, VMS-sign, control-signal, predicted-density, destination, location, mood, and match-accuracy, respectively. Take r8 as an example, which means that if the current road is severely crowded, the VMS shows that all the roads are congested, and the traffic light allows vehicles to turn right, then the driver becomes impatient, and the matchaccuracy decreases by 10%. The meanings of other rules are expressed in the same way. The above rules are just partial initial rules of the driver agent, the others are not given in detail because of the limited space.
The episode knowledge is the specific experience and memories of a agent, and it is the source of episodic learning. Episodic memory records the events and history that are embedded into experience. An episode can be used to answer questions about the past to predict the outcome of possible courses of action or to help keep track of progress on long-term goals. In SOAR, episodes are recorded automatically when a problem is solved . An episode consists of a subset of the working memory elements that exist at the time of recording. SOAR then selects those working memory elements that have been used recently. The episode that best matches the cue is found and recreated in working memory. Once the episode is retrieved, it can trigger rule firings, or even serve as the basis for creating new cues for further searches of episodic memory. In the guidance compliance scene, once a decision cycle is finished, episodic memory records the chosen operator and its preference in the current state which contains the specific guidance information, control signal, current density, predicted density, driving goal, destination and location, and preparing for the next impasse coming.
Long-term memory changes dynamically, more episodes are added to the episodic memory and new production rules are added to the procedural memory through the different learning mechanisms in the process of decision-making.
5.4. Decision Cycle
The decision cycle is the most basic processing mechanism in SOAR. The core functions of SOAR are selecting the operator and then applying it, but only a single operator can be selected for a state at a given time.
The decision cycle starts with input, during which working memory elements are created to reflect changes in perception. In our simulation, the initial state, including the current location, the guidance information shown on VMS and the road conditions in sight, of the guidance compliance problem, is created by the initial operator. The state elaboration is not the knowledge directed toward selecting and applying the operator, but the knowledge to create a new description of the state to affect the operator selection and application, after which a new description can evoke the operator to be selected and applied. The stages of state elaboration, operator proposal, operator selection, and operator application in the decision cycle retrieve the rules in the long-term memory; however, comparison among operations is achieved by preference.
If the preferences can be compared successfully, such as there is only a single candidate operator to be proposed or an operator is obviously better than the others, and then the best selected operator of the current state will be added to the working memory. When the collision among the operators occurs, as two operators have the best preference at the same time, or operator A is better than B and B is better than A are both put forward, an impasse requiring the chunking learning mechanism to solve is created.
In the simulation of the guidance scene, each rule in the long-term memory contains conditions, the operator which matches the conditions, and the numeric preference of the proposed operator. Once a new rule is added to long-term memory, the initial numeric preference of the operator in the rule is judged, and its value can be updated along the decision-making process according to the feedback from the external environment in order to make it closer to reality and provide more precise information to drivers for decision.
5.5. Learning Mechanism
The SOAR agent of the guidance compliance behavior has four different learning mechanisms. They generate the representation forms of knowledge in SOAR together, but each of them has a different source of knowledge. The source of knowledge, learning time, and learning outcome of different learning mechanisms are shown in Table 5. In this paper, chunking and reinforcement learning are mainly introduced and adopted according to the characteristics of the traffic guidance compliance behavior.
Chunking occurs to learn when the impasse is solved, and the chunking rules are learned in the process of solving a substate. An impasse is how an architect defines a lack of available operators in the working memory of the system which make movement through the problem space, and a new rule is automatically created to solve the current impasse. The establishment of the chunking rule needs to analyze the production rules in the long-term memory and the episodic cues relating to the results.
Taking the SOAR agent of guidance compliance behavior as an example, it is assumed that the preferences of operator which will change the predicted road condition and operator which will change the driving goal cannot be compared successfully (i.e., the two operators are just as good or as bad with each other); thus, the decision cycle cannot decide, and an impasse occurs. At the same time, a substate which is meant to solve the impasse is produced. If is the state producing an impasse, and is the substate, we say is the super state of . An impasse can be solved by inputting new rules from outside, recalling the episodic knowledge in long-term memory, or randomly selecting an operator from several operators. The steps of solving an impasse in this paper are as follows.(1)Conditions which bring impasse: or and , where is the current state transformed from the initial state by times state transition, and is the set of candidate operators under the current state, its base number is . and are the best and the second best operator of , respectively, and is the domain range for selecting the operator of state directly.(2)The approach of solving an impasse: when the conditions of an impasse are matched, the episodic memories containing the current state are retrieved first in order to find out the best operator to solve the impasse. If the episodic memories do not contain the cue of state , then the match accuracy of decreases by step , and all the long-term memories are to be searched for the matched operator to move the problem to the goal state.(3)The creation of the chunking rule: if after agent exits out the downstream road, the operators of solving the impasse are to be updated to create chunking rules. is the true travel time, is the expected travel time, and is the domain range of chunking rules. If the same rule is updated for continuous times, the operator in the rule is to be added to the state under which the impasse happens in the decision cycle and chunking is achieved.
5.5.2. Reinforcement Learning
The source of reinforcement learning is the feedback of external environment, or what is often called a “reward.” The reward can come from the “body,” in which a model is embedded (to the model, this can be considered an external environment), or it can be generated internally when the goal is achieved. Based on experience, reinforcement learning adjusts predictions of future rewards which are then used to select actions that maximize future expected rewards. In the SOAR agent of guidance compliance behavior, the driving time is associated with the total feedback of the operator. is the total feedback under final state , the parameter in this paper is set as 0.5, is the true travel time the agent costs to get to the destination under state , is the expected travel time, and is the mean driving time on road at time, and it indicates driver’s experience. represents the affection of road condition information shown on VMS on travel time, is the mean travle time on road when the guidance information is, and is the reference travel time on road . The reference travel time adopted in this paper is figured with the assumption that the percentage occupancy is 0.5.
Given that many states and operators are involved in one decision cycle of guidance compliance behavior, the feedback is allocated to a state according to the distance between the state and the goal state. The feedback of , which is the operator corresponding to the th state of the state transition path on cycle , is . is the total feedback when the goal state is reached, and is 0.5. is the weighting factor of which are allocated to , and it is the function of and . is the distance between and , and is the transition path. In this paper, , where is the state number involving in state transition path which contains .
6. Simulation Experiment and Analysis
6.1. Simulation Environment
The adopted simulation road network contains 6 nodes and 7 one-way three-lane (left-turning, forward-going, and right-turning) roads, the structure and the length of roads are shown in Figure 4. VMS is set about 100 meters far away crossing 2 on road . The guidance information shown on VMS is the real-time traffic flow situation, which is represented by different colors. The corresponding occupancy percentages of green, yellow, and red colors are [0, 0.5], [0.5, 0.7], and [0.7, 1], respectively. Traffic flow is generated in crossing 1 and passes the VMS on to select one of the three downstream roads, including left (–), forward (), and right (–), with the guidance of VMS to get to crossing 5, before finally to crossing 6 in sequence. Crossing 2 and crossing 5 are equipped with two-phase fixed-time controllers, while the others are not. The microscopic traffic flow simulation platform based on the cellular automaton theory is adopted, and the free flow speed of each road is 60 km/h. The departure frequency of crossing 1 is set to simulate for 30 continuous days from 5 am–10 am (i.e., 540,000 seconds). In order to simulate the change law of guidance compliance behavior during a long period after the VMS is newly set on road , the vehicle is required to pass VMS repeatedly. Thus, a vehicle comes into again after it exits out from crossing 6 if the departure frequency of crossing 1 is matched.
Based on the basic structure of road network, the lengths of entry and exit roads, the position of VMS, and the traffic flow are adjusted accordingly. We find that the simulation results are basically identical when the adjustments are not too large. Hence, the simulation results and analysis for the basic structure mentioned above are only presented.
6.2. Simulation Results and Analysis
6.2.1. Guidance Effects
Two simulation experiments are conducted with the same road network and conditions which are described in Section 6.1 except no VMS is set on in one experiment, so that the differences of traffic flow operation between guidance and without guidance can be studied. Road is divided into 10 sections represented by I1 to I10 from crossing 1 to crossing 2, and the length of each section is 120 meters. Figures 5 and 6 are the spatial and temporal distributions of vehicles with and without guidance in the 30th day.
When there is no guidance information shown on VMS, the average number of vehicles on section I10 in peak period approaches 30 and its lasting time is long, the road occupancy is more than 0.8, and the average amout of vehicles on I9 in peak time is also over 20 (Figure 5). These indicate that the vehicle begins to decelerate at the place which is a bit far from the downstream crossing, resulting in the decrease of traffic capacity on in morning peak. On the other hand, when there is guidance information shown on VMS, the mean number of vehicles on I10 and I9 decrease by 22.7% and 17.6%, respectively, from 7-8 am of the morning peak (Figure 6). These show that the guidance information alleviate the congestion situation of morning peak on effectively. In addition, comparing Figure 5 with Figure 6, the vehicle numbers of each section on in off-peak time are pretty much the same, which means that the guidance information does not bring an obvious effect when traffic flow is small.
Figure 7 shows the effect of the guidance information on traffic flow in peak and off-peak period from the perspective of vehicle speed on downstream roads. Whether there is guidance information or not, vehicles on the downstream roads behind VMS remain at high speed, especially in the morning between 5-6 am when they almost travel by free speed entirely; thus, their travel time is almost the same no matter which route the drivers choose. In this case, whether there is guidance or not has no obvious meaning to drivers. During peak hours, the differences between guidance and without guidance are well represented by speed. When there is no guidance information shown on VMS, is congested severely at 8 am, the average speed is just 24.87 km/s, but keeps smooth with its average speed being 36.35 km/s, and the speeds of and differ nearly by 50%. When there is guidance information, the speeds of the downstream roads behind VMS are almost the same, and the highest difference is just about 9.25%, which indicate that the capacities of the downstream roads are maximized as VMS can allocate the traffic flow well in peak time. It also represents the effectiveness of VMS in alleviating the traffic congestion.
6.2.2. The Guidance Compliance Regular Pattern and Learning Law
Figure 8 shows the route switching times of drivers in peak and off-peak time when there is guidance information shown on VMS as well as the average switching time in all periods. Simulation is used to investigate the guidance effects and its change regular pattern after VMS is newly added on . Given that all agents are newly added in the initial stage of the simulation, it takes some time for drivers to be acquainted with the guidance of VMS. The initial route switching rate is close to 0.5, indicating that 50% of the drivers have chosen different downstream roads when the same guidance information is encountered every two times in this period, as the time goes by, the average route switching rate becomes 0.15 in the 10th day (Figure 8). In this process, drivers become more and more familiar with the guidance performance of VMS. Whether the individual driver complies with the guidance or not, the guidance compliance behavior of the driver groups represented by guidance compliance rate has basically reached a stable state after drivers gradually adapt to the guidance function of VMS.
Although the average guidance compliance rate is stable, the guidance compliance behaviors of peak time and off-peak time are inconsistent. The route switching rate of peak period is very low after being steady, which is about 0.07 (Figure 8). Thus, once the drivers adapt to the VMS in the peak time, their guidance compliance habit will rarely change. This also proves that the costs of changing behavior in the peak time are very high. The route switching rate of off-peak period is twice that in peak time, indicating that drivers’ guidance compliance behavior in off-peak time is with a certain degree of arbitrariness. This is because the average vehicle speeds of the downstream roads are all high, and the differences of choosing different routes are not obvious. This is also a reason for the ineffectiveness of the guidance system in off-peak time.
The number of state transition is a significant indicator which reflects the degree of difficulty involved in the process of decision-making. Figure 9 shows the average state transition times of 800 agents on cycles 0–300, all the agents are chosen randomly from the drivers whose decision cycles are over 300. At first, the average state transition number of agents is 7, indicating that the route from initial state to goal state is a little complicated in the beginning of the simulation, and several times of state transition and operator selection are needed. With the increase of decision-making number, after several rounds of feedbacks and learning, the operator of choosing route is directly reached through an average of a little more than 1 time state transition. In this case, the map from current state to goal state is almost finished directly in the decision-making process.
No matter how perfect the initial rules of SOAR agent are, it is still impossible to contain all the rules to be used in the decision-making process. The initial rules and operators are difficult to be set exactly the same with the actual situation, so rules can be added or revised in the learning process, and chunking is the most important learning mechanism in SOAR. Figure 10 shows the average chunking times in each decision-making process of all the chosen agents. The average chunking times of each decision-making is 3 in the beginning of the simulation, which indicate that the initial rules are insufficient to support choosing downstream route, so impasse may happen with a very high probability. The success of chunking rapidly increases the rules of agents to support decision-making. Therefore, the number of chunking decreases fast and gradually towards 0 along the decision-making process. This is consistent with the decreasing tendency of state transition times in Figure 9. When the goal state can be reached by one state transition, this indicates that the map from the perception of external traffic situation and the guidance information shown on VMS to the operator of choosing the downstream route is finished directly, and the chunking times are 0 in this mapping process.
6.2.3. The Effect of New Drivers Adding
The above simulation experiments investigate the changing laws of the guidance compliance behaviors of drivers on after adding VMS. In the beginning of simulation, all the agents are newly added, and they are unaware of the guidance system. After several decision-making processes after passing VMS, a set of cognitive experience about VMS is formed, and agents finish choosing downstream route based on this cognitive experience. In our simulation, in order to finish repeated decision-making process all the agents queue again to enter after they exit from . Thus, very few new agents are added under this condition. Based on the simulation situation of the previous 30 days, 4 more simulation experiments are conducted to simulate the traffic guidance situations from 5 am to 10 am of each day from the 30th to 60th days. 1/6, 2/6, 3/6, and 4/6 of new agents are added to the four simulations, respectively from the 35th day to inspect the effect of adding new agents on the guidance compliance behavior of the old ones. Figure 11 shows the variation of the route switching rates of the old agents after different percentages of new ones are added.
Adding 1/6 of new agents has little impact on the guidance compliance behavior of old ones, their route switching rate still keeps stable. When the percentage increases to 3/6, the interferences of new agents to the decision-making process of old ones are obvious, and when 4/6 are added, the old agents almost recomplete one learning process, which is similar to the learning of new agents (Figure 8). Their largest route switching rate approaches 0.5 and it takes 10 simulation days to reach stability. The above results indicate that the guidance compliance behaviors of drivers have a certain anti-interference ability when the guidance system reaches stability; however, in order to maintain stability, the adding proportion of new drivers should be lower than 1/2 in this simulation.
In this paper, the formation mechanism and variation law of drivers’ guidance compliance behavior are studied based on SOAR cognitive architecture. Traffic guidance system is a very large and complex system, and the study of guidance compliance behavior is a fundamental problem in achieving the functions and goals of guidance system. This paper begins with the analysis of individual driver, wherein the perception, memory, decision-making, and learning of drivers’ guidance compliance behavior are described in detail based on SOAR architecture. The guidance effect, guidance compliance law, learning law, and the anti-interference ability of the behavior are analyzed through simulation. As there are many factors affecting the guidance compliance behavior of drivers, apart from the impact of driver individual described in this paper, the release mode and display approach of guidance information, and the position of VMS all have a significant effect on guidance compliance behavior, and further research on these aspects will be conducted in the future.
The research described in this paper was substantially supported by Grants (50908155, 70971094) from the National Natural Science Foundation of China, and the Young Teachers Foundation Project (20090032120032) supported by the Doctorate in Higher Education Institutions of Ministry of Education. Meanwhile, the authors appreciate the help and comments from editors and reviewers.
- G. He, ITS System Engineering Introduction, Chinese Railway Press, Beijing, China, 2004.
- H. Huang, “Dynamic modeling of urban transportation networks and analysis of its travel behaviors,” Chinese Journal of Management, vol. 2, pp. 18–22, 2005.
- M. Wardman, P. W. Bonsall, and J. D. Shires, “Driver response to variable message signs: a stated preference investigation,” Transportation Research C, vol. 5, no. 6, pp. 389–405, 1997.
- P. Thakuriah and A. Sen, “Quality of information given by advanced traveler information systems,” Transportation Research C, vol. 4, no. 5, pp. 249–266, 1996.
- Y. Wang, M. Papageorgiou, G. Sarros, and W. J. Knibbe, “Real-time route guidance for large-scale express ring-roads,” in Proceedings of IEEE Intelligent Transportation Systems Conference (ITSC '06), pp. 224–229, September 2006.
- F. P. Deflorio, “Evaluation of a reactive dynamic route guidance strategy,” Transportation Research C, vol. 11, no. 5, pp. 375–388, 2003.
- Y. Yin and H. Yang, “Simultaneous determination of the equilibrium market penetration and compliance rate of advanced traveler information systems,” Transportation Research A, vol. 37, no. 2, pp. 165–181, 2003.
- H. S. Mahmassani and Y. H. Liu, “Dynamics of commuting decision behaviour under advanced traveler information systems,” Transportation Research C, vol. 7, no. 2-3, pp. 91–107, 1999.
- R. C. Jou, S. H. Lam, Y. H. Liu, and K. H. Chen, “Route switching behavior on freeways with the provision of different types of real-time traffic information,” Transportation Research A, vol. 39, no. 5, pp. 445–461, 2005.
- S. Bekhor and J. N. Prashker, “GEV-based destination choice models that account for unobserved similarities among alternatives,” Transportation Research B, vol. 42, no. 3, pp. 243–262, 2008.
- C. F. Daganzo and Y. Sheffi, “On stochastic models of traffic assignment,” Transportation Science, vol. 11, no. 3, pp. 253–274, 1977.
- D. Watling, “User equilibrium traffic network assignment with stochastic travel times and late arrival penalty,” European Journal of Operational Research, vol. 175, no. 3, pp. 1539–1556, 2006.
- P. W. Bonsall and M. Joint, “Driver compliance with route guidance advice. The evidence and its implications,” in Proceedings of the Vehicle Navigation & Information Systems Conference, vol. 2, pp. 47–59, October 1991.
- M. Cummings, “Electronic signs strategies and their benefits,” in Proceedings of the 7th International Conference on Road Traffic Monitoring and Control, pp. 141–144, London, UK, 1994.
- S. Tarry and A. Graham, “The role of evaluation in ATT development. IV: evaluation of ATT systems,” Traffic Engineering and Control, vol. 36, no. 12, pp. 688–693, 1995.
- J. Swann, I. W. Routeledge, J. Parker, and S. Tarry, “Result of practical applications of variable message signs(VMS):A64/AI accideng reduction scheme and Forth Estuary Driver Information and Control System(FEDICS),” in Proceedings of the Seminar G Held at the 23rd PTRC European Transport Forum, pp. 149–167, 1995.
- A. Erke, F. Sagberg, and R. Hagman, “Effects of route guidance variable message signs (VMS) on driver behaviour,” Transportation Research F, vol. 10, no. 6, pp. 447–457, 2007.
- H. N. Koutsopoulos and H. Xu, “An information discounting routing strategy for advanced traveler information systems,” Transportation Research C, vol. 1, no. 3, pp. 249–264, 1993.
- K. Ozbay and B. Bartin, “Estimation of economic impact of VMS route guidance using microsimulation,” Research in Transportation Economics, vol. 8, pp. 215–241, 2004.
- J. L. Adler, “Investigating the learning effects of route guidance and traffic advisories on route choice behavior,” Transportation Research C, vol. 9, no. 1, pp. 1–14, 2001.
- Y. Zhou and J. Wu, “The research on drivers’ route choice behavior in the presence of dynamic traffic information,” in Proceedings of the IEEE Intelligent Transportation Systems Conference, pp. 17–20, 2006.
- J. Chen, P. Liu, and W. Wang, “Methods for en route parking guidance and information system survey in Nanjing,” Urban Transport of China, vol. 4, pp. 79–83, 2006.
- Y. Mo and K. Yan, “Analysis of the performance of the PGIS in metropolitan area,” Road Traffic and Safety, vol. 5, pp. 33–36, 2007.
- R. W. Allen, A. C. Stein, T. J. Rosenthal, D. Ziedman, J. F. Torres, and A. Halati, “Human factors simulation investigation of driver route diversion and alternate route selection using in-vehicle navigation systems,” in Proceedings of the Vehicle Navigation & Information Systems Conference, vol. 2, pp. 9–26, October 1991.
- R. Srinivasan and P. P. Jovanis, “Evaluation of the attentional demand of selected visual route guidance systems,” in Proceedings of the 6th 1995 Vehicle Navigation and Information Systems Conference, pp. 140–146, August 1995.
- W. Chen and P. J. Paul, “Analysis of a driver en-route guidance compliance and driver learning with ATIS using a travel simulation experiment,” Research Report UCD-ITS-RR-97-12, Institute of Transportation Studies, University of California, Davis, Calif, USA, 1997.
- K. Wachinger and D. Boehm-Davis, “Navigational preference and driver acceptance of advanced traveler information systems,” in Ergonomics and Safety of Intelligent Devices, Y. Noy, Ed., pp. 345–362, 1997.
- C. Lu and Y. Tan, “The simulation and analysis of urban transportation system complexity,” System Engineering, vol. 23, pp. 84–87, 2005.
- H. J. Huang, T. L. Liu, and H. Yang, “Modeling the evolutions of day-to-day route choice and year-to-year ATIS adoption with stochastic user equilibrium,” Journal of Advanced Transportation, vol. 42, no. 2, pp. 111–127, 2008.
- J. L. Adler and V. J. Blue, “A cooperative multi-agent transportation management and route guidance system,” Transportation Research C, vol. 10, no. 5-6, pp. 433–454, 2002.
- J. Wahle, A. L. C. Bazzan, F. Klűgl, and M. Schreckenberg, Anticipatory Traffic Forecast Using Multi-Agent Techniques, Traffic and Granular Flow, Springer, Heidelberg, Germany, 1999.
- H. Dia, “A conceptual framework for modeling dynamic driver behavior using intelligent agents,” in Proceedings of the 6th International Conference on Applications of Advanced Technologies in Transportation Engineering, pp. 28–30, Singapore, 2000.
- S. Nason and J. E. Laird, “Soar-RL: Integrating reinforcement learning with SOAR,” Cognitive Systems Research, vol. 6, no. 1, pp. 51–59, 2005.
- R. M. Jones, J. E. Laird, and P. E. Nielsen, “Automated intelligent pilots for combat flight simulation,” in Proceedings of the 10th Conference on Innovative Applications of Artificial Intelligence (IAAI '98), pp. 1047–1054, July 1998.
- R. P. Marinier, J. E. Laird, and R. L. Lewis, “A computational unification of cognitive behavior and emotion,” Cognitive Systems Research, vol. 10, no. 1, pp. 48–69, 2009.
- J. E. Laird, A. Newell, and P. S. Rosenbloom, “SOAR: An architecture for general intelligence,” Artificial Intelligence, vol. 33, pp. 1–64, 1987.
- P. S. Rosenbloom, J. E. Laird, and A. Newell, “Knowledge-level learning in Soar,” in Proceedings of the 6th National Conference on Artificial Intelligence (AAAT '87), pp. 499–504, Morgan Kaufmann, Los Altos, Calif, USA, 1987.
- A. Nuxoll and J. Laird, “A cognitive model of episodic memory integrated with a general cognitive architecture,” in Proceedings of the International Conference on Cognitive Modeling, 2004.