Research Article  Open Access
A Balanced Heuristic Mechanism for Multirobot Task Allocation of Intelligent Warehouses
Abstract
This paper presents a new mechanism for the multirobot task allocation problem in intelligent warehouses, where a team of mobile robots are expected to efficiently transport a number of given objects. We model the system with unknown task cost and the objective is twofold, that is, equally allocating the workload as well as minimizing the travel cost. A balanced heuristic mechanism (BHM) is proposed to achieve this goal. We raised two improved task allocation methods by applying this mechanism to the auction and clustering strategies, respectively. The results of simulated experiments demonstrate the success of the proposed approach regarding increasing the utilization of the robots as well as the efficiency of the whole warehouse system (by 5~15%). In addition, the influence of the coefficient in the BHM is wellstudied. Typically, this coefficient is set between 0.7~0.9 to achieve good system performance.
1. Introduction
Autonomous guided vehicles (AGVs) have been used to perform tasks in warehouses for more than fifty years [1, 2]. In the past few years, they have been mostly used to transport large or heavy objects such as rolls of uncut paper or engine blocks. Recently, as autonomous robots [3ā5] become cheaper, smaller, and more capable, they are widely used in the logistic industry, for example, being substituted for the manual labor in the typical pickpackandship process in warehouses [6]. Moreover, in [7, 8], it is pointed out that the multirobot coordination, if being really efficient and robust, will put significant impact on improving the logistical efficiency. The multirobot coordination involves the allocation and execution of individual tasks with an efficient mechanism. In this paper, we mainly focus on the task allocation process in which a team of autonomous robots must fulfill a set of orders in optimal routes meeting with certain criteria. This problem is also known as multirobot task allocation problem (MRTA) [9].
There are already some methods to deal with MRTA. In 1998, Yamauchi presented a greedy strategy [10] to coordinate robots, with which each robot chooses the closest task as soon as they finish their previous tasks. Gerkey and MatariÄ introduced a dynamic task allocation method [11] for groups of autonomous robots, with which the robots bid on all unallocated targets, and the bids depend on the distance between their last targets and those unallocated targets. The algorithm proposed by Sandholm [12], known as the combinatorial auction method, is considered more efficient because it divides tasks into different clusters according to correlations between tasks; then robots bid on clusters of tasks [13]. In 2004, Lagoudakis et al. used Prim Allocation [14] to generate the MSF (minimum spanning forest) of tasks for robots and then transformed the forest in paths by the depthfirst traversal. Later, some researchers [15, 16] improved the depthfirst traversal process in the prim allocation method and better results are obtained.
As a matter of fact, most of current attempts focus on minimizing the travel cost of the robots group without concerning the travel time, that is, balancing the individual travel cost of each robot. There are few research works putting emphasis on improving the utilization of multirobots, especially in many time sensitive applications, such as the logistics industry, rescue, and exploration environments. Puig et al. [17] improved Yamauchiās greedy algorithm to solve the balancingtarget allocation problem in the planetary exploration. By this method, after finishing its current task, every robot moves to the frontier cell until all the tasks have been explored. However, since every robot only acts for itself with no coordination with other robots, this method may lead to a rather suboptimal solution in travel distance, which is not acceptable in our logistic application. In 2011, Elango et al. [18] proposed the means clustering and auction based mechanism (referred as mechanism) to balance the task allocation in multirobot systems. This approach considers the dual goal of both minimizing the travel distance and efficient sharing of the workload. Nevertheless, the complexity of this algorithm is relatively high and we will compare this method with our proposed approaches later.
Considering the specific characteristics of our problem, (i) balancing the task allocation as well as minimizing the travel distance, (ii) the cost of tasks are unknown, and (iii) large scale and dynamic environment, a balanced heuristic mechanism () is proposed to coordinate the autonomous robots and is applied to several traditional approaches, such as the auction allocation method and the means clustering method. To evaluate the performance of different strategies, three criteria are provided as the travel time (), coefficient of variation () between different robots, and total travel cost (). Simulation results strongly prove that the has a better balancing performance compared with traditional methods.
The rest of this paper is organized as follows. In Section 2, we give the formal description and model of the order fulfillment process in the warehouse. In Section 3, we describe the overall control structure of the system and present the balanced heuristic mechanism (). Then simulated experimental results are provided and compared with several traditional methods in Section 4.
2. Problem Formulation
2.1. Problem Statement
In this paper, we focus on the dynamic task allocation problem for multirobots in an intelligent warehouse. To characterize the system, some assumptions are listed as follows:(i)the system is composed of largescale physical embedded robots;(ii)the robots and the tasks are homogenous;(iii)the environment of the system is known to each robot;(iv)each robot in the system is selfinterested and fulfills its own tasks independently;(v)the robots may conflict with each other when fulfilling their tasks and they communicate with each other when being close to other vehicles (in purpose of obstacle avoidance);(vi)the time consuming at the station is the same for all robots.
Usually, the multirobot task allocation is described as follows: given a sequence of tasks and a set of robots, the goal is to assign tasks to robots in an efficient manner based on what the system is trying to optimize. In our approach, we strive to optimize the system in a dual objective function to keep the total travel cost as low as possible and keep the individual travel time as equal as possible, respectively. These two objectives correspond to a balance between time expenses and robot expenses, both of which deserve consideration in a concrete physical system. As shown in Figure 1, each storage shelf consists of several inventory pods and each pod consists of several resources. By order, a robot lifts and carries a pod at a time along a preplanned path, deliver it to the specific station which is appointed in the order and finally returns the pod back. The main consideration is how to allocate the tasks to the robots efficiently and equally while avoiding conflicts.
Another point that must be taken into account is that traditional task allocation methods are based on the assumption that the cost of each task is already known. However, in this problem, the input to the warehouse system is a sequence of orders, from which we can only know the location of the task and the specific station to which the inventory pod is supposed to be delivered. The cost of each task as well as the travel cost from one location to another is totally unknown.
2.2. Modelling
The model formulated to solve the balanced multirobot task allocation problem mentioned above in the warehouse system is as follows (as shown in Figure 2).
Let a set of robots to complete a set of tasks . The cost of task is , which refers to the travel cost of carrying the inventory pod to the specific station and then delivering it back, ignoring the time cost of obstacle avoidance, and is the travel cost between every two locations (usually from the robot to the inventory pod assigned in the order). Suppose have tasks ; then the total cost of all the assigned tasks in is expressed as . More details about the basic parameters and indexes using in the model are shown in Figure 2.
Rewrite as , which means a partition of the set of tasks where task set are allocated to robot . Then, define the individual travel cost as , which represents the sum cost for to fulfill its task set and can be calculated by where represents the travel cost for to come to its first task , and represents the total travel cost for the rest tasks. Also, represents the total task cost of all the assigned tasks in . Our dual goal can be expressed as which aims to minimize the travel cost as well as balances the individual travel cost of each robot.
Furthermore, we provide three criteria to evaluate the performance of different strategies: travel time (), which equals to the maximum of all the individual robotsā travel costs; total travel cost (), which reveals the total travel cost of all the robots, evaluates the power consumption and mechanical loss of robots; coefficient of variation (), which is a statistical measurement commonly used for comparing the diversity in work groups [19]: where represents the standard deviation of and is the mean of , .
It is worth to point out that we use to evaluate the power consumption of the system. On one hand, compared to the travel energy cost of the robots, other energy costs such as the communication cost and central controller cost are relatively small so that they can be neglected without affecting the overall power consumption. On the other hand, (distance that robots running with pods) is constant for a system, which means the power consumption of this part is unchanged for a certain system. So we need to minimize (distance that robots running without pods) which can be fulfilled by minimizing . Thus, can be used to evaluate the power consumption of the system.
3. Method
In this section, we develop the based approach for the task allocation of multirobots for the intelligent warehouse. We firstly provide an overview of the whole control system. Then the principles are described in detail. By introducing this mechanism into the traditional auction allocation method and the means clustering method, two improved methods based on are presented and analyzed for the taskallocation problem.
3.1. Overall System
The work space of the warehouse can be divided into several grids with inventory storage zones in the middle surrounding with the stations and workers. Autonomous robots transport movable inventory pods from storage locations to stations, where workers can pick items off and pack them up; then those packages are sent to the customers.
In technical aspects, the task allocation process is regulated by the central controller while the path planning and motion planning, for example, the obstacle avoidance, are fulfilled by robots. In the task allocation process, many current allocation mechanisms tend to use the dynamic method [20, 21]; however it may result in suboptimal performance. Thus, an approach is proposed combining both dynamic and static methods by means of the task pool. The continuously arriving orders accumulate in the task pool. The controller requires a certain number of tasks from the task pool after the previous batch of tasks has been completed, and uses a certain approach to allocate those tasks to robots. Then, the controller sends sets of tasks to corresponding robots by wireless communication such as ZigBee and the robots complete their tasks sequentially by a standard A^{ā} algorithm. Usually, we use the A^{ā} algorithm to find the shortest path between storage locations and stations in a static environment; while in an uncertain environment, the learning methods (e.g., reinforcement learning [22, 23]) may be necessary to find the shortest path, which will be our future work and is beyond the scope of this paper.
In addition, as for the obstacle avoidance process, robots firstly use infrared detectors to detect other robots within certain distance (usually several grids in front of them). Then, they communicate with each other by wireless devices so that they can verify their relative priorities. Generally speaking, the robot with more tasks left to be completed has priority to occupy front grids. Therefore, the robot with fewer tasks left is likely to give way to the robot with more tasks to be completed. So the actual balancing performance could be further enhanced in the real world compared with the simulated balancing performance of the simulated experiment. This mechanism is referred as moretaskprior mechanism in this paper. The whole process is shown in Figure 3.
3.2. Balanced Heuristic Mechanism (BHM)
Considering the goal of minimizing the travel cost as well as balancing the individual travel cost of each robot, a heuristic function is given in the following for guiding the task allocation process: where describes the cost (considering both the time and distance) for robot to finish task , is the travel cost from robot to task , and is the current value of (refer to Section 2).
More specifically, as both task cost and travel cost from one location to another are unknown, we evaluate (the cost of each task) and (the travel cost from one location to another) before the task allocation by using the standard A^{ā} algorithm, a simple but significantly effective path planning algorithm. When a robot fulfills a task or moves from one location to another in real environment, more complex motions should be taken into account, such as the obstacle avoidance. Then, the more accurate value is shared with the central controller and the original estimated value stored in the controller is replaced with the newest value, which can help the controller to make a better task allocation. Thus the following allocation process is in the light of new information and can be more accurate and predictable.
makes tasks to be assigned to the robot who has the lowest . By introducing a parameter , targets are more inclined to be assigned to the nearby robot with fewer allocated tasks. In practice, an effective value of can be set between and according to the numbers of robots and tasks. The detailed relationship between and the system performance will be given in Section 4.
At the beginning stage, as the value of is quite small, this mechanism has limited effects. Thus the performance of the system is not significantly superior to the one by traditional methods that merely focus on minimizing the total travel distance. However, as the allocation proceeds, the role of comes into effect by sharing the workload with those who have fewer assigned tasks. Above all, the method can obtain an approximately optimal result by considering the total travel cost and the workload balance of robots. A sample illustration of this heuristic mechanism is given in Figure 4.
3.3. Application of BHM
Recently the auctionbased task allocation method and the clusterbased task allocation method have been widely used to solve the problems [18, 24]. So we introduce our approach into the classic singleitem auction method and the means clustering method, respectively, and proposed two based methods to solve the balancing problem.
3.3.1. An Improved Auction Based Task Allocation Method Using BHM
A typical sealbid singleround singleitem auction process generally consists of four steps:(i)task announcement,(ii)matrix evaluation,(iii)bid submission,(iv)close of auction (to determine the winning bids and notify the winning robot).At the second stage, the matrix is usually defined as a function of the optimistic travel cost [11] as shown in the following matrix: where are all the open tasks which have not been carried out by robots. In our method, the is introduced into the matrix evaluation process. In order to strengthen the balancing performance of the auction, the matrix evaluation is defined by rather than the optimistic travel distance between and .
In each round, after all of the robots submit their bids, the singleround singleitem auction is closed. The controller will decide the winning bidder with the smallest bid value and notify the winner. Then the bid matrix will be refreshed with following steps (as shown in Figure 5).
Step 1. Delete the task from the opentask set.
Step 2. Recalculate the row .
As described in (4), the value of is both related to the distance between and as well as the already assigned tasks of . Firstly, the current position of is replaced by the location of , since wins the bid and has to move from its former position to in order to carry out the order. Secondly, the increases by the cost of the newly allocated task, which can be mathematically expressed as . After the matrix is refreshed, next singleitem auction process is proceeding until all the tasks have been allocated.
3.3.2. A Balanced Means Clustering Method Using BHM
A clustering method can be generally divided into two stages, that is, firstly finding the cluster centers and then assigning the sample data into the proper cluster. The means clustering is a commonly used partitioning method [25], the main idea of which is to find cluster centers , which minimize in the following by iteration: In (6), represents the distance between the point and , known as the Euclidean distance. When converges to a stable value, the appropriate are the cluster centers for those subgroups.
At the second stage, the sample data is assigned into the proper cluster with the smallest , . However, this method only guarantees an optimal allocation scheme minimizing the travel cost, while the size differences between each cluster are ignored.
In order to solve the balancing problem, we make some modifications on both of the two stages. As for the first stage, given that robots can only move along the grid lines, we substitute the Euclidean distance with the Manhattan distance in (6). In a rectangular coordinate system, the Manhattan distance between point and point can be expressed as
Thus, the main purpose is to find cluster centers , which minimize in the following by iteration:
Then, to reduce the value iteratively, we find the Manhattan center rather than the center of the collected tasks. Finally, a partial optimal solution is formed in a finite number of iterations. Generally speaking, there are three steps of the first stage.(1)Initial step: give an initial set of cluster centers.(2)Assignment step: assign each point to the cluster center which has the shortest Manhattan distance.(3)Update step: calculate the Manhattan Center, which can reduce the overall value of . The Manhattan Center is defined as follows.
It is supposed that there are points in a rectangular coordinate system as , coordinates of which are denoted as . Then, the descending order of the sequence is denoted as and the descending order of the sequence is denoted as . Our purpose is to find the point , which minimizes (8):
Since the values of and do not affect each other, the value of and is minimized separately. It is easy to prove that when is odd, is so that is minimum; when is even, is between and so that is minimum. We can get in the same way. Therefore, the Manhattan center can be calculated by following equations:
As mentioned before, the Manhattan center has the minimum overall Manhattan distance to these points, by which the value of is reduced. An example is shown in Figure 6. For the original center , , while, for the Manhattan Center , .
Now, we concisely prove the convergence of the algorithm. Given that the generated by the algorithm is strictly decreasing and there only exist a finite number of such partitions, the algorithm will reach a partial optimal solution in a finite number of iterations.
After obtaining the cluster centers, the balanced heuristic mechanism is introduced into the second stage of clustering. In traditional methods, when the cluster centers are found in the first stage of the means algorithm, the remaining active tasks are allocated into the proper cluster with the smallest , . However, in our approach, remaining active tasks are allocated into the proper cluster with the smallest defined in (4).
The main idea is that, instead of merely considering the travel cost, the balance of the cluster size is taken into account. The task tends to be allocated into the nearby cluster with fewer tasks. When a new task is added into the cluster , the will be refreshed. After all the active tasks have been allocated, the geneticbased TSP algorithm [26] is used to plan the route for tasks in each cluster. One example of the task allocation is demonstrated in Figure 7āā(, , ), where different colors represent tasks allocated to different robots. Blocks with are clustering centers.
4. Simulated Experiments
4.1. Simulation Description
Several groups of simulated experiments are implemented with and the experimental process is as follows.
Step 1. Initialization.
Step 2. Creating orders in the task pool while the arriving rate of orders meets the Poisson distribution.
Step 3. Sending a specific number (e.g., ) of orders from the task pool to the controller.
Step 4. Using different methods to allocate the orders (tasks). More specifically, there are four approaches, namely, means, means, auction, and auction, and their performances are compared.
Step 5. Sending the allocated tasks by the controller to the corresponding robots, which begin to fulfil their tasks (using the A^{ā} algorithm to calculate the path and using the moretaskprior mechanism to avoid collision). Then go to Step .
4.2. Simulation Results Analysis
In this part, three criteria mentioned in Section 2 are used to evaluate the performance of the proposed balanced heuristic mechanism, that is, coefficient of variation (), travel time (), and total travel cost ().
First of all, we compare the balance behavior of the improved methods with the typical auction method and the means method, results of which are given in Tables 1 and 2, where the data are collected by average during fifty times of simulation experiments. Both Tables 1 and 2 show that using the proposed , the diversity degree in work groups is significantly reduced, which indicates the more equal workload distribution among robots.


As shown in Figures 8 and 9, the travel time () of the multirobot system is provided by both the traditional method and the improved method. We can conclude that by applying the based methods, the time spent to finish the given task is much less than the one by using the traditional method, due to an efficient sharing of workload.
Comparison results for traditional methods and based methods are demonstrated in Figure 10 by means of the total travel cost (), from which we can conclude that in balanced heuristic cases, although robots pay more attention to balancing the workload rather than merely minimizing the travel cost as traditional methods do, the only increases by about 8%. Concerning the obvious increasement of the robots utilization, this slightly increased power consumption is quite acceptable.
(a)
(b)
With the traditional auction method and means method, the task allocation process only concentrates on minimizing the travel cost. Once there is a robot nearest to an open task, other robots do not attempt to compete for the task even if they are idle. Thus the utilization of the system is not enough and the time cost to fulfill the given tasks is high. With based methods proposed in this paper, all robots can be utilized in a balanced and effective manner while the travel cost is quite low.
Then we study the influence of the variable on the system performance. The value of describes the weight of the system energy saving demand and time saving demand. Figure 11 demonstrates the relationship between and the travel time (TT). It shows that in order to keep a good balance between minimizing the travel distance and efficient sharing of the workload, the coefficient is usually set between and to achieve the minimum of TT.
Finally, besides comparing based approaches with the traditional auction method and means method, we use a more competing algorithm ( mechanism in [18]) to testify the scientific merit of the scheme (as shown in Figures 12 and 13). We can conclude that by based methods, the time spent to finish the given task is 515% less than the mechanism; meanwhile, their total travel costs are almost the same. Thus, the benefit of the balanced heuristic mechanism to warehouse systems is significant.
5. Conclusion
In this paper, a novel balanced heuristic mechanism for the multirobot task allocation problem is presented with respect to the task cost and travel cost. To evaluate the proposed approach, we give three criteria, namely, travel time (), total travel cost (), and coefficient of variation (), and compare some traditional methods with based methods under those criterions. Simulation results show that the proposed approach greatly enhances the balance behavior of the multirobot system, reduces the total travel time while achieves almost the same total travel cost as the traditional methods do. In addition, the influence of the variable on the system performance is also studied in detail. In our future work, will be applied into more complicated environments, which consist of heterogeneous tasks and robots. Furthermore, we will dedicate to enhance the robustness of the proposed approach in dynamic environments. Finally, it is worth to point out that this new mechanism is of great applicability to other task allocation problems, such as the WSANs problem in [27].
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Authorsā Contribution
Luowei Zhou and Yuanyuan Shi contributed equally to this work.
Acknowledgment
This work was supported by the Natural Science Foundation of China under Grants no. 61273327 and no. 61432008.
References
 P. R. Wurman, R. D'Andrea, and M. Mountz, āCoordinating hundreds of cooperative, autonomous vehicles in warehouses,ā AI Magazine, vol. 29, no. 1, pp. 9ā19, 2008. View at: Google Scholar
 X. Zhai, J. E. Ward, and L. B. Schwarz, āCoordinating a onewarehouse Nretailer distribution system under retailerreporting,ā International Journal of Production Economics, vol. 134, no. 1, pp. 204ā211, 2011. View at: Publisher Site  Google Scholar
 C. Chen, H.X. Li, and D. Dong, āHybrid control for robot navigation—a hierarchical Qlearning algorithm,ā IEEE Robotics & Automation Magazine, vol. 15, no. 2, pp. 37ā47, 2008. View at: Publisher Site  Google Scholar
 S. Chen and C. Chen, āProbabilistic fuzzy system for uncertain localization and map building of mobile robots,ā IEEE Transactions on Instrumentation and Measurement, vol. 61, no. 6, pp. 1546ā1560, 2012. View at: Publisher Site  Google Scholar
 D. Dong, C. Chen, J. Chu, and T.J. Tarn, āRobust quantuminspired reinforcement learning for robot navigation,ā IEEE/ASME Transactions on Mechatronics, vol. 17, no. 1, pp. 86ā97, 2012. View at: Publisher Site  Google Scholar
 T. Niemueller, D. Ewert, S. Reuter et al., āTowards benchma rking cyberphysical systems in factory automation scenarios,ā in KI 2013: Advances in Artificial Intelligence, vol. 8077 of Lecture Notes in Computer Science, pp. 296ā299, 2013. View at: Publisher Site  Google Scholar
 M. Brambilla, E. Ferrante, M. Birattari, and M. Dorigo, āSwarm robotics: a review from the swarm engineering perspective,ā Swarm Intelligence, vol. 7, no. 1, pp. 1ā41, 2013. View at: Publisher Site  Google Scholar
 C. NietoGranda, J. G. Rogers, and H. I. Christensen, āCoordination strategies for multirobot exploration and mapping,ā Experimental Robotics, vol. 33, no. 4, pp. 519ā533, 2013. View at: Google Scholar
 B.Y. Shih, C.Y. Chen, and W. Chou, āAn enhanced obstacle avoidance and path correction mechanism for an autonomous intelligent robot with multiple sensors,ā Journal of Vibration and Control, vol. 18, no. 12, pp. 1855ā1864, 2012. View at: Publisher Site  Google Scholar
 B. Yamauchi, āFrontierbased exploration using multiple robots,ā in Proceedings of the 2nd International Conference on Autonomous Agents, pp. 47ā53, ACM, May 1998. View at: Google Scholar
 B. P. Gerkey and M. J. Matarić, āSold!: auction methods for multirobot coordination,ā IEEE Transactions on Robotics and Automation, vol. 18, no. 5, pp. 758ā768, 2002. View at: Publisher Site  Google Scholar
 T. Sandholm, āAlgorithm for optimal winner determination in combinatorial auctions,ā Artificial Intelligence, vol. 135, no. 12, pp. 1ā54, 2002. View at: Publisher Site  Google Scholar  MathSciNet
 M. Berhault, H. Huang, P. Keskinocak et al., āRobot exploration with combinatorial auctions,ā in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 1–4, pp. 1957ā1962, October 2003. View at: Google Scholar
 M. G. Lagoudakis, M. Berhault, S. Koenig, P. Keskinocak, and A. J. Kleywegt, āSimple auctions with performance guarantees for multirobot task allocation,ā in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS '04), pp. 698ā705, IEEE, October 2004. View at: Google Scholar
 L. Liu and D. A. Shell, āLargescale multirobot task allocation via dynamic partitioning and distribution,ā Autonomous Robots, vol. 33, no. 3, pp. 291ā307, 2012. View at: Publisher Site  Google Scholar
 S. Öztürk and A. E. Kuzucuoğlu, āOptimal bid valuation using path finding for multirobot task allocation,ā Journal of Intelligent Manufacturing, 2014. View at: Publisher Site  Google Scholar
 D. Puig, M. A. Garcia, and L. Wu, āA new global optimization strategy for coordinated multirobot exploration: development and comparative evaluation,ā Robotics and Autonomous Systems, vol. 59, no. 9, pp. 635ā653, 2011. View at: Publisher Site  Google Scholar
 M. Elango, S. Nachiappan, and M. K. Tiwari, āBalancing task allocation in multirobot systems using Kmeans clustering and auction based mechanisms,ā Expert Systems with Applications, vol. 38, no. 6, pp. 6486ā6491, 2011. View at: Publisher Site  Google Scholar
 A. G. Bedeian and K. W. Mossholder, āOn the use of the coefficient of variation as a measure of diversity,ā Organizational Research Methods, vol. 3, no. 3, pp. 285ā297, 2000. View at: Publisher Site  Google Scholar
 S. S. Chiddarwar and N. R. Babu, āConflict free coordinated path planning for multiple robots using a dynamic path modification sequence,ā Robotics and Autonomous Systems, vol. 59, no. 78, pp. 508ā518, 2011. View at: Publisher Site  Google Scholar
 J. Capitan, M. T. J. Spaan, L. Merino, and A. Ollero, āDecentralized multirobot cooperation with auctioned POMDPs,ā The International Journal of Robotics Research, vol. 32, no. 6, pp. 650ā671, 2013. View at: Publisher Site  Google Scholar
 D. Dong, C. Chen, H. Li, and T.J. Tarn, āQuantum reinforcement learning,ā IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 38, no. 5, pp. 1207ā1220, 2008. View at: Publisher Site  Google Scholar
 C. Chen, D. Dong, H.X. Li, J. Chu, and T.J. Tarn, āFidelitybased probabilistic Qlearning for control of quantum systems,ā IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 5, pp. 920ā933, 2014. View at: Publisher Site  Google Scholar
 M. Elango, G. Kanagaraj, and S. G. Ponnambalam, āSandholm algorithm with Kmeans clustering approach for multirobot task allocation,ā in Swarm, Evolutionary, and Memetic Computing, vol. 8298, pp. 14ā22, Springer International Publishing, 2013. View at: Google Scholar
 M. E. Celebi, H. A. Kingravi, and P. A. Vela, āA comparative study of efficient initialization methods for the kmeans clustering algorithm,ā Expert Systems with Applications, vol. 40, no. 1, pp. 200ā210, 2013. View at: Publisher Site  Google Scholar
 M. Shao, āResearch on improved convergence cocevolution genetic algorithm for TSPs,ā International Journal of Convergence Computing, vol. 1, no. 2, pp. 118ā126, 2014. View at: Publisher Site  Google Scholar
 I. Mezei, M. Lukic, V. Malbasa, and I. Stojmenovic, āAuctions and iMesh based task assignment in wireless sensor and actuator networks,ā Computer Communications, vol. 36, no. 9, pp. 979ā987, 2013. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2014 Luowei Zhou et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.