Abstract

Large-scale multiagent teamwork has been popular in various domains. Similar to human society infrastructure, agents only coordinate with some of the others, with a peer-to-peer complex network structure. Their organization has been proven as a key factor to influence their performance. To expedite team performance, we have analyzed that there are three key factors. First, complex network effects may be able to promote team performance. Second, coordination interactions coming from their sources are always trying to be routed to capable agents. Although they could be transferred across the network via different paths, their sources and sinks depend on the intrinsic nature of the team which is irrelevant to the network connections. In addition, the agents involved in the same plan often form a subteam and communicate with each other more frequently. Therefore, if the interactions between agents can be statistically recorded, we are able to set up an integrated network adjustment algorithm by combining the three key factors. Based on our abstracted teamwork simulations and the coordination statistics, we implemented the adaptive reorganization algorithm. The experimental results briefly support our design that the reorganized network is more capable of coordinating heterogeneous agents.

1. Introduction

Cooperative multiagent and multirobot teams are promising in the domain of distributed artificial intelligence, such as controlling of a large number of unmanned aerial vehicles [1], coordinating soldiers, agents and robots in battlefield [2], and searching and rescuing in disasters [3]. In those teams, agents work together toward their common goal, and similar to the infrastructure of human society, they only closely coordinate with a few teammates either logically or under the constraints of physical communications, with a peer-to-peer associate network structure. For example, in a multi-UAV team, members communicate through a wireless ad hoc network structure. When the team scales up, the network presents the complex network structure in the dynamic team coordination [4].

How to improve the team performance by adjusting the network topology has been extensively studied. Some special complex networks attributes are found to be important to the social network performance. For example, Gaston found that the tight coupled network structure will weaken the team performance because of the excessive consumption [5]. Scerri has analyzed how the network structure affects the efficiency of the multiagent communication network [6]. Following those discoveries, structure-oriented approaches [7] take the advantages of the discoveries of complex network effects, such as the scale-free, small-world property and the community structure on the team performance [8, 9]. On the other hand, actor-oriented approaches [7] focus on analyzing the characters of agents’ behaviors on the networked coordination and making use of agents’ role features to adjust team organization based on a given network structure such as hierarchical or grid-based networks [7]. For example, Jiang et al. [10] proposed a task assignment algorithm for heterogeneous agents with an understanding of its “actor” in the social network. Although significant progress has been made in both structure and actor-oriented approaches, no previous research combines both advantages to build an integrated team reorganization algorithm in real multiagent coordination domains [7].

In this paper, to combine both structure and actor-oriented approaches, we made three efforts. First, we analyze how different network topologies with different complex attributes may influence the agents’ networked coordination by Markov chain model. The discovery is exciting because potential scale-free and small-world effects help the team a lot especially when coordination interaction can be intelligently routed. Second, we found that agents always try to connect each other directly or within very short distance if they are closely cooperated. However, the cooperative relationship is the intrinsic character of the team and independent of how team is organized, for example, the capabilities of heterogeneous agents and the subteams in which agents closely cooperate for joint activities [11]. Similar to human society, people closely coordinate or communicate because some kinds of relationships exist. For example, a letter will be delivered from the mother to her daughter, or bread will be sent from the bakery to a hungry person. Human society will keep routing between those pairs of persons no matter how far they are. Therefore, if we can reorganize the team according to this nature and connect those closely coordinated agents with less links, the team will be benefited most.

To make our team organization optimization approach possible, we simulated the peer-to-peer coordination of large-scale heterogeneous multiagent systems. In our simulation, the resources, tasks, and other coordination information are abstracted and encapsulated into messages, which can be passed from the source agent to the destination via the coordination network. This method is easy to track the messages so that we can infer which pairs of agents are closely coordinated. To find the closely cooperative relationships, we built a connection matrix and recorded messages’ resources and destinations. In addition, the subteam model is used to record the agents who are involved in the same joint activity.

As the last part of our design, we set up the adaptive reorganization algorithm to integrate the three key factors we have analyzed. In the network construction phase, by building a probability model for each pair of agents, agents recursively pick their neighbors based on their probabilities. The probability of connecting each pair of agents is measured by how closely cooperated they are, within a subteam, or how much they help to promote the scale-free and small-world effects if they are directly connected. Therefore, in our algorithm, agents always pick neighbors who are closely cooperative, with higher degrees and shorter distances, but with different important factors. Our experiment results briefly match our expectations. By reorganizing the team as a small-world and scale-free network and connecting closely cooperative agents with shorter distances, the network can significantly expedite team performance.

2. Scalable Team Coordination

In this section, we build the scalable multiagent team network model, in which pairs of agents are defined as connected only when they are able to interact with each other directly and each agent only maintains its peer-to-peer connection with a few of others. The objective of their interactions is to jointly perform their tasks so that their complex team goal can be achieved.

2.1. Large Team Organization

The organizational topology is described as an undirected graph: , where multiagent is the node set of , and is the set of edges; if there is , then coordination messages could be transmitted between and directly; that is, they are neighbors to each other. is defined as the set of ’s neighbors.

could be organized based on the properties of complex networks. In this paper, we are mainly interested in four of the topologies: random network, grid network, small-world network, and scale-free network. Preliminary studies [12] found that each topology encodes the following different fundamental properties.(i)Degree: the degree of agent is .(ii)Average degree: is the average number of neighbors of all agents, for any complex network .(iii)Degree distribution: is defined as a fraction of agents (the number of such agents is ) with the degree .(iv)Distance: defines the least number of hops to communicate between agents and . Specifically, , if .(v)Average distance is the average distance between any pair of agents:

Different complex network topologies can be described according to the properties. In this paper, we mainly consider the two effects and used degree distribution to express the scale-free effect and average distance to express the small-world effect.

2.2. Multiagent Team Coordination

A large multiagent team coordination can be briefly described as follows: agents are cooperative on a joint goal. It can be decomposed into discrete subgoals . To achieve the subgoals, the corresponding tasks are typically performed by individuals. Agents must perform the individual tasks , when they are applicable, for the team to receive reward. An amount of reward will be received by the team when an agent performs a task. The reward depends on the agent and task, the capability of the agent, and the resources that the agent has. Specifically,

The function , if agent is assigned a task ; otherwise it is equal to 0. A given task is only allowed to be assigned to one agent at any time; that is, . However, agents may expect different utilities on the same task based on their capabilities. For example, we should expect higher reward for a fireman to do fire fighting than a nonexperienced civilian. The function projects a real value to denote the expected utility that agent performs .

Agents always require sharable resources to perform tasks. These resources, , are discrete and nonconsumable. Agent has the exclusive access to resources . Only one agent may hold a resource at any time; that is, , , .

The teamwork is to maximize the reward for the team, while minimizing the costs of coordination. The overall reward is

The costs of coordination are very general and in some cases hard to be defined. Here we are mainly concerned with the volume of communication.

2.3. Coordination Decision Process

The objective of agents’ interactions is to jointly coordinate themselves so that their common goal could be achieved. In large team coordination, similar to human society, agents always forward the incapable tasks and resources across the network. Once an agent accepts a task or resource according to its capability and what it is performing, it will execute the task or make use of the resource. The key of the coordination is to optimize their coordination so that the best capable agents could be reached to as fast so that agents’ communication and assignment delay can be minimized.

Specifically, in our abstracted coordination simulator, initiated tasks or resources are encapsulated into messages; each agent executes Algorithm 1, which describes a general way of agents’ coordination. Agents firstly check whether new tasks become applicable. If it is, the agent will encapsulate the task into a message and add it into its message queue so that the messages can be processed (lines 3–7). Next, the agent will merge all the messages passed from other agents to its message queue (line 8). It then processes all the messages in the queue. If a message represents a task, the agent will accept the task when its capability to perform that task is higher than message’s threshold (lines 11–14); otherwise, the agent will choose a neighbor to pass that message to (line 17). If the message encapsulated a resource and the agent’s need for that resource to perform its waiting tasks is higher than message’s current threshold [13], this resource will be held; otherwise it is passed to a neighbor (lines 19–27). Note that when a message is sent, the message will be removed from that agent’s list. Finally, the agent will check whether any pending tasks can now be executed (line 30) and release any resources from completed tasks (lines 32–37).

(1)  ApplicableTasks , OwnTasks , Holds , Messages ;
(2)  while (true) do
(3)   for  ( agent & ApplicableTasks) do
(4)    if (Applicable( )) then
(5)     ApplicableTasks.append( );
(6)     Messages.append(CreateMessages( ));
(7)    end if
(8)    Messages.append(recvMessages());
(9)   end for
(10)  for ( Messages) do
(11)   if ( is TaskMessages( )) then
(12)    if (GetCap( ) .threshold) then
(13)     if ( OwnTasks) then
(14)      OwnTasks.append( );
(15)     end if
(16)    else
(17)     SendToNeighbour( );
(18)    end if
(19)   else if ( is ResourceMessages( )) then
(20)     .threshold += ;
(21)    if (GetNeed( ) .threshold) then
(22)     if ( Holds) then
(23)      Holds.append( );
(24)     end if
(25)    else
(26)      .threshold −= ;
(27)     SendToNeighbour( );
(28)    end if
(29)   end if
(30)   CheckExecution(OwnTasks, Holds);
(31)  end for
(32)  for ( OwnTasks) do
(33)   if ( is complete) then
(34)    OwnTask.remove( );
(35)    for ( ChkUnneed(OwnTask, Holds)) do
(36)     Hold.remove( );
(37)     SendToNeighbour(CreateMessages( ));
(38)    end for
(39)   end if
(40)  end for
(41) end while

3. Coordination Efficiency over Different Network Topologies

In this section, we briefly analyze how team organization can make the team coordination performance different. We model the team coordination messages’ routing over the network as a finite Markov chain, which is briefly illustrated as in Figure 1. For a specific message movement, we can define different states. In Figure 1, defines the state that the message moves to an agent with the shortest distance of to the sink agent. The transition probability defines that of the message being passed from state to . Because there is only one step move for each message at any horizon, except for . Therefore, for a state , the message may move closer to the destination (), stay on the same level (), or move far away () as the three statuses shown in Figure 2. When the message reaches state , it will be kept at the destination and . If we suppose that is the initial probability distribution of the message being in state , according to the theory of Markov chains [14], the probability that the message reaches the sink agent after steps can be calculated as

As the agents transmit the messages randomly to anyone of their neighbors, there will be different distances between the source and destination. Figure 2 shows the relative rates of (marked as “Close”), (marked as “Same”), and (marked as “Further”) for scale-free and random networks [6]. Notice that we average over each node at distance , though this will vary from node to node in different cases. The -axis shows the distance from a node to the target node, that is, the target agent . The -axis shows the proportion of the states of “Further,” “Same,” and “Closer,” and different areas represent the corresponding proportions with a sum of 100%. In general, the closer the agent is to the sink agent, the more likely the random movement is to lead the message further from it. Conversely, the further the agent is from the sink agent, the more likely the random movement is to lead the message closer to it. Figure 2 shows different complex network probability distributions with different messages’ movement probabilities.

Figure 2(a) shows the messages’ state probability transition matrix with a typical scale-free network organization:

Figure 2(b) shows that, with a typical random network organization,

Suppose that the same message’s initial distribution is and after given steps of message’s random movements, the state probability distribution is listed as in Table 1. For example, after 1000 steps, the state probability distribution for a scale-free network is , where in 86% of cases this message has reached the sink agent. On the other hand, the state probability distribution for a random network after 1000 steps is , where in only about 65% cases this message has reached the sink agent. The efficiency of information transmission in a scale-free network is significantly higher than in a random network.

4. Team Reorganization Approach

The theoretical analysis demonstrates that the organization structure of large-scale multiagent team does influence its performance efficiency. In this section, we will introduce our team adjusting approach to construct better team organization so that team coordination efficiency could be improved. Taking both the intrinsic characters of agents’ coordination and the complex network effects into consideration, we have proposed an integrated adaptive algorithm and built two heuristic models to learn the agents’ closely coordinative relationships.

Our previous study has found that communication efficiency of the team, which is measured by the number of hops a message goes across the team, will be improved if messages are forwarded down the link with higher probability towards the destination agent. Therefore, we can build a data structure to learn from the sampled team coordination performance data and transferred into the reorganization algorithm.

The basic process of our reorganization approach is described as in Figure 3. In the learning process, we will sample the interactions between agents from the coordination simulations and build two heuristic models: connection matrix and to find closely coordinative relationships for each pair of agents. Next, by taking the complex network attributes, connection matrix and models, into consideration, we build a probability model to measure the connection between each pair of agents. The high probability will be applied for agents who are closely cooperated, within a subteam, or promote the scale-free and small-world effects. Therefore, we can transform the probability model into an adaptive reorganization algorithm, where agents recursively pick their neighbors based on their probabilities.

4.1. Connection Matrix

The coordination between agents is somewhat similar to human society, in which groups of people closely coordinate and communicate according to their common interests. Therefore, tasks or resources always come from their source to the agents who are capable of performing the task or using the resource encapsulated. For example, a piece of information about finding a hostile tank always starts from an unmanned scout good at information gaining and is useful to the UAVs who can attack tanks. This cooperative relationship is an intrinsic character of the team and is independent of how the team is organized or how the coordination is delivered.

If the messages coming from the source can be delivered to their destinations with shorter paths, the coordination is improved. In addition, the pairs of the source and target agents are always fixed according to their characters such as their capabilities which have been predefined before the coordination. Therefore, if we can learn those partners who always closely coordinate, we are able to reorganize the team so that the average distance between those partners could be shortened.

According to this idea, we use a matrix called to record messages’ movements. Each element of records the number of messages whose sender is agent and receiver is agent . In our learning process, when a message has been accepted (as shown in lines 11–15 and 21–24 in Algorithm 1), we are able to record its source (as ) and sink agent (as ) and is accumulated by one. Based on the learned matrix, it can be easily transformed into matrix: The matrix is symmetric and records the coordination between any pair of agents.

4.2. Subteams

Inspired by the clusters formed by closely cooperative individuals in human society, we build the model to describe the similar group activities across the network. The model, which is described in Figure 4 [11], is according to the mechanism of coordinative task planning. In the team coordination model we have built, the common goal is broken into subgoals , which can be executed by individual agents. Hence, agents can follow the planning mechanism to coordinate and achieve their subgoals.

Firstly, we predefined a number of plan templates in the library for agents to instantiate their plans. For example, when there is a fire in a building, the plan will be instantiated because it matches a template for disaster response. Each subgoal is addressed with a plan, , and thus the overall team plans, . is the subgoal, and describes the way the subgoal will be achieved. are individual activities that must be performed to execute , and is the domain specific information pertinent to the plan.

Distributed plan creation is implemented by individual agents on behalf of the team, and we allow any member to commit the team to executing a plan when it detects that subgoal is relevant. The subteams formation process commences when an individual agent detects all the appropriate preconditions that match a plan template in the library and subsequently instantiates a plan, . The in is embedded in the messages which are forwarded across the network until an agent finally accepts the role. Once accepted, the agent becomes a member of the subteam and makes a temporary commitment to perform the role toward the subgoal with the other subteam members. With the completion of the plan, the subteam will be dismissed so that new subteam for a new plan can be formed.

This algorithm can be described as an extended part of task allocation in team coordination. Algorithm 2 briefly describes the formation of subteams. Although agents dynamically form and dismiss subteams, when agents form a subteam, they communicate frequently for their common interest. It should be learned and modeled in our adaptive algorithm so that the agents who were always in the same subteams should be connected closely to improve the team performance.

(1)  for  ( Messages) do
(2)   if     then
(3)    if     then
(4)     if     then
(5)      OwnTasks.append( );
(6)       ( );
(7)       ( );
(8)     end if
(9)    end if
(10)  end if
(11) end for
(12) for ( OwnTasks)  do
(13)  if ( is complete)  then
(14)   OwnTask.remove( );
(15)    ;
(16)  end if
(17) end for

Based on the intrinsic character of the subteams, we consider it as an important factor for the team reorganization. In the coordination process, when agents use to inform their neighbors to join a subteam, we record their subteam ID and mark it as . Thus, we can learn the number of the subteams that agent joined, written as . The overlapping subteam where agent and agent have joined together is written as . Therefore, the close relationship between and in the subteam model can be formalized as :

4.3. Integrated Reorganization Algorithm

In this section, we propose our adaptive reorganization algorithm. The key is to connect each pair of closely cooperated agents and promote the helpful complex network attributes to design an integrated algorithm. In our algorithm, the closely coordinated relationships are defined according to the connection matrix and . To build the integrated probability model, each probability of connecting agents and directly is written as (). It is correlated with connection matrix and according to the closely cooperative relationship, according to the scale-free effect, and their distance according to the small-world effect. Please note that we put a very small positive value as default in the probability model to guarantee that, although less likely, any agent is still able to directly connect with other agents. Specifically, we write as the probability vector that connects with any of the others.

The network reorganization algorithm is expressed as in Algorithm 3. In this algorithm, the network starts from empty link set (line 1). Each agent picks half of the predefined average degree of the network (line 3) and connects with them if they have not been connected (lines 5-6). The function helps to pick agent ’s neighbor based on its probability vector, which is the key to the algorithm. In this paper, we briefly consider four key factors to build the probability :(i) models the preference of agent directly connecting according to the connection matrix;(ii) models the preference of agent directly connecting according to the model;(iii) models ’s degree distribution to infer whether should connect to to promote scale-free effect. The higher the degrees of are, the more likely the will connect to ;(iv) models the small word effects. The further the distance between and is, the more likely they will be directly connected.

(1)   ;
(2)  for ( to No_of_agents) do
(3)   for ( to Avg_degree/2) do
(4)    repeat
(5)     ;
(6)    until  
(7)     ;
(8)   end for
(9)  end for
(10)  ;

If agent starts a new connection, the probability of connecting with is defined as where normalization should be applied. are the important factors and . Please note that if , we only take the character of closely coordinative relationships into consideration; if , we will set up a standard scale-free network; and if , we will set up a standard small-world network [12].

5. Simulations and Results

To simulate the real team coordination, we use our abstract simulator called CoordSim [15]. This simulator is capable of simulating the major aspects of coordination, including task assignment and resource allocation. CoordSim abstracts the environment by simulating only its effects on the team. According to the team coordination process in Section 2.3, reward is simulated as being received by the team when a task is allocated. CoordSim allows a large number of parameters to be varied and also allows statistics to be recorded, such as the number of rewards and message movements, which is important to our approach. Two interfaces of this simulator are shown in Figure 5. There are more than 20 parameters that can be varied, covering the major aspects of the large heterogeneous coordination.

If not otherwise stated, the experiments are configured as follows. There are 100 agents deployed in a 500 500 environment to perform 100 tasks with 100 resources. Each task requires at least one resource. In the default setup, the heterogeneous team has five different types of capabilities. Task and resource messages are allowed to move unless accepted. “Reward” is the sum of the rewards received by each agent. “Messages” is the number of times that agents communicated for coordination. The objective of the team coordination is to gain the best tradeoff between maximizing team rewards and minimizing messages. To build intelligent coordination between agents, we used the literature [13] as the coordination schema in the simulations. Simulation will last for 500 time steps. All of the experimental results are based on 100 runs. To build the connection matrix and , we sampled the team with 100 runs as well. In each run, all the agents and environmental settings keep the same, but their organization networks are different.

We evaluated the efficiency of our team reorganization algorithm in five experiments. In the first experiment, we first organized the team as random networks and set that the team should be reorganized according to the connection matrix and only. We varied the team size from 50 to 300, and to be fairly compared, tasks and resources in large teams were more to keep the same tasks per agent and resources per agent. The experimental results in Figure 6 show that, no matter what the team sizes are, the reorganized network outperforms the random team organization with higher rewards and less communication costs.

In the second experiment shown in Figure 7, we organized the original network with three different structures: random, small world, and scale free. The team size is 100 and we set that the team will be reorganized according to the connection matrix and subteam model only. As we expected, sine we have taken the team’s intrinsic closely cooperative relationship between heterogeneous agents into consideration, the team performance is greatly improved no matter what the original network is.

In Figure 8, we investigate whether the reorganized networks have the complex network attributes. The reorganized network is learned from a random network and the team size is 100. We set as well. The distribution of distance and degree is shown in Figure 8. We can see that the reorganized network has the small-world and scale-free effects.

In the fourth experiment, we verify our design in different heterogeneous teams. The team size is 100 and the original network is random network. We set as before. Agents’ capability types are set from 2 to 10 to make the team more and more heterogeneous. Note that when team becomes more heterogeneous, less agents are capable of accepting a given task message. It will make the coordination very hard. As the experiment results in Figure 9 show, the reward gains keep decreasing in the original random networks; however, it is not the case in our reorganized networks. We hypothesize that when team becomes more heterogeneous, the closely cooperative relationship is more prominent to be caught, which makes our design more efficient.

In the last experiment, we set up the original network as random networks and team size is 100. We have six settings: , , , , , and so that the team organization is set according to all the three factors, but with higher portions to small-world and scale-free effects. Experimental results in Figure 10 show that, comparing with which is not reorganized, by integrating all three factors, the team performance with reorganization is improved. However, to gain the best performance, works the best. When , the performance is worse with the lack of closely cooperative relationship in organization. We explain that little portions of complex network effects help the team, but discovering the closely cooperative relationship contributes the most.

Many researchers have demonstrated that the organizational design in multiagent systems has a significant effect on its performances [16], and the properties of small-world network can enhance network’s signal propagation speed, computational power, and synchronization [17]. Glinton et al. [18] have found that the team is of high performance and rapid convergence when the scale-free organization was formed, and limited links per agent in the complex network improve team performance.

A range of organizational strategies have been proposed to improve multiagent team. In literature [19], it presented a distributed scenario for team formation in multiagent systems and concluded that the direct interconnections among agents were determined by the agent interaction topology. Kota et al. [20] provided a self-organization method which enabled agents to modify the structural relations to achieve a better allocation of tasks. Keogh and Sonenberg [21] designed a flexible, coordinated organization-based agent system in which agents can adjust their own attitudes to fit in others in a changing situation by having the access to organizational information that they can change. A composite self-organization mechanism in a multiagent network is proposed in [22]. It enables agents to dynamically adapt team organization by using a trust model and the former task allocation to assist agents to decide whom they should connect with. Terauchi et al. [23] proposed an agent organization management system and provided context based organizational information for problem solving. It is only concerned with improving system scalability and flexibility but neglected the coordination efficiency.

7. Conclusion and Future Works

In this paper, we have made an initial effort on finding how to adjust team organization to expedite team performance. We have proved that team organization is a key factor to the team performance. Based on scalable team coordination schema, we designed an adaptive team organization algorithm by incorporating the nature of agents’ cooperation relationship and the attributes of complex network. Our experiments have been proven to the validity of our design.

While this work represents an important step in this regard, much work remains to be done. First of all, we have only designed an offline learning algorithm to adjust the team, while online adjustment may be available. Secondly, our approach is based on the organization of associated network which is based on a P2P coordination infrastructure, while in many application domains, the coordination is based on broadcast.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This research has been sponsored by NSFC 61370151, 60905042, and 61202211, National Science and Technology Support Program of China 2012BAI22B05, and Central University Basic Research Funds Foundation ZYGX2011X013.