#### Abstract

The recent advent of satellite swarm technologies has enabled space exploration with a massive number of picoclass, low-power, and low-weight spacecraft. However, developing swarm-based satellite systems, from conceptualization to validation, is a complex multidisciplinary activity. One of the primary challenges is how to achieve energy-efficient data transmission between the satellite swarm and terrestrial terminal stations. Employing Lyapunov optimization techniques, we present an online control algorithm to optimally dispatch traffic load among different satellite-ground links for minimizing overall energy consumption over time. Our algorithm is able to independently and simultaneously make control decisions on traffic dispatching over intersatellite-links and up-down-links so as to offer provable energy and delay guarantees, without requiring any statistical information of traffic arrivals and link condition. Rigorous analysis and extensive simulations have demonstrated the performance and robustness of the proposed new algorithm.

#### 1. Introduction

To enable robust space exploration, astronautic scientists are exploiting principles and techniques that can help spacecraft systems become more resilient through self-organizing and automatic adaptation. Inspired by the swarming behaviors of animals in nature [1], they have recently proposed to build the swarm-based satellite system that is comprised of a large number of picoclass, low-power, and low-weight satellite units working together for space exploration tasks [2]. Typical projects include NASA ANTS (autonomous nanotechnology swarm) [3], ESA APIES (asteroid population investigation and exploration swarm) [4], and DARPA System F6 (future, fast, flexible, fractionated, free-flying spacecraft) [5].

As shown in Figure 1, a swarm typically consisted of several subswarms, which are temporal groups organized to perform a particular task. Each swarm group has a group leader (i.e., “ruler”) and a large number of “workers” carrying some specialized instruments. The ruler is responsible for coordinating its workers to cooperate on area monitoring and data gathering. Besides, there are some “messengers” in the swarm to coordinate communications among rulers, works, and earth stations. Although with different roles and responsibilities, all these three types of units rely mainly on the power from sun for data gathering, processing, and communication. However, the on-board solar panels cannot be very large for small-size satellites [6]. Thus, it is of great necessity to take energy efficiency into account when designing such satellite swarms.

To address the energy challenges of satellite swarms [7], we propose a novel online control algorithm based on Lyapunov optimization techniques [8] to make traffic dispatching decisions in the context of satellite-ground communications, which offers significant potentials on reducing energy consumption for data transmission over UDLs (up-down-links). Specifically, this algorithm aims to reduce energy consumption by making control decisions on (1) how to dispatch the traffic load from the workers over ISLs (intersatellite-links) to the messengers and (2) how to choose the best link from several available UDLs of a given messenger to transmit the aggregated data from this messenger to ground. Our algorithm does not require any prior knowledge of the system statistics or any prediction on traffic arrivals and link condition. Moreover, it is computationally efficient and easy to be implemented in the practical systems. It can obtain a time-average energy consumption within a deviation of from optimality, while bounding the traffic queue backlog by , where is a nonnegative control parameter representing a design knob of the energy-delay tradeoff, that is, how much we will emphasize the energy minimization compared to the system stability. We thoroughly analyze the performance of this algorithm with rigorous theoretical analysis. We also demonstrate its optimality, stability, and robustness, by extensive simulation experiments. To our knowledge, prior work has not yet studied the above energy-saving issue for satellite swarms, and our use of the Lyapunov framework for solving this issue is also novel.

The rest of this paper is organized as follows: Section 2 describes the theoretical model and formulates the optimization problem. Section 3 presents the proposed online control algorithm and provides an analysis on its performance bound and robustness. Section 4 analyzes the simulation results. Finally, Section 5 concludes the paper.

#### 2. Problem Formulation

We consider a satellite swarm, as shown in Figure 2, which has heterogeneous workers denoted by , as well as homogeneous messengers denoted by . For ease of reference, important notations are summarized in Notations section. The whole swarm system operates in discrete time with unit time slots . In every time slot , we denote by the amount of newly generated data at , where denotes the arrival vector. We assume that are independently identically distributed (i.i.d.) over time slots with . We also assume that there exists a finite maximum such that for all and all . However, we do not assume any prior knowledge of the statistics of , and the traffic arrivals can be unpredictable and time-varying.

In a swarm, a worker continuously gathers and temporally stores data information obtained by specialized exploring instruments. It then decides how to distribute the queued data to the messengers for transmitting to ground. To model this decision, we use to denote the amount of data traffic sent from to at time and use to denote the vector of traffic load dispatching rates at . We assume that, in every , must be drawn from some general feasible set ; that is, for all , and each set contains the constraint that for some finite maximum . This assumption is quite general; for example, , for all representing that cannot process data from due to some reasons.

The current satellite swarm prototypes generally use two different technologies for data communication, that is, the commercial-off-the-shelf wireless technology (e.g., IEEE 802.11) for intersatellite communications within the swarm [9] and the high-frequency microwave wireless technology (e.g., UHF/C/X/Ku/Ka band) for satellite-ground communications between swarm messengers and earth stations [10]. It is obvious that the latter can bring about energy consumption considerably higher than the former [7] and has significant impacts on the energy consumption and operational lifetime of the messengers. In the same time slot, a messenger may have more than one accessible earth station at different locations, and the UDLs between them have distinct communication characteristics, including available duration and error rate [11]. Hence, there exist opportunities for energy saving for the swarm by scheduling data transmission onto messengers having available UDLs currently with high quality and low cost. Let denote the decision of messenger on the UDL selection in time slot . Then, the service rates (i.e., throughput) of messenger for slot can be given as . We assume a maximum transmission rate , regardless of , so that . Besides, the energy consumption for transmitting data for in slot can be given as where is the coefficient of energy consumption for transmitting per unit data, and is the error rate of selected UDL of in slot [12]. Note that this is a simple throughput-based energy consumption model used in wireless communication, and more accurate models can be used instead if necessary [13].

Let , , , , be the vector denoting the data traffic queued at the workers and the messengers at time slot ; we can capture the following queueing dynamics over time:

In the following, we assume that satellites can estimate the unfinished traffic load in their queues accurately, and the case when such estimation has errors will be discussed in Section 3. Throughout the paper, we use the following definition of queue stability:

Our objective is to design a flexible and robust online control algorithm that automatically adapts to the time-varying systems by making decisions on and for solving the stochastic minimization problem as follows:

However, traditional techniques, for example, Markov decision theory and dynamic programming, require substantial statistics of system dynamics and suffer from high computational complexity [8]. By comparison, the recently developed Lyapunov optimization technique has shown its efficacy and efficiency in designing online control algorithms for solving problems of joint system stability and performance optimization on stochastic networks, especially communication and queueing systems. To achieve this, network algorithms are designed to make control actions that greedily minimize a bound on the following drift-plus-penalty expression in each time slot : where (Lyapunov drift) represents the congestion state of queue backlog, denotes the objective function to be optimized, and is a nonnegative weight that is chosen as desired to affect a performance tradeoff between backlog reduction and penalty minimization. Unlike the traditional techniques, Lyapunov optimization does not require knowledge of the statistics of relevant stochastic models to make online control decisions. The Lyapunov optimization algorithms commonly have a better computational complexity and are easy to be implemented in practical systems. By now, this technique has been applied in solving many stochastic network optimization problems, including workload/resource scheduling among data centers [14], power management in smart grid [15], and energy/throughput optimization for wireless systems [16]. In the following section, we will present an online control algorithm to solve problem in (5) based on the Lyapunov optimization framework.

#### 3. Online Control Algorithm

Let be a concatenated vector of all and queues. As a scalar measure of all the queue lengths, a quadratic Lyapunov function, which is a scalar measure of network congestion [8], can be defined as follows:

Then, the one-slot conditional Lyapunov drift is defined as

If control decisions are made in every slot to greedily minimize , then all backlogs are consistently pushed towards a lower congestion state, which intuitively maintains system stability [8]. However, the objective energy consumption function to be minimized should also be incorporated. Thus, following the drift-plus-penalty framework in Lyapunov optimization [8], we design the control algorithm to make decisions on and to minimize upper bound on the following drift-plus-penalty term in each time slot:

A key derivation step is to obtain an upper bound on this term, which is defined as follows.

Theorem 1 (drift-plus-penalty bound). *Under any control algorithm, the drift-plus-penalty expression has the following upper bound for all , all possible values of , and all parameters :
**
where .*

*Proof. *Squaring both sides of the queueing dynamic (2) and using the fact that, for any , , , , we have

Summing the above over and using the fact that and , we have

Repeating the above steps for the queue , and by using the fact that , we have

Combining these two bounds together, and taking the expectation with respect to on both sides, we arrive at the following one-slot conditional Lyapunov drift :
where .

Now adding to both sides the penalty expression, that is, the term , proves the theorem.

Minimizing the right-hand-side of (10) is equivalent to minimizing under the same constraints in (5). Thus, we can design the online control algorithm as in Algorithm 1.

The following theorem presents bounds on the time-average energy cost and queue backlogs achieved by the proposed new algorithm.

Theorem 2 (algorithm performance). *Define as the set of all rate vectors that satisfy the constraints in (5). For any rate vector , suppose there exists an such that ; then under the online control algorithm one has
**
Here and are the maximal value and the optimal value for the cost defined in (5), respectively, and denotes the vector of all 1’s.*

*Proof. *According to Caratheodory’s theorem [8, 14], we can easily prove that there exists a randomized stationary control policy that chooses feasible control decisions and , independent of the current queue backlogs, and achieves the following guarantees:

Because in every time slot , our implementation seeks to minimize the right-hand-side of the drift-plus-penalty expression in (10):
where , , and are the resulting decisions and attributed values under any alternative (possibly randomized) policy (denoted by ). Now since , it can be known from (17) that there exists a stationary and randomized policy that achieves the following:
Here is the minimum cost corresponding to the rate vector . Plugging (19) into the right-hand-side of (18) yields

Now we can take expectations on both sides over to get

Rearranging the terms, and using the fact that , we get that

Summing the above over , rearranging the terms, using the fact that for all , and dividing both sides by , we have
which proves (15) by taking limit as .

To prove (16), using (21), we have

Summing the above over , and dividing both sides by , we have
which proves (16) by taking limit as and letting .

As mentioned in Section 2, this algorithm needs to know the concatenated queue backlog to make control decisions. In practice, when this is not available, we use estimated values. The following theorem shows that the algorithm is robust when it makes control decisions based on the queue backlog estimates that differ from the actual queue backlogs.

Theorem 3 (algorithm robustness). *Suppose there exists an such that . Also suppose there exists a constant , such that at all time , the estimated backlog sizes , , and the actual backlog sizes , satisfy and . Then under the algorithm, one has
**
Here .*

*Proof. *It suffices to show that using , , we still minimize the right-hand-side of the drift expression in (10) to within some additive constant. Denote and . Suppose now and are used to carry out the algorithm; then we see that we try to minimize

Now denote the minimum value of to be , that is, the minimum of the right-hand-side of (10) with , and denote the minimum value of to be . Then using the fact that and , and the fact that , (since ), we know that

Using this and (27), we see that

Here , , and are the actions taken by the policy based on . This completes the proof. This shows that (10) holds with replaced by and replaced by . The rest of the proof follows similarly as the proof of Theorem 2.

#### 4. Performance Evaluation

##### 4.1. Simulation Setup

We evaluate the proposed new algorithm with extensive simulations under realistic settings. We simulate a satellite swarm consisting of 5 homogeneous messengers and 50 heterogeneous workers. In every time slot, each messenger can choose to use a specific UDL from at least 3 candidates for transmitting data to ground. According to [17, 18], we set Mb, for all and , and J. We assume the data arrivals at worker follow Poisson processes with Mb, and Mb.

To fully investigate the algorithm performance, we compare it with an intuitive online algorithm called LBMin. In LBMin, each worker dispatches all its data to the messenger currently with the smallest queue backlog, while each messenger chooses the UDL with the lowest energy cost for satellite-ground communication. It represents the current task processing practices in computing and network systems [14, 19]. According to Theorem 2.4 in [8], we use a simple control policy to enforce that for each and , so as to guarantee the rate stability of queue .

##### 4.2. Experimental Results

We conduct the following analysis on critical factors to characterize their impacts on the algorithm performance.

(1) Impact of control parameter : we fix time slots and conduct experiments with different values of . From the results shown in Figures 3, 4, and 5, we can observe that, by choosing a small initially, the time-average energy cost is quite big whereas it can be significantly reduced later by increasing the value of and finally converge to the minimum level for larger values of . This quantitatively corroborates Theorem 2 in that the proposed algorithm can approach the optimal cost with a diminishing gap (captured by (16)). However, increasing leads to the almost linear growth of time-average queue backlog and average delay (captured by (15)). In general, in the algorithm controls the tradeoff between energy cost and service delay. This enables the algorithm to obtain an elastic performance bound by adjusting within a desired cost and delay constraints.

(2) Impact of long-term time slot : in Figure 6, we fix , respectively, and vary from 50 time slots to 600 time slots, which is a sufficient range for exploring the characteristics of different time-scales of long-term processing. The time-average energy cost only fluctuates within , , , and for , respectively. Therefore, we can know that changing has relatively less impact on the energy cost, and the proposed new algorithm can guarantee stable performance over time.

(3) Characterizing algorithm robustness: as mentioned in Section 2, the proposed new algorithm needs to know the amount of traffic load of each worker. Now we explore the influence of estimation errors on the performance of this algorithm. For each worker, we add a random estimation error (, uniformly distributed) to the amount of traffic load in its queue. We let the algorithm make all the control decisions based on the traffic input with random errors under different values of . In Figures 7–9, we use the results on the original data without estimation errors as the baseline and show the differences in energy cost, queue backlog, and service delay due to injected estimation errors. Figure 7 indicates that for all values we experimented with, the difference (due to errors) in energy cost is between −1.36% and 0.78%. In Figure 8, the difference in queue backlog is between −9.30% and 11.55%. As shown in Figure 9, changes in service delay caused by estimation errors are small, that is, about −1.49 to 1.25 time slots. To conclude, this algorithm is robust to estimation errors of traffic load, as revealed by Theorem 3.

(4) Algorithm comparison: we choose to compare LBMin to the proposed new algorithm when . As shown in Figure 10, the performance gap on energy cost between them is almost neglectable, that is, about 0.13% to 0.26%. The reason is that the proposed algorithm has approached the optimal energy consumption when , as indicated by Figure 3. However, we can find from Figure 11 that the proposed algorithm outperforms LBMin on system throughput at all times. With varying from to time slots, the difference grows from about 4.64% to 11.44%. That is because LBMin takes no consideration of the throughput issue. In conclusion, the proposed algorithm is able to consume nearly the same energy with LBMin to transmit more satellite data to ground over time.

#### 5. Conclusions

In this paper, we propose a new algorithm based on Lyapunov optimization, to reduce the energy consumption on satellite-ground communications for satellite swarm systems. Our approach incorporates traffic scheduling actions on ISLs between workers and messengers and on UDLs between messengers and earth stations, so as to transmit data on high-quality, low-cost UDLs to ground. Through analysis and simulations, we show that this algorithm has provable performance bounds and is very effective in reducing energy costs of satellite-ground communication. We also show that this algorithm can guarantee stable performance over time and is robust to traffic load estimation errors. Moreover, it is computationally efficient and easy to implement in large practical swarming systems.

#### Important Notations

The number of workers in the swarm | |

: | The number of messengers in the swarm |

: | Lyapunov control parameter |

: | The coefficient of energy consumption for transmitting per unit data |

: | Queue backlog of worker in slot |

: | Queue backlog of messenger in slot |

: | The amount of newly generated data at in slot |

: | The maximum value for all |

: | The expectation of |

: | The amount of data traffic sent from to in slot |

: | The maximum amount of traffic that can be distributed from in one slot |

: | The decision of on the UDL selection in slot |

: | The transmission rate of in slot |

: | The maximum transmission rate for all |

: | The error rate of selected UDL of in slot |

: | The energy consumption for transmitting data for in slot . |

#### Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

#### Acknowledgments

This work was supported in part by the State Key Lab of Astronautic Dynamics of China under Grant no. 2014ADL-DW0401, the National Science Foundation of China (NSFC) under Grant no. 61401516 and no. 61202430, the Science and Technology Foundation of Beijing Jiaotong University under Grant no. 2012RC040, and the China Scholarship Council under Grant no. 201308110150. This work has been presented in part at the 16th International Conference on Advanced Communications Technology (ICACT), Pyeongchang, Korea, February 2014.