A Deep Learning Approach to Optimal Sampling Problems
Time-triggered and event-triggered sampling methods have been widely adopted in control systems. Optimal sampling problems of the two mechanisms have also received great attentions. However, for high-dimensional systems, analytical methods have some limitations. In this study, we propose a model-free method, called soft greedy policy for neural network fitting, to calculate the optimal sampling period of the time-triggered impulse control and the optimal threshold of the event-triggered impulse control. A neural network is used to approximate the objective function and then is trained. This approach is more widely applicable than the analytical method. At the same time, compared with different ways of generating data, the algorithm can carry out real-time update with greater flexibility and higher accuracy. Simulation results are provided to verify the effectiveness of the proposed algorithm.
Time-triggered and event-triggered methods are two important sampling mechanisms in control systems [1, 2]. Periodic sampling is usually performed in time-triggered control systems, and event-triggered sampling method is different from the time-triggered sampling method. It takes control actions when events happen. In the periodic sampling method, if the change of system states is not significant, then the frequent transfer of system measurements will cause a great waste of resources. In the event-based sampling method, an event occurs when the system states deviate to a certain extent. This approach is focused on the states of system and presents considerable advantages over other approaches. These two sampling methods have been applied in various systems, such as deterministic systems  and stochastic systems . At the same time, these two sampling mechanisms are compared in different settings [5–7]. These two sampling methods are compared, respectively, in  for one-dimensional linear systems and  for two-dimensional linear systems. In addition, these two papers have reached the same conclusion that the event-based sampling method has better performance under their respective settings. Based on the superior performance of event-triggered control, a large number of research results have emerged. The design of event-triggered mechanisms for different systems with various targets has been put forward continuously, such as the stability of nonlinear systems [10–12], stabilization for continuous-time stochastic systems [13, 14], and performance optimization under some special conditions [15–19].
For one-dimensional linear stochastic differential systems, Astrom and Bernhardsson  established the objective evaluation for the two sampling mechanisms, which is composed by the variance of state and the average control frequency. In , the Kolmogorov backward equation was constructed and solved; then, the optimal threshold for event-based impulse control and the optimal sampling period for periodic impulse control were obtained. Meng and Chen  consider two-dimensional linear stochastic systems. It first converted the two-dimensional stochastic system in the Cartesian coordinates to the one-dimensional stochastic system in the polar coordinates. Then, by using the Kolmogorov backward equation, the optimal threshold is obtained. However, for high-dimensional systems, the probability densities and the mean exit times are not easily computable in the process of solving the Kolmogorov forward and backward equation. Therefore, the optimal threshold and optimal sampling period are hard to get for high-dimensional systems and comparison the performance of the two sampling methods is difficult.
Motivated by the reason mentioned in the preceding paragraph, we want to obtain the optimal sampling period, the optimal threshold and avoid solving partial differential equations. As we know, deep learning is generally a computing approach which consists of multiple processing layers to learn data representations with multiple levels of abstraction . It usually uses the back-propagation algorithm to discover the internal structure of data to indicate how the machine should change its internal parameters, which are used to calculate the representation of each layer according to the representation of the previous layer. With the development of artificial intelligence, deep learning has been widely applied in computer vision , natural language processing , reinforcement learning [23, 24], intelligent driving , etc. Sutton and Barto  introduce a soft greed policy. Based on this policy, and by using neural network fitting , we can obtain the optimal threshold and the optimal period for event-triggered and time-triggered control of high dimensional systems. This model-free method is applicable to a wide range of applications and does not depend on the solution method of partial differential equations. For high-dimensional systems, we only need to calculate the corresponding objective function value to obtain the corresponding optimal sampling period and optimal threshold. In this way, the two sampling methods can also be compared for high-dimensional systems.
In this paper, time-triggered and event-triggered sampling methods are considered for linear stochastic systems. The contributions of this paper include the following: (a) proposing a Soft Greedy Policy for Neural Network Fitting (SGP-NNF) algorithm based on the deep learning method to get the optimal sampling period of the time-triggered impulse control and the optimal threshold of the event-triggered impulse control, respectively, and (b) comparing SGP-NNF algorithm with other methods, where SGP-NNF algorithm is model-free and can be applied to high-dimensional systems.
The rest of this study is organized as follows. Section 2 introduces the system equation and two sampling methods. The numerical algorithm is shown in Section 3. Simulation results of two different sampling mechanisms for fixed weight are given in section 4. Section 5 gives conclusions of this study.
Notations: is the positive real number set. is the -dimensional real vector space. is the -dimensional real matrix space. is the expectation of random variables. is the gradient vector of . is the Hessian matrix of . Here, indicates the transpose of . is the Euclidean norm. is a set with numbers. is the estimate of .
2. Problem Formulation
We consider a system described by the following stochastic differential equation:where the state vector , , and the random variable . At the sampling instant , the control law is given aswhere is an impulse function, such that .
In order to evaluate the control performance, an objective function is established :where is a weight parameter, is the number of sampling times in the time interval , and consists of two terms and , where is the mean square variation of the system and is the control frequency in .
2.1. Periodic Impulse Control
For a given sampling period , the system samples once for every interval . If we assume that time starts at zero, then we have periodic sequence , .
The control frequency of system (1) is written as
For a given weight , we want to determine an optimal sampling period for periodic impulse control that minimizes . Thus, the problem is formulated as follows.
Problem 1. Find the optimal sampling period that
2.2. Event-Based Impulse Control
We chooseas the restriction of the states, where . The system samples whenever .
The same as before, an optimal threshold for the event-triggered impulse control is we wanted. So, the problem we want to solve is as follows.
Problem 2. Determine the optimal threshold that
3. Main Results
In this section, we propose a SGP-NNF algorithm to solve problems 1 and 2. This algorithm is a model-free approach. We do not need to know the system equations. For a given input (or ), we just need to know the value of (or ). In the proposed algorithm, the difference between the two sampling methods is reflected in the calculation of the objective function value and the choice of the parameter values. For a given and a given , according to the numerical method given in , we calculate the corresponding values, and .
Here, we define the objective functions obtained by neural networks as and , where and are the weight parameters. The mean square error cost functions are defined asandwhere and are the target values, and are the network outputs, and and are the number of data.
In order to save space, we only show the process of solving the optimal sampling period for the periodic impulse control. The optimal threshold for the event-triggered impulse control can be obtained in the same way.
Here, we use a Levenberg–Marquardt (L-M) algorithm to train the network . First, we define
Then, the loss function becomes
First, we make a second-order approximation of the loss function :where and are the gradient vector and the Hessian matrix of the cost function, respectively. The optimal step is obtained by the first-order optimality condition and is given by
We rewrite the Gradient vector and the Hessian matrix of the cost function aswhere , is the Jacobi matrix of first derivatives of , and denotes the second order derivative information in . The term is zero when is close to the optimal solution; then, equation (13) becomes the Gauss–Newton approach. However, the term is not zero when is far away from the optimal solution. This leads to a poor approximation to the Hessian matrix. Those problems result in slow convergence rates and other problems to the solution of (13) due to the ill-condition of the Jacobi matrix, see , for details.
The L-M method is based on the assumption that such an approximation is valid only within a trust region of small radius. This causes the following approximation for the Hessian matrix :where is the identity matrix and is a scalar, and it decides the size of the trust region. Then, equation (13) becomes
When using the L-M algorithm to train neural networks, is usually chosen in this way . If a successful step is taken , then is decreased by a factor of 10 biasing; therefore, the iteration is towards the Gauss–Newton direction. On the contrary, if the step is unsuccessful , then is increased by the same factor until a successful step can be found.
Remark 1. Here, we take a batch approach to update the neural network. The batch size is an integer .
Remark 2. The value of (or ) should be computed by (3). However, in our algorithm, we replace (or ) by the average of the values generated by the simulation system  for 100 times.
By applying Algorithm 1, for a given weight , we can get the optimal sampling period and corresponding objective function as well as the optimal threshold and its target function . The difference between our algorithm and traditional neural network fitting lies in the different data used in fitting. By updating the network, the algorithm generates the fitting points, which make the points near (or ) denser, has better fitting effect on the vicinity of (or ), and improves the accuracy of the (or ).
In general, the algorithm can be divided in two parts: one is the training of the neural network through the data and the other is to get the point corresponding to the minimum value of the objective function trained by neural network. We use the L-M algorithm to train the network. The convergence of neural network fitting objective function has been proved completely in . Here, we use a gradient descent method to calculate the point corresponding to the minimum value of . By updating in this way, we not only maintain the exploration of different but also achieve better fitting effect near the optimal value point and achieve higher precision. With the increase of sampling points, the effect of data fitting becomes more accurate. When the final result reaches the given precision, we get the optimal value .
Compared with the analytical method, Algorithm 1 avoids solving differential equations and is a model-free method. At the same time, Algorithm 1 can be applied to high-dimensional systems.
4. Simulation Results
In this section, we consider two systems, a two-dimensional linear decoupled differential system and a three-dimensional linear coupled differential system. In Algorithm 1, the difference between the two systems is the calculation of the values of the objective function and the algorithm parameters.
We define a feedforward neural network, which is shown in Figure 1. The neural network has three layers, where the hidden layer has five neurons and a sigmoid transfer function . In the output layer, there is a linear transfer function. The training method is the L-M algorithm.
In this section, we compare three different ways for generating data:(1)Randomly generate points with a uniform probability distribution for neural network fitting (UPD-NNF) of a given interval.(2)For a given interval, select points equidistant for neural network fitting (E-NNF). For example, for an interval [0, X], select 1000 points, .(3)We propose SGP-NNF algorithm. In the soft greedy method, we choose .
We use the UPD-NNF and E-NNF methods to get 1000 points, respectively, compute (or ) by the average of the values generated by the simulation system  for 100 times, and use the neural network to fit it. At the same time, we set the SGP-NNF steps . This ensures that the amount of data in this method will not exceed those of the other two methods. All parameters are shown in Table 1.
4.1. 2D Decoupling System
We consider the following system :where is the pole of the system, and the random terms and are the mutually independent Wiener processes with unit increment. For the case , we choose , and we have the following results.
4.1.1. Periodic Impulse Control
It was proved in  that, for a sampling period and a given , the performance is
For the case of ,
The corresponding optimal sampling period is
Here, we set , , , and and choose the interval . By using SGP-NNF, we take 392 steps and get . E-NNF and UPD-NNF take 1000 points, respectively. Figure 2 shows the comparison of analytical result and fitting results of three different ways. It can be seen from Figure 2 that the SGP-NNF method uses less data but achieves a better result. Especially, near the optimal value, the SGP-NNF method has a more accurate approximation.
4.1.2. Event-Based Impulse Control
In , in the case of , we have
The corresponding optimal threshold is
We set , , , , and and choose the interval . By using SGP-NNF, we take 891 steps, and we get . E-NNF and UPD-NNF also take 1000 points, respectively. Figure 3 shows the comparison of analytical result and fitting results of three different ways. It can be seen from Figure 3 that SGP-NNF approach uses less data but gets more precise result compared with the other two methods.
4.2. 3D Coupling System
The system is given as follows:where and the random terms , , and are the mutually independent Wiener processes with unit increment.
4.2.1. Periodic Impulse Control
For , , , and . By using SGP-NNF algorithm, we take 801 steps and get . Figure 4 shows the training result , where the black points are the fitting points adopted by SGP-NNF, the red line is the final fitting result of the neural network, and is the inflection point corresponding to the red line.
4.2.2. Event-Based Impulse Control
Similarly, for , , , and . By using SGP-NNF algorithm, we take 291 steps and get . Figure 5 shows the training result of , where the black points are the fitting points, the red line is the final fitting result, and is the inflection point corresponding to the red line.
Remark 3. Here, the 3D system is put forward, mainly in order to show that our proposed SGP-NNF algorithm is applicable to high-dimensional systems. As we mentioned before, we only need to calculate corresponding objective function values; then, we can obtain the optimal sampling values by SGP-NNF algorithm.
In this study, a model-free numerical algorithm to get optimal sampling period and threshold of two sampling methods, respectively, is proposed. The two-dimensional and three-dimensional stochastic differential systems are simulated, and the values obtained by the algorithm have good precision for a fixed weight . In our future work, we will focus on extension of this algorithm to the case that the weight is not fixed but a continuous state.
No data were used to support this study.
Conflicts of Interest
The authors declare that they have no conflicts of interest.
This work was supported in part by the National Natural Science Foundation of China under Grants 62173142 and 62073158, in part by the Research Funds for the Central Universities SLK13223001, and in part by the Program of Introducing Talents of Discipline to Universities (the 111 Project) under Grant B1701.
K. J. Astrom, Event Based Control, Springer Berlin Heidelberg, Berlin/Heidelberg, Germany, 2008.
C. M. Bishop and M. B. Christopher, Neural Networks for Pattern Recognition, Oxford University Press, Oxford, U K, 1995.
R. S. Sutton and A. G. Barto, Reinforcement Learning: An Introduction, MIT press, Cambridge, MA, USA, 2018.
J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations, SIAM, Philadelphia, PA, USA, 1996.
P. E. Gill, W. Murray, and M. H. Wright, Practical Optimization, SIAM, Philadelphia, PA, USA, 2019.
S. Haykin, Neural Networks and Learning Machines, Pearson, London, UK, 2009.