A Simulated Annealing Algorithm for D-Optimal Design for 2-Way and 3-Way Polynomial Regression with Correlated Observations
Much of the previous work in D-optimal design for regression models with correlated errors focused on polynomial models with a single predictor variable, in large part because of the intractability of an analytic solution. In this paper, we present a modified, improved simulated annealing algorithm, providing practical approaches to specifications of the annealing cooling parameters, thresholds, and search neighborhoods for the perturbation scheme, which finds approximate D-optimal designs for 2-way and 3-way polynomial regression for a variety of specific correlation structures with a given correlation coefficient. Results in each correlated-errors case are compared with traditional simulated annealing algorithm, that is, the SA algorithm without our improvement. Our improved simulated annealing results had generally higher D-efficiency than traditional simulated annealing algorithm, especially when the correlation parameter was well away from 0.
D-optimality is a popular criterion for optimal experimental design. The model for polynomial regression can be written as in Zhu et al. : where , is a -vector of parameters, and is a vector of polynomial functions of , and is the number of observations. Our purpose is to estimate the coefficient vector or part of the vector of primary interest.
In some experimental settings, the observations may be correlated according to various structures or patterns. The motivation of the research in optimal designs with correlated observations can be found in Dette et al. . Muller  introduced optimal design with correlated observations in detail.
The simulated annealing algorithm is a probabilistic “hill climbing” algorithm for optimization in the absence of an analytical solution. The application of simulated annealing algorithm for optimal design problem was first proposed by Haines . Lejeune  proposed a simulated annealing algorithm for D-optimal design with uncorrelated observations. The simulated annealing algorithm with a reheating process is introduced in Dimitris and Omid  and Abdullah et al. . In Zhu , Zhu solved the 1-way D-optimal design for polynomial regression with correlated observations using a simulated annealing algorithm. Cheng  produced D-optimal designs with block effects, which can be considered as a special case of the D-optimal design problem with correlated observations, since the block effects can be incorporated into the correlation structure.
Most previous work only considered the simplest case, that is, optimal design for 1-way polynomial regression. However, in real world problems, the response variable is usually influenced by multiple effects and their interactions. This kind of problem is more complicated and cannot get satisfactory result by existing algorithms or their generalizations.
In this paper, we propose a modified, improved simulated annealing algorithm to approximately solve for D-optimal design for 2-way and 3-way polynomial regression with correlated observations. This algorithm is applicable to any number of observations, not necessarily a multiple of the dimension of the parameter vector. It conquers the shortcoming of previous work, which mainly concentrated on the case that (the number of observations) is a multiple of (the number of the coefficients to be estimated or, equivalently, the dimension of the parameter ). We also provide a reinforced version of our simulated annealing algorithm with a reheating process.
2. Model and Correlation Structures
The full model for second-order 2-way and 3-way polynomial regression is presented by Boon  and Pukelsheim . The model for the second-order 2-way polynomial regression is where , and each of the and are in , where has mean 0 and variance but are not necessarily independent.
The design matrix is . The first column is all 1’s, and the other 5 columns correspond to the values of , respectively. That is, each column of corresponds to one design variable (or their square or interaction effect) in the model.
The model for the second-order 3-way polynomial regression is where .
The design matrix is , and the definition is similar with 2-way polynomial regression.
D-optimality aims to maximize of the determinant of the information matrix, where the information matrix for these models is where is the variance covariance matrix of the errors. Some common correlation structures for are introduced below.
2.2. Correlation Structures
We define commonly used correlation structures below for a single correlation parameter .(i)Circulant correlation: see Zhu et al. : (ii)Nearest neighbor correlation: see Zhu : (iii)Autoregressive correlation: see Dette et al. : where .(iv)Completely symmetric block structure: see Cadima et al. : Here is a matrix with the elements on the main diagonal =1, and all other elements =, ( is the common block size). is the correlation coefficient for the observations in the same block. is a block with all elements =. In this paper we take all of the equal to the same coefficient .
Note that one commonly used block correlation structure is proposed by Atkins and Cheng :
with . Here is the matrix with all of the elements =1. This is a special case of (iv) with .
3. Improved Simulated Annealing Algorithm for 2-way and 3-way Second-Order Polynomial Regression with Correlated Observations
3.1. The Principle of Simulated Annealing
The simulated annealing (SA) algorithm belongs to a class of heuristic probabilistic hill-climbing algorithms; see Zhu  and Lejeune . The SA algorithm attempts to globally maximize an energy function for in a specified state space (a design region for our D-optimality problem), by moving about the state space according to a transition mechanism defined by random perturbations of the current solution, , to a new candidate solution, . Letting , if , accept as the current solution. Otherwise, accept as the current solution with probability , where is the current value of a temperature control parameter, . Thus, there is positive probability that the algorithm will move to a poorer design, which is the key feature of the SA search algorithm, as it provides for the possibility that the algorithm will escape a local maximum. As the algorithm proceeds, the temperature decreases, making it less likely that designs with lower energy will be accepted. Convergence of the SA algorithm to a highly efficient design (a globally optimal solution is never guaranteed to be found) depends on the convergence to a stationary distribution of the underlying Markov chain, which typically requires a large number of iterations as well as a suitably chosen transition scheme over the state space.
3.2. Simulated Annealing Algorithm for D-Optimal Design for 2-Way and 3-Way Polynomial Regression
For 2-way polynomial regression, the design matrix is fully determined by the values of and , each in . Therefore, at each iteration of our simulated annealing algorithm, a new design matrix is obtained by perturbing the current values of and . We denote the current values of and by and and new values by and , respectively.
In many applications of simulated annealing, the values of only one current design point are perturbed (by some random mechanism) at each iteration, and typically a systematic pass is made through all design points in this manner, and the process repeated until “convergence” is achieved according to a specified stopping condition. Alternatively, all design points are perturbed simultaneously. However, both of these traditional methods were found to be inefficient for our D-optimal design with correlated errors. Thus, we used a modification that improved convergence and solution quality. Our modification was to divide the design points into three parts, of equal or nearly equal size, and perturb all points in each part in an “inner” loop, while systematically doing this for each of the three parts. This represented a middle ground for the perturbation scheme between the two traditional perturbation methods, one at each extreme, as described above.
Our modified simulated annealing algorithm was as follows.
Step 1. Initialize starting temperature , finishing temperature , temperature reduction coefficient , perturbation neighborhood control parameter , and initial design matrix . Control parameter is chosen from and is used to adjust the size of the perturbations as the algorithm proceeds. Calculate the energy function of the current design: .
Divide the design points (rows of ) into three parts. If , for some positive integer , then each part has design points. If , the first two parts have design points and the third part has . Similarly, if , the first part has points, and the other two have design points.
Step 2. Outer loop: cycle through each of the 3 parts of systematically, repeating the following inner loop.
Inner loop:(i)let and be vectors with each element of sampled at random from for those design points belonging to the current part of under consideration; all remaining elements of are set equal to 0;(ii)generate new candidate design points and ; if any element of or falls outside , set the value to the closest boundary value of the design region;(iii)determine ;(iv)if , accept the new design by setting ; otherwise, compare with a random number chosen uniformly from multiplied by a coefficient ; if is greater than this number, we set ; if not, keep the unchanged.
Step 3. If , stop. Otherwise, increase the counter to , set , , and return to Step 2.
Reduction Control Parameter . This tuning parameter is chosen by the user but is often set about 0.98-0.99 for geometric rate of reduction in the temperature.
Perturbation Control Parameter . Typically, is set close to 1, allowing large perturbations in design points at early iterations. As solution quality improves and the temperature decreases, also decreases, localizing perturbations to a smaller neighborhood of the current design which is more likely to be close to a global optimum when iteration counter is large.
Reheating. The annealing algorithm can be reinforced by using “reheating." Specifically, after the usual stopping condition based on the temperature is reached in Step 3, the process is repeated, often several times, by reheating to the original starting temperature, and continuing at Step 2. Since the reheating process is much more time consuming than nonreheating process, so we have to weigh and balance between the computing time and effect of computation. In this paper, we mainly run the improved simulated annealing algorithm without reheating process. In Table 6, we present results of the algorithm for and three correlation structures without and with reheating.
For 3-way polynomial regression, the only difference of the algorithm is in the inner circulation; we do all of the operation on 3 vectors: , , and .
3.3. Improvements from This Algorithm Compared with the Traditional Simulated Annealing Algorithm
(1) There are 2 vectors, and , to be changed (for 3-way polynomial regression, there are 3 vectors to be changed). In this case, the traditional simulated annealing algorithm, which treats the perturbation vector as a whole, does not produce satisfactory results. In our modified algorithm, we divide the (and consequently the perturbation process) into 3 parts and make perturbations part by part. This method ensures that we do not miss any corner of the design region and is more precise than the usual annealing method. Additionally, this part-by-part perturbation scheme allows the number of observations to be any number, not necessarily to be multiple of the number of coefficients. This makes our algorithm more flexible since it can be applied to experiments with any number of observations.
(2) We shrink the search neighborhood and increase the threshold for accepting a perturbation each time we lower the temperature. That is, when the temperature is high, we search in a wide neighborhood and are more likely to jump out of the local optimum. At each time we lower the temperature, we make the perturbation neighborhood smaller and make the acceptance threshold higher so it becomes harder to leave a local optimum. We implement this approach by multiplying the scale number by the reduction coefficient and multiplying the random number to be compared with by a coefficient, , at each time we decrease the temperature. Here initially is 0 and will increase by 1 each time we decrease the temperature.
This approach is in accordance with the idea of simulated annealing; that is, when the temperature becomes lower, the “molecules” are less active and tend to an equilibrium stabilization. This modification resulted in improved relative efficiency of the final design.
(3) In each part of Step 3, we repeat the iterations until the improvement is less than a small threshold value multiple times. This guarantees we go to the next step only when the improvement is negligible and none in the current step. In other words, we do not miss any valuable improvement. We take the threshold as 0.02× determinant of the current information matrix as the threshold value.
4. Results and Comparison with Traditional Simulated Annealing Algorithm
In this part, we compare the results from our improved simulated annealing algorithm with that of traditional simulated annealing algorithm, that is, the SA algorithm without our improvement. Since the most often used correlation parameters are 0.1 and 0.4, in Tables 1, 2, 3, 4, 5, 6, and 7, we mainly use these 2 parameters in the computation and comparison. The result of other correlation parameters can be gotten by the same algorithm by a simple adjustment of the parameters.
In Table 1 through Table 4, we present the comparisons of our improved simulated annealing results and the traditional simulated annealing algorithm when observations number is a multiple of 6 using each of the autoregressive, circulant, nearest neighbor, and block correlation structures. We also get the ratio of the results of the two algorithms, which will tell us how much our improved simulated annealing algorithm is better than the traditional simulated annealing algorithm.
Tables 1–3 present the comparison of the results of the two SA algorithms for the autoregressive, circulant, and nearest neighbor structure for designs of sizes 6, 12, and 18 and correlation parameters of 0.1 and 0.4, and Table 4 presents the comparison of the results of the two SA algorithms for the block structure for designs of size 12 and correlation parameters of 0.1 and 0.4.
From these tables, we see that under all of the cases, the determinants obtained by our improved simulated annealing algorithm are much higher than that of the traditional simulated annealing algorithm. When changes from 0.1 to 0.4, and when (the number of observations) gets larger, the ratio of the determinants of our improved simulated annealing algorithm and that of the traditional simulated annealing algorithm increase rapidly. So the D-efficiency of our improved simulated annealing algorithm is much better than that of the traditional simulated annealing algorithm, especially when and get even larger.
For the case that the observations number is not a multiple of the dimension of the parameter vector, the traditional simulated annealing algorithm does not work. However, our improved simulated annealing algorithm is a powerful algorithm to get the determinant for any number of . We list the results of our improved simulated annealing algorithm with various and for circulant correlation structure in Table 5.
Table 6 provides the comparison of the result of our algorithm without and with reheating process. From this table, we can see that, with the addition of the reheating process, the results are much better than the nonreheating process. However, the reheating process is much more time consuming than nonreheating process.
Table 7 provides the comparison of our improved simulated annealing results with the traditional simulated annealing algorithm results for 3-way polynomial regression with .
From Table 7, we can see that the results of our improved simulated annealing algorithm are much higher than that of the traditional simulated annealing algorithm for all of the 3 correlation structures for 3-way polynomial regression. When gets larger, the ratio of the determinants of our improved simulated annealing and that of the traditional simulated annealing algorithm increase rapidly. All of the tables point out that our improved simulated annealing is much more powerful than traditional simulated annealing algorithm.
This paper demonstrates that an improved simulated annealing algorithm can successfully determine highly efficient D-optimal designs for second-order polynomial regression on and third order polynomial regression on for a variety of correlated error structures and with the design size, , not limited to a multiple of the number of regression parameters. The combination of (i) a “part-by-part” perturbation scheme, (ii) the use of a parameter that controls the size of the neighborhood for the perturbations, and (iii) increase of the threshold for accepting a perturbation each time we lower the temperature lead to designs that—while not likely globally optimal—are better than those obtained by traditional simulated annealing algorithm. In particular, when the true correlation parameter is well away from 0, our improved simulated annealing algorithm has much greater relative efficiency than the traditional simulated annealing algorithm.
The SA algorithm needs only a well-defined energy function to maximize here the determinant of the information matrix. Thus, the same algorithm may be used for other design optimality criteria, for example, A- and E-optimality. In the absence of exact analytic optimal designs when errors are correlated, the SA algorithm is an attractive, easily implemented method to find highly efficient designs. Extensions to higher degree polynomial regression models are immediate, except for the likely need for longer run times and slower reduction of the temperature to allow for more effective searching over a larger design region.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
W. G. Muller, Collecting Spatial Data: Optimum Design of Experiments for Random Fields, Springer, 2001.
L. M. Haines, “The application of the annealing algorithm to the construction of exact optimal designs for linear-regression models,” Technometrics, vol. 29, no. 4, pp. 439–447, 1987.View at: Google Scholar
B. Dimitris and N. Omid, Robust Optimization with Simulated Annealing, Springer Science+Business Media, LLC, 2009.
S. Abdullah, L. Golafshan, and M. Z. A. Nazri, “Re-heat simulated annealing algorithm for rough set attribute reduction,” International Journal of Physical Sciences, vol. 6, no. 8, pp. 2083–2089, 2011.View at: Google Scholar
Z. Zhu, Application of simulated annealing to d-optimal design for polynomial regression with correlated observations [Ph.D. thesis], Department of Mathematics and Statistics, Utah State University, 2004.
J. E. Boon, “Generating exact d optimal design for polynomial models,” in Proceedings of the Spring Simulation Multiconference (SpringSim '07), 2007.View at: Google Scholar
Z. Zhu, Optimal experimental designs with correlated observations [Ph.D. thesis], Department of Mathematics and Statistics , Utah State University, 2004.