Mathematical Problems in Engineering

Volume 2015 (2015), Article ID 450492, 9 pages

http://dx.doi.org/10.1155/2015/450492

## Two General Extension Algorithms of Latin Hypercube Sampling

Control and Simulation Center, Harbin Institute of Technology, Harbin 150080, China

Received 5 April 2015; Revised 30 June 2015; Accepted 8 July 2015

Academic Editor: Jose J. Muñoz

Copyright © 2015 Zhi-zhao Liu et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

For reserving original sampling points to reduce the simulation runs, two general extension algorithms of Latin Hypercube Sampling (LHS) are proposed. The extension algorithms start with an original LHS of size and construct a new LHS of size that contains the original points as many as possible. In order to get a strict LHS of larger size, some original points might be deleted. The relationship of original sampling points in the new LHS structure is shown by a simple undirected acyclic graph. The basic general extension algorithm is proposed to reserve the most original points, but it costs too much time. Therefore, a general extension algorithm based on greedy algorithm is proposed to reduce the extension time, which cannot guarantee to contain the most original points. These algorithms are illustrated by an example and applied to evaluating the sample means to demonstrate the effectiveness.

#### 1. Introduction

Latin Hypercube Sampling (LHS) is one of the most popular sampling approaches, which is widely used in the fields of simulation experiment design [1], uncertainty analysis [2], adaptive metamodeling [3], reliability analysis [4], and probabilistic load flow analysis [5]. Compared with other random or stratified sampling algorithms, LHS has a better space filling effect, better robustness, and better convergence character. The extension of LHS is to obtain a LHS of a larger size that reserves the preexisting LHS (or the original LHS). There are at least two situations that need the extension of sampling, especially for time consuming simulation systems. One is sequential sampling for sequential analysis, adaptive metamodeling, and so on. The other is to consider the extension of LHS when the original LHS was subsequently determined to be too small and a new LHS of a larger size without original sampling points might be time consuming. But the LHS structure makes it difficult to increase the size based on an original LHS while simultaneously keeping the stratification properties of LHS.

A special extension case is the integral-multiple extension where the new LHS is integral times the size of the original sampling. Tong [6] proposed integral-multiple extension algorithms for stratified sampling methods including LHS. Sallaberry et al. [7] gave a two-multiple extension algorithm of LHS with correlated variables. Later, two related techniques appeared in the papers named “nested Latin hypercube design” [8] and “nested orthogonal array-based Latin hypercube design” [9]. A nested Latin hypercube design with two layers is defined to be a Latin hypercube design that contains a smaller Latin hypercube design as a subset. A special integral-multiple extension method called -extended LHS method was illustrated, where the new LHS contains smaller LHSs [10]. Some related papers were produced by Vorechovsky [11–13], where the new sampling size is multiple times more than the original sampling size. The integral-multiple extension algorithms have a good feature that can obtain a strict LHS of larger size and simultaneously preserve all the original sampling points.

In this study, we consider the general extension algorithm of LHS where the new sample size is more controllable and the algorithm can be applied more widely. Wang [14] and Blatman and Sudret [15] obtained an approximate LHS of a larger size, which might have two or more original points falling into the same variable interval. Wei [16] also proposed a general extension algorithm to get an approximate LHS, which might have no point falling into a variable interval. However, a sample is a LHS if (and only if) there is only one point in each variable interval. The approximate LHS does not satisfy the definition and is harmful to the extension with some criteria, such as correlated variables, maximizing minimum distance, orthogonal array, and so on.

In this paper, we would like to obtain a strict extension of LHS (ELHS) rather than an approximate one and the new LHS contains original sampling points as many as possible. The extension algorithm includes two parts: the reservation of original sampling points and the generation of new ones. As the generation of new sampling points is almost the same as integral-multiple extension, the reservation of original sampling points is the main problem to discuss. The relationship of original sampling points in new LHS is expressed by a graph. Then, the reservation problem can be solved by graph theory.

In Section 2, the procedure of LHS and the mathematical description for extension problem of LHS are given, which is the basis of extension algorithm. In Section 3, a basic general extension algorithm of LHS (BGELHS) is proposed based on the graph theory, which reserves the most original sampling points. A general extension algorithm of LHS based on greedy algorithm (GGELHS) is intended to balance the simulation runs and the generation time of ELHS. In Section 4, the proposed extension algorithms are illustrated by an application, which is performed to evaluate sample means. Finally, the conclusion and some thoughts for the future research are given.

#### 2. Latin Hypercube Sampling and Extension Problem

##### 2.1. Procedure of LHS

Suppose that the input variables are and the range of is , . Then, LHS can be obtained as follows, which can ensure that each input variable has all portions among its range.(a)Divide the range into equiprobable intervals , . So the intervals satisfy , , and , where .(b)For the th interval of variable , the cumulative probability can be obtained aswhere is a uniform random number ranging from 0 to 1. So all the probability values can be noted as .(c)Transform the probability into the sample value by the inverse of the cumulative distribution function :Then, the sample matrix is (d)The values of each variable are paired randomly or in some prescribed order with the values of the other variables. Then the sample matrix of LHS can be written aswhere each row is a sampling point.

Figure 1 shows an example of LHS in size of 5, where each interval of each input variable has one sampling point.