Abstract
In many clinical trials, it is important to balance treatment allocation over covariates. Although a great many papers have been published on balancing over discrete covariates, the procedures for continuous covariates have been less well studied. Traditionally, a continuous covariate usually needs to be transformed to a discrete one by splitting its range into several categories. Such practice may lead to loss of information and is susceptible to misspecification of covariate distribution. The more recent papers seek to define an imbalance measure that preserves the nature of continuous covariates and set the allocation rule in order to minimize that measure. We propose a new design, which defines the imbalance measure by the maximum assignment difference when all possible divisions of the covariate range are considered. This measure depends only on ranks of the covariate values and is therefore free of covariate distribution. In addition, we developed an efficient algorithm to implement the new procedure. By simulation studies we show that the new procedure is able to keep good balance properties in comparison with other popular designs.
1. Introduction
Balanced allocation among treatment groups is often desirable in many clinical trials. A well-balanced design enhances the trials by increasing the credibility of the trial, precision of subgroup or interim analysis, and robustness to model misspecification [1]. Moreover, ignorance of balance at the design stage may lead to loss of statistical efficiency, especially in small trials [2]. In the existence of covariates (or prognostic factors), well-balanced allocation does not only mean similar group sizes but also similar distributions of covariate values across treatment groups [3–6]. With discrete covariates, a great many papers have been published, which include stratified permuted block designs, marginal procedures (or minimization) [7, 8], and hierarchical models [9–11]. Y. Hu and F. Hu [12] are among the few who explored the theoretical properties of such procedures.
With respect to balancing over continuous covariates, a traditional way is to discretize a continuous covariate by splitting its range into several small intervals, each of which defines a distinct category [4, 13]. However, such methods have several potential problems. First, there is no standard way of defining such intervals: usually the boundaries are set by experts’ opinion from a nonstatistical perspective, or they are heavily dependent on the model assumption of the covariate distribution. Weir and Lees [13] studied the randomization procedure in an acute stroke clinical trial in which 12 covariates were involved. The five continuous covariates were discretized before the traditional minimization was applied. Based on the historical information of the covariate distribution, the covariates age and mean arterial pressure were discretized by their quintiles; plasma glucose level was first discretized into two groups by the critical value 8 mmol/L and the one with larger values was further split by quartiles; the remaining two covariates, time of delay from stroke onset and Glasgow coma scale, were categorized whose critical values were set by clinical experience. Therefore, we see that discretization methods are rather covariate-specific and may depend on the distribution of the covariates. Second, there is a dilemma as to the selection of the number of categories (denoted by ). While a small may lead to loss of information, a large can result in a severe overall imbalance, as is in the case of stratified designs when the number of strata is large [7].
An alternative that saves the step of discretization is to define a certain “imbalance measure” for continuous covariates among different treatment groups and then allocate the treatments sequentially to minimize that specific measure [5, 6, 14–16]. With two treatment groups and , many authors defined imbalance measures as the mean difference of covariate values between the two groups: some of the measures may further be adjusted for standard deviations within the two groups [5, 14], others normalized the mean difference by calculating the two-sample -statistic and the corresponding value [15], that is, a larger value indicates more balanced allocation. An alternative to the above mean comparison methods is model-based approach [17–20], which relates different allocation to different design matrices in a linear model, which in turn influence the maximum likelihood estimator of the treatment effect ; it then sequentially favors an assignment that would lead to a smaller var. This approach only implies balance under a homogeneous linear model.
The methods discussed above, however, usually depend on the assumption of normal distributions of the covariates. Among the methods that apply to more general distributions, Stigsby and Taves [16] considered the difference of rank sums as their imbalance measure; Su [6] used a weighted average of the difference in overall patient numbers and the difference in quartiles of the continuous covariates. Rosenberger and Sverdlov [21] found the empirical distributions of the covariates in the two groups and calculated their Kolmogorov-Smirnov distance to measure the imbalance for a series of randomization procedures.
In this paper, we propose a new minimization procedure which carefully examines the assignment differences over all intervals and set their maximum value as the actual imbalance measure. The new procedure utilizes only the rank information of the covariates and is therefore robust against various distributions. Moreover, we developed an efficient algorithm whose computation is even less intensive than discretization. By simulation studies, we show that our new procedure leads to well-balanced allocation in terms of overall patient numbers and covariate distributions. For simplicity, we first consider only one continuous covariate and then extend the idea to more general cases.
In Section 2, by giving a motivating example, we analyze the downside of simple discretization and then highlight the idea underlying the new procedure. In Section 3, details of the procedure are described, and the key part of the algorithm is explained. Simulation studies are carried out in Section 4, comparing our procedure with discretization methods as well as some other designs. In Section 5, we conclude the paper by discussing possible extension of the current procedure.
2. Motivation
We will first focus on two treatments, and and only one continuous covariate . For simplicity, assume takes value over interval ; any other type of can be transformed to by a linear function if is bounded, or a nonlinear function such as otherwise.
To see the difficulty in defining “balance’’ for a continuous covariate, we use an example as shown in Figure 1. The first 8 patients have been randomized. The 9th has just arrived, who is tentatively assigned to treatment (a) and (b). The figure shows the assignment as well as the covariate values of the patients. First, we point out that discretization could cause complications since different ways of splitting may lead to different assignments. For example, if the covariate range in Figure 1 is split into intervals of equal lengths, then the choices of , , and result in different preferences of the assignment: if , that is, only overall patient numbers are considered, then the 9th patient could be assigned to treatment or treatment , since in either way the absolute difference is 1; if , then treatment would be favored, producing balanced patient numbers in the second category, that is, over the interval where the 9th patient belongs; if , then treatment would be preferred instead, since it produces allocation of 2 : 1 in the third category, that is, over the interval , rather than 3 : 0 when treatment is assigned. Second, we emphasize that balance in the mean values does not necessarily lead to balance in distributions. Taking the upper panel in Figure 1, for instance, the assignment of treatment would not cause a significant mean difference in the two treatment groups. However, in terms of covariate distribution, a severe imbalance would exist, since the 5 patients (7th, 6th, 9th, 4th, and 8th) in group have covariate values at the center of the range and the 4 patients (2nd, 3rd, 5th, and 1st) in group at the two ends.
(a)
(b)
In light of the above discussion, we propose a new imbalance measure, which is defined as the maximum (absolute) assignment difference when all possible ways of discretization are considered. By doing so, we want to make sure that the assignment difference over any interval does not become too extreme. Thus, if the patient is tentatively assigned to , we need to examine the differences of patient numbers in the two groups in all neighborhoods of and use the maximum of them as the resulting degree of imbalance. The corresponding imbalance caused by the assignment of can be calculated in the same manner. Then with a probability greater than 1/2, the patient should be assigned to a treatment that would cause a smaller maximum difference.
3. Procedure
Consider a sequential trial with two treatments and (control and test). For the first patients that have arrived, let be the allocation sequence with if the th patient is assigned to treatment and otherwise. Suppose we need to balance treatment allocation over one single continuous covariate with cumulative distribution function , whose range is an interval on the real line. The two endpoints of can be finite or infinite. Let , where is the covariate value of the th patient. Suppose , , are independent. Define as the difference of patient numbers in treatment groups and over the interval , given the covariate information and allocation sequence . Suppose the first patients have been randomized and the th patient has just arrived, that is, and are known and is to be determined. Define where or . Thus, is the potential maximum absolute difference it would cause if the new patient was assigned to treatment . Note that the interval in (3.2) needs to contain the new covariate value , because the difference over any other interval will not be affected by the the arrival of and is therefore not of our interest. Note also that is a function of .
Then, assign the th patient to treatment with the following probability: where and .
We will show how the 9th patient is randomized according to our new procedure. The critical part lies in the calculation of and . For the former, that is, the 9th patient is temporarily assigned to treatment (see the upper panel in Figure 1), then the maximum absolute difference is 5, attained over intervals which exclusively contain , , , , and , that is, 5 patients over these intervals are assigned to and 0 is to , so . Similarly, . Therefore, since , with probability the 9th patient will be assigned to treatment .
We would like to point out that in order to calculate the imbalance , which is defined as a supremum, it is sufficient to examine intervals whose endpoints belong to the set . Hence the total number of such intervals has the order . Moreover, the difference of patient numbers over any of these intervals is only related to the ranks of , whose joint distribution places an equal probability on any permutation of . To support the above argument, we can reexamine the upper panel in Figure 1: for any interval , where and , the difference of patient numbers over is exactly the same as that over interval , which is ; furthermore, so long as the relative positions of remain the same, this difference of does not change. Nor does . Therefore, we come to the conclusion that the new procedure is free of the underlying distribution .
In fact, the computation time of can be reduced by examining an even smaller number of intervals, that is, instead of . Before demonstrating this, we need a few more notations and definitions. Since our new procedure is distribution-free, simply assume that the covariate is from uniform . Suppose and have been observed. in (3.1), defined as difference of patient numbers in groups and over interval , will simply be written as . For the ease of representation, let and . Define two sets and as That is, is any point from the set , that is, to the left of or equal to , and is a left-closed and right-open interval. For instance, in Figure 1, and from left to right. The interpretation of and is similar.
Proposition 3.1. Let , , , and . Then,
The proof of Proposition 3.1 is given in the Appendix. Proposition 3.1, together with the definitions of and , suggests the following.(1)The calculation of and only requires the examination of intervals, instead of .(2)For two consecutive intervals and in , where , and , we have , depending on the assignment or for the patient at (the same argument applies to intervals in ).
The above two observations form the basis of the algorithm, which was developed for the new procedure in the simulation studies (Section 4). We found that the computation time of the new procedure was even less than discretization methods.
4. Simulation Studies
Suppose patients enrolled. As mentioned in Section 1, for continuous covariates, it is desirable to keep similarity between treatment groups in two aspects: the group sizes and the distributions of the covariates. Therefore, the new procedure was first compared with several other procedures in terms of the following two criteria: the mean absolute difference of all patient numbers in the two groups, shown as ; the mean Kolmogorov-Smirnov distance (K-S) between the empirical distributions of covariate in groups and , shown as , which basically measures the similarity between two distributions. In addition, we used a new criterion: the “maximum imbalance” defined by us as: shown as , which is the maximum absolute difference over all possible intervals after all patients have been assigned to a treatment. We will show that criterion acts as a compromise between criterion and criterion .
Since the procedures we compared are all distribution-free, the independent covariate values were simply generated from Unif(0,1). All procedures use the strategy of minimization, but each has a different imbalance measure. More specifically, under a certain imbalance measure , we calculate or , defined as the imbalance that would occur if the new patient was assigned to treatment or . Depending on whether is positive, negative, or zero, the allocation probability toward the treatment is , , or , where and . In the simulation, we used and , with the latter corresponding to deterministic allocation unless there is a tie.
The following procedures were studied.(1) Efron’s design (EFRON) [22]. Let and be the patient numbers in the two groups. Define the imbalance measure as . This method solely focuses on the balance of patient numbers.(2) Kolmogorov-Smirnov measure (K-S). Let be the empirical distribution function of the covariate values in group , , . Define imbalance measure as the Kolmogorov-Smirnov distance between and . This method solely focuses on the balance of distributions. The above two methods are rarely used as a way of balancing over a continuous covariate, since each of them is designed to meet only one criterion. In our simulations, they served as two controls to evaluate other procedures.(3) Discretization (DSCRT). In practice, in order to discretize a continuous covariate with cumulative distribution function , the range is often split by the quantiles of at probabilities . This is equivalent to splitting into intervals of equal length for unif. In our simulations, we tried . Within each category Efron’s design was applied.(4)The new procedure (MAX-IMB).(5)Stigsby and Taves’ rank sum (RANK-SUM) [16]. Let be the ranks of . Suppose patients are in group and patients are in group . The imbalance measure is defined by .(6) Su’s weighted average (WGT-AVE) [6]. Let be the quartiles of the covariate values in group , , . The “qualitative’’ imbalance measure is defined by where and are two weights placed on the two items, and are two upper limits, and is the indicator function; the “quantitative’’ imbalance measure is defined by Therefore, let be the the qualitative imbalance resulted from the tentative assignment of treatment to the new patient, . If is positive (negative), the allocation probability toward the treatment will be ; if , use measure to determine the probability , , or .
, , , and can be changed freely from a subjective point of view. In the simulations, we fixed and , but tried .
The results for and under 5000 repetitions are shown in Tables 1 and 2.
We first focus on Table 1 (). The 1st and 2nd columns suggest that the “best’’ and the “best” that can be achieved are 1.28 by EFRON and 0.137 by K-S, respectively, at the expense of large imbalance under the other criterion. For DSCRT when increases from 2 to 4 and 8, increases from 2.17 to 2.94 and 3.76, whereas decreases from 0.178 to 0.161 and 0.159. Therefore, we see that there is a trade-off between the balance of group sizes and the balance of covariate distributions. Similar trend can be observed for WGT-AVE when increases from 2 to 4 and 6. In terms of these two criteria, the new procedure (MAX-IMB), with 2.36 and 0.159, has better performance than DSCRT with and and WGT-AVE with and ; it also has lower and than RANK-SUM.
In terms of (the 3rd column), MAX-IMB has the minimum value 7.38 since it sequentially minimizes this criterion. On the contrary, DSCRT minimizes the imbalance of patient numbers over the selected intervals, but ignores the imbalance over others. As a result, on average, the maximum imbalance under DSCRT is higher than that under MAX-IMB. In a sense, serves as a tool which detects any allocation imbalance that is ignored by DSCRT. Since the new procedure examines both “global’’ imbalance, that is, over the whole range, and “local’’ imbalance, that is, over any small interval, it can be regarded as a compromise between achieving balance in overall group sizes and achieving balance in covariate distributions.
Similar conclusion can be drawn for (see Table 2), that is, the allocation is deterministic except the case of a tie. From to , the decrease in is most significant under DSCRT, from to . This is because when only covariate values are random, and so long as the numbers of patients over the selected intervals are even (e.g., 34 over and 26 over with ), DSCRT can always achieve perfect overall balance. Moreover, even if the patient numbers are odd (e.g., 35 over and 25 over ), there are still chances that the allocation differences are and or in the reversed way, again resulting perfect overall balance. Other procedures are more complex and the decrease in is less significant. As a result, when , MAX-IMB is only uniformly better than DSCRT with , not .
We also compared the above procedures by other commonly used measures including the mean absolute difference of sample means () and the mean absolute difference of sample standard deviations () of the covariate values in the two treatment groups. Furthermore, Lin and Su [23] introduced another criterion, the area between the empirical cumulative distribution functions of the covariate values in the two treatment groups (normalized by the difference of the maximum and the minimum values), denoted as , and pointed out that this criterion has better performance than Kolmogorov-Smirnov distance in capturing the difference in two distributions. We thus included this criterion in the simulation. Since the measurements of mean, standard deviation, and area under a distribution function depend on the underlying distribution of the covariate, we did simulation studies under a uniform distribution and under a normal distribution and show the results in Tables 3 and 4, respectively.
From Table 3 under the uniform distribution, it is seen that K-S has the best performance, since its , , and (2.29, 1.77, and 5.02) are the lowest among all procedures. This is expected, since K-S solely minimizes the distance between the two distributions. Once the distributions are closest, so are the summary statistics of means and standard deviations as well as the area between the distributions. However, K-S is likely to produce severe imbalance of group sizes, as shown in Tables 1 and 2. EFRON has the worst performance under the three criteria since it completely ignores the covariate distributions.
Among the three choices of under DSCRT, roughly speaking performs best: its and (3.22 and 5.56) are the lowest and its (1.92) slightly higher than that under . Moreover, under these three criteria, DSCRT with is uniformly better than MAX-IMB, RANK-SUM, and WGT-AVE. However, the good performance of DSCRT with is based on the correct identification of quartiles of the true covariate distribution, which may not be feasible before the collection of data. In contrast, other methods do not require such information.
Comparing MAX-IMB, RANK-SUM, and WGT-AVE, RANK-SUM has the highest values under the three criteria; MAX-IMB has comparable performance to WGT-AGE with , with the former having slightly higher and the latter slightly higher and . Similar conclusion can be reached for the normal distribution (see Table 4). The result for under the three criteria , , and resembles that for , the only difference being that the best choice of under DSCRT is instead of . In fact, we also did simulations under different sample sizes ( and 150) and the results are quite consistent.
5. Discussion and Conclusions
In this paper, we propose a new minimization procedure that balances treatment allocation over continuous covariates. For any new patient, it examines the imbalances in the neighborhoods of his or her covariate value and bias the allocation probability towards the treatment that would result in a smaller value of the maximum imbalance. The new method only depends on the ranks of the covariates and is therefore distribution-free. Our simulation studies have shown that it is able to maintain relatively good balance in terms of group sizes and covariate distributions across treatment groups.
In addition, the new procedure does not require the specification of any critical values, which is usually needed for discretization methods in order to define categories. For the latter methods, if quantiles of the covariate distribution are used for the critical values, then lack of knowledge about may lead to wrong guesses of the quantiles. The new procedure saves this step by considering all possible divisions of the range. Nevertheless, only the assignment differences over intervals have to be examined to calculate the new imbalance measure, and the corresponding algorithm is computationally efficient.
Borrowing the idea of Pocock and Simon’s design [7], our method can easily be generalized to two or more continuous covariates or a mix of discrete and continuous covariates. Suppose that for a total of covariates , the first is continuous and the rest are discrete. When the th patient is enrolled, for any continuous covariate , , we define , by (3.2), which is the the maximum imbalance measure with respect to the th covariate; for any discrete covariate , , observe the category the new patient belongs to, tentatively assign him to treatment , and define as the absolute difference of patient numbers in the two treatment groups with respect to that specific category. For example, if the th covariate is gender and the new patient is a male, then is calculated among all males. Define where ’s and ’s are the weights placed on the covariates and can be assigned by the importance of the different covariates. Depending on whether is greater than, less than, or equal to , assign the th patient to treatment with probability , or . From (5.1), it is seen that is similar to Pocock and Simon’s weighted average of marginal imbalances. The only difference is that for those continuous covariates we redefine the marginal imbalances by the new measure proposed in the current paper, so that the negative effect caused by discretization can be mitigated. Since the marginal imbalances for discrete covariates in remain the same as in Pocock and Simon’s, it is expected that the good balance properties for discrete covariates in their design can be preserved when is used in the minimization. We did simulations for the case of two continuous covariates and the new procedure again showed improvement over other procedures.
In the case that an unequal allocation ratio such as is desired, one can easily generalize the proposed metric by redefining in (3.1) as By doing so, it can be ensured that the allocation ratio over any interval is close to . In practice, one can also modify the maximum imbalance measure by adding weights to different intervals. The weight of each interval can be assigned as a function of the number of patients within the interval, so that the procedure remains distribution-free. But the algorithm to implement such a procedure will be more complicated. We will leave these as future research topics.
Appendix
Proof of Proposition 3.1
We will use the basic fact that for a sequence and a constant , Following the notations and argument in Section 3, But and if . Thus, Note that by definition of the notation , , which further equals since the interval does not contain . Now for any fixed , apply (A.1) to the constant and the sequence , , and we have Applying (A.1) again to with constant and sequence , we have Similarly, Therefore, The derivation of is similar.
Acknowledgments
This work was supported by Grants DMS-0907297 and DMS-0906661 from the National Science Foundation (USA).