Abstract

We propose the degenerate-generalized likelihood ratio test (DGLRT) for one-sided composite hypotheses in cases of independent and dependent observations. The theoretical results show that the DGLRT has controlled error probabilities and stops sampling with probability 1 under some regularity conditions. Moreover, its stopping boundaries are constants and can be easily determined using the provided searching algorithm. According to the simulation studies, the DGLRT has less overall expected sample sizes and less relative mean index (RMI) values in comparison with the sequential probability ratio test (SPRT) and double sequential probability ratio test (2-SPRT). To illustrate the application of it, a real manufacturing data are analyzed.

1. Introduction

Consider the following hypotheses test problem: 𝐻0βˆΆπœƒβ‰€πœƒ0versus𝐻1βˆΆπœƒβ‰₯πœƒ1ξ€·πœƒ0<πœƒ1ξ€Έ(1.1) with the error constraints π‘ƒπœƒξ€½accept𝐻1≀𝛼forπœƒβ‰€πœƒ0π‘ƒπœƒξ€½accept𝐻0≀𝛽forπœƒβ‰₯πœƒ1.(1.2) Here, πœƒ0, πœƒ1∈Θ, and Θ is the parameter space. Sequential tests for the problem (1.1) with independently and identically distributed (i.i.d.) observations have been widely studied. In cases of the one parameter exponential family with monotone likelihood ratio, the sequential probability ratio test (SPRT) proposed by Wald [1] provided an optimal solution to the problem (1.1), in the sense of minimizing the expected sample sizes (ESSs) at πœƒ=πœƒ0 and πœƒ=πœƒ1, among all tests satisfying the constraints (1.2).

However, its ESSs at other parameter points are even larger than that of the test methods with fixed sample sizes. This led Weiss [2], Lai [3], and Lorden [4] to consider the problem (1.1) from the minimax perspective. Subsequently, Huffman [5] extended Lorden’s [4] results to show that the 2-SPRT provides an asymptotically optimal solution to the minimax sequential test problem (1.1). Instead of the minimax approach, Wang et al. [6] proposed a test minimizing weighted ESS based on mixture likelihood ratio (MLR). Since the ESSs over [πœƒ0,πœƒ1] are hard to control and are usually focused on applications, Wang et al. [6] paid much attention to investigate the performance of the ESS over [πœƒ0,πœƒ1]. Many tests for the problem (1.1) under independent observations are developed from other perspectives, including [7–11] and so forth.

It is true that in many practical cases the independence is justified, and hence these tests have been widely used. However, such tests may not be effective in cases when the observations are dependent, for example, Cauchy-class process for sea level (cf. [12]), fractional Gaussian noise with long-range dependence (cf. [13, 14]) and the power law type data in cyber-physical networking systems [15]. Especially for the power law data, the sequential tests for dependent observations are particularly desired. This need is not limited to these cases.

So far, many researchers studied sequential tests for various dependent scenarios. Phatarfod [16] extended the SPRT to test two simple hypotheses 𝐻0βˆΆπœƒ=πœƒ0 versus 𝐻1βˆΆπœƒ=πœƒ1 when observations constitute a Markov chain. Tartakovsky [17] showed that certain combinations of one-sided SPRT still own the asymptotical optimality in the ESS under fairly general conditions for a finite simple hypotheses. Novikov [18] proposed an optimal sequential test for a general problem of testing two simple hypotheses about the distribution of a discrete-time stochastic process. Niu and Varshney [19] proposed the optimal parametric SPRT with correlated data from a system design point of view. To our best knowledge, however, there are few references available for considering the problem (1.1) with dependent observations from the perspective of minimizing the ESS over [πœƒ0,πœƒ1]. Similar to Wang et al. [6], one can extend the MLR to the dependent case. However, unlike the i.i.d. case, the MLR under the dependent case may not be available because of the complexity of its computation. Besides, its test needs to divide [πœƒ0,πœƒ1] into two disjoint parts by inserting a point. In i.i.d. cases, this point can be selected following Huffman’s [5] suggestion. But, in the dependent case, this suggestion may not be effective. One also can use the generalized likelihood ratio (GLR) instead of the MLR. Unfortunately, as opposite to the MLR, the GLR does not preserve the martingale properties which allow one to choose two constant stopping boundaries in a way to control two types of error. Moreover, the computation of the GLR is hard to be obtained in cases when the maximum likelihood estimator should be searched. This usually happens in the dependent case.

In this paper, we propose a test method for both dependent and independent observations. It has the following features: (1) it has good performances over [πœƒ0,πœƒ1] in the sense of less overall expected sample sizes; (2) its computation is reasonably simple; (3) its stopping boundaries can be determined conveniently. The rest of the paper is organized as follows. In Section 2, we describe the construction of the proposed test in details and present its basic theoretical properties. Based on these theoretical results, we provide a searching algorithm to compute stopping boundaries for our proposed test. In Section 3, we conduct some simulation studies to show the performance of the proposed test. Some concluding remarks are given in Section 4. Some technical details are provided in the appendix.

2. The Proposed Test

Let π‘₯𝑖=∢(π‘₯1,π‘₯2,…,π‘₯𝑖), 𝑖=1,2,… and suppose that the conditional probability distribution of each π‘₯𝑖|π‘₯π‘–βˆ’1, 𝑓(π‘₯𝑖|π‘₯π‘–βˆ’1,πœƒ) has an explicit form. Here, π‘₯1|π‘₯0=∢π‘₯1 and 𝑓(π‘₯1|π‘₯0,πœƒ)=βˆΆπ‘“(π‘₯1,πœƒ). Thus, likelihood ratio can be defined as π‘…π‘›ξ€·πœƒ,πœƒξ…žξ€Έ=𝑛𝑖=1𝑓π‘₯π‘–βˆ£π‘₯π‘–βˆ’1ξ€Έ,πœƒπ‘“ξ€·π‘₯π‘–βˆ£π‘₯π‘–βˆ’1,πœƒξ…žξ€Έ,πœƒ,πœƒβ€²βˆˆΞ˜.(2.1) Lai [20] introduced this model to construct a sequential test for many simple hypotheses when the observations are dependent. It is very general and also includes the i.i.d. cases.

Example 2.1. Consider, for instance, a simple nonlinear time series model: π‘₯𝑖=πœƒπ‘₯2π‘–βˆ’1+πœ€π‘–,πœ€π‘–βˆΌπ‘(0,1).(2.2) In this case, π‘…π‘›βˆ(πœƒ,πœƒβ€²)=𝑛𝑖=1πœ™(π‘₯π‘–βˆ’πœƒπ‘₯2π‘–βˆ’1)/πœ™(π‘₯π‘–βˆ’πœƒβ€²π‘₯2π‘–βˆ’1), π‘₯0=0, and πœ™(β‹…) is the probability density function of the standard normal distribution.
To overcome the difficulty stated in Section 1, we propose a test statistic which minimizes the likelihood ratio with restriction to a finite parameter points in [πœƒ0,πœƒ1]. First, we insert π‘˜ (β‰₯3) points into [πœƒ0,πœƒ1] uniformly, denoted as Μƒπœƒπ‘– with Μƒπœƒπ‘–=πœƒ0+(π‘–βˆ’1)(πœƒ1βˆ’πœƒ0)/(π‘˜βˆ’1), 𝑖=1,…,π‘˜. Next, we define the test statistic as max1β‰€π‘–β‰€π‘˜π‘…π‘›(Μƒπœƒπ‘–,πœƒβ€²). It can be checked that this test statistic not only preserves the martingale properties, but also inherits the merit of the GLR. As long as π‘˜ is not very large (e.g., π‘˜>100), its computation will be very simple. Thus, it has all the three features stated in Section 1. Since this maximization is restricted to some finite points, we refer to it as degenerate-generalized likelihood ratio (DGLR).
Based on the DGLR, we define a stopping rule 𝑇 for the problem (1.1) by 𝑇=inf𝑛β‰₯1,max1β‰€π‘–β‰€π‘˜π‘…π‘›ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έβ‰₯𝐴ormax1β‰€π‘–β‰€π‘˜π‘…π‘›ξ€·Μƒπœƒπ‘–,πœƒ1ξ€Έξ‚Όβ‰₯𝐡,(2.3) with the terminal decision rule ⎧βŽͺ⎨βŽͺβŽ©Ξ”=accept𝐻1,max1β‰€π‘–β‰€π‘˜π‘…π‘‡ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έβ‰₯𝐴,accept𝐻0,max1β‰€π‘–β‰€π‘˜π‘…π‘‡ξ€·Μƒπœƒπ‘–,πœƒ1ξ€Έβ‰₯𝐡,continuesampling,else,(2.4) where 0<𝐴, 𝐡<∞ are two stopping boundaries. Hereafter, the sequential test method with (2.3) and (2.4) is called the degenerate-generalized likelihood ratio test (DGLRT). It has some theoretical properties which are stated as follows. These theoretical properties provide a guide to the design of the DGLRT, whose proofs are provided in the appendix.
Let π›Όξ…ž(πœƒ,𝐴,𝐡)=π‘ƒπœƒξ‚»max1β‰€π‘–β‰€π‘˜π‘…π‘‡ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έξ‚Όβ‰₯𝐴,πœƒβˆˆΞ˜0,π›½ξ…ž(πœƒ,𝐴,𝐡)=π‘ƒπœƒξ‚»max1β‰€π‘–β‰€π‘˜π‘…π‘‡ξ€·Μƒπœƒπ‘–,πœƒ1ξ€Έξ‚Όβ‰₯𝐡,πœƒβˆˆΞ˜1(2.5) be the real error probabilities, where Θ0 and Θ1 represent the parameter subsets under 𝐻0 and 𝐻1, respectively.

Proposition 2.2. Suppose ξ€œπ‘“ξ€·π‘₯π‘–βˆ£π‘₯π‘–βˆ’1,πœƒξ…žξ…žξ€Έπ‘“ξ€·π‘₯π‘–βˆ£π‘₯π‘–βˆ’1,πœƒξ…žξ€Έπ‘“ξ€·π‘₯π‘–βˆ£π‘₯π‘–βˆ’1ξ€Έ,πœƒπ‘‘π‘₯𝑖≀1,(2.6) for any positive integer 𝑛 and every triple πœƒβ‰€πœƒβ€²β‰€πœƒξ…žξ…ž. For the DGLRT defined by (2.3) and (2.4), one has π›Όξ…ž(πœƒ,𝐴,𝐡)β‰€π‘˜/𝐴 for all πœƒβˆˆΞ˜0 and π›½ξ…ž(πœƒ)β‰€π‘˜/𝐡 for all πœƒβˆˆΞ˜1.

Remark 2.3. The assumption (2.6) given in Proposition 2.2 is not restrictive. This holds for the general one parameter exponential family and many others (cf. Robbins and Siegmund [21]).

Proposition 2.4. Suppose that there exists a constant πœ€>0 such that πΈπœƒβ€²β€²[log{𝑓(π‘₯𝑖|π‘₯π‘–βˆ’1;πœƒβ€²)}βˆ’log{𝑓(π‘₯𝑖|π‘₯π‘–βˆ’1;πœƒ)}]β‰₯πœ€ for all 𝑖 and every triple πœƒβ‰€πœƒβ€²β‰€πœƒξ…žξ…ž. Under the assumptions stated in Proposition 2.2, one has π‘ƒπœƒ{𝑇<∞}=1 for all πœƒβˆˆΞ˜.

Remark 2.5. For πœƒξ…žξ…žβ‰₯πœƒβ€², we have πΈπœƒβ€²β€²ξ€Ίξ€½π‘“ξ€·π‘₯logπ‘–βˆ£π‘₯π‘–βˆ’1;πœƒξ…žξ€½π‘“ξ€·π‘₯ξ€Έξ€Ύβˆ’logπ‘–βˆ£π‘₯π‘–βˆ’1;πœƒξ€Έξ€Ύξ€»=βˆ’πΈπœƒβ€²β€²ξ€Ίξ€½π‘“ξ€·π‘₯logπ‘–βˆ£π‘₯π‘–βˆ’1𝑓π‘₯;πœƒξ€Έξ€Ύβˆ’logπ‘–βˆ£π‘₯π‘–βˆ’1;πœƒξ…žξƒ―πΈξ€Έξ€Ύξ€»β‰₯βˆ’logπœƒβ€²β€²ξƒ¬π‘“ξ€·π‘₯π‘–βˆ£π‘₯π‘–βˆ’1ξ€Έ;πœƒπ‘“ξ€·π‘₯π‘–βˆ£π‘₯π‘–βˆ’1;πœƒξ…žξ€Έξƒ­ξƒ°β‰₯0.(2.7) The last inequality follows from (2.6). πΈπœƒβ€²β€²[log{𝑓(π‘₯𝑖|π‘₯π‘–βˆ’1;πœƒβ€²)}βˆ’log{𝑓(π‘₯𝑖|π‘₯π‘–βˆ’1;πœƒ)}] is positive with probability 1 if πœƒβ‰ πœƒβ€². Heuristically, the requirement that the difference be greater than the constant πœ€>0 for all 𝑖 amounts to assuming that the sequence of data cumulatively adds information about all the πœƒξ…žξ…žβ‰₯πœƒβ€², which is generally true in sequential studies.
From Proposition 2.2, we conclude that the DGLRT satisfies the error constraints (1.2) if 𝐴=π‘˜/𝛼 and 𝐡=π‘˜/𝛽. From Proposition 2.4, it is easy to find that we absolutely stop sampling after finite observations. These results imply that the DGLRT can be useful in a sequential study for testing the problem (1.1).
In the DGLRT (2.3) and (2.4), the value of the parameter π‘˜ should be large but finite. In practice, we suggest that π‘˜=10 (cf. Section 3). Regarding 𝐴 and 𝐡, we can compute them by simulation. Proposition 2.2 shows π΄β‰€π‘˜/𝛼 and π΅β‰€π‘˜/𝛽. Thus, we can search (𝐴, 𝐡) over [1,π‘˜/𝛼]Γ—[1,π‘˜/𝛽] with the real error probabilities being computed by simulations. One may consider a density grid searching on [1,π‘˜/𝛼]Γ—[1,π‘˜/𝛽]. But this is a time consuming job. To reduce the computation, we introduce an efficient approach as follows. In the first step, we can use bisection searching to find 𝐴1 (∈[1,π‘˜/𝛼]) such that π›Όξ…ž(πœƒ0,𝐴1,π‘˜/𝛽)=𝛼. Then, fix 𝐴1 to find 𝐡1 (∈[1,π‘˜/𝛽]) such that π›½ξ…ž(πœƒ1,𝐴1,𝐡1)=𝛽. Since π›Όξ…ž(πœƒ0,π‘₯,𝑦) and 1βˆ’π›½ξ…ž(πœƒ1,π‘₯,𝑦) increase in π‘₯ and decrease in 𝑦, we conclude that (𝐴,𝐡)∈[1,𝐴1]Γ—[1,𝐡1]. Hence, we repeat the above step over [1,𝐴1]Γ—[1,𝐡1]. In this way, we generate a sequence of pairs (𝐴1,𝐡1),(𝐴2,𝐡2),…. Following the above program, we have 𝐴1β‰₯𝐴2β‰₯β‹―β‰₯1,𝐡1β‰₯𝐡2β‰₯β‹―β‰₯1.(2.8) It can be checked that these pairs converge to the exact stopping boundaries. In practice, we repeat the above process and stop at step 𝑙 if |π›Όξ…ž(πœƒ0,𝐴𝑙,𝐡𝑙)βˆ’π›Ό|≀tol1 and |π›½ξ…ž(πœƒ1,𝐴𝑙,𝐡𝑙)βˆ’π›½|≀tol2. Here, tol1=2%𝛼 and tol2=2%𝛽. Computation involved in finding 𝐴 and 𝐡 is not difficult partly due to the rapid developments in information technology. For example, in the nonlinear time series model (2.2), setting βˆ’πœƒ0=πœƒ1=0.25, 𝛼=0.01, 𝛽=0.05, and π‘˜=10, it requires 15 minutes to obtain the stopping boundaries 𝐴 and 𝐡 for the DGLRT based on 100,000 simulations, using Intel-Core i7-2.80 GHz CPU. Since this is a one-time computation before testing, it is convenient to accomplish.

3. Numerical Studies

In this section, we present some simulation results regarding the numerical performance of the proposed DGLRT. In the DGLRT, the parameter π‘˜ needs to be chosen. We first investigate the effect of π‘˜ on the performance of the DGLRT according to i.i.d. observations from the normal distribution 𝑁(πœƒ,1). Setting βˆ’πœƒ0=πœƒ1=0.5 and 𝛼=𝛽=0.01, we compare the DGLRTs with π‘˜=3,5,10,50. The corresponding stopping boundaries (𝐴,𝐡) are (69.3,69.3), (74.3,74.3), (75.7,75.7), and (76.7,76.7), respectively. The ESSs at πœƒ=βˆ’0.8 (0.1) 0.8 (i.e., πœƒ takes values from βˆ’0.8 to 0.8 with step 0.1) are computed based on 100,000 simulated data and are provided in Table 1.

Because of the symmetry, we only include results for πœƒβˆˆ[βˆ’0.8,0]. Table 1 shows that the ESSs under a larger π‘˜ are smaller than those under a smaller π‘˜ if πœƒβˆˆ(πœƒ0,πœƒ1). Meanwhile, it can be seen that a smaller π‘˜ has a better performance outside (πœƒ0,πœƒ1). In order to assess the overall performance of the tests, we compute their relative mean index (RMI) values. The RMI is introduced by Han and Tsung [22] for comparing the performance of several control charts. It is defined as 1RMI=𝑁𝑁𝑙=1ξ€·πœƒESSπ‘™ξ€Έξ€·πœƒβˆ’MESSπ‘™ξ€Έξ€·πœƒMESS𝑙,(3.1) where 𝑁 is the total numbers of parameter points (i.e., πœƒπ‘™β€™s) we considered, ESS(πœƒπ‘™) denotes the ESS at πœƒπ‘™, and MESS(πœƒπ‘™) is the smallest one among all the three ESS(πœƒπ‘™). So, (ESS(πœƒπ‘™)βˆ’MESS(πœƒπ‘™))/MESS(πœƒπ‘™) can be considered as a relative difference of the given test, compared to the best test, at πœƒπ‘™, and RMI is the average of all such difference values. By this index, a test with smaller RMI value is considered better in its overall performance. Since we focus on the performance over the parameter interval [πœƒ0,πœƒ1], πœƒπ‘™=βˆ’0.5+0.1(π‘–βˆ’1), 𝑖=1,…,10 in this illustration. The resulting RMIs for the DGLRT under π‘˜=3,5,10,50 are 0.0116, 0.0042, 0.0017, and 0.0011, respectively, which shows that the DGLRT under a larger π‘˜ is more efficient than the one under a smaller π‘˜. The improvement is minor when π‘˜ is large enough. Considering the complexity of computation, we select π‘˜=10 for practical purposes. From now on, the DGLRT is always the DGLRT under π‘˜=10 unless otherwise stated.

Next, we investigate the performance of the DGLRT in controlling the ESSs over [πœƒ0,πœƒ1]. In the i.i.d. case, we know the 2-SPRT has a better performance in controlling the maximum ESS. For the ESSs over the neighborhoods of πœƒ0 and πœƒ1, the SPRT provides a closely approximation. Based on extensive simulations, we conclude that these features still preserve in the dependent case. Therefore, the SPRT and the 2-SPRT are compared with the DGLRT in this paper. The following three cases are considered.

Case 1. Observations collected from normal distributions with mean πœƒ and variance 1. Set βˆ’πœƒ0=πœƒ1=0.5 and 𝛼=𝛽=0.01 for the test problem (1.1).

Case 2. Observations collected from exponential distributions with mean 1/πœƒ. The problem (1.1) is set with πœƒ0=0.5, πœƒ1=2, and 𝛼=𝛽=0.01.

Case 3. Consider the test problem (1.1) for the simple nonlinear time series model (2.2) with πœƒ0=0, πœƒ1=1 and 𝛼=𝛽=0.01.

In each case, the inserted point for the 2-SPRT is searched over [πœƒ0,πœƒ1]. The stopping boundaries are also computed following the searching algorithm stated in Section 2. These stopping boundaries (𝐴,𝐡) are listed in the order of the SPRT, 2-SPRT, and DGLRT: Case 1: (56.4,56.4), (37.4,37.4), and (75.7,75.7); Case 2: (63.8,25.5), (42.5,23.5), and (79.5,39.5); and Case 3: (14.5,25.5), (8.2,26.8), and (22.5,36.5). Figures 1–3 display the ESS curves over [πœƒ0βˆ’0.5,πœƒ1+0.5] under the three tests for Cases 1–3 with the dashed line for the SPRT, the dotted line for the 2-SPRT, and the solid line for the DGLRT. Figure 1 shows that the DGLRT is comparable to the 2-SPRT in the middle of the parameter range and performs as well as the SPRT in the two tails. It implies that the DGLRT controls both the maximum ESS and the ESSs under 𝐻0 and 𝐻1 very well. The same conclusions can also be obtained from Figures 2 and 3. The RMIs for the SPRT, 2-SPRT, and DGLRT under the three cases are also computed. The results are listed in Table 2. It can be seen that the RMI for the DGLRT is the smallest one among the three tests under all three cases. Thus, the DGLRT performs the best, compared with the SPRT and the 2-SPRT over [πœƒ0,πœƒ1].

To illustrate the DGLRT, we apply it to a real manufacturing data (cf. Chou et al. [23]). A customer specifies an average breaking strength of a strapping tape as 200 psi, and the standard deviation is 12 psi. The data are the breaking strength of different strapping tapes, so the random errors mainly stem from the measurement errors. Thus, the observations can be assumed to be independent. The Shapiro and Wilk [24] test shows that the data are taken from a normal distribution. Consider the test problem (1.1) with πœƒ0=200 and πœƒ1=212 and standardize the observations by using a transformation 𝑋𝑖→(π‘‹π‘–βˆ’206)/12, 𝑖=1,2,…. Then the resulting test problem is equivalent to 𝐻0βˆΆπœƒβ‰€βˆ’0.5 versus 𝐻1βˆΆπœƒβ‰₯0.5. Under 𝛼=𝛽=0.01, the corresponding stopping boundaries for the DGLRT are (75.7,75.7). Based on the first 20 real observations, we compute the test statistics of the DGLRT, which are displayed in Table 3. In Table 3, standardized 𝑋𝑖 indicates (π‘‹π‘–βˆ’206)/12. Table 3 shows that max1β‰€π‘—β‰€π‘˜π‘…π‘–(Μƒπœƒπ‘—,πœƒ1) increases in 𝑖 rapidly, while max1β‰€π‘—β‰€π‘˜π‘…π‘–(Μƒπœƒπ‘—,πœƒ0) keeps constant for 𝑖=1,2,…,20 under the real data. Since max1β‰€π‘—β‰€π‘˜π‘…π‘–(Μƒπœƒπ‘—,πœƒ1) crosses its stopping boundary at the 11th observation, we should accept the null hypothesis according to the terminal decision rule (2.4).

4. Concluding Remarks

In this paper, we have proposed the DGLRT test in cases where the conditional density function has an explicit form. It has been shown that the properties of the DGLRT can guarantee bounding two error probabilities. To make our method be more applicable, we further discuss the selection of the parameter π‘˜ and the searching algorithm for its stopping boundaries. From our numerical results, we conclude that the DGLRT has several merits: (1) in contrast to the SPRT, the DGLRT has much smaller ESS for πœƒ in the middle of the parameter range and nearly has the same performance for πœƒ outside the interval (πœƒ0,πœƒ1). It is not surprising that the 2-SPRT performs the best in minimizing the maximum ESS because it is designed to be optimal in the minimax sense. However, the relative difference of the maximum ESS between the DGLRT and the 2-SPRT is minor. Moreover, for πœƒ outside (πœƒ0,πœƒ1), the ESSs of the DGLRT are much smaller than those of the 2-SPRT. That is to say, the DGLRT controls the maximum ESS and the ESSs under two hypotheses; (2) under the RMI criteria, the DGLRT performs more efficiently than the SPRT and the 2-SPRT over [πœƒ0,πœƒ1]; (3) its implementation is very simple.

While our focus in this paper is on methodological development, there are still some related questions unanswered yet. For instance, at this moment, we do not know how to determine the critical stopping boundaries for the DGLRT in an analytical way instead of the Monte Carlo method. Besides, our method controls the ESS in pointwise, so it can be used to construct control chart for detecting the small shifts. These questions will be addressed in our future research.

Appendix

Proof of Proposition 2.2. Let 𝑇1ξ‚»=inf𝑛β‰₯1,max1β‰€π‘–β‰€π‘˜π‘…π‘›ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έξ‚Ό,𝑇β‰₯𝐴2ξ‚»=inf𝑛β‰₯1,max1β‰€π‘–β‰€π‘˜π‘…π‘›ξ€·Μƒπœƒπ‘–,πœƒ1ξ€Έξ‚Ό.β‰₯𝐡(A.1) So, π›Όξ…ž(πœƒ,𝐴,𝐡)=π‘ƒπœƒξ€½accept𝐻1ξ€Ύ=π‘ƒπœƒξ‚»π‘‡<∞,max1β‰€π‘–β‰€π‘˜π‘…π‘‡ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έξ‚Όβ‰₯𝐴=π‘ƒπœƒξ‚»π‘‡1≀𝑇2,𝑇<∞,max1β‰€π‘–β‰€π‘˜π‘…π‘‡ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έξ‚Όβ‰₯π΄β‰€π‘ƒπœƒξ€½π‘‡1ξ€Ύβ‰€ξ€œ<∞{𝑇1<∞}1𝐴max1β‰€π‘–β‰€π‘˜π‘…π‘‡1ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έπ‘‘π‘ƒπœƒβ‰€π‘˜ξ“π‘–=11π΄ξ€œ{𝑇1<∞}𝑅𝑇1ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έπ‘‘π‘ƒπœƒβ‰€π‘˜π΄.(A.2) The last inequality follows from (2.6). Till now, we prove that the result π›Όξ…ž(πœƒ,𝐴,B)β‰€π‘˜/𝐴 for all πœƒβˆˆΞ˜0. The other result can also be proven in a similar way.

Proof of Proposition 2.4. Since we insert π‘˜ (β‰₯3) points in [πœƒ0,πœƒ1], we can find a point πœƒ2 which belongs to (πœƒ0,πœƒ1). Thus, there exists a πœ€>0 such that πΈπœƒ[log{𝑓(π‘₯𝑖|π‘₯π‘–βˆ’1;πœƒ2)}βˆ’log{𝑓(π‘₯𝑖|π‘₯π‘–βˆ’1;πœƒ0)}]β‰₯πœ€. It implies that πΈπœƒ[𝑅𝑛(πœƒ2,πœƒ0)]β†’βˆž for πœƒβ‰₯πœƒ2. So, limπ‘›β†’βˆžπ‘ƒπœƒξ‚»max1β‰€π‘–β‰€π‘˜π‘…π‘›ξ€·Μƒπœƒπ‘–,πœƒ0ξ€Έξ‚Όβ‰₯𝐴β‰₯limπ‘›β†’βˆžπ‘ƒπœƒξ€½π‘…π‘›ξ€·πœƒ2,πœƒ0ξ€Έξ€Ύβ‰₯𝐴=1.(A.3)
Thus, we have the result that π‘ƒπœƒ{𝑇<∞}=1 for all πœƒβ‰₯πœƒ2. In a similar way, we can obtain π‘ƒπœƒ{𝑇<∞}=1 for all πœƒβ‰€πœƒ2. Combining the two results, we complete this proof.

Acknowledgments

The authors cordially thank the editor and the anonymous referees for their valuable comments which lead to the improvement of this paper. This research was supported by grants from the National Natural Science Foundation of China (11101156 and 11001083).