About this Journal Submit a Manuscript Table of Contents
Mathematical Problems in Engineering
Volume 2010 (2010), Article ID 393095, 15 pages
http://dx.doi.org/10.1155/2010/393095
Research Article

Hypothesis Designs for Three-Hypothesis Test Problems

School of Finance and Statistics, East China Normal University, No. 500 Dongchuan Road, Shanghai 200241, China

Received 25 January 2010; Accepted 18 March 2010

Academic Editor: Ming Li

Copyright © 2010 Yan Li and Xiaolong Pu. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

As a helpful guide for applications, the alternative hypotheses of the three-hypothesis test problems are designed under the required error probabilities and average sample number in this paper. The asymptotic formulas and the proposed numerical quadrature formulas are adopted, respectively, to obtain the hypothesis designs and the corresponding sequential test schemes under the Koopman-Darmois distributions. The example of the normal mean test shows that our methods are quite efficient and satisfactory for practical uses.

1. Introduction

In practice, the multihypothesis test problems are of considerable interest in the areas of engineering, agriculture, clinical medicine, psychology, and so on. For instance, the multihypothesis tests are involved in pattern recognition [14], multiple-resolution radar detection [57], products' comparisons [8, 9], and information detection [10]. Before the inspections, the hypotheses must be determined according to such practical needs as the balance of risks and costs. As Wetherill and Glazebrook [11] pointed out, combinations of hypotheses, risks, and costs may need to be tried iteratively until an acceptable design is attained. This bothers and burdens the practitioners.

To avoid too many troublesome trials and to produce the hypotheses directly, we discuss the hypothesis designs under the controlled risks and expected costs in this paper. As an initial exploration, only the three-hypothesis test problems are considered here. Indeed, our methods may extend to the multihypothesis cases.

In practice, test costs are mainly determined by sample sizes. Therefore, the sample size becomes an issue relating to the statistical analysis of problems in many aspects; see for example, Chen et al. [12], Oliveira et al. [13], Li and Zhao [14], Li et al. [15], Bakhoum and Toma [16], Cattani [17], as well as Cattani and Kudreyko [18]. Accordingly, we consider the Average Sample Number (ASN), which is one of the most important values in evaluating the expected costs of sequential test schemes.

In the three-hypothesis test problem, the null hypothesis is always set as a standard and medium status. For example, Anderson [8] discussed the three-hypothesis test problem to decide whether the difference of two yarns' strength is zero (the null hypothesis), positive or negative. Realistically, the standard and medium status (denoted as ) is definite, while the two alternatives beside it need to be designed to balance the risks and costs. Thus, in this paper, we try to design the alternatives and under the required error probabilities and ASN for testing the parameter of the Koopman-Darmois distribution

To simplify the discussion, we only consider the designs of the two alternative hypotheses symmetric with the null hypothesis, that is, . Actually, the asymmetric designs may be obtained by extending our methods slightly.

Then, the test problem here is

For the multihypothesis test problems, Armitage [19] provided a classical test scheme by simultaneously applying the method of Sequential Probability Ratio Test (SPRT) on each pair of the hypotheses. This test scheme pattern is simple and easy to implement. When testing the three hypotheses for the Koopman-Darmois distribution (1.1), Armitage's scheme may be illustrated as in Figure 1, where AL//CM are boundaries for “ versus " and CP//DQ are for “ versus " when the boundaries for “ versus " are encircled by AL and DQ and thus are neglected. According to Figure 1, the decision rule should be where and are independent sequential observations from a Koopman-Darmois distribution.

393095.fig.001
Figure 1

For the given , and , the test scheme in Figure 1 is decided by 6 parameters . and may be determined according to Armitage [19], that is, , under the Koopman-Darmois distribution (1.1), then the remaining 4 parameters form the scheme. Altogether with the hypothesis design value in the test problem (1.2), the 5 underdetermined values are .

In the three-hypothesis test problems, the error probabilities and should be assigned to the error probabilities in correspondence with the requirements Commonly, we set , , as Payton and Young [20, 21] indicated.

And the request on the ASN should be where is a provided integer and is the point at which the ASN needs to be controlled. may take values of , , , and so on according to practical needs.

Then, under the constraints (1.4) and (1.5), we may find the proper by virtue of their relationships with the error probabilities and ASN.

Unfortunately, however, to the best knowledge of the authors, the accurate formulas for the performances of the three-hypothesis test scheme are still unavailable possibly because of its sequential feature and anomalistic continuing sampling area. In the following, the hypothesis designs and the test scheme parameters are determined under the required error probabilities and ASN in terms of some approximate expressions, that is, the asymptotic formulas and the proposed numerical quadrature formulas.

2. Designs under Asymptotic Formulas

In this section, we try to find the hypothesis designs and test schemes under the required error probabilities and ASN by virtue of the asymptotic formulas of the multihypothesis sequential test scheme by Dragalin et al. in [22, 23].

Firstly, we discuss how to control the error probabilities. Let be the critical value of the logarithmic likelihood ratio function for accepting , and let be the probability limit of incorrectly accepting , . According to Dragalin et al. [22], under the condition of equal prior probabilities for the three hypotheses, the probability of wrongly accepting for the Armitage [19] scheme may be controlled by if the critical value is set as Thus, the error probabilities are in control if we follow the critical values in (2.1), where , , and . Setting the critical values equal to the corresponding logarithmic likelihood ratio functions, we have the following expressions for the test scheme parameters under the Koopman-Darmois distribution (1.1):

Note that the expressions in (2.2) define the relations between the hypothesis design parameter and the test scheme parameters , while has not been determined so far.

In the following, the hypothesis design parameter is found with the help of Dragalin et al.'s asymptotic ASN formulas [23].

Based on the nonlinear renewal theory, Dragalin et al. [23] summarized and developed the asymptotic ASN formulas under . Specifically, when , the asymptotic ASN formulas under the two alternatives and are where and is the expected limiting overshoot under , .

And for the null hypothesis under , the asymptotic ASN formula is where ( here), and is the value related to the covariance of the logarithmic likelihood ratio functions.

Notice that the approximate ASN formulas (2.3) and (2.4) only depend on the hypothesis design parameter when is given. Therefore, to find the proper hypothesis design under the desired number , we set up an equation about to meet the ASN requirement on one of the three hypothesis values, that is, where may be , or .

Then, the hypothesis design parameter is the solution to (2.5) and the test scheme with may be obtained correspondingly according to (2.2). Illustrations are provided in Example 1 for testing the normal mean with the variance known.

Example 1. Suppose that the sequential observations are independent and identically distributed (i.i.d.) with . Let , , and . Small values () are set on as practical sequential inspections always require.
Accordingly, we have , . In this example, the test scheme parameters should be And for the normal distribution , there are where and are the probability density function (p.d.f.) and cumulative distribution function (c.d.f.) of the standard normal distribution, respectively.

Consider the following 4 cases, respectively: Then, solving (2.5), we obtain the hypothesis designs as shown in Column 2 of Tables 1, 2, 3, and 4. The corresponding test scheme parameters from (2.6) are listed in Columns 3–5 of Tables 14. To evaluate the method's efficiency, we record the Monte Carlo simulation study results with 1,000,000 replicates in Tables 5, 6, 7, and 8, where is the simulated value of and is the relative difference between and . Note that the simulated probabilities under are neglected here since they are nearly equivalent to their counterparts under in terms of the schemes' symmetry.

tab1
Table 1: Hypothesis designs and test schemes for , .
tab2
Table 2: Hypothesis designs and test schemes for , .
tab3
Table 3: Hypothesis designs and test schemes for , .
tab4
Table 4: Hypothesis designs and test schemes for , .
tab5
Table 5: Simulated performances for the schemes under asymptotic formulas in Table 1.
tab6
Table 6: Simulated performances for the schemes under asymptotic formulas in Table 2.
tab7
Table 7: Simulated performances for the schemes under asymptotic formulas in Table 3.
tab8
Table 8: Simulated performances for the schemes under asymptotic formulas in Table 4.

Obviously, the accuracy of the solution to (2.5) is decided by the efficiency of the ASN formulas (2.3) and (2.4). On one hand, from Dragalin et al. [23] and the ’s in Tables 58, we conclude that the formulas in (2.3) for and are more efficient than the one in (2.4) for when testing the normal mean. On the other hand, the asymptotic ASN formulas perform better under smaller error probabilities since the asymptotic limit is taken as . For applications, with such a simple computation, the efficiency of the design is quite satisfactory for small error probabilities conditions.

However, this method may only serve to control the ASN on the three hypothesis values since the asymptotic ASN formulas out of these points are absent so far. And the quantities , and should be deduced according to specific distributions (see [23]). Besides, the discrepancies between the real performances and the required ones show the method's conservativeness. In the next section, an improved method is proposed and more efficient formulas are developed through the numerical quadrature.

3. Designs under Numerical Quadrature Formulas

This section proposes a method to obtain more efficient hypothesis designs and test schemes through a system of equations based on the numerical quadrature formulas of the error probabilities and ASN.

In studies by Payton and Young in [20, 21], for the provided hypotheses, the error probabilities are approximately attained by solving a system of equations about the 4 scheme parameters . This method is hoped to fully make use of the required error probabilities and to obtain efficient designs. Enlightened by Payton and Young, we propose to find the hypothesis design and test scheme by solving the following system of equations:

Obviously, the key is to find the formulas of the error probabilities and ASN on the left side of the equations in (3.1). Unfortunately, the available approximate formulas cannot meet applicable needs well. For example, Payton and Young [20, 21] adopted the formulas under the continuous-time process and the required minimum sample size before decisions, and obtained some inefficient results. Also, as mentioned in Section 2, Dragalin et al.'s results are restricted to the conditions of small error probabilities and [22, 23].

To find efficient and applicable designs, we develop the approximate formulas through the numerical quadrature for the three-hypothesis test scheme's performances on the error probabilities and ASN.

To deduce the formulas for the realistic discrete-time situation, we denote as the minimum integer that is not less than . Let and be the values on the two boundaries DQ and AL in Figure 1 at , that is, , , . Denote , , , and . With the decision rule (1.3), we rewrite the system of (3.1) as where = through ; , , ; , , ; ; = ; ; is the average sample number from a point in at to the point of accepting or when . is the average sample number from a point in at to the point of making a decision when . And is the average sample number from a point in at to the point of making a decision when .

The following theorem provides the approximate formulas through the numerical quadrature for the quantities in (3.2). In fact, these formulas are developed by a stepwise dealing for the continuing sampling area before and the results by Li and Pu in [24] for the parallel lines areas inside AL//CM and inside CP//DQ, respectively. With such an idea, the proof of Theorem 3.1 is trivial and is neglected here.

Theorem 3.1. Assume that are i.i.d. observations. Let and be the p.d.f. and c.d.f. of , respectively. Assume that . Denote , and = , where is the th numerical quadrature root for , , , and is the corresponding weight for the numerical quadrature root . Let and be the th numerical quadrature root for and for , respectively, .
Then, the approximate values , , , , , , , , , and for the quantities in (3.2) are the following.()()()()() Denote , where , ; , where , ; , where , . Let be the identity matrix. Then, there is where .() Denote , where , ; , where , ; , where , . Then, we have where .()()() where with being the vector of 1's.() where

Notice that the values on the left side of the equations in (3.2) must be obtained through a computer program with much iterative work, which reveals the method's complexity in computation and impairs the speed of solving the system of (3.2). Nevertheless, the time it costs is tolerable when the accuracy of solving the equations is not too demanded.

Example 1 (Continued). Consider the same problems as those in Example 1 in Section 2. By applying the formulas (3.3)–(3.12) and the 64 Gaussian quadrature roots, we solve the system of (3.2) in a computer program. The hypothesis designs and the corresponding test schemes are listed in Columns 6–9 of Tables 14. As a comparison with the method under the asymptotic formulas in Section 2, in Column 10 of Tables 14 records the relative difference between the two hypothesis designs of the two methods. The Monte Carlo simulation study with 1,000,000 replicates in Tables 9, 10, 11, and 12 reveal the schemes' real performances.

tab9
Table 9: Simulated performances for the schemes under Gaussian quadrature formulas in Table 1.
tab10
Table 10: Simulated performances for the schemes under Gaussian quadrature formulas in Table 2.
tab11
Table 11: Simulated performances for the schemes under Gaussian quadrature formulas in Table 3.
tab12
Table 12: Simulated performances for the schemes under Gaussian quadrature formulas in Table 4.

The real performances in Tables 912 show that the requirements on controlling the error probabilities and ASN may be fully made use of under this method and the numerical quadrature formulas are almost accurate. Therefore, the hypothesis designs and test schemes are highly efficient in terms of, for instance, more efficient designs with smaller in Tables 14 under this method.

To further explain the methods, an example of the airbag quality inspection is provided in the appendix.

4. Conclusions and Remarks

For the three-hypothesis test problems, the methods of designing the hypotheses, together with obtaining the corresponding test schemes, are proposed by adopting asymptotic formulas or numerical quadrature formulas in this paper. As a helpful guide for practitioners, they aid to directly find proper hypotheses under controlled risks and costs in preventing from too many iterative trials on combinations of hypotheses to meet practical needs.

The asymptotic formulas and the numerical quadrature formulas are both alternative tools for the hypothesis designs. Several aspects should be considered when choosing between them in applications.(1)The method with numerical quadrature formulas outperforms the one under the asymptotic formulas especially when the error probabilities are not very small, as the example shows. In reality, the required error probabilities always range from 0.05 to 0.30 in sequential inspections, which seems to suggest choosing the numerical quadrature formulas to obtain efficient designs.(2)In computation, the asymptotic formulas provide great convenience for applications, while the numerical quadrature formulas demand much iterative computational work especially when the number of numerical quadrature roots is large. But from the computation with the 64 Gaussian quadrature roots in the example, the time it costs in a common computer is tolerable if the start values for the system of equations are proper. We recommend finding the designs under the asymptotic formulas first, and then apply them as starts to obtain more efficient hypotheses from the numerical quadrature formulas when needed.(3)When adopting the asymptotic formulas, the expressions for the quantities , and should be developed for a specific distribution (see [23]). For the use of numerical quadrature formulas, the quadrature roots may be particularly arranged to fit the support points in the discrete distributions (e.g, see Reynolds and Stoumbos [25]). And for the out of , only the method with numerical quadrature formulas may take effect.

Actually, the two methods may apply to any distribution out of the Koopman-Darmois family. However, the test schemes under these distributions may be different from that in Figure 1, and the numerical quadrature formulas should be changed according to the test scheme patterns.

For the hypothesis designs asymmetric with the null hypothesis or the multihypothesis test problems, the methods proposed in this paper are still applicable by some extensions of adding more constraints on the designs. The hypothesis design problems under other requests, for example, under the desire of stopping sampling before a limit guaranteed by a provided probability, are still open to scholars and practitioners.

Appendix

Illustration of Airbag Quality Inspection

According to Li et al. [26], the airbag deployment pressure rate per unit of time, which is always assumed to conform with a standard normal distribution after some standardized transformation, is a key index of the airbag quality. The concerned problem here is whether the quality index is zero, positive, or negative. This quality index is measured in a 100 cubic feet testing air tank with sensors and the inspection is destructive. Since the airbag is expensive, the three-hypothesis sequential test scheme is needed to reduce the average inspection costs.

Suppose that the two error probabilities are and the required is no more than 5. Then, the hypothesis designs and test schemes of “" in Table 2 should be taken, that is, = , under the method with asymptotic formulas and = under the method with Gaussian quadrature formulas.

Under the method with asymptotic formulas, the hypothesis test problem should be Taking the simulated observations from by Li et al. in [26], we may reach a decision of accepting when falls in according to the test process in Table 13.

tab13
Table 13: Test process under the test scheme from asymptotic formulas.

Under the method with Gaussian quadrature formulas, the hypothesis test problem should be Also taking the simulated observations from by Li et al. in [26], we may accept after inspecting the third airbag according to the test process in Table 14.

tab14
Table 14: Test process under the test scheme from Gaussian quadrature formulas.

Acknowledgments

This work was partly supported by the National Natural Science Foundation of China (NSFC) under the project Grants numbers 60573125 and 60873264. The authors cordially thank the editor and anonymous referees for their helpful reviews.

References

  1. K. S. Fu, Sequential Methods in Pattern Recognition and Learning, Academic Press, New York, NY, USA, 1968.
  2. W. E. Waters, “Sequential sampling in forest insect surveys,” Forest Science, vol. 1, pp. 68–79, 1955.
  3. B. Lye and R. N. Story, “Spatial dispersion and sequential sampling plan of the southern green stink bug on fresh market tomatoes,” Environmental Entomology, vol. 18, no. 1, pp. 139–144, 1989.
  4. T. McMillen and P. Holmes, “The dynamics of choice among multiple alternatives,” Journal of Mathematical Psychology, vol. 50, no. 1, pp. 30–57, 2006. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  5. J. J. Bussgang, “Sequential methods in radar detection,” Proceedings of the IEEE, vol. 58, no. 5, pp. 731–743, 1970.
  6. E. Grossi and M. Lops, “Sequential along-track integration for early detection of moving targets,” IEEE Transactions on Signal Processing, vol. 56, no. 8, pp. 3969–3982, 2008. View at Publisher · View at Google Scholar · View at Scopus
  7. N. A. Goodman, P. R. Venkata, and M. A. Neifeld, “Adaptive waveform design and sequential hypothesis testing for target recognition with active sensors,” IEEE Journal on Selected Topics in Signal Processing, vol. 1, no. 1, pp. 105–113, 2007. View at Publisher · View at Google Scholar · View at Scopus
  8. S. L. Anderson, “A simple method of comparing the breaking loads of two yarns,” Textile Institute, vol. 45, pp. 472–479, 1954.
  9. C. Liteanu and I. Rica, Statistical Theory and Methodology of Trace Analysis, Halsted, New York, NY, USA, 1980.
  10. A. G. Tartakovsky, B. L. Rozovskii, R. B. Blažek, and H. Kim, “Detection of intrusions in information systems by sequential change-point methods,” Statistical Methodology, vol. 3, no. 3, pp. 252–293, 2006. View at Publisher · View at Google Scholar · View at Scopus
  11. G. B. Wetherill and K. D. Glazebrook, Sequential Methods in Statistics, Monographs on Statistics and Applied Probability, Chapman & Hall, London, UK, 3rd edition, 1986. View at MathSciNet
  12. T.-H. Chen, C.-Y. Chen, H.-C. P. Yang, and C.-W. Chen, “A mathematical tool for inference in logistic regression with small-sized data sets: a practical application on ISW-ridge relationships,” Mathematical Problems in Engineering, vol. 2008, Article ID 186372, 12 pages, 2008. View at Publisher · View at Google Scholar · View at Scopus
  13. T. F. Oliveira, R. B. Miserda, and F. R. Cunha, “Dynamical simulation and statistical analysis of velocity fluctuations of a turbulent flow behind a cube,” Mathematical Problems in Engineering, vol. 2007, Article ID 24627, 28 pages, 2007. View at Publisher · View at Google Scholar · View at MathSciNet
  14. M. Li and W. Zhao, “Variance bound of ACF estimation of one block of fGn with LRD,” Mathematical Problems in Engineering, vol. 2010, Article ID 560429, 14 pages, 2010. View at Publisher · View at Google Scholar
  15. M. Li, W.-S. Chen, and L. Han, “Correlation matching method for the weak stationarity test of LRD traffic,” Telecommunication Systems, vol. 43, no. 3-4, pp. 181–195, 2010. View at Publisher · View at Google Scholar · View at Scopus
  16. E. G. Bakhoum and C. Toma, “Relativistic short range phenomena and space-time aspects of pulse measurements,” Mathematical Problems in Engineering, vol. 2008, Article ID 410156, 20 pages, 2008. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  17. C. Cattani, “Harmonic wavelet approximation of random, fractal and high frequency signals,” Telecommunication Systems, vol. 43, no. 3-4, pp. 207–217, 2010.
  18. C. Cattani and A. Kudreyko, “Application of periodized harmonic wavelets towards solution of eigenvalue problems for integral equations,” Mathematical Problems in Engineering, vol. 2010, Article ID 570136, 8 pages, 2010. View at Publisher · View at Google Scholar
  19. P. Armitage, “Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis,” Journal of the Royal Statistical Society. Series B, vol. 12, pp. 137–144, 1950. View at Zentralblatt MATH · View at MathSciNet
  20. M. E. Payton and L. J. Young, “A sequential procedure for deciding among three hypotheses,” Sequential Analysis, vol. 13, no. 4, pp. 277–300, 1994. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  21. M. E. Payton and L. J. Young, “A sequential procedure to test three values of a binomial parameter,” Metrika, vol. 49, no. 1, pp. 41–52, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  22. V. P. Dragalin, A. G. Tartakovsky, and V. V. Veeravalli, “Multihypothesis sequential probability ratio tests. I. Asymptotic optimality,” IEEE Transactions on Information Theory, vol. 45, no. 7, pp. 2448–2461, 1999. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  23. V. P. Dragalin, A. G. Tartakovsky, and V. V. Veeravalli, “Multihypothesis sequential probability ratio tests. II. Accurate asymptotic expansions for the expected sample size,” IEEE Transactions on Information Theory, vol. 46, no. 4, pp. 1366–1383, 2000. View at Publisher · View at Google Scholar · View at Zentralblatt MATH · View at MathSciNet
  24. Y. Li and X. L. Pu, “A method on designing three-hypothesis test problems,” to appear in Communications in Statistics—Simulation and Computation.
  25. M. R. Reynolds Jr. and Z. G. Stoumbos, “The SPRT chart for monitoring a proportion,” IIE Transactions, vol. 30, no. 6, pp. 545–561, 1998. View at Scopus
  26. Y. Li, X. L. Pu, and F. Tsung, “Adaptive charting schemes based on double sequential probability ratio tests,” Quality and Reliability Engineering International, vol. 25, no. 1, pp. 21–39, 2009. View at Publisher · View at Google Scholar · View at Scopus