`ISRN Computational MathematicsVolume 2012 (2012), Article ID 396831, 8 pageshttp://dx.doi.org/10.5402/2012/396831`
Research Article

A Computational Study Assessing Maximum Likelihood and Noniterative Methods for Estimating the Linear-by-Linear Parameter for Ordinal Log-Linear Models

1School of Mathematical and Physical Sciences, University of Newcastle, Newcastle, NSW 2308, Australia
2School of Veterinary Medicine, University of California, Davis, CA 95616, USA

Received 5 October 2011; Accepted 16 November 2011

Academic Editors: K. T. Miura, P. B. Vasconcelos, and Q.-W. Wang

Copyright © 2012 Eric J. Beh and Thomas B. Farver. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

For ordinal log-linear models, the estimation of the parameter reflecting the linear-by-linear measure of association has long been a topic for the analysis of dependence for contingency tables. Typically, iterative procedures (including Newton’s method) are used to determine the maximum likelihood estimate of the parameter. Recently Beh and Farver (2009, ANZJS, 51, 335–352) show by way of example three reliable and accurate noniterative techniques that can be used to estimate the parameter and extended this study by examining their reliability computationally. This paper further investigates the reliability of the non-iterative procedures when compared with Newton’s method for estimating this association parameter and considers the impact of the sample size on the estimate.

1. The Ordinal Log-Linear Model

Consider an two-way contingency table, , where the th cell entry is given by for and . Let the grand total of be and the matrix of joint relative frequencies be so that the th cell entry is and . Define the th row marginal proportion by , and define the th column marginal proportion as . When the row and column variables both consist of ordered categories, the th cell may be modeled by considering the following log-linear model: The values and are the weighted average of the row and column scores assigned a priori to the categories of the row and column variables, respectively. The parameters and are constrained so that and are treated as nuisance parameters. Thompson [1, pages 164–197] provides a description how one may use S-PLUS, and R, to obtain the ML estimates the parameters from this model.

An alternative model that may be considered is so that the th cell frequency can be estimated by The parameter of interest in (3) is and is a measure of linear-by-linear association between the variables of the table. Thus, in this paper, , and (and hence and ) are treated as nuisance parameters, and can be estimated by respectively; see, for example, Beh and Davy [2]. The importance of the interpretation of the parameter can be highlighted by considering that Therefore, if the row scores (and column scores) are chosen to be consecutive integers, so that and , then and is the constant log-odds ratio of the contingency table.

There are various algorithmic strategies that can be considered that provide an accurate means of determining the maximum likelihood estimate of . These include, but are not limited to, the contributions of Goodman [3], Haberman [4], Agresti [5, Appendix B], Nelder and Wedderburn [6], Galindo-Garre and Vermunt [7], and Aït-Sidi-Allal et al. [8].

An alternative approach to estimating involves a more direct estimation procedure and was proposed by Beh and Davy [2]. They considered a moderately reliable and relatively mathematically simple means of estimating noniteratively. Beh and Farver [9] greatly improved upon this approach. By applying these estimation techniques to fourteen commonly analysed two-way contingency tables considered in the statistics literature, Beh and Farver [9] showed that noniterative procedures provide estimates of that are comparable, and in most cases statistically indistinguishable, to ML estimates using Newton’s unidimensional method. Here we further elaborate on this empirical study by examining the relative stability of estimating noniteratively and by using Newton’s unidimensional method. Also discussed is the impact the sample size has when using iterative and noniterative estimation procedures.

2. On Estimating 𝜙

2.1. MLE (Newton’s Unidimensional Method)

There are a number of iterative algorithms that may be used for obtaining the maximum likelihood estimate (MLE) of from model (3). Some procedures include iterative proportional fitting, as described by Agresti [5, pages 64–66], and Nelder and Wedderburn’s [6] quasilikelihood approach. Another popular iterative approach, and one adopted in this paper, is Newton’s unidimensional method. Refer to Goodman [3], Haberman [4, 10], and Agresti [5] for further details on the method. Here a brief description of Newton’s method will be provided although a more detailed account on the algorithm used in this paper is provided in Beh and Farver [9].

Suppose that are independent Poisson random variables. At the th iteration of the algorithm, the MLE of from the th iteration is updated such that until , for some reasonable value of , say . Here, is just (3) with in place of . When this stopping criterion is satisfied, the MLE of using Newton’s method, , is the final iterative estimate . In the study undertaken by Beh and Farver [9], an initial value of was chosen and, as (2) indicates, coincides with independence between the two variables. The standard error of the MLE of is Beh and Farver [11] showed that the maximum standard error of the estimate of for a contingency table coincides with the hypothesis of independence and may be calculated by .

There are various adaptations of this algorithm that can be considered to accelerate the estimation of or to ensure the algorithm converges. Due to the substantial level of technical expertise required to implement these algorithms, we will consider the traditional unidimensional version of Newton’s method.

2.2. Noniterative Methods

Originally, Beh and Davy [2] considered various issues concerned with the noniterative estimation of for model (2). As part of their study, they derived a simple means of estimating the parameter by considering where and . Beh and Farver [9] continued to examine these issues and proposed the following two noniterative estimates: where (10) is referred to as the estimate. Here, (11) involves using a first-order approximation of the logarithm function in (10) and is termed the estimate. By considering a second-order approximation of the logarithm in (10), Beh and Farver [11] proposed another estimate to improve upon that of (11) which is referred to as the estimate. The comprehensive computation study of Beh and Farver [11] demonstrated the reliability and stability of the estimate (10) in estimating the parameter directly. They also showed that for small values, or for small contingency tables, the estimate (10), estimate (11), and estimate (12) provide very good estimates when compared with the MLE calculated using Newton’s method. The computational study given in the following section calculates the standard error of BDNI estimate (9), estimate (10), estimate (11), and estimate (12) using (8).

3. A Computational Study

Two key issues concerning the estimation of in (2) are considered in this section:(1)the stability of using the estimate (10) and Newton’s method to estimate ,(2)the impact of the sample size in estimating the parameter using these two methods.

The study of these two issues was made by adopting the simple procedure for randomly generating a contingency table of given dimension, sample size, , and value described in Beh and Farver [11]. The results reported in the following sections are based on randomly generating 1000 contingency tables using the procedure outlined in the appendix, with a fixed and predefined dimension and . For the first issue, samples sizes of were considered, and varied from 10 to 10000 for investigations of the second issue.

3.1. Stability of the Estimates

Beh and Farver [11] demonstrated computationally the issues concerning the accuracy of Newton’s method and four noniterative methods (the BDNI estimate (9), the estimate (10), the estimate (11), and the estimate (12)) for estimating . Understanding how stable and reliable these methods are in calculating is a very important aspect of such a study. In the present study, three different types of contingency tables were considered to represent the large variation of dimension and commonly encountered (see, e.g., Beh and Farver [9] who considered the estimation of for fourteen commonly analysed contingency tables found in the statistical literature). One thousand (1000) randomly generated contingency tables with , tables with and tables with were considered. In practice, may often be smaller (say ), and, as shown in Beh and Farver [11], the noniterative estimation methods compare very well with Newton’s method of calculating the MLE of . Therefore, the stability of the noniterative methods for larger values was studied. To avoid any problems involved in using the logarithm function for zero-cell values, cells from the randomly generated contingency tables with a zero frequency are replaced with 0.05, a value which is negligibly small in comparison with the sample size of approximately 1000. Agresti [12, page 397] considers how one may treat zero-cell counts in a contingency table.

3.1.1. 1000 4 × 5 Tables—𝜙=0.5

One thousand (1000) contingency tables were generated with . Figure 1 shows how stable and accurate each of the five methods were for estimating from (2).

Figure 1: Plots showing the stability of Newton’s method for obtaining the MLE of and the four noniterative procedures in 1000 simulations of a contingency table where .

The five plots on the first line of Figure 1 summarise the estimate of calculated (to four decimal places) using Newton’s method, the BDNI estimate (9), the estimate (10), the estimate (11), and the estimate (12), respectively. These graphs clearly show that Newton’s method produces extremely reliable and consistent estimates of the parameter. The iterative technique produced a mean estimate of 0.500182 with a mean standard error (calculated using expression (8)) of 0.01671. The method took, on average, 44 iterations to converge to four decimal places ranging from 20 to 82 iterations (median is 43). Of the four noniterative methods, method (10) provided accurate and reliable estimates of . This method produced stable estimates with mean of 0.50298 (mean SE of 0.0122) which is favorably comparable with the MLE. However, while the BDNI estimate (9), estimate (11), and estimate (12) produced fairly consistent estimates, they underestimated the value of . This is not surprising since these three estimation procedures require a first or second order approximation of the logarithm function present in the estimate (10). However, not surprisingly, the second-order logarithm approximation (whose graphical summary appears as the fifth plot of the first line of Figure 1) produced better estimates than either of the two first-order logarithm approximation methods. The noniterative methods of the BDNI estimate (9), the estimate (11), and the estimate (12) produced mean estimates (and standard errors) of 0.35573 (0.01424), 0.40950 (0.01352), and 0.46108 (0.01279), respectively. Of the noniterative estimates, the mean standard error of the estimates using method (9) were lower than those estimates using an approximation of the logarithm function. One may also note that, as part of the analysis, the mean maximum standard error of the tables, calculated using , was 0.01729.

The second row of Figure 1 summarises the (Wald test) -value obtained by comparing each of the estimated values of with 0.5. They clearly show that the values for the MLE of calculated using Newton’s method were all very close to 1, indicating that there is very strong evidence that the MLE is statistically consistent with the true parameter value. Of the noniterative techniques, the estimates calculated using the estimate (10) had values consistently above 0.5, with many of them being well above 0.8 indicating that, of all the 1000 tables generated, all the estimates were statistically indistinguishable with . Of the three remaining noniterative methods, the original estimation procedure of Beh and Davy [2] (the BDNI estimate (9)) performed the worst and calculated values less than 0.05.

3.1.2. 1000 5 × 5 Tables—𝜙=0.3

One thousand (1000) contingency tables were generated with . Figure 2 shows how stable and accurate each of the five methods were for estimating from (2).

Figure 2: Plots showing the stability of Newton’s method for obtaining the MLE of and the four noniterative procedures in 1000 simulations of a contingency table where .

Again, Newton’s method of calculating the MLE of was more stable than any of the noniterative techniques. The mean MLE from the 1000 tables randomly generated was an impressive 0.30001 (0.01455) with the algorithm requiring between 14 and 45 iterations to converge to four decimal places (the median number of iterations observed was 23). However, the estimate (10) also proved to be an extremely reliable and consistent noniterative estimation procedure and achieved estimates whose values were consistently well above 0.8. Here, the mean estimate using the estimate (10) is 0.30010 (0.01198) compared with 0.29083 (0.01214) for method (12), followed by a mean of estimate of 0.26797 (0.01249) for the estimate (11) and a relatively poor mean estimate of 0.24209 (0.01286) for the BDNI estimate (9). For the tables that were randomly generated, the mean maximum standard error of was 0.01498.

The values of the noniterative methods, as summarised in Figure 2, attest to the accurate and consistent estimation of using the estimate (10), while the estimate (12) produced estimates that were nearly as reliable, although consistently underestimating (due to the approximation of the logarithm function used in the estimate (10)) . The BDNI estimate (9) and estimate (11) proved to be the least reliable due to both involving first order approximations of the natural logarithm function in the estimate (10), however their Wald -values were all well above the nominal 0.05 level of significance, with all being at least 0.3.

3.1.3. 1000 3 × 3 Tables—𝜙=0.8

One thousand (1000) contingency tables were generated with . Figure 3 shows how stable and accurate each of the five methods were for estimating from (2).

Figure 3: Plots showing the stability of Newton’s method for obtaining the MLE of and the four noniterative procedures in 1000 simulations of a contingency table where .

Figure 3 shows that for the 1000 randomly generated contingency tables method (10) estimated with great accuracy and reliability, just as Newton’s method did. The general level of accuracy of the estimate (10) is evident by noting that the mean of the estimates was 0.80164 (0.04765) with a mean maximum standard error of 0.05531, while the mean MLE was 0.80057 (0.05421). To calculate the MLE with this level of precision required a median of 34 iterations to converge to four decimal places, with a range of 20 to 155 iterations. The estimate (12) again produced relatively stable and accurate results, with a mean estimate of 0.78785 (0.04789) while the BDNI estimate (9) provided the worst estimates with a mean estimate of 0.68781 and a standard error of 0.04930.

3.2. Impact of Sample Size on the Estimate of 𝜙

The results in Section 3.1 clearly show that using method (10) provides as accurate and reliable means of estimating as those obtained using the more commonly adopted MLE procedures (Newton’s unidimensional method here). However, for all of the randomly generated contingency tables, the sample size was approximately 1000. In this section, we examine the accuracy and reliability of the estimates obtained using Newton’s method compared with the approach, method (10), for 1000 randomly generated contingency tables where the sample size varied from 10 to 100000.

When calculating the MLE, the number of iterations needed for convergence stabilised. For example, for a contingency table where , one would expect Newton’s method to estimate the MLE of the parameter (to an accuracy of four decimal places) at approximately 34 iterations, while for a table 43 iterations would be expected. As expected the number of iterations required to ensure convergence depends on the magnitude of   and the size of the contingency table being analysed.

As part of this study, three different sized contingency tables were considered: contingency tables where  , contingency tables where  , and contingency tables where  . Table 1 summarises the mean estimated value of using Newton’s method and the estimate (10) from 1000 randomly generated contingency tables. Table 1 also includes, when using Newton’s method to find the MLE of , the median number and the range (minimum and maximum) of iterations needed for convergence to occur to four decimal places. It shows that, for the three different types of contingency tables considered, as the sample size increased, so too did the accuracy of both estimates.

Table 1: Mean ML and estimates of for three different sized contingency tables with varying various sample sizes [iterations] give the median iterations taken for Newton’s method to converge to four decimal places and the range of iterations observed.

Suppose that we now consider the impact of the sample size when adopting Newton’s method and the estimate (10) for estimating in the ordinal log-linear model of (2). To monitor the relative accuracy of the estimates using these methods compared to the specified value of used to randomly generate the contingency tables, the quantity in Table 1 describes the percentage difference in the estimate using each of the procedures compared with the specified value of such that . It can be seen that, for small sample sizes, both estimation procedures produce estimates that are equally poor; although, for the 1000 random tables with  , Newton’s method did substantially better at estimating than the estimate (10). However, as the sample size increases, both estimation procedures become comparable, getting to within 1% of the specified value of for sample sizes exceeding 500. Table 1 shows that the estimate produces very consistent estimates of the underlying parameter and, in some cases involving larger sample sizes, performs better than its iterative counterpart. In such cases, this provides some compelling evidence for considering the computationally simple estimation procedure of noniterative estimation, especially the estimate (10).

4. Discussion

This paper has examined, through simulation studies, the accuracy and reliability of estimation techniques for of the ordinal log-linear model (2). Traditionally, the MLE of this parameter is determined iteratively, either through variations of iterative proportional fitting, Newton’s method, or by some other way. However Beh and Davy [2] and Beh and Farver [9, 11] developed, between them, four noniterative estimation techniques, and this paper has computationally studied their precision and consistency. While Beh and Farver [9, 11] established that the estimate (10) provides a very accurate estimate of the parameter when compared with the estimate obtained using Newton’s method, the simulation study performed in Section 3 has demonstrated that the estimate (10) also produces an estimate that is consistent with the true parameter value. We have also shown that, while they may not produce estimates that are as accurate as the estimate, the BDNI estimate (9), estimate (11), and the estimate (12) are very reliable. A computational study of the impact of sample size on the estimates was also examined, and it was shown that the estimate provided very consistent results when compared with the estimate using Newton’s unidimensional method.

With the advent of faster computing power and processing, performing an algorithm that cycles through hundreds or even thousands of iterations is no longer a major issue. So the question that one may ask is why should noniterative estimation be considered? Certainly for those well versed in undertaking computationally intensive studies, or familiar with the breadth of computing facilities that are now available to perform statistical tasks, estimating the parameters from models like (2) using iterative algorithms is a relatively simple task. However, not all analysts are as well versed in with such tools. Therefore, simple, and direct estimation procedures such as the four noniterative estimates studied in this paper provide one with a means of estimation that is computationally simple, accurate, and stable. There are other factors that can make noniterative approaches appealing, and these stem from problems inherent with using computationally intensive algorithms. While it may occur infrequently for models like (2) that problems with convergence will fail to produce an estimate, or that poor starting values will lead to poor estimates, they can still occur. Direct estimation procedures, therefore, provide the analyst with an alternative strategy for parameter estimation.

The aim of this paper is not to demonstrate the superiority of noniterative estimation methods over the more traditional MLE approaches that adopt algorithms to perform estimation. Instead the aim of this paper, and its predecessors, is to highlight that there are alternative strategies that can be considered. There are, however, still many issues yet to be resolved for the noniterative estimation procedures studied here and elsewhere. For example, this paper has not proposed a strategy for calculating noniterative estimates of multiple parameters in an ordinal log-linear model that have a nonzero covariance structure. This issue, and others like it, will be left for future consideration.

Randomly Generating a Contingency Table

In order to study the accuracy and reliability of Newton’s unidimensional method and the noniterative methods described above, we adopted a simple procedure for generating a two-way contingency table of dimension given a value of and sample size, . The procedure involves selecting an independent and random sample of   pseudomarginal row proportions, denoted by , from a uniform distribution from the interval . A sample of   pseudomarginal column proportions, , is generated in a similar manner. The random marginal proportions of the th row and th column can, therefore, be obtained such that respectively, ensuring that and . Given these randomly generated marginal proportions and the predefined scores used to reflect the ordered structure of the categories, the mean row and column scores are respectively. For this computational study, natural row scores and natural column scores are chosen to reflect the ordinal structure of the categories. The th joint marginal cell frequency is then calculated based on the ordinal log-linear model (7) such that where is the frequency observed in the th cell, given the randomly generated marginal proportions, the pre-determined value, and It is apparent that the values obtained will not be guaranteed to be an integer. Therefore, each cell value is rounded to the nearest integer to obtain the randomly generated contingency table. Thus, the sample size of n specified in the algorithm is only an approximate sample size value.

References

1. L. Thompson, S-PLUS (and R) Manual to Accompany Agresti’s Categorical Data Analysis, 2nd edition, 2009.
2. E. J. Beh and P. J. Davy, “A non-iterative alternative to ordinal log-linear models,” Journal of Applied Mathematics and Decision Sciences, vol. 8, no. 2, pp. 67–86, 2004.
3. L. A. Goodman, “Simple models of the analysis of association in cross-classifications having ordered categories,” Journal of the American Statistical Association, vol. 74, pp. 537–552, 1979.
4. S. J. Haberman, “Log linear models for frequency tables with ordered classifications,” Biometrics, vol. 30, no. 4, pp. 589–600, 1974.
5. A. Agresti, Analysis of Ordinal Categorical Data, John Wiley & Sons, New York, NY, USA, 1984.
6. J. A. Nelder and R. W. M. Wedderburn, “Generalized linear models,” Journal of the Royal Statistical Society A, vol. 135, pp. 370–384, 1972.
7. F. Galindo-Garre and J. K. Vermunt, “The order-restricted association model: two estimation algorithms and issues in testing,” Psychometrika, vol. 69, no. 4, pp. 641–654, 2004.
8. M. L. Aït-Sidi-Allal, A. Baccini, and A. M. Mondot, “A new algorithm for estimating the parameters and their asymptotic covariance in correlation and association models,” Computational Statistics and Data Analysis, vol. 45, no. 3, pp. 389–421, 2004.
9. E. J. Beh and T. B. Farver, “An evaluation of non-iterative methods for estimating the linear-by-linear parameter of ordinal log-linear models,” The Australian and New Zealand Journal of Statistics, vol. 51, no. 3, pp. 335–352, 2009, Erratum in The Australian and New Zealand Journal of Statistics,vol. 52, pp. 241-242.
10. S. J. Haberman, “Computation of maximum likelihood estimates in association models,” Journal of the American Statistical Association, vol. 90, pp. 1438–1446, 1995.
11. E. J. Beh and T. B. Farver, “On estimating the linear-by-linear parameter for ordinal log-linear models,” A Computational Study. In press.
12. A. Agresti, Categorical Data Analysis, John Wiley & Sons, New Jersey, NJ, USA, 2nd edition, 2002.