Abstract

It is shown for coefficient matrices of Russell-Rao coefficients and two asymmetric Dice coefficients that ordinal information on a latent variable model can be obtained from the eigenvector corresponding to the largest eigenvalue.

1. Introduction

An important role in statistics and data analysis is played by similarity coefficients. A similarity coefficient is a measure of resemblance or association of two data vectors, such as score patterns, variables, and items. For example, in ecological biology similarity coefficients are used for measuring the degree of coexistence between two species types over different locations. In many research studies the data consist of binary vectors: presence or absence of disease; presence or absence of species characteristics; yes or no answers in questionnaires; pass or fail in high-stakes testing. For expressing the degree of resemblance of two binary vectors in a number, a variety of similarity coefficients has been proposed [13]. Examples are the Jaccard coefficient [4], the Russell-Rao coefficient [5], the Dice coefficient [6], and the simple matching coefficient [7, 8]. In choosing a coefficient, a measure has to be considered in the context of the data-analytic study of which it is a part [9]. Because there are so many similarity coefficients for binary data to choose from, it is important that the different coefficients and their properties are better understood.

Instead of studying properties of individual coefficients [1013] one may also study properties of coefficient matrices [14]. Coefficient matrices are used as input in various techniques of multivariate data analysis, including factor or component analysis [15, 16], hierarchical cluster analysis, and techniques in classification and dissimilarity analysis [17]. Moreover, exploratory data-analytic methods such as principal coordinates analysis and (multiple) correspondence analysis can be defined as eigendecomposition of certain coefficient matrices [15, 16, 18]. It would be interesting to know what information, if any, is reflected in the eigenvectors of a coefficient matrix that is based on a similarity coefficient for binary vectors.

In this paper we show for several coefficient matrices that ordinal information on latent variable models can be obtained from the eigenvector corresponding to the largest eigenvalue. It is thus possible to uncover meaningful orderings of various models by using eigenvectors. The results are first of all of theoretical interest. They show that some coefficient matrices have more interesting eigenvectors than others. Coefficient matrices based on some coefficients may thus lead to more interesting data-analytic solutions than matrices corresponding to other coefficients. Furthermore, potentially, the results can enhance the interpretation of a data analysis that uses these coefficient matrices as input.

The paper is organized as follows. Notation and two latent variable models are introduced in the next section. In Section 3 several ordering properties of eigenvectors corresponding to a largest eigenvalue are presented. An illustration of the results is presented in Section 4. Section 5 contains a conclusion.

2. Latent Variable Models

Suppose the data consist of binary vectors of length . It may be assumed that the scores in the binary vectors are realizations of a latent variable model. In this section we introduce two models in the context of nonparametric item response theory [19, 20]. In item response theory the vectors are often viewed as items that, for instance, contain the responses (pass, fail) of a high-stakes test for subjects. The items will be indexed by and .

Let denote a one-dimensional latent variable and let be its probability density function. Let denote the response function corresponding to the response 1 on item . The unconditional probability of a response 1 on item is then given by Next, assume local independence; that is, conditionally on the responses of a subject on the items are stochastically independent. The joint probability of items and for a value of is then given by . The corresponding unconditional probability is Throughout the paper we assume that .

Next, we define the latent variable models. Both models have monotone response functions and are frequently applied in the context of measuring ability. The first model is characterized by requirements (3) and (4). The first requirement is that are monotonically increasing on ; that is, for . The second requirement is that the items can be ordered such that are nonintersecting; that is, for . The case that assumes (3) and (4), together with the assumptions of local independence and a single latent variable, is called the double monotonicity model in nonparametric item response theory [19, 20]. A well-known result is that if the double monotonicity model holds, then the items can be ordered such that we have for , and for and [19, 20]. The second model is characterized by requirements (3) and (7). The response functions may satisfy various orders of total positivity [21]. If the functions are totally positive of order 2, the items can be ordered such that holds for and . Schriever [22] proved the following result for a set of response functions that are both monotonically increasing and satisfy total positivity of order 2. If the vectors are ordered such that (3) and (7) hold, then holds for and .

We conclude this section with a parametric example that satisfies requirements (3), (4), and (7). A well-known model from the field of item response theory is the Rasch [23] model. A response function of this one-parameter logistic model is given by where is a location parameter. In the context of item response theory the parameter is usually called a difficulty parameter [19, 20]. The functions form a location family.

3. Ordering Properties

In this section we present ordering properties for three coefficient matrices. The coefficient matrices of size are An element of the matrix is a Russell-Rao coefficient for two binary vectors and [5, 10]. Some data-analytic properties of the matrix are discussed in Warrens [14]. The elements of the matrices and are conditional probabilities discussed and applied in Dice [6]. The harmonic mean of the two conditional probabilities is equal to the Dice coefficient [6]. Matrix is also called the conditional adjacency matrix in Post and Snijders [24].

A specific result that will be used in the proofs of Theorems 2, 3, and 4 below is the Perron-Frobenius theorem [25, 26]. More precisely, only the following weaker version of the Perron-Frobenius theorem will be used.

Lemma 1. If a square matrix has strictly positive elements, then the eigenvector corresponding to the largest eigenvalue of has strictly positive elements.

In the proof of Theorems 2, 3, and 4 we use certain special matrices. Let denote the upper triangular matrix of size () with unit elements on and above the diagonal and all other elements zero. Its inverse is the matrix with unit elements on the diagonal and with elements adjacent and above the diagonal. Examples of and of size are Furthermore, let be the identity matrix of size , and let denote the diagonal block matrix of size with diagonal elements and . Examples of and of size areWe first consider the matrix . Let be the eigenvector corresponding to the largest eigenvalue of the matrix . Theorem 2 shows that if the binary vectors can be ordered such that (3) and (4) hold, then this ordering is reflected in the corresponding elements of .

Theorem 2. Suppose that of the vectors, which without loss of generality can be taken as the first , can be ordered such that (3) and (4) hold. Then the elements of of corresponding to these vectors satisfy .

Proof. Since is nonsingular, is an eigenvector of corresponding to if and only if is an eigenvector of corresponding to . Under the conditions of the theorem, the elements of are positive and the elements of are strictly positive. Application of Lemma 1 then yields that the eigenvector of (or ) has strictly positive elements. The assertion then follows from the identity .
In the remainder of the proof we show that has positive elements and has strictly positive elements. The matrix has elements for and and for and . Under the conditions of the theorem properties (5) and (6) hold for the first items. By (6), we have , and the matrix has positive elements except for for . However by (5), we have and it follows that for . Hence, the matrix has positive elements. Moreover, because the elements in the last row and last column of are strictly positive, it follows that the elements of are strictly positive.

An analogous result holds for the matrix . Let be the eigenvector corresponding to the largest eigenvalue of the matrix . Theorem 3 shows that if the binary vectors can be ordered such that (3) and (4) hold, then this ordering is reflected in the corresponding elements of of .

Theorem 3. Suppose that of the vectors, which without loss of generality can be taken as the first , can be ordered such that (3) and (4) hold. Then the elements of of corresponding to these vectors satisfy .

Proof. The proof is similar to the proof of Theorem 2. The matrix has elements for and and for and . Under the conditions of the theorem properties (5) and (6) hold for the first items. By (6), we have , and the matrix has positive elements except for for . But by (5), we have , and it follows that for

Finally, Theorem 4 below presents an ordering property of the matrix . The ordering holds for a slightly stronger model than the one considered in Theorems 2 and 3. Theorem 4 shows that if the binary vectors can be ordered such that (3), (4), and (7) hold, then this ordering is reflected in the corresponding elements of of .

Theorem 4. Suppose that of the vectors, which without loss of generality can be taken as the first , can be ordered such that (3), (4), and (7) hold. Then the elements of of corresponding to these vectors satisfy .

Proof. The proof is similar to the proof of Theorems 2 and 3. Let denote the transpose of . The matrix has elements for and and for and . Under the conditions of the theorem properties (5) and (8) hold. By (8), we have , and the matrix has positive elements except for for . However, by (5), we have , and it follows that for .

4. An Illustration

In this section we consider an example from educational testing to illustrate some of the results from Section 3. The data consist of responses of 1000 individuals to five items of the LSAT (Law School Admission Test). The test was designed to measure a one-dimensional latent variable. The example is part of a data set given by Bock and Lieberman [27]. The data set is distributed with the R package “ltm” written by Rizopoulos [28].

Requirements (3), (4), and (7) cannot be checked directly for real life data. However, it can be shown that the Rasch model in (9) fits these data quite well. Using subroutines from the “ltm” package we fitted the Rasch model and the so-called two-parameter logistic model [19, 20]. In the Rasch model the items are allowed to differ in location. In the more general two-parameter model the items are also allowed to differ in slope. For these data the two-parameter model has four additional parameters. The log likelihoods of the models are and , respectively, and the corresponding likelihood ratio test has a value of . Thus, the extra slope parameters are statistically not warranted.

Requirements (3), (4), and (7) can also be studied by verifying if conditions (5), (6), and (8) hold. The proportions of correct responses are , , , , and for items 1 to 5, respectively. For verifying conditions (6) and (8), we suppose that the items are ordered on the proportions of correct responses, from easiest to hardest item (1, 5, 4, 2, and 3). In other words, in the following we assume that the items are ordered such that condition (5) holds.

To study condition (6) we may inspect the matrix of Russell-Rao coefficients. For the LSAT data this matrix is given by The elements on the main diagonal are the proportions of correct responses. If we ignore the elements on the main diagonal it can be verified that the other four elements in each column of are strictly decreasing. Hence, condition (6) holds.

Since conditions (5) and (6) hold for all five LSAT items it follows from Theorem 3 that the ordering of the items is reflected in the eigenvector corresponding to the largest eigenvalue of . The largest eigenvalue is and the associated eigenvector is given by . The item ordering is thus reflected in the elements of the eigenvector.

To verify whether condition (8) holds we may inspect the matrix of Dice coefficients. For the LSAT data this matrix is given by If we ignore the elements on the main diagonal it can be verified that the remaining four elements in the first, third, and fourth columns of are strictly increasing. Furthermore, the elements in the second and fifth columns are roughly increasing. In both columns there is one anomaly. We may conclude that condition (8) holds approximately.

If the five LSAT items satisfy conditions (5) and (8) it follows from Theorem 4 that the ordering of the items is reflected in the eigenvector corresponding to the largest eigenvalue of . The largest eigenvalue is and the associated eigenvector is given by . The item ordering is thus reflected in the elements of the eigenvector.

5. Conclusion

Similarity coefficients for binary vectors are frequently used in statistics for analyzing the structure between objects. Examples that are commonly used are the Russell-Rao coefficient [5] and the Dice coefficient [6]. Since the choice of a coefficient depends on the context of the data-analytic study, it is important that the different coefficients and their properties are well understood.

In this paper we showed that ordinal information on latent variable models is reflected in the eigenvector corresponding to the largest eigenvalue of the coefficient matrices with Russell-Rao coefficients (Theorem 3) and two asymmetric coefficients used in Dice [6] (Theorems 2 and 4). For other well-known coefficients like the Jaccard coefficient [4] and the simple matching coefficient similar ordering properties could not been found. The results may indicate that the Russell-Rao coefficient and Dice coefficients may lead to more clearly interpretable output if used as input in clustering methods or principal coordinates analysis. However, more research on this topic is needed.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.