Abstract

The present study aimed to investigate the overall and relative contribution of four subcomponents of vocabulary knowledge to reading comprehension. The four vocabulary subcomponents were vocabulary size, word association knowledge, collocation knowledge, and morphological knowledge. The participants were 124 college students from a university in Taipei, Taiwan. Six instruments were employed: (1) a reading comprehension test, (2) a vocabulary size test, (3) a test on word association knowledge and collocation knowledge, (4) a test of morphological knowledge, (5) motivation attitude scale, and (6) a self-efficacy scale. The results can be summarized as follows. First, after the effects of motivation and self-efficacy have been controlled, the four vocabulary subcomponents altogether contributed significantly (20%) to reading comprehension performance. Moreover, depth of vocabulary knowledge (including word association knowledge, collocation knowledge, and morphological knowledge) provided an additional explained variance (6%) in reading comprehension performance over and above vocabulary size. Finally, among the three subcomponents of depth of vocabulary knowledge, collocation knowledge explained the most proportion of variance (5.6%) in contributing to performance on reading comprehension. Based on these findings, some implications and suggestions for future research were provided.

1. Motivation

As words are an integral part of a language, vocabulary knowledge has been widely considered one of fundamental contributors to the comprehension of a text. Indeed, it has long been held that vocabulary knowledge is one of the most significant predictors of text difficulty. As Chall [1] once put it, “Once a vocabulary measure is included in a prediction formula, sentence structure does not add very much to the prediction” (p. 157). The crucial role of vocabulary knowledge in reading comprehension has also been empirically evidenced in many studies (e.g., [24]). Take Wu and Hu’s [4] study for example. Among many variables investigated in their study, vocabulary knowledge was found to have a significant and positive correlation with reading achievement and play a key role in reading comprehension. As such, adequate vocabulary knowledge appears to be one of the prerequisites for successful reading comprehension.

Likewise in Taiwan, where English is a foreign language (EFL), the importance of adequate English vocabulary knowledge to reading comprehension has also been recognized over the years. University students, after receiving six years of formal English training in their high schools, are expected to be able to read English textbooks related to their field of study without much difficulty. Unfortunately, many of them, as reported by several researchers (e.g. [5]), still have great difficulty reading English textbooks. Specifically, Huang [5] found that a lack of adequate vocabulary knowledge is one of the major culprits causing Taiwanese college students’ difficulties in comprehending English textbooks. Their deficiencies in this regard were also evidenced in many other studies (e.g., [6, 7]). Given the importance of vocabulary knowledge to reading comprehension but the evidenced inadequacy of Taiwanese college students’ vocabulary knowledge, it is not unreasonable to have found that in recent years Taiwanese learners’ performance in reading comprehension, as measured by Test of English for International Communication (TOEIC), has fallen far behind other EFL countries in Asia, such as China and South Korea, and ESL (English as a Second Language) countries, such as the Philippines [8].

In response to the Taiwanese learners’ declining reading performance in worldwide large scale proficiency tests and the empirically evidenced inadequacy of their vocabulary knowledge, the present study was called for in an attempt to take a close look at the relationship between reading comprehension and vocabulary knowledge among college students in Taiwan. As vocabulary knowledge has been perceived as a multidimensional construct [9, 10], the current study specifically is aimed at finding out the overall and relative contribution of vocabulary knowledge’s various subcomponents to explaining the variance of Taiwanese learners’ reading comprehension performance. It was hoped that the results of the present study could guide English language instructors and teaching material designers toward pedagogically sound practices with respect to vocabulary learning and reading comprehension.

2. Literature Review

Vocabulary knowledge has received a lot of attention in the field of reading research (e.g., [2, 913]). Just as Alderson [11] noted, “reading research has consistently found a word knowledge factor on which vocabulary knowledge loads highly” (p. 99). For instance, in a study on text simplification, Strother and Ulijn [14] compared reading comprehension scores between original texts and texts that had been simplified in a syntactical rather than lexical way. They found no differences, so they concluded that simplifying syntax does not necessarily lead to more readable texts. Instead of using a syntactic strategy, they suggested to use a conceptual strategy, which involves processing content words and utilizing lexical and content knowledge. Similarly, Horwitz [15] also found that a substantial number of language learners agreed that learning vocabulary is the most important part of learning a foreign or second language. As such, the important role that vocabulary knowledge plays in either language learning or reading comprehension could never be overemphasized.

In light of the importance of vocabulary knowledge, over the past few decades, numerous second language (hereafter L2) vocabulary researchers [2, 9, 10, 12, 16, 17] have proposed various, but complementary vocabulary knowledge frameworks. For instance, Meara [17] contended that vocabulary knowledge could be viewed as possessing two primary dimensions: breadth and depth. Breadth of vocabulary knowledge refers to the number of words that a learner has at least some superficial knowledge about, whereas depth of vocabulary knowledge refers to how well a learner knows a word [10]. Subsequently, greater effort has further been made by Chapelle [16] and Qian [10]. For example, based on the collective strength of previous frameworks, Qian [10] proposed that vocabulary knowledge consists of four interrelated dimensions: (a) vocabulary size, (b) depth of vocabulary knowledge, which contains all lexical subcomponents, such as phonemic, graphemic, morphemic, syntactic, semantic, collocational, associative, and phraseological properties, as well as frequency and register, (c) lexical organization, and (d) automaticity of receptive-productive knowledge. Taken together, it appears that there is a growing tendency to view vocabulary knowledge as a multidimensional construct instead of a single dimension.

As far as the first dimension, vocabulary size or breadth of vocabulary knowledge, is concerned, numerous studies have been conducted on vocabulary size of a particular group of ESL (English as a second language) or EFL students. Some studies (e.g., [6, 7, 18]) focused on measuring vocabulary size of university students. Other studies (e.g., [9, 10, 19]) investigated the role of vocabulary size in reading comprehension. For instance, focusing on the relationship between vocabulary size and the comprehension of academic texts, Laufer [19] reported that vocabulary size was moderately to highly correlated with reading comprehension performance, with correlation coefficients ranging from .50 to .75. She then concluded that, in accordance with the results of previous research, vocabulary size is a good predictor of the reading comprehension level in foreign language.

Among the various subcomponents of the second dimension (i.e., depth of vocabulary knowledge) that have been researched, word association knowledge has seemed to gain a lot of attention. For example, the importance of word association knowledge has been identified in the field of language learning (e.g., [12, 20]). As Nation [12] put it, “understanding word association is useful for creating limited vocabularies to define words and for the simplification of text” (p. 52). Empirically, research on word association has shown a great deal of agreement from groups of native English respondents. For instance, with an attempt to find out the association responses of native English respondents, Lambert and Moore (as cited in [21]) found that the primary response, which meant the most popular response, accounted for about one-third of the total responses and the primary, secondary (i.e., the second most popular), and tertiary (i.e., the third most popular) responses together were calculated between 50% and 60% of the total responses. To illustrate, Schmitt [21] in his review further reported Lambert and Moore’s 100 British university student participants’ responses’ to the word “abandon.” As explained by Schimitt, if native speakers’ association responses were random and not similar, then almost 100 different responses would have been found in their study. However, it turned out only 38 different association responses were obtained. Specifically, the primary or the most popular response to “abandon” was “leave,” the secondary or the second most popular response was “ship,” and the tertiary or the third most popular response was “give up.” The top three responses were given by 53 participants. In other words, the top three responses accounted for 53% of the total responses. Furthermore, responses that were made by two or more participants accounted for 71%. As such, the native English speakers appeared to exhibit a great deal of similarity in organization of mental lexicons. Similarly, Johnston (as cited in [21]) also obtained a figure of 57% when she examined the three most popular responses provided by fourth and fifth graders. Therefore, Schmitt concluded that the large degree of systematicity or agreement found in native English speakers’ responses suggested that the mental lexicons of native English speakers are organized in a similar pattern, and he further speculated that nonnative speakers would benefit from organizing their lexicons similarly [21].

When it comes to the assessment of word association knowledge, there has been a great body of literature on measuring word association knowledge in L1 (e.g., [22]) and on investigating the use of word association knowledge in L2 (e.g., [23]). For example, Clark [22] employed a standard word association task, in which native speakers were asked to produce their own responses to stimulus words. He found that they had stable patterns of word association. On the other hand, when the same task was applied to L2 learners, Meara [24] reported that they created much more varied and unstable associations. Based on the instability of the L2 learners’ responses, he suggested that the standard word association task was an unsatisfactory test for assessing L2 learners’ word association knowledge.

In view of Meara’s [24] suggestion, Read [23] designed and developed an alternative word association test to measure college students’ word association knowledge. For each item of the test, a stimulus word was presented to test takers together with a group of other words, some of which were related in meaning to the stimulus word and others were not. The test would require the learners to select related words (or associates) rather than produce their own ones. According to Read, the stimulus words and corresponding associates were chosen based on three types of relationship including paradigmatic, syntagmatic, and analytic. By involving particular associates on the basis of these relationships, Read’s word association test is perceived to provide insight into the type of knowledge that learners have about a word and into the development of that knowledge [25].

While there has been a wealth of research (e.g., [22, 23]) on assessment of word association knowledge, there is a scarcity of research on the relationship between word association knowledge and reading comprehension. One study with such an attempt was done by Qian [9]. In his attempt to explore relationships among vocabulary size, depth of vocabulary knowledge, and reading comprehension in ESL students, Qian employed Depth of Vocabulary Knowledge test (hereafter DVK), which was a modified version of the original word associate test [23] and composed of word association (including synonymy and polysemy) and collocation. He reported that scores on the DVK and scores on the vocabulary size were positively associated (, ). Moreover, the results from his multiple regression analyses showed that scores on DVK made a significantly noticeable and unique (11%) contribution to the prediction of scores on reading comprehension beyond the prediction provided by vocabulary size. Therefore, he concluded that both vocabulary size and depth of vocabulary knowledge appear to be variables significantly related to the performance of reading comprehension.

However, one limitation of Qian’s [9] study was that he did not control affective variables, such as motivation and self-efficacy, whose effects have been confirmed to be alarming in L2 learning achievement (e.g., [26]). In addition, as pointed out by Qian [10] himself, since the DVK represents only two subcomponents (i.e., word association knowledge and collocation knowledge) of depth of vocabulary knowledge, measures which include other subcomponents, for example, morphosyntactic ones, should also be developed and included in future studies for a more complete understanding about the depth of vocabulary knowledge.

As for collocation knowledge, another frequently researched subcomponent of depth of vocabulary knowledge, much attention has been paid to exploring its importance in language teaching and learning (e.g., [27]). One of the positions taken is that collocation knowledge is fundamental because the stored sequences of words are the bases of language learning. For example, Ellis [28] claimed that a lot of language learning can be explained by the storage of chunks of language in long-term memory without having to refer to underlying rules. By means of retrieving chunks of language from long-term memory, language reception and language production are made more effective. Similarly, Pawley and Syder [29] argued that in addition to knowing the rules of the language, language users can produce native-like sentences (native-like selection) and can produce language fluently (native-like fluency) by extracting units of language of clause length from memory. Empirical evidence for this position comes from a longitudinal study conducted by Towell et al. [30], which compared learners of French as a L2 before and after their residence in a French speaking country. They concluded that the observed increase in fluency of the learners was the result of learners storing memorized sequences and suggested that having a good command of collocation is crucial for acquiring native-like fluency and selection.

In addition to the importance of collocation knowledge in language learning in general, numerous studies (e.g., [31]) have shown its crucial role specifically in learners’ reading comprehension. For instance, Keshavarz and Salimi [31] reported that there was a significant relationship (, ) between Iranian EFL learners’ collocation competence and their scores on cloze tests, suggesting that it is important to improve ESL/EFL learners’ collocation knowledge in order to enhance their reading skills.

Besides collocation knowledge, the importance of morphological knowledge—the ability to obtain information about the meaning and parts of speech of new words from their prefixes, roots, and suffixes—in language learning has also been recognized in the past few decades (e.g., [12, 20, 32]). As Nation [12] noted it, “knowledge of affixes can be used to help the learning of unfamiliar words by relating these words to known prefixes and suffixes (p. 264).” Likewise, Schmitt and Meara [20] also pointed out that affix knowledge is important in the process of forming word families (e.g., appoint appointer, appointee, and appointment) and thus the expansion of vocabulary size. In addition, empirical research also showed that word parts are a very important aspect of vocabulary knowledge. For example, White et al. [33] found that about 60% of words with the four prefixes they studied, that is, un-, re-, in-, dis-, could be understood from knowing the commonest meaning of the base word. They further reported that nearly 80% of prefixed words could be understood with the aid of knowledge about the less common meanings of the prefixes along with the aid of context.

As a crucial subcomponent of depth of vocabulary knowledge, knowledge of morphology has been believed by numerous researchers to be helpful for reading comprehension. For example, Nagy et al. [34] asserted that given that a considerable portion of English words have meanings that are predictable from the meanings of their parts, knowledge of morphology is perceived to play an important role in determining how learners read and learn new, long word, which in turn impacts their reading comprehension. Specifically, taking words like “uncourteously” or “queenlike,” for example, Nagy et al. [35] pointed out that in spite of their length and low frequency, recognizing the familiar parts and understanding how these parts contribute to the meaning of the word give English language learners access to the meaning of novel words encountered in texts while they read. Similarly, Kieffer and Lesaux [36] also stated that “the word-general ability to decompose morphologically complex words may lead to more successful word learning over time and thereby equip readers better to succeed with reading comprehension” (p. 785).

As such, another morphological research strand concerned the role of morphological knowledge in reading comprehension (e.g., [34, 37, 38]). For instance, in her attempt to investigate the relationship between high school and college students’ sensitivity to three (syntactic, phonological, and relational) properties of suffixes and their reading achievement, Mahony [38] generally reported weak to moderate links between good reading and good word structure sensitivity, with correlation coefficients ranging from .34 to .68. A longitudinal study by Deacon and Kirby [37] also found that second graders’ performance on morphological knowledge predicted their performance on reading comprehension in fifth grade even after controlling phonological awareness and second grade reading comprehension. A more recent study by Nagy et al. [34] reported that for fourth to ninth graders, morphological knowledge accounted for significant and unique variation (ranging from 38% to 86%) in reading comprehension independent of breadth of vocabulary knowledge, word reading, and phonological awareness. These studies may shed some light on the relationship between knowledge of derivational morphology and reading comprehension. However, as these studies investigated on native English speaking students, they did not address whether this relationship also holds among English language learners.

In response to the paucity of previous studies on this issue in EFL or ESL settings, Kieffer and Lesaux [36] conducted a study to examine the relationship between morphological knowledge and reading comprehension in English among Spanish-speaking English language learners (ELLs) from fourth to fifth grade. They reported that there was a statistically significant relationship (, ) between morphological knowledge and reading comprehension among Spanish-speaking English language fifth graders even when the influence of the learners’ word reading skills, vocabulary breadth, and phonological awareness was controlled. However, even though they found that morphological knowledge has explanatory power independent of breadth of vocabulary and word reading skill, their results should be interpreted with some caution. The test of morphological knowledge employed in their study was a productive test. Considering that reading ability is a receptive skill, it seems that there is a need to investigate whether morphological knowledge, when assessed by a receptive test, would still be a significant predictor to reading comprehension.

Taken together, a large body of research has attempted to measure vocabulary size (e.g., [6, 7]), word association knowledge (e.g., [22, 24]), collocation knowledge (e.g., [30]), and morphological knowledge (e.g., [33]) separately, or to assess two or three of the subcomponents in a single study, such as word association and morphological knowledge (e.g., [20]), vocabulary size, word association knowledge, and collocation knowledge (e.g., [9, 10]). In addition, a number of studies have been conducted on the relationship between reading comprehension and one or two of the subcomponents of vocabulary knowledge, such as vocabulary size (e.g., [19]), word association knowledge (e.g., [9, 10]), collocation knowledge (e.g., [9, 10]), and morphological knowledge (e.g., [20, 36]). However, no investigation has been made to incorporate the four subcomponents all together in a single study nor has a study been run on the relationships of reading comprehension to all of the four subcomponents. That is, no single study has been done to find out among the four subcomponents, which one can best predict the performance on reading comprehension. Thus, it seemed that there is a need to conduct a study along this line. Additionally, it must be pointed out that although all vocabulary dimensions are conceptually relevant in assessing the role of vocabulary knowledge in reading comprehension, only breadth (i.e., vocabulary size) and depth dimensions of vocabulary knowledge were evaluated in this study because (a) these two dimensions appeared to be central in Qian’s [10] framework, as well as in all the other frameworks reviewed, (b) and the present study opted for a modest goal of assessing the depth of vocabulary knowledge due to the constraint of the scope. Furthermore, while examining the relative and unique contribution of the subcomponents to reading comprehension, most of the studies reviewed did not exert proper control over a range of affective variables, such as motivation and self-efficacy, whose effects have been confirmed to be attributable to L2 learning achievement (e.g., [26]) or to L2 reading comprehension [3941]. As such, motivation and self-efficacy were chosen as control variables in the present study.

2.1. Research Questions

The purpose of the present study was twofold: (a) to explore the overall contribution of the four vocabulary subcomponents and (b) to examine the relative contribution of each of the four subcomponents. Specifically, the study aimed to address the following research questions.(1)What is the overall contribution of vocabulary size, word association knowledge, collocation knowledge, and morphological knowledge to the performance on reading comprehension after motivation and self-efficacy have been accounted for?(2)To what extent does depth of vocabulary knowledge (word association knowledge, collocation knowledge, and morphological knowledge) add to the prediction of reading comprehension scores, over and above the prediction provided by vocabulary size, after motivation and self-efficacy have been accounted for?(3)For each of the three subcomponents of depth of vocabulary knowledge, what is its relative variance contribution to the prediction of scores on reading comprehension, over and above the prediction provided by vocabulary size, after motivation and self-efficacy have been accounted for?

3. Method

3.1. Subjects

A total of 124 college students taking the course of Freshmen English at one university in Taipei were the participants of the study. They were freshmen from different departments, including Department of Education, Department of Early Childhood Education, Department of Music, and Department of English Instruction. Their length of exposure to formal EFL instruction ranged from nine to twelve years.

3.2. Instruments

A total of seven variables were involved in the present study: reading comprehension, vocabulary size, word association knowledge, collocation knowledge, morphological knowledge, motivation, and self-efficacy. The instruments used in the present study to measure the seven variables included (1) a reading comprehension test, (2) vocabulary size test, (3) a word association and collocation test, (4) a derivative word form test, (5) a motivation questionnaire, and (6) a self-efficacy questionnaire. For each of the seven variables, its corresponding instrument used is described in the following sections.

3.2.1. Reading Comprehension

In the present study, the participants’ performance on reading comprehension was assessed by a published intermediate level sample test from the Cambridge Preliminary English Test 4 (hereafter CPET), which was at Level B1 of the Common European Framework of Reference for Languages (CEFR)—an internationally recognized benchmark of language ability. When it comes to assessing the reading performance of university students in Taiwan, a large number of studies would adopt either Test of English as a Foreign Language (TOEFL) or General English Proficiency Test (GEPT), which is a locally well-constructed English proficiency test. Therefore, for the sake of avoiding possible practice effect, CPET was applied in the present study. The CPET was composed of different target language situations that addressed a range of skills involved in reading comprehension at the intermediate level (e.g., reading for gist and detailed comprehension, scanning for specific information, understanding attitude, and opinion and writer’s purpose, as well as making inference). The sample reading test contained five parts. In Part 1 (Questions 1–5), test takers were presented with a list of signs or texts and they were required to choose the best description that conformed to the text. In Part 2 (Questions 6–10), there were a list of people in the left and short descriptions in the right. The test takers were required to choose from one of the descriptions that best matched each person. The third part of the test (Questions 11–20) contained a list of statements based on a reading passage. The participants should be able to judge whether the statement was true or not. In Part 4 (Questions 21–25), there were five multiple-choice items with one reading passage. The test takers were asked to choose the most appropriate answer from four written options. In the last part (Questions 26–35), there was a cloze test with ten multiple-choice items to assess test takers’ vocabulary and structural knowledge. Three items (items 5, 19, and 21) were discarded from the test for a concern that some of their distractors were considered ambiguous. For the scoring of the Cambridge Preliminary English Test, each item was worth one point. Thus, the maximum possible total score was 32 for the 32 test items. The split-half reliability estimate for scores on the CPET was .50.

3.2.2. Vocabulary Size

The Receptive Vocabulary Levels Test (hereafter VLT [2]) was used in the present study to measure the participants’ vocabulary size or breadth of vocabulary knowledge. It was a paper-and-pencil test, consisting of five levels of word frequency: the 2,000 Word Level, the 3,000 Word Level, and the 5,000 Word Level, the University Word List Level, and 10,000 Word Level. Due to the time constraint, the present study tended not to test all vocabulary levels in the test. Based on the results of many previous studies (e.g., [6, 7]) in Taiwan, most Taiwanese freshmen’s average receptive vocabulary size is about 2,000–3,000 words. Furthermore, prior to the formal conducting of the current study, a pilot study was done. A few students whose educational background was similar to the participants’ of the current study were invited to take the 2,000, 3,000, and 5,000 Word Levels Test. Their performance on the 5,000 Word Level Test turned out to be quite poor and they all complained about its difficulty. Therefore, the present study adopted only the Receptive 2,000 and 3,000 Word Levels Test. However, instead of the original VLT (24 items in eight clusters, each cluster contains three items) constructed by Nation [2], an equivalent version developed by Schmitt et al. [42] was employed in the present study. It contained 30 items in ten clusters, with each cluster containing three items. Nation [12] pointed out that the new version of Schmitt et al. [42] was a major improvement on the original test. Therefore, the version was adopted in this study to assess the participants’ breadth of vocabulary knowledge. It was in a matching format, including 20 blocks. Each block contained six words and three definitions. The test takers were required to select the original word in the left column to go with each definition in the right column. With three definitions for each of the 20 blocks, the test included 60 items for test takers to answer. Each item of the test was scored one point. Thus, the maximum possible total score for the Receptive 2,000 and 3,000 Word Level Test was 60. The split-half reliability estimate for scores on the whole test was .76. The following is a sample item:

(1) business,
(2) clock part of a house,
(3) horse animal with four legs,
(4) pencil something used for writing,
(5) shoe,
(6) wall.

3.2.3. Association Knowledge and Collocation Knowledge

Originally called the Word Associates Test (WAT), the depth of vocabulary knowledge (DVK) measure was developed by [23]. Claimed as the best-known test format based on the dimensions approach (see Schmitt et al. [43]), the test consisted of 40 items. It was intended to gauge the test taker’s depth of receptive English vocabulary knowledge. Most word association tasks are productive in nature, requiring test takers to produce a number of related words that come into their heads when the test takers are presented with a set of stimulus words. However, considering that the present study attempted to investigate the relationship of word association knowledge to reading comprehension, which pertains to a receptive skill, it seemed reasonable to adopt such a receptive test of word association as DVK.

A modified version of the DVK measure by Qian [9] was used in the present study to assess the participants’ word association knowledge and collocation knowledge. According to Qian [9], the key answers to eight items in Read’s [23] WAT were considered ambiguous and were thus replaced. Each DVK item was composed of one stimulus word and two boxes. The stimulus word was an adjective, and each of the two boxes contained four words. The words in the top boxes were used to assess the participants’ receptive aspect of word association knowledge, while the words in the bottom boxes were utilized to measure the participants’ receptive aspect of collocation knowledge. Among the four words in the top box, the test takers were required to choose one to three words that is/are synonymous with one aspect of or the whole meaning of the stimulus word, whereas in the bottom box, they were asked to select among the four words, one to three words that collocate(s) with the stimulus word. The instruction sheet for the test taker specified that there were four correct answers in each item. However, these answers were not evenly spread. Three situations were possible: (a) the top and bottom boxes both included two correct answers; (b) the top box included one correct answer, and the bottom box included three correct answers; and (c) the top box included three correct answers, and the bottom box included one correct answer. According to Read [13], this arrangement was made with an attempt to reduce possible guessing effects. For the DVK measure, each word correctly chosen was awarded one point. Thus the maximum possible total score was 160 for the 40 items. The split-half reliability estimates for scores on the DVK association and DVK collocation were .67 and .66. The following is an example of the items in the test:Sound

3.2.4. Morphological Knowledge

Originally constructed by Tyler and Nagy [44], Morpheme Sensitivity Test (hereafter MST) was later modeled by Mahony [38] to assess university students’ performance on derivational morphology. Because the present study also intended to investigate university students’ morphological knowledge, MST was utilized as a test of morphological knowledge to facilitate the ease of comparison with her results. It consisted of four parts including three paper-and-pencil tests and one oral reading test. Part 1 and Part 2 were designed to measure receptive knowledge of syntactic category of common Latin and Greek suffixes. The former tested knowledge of syntactic category using real word whereas the latter examined knowledge of syntactic category by means of nonsense word. The third part of the test assessed the derivational relationships of word pairs. The last part was an oral test to examine test takers’ knowledge of how the pronunciation of a letter that is silent in the base form is affected by different types of word-internal boundaries. Among the four parts of the MST, only Part 1 and Part 3 of the MST were employed in the present study, as they were more directly related to its research focus—to assess the participants’ receptive English knowledge of morphology.

Part One of the MST, the Syntactic Categories of Suffixes Using Real Words (hereafter Syncat-real Test), comprised 27 sentences, each of which contained a blank and was followed by four real words which were different derivations of the same stem; that is, the answer choices differed from each other only in their suffixes. The test takers were asked to choose the appropriate derivational suffix from the four real words. The following is an example of the items in the test:The cost of keeps going up.(A) electric(B) electrify(C) electricity(D) electricalThe 27 sentences included three noun types (-ion/-ation, -ity, and -ist), three verb types (-ate, -ize, and –ify), and three adjective types (-ous/-ious, -al, and –ive). All of the sentences were unambiguous, and the blanks were highly constrained syntactically, limiting the choice of possible correct answers to one. Each item of the test was scored one point. Thus, the maximum possible total score was 27 for the 27 items. The split-half reliability estimate for scores on the MST Syncat-real Test was .74.

The Relational Test, Part Three of the MST, was made up of 42 pairs of words, 25 of which were related and 17 of which were not. Each pair of words was followed by the words “YES” and “NO.” The test takers were required to put a tick under the word “YES” if the pair of words was related and “NO” if the pair of words was not related. The following is an example of the items in the test:

happy – happinessYESNO
cat – categoryYESNO
For the scoring of the Relational Test, each item was worth one point. Thus, the maximum possible total score was 42 for the 42 items. The split-half reliability estimate for scores on the MST Relational Test was .44. Given the great potential for guessing resulting from the “yes-no” item format, obtaining the reliability estimate as low as .44 appeared to be understandable. Furthermore, the items of the test, on the surface, seemed to have a certain degree of face validity, suggesting that the low to moderate reliability estimate obtained may not be a major impediment to its use in the current study.

3.2.5. Motivation

The Language Learning Orientation Scale (hereafter LLOS), developed by Noels et al. [45], was adopted in the present study. It was made up of 20 randomly ordered statements with seven subscales to assess a motivation, the three types of intrinsic motivation (IM-Knowledge, IM-Accomplishment, and IM-Stimulation), and the three types of extrinsic motivation (external regulation, introjected regulation, and identified regulation). The participants were asked to rate the degree to which the proposed statement applied to themselves on a five-point Likert Scale. The maximum possible total score of the motivation questionnaire was 100. Cronbach’s α reliability estimate for scores on the scale measuring motivation was .88. Table 1 shows two example items of the questionnaire.

3.2.6. Self-Efficacy

The self-efficacy scale was adapted from the Motivated Strategies for Learning Questionnaire (hereafter MSLQ [46]) to specifically fit in the field of English learning. The original MSLQ scale included items such as “I am certain I can understand the ideas taught in this course,” while the adapted scale consisted of items such as “I am certain I can understand the ideas taught in English.” The self-efficacy scale contained nine items to measure students’ self-efficacy when they learn English. The test takers were required to rate the extent to which the proposed statement applied to themselves. With a five-point Likert format, the maximum possible total score for the self-efficacy questionnaire was 45. Cronbach’s α reliability estimate for the scores on the scale measuring self-efficacy was .91. Table 2 shows two example items of the questionnaire.

3.3. Procedures

All participants were required to take three vocabulary tests, a reading comprehension test, and two questionnaires in two separate class sessions. There was a one-week interval between the two sessions for the purpose of avoiding the participants’ fatigue from taking the tests. In particular, in the first session, the participants took the reading comprehension test. In the following session, the remaining instruments were administered to the participants.

4. Results

Table 3 presents the means, standard deviations, obtained score ranges for the participants’ performance on the reading comprehension test, and the four vocabulary knowledge tests. The mean percentage correct score (88%) of the vocabulary size test was the highest among the four tests. The test of morphological knowledge obtained the second highest mean percentage correct score (80%). Comparatively, the mean percentage correct scores of word association (66%) and collocation (58%) appeared to be moderate.

In terms of the relationship of reading comprehension performance to the four subcomponents of vocabulary knowledge, Pearson correlation analyses were conducted, and the correlation coefficients are shown in Table 4. The participants’ scores on the four subcomponents of vocabulary knowledge were all correlated significantly with their reading comprehension scores. Among the four subcomponents, vocabulary size had the highest correlation (, ) with reading comprehension, despite the fact that the strength of the correlation was just moderate. What came next were word association and collocation, both of which shared the same strength of relationship (, ) with reading comprehension score. By comparison, morphological knowledge displayed a significant but slightly lower correlation (, ) with reading comprehension. A much lower correlation was found in the relationship of reading comprehension to self-efficacy (, ) and to motivation (, ).

As to the intercorrelations among the four vocabulary subcomponents, the results indicate that the four subcomponents were moderately to highly correlated with one another. Specifically, word association was shown to have the strongest correlation (, ) with collocation, followed by its correlation with vocabulary size (, ) and with morphological knowledge (, ). As described earlier, the word association performance and the collocation performance on each of the items shared the same stimulus word. Obtaining the correlation coefficient as large as .87 between word association knowledge and collocation knowledge was hardly surprising. In addition, vocabulary size was moderately related to morphological knowledge (, ) and to collocation (, ). Likewise, collocation was found to have a medium correlation size (, ) with morphological knowledge.

As multiple regression analyses were to be done in the present study, the variance inflation factor (VIF) for each of the independent variables was calculated to check whether the required assumption of no multicollinearity has been met. The VIF indicates whether an independent variable or a predictor has a strong linear relationship with the others. According to Bowerman and O’Connell [47], Field [48], and Myers [49], VIF values greater than 10 are worthy of multicollinearity concern. The VIFs calculated for the independent variables in the present study ranged from 1.42 to 4.31 and were all below 10, suggesting no multicollinearity within the data.

For the purpose of addressing the first research question concerning the overall contribution of the four vocabulary subcomponents to reading comprehension after controlling the influence of motivation and self-efficacy, a multiple regression procedure was carried out with the four subcomponents as independent variables and reading comprehension as the dependent variable, after motivation and self-efficacy were entered as a predictor block in the first step to control their variances. The results are displayed in Table 5. In this analysis, motivation and self-efficacy, entered into the equation in Model 1, accounted for 5% variance in reading comprehension. In Model 2, with the four subcomponents added to the equation, changed to 25% of the variance in reading comprehension (, ). Thus, the addition of these vocabulary variables resulted in a statistically significant 20% increment in the explained variance in reading comprehension.

Table 6 presents the results of the second research question, which intended to explore the predictive power of depth of vocabulary knowledge (i.e., word association, collocation, and derivative word form) in the contribution of reading comprehension above the prediction explained by vocabulary size after differences among students in motivation and self-efficacy have been eliminated. As shown in Table 6, when vocabulary size was entered into the equation in Model 2, it accounted for a significant proportion of the variance in reading comprehension with , , . The addition of depth of vocabulary knowledge in Model 3 changed the size of to .25, showing a statistically significant increase of .06, or 6% of the explained variance in reading comprehension (, ). Thus, depth of vocabulary knowledge added a unique proportion (6%) of explained variance in reading comprehension on top of the 14% variance provided by vocabulary size.

Finally, three more regression analyses were conducted to determine the relative contribution of each of the three vocabulary knowledge depth subcomponents to performance on reading comprehension on top of the variance afforded by motivation, self-efficacy, and vocabulary size. For the first analysis, the first step was to enter motivation and self-efficacy as a predictor block, and then vocabulary size was then entered in the second step. The final step involved merely entering association. As to the second analysis, the first two steps remained the same, with the final step replacing collocation with association. Likewise, the third analysis also involved the first two steps with final step entering morphological knowledge only. The focus of the three analyses was on examining and comparing the magnitude of changes. The magnitude of change was .048 for the first analysis and .056 for the second analysis. Furthermore, the magnitude of changes for the two analyses was all significant at both .05 and .01 levels of significance. In contrast, the magnitude of change was .016 and statistically insignificant for the third analysis. In other words, the results indicated that, among the three vocabulary knowledge depth subcomponents, collocation appeared to account for the greatest amount of variance contribution to predicting reading comprehension performance over and above motivation, self-efficacy, and vocabulary size, while morphological knowledge had the least and insignificant amount of variance contribution.

5. Discussions and Conclusions

In terms ofthe overall contribution of the four vocabulary subcomponents to reading comprehension, when the effects of motivation and self-efficacy have been controlled, there was still a statistically significant relationship (20% of the explained variance) between the four subcomponents and reading comprehension. This finding seemed to suggest that the four subcomponents altogether play an important role in the reading comprehension of the university students. In terms of the prediction size of depth of vocabulary knowledge, the results indicated that 6% of the explained variance in reading comprehension was afforded by depth of vocabulary knowledge alone over and above vocabulary size after the effects of motivation and self-efficacy have been controlled. In other words, word association, collocation, and morphological knowledge altogether made a unique contribution to the prediction of scores on reading comprehension beyond the prediction provided by scores on vocabulary size. The figure appeared to be lower than those obtained in Qian’s [9] and Qian’s [10] research. In his two studies, he found that DVK, which represented scores on word association and collocation, added a unique proportion of explained variance (11% and 13%, resp.) in reading comprehension apart from the prediction provided by vocabulary size. A plausible reason for the higher contribution in both of Qian’s studies was that he did not control the effects of any affective variables, which might overestimate the prediction of depth of vocabulary knowledge. In fact, the results of the current study made a further contribution by revealing that, after controlling the two affective variables (i.e., motivation and self-efficacy), depth of vocabulary knowledge does not exert that large amount of explained variance over vocabulary size in reading comprehension, as claimed by Qian’s [9] and Qian’s [10] studies. Compared with the results in Qian’s two studies, where only vocabulary size and depth were investigated, the results of the present study were based on a more comprehensive research design by including the consideration of the two affective variables.

Further analyses showed that, among the three subcomponents of depth of vocabulary knowledge, collocation accounted for the largest proportion of variance (5.6%) in contributing to performance on reading comprehension. This finding appeared to indicate that collocation emerged as a stronger predictor (5.6%) relative to the other two depth components. That is, collocation appeared to override the contribution of word association and morphological knowledge to performance on reading comprehension. By comparison, morphological knowledge accounted for the smallest and negligible proportion (1.6%) of variance contribution to predicting reading comprehension performance. This finding was congruent with that of Qian’s [9], where morphological knowledge was found to account for the smallest proportion of variance contribution to predicting reading comprehension.

Concerning the relationship of reading comprehension to the four vocabulary subcomponents, the strongest correlation (, ) was found between reading comprehension and vocabulary size. This finding appeared to be consistent with that of Qian’s [10] study, in which scores on the Vocabulary Levels Test were also related to reading comprehension (, ). By comparison, the correlation coefficient obtained in the present study was, however, slightly weaker than that found in Qian’s [10] study. Taken together, it appeared that learners’ vocabulary size is at least moderately associated with their performance on reading comprehension, despite the slight difference in the strength of relationship found between the two studies.

Similarly, a moderate relation was also obtained between word association (, ) and reading comprehension, as well as between collocation (, ) and reading comprehension. If the two subcomponents were put together, they were also moderately related to reading comprehension (, ). The results seemed to be in agreement with those of Qian’s [10] study, where the same DVK test was also used. However, the magnitude of the correlation he reported was higher than that obtained in the present study. Nevertheless, the significant correlation found in both studies lent support to the claim that students’ scores on word association and collocation are somewhat correlated with their reading comprehension levels.

A significant but slightly lower correlation coefficient (, ) was obtained between morphological knowledge and reading comprehension in the present study. The test of morphological test, which was adopted from Mahony’s [38] study, consisted of two parts, Syncat-real section and Relational section. A separate analysis showed that both sections were weakly but significantly related to reading comprehension ( for Syncat-real section and for Relational section). This finding appeared to be inconsistent with that of Mahony’s [38] study, in which scores on her derivative word form test were not significantly related to performance on reading comprehension ( for Syncat-real section and for Relational section). A plausible explanation for this contradictory finding may be due to the fact that the two studies adopted different reading tests to gauge reading comprehension. In the present study, the participants’ reading comprehension was assessed by a published intermediate level sample test from the Cambridge Preliminary English Test 4 (CPET), designed and targeted for measuring the construct of reading comprehension. In contrast, as pointed out by Mahony herself, the Verbal Scores of Scholastic Aptitude Test (SAT) employed in her study were just an indirect (i.e., rather than direct) indicator of her participants’ reading success. Despite the slight inconsistency of the results between the two studies, the low but significant correlation found in the present study seemed to suggest that students’ performance on morphological knowledge was weakly related to their performance on reading comprehension.

On the other hand, a positive but higher correlation between morphological knowledge and reading comprehension was found in Kieffer and Lesaux’s [36] study. The morphological knowledge task used in their study, which was a decomposition task adapted from Carlisle [50], required students to extract the base word from a derived word to complete a sentence (e.g., students were given “popularity” and asked to complete “The girl wanted to be very.”). Their results revealed that there was a significant relationship between Spanish-speaking English language learners’ performance on morphological knowledge and reading comprehension. In addition to different measures of morphological knowledge used, different L1 involved might be another reason for the difference in the magnitude of correlation coefficients obtained between the two studies. In the present study, the participants’ L1 was Chinese, which differed greatly from English, in terms of derivational features. By contrast, the participants’ L1 in Kieffer and Lesaux’s study was Spanish, which shared similar word derivation with English.

With respect to the results about the intercorrelations among the four vocabulary subcomponents, word association was found to have the highest correlation (, ) with collocation. One seemingly possible reason for the high correlation between word association and collocation might be that, in the DVK test, the same stimulus word of each test item was used to measure the two subcomponents. Another point worth mentioning is that the correlation coefficient (, ) obtained in the present study between word association and vocabulary size seemed to be comparable to that (, ) found in Schmitt and Meara’s [20] study. The findings of both studies altogether lent support to the previously claimed relationship between the two vocabulary subcomponents. Still one more point also deserving some discussion is about the correlation coefficient (, ) found between vocabulary size and morphological knowledge. The medium size appeared to be consistent with Kieffer and Lesaux’s [36] finding (, ). Thus, the moderate correlation found in the present study provided additional evidence for the presumed relationship between vocabulary size and morphological knowledge. As noted by Schmitt and Meara [20], students’ affix knowledge is crucial in the process of forming word families and thus the expansion of vocabulary size.

Based on the results of the current study, the conclusions pertinent to each of the three research questions can be drawn as follows. Firstly, after controlling the effects of motivation and self-efficacy, the four vocabulary subcomponents overall contributed significantly to reading comprehension. Secondly, depth of vocabulary knowledge alone provided an additional explained variance in reading comprehension over and above vocabulary size after the effects of motivation and self-efficacy have been controlled. Finally, among the three subcomponents of depth of vocabulary knowledge, collocation was found to explain the most proportion of variance in contributing to performance on reading comprehension.

5.1. Theoretical Implications

Based on the results of the present study, some theoretical implications can be drawn in the following paragraphs.

First of all, the finding of the present study that, among the four variables of vocabulary knowledge, breadth of vocabulary accounted for the largest proportion (i.e., 14%) of variance contribution to predicting reading performance appears to provide additional evidence to corroborate the longstanding claim that receptive vocabulary size is “the determinant factor for reading success” in L2 or FL ([51] p. 144). The larger vocabulary size a learner has, the higher percentage of lexical items in any given text s/he tends to know. As pointed out by Schmitt [52], a vocabulary of as much as 8,000–9,000 word families is what L2 or FL learners need to strive for in order to succeed in comprehending a range of authentic texts without being seriously hindered by unknown vocabulary.

Among the three subcomponents of depth of vocabulary knowledge, collocation knowledge was found to have highest proportion of variance contribution to predicting reading comprehension performance. One seemingly plausible explanation for its highest contribution may be due to its importance in every language. As stated by Hill [53], the importance of collocation can be evidenced by its amount in native speakers’ mental lexicon. He further argued that as much as 70% of everything native speakers say, hear, read, or write is to be found in some form of collocation. Similarly, Farrokh and Mahmoodzadeh [54], Jafarigohar and Nazari [55], and Lewis [56] also asserted that, in order to comprehend and produce language, much of native speakers’ mental lexicon consists of many types of prefabricated chunks, and collocations are the largest part of these chunks. As such, from the psycholinguistic perspective, the human brain equips itself much better for memorizing than for processing, and the availability of large numbers of collocation chunks reduces the processing effort and thus makes reading efficient (e.g., [29, 57]). In other words, the use of collocation facilitates reading comprehension [58].

Furthermore, given the fact that the four vocabulary subcomponents merely accounted for 20% of the variance explained in performance on reading comprehension, there is a possibility that more than 50% of the variance were explained by other subcomponents. Numerous factors in addition to the four vocabulary subcomponents might be connected with reading performance. Subcomponents such as discourse structure knowledge, content knowledge, and synthesis and evaluation skills, as identified by Grabe [59], could also play an important role in reading comprehension process. Accordingly, these findings lent additional support to the common claim that reading comprehension is a complicated and multifaceted process [60].

Finally, the results of the current study also shed some light on the interrelationships among vocabulary size, word association, collocation, and morphological knowledge. As pointed out by Schmitt and Meara [20], a better understanding of these interrelationships could help get a clearer picture of the process of vocabulary acquisition. Specifically, the moderate correlations found between the vocabulary size and the three subcomponents of depth of vocabulary knowledge appear to suggest that breadth and depth of vocabulary knowledge are associated with each other to a certain extent. Collectively, the moderate correlation coefficients (ranging from .52 to .56) obtained in the present study among the four vocabulary subcomponents further supported the previous claim that the development of breadth and depth of vocabulary knowledge “is probably indeed interconnected and interdependent” ([9], p. 299).

5.2. Practical Implications

Some pedagogical implications can also be derived from the results of the present study for EFL vocabulary instruction. As the current study shows, among the three subcomponents of depth of vocabulary knowledge, collocation turned out to contribute most to performance on reading comprehension. As such, teachers should make students aware of the value of collocation in language learning. Typical awareness-raising activities such as getting students to underline the collocations encountered in a text and asking students to think up as many collocations as they can with a common word (e.g., make: make a mistake/a meal/trouble/friends/a complaint) are recommended [53]. When teaching a new word, teachers are advised to demonstrate some of its most common collocations at the same time. Furthermore, teachers should get students exposed to abundant word usage in the authentic context. Specifically, teachers can present authentic materials, such as movies and English songs, as well as English newspapers to students, and then lead students to do some follow-up collocation practice (e.g., matching activities and collocation grid exercises). This can help students learn the actual usage of collocations in authentic contexts.

Besides pedagogical implications, some implications can also be made for the assessment of vocabulary knowledge. For instance, in the present study, only one instrument was utilized to assess each of the four vocabulary knowledge subcomponents. It is suggested that test constructors could design and develop more valid instruments to gauge the subcomponents. As asserted by Bachman and Palmer [61], different characteristics of language tasks, such as item formats and scoring methods, could affect how the scores of the test represent the targeted construct. Similarly, Bernhardt (as cited in [62]) also suggests the need of developing various assessment tasks for the purpose of obtaining a full understanding about the targeted construct and generalizing research findings.

5.3. Limitations of the Study

Although the results of the present study provided some insight into the relationships among the four vocabulary knowledge subcomponents investigated, the generalizability of the results may be limited by the nature of the participants recruited in the present study. In fact, previous studies have shown that the relationship between breadth and depth of vocabulary knowledge depends on proficiency level. For instance, operationalizing depth of vocabulary knowledge through an adapted version of DVK test, Nurweni and Read [3] found that there was a strong relationship between the two dimensions of vocabulary knowledge for the advanced first-year university students, a moderate relationship for the middle-level students, and a considerably weak relationship for the low-level learners. Likewise, in the case of derivative word form, Mahony [38] also reported that the proficient 11th graders had a high degree of sensitivity to word structure and the poor learners had greater difficulty in generalizing knowledge about suffixes to novel words. Given the fact that the participants of the present study are from one university which is in the middle rank in Taiwan, it remains unknown about whether or not the results obtained from learners with higher or lower levels of English proficiency would be different. Therefore, whether the results can also be applicable to high-level or low-level English learners needs to be further investigated in the future.

In addition, there are some limitations concerning the instrument used to assess reading comprehension. For one thing, only one reading comprehension test (i.e., the CPET reading comprehension) was utilized to assess the participants’ reading comprehension ability, which suggested that only a few reading skills were assessed and thus the scores may not fully represent the participants’ reading comprehension abilities. For another, the CPET reading comprehension test seemed not to be satisfactory in terms of reliability. The reliability estimate obtained for the test was only .50. Considering that it is a standardized test administered based on the guidelines provided by Cambridge ESOL, this size of reliability estimate obtained may seem rather surprising. A possible reason might be that the participants were quite homogenous and did not produce much variance in the reading scores , which could lead to a deflation of the reliability estimate (Davies et al., as cited in [63]). As noted by Fletcher (as cited in [64]), approaches to the evaluation of reading comprehension need to entail multiple observations to fully and reliably capture the underlying construct. As such, future studies could incorporate at least two well-established reading comprehension measures.

Several other measurement-related limitations pertain to the instruments used to assess vocabulary knowledge. To start with, both the DVK test and the MST had a tendency to overestimate learners’ vocabulary knowledge since it is susceptible to guessing. Take the DVK for example. In their validation of Read’s [23] version of the DVK test, Schmitt et al. [43] found that there was a high possibility (87%) that the participants are able to select the associates partially correct or fully correct despite the fact that they have no knowledge of the target word. It raises the concern as to whether the learners can guess successfully when they have partial knowledge of the word or even when they have no knowledge of the word at all. Moreover, since each of the four vocabulary knowledge subcomponents was measured with only one instrument, the construct of vocabulary knowledge might not be comprehensively and reliably gauged. Future studies, hence, should employ multiple measures to assess each of the four subcomponents of vocabulary knowledge.

Moreover, due to the constraint of the scope, the current study investigated only three subcomponents of depth of vocabulary knowledge. Therefore, the interpretation of the results should be restricted to the three subcomponents when it comes to assessing depth of vocabulary knowledge. It remains unanswered as to whether or not other subcomponents of depth of vocabulary knowledge may also affect learners’ performance on reading comprehension. As recommended by Qian [9], it is hoped that future research could include other subcomponents (e.g., spelling, syntactical properties, and register) to better understand what contribution various subcomponents of depth of vocabulary knowledge would make to performance on reading comprehension.

Similar recommendation along this line is that, based on the results that the four subcomponents altogether attributed to only 20% of the variance in reading comprehension, a further step could be to examine other variables that might explain the remaining variance. A range of other linguistic (e.g., word recognition, syntactic complexity, and contextual information) and nonlinguistic (e.g., learner’s anxiety) factors could also contribute to success of reading comprehension. With more variables explored, the nature of learners’ reading comprehension could be better understood [64]. The results yielded from this kind of study may provide more constructive ways to help English learners to improve their reading skills.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.