Research Article

The Effects of Feature Optimization on High-Dimensional Essay Data

Table 4

Effective feature list.

Number of times selectedFeature name Meaning of feature

130 posNumINVoca Number of vocabularies with IN POS tag
130 posNumIN Number of words with IN POS tag
130 lmPosTrigramVoca The number of different POS trigrams
130 lmPosTrigramOccMore3 The ratio of POS trigrams occurred more than 3
130 lmPosTrigramOccMore2Less5 The ratio of POS trigrams occurred more than 2 but fewer than 5
130 lmPosTrigramOccMore2Less10 The ratio of POS trigrams occurred more than 2 but fewer than 10
130 lmPosTrigramOccMore2 The ratio of POS trigrams occurred more than 2
130 lmNumVoca4Root Biquadrate of the number of vocabularies
130 lmNumVoca The number of vocabularies
130 lmLexWordOccMore5 The number of different words occurred more than 5
130 lmLexWordOccMore4 The number of different words occurred more than 4
130 lmLexWordOccMore3 The number of different words occurred more than 3
130 lmLexWordOccMore2Less5 The number of different words occurred more than 2 but fewer than 5
130 lmLexWordOccMore2Less10 The number of different words occurred more than 2 but fewer than 10
130 lmLexWordOccMore2 The number of different words occurred more than 2
130 lmLexWordOccMore1Less5 The number of different words occurred more than 5
130 lmLexWordOccMore1Less10 The number of different words occurred more than 10
130 lmLexWordOccMore1 The number of different words occurred more than 1
130 lmLexBigramVoca The number of different lexical bigrams
130 lmLexBigramOccMore2Less5 The ratio of lexical bigrams occurred more than 2 but fewer than 5
130 lmAvgLexWordDistance The average distance of same words
130 lmAvgLemmaWordDistance The average distance of same lemmas
130 cNumWordLen8 The number of words whose length is more than 8 characters
130 cNumWordLen7 The number of words whose length is more than 7 characters
130 cNumWordLen6 The number of words whose length is more than 6 characters
130 cNumWordLen5 The number of words whose length is more than 5 characters
130 cNumWord The number of all words
130 cNumNotStopWord The number of all words except stop words
130 cNumNotStopVoca The number of all vocabularies except stop words
130 cNumMidd The number of words in the intermediate dictionary
130 cNumElem The number of words in the elementary dictionary
130 cNumChar The number of all characters
130 cCharNotStopWord The number of all characters except stop words
129 posNumNN The number of words with NN POS tag
129 lmLexBigramOccMore2Less10 The ratio of lexical bigrams occurred more than 2 but fewer than 10
129 lmLexBigramOccMore2 The ratio of lexical bigrams occurred more than 2
128 posNumJJVoca The number of vocabularies with NN POS tag
128 posNumJJ The number of words with NN POS tag
126 cNumWordLen10 The number of words whose length is more than 10 characters
125 posNumNNSVoca The number of vocabularies with NNS POS tag