1: procedure FeatureSelection |
2: BOW ⟵ Bag of Word representation of the corpus |
3: L ⟵ Set of Leaders from Algorithm 1 |
4: G ⟵ The graph model from Algorithm 1 |
5: begin: |
6: W={} |
7: foreach Word w in BOWdo |
8: if w thenord ∈ HEF such as link text, headings, meta informations, image descriptions etc... |
9: W[w]=BOW[w]1.5 |
10: else |
11: W=[w]=BOW[w] |
12: DocClass={} - Represent the Document-Class vector, all documents are initialized 1 to all classes. |
13: foreach leader l in Ldo |
14: class,prob=get class and probability of l from the classifer |
15: update DocClass, value=l, class and prob, weight =1.5 |
16: foreach neighbor n of l in Gdo |
17: update DocClass, value=n, class and prob, weight =1.5 |
18: foreach document d in the corpusdo |
19: class,prob=get class and probability of d from the classifer |
20: update DocClass, value=l, class and prob, weight =1 |
21: Assign the class which is having the highest probweight value to d. |