Research Article
WSF2: A Novel Framework for Filtering Web Spam
Algorithm 2
Dummy filter definition for the WSF2 platform.
(00) web_features SVM check_svm(); | () describe SVM Classifies a web page as spam using Support Vector Machine classifier | () score SVM 3 | () | () web_features TREE_95 check_tree(0.95, 0.99); | () describe TREE_99 C5.0 between 0.99 and 1.00 | () score TREE_99 1.5 | () | () web_features TREE_99 check_tree(0.99, 1.00); | () describe TREE_99 C5.0 between 0.99 and 1.00 | () score TREE_99 3 | () | () web_body HAS_VIAGRA_ON_WEB_BODY eval( "[vV][iI?1!][aA][gG][rR][aA]") | () describe HAS_VIAGRA_ON_WEB_BODY Check if the web page contains references to viagra on body | () score HAS_VIAGRA_ON_WEB_BODY 2 | () | () meta HAS_HIGH_SPAM_RATE (SVM & (TREE_95 || TREE_99)) | () describe HAS_HIGH_SPAM_RATE Has high probability of being spam | () score HAS_HIGH_SPAM_RATE + | () | () required_score 5 |
|