Computational Intelligence and Neuroscience

Computational Intelligence and Neuroscience / 2016 / Article

Research Article | Open Access

Volume 2016 |Article ID 3057481 | 13 pages | https://doi.org/10.1155/2016/3057481

Self-Trained LMT for Semisupervised Learning

Academic Editor: Manuel Graña
Received10 Aug 2015
Accepted08 Nov 2015
Published29 Dec 2015

Abstract

The most important asset of semisupervised classification methods is the use of available unlabeled data combined with a clearly smaller set of labeled examples, so as to increase the classification accuracy compared with the default procedure of supervised methods, which on the other hand use only the labeled data during the training phase. Both the absence of automated mechanisms that produce labeled data and the high cost of needed human effort for completing the procedure of labelization in several scientific domains rise the need for semisupervised methods which counterbalance this phenomenon. In this work, a self-trained Logistic Model Trees (LMT) algorithm is presented, which combines the characteristics of Logistic Trees under the scenario of poor available labeled data. We performed an in depth comparison with other well-known semisupervised classification methods on standard benchmark datasets and we finally reached to the point that the presented technique had better accuracy in most cases.

1. Introduction

Classification task is an integral part of machine learning algorithms, trying to separate and thereafter match each tested pattern or object into distinct categories or classes. The classes vary according to the application domain of each problem. For example, the classes could represent the different origin among the tested speakers in a Speech Identification problem or different objects at several pictures in various backgrounds in a Pattern Recognition problem.

The default scenario of classification is the supervised, in which all the available labeled data are used in order to build a classification model. Using the information of the labeled data, the trained supervised classification model will assign to each new instance a class label. Unsupervised techniques can be also used for the same problems. The main characteristic of unsupervised techniques is the lack of need for labeled data [1]. However, the lack of the classes downgrades the performance of unsupervised algorithms, respectively. The most recently proposed family of methods is commonly called semisupervised learning (SSL) algorithms and is generated by a direct combination of the previous strategies [2].

Friedhelm and Edmondo [3] proposed in 2014 a categorization of semisupervised learning algorithms. They used the title of Partially Supervised Learning (PSL) for mentioning of these algorithms. They also referred to the phase of training semisupervised algorithms as a weak supervision, since only a part of the whole information is provided. Trying to explain all the new matters that have arisen from PSL task, Friedhelm and Edmondo [3] review the most prominent directions of research that are related to this domain.

Sun [4] reviews theories in order to describe the characteristics of multiview learning. Under this concept, any set of features, or, more generally, any possible information gathered which is related to the dataset, can potentially improve the classification accuracy. Moreover, Triguero et al. [5] made an in depth study of self-labeled techniques, mainly focused on the matter of classification. Based on some specific properties, which seem to be quite representative of and objective for the majority of real applications, they proposed a taxonomy for semisupervised classification (SSC) methods. One of the findings in this work is the shortage of multilearning approaches introduced with self-training method.

In many application domains, the labeling of the training instances requires high cost in labor and/or time [6]. The major asset of semisupervised algorithms is that they overcome the need for collecting and distinguishing large amounts of data in fields like text mining, speech recognition, object detection from images [7], and so forth, allowing application of such methods in a variety of contexts. Moreover, the increased accuracy that is provided by these methods along with the automated learning of most possible patterns from datasets renders semisupervised techniques as a great tool to the machine learning community [8]. Using SSC methods, the essential effort from human experts of labeling instances tends to be reduced dramatically, especially in real-life scenarios [9].

In particular, SSC methods demand only a small proportion of the whole amount of data to be labeled for accomplishing their task. This attribute is widely known as labeled ratio and is usually provided in percentage values: Having chosen the labeled ratio, all the available data split into two different subsets: the labeled () and the unlabeled () set. The mathematic expression of the instances that are included in each of these subsets is as follows:

Tanha et al. [10] suggested that using decision tree classifiers as base classifiers along with self-training algorithm is not quite effective as semisupervised learning is concerned mainly due to low performance when decision tree classifiers compute probability estimations for their predictions. However, decision trees are not demanding in training time and produce easily comprehensive models. A series of modifications have been proposed so as to refrain from using the simplistic proportion distribution at the leaves of a pruned decision tree [11]. Laplacian correction and grafted decision trees are some of them [10]. Torgo [12] also made a thorough study of tree-based regression models and focused on generation of tree models and on pruning by tree selection.

The aim of our work was to present a self-trained Logistic Model Tree (LMT) algorithm and compare it with other well-known semisupervised classification methods on standard benchmark datasets. To achieve this, we performed statistical comparisons of the proposed method with other algorithms and represented an illustrative visualization for recording the behavior of each algorithm against the others. Our proposed technique presented higher accuracy in most cases and a better overall performance in different scenarios, rendering this algorithm as a robust tool.

In Section 2, a brief description of the semisupervised classification techniques is provided. In Section 3, the proposed algorithm is presented. In Section 4, there are the results of the comparison of the proposed algorithm with other well-known semisupervised classification methods on standard benchmark datasets. Finally, some conclusion remarks and future research points are presented in Section 5.

2. Semisupervised Techniques

Self-training is usually called a wrapper method that constitutes a great tool for semisupervised learning tasks. It is a simple scheme based on four stages [7]. In the first one, a classifier of our choice is chosen and is trained with a small amount of labeled data, which have been chosen randomly from the initial dataset. During the second phase, the classification of unlabeled instances takes place and afterwards a procedure of assessment follows. More specifically, each instance that has achieved a probability value over a defined threshold is considered enough reliable to be added to the training set for the following training phases. Finally, these instances are added to the initial training set, increasing in this way its robustness. All these phases constitute a complete step of the algorithm. Re-training of the classifier is done using the new enlarged training set until stopping criteria are satisfied. Self-training has been proven to perform with great success in many real-life scenarios, even though misclassified instances could occur due to lack of specific assumptions. An important reason why PSL techniques’ performance may fluctuate compared with supervised algorithms’ performance is the fact that, during the training phase of the former, some of the unlabeled examples will not get labelized, since the termination of the algorithm will have been preceded [3]. This fact means that a part of the total information provided through the dataset will not be exploited under this scheme.

Self-Training with Editing (SETRED) method is a modified approach to self-training proposed by Li and Zhou [13]. Their principal improvement in relation to the basic self-training scheme is the different tackle of misclassified examples which come from the unlabeled set and may incorrectly be merged with the original train set, pushing in this way the performance of the algorithm in inferior level. In order to reduce these occasions, they build a neighborhood graph in -dimensional feature space, whereas is the dimension of the feature vector (). By evaluating a hypothesis test, they finally discard any example whose output of the test was negative.

Cotraining is an equally important scheme that can be considered as a different variant of self-training technique [14]. Its main approach is that the feature space can be exploited with a different way other than combining all its elements. Under this assumption, which keeps up with the multiview learning, cotraining algorithm assumes that, by dividing the feature space into two separate categories, it is more effective to predict the unlabeled instances each time [15]. This assumption seems to be more realistic when the newly formed categories represent a different view of the dataset. Since the cotraining algorithm belongs to the family of self-training schemes, its algorithmic phases are similar to the previously referred ones, under the restriction of the existence of two independent feature vectors for each instance. In the work of Didaci et al. [16], the relation between the performance of cotraining and the size of the labeled training set was examined and their results showed that high performance was achieved even in cases where the algorithm was provided with very few instances per class. However, Du et al. [17], based on an adequate number of experiments, came to the conclusion that relying on small labeled training sets cannot ensure the accuracy of multiview consideration assumptions. In order to exclude the insertion of misclassified instances into the training set at the end of each iteration, several approaches have been proposed. Sun and Jin [18] filtered the predictions of cotraining classifiers with Canonical Correlation Analysis [4]. By applying CCA on paired datasets, the similarities between unlabeled examples of test set and initial train set were calculated in an effective way and only those instances that satisfied CCA’s restrictions were inserted into the initial training set.

Wang et al. [19] proposed the usage of some distance metric, which examines the probabilities of belonging to a class between labeled and unlabeled examples. If two examples have the same class probability value, the metric that has been defined by this scenario will boost the example with the smaller distance, to be selected with a higher possibility. Another technique for separating with higher accuracy the predictions of a semisupervised scheme is the combination of more than one classifier. Jiang et al. [20] introduced a hybrid method which combines the predictions of two different types of classifiers for exploiting their different characteristics. The first one is Naive Bayes (NB), which is a generative classifier, and the second is Support Vector Machine (SVM), which is a discriminative classifier. The final prediction is controlled by a parameter which controls the weights between the two classifiers. A review of other similar hybrid methods is also presented in [20]. Moreover, Li and Zhou [6] suggested Co-Forest algorithm, in which a number of Random Trees are trained on bootstrap data from the dataset. As an ensemble method, its behavior is robust even if the number of the available labeled examples is reduced. The principal idea of this algorithm is the assignment of a few unlabeled examples to each Random Tree during the training period. Eventually, the final decision is produced by majority voting. An extension of this algorithm is ADE-Co-Forest which is based on a data editing technique in order to find and reject possibly problematic instances at the end of each iteration [21]. Within its framework, cotraining by committee has been proposed by Hady and Schwenker [22]. Based on the completely known instances of dataset, a starting committee was built. The ensemble methods that were used under this semisupervised scheme were named as CoBag (Bagging), CoAdaBoost (AdaBoost), and CoRSM (random subspace).

RASCO [23] does not consider any specific criterion for splitting the feature vectors, but it implements a random split, so as to train different learners. Following this strategy, the unlabeled data are getting labeled and added to the training set based on the combination of a number of decisions of the learners trained on different attribute splits. Rel-RASCO [24] algorithm instead of random feature subspaces generates relevant random subspaces using relevance scores of features which are obtained using the mutual information between features and class.

Tri-training scheme uses three classifiers using different bootstrap sample of the same dataset to label each unlabeled instance. If two of the three classifiers agree on the categorization of an instance, then this is considered to be labeled and is added to the training set [25]. An improved approach to tri-training scheme is improved tri-training algorithm (im-tri-training) [26], in which some drawbacks of the original model such as unsuitable error estimation, excessively confined restriction, and deficiency of weight for labeled example and unlabeled example were eliminated. The idea of ensemble methods and majority voting has been also endorsed by Zhou and Goldman [27], who proposed democratic colearning. One really interesting asset of this algorithm is the enlarging of the training set of the classifier whose prediction was different with the final one after the voting phase. Sun and Zhang [28] suggested an ensemble of classifiers to be trained from multiple views. Subsequently, only the instances whose classification stemmed from consensus prediction of multiple classifiers are selected as the most confident in order to teach the other ensemble from the new one view.

Huang et al. [29] proposed a classification method based on Local Cluster Centers (CLCC). This algorithm tries to resolve problems that occur when the provided datasets consist of a few labeled training data and facilitates situations in which the labeling process may lead to misclassified instances. Another algorithm which uses self-training scheme is aggregation pheromone density based semisupervised classification (APSSC) algorithm [30]. In this work, the corresponding property was used, as the name of algorithm defines, found in natural behavior of real ants. Actually, it performed well enough and offered promising results for solving real world problems which are related to the classification task. A combination of classifiers under self-training scheme has been proposed by Wang et al. [31]. Their learning approach is named Self-Training Nearest Neighbor Rule using Cut Edges (SNNRCE) and its main advantage is the prevention of problematic examples from being added in each iteration to the initial labeled set through graph-based methods.

3. Proposed Algorithm

Our proposed algorithm combines self-training scheme with Logistic Model Tree (LMT) algorithm. A LMT is a decision tree that has linear regression models at its leaves to provide a piecewise linear regression model [34]. As in ordinary decision trees, a test on one of the features is associated with every inner node. For a nominal feature with values, the node has child nodes, and examples are sorted down one of the branches depending on their feature’s value. For numerical features, the node has two child nodes and the test consists of comparisons of the feature value with a threshold. The LogitBoost algorithm is used to produce a linear regression model at every node in the tree [35]. The subsets encountered at lower levels in the tree become smaller and smaller; it can be preferable at some point to build a linear logistic model instead of calling the tree growing procedure recursively. There is strong evidence that building trees for very small datasets is usually not a good idea; it is better to use simpler models (like logistic regression) [36]. As for simple decision trees, pruning is an essential part of the LMT algorithm. For LMT, sometimes a single leaf (a tree pruned back to the root) leads to the best generalization performance, which is seldom the case for simple decision trees [11].

Decision trees can generate estimates for the class membership probabilities: the probability for a particular class is just the fraction of the instances in the region which are labeled with that class. In terms of probability estimates, LMT outperforms all other simple decision trees and related algorithms included in the experiments [34]. In this work, we propose a self-training method that uses the power of LMT for semisupervised tasks. The proposed algorithm (self-trained LMT) is presented in Algorithm 1. The self-training process produces good results by using the more accurate class probabilities of LMT model for the unlabeled instances. When fitting the logistic regression functions at a node, LMT has to determine the number of LogitBoost iterations to run. Originally, this number was cross-validated at every node in the tree [34]. To save time, a heuristic that cross-validates the number only once and then uses this number at every node in the tree was used in our implementation. In [37], a similar process was used.

Input:
LMT – Linear model trees, as base classifier
– Initial training dataset
– Ratio of labeled instances along
– initial labeled instances,
– initial unlabeled instances,
– Instances with Most Confident Predictions
MaxIter – number of maximum iterations performed
 (1) Initialization:
    Train LMT as base model on
 (2) Loop for a number of iterations (MaxIter is equal to 40 for our implementation)
    (a) Use LMT classifier to select the instances with Most Confident Predictions per iteration ()
    (b) Remove from and add them to
    (c) In each iteration a few instances per class are removed from and added to
    (d) Re-train LMT as base model on new enlarged
Output:
Use LMT trained on to predict class labels of the test cases.

Removal of data points from to is based on estimation of class probabilities. If the probability of the most probable class exceeds the predefined threshold , then this instance is assigned a label. In the proposed algorithm, experimental results that were performed by the authors showed that a good option for the threshold parameter is the value of 0.9, which gave decent results irrespective of the dataset. It was noticed that only a small amount of instances per class in each iteration meets the restriction above.

Algorithm 2 describes briefly the main characteristics of LMT classifier and is focused on the points that distinguish the used classifier from the common decision tree algorithms.

Definitions:
– root of decision tree
minNumInst – minimum number of instances at which a node is considered for splitting
numBoostIter – fixed number of iterations of LogitBoost [32]
CART – pruning algorithm [33]
Steps:
  (1) Build logistic model at
  (2) Split data at according to the splitting criterion
  (3) Terminate the splitting phase when any of the stopping criterion (minNumInst or numBoostIter) is met
  (4) Prune the tree using CART-based algorithm

For the implementation, we used the open-source environments of Weka [38] and KEEL [5]. In our implementation, minNumInst was set to 15 and numBoostIter was set to 10.

4. Experiments

The experiments are based on standard classification datasets taken from the KEEL-dataset repository [39] covering a wide range of scientific fields. These datasets have been partitioned using the 10-fold cross-validation procedure. For each generated fold, a given algorithm is trained with the examples contained in the rest of folds (training partition) and then tested with the current fold. Each training partition is divided into two parts: labeled and unlabeled examples. In order to study the influence of the amount of labeled data, we examined four different ratios for dividing the training set: 10%, 20%, 30%, and 40%.

Subsequently, we compared the proposed method with other state-of-the-art algorithms into the KEEL tool [39] such as self-training (C45) [7], self-training (SMO) [40], self-training (NN) [32], SETRED [13], cotraining (C45) [14], cotraining (SMO) [41], democratic-co [27], tri-training (C45) [41], tri-training (SMO) [25], tri-training (NN) [41], DE-tri-training (C45), DE-tri-training (SMO) [42], Co-Forest [6], Rasco (C45) [23], CLCC [29], APSSC [30], SNNRCE [31], Rel-Rasco (NB) [24], ADE-Co-Forest [43], cobagging (C45) [22], and cobagging (SMO) [44]. For all tested algorithms, the default parameters of KEEL were used.

The classification accuracy of each tested algorithm using 10%, 20%, 30%, and 40% as labeled ratio is presented in Tables 1, 2, 3, and 4, respectively. The best accuracy value among the different algorithms tested in each experiment is shown in bold style. For our experiments, we used 52 datasets and all the above 22 algorithms, including Self-LMT. The full tables of comparisons can be found in http://www.math.upatras.gr/~sotos/Self_LMT_Results.xlsx.


DatasetsAlgorithms
Self Self Self CoBag Cotrain Cotrain Democ-CoTri-train Tri-train SETRED
(LMT)(C45)(NN)(C45)(C45)(SMO)(C45)(SMO)

appendicitis0.83180.83180.75730.80450.83180.79270.82180.80450.75270.7373
australian0.85070.82750.80430.82750.83480.79420.84490.84490.80140.8043
automobile0.33650.40520.39700.33660.37300.32980.36010.38890.32090.4388
banana0.85470.84790.86380.85530.84810.89130.84170.84810.89510.8638
breast0.68090.72160.66180.72520.67740.66960.72870.72160.69730.6835
bupa0.53980.53920.53140.61190.57390.61230.51040.57420.62910.5339
chess0.95400.95430.80980.95430.95150.90330.91990.95780.90610.8104
cleveland0.50910.52480.48250.53350.57360.50320.52230.47610.55650.5294
coil20000.93770.93730.89040.93480.94030.61750.93220.93570.57750.8926
contraceptive0.47170.48860.41070.48270.44600.45900.43580.48130.45420.4148
crx0.85550.86600.81110.84960.81590.81770.84950.85550.83930.8111
dermatology0.86270.85620.90530.87600.84290.77370.87600.88160.67290.9182
ecoli0.68830.64670.68470.65580.57720.65500.63700.65850.64890.6937
flare0.70930.72140.65110.71400.57420.56580.72140.71580.65870.6445
german0.71400.70600.65400.71100.69000.62600.71600.71700.66100.6660
glass0.51260.48470.56310.48980.45040.55540.48680.49210.51880.5402
haberman0.71190.70540.61440.71220.71900.59420.71560.70880.62990.6211
heart0.73330.67780.74440.70370.70000.73700.80000.71480.62960.7444
hepatitis0.83650.83430.78830.83430.83430.82430.83430.83430.83430.8008
housevotes0.92290.94140.91290.91950.82250.82600.88990.91580.64490.9129
iris0.84000.84000.90000.80000.84670.94670.91330.72670.89330.9133
led7digit0.66200.61400.61800.56400.51400.50200.61600.60400.51600.6180
lymphography0.65740.63120.63540.59480.57260.54900.49010.61180.55560.6833
magic0.83630.82170.78400.83190.82040.83900.78420.82450.83920.7840
mammographic0.80740.80250.75910.80850.80740.78810.79630.81830.77100.7580
monk-20.94580.97270.64590.96570.97270.81460.90750.96570.75710.6459
movement_libras0.36110.25560.45830.24170.23610.18060.19720.27500.25280.4444
mushroom0.99680.99660.99450.99540.99680.99320.99270.99550.99050.9945
nursery0.92960.90640.71430.90060.90340.81410.89510.90390.77140.8101
page-blocks0.95030.95230.92560.95670.94920.94480.90770.95610.94540.9359
penbased0.95710.89160.97780.90490.89570.98430.94740.90270.98550.9778
phoneme0.78090.77700.80440.78890.76520.83200.78740.77700.82880.8046
pima0.71360.66430.65650.63420.67040.63950.69670.65640.59900.6565
ring0.83280.83960.66910.85820.83660.97010.87410.85420.96990.6691
saheart0.69060.65160.64080.64970.63640.58220.68190.67760.63420.6300
satimage0.83400.80450.84660.82050.80560.85700.84620.82240.85180.8570
segment0.92420.89000.90610.91600.90170.92810.90260.90000.91900.9065
sonar0.70100.64330.66330.70140.58190.60050.60050.70190.60050.6633
spambase0.89800.86690.82810.89510.88840.86010.87770.88100.85490.8281
spectfheart0.70850.68190.68650.75710.72350.77950.73790.75740.78700.7201
splice0.93450.82660.67930.82480.83100.78620.89780.82540.78430.6997
texture0.98050.83050.95130.85000.82890.97530.89440.85240.96840.9513
thyroid0.98690.99220.89630.99060.99240.93420.93930.99180.92990.9090
tic-tac-toe0.75570.71080.71500.70350.69310.67330.69000.70880.64200.7255
titanic0.76880.77510.64020.78370.77830.77560.77560.77650.77600.6402
twonorm0.97200.81360.93580.85970.80850.97350.96450.86160.97430.9358
vehicle0.65260.57920.56980.60300.57470.61350.50230.61940.59100.5828
vowel0.48080.42420.48790.45560.43840.45860.41620.45250.42830.4808
wine0.91570.74050.94380.78660.80750.94350.94930.82030.89310.9438
wisconsin0.93730.90930.94780.92840.90640.95920.96500.93120.95630.9478
yeast0.53040.46160.47710.47650.48860.47850.48860.49070.48390.4886
zoo0.80970.67940.92360.74560.63560.56250.93140.71920.57750.9347


DatasetsAlgorithms
Self Self Self CoBag Cotrain Cotrain Democ-CoTri-train Tri-train SETRED
(LMT)(C45)(NN)(C45)(C45)(SMO)(C45)(SMO)

appendicitis0.85180.85090.78910.83180.85000.83270.86000.83180.80270.8100
australian0.86670.85800.83040.83620.83770.83040.85800.85220.84350.8304
automobile0.55230.51580.52340.53550.48030.40470.47440.49380.39860.4951
banana0.86850.88040.86910.88040.87430.89750.87940.87360.89850.8691
breast0.73580.71560.67800.72240.72270.60760.71840.72250.64930.6788
bupa0.62380.60550.56540.60260.60550.62980.55040.61990.64590.5655
chess0.97780.97780.84890.97780.97590.95090.93930.97810.94590.8489
cleveland0.54620.49290.46920.55350.52700.48660.55300.53990.51290.5257
coil20000.93830.94000.89650.93380.94020.82060.93300.93710.69000.8994
contraceptive0.52480.47790.43240.48060.50440.47930.48410.48810.47050.4379
crx0.86610.85710.81420.85100.84980.85000.85870.85400.87000.8142
dermatology0.91900.91000.93550.91860.87900.87630.93000.92140.89050.9467
ecoli0.72950.72040.73510.73240.73290.69960.72350.72080.71710.7351
flare0.72510.72800.66130.71760.68970.60230.74020.72700.63320.6660
german0.71100.69700.65100.71200.69900.64100.72900.68400.65800.6620
glass0.52700.50850.61930.56310.53480.58960.47990.61040.55380.5935
haberman0.72190.70890.64020.73480.73530.66260.72220.70890.63020.6600
heart0.79630.75190.77410.74810.75560.77780.82220.79260.78520.7741
hepatitis0.78480.84340.78590.82000.84340.83430.83430.84340.83430.7992
housevotes0.96670.95860.89950.95280.93630.88160.90230.95860.89650.8957
iris0.89330.89330.90670.86670.89330.94000.95330.88000.92000.9200
led7digit0.68400.67800.62800.65800.64400.62200.66200.67600.61000.6280
lymphography0.76850.70550.74460.74740.77500.69790.46120.75450.63030.7392
magic0.85130.83040.79340.84090.83200.85280.80170.83350.84510.7934
mammographic0.82320.82290.74730.82480.82280.79680.80840.81570.79070.7473
monk-20.98180.97950.73460.97270.97950.89600.94410.97270.88000.7322
movement_libras0.49720.34720.57500.35560.39170.21940.36670.38330.36110.6000
mushroom0.99950.99910.99840.99890.99910.99680.99800.99890.99750.9984
nursery0.95050.92350.74710.92350.92600.83900.91080.92580.77140.8327
page-blocks0.96180.96020.94130.96130.96090.95710.91150.96110.95670.9448
penbased0.97110.92410.98710.93700.92470.99160.96320.92870.99130.9871
phoneme0.80590.78390.83440.82440.80240.84970.80550.79770.84700.8344
pima0.75530.68100.63820.70840.68740.65620.73190.69390.63410.6369
ring0.85200.86580.70070.89180.86300.97050.89690.87950.97220.7007
saheart0.69070.65150.66230.68620.70380.60410.69720.67770.62560.6601
satimage0.83960.82380.87020.84100.82610.86930.86120.83530.87070.8740
segment0.94070.92640.93330.92770.92940.94890.93070.92470.93640.9338
sonar0.77360.66360.70120.65860.63360.71640.64740.66310.69600.7012
spambase0.91540.89120.85290.89600.89080.90120.89410.89410.90100.8529
spectfheart0.77190.71920.68600.78330.76110.67440.73050.74910.72310.7085
splice0.94170.88340.69690.88750.87930.88970.91190.88430.91690.6978
texture0.98750.86690.97130.88400.86310.98650.91760.89290.98420.9713
thyroid0.99390.99350.90490.99420.99320.94030.94190.99390.93920.9149
tic-tac-toe0.87680.75680.76620.72860.72130.72230.73290.75050.70240.7621
titanic0.78330.78240.64070.78280.78240.78330.77970.78150.78060.6407
twonorm0.97550.81650.94110.87410.82760.97280.97070.86740.97470.9411
vehicle0.71410.64890.62530.65490.64890.69260.48330.66200.66680.6253
vowel0.61110.53030.68480.55660.53030.69900.50300.55450.67070.6697
wine0.92650.83690.93820.82550.78630.95560.95420.84220.92650.9154
wisconsin0.94900.93430.94780.93600.93430.94900.96370.92530.95030.9463
yeast0.57280.52630.48990.52020.54920.51220.54920.54860.51690.5061
zoo0.89750.83110.92470.82310.73440.62830.89000.82640.53750.9372


DatasetsAlgorithms
Self Self Self CoBag Cotrain Cotrain Democ-CoTri-train Tri-train SETRED
(LMT)(C45)(NN)(C45)(C45)(SMO)(C45)(SMO)

appendicitis0.84910.84090.83000.83180.84090.81270.86910.83270.72270.8109
australian0.86520.84640.81010.85070.84780.82320.85360.84200.81160.8101
automobile0.66310.54840.56300.57370.57100.47250.54320.57070.44290.5515
banana0.87960.88190.87000.88110.87850.89920.87280.88060.90040.8700
breast0.72870.71790.64190.71110.70790.55940.72190.70000.57730.6495
bupa0.62930.58420.57910.61590.61370.61920.59880.58500.63070.5848
chess0.98900.98560.86510.98190.98400.96280.95960.98430.95120.8648
cleveland0.58370.53640.51590.55990.54670.47330.57000.53620.46190.5566
coil20000.93900.93840.89770.93170.94000.87410.93200.93870.64800.8992
contraceptive0.51320.49220.43650.50920.49760.49290.48470.51050.49290.4426
crx0.85680.85260.80670.86150.85700.85120.84820.85710.86080.8067
dermatology0.93520.92090.94630.91840.90980.92670.91800.92400.92960.9380
ecoli0.75920.75340.75930.71210.75050.71160.75950.75030.69100.7652
flare0.73920.72990.66510.73080.72240.64730.74850.72980.66230.6594
german0.73700.71000.69000.69200.68400.64600.73900.70300.66100.6960
glass0.51220.54370.60960.56350.57460.62200.50790.56870.59830.5933
haberman0.72190.71530.66300.72820.72880.66950.74470.70890.64990.6761
heart0.80000.75560.78520.77040.77040.79630.82960.75560.77780.7852
hepatitis0.77570.81920.81680.83340.79340.83430.81680.83340.83430.8275
housevotes0.97020.94750.90780.95310.95310.91270.90370.95310.91640.9078
iris0.89330.92000.91330.90670.91330.94670.95330.91330.94670.9267
led7digit0.72400.68800.63600.68200.65200.67600.68200.67600.67800.6360
lymphography0.74300.75870.74460.74450.74440.79250.74420.73150.78580.7439
magic0.85560.83800.79500.84560.83850.85620.80160.84110.85620.7950
mammographic0.82820.84380.76200.83520.84370.80980.83000.84250.79900.7620
monk-20.97390.99090.75810.98860.99090.90300.94520.99090.91610.7513
movement_libras0.63330.40830.71110.42500.46390.38330.49170.48890.45830.6917
mushroom1.00000.99960.99910.99910.99960.99840.99950.99950.99880.9991
nursery0.95780.93770.76870.93820.93630.85510.92120.93620.77140.8357
page-blocks0.96450.96350.94280.96220.96180.96360.92890.96350.96330.9461
penbased0.97570.94090.99010.94850.93910.99370.97290.94310.99080.9901
phoneme0.82290.81370.84700.82750.82180.85570.80290.82490.85340.8470
pima0.75920.72520.67330.70850.71230.66800.73050.70450.65890.6694
ring0.86770.87540.71040.89430.87840.97080.90890.88810.97220.7104
saheart0.69280.67530.66440.67970.67780.60390.70800.67970.59300.6644
satimage0.85520.82700.88220.85280.84070.87860.86930.84380.87770.8862
segment0.94550.93030.94110.93290.93850.95670.94160.93810.95580.9411
sonar0.75930.67620.76450.71550.63330.82640.73100.69120.75430.7645
spambase0.91950.89990.86950.91170.90520.91170.90520.90730.91230.8695
spectfheart0.76470.74960.70100.76080.75380.68590.71210.76440.70460.7127
splice0.95270.91660.71660.92010.91760.90220.91880.91570.92980.7154
texture0.99050.88910.98000.91150.89710.98950.93310.90550.98850.9805
thyroid0.99560.99460.91000.99390.99470.94830.95210.99440.94890.9183
tic-tac-toe0.94670.76100.79750.77450.74950.77550.76300.76410.72330.7923
titanic0.78600.77870.64070.77970.77830.78600.77920.77830.78600.6407
twonorm0.97460.82550.94390.87780.83230.97310.97010.87430.97340.9439
vehicle0.78740.65840.66300.65610.66560.71750.56520.67020.72450.6583
vowel0.72630.62320.78890.61720.58590.80300.59600.63330.78280.7737
wine0.94380.84150.93820.88170.82550.96630.96600.90390.95520.9275
wisconsin0.95340.94430.95350.94310.94740.96210.96660.95020.94590.9535
yeast0.58430.52090.50000.52360.53580.52700.53710.54050.52440.5128
zoo0.88080.82310.93310.77310.82560.65970.91330.77810.61970.9331


DatasetsAlgorithms
Self Self Self CoBag Cotrain Cotrain Democ-CoTri-train Tri-train SETRED
(LMT)(C45)(NN)(C45)(C45)(SMO)(C45)(SMO)

appendicitis0.85090.80180.75450.83820.80000.77270.85910.77180.78270.7564
australian0.87390.85070.82460.84490.85510.82460.84200.84350.83190.8246
automobile0.66400.57600.60380.60090.61680.58440.55370.61510.55150.6089
banana0.88580.88280.86490.88360.87850.90150.88110.88250.89960.8649
breast0.72940.70460.65290.72890.70820.54970.72570.70090.58700.6632
bupa0.67240.65460.57600.61800.60020.64620.60350.65700.63470.5874
chess0.99120.98530.87950.98720.98560.97280.97120.98720.96680.8795
cleveland0.55970.50930.50830.56270.53640.44520.52900.52890.44790.5586
coil20000.93840.94010.89670.93540.93950.91090.93270.93990.88810.8982
contraceptive0.52340.50850.44130.50110.49630.49080.49770.51730.48680.4379
crx0.85580.84650.81610.85870.85560.81200.84240.85270.78820.8161
dermatology0.94670.93850.94930.92670.91590.94390.94390.93530.93240.9466
ecoli0.77110.77390.74740.74760.77680.72680.79470.79470.71480.7534
flare0.74490.72800.66880.72230.70720.65110.74200.72700.65010.6679
german0.73800.71000.67300.69700.71700.65800.74200.71100.65800.6790
glass0.55030.57220.65550.62720.60380.66990.55350.67180.61030.6487
haberman0.73470.71860.67910.74490.73850.70260.74490.72180.66920.6924
heart0.83330.76670.78150.79260.74070.80000.82590.77410.80000.7741
hepatitis0.85170.86510.79340.86190.87850.84430.85430.86750.83430.8392
housevotes0.97020.95860.91170.97020.95860.92470.91640.95910.91640.9117
iris0.92000.90000.92670.90670.90000.94000.97330.90670.94670.9333
led7digit0.72600.68400.63800.68800.67400.68800.68200.67400.69200.6380
lymphography0.81660.77250.72470.76300.76530.80500.64960.73820.78410.7639
magic0.85650.84160.79960.85320.84050.86260.81110.84090.86280.7996
mammographic0.83900.84520.74860.83520.84400.80010.82670.83670.79290.7486
monk-21.00001.00000.77611.00001.00000.91420.95431.00000.91030.7761
movement_libras0.69440.48890.76390.48890.46110.51670.62220.51940.55280.7500
mushroom1.00001.00000.99980.99931.00000.99910.99980.99950.99930.9998
nursery0.96520.94420.78830.94330.94350.86080.93070.94390.80140.8341
page-blocks0.96550.96770.94830.96670.96880.96270.93710.96730.96290.9455
penbased0.97940.94560.99040.95670.94510.99420.97610.94780.99430.9904
phoneme0.82470.82330.85620.84440.82610.85680.81420.82110.85550.8560
pima0.76200.72920.69930.73300.70300.67170.75270.73180.66790.6980
ring0.87460.87270.71740.90950.88460.97120.91770.89530.96910.7174
saheart0.71890.66660.66850.67320.69060.59300.72960.66660.58640.6664
satimage0.86420.84480.89070.82520.84830.89060.87380.85450.88830.8940
segment0.95450.93850.94460.91600.93980.95970.94240.94370.95410.9446
sonar0.72070.66880.78880.70140.68690.82690.75860.69140.79330.7888
spambase0.92540.90560.87800.89510.91190.92500.91540.90890.92430.8780
spectfheart0.80230.77520.73480.73700.74890.68600.69720.76040.70100.7500
splice0.95550.92660.72100.82480.92760.93510.92730.93010.93790.7201
texture0.99250.90150.98290.85000.90840.99360.94200.91240.99310.9831
thyroid0.99570.99400.91610.99010.99490.95570.95260.99440.95430.9231
tic-tac-toe0.96660.78290.80380.70350.77970.77340.79010.80060.73380.7975
titanic0.78510.78740.64070.78370.78560.78830.77870.78420.78830.6407
twonorm0.97640.82690.94540.85970.83580.97320.97220.87880.97450.9454
vehicle0.78740.69260.66670.60300.67730.75760.64190.68090.74460.6619
vowel0.75450.65660.87070.45560.65860.88280.64750.65860.86770.8535
wine0.95490.88660.93300.78660.89870.96080.96050.87550.97190.9441
wisconsin0.95640.96060.96230.92840.94170.95920.96950.94740.95470.9623
yeast0.58360.51960.49670.47650.54720.52030.55120.53640.51760.5074
zoo0.89170.87670.93310.74560.88750.77280.91170.90750.68220.9397

Here, we present only the best 10 of these algorithms, according to their classification accuracy. A short comment follows each experiment about the general behavior of the proposed algorithm in comparison with the most effective one of the rest. We also provide a more representative visualization of the average accuracy ability of the proposed algorithm in comparison with the rest 21 algorithms, presented in Figure 1. In this figure, we have mapped each different ratio of labeled instances with a different color across a radar plot.

In this experiment, self-trained LMT and Co-Forest presented 8 wins in an amount of 52 datasets, being followed by self-training (C45), cotraining (C45), and APSSC with 5 victories. Despite the low labeled ratio of instances, self-trained LMT managed to achieve the best average accuracy, assuring its robust behavior.

During the experiment of 20% labeled rate, self-trained LMT algorithm succeeded with 15 victories, while the next in victories’ rank were Co-Forest algorithm with 5 and cotraining (SMO) with 4, respectively.

Similar to the previous experiment, self-trained LMT performed 17 wins out of 52 datasets, while cotraining (SMO) and Rel-Rasco (NB) achieved 7 and 6 best accuracy values, respectively.

Finally, self-trained LMT algorithm outperformed the rest of algorithms managing to score the best accuracy value in 19 different datasets, while democratic-co achieved 5 victories.

An interesting point which comes out from Figure 1 is that the increase of labeled ratio does not uniquely mean that the average accuracy of all the algorithms will also be enhanced. The example of cobagging (C45) depicts this phenomenon, since its accuracy rate was decreased when it was provided with 40% labeled ratio against the same rate in 30% labeled ratio scenario. Furthermore, many other algorithms, such as Rel-Rasco (NB), APSSC, and de-tri-training (SMO), did not manage to achieve a noteworthy improvement between 30% and 40% labeled ratio. Consequently, by providing the average accuracy of the tested algorithms on radar plots like this in Figure 1, we can extract useful information for comparing any subset of these algorithms as it concerns not only their accuracy but also their response to labeled ratio’s increase, avoiding any saturation phenomena. In order to conduct comparisons among all algorithms considered in the study and the proposed algorithm for all the different labeled ratios, the results of Friedman test together with a post hoc statistical test described in [45] are presented in Tables 5, 6, 7, and 8.


AlgorithmFriedman
ranking
valueHolm/
Hochberg test

Self-LMT6.8750
Tri-training (C45)8.70191.5141E − 010.0500
Cobagging (C45)9.20196.7671E − 020.0250
Co-Forest9.63463.0238E − 020.0167
Democratic-Co9.85581.9252E − 020.0125
Self-training (C45)10.23088.4117E − 030.0100
Cotraining (SMO)11.21156.6111E − 040.0083
SETRED11.38463.9842E − 040.0071
Tri-training (SMO)11.50002.8153E − 040.0063
SNNRCE11.50002.8153E − 040.0056
Cotraining (C45)11.50962.7340E − 040.0050
Cobagging (SMO)11.58652.1587E − 040.0045
DE-tri-training (C45)11.77881.1778E − 040.0042
Tri-training (NN)11.91357.6088E − 050.0038
ADE-Co-Forest12.04814.8633E − 050.0036
DE-tri-training (SMO)12.18273.0754E − 050.0033
Self-training (SMO)12.31731.9242E − 050.0031
Self-training (NN)12.68275.1049E − 060.0029
APSSC13.57691.4202E − 070.0028
CLCC13.75006.7193E − 080.0026
Rel-Rasco (NB)14.42313.0843E − 090.0025
Rasco (C45)15.13468.8277E − 110.0024


AlgorithmFriedman
ranking
valueHolm/
Hochberg test

Self-LMT5.1923
Cobagging (C45)8.70195.8533E − 030.0500
Tri-training (C45)8.76924.9736E − 030.0250
Democratic-co8.77884.8582E − 030.0167
Cotraining (C45)9.52886.6111E − 040.0125
Co-Forest9.70193.9842E − 040.0100
Self-training (C45)10.02881.4596E − 040.0083
Cotraining (SMO)10.25966.9191E − 050.0071
Cobagging (SMO)10.45193.6267E − 050.0063
Tri-training (SMO)10.86548.4002E − 060.0056
DE-tri-training (SMO)11.15382.8515E − 060.0050
DE-tri-training (C45)11.74042.7211E − 070.0045
Self-training (SMO)12.01928.2869E − 080.0042
SETRED12.02887.9474E − 080.0038
Self-training (NN)12.70193.7052E − 090.0036
ADE-Co-Forest12.85581.7697E − 090.0033
SNNRCE12.91351.3364E − 090.0031
Tri-training (NN)13.15384.0597E − 100.0029
APSSC13.84621.0806E − 110.0028
CLCC15.60582.9085E − 160.0026
Rel-Rasco (NB)15.70191.5503E − 160.0025
Rasco (C45)17.00001.8292E − 200.0024


AlgorithmFriedman
ranking
valueHolm/
Hochberg test

Self-LMT5.6923
Democratic-Co8.85581.2989E − 020.0500
Cotraining (SMO)9.42313.3946E − 030.0250
Cobagging (C45)9.69231.6840E − 030.0167
Tri-training (C45)9.72121.5583E − 030.0125
Tri-training (SMO)10.29812.9846E − 040.0100
Co-Forest10.30772.8988E − 040.0083
Cotraining (C45)10.44231.9157E − 040.0071
Self-training (C45)10.50961.5511E − 040.0063
Self-training (SMO)10.68278.9048E − 050.0056
Cobagging (SMO)10.70198.3632E − 050.0050
DE-tri-training (SMO)10.81735.7133E − 050.0045
SETRED12.39421.4202E − 070.0042
Self-training (NN)12.75002.9907E − 080.0038
DE-tri-training (C45)12.91351.4252E − 080.0036
ADE-Co-Forest13.05777.3123E − 090.0033
Rasco (C45)13.09626.1074E − 090.0031
SNNRCE13.52887.5764E − 100.0029
Rel-Rasco (NB)13.91351.0781E − 100.0028
Tri-training (NN)14.03855.6118E − 110.0026
APSSC14.28851.4781E − 110.0025
CLCC15.87501.2868E − 150.0024


AlgorithmFriedman
ranking
valueHolm/
Hochberg test

Self-LMT5.0865
Democratic-Co9.19231.2641E − 030.0500
Tri-training (C45)9.21151.1990E − 030.0250
Cotraining (SMO)9.41356.7962E − 040.0167
Cotraining (C45)9.89421.5989E − 040.0125
Self-training (C45)9.95191.3319E − 040.0100
Co-Forest10.05779.4794E − 050.0083
Cobagging (SMO)10.72129.6656E − 060.0071
Self-training (SMO)10.73089.3332E − 060.0063
Tri-training (SMO)10.89425.1049E − 060.0056
Cobagging (C45)11.00963.3028E − 060.0050
DE-tri-training (SMO)11.69232.1358E − 070.0045
DE-tri-training (C45)12.18272.5157E − 080.0042
ADE-Co-Forest12.25961.7753E − 080.0038
Rasco (C45)12.26921.6992E − 080.0036
SETRED12.32691.3048E − 080.0033
Self-training (NN)13.07693.5107E − 100.0031
SNNRCE13.53853.2060E − 110.0029
Tri-training (NN)14.18279.1543E − 130.0028
APSSC14.24046.5767E − 130.0026
Rel-Rasco (NB)14.48081.6224E − 130.0025
CLCC16.58651.7128E − 190.0024

As a result, the proposed algorithm gives statistically better results among all the tested algorithms. This is due to better probability-based ranking and higher classification accuracy which allow selection of the high-confidence predictions in the selection step of self-training.

5. Conclusions

It is promising to implement techniques that use both labeled and unlabeled instances in classification tasks. The limited availability of labeled instances makes the learning process difficult, as supervised learning methods cannot produce a learner with worthy accuracy.

LMT produces a single tree containing binary splits on numeric features, multiway splits on categorical ones, and logistic regression models at the leaves, and the algorithm ensures that only relevant features are included in the latter. The produced classifier is not so easy to interpret as a standard decision tree, but much more legible than an ensemble of classifiers or Kernel-based estimators.

In this work, a self-trained LMT algorithm has been proposed. We performed a comparison with other well-known semisupervised learning methods on standard benchmark datasets and the presented technique had better accuracy in most of the tested datasets. Due to the encouraging results obtained from these experiments, one can expect that the proposed technique can be applied to real classification tasks giving slightly better accuracy than the traditional semisupervised approaches.

In spite of these results, no general method will work always. The main drawback of the semisupervised schemes is the needed time in the training phase. Some techniques that could enhance this property by saving both valuable operation time and computational resources are the feature selection algorithms which search for a subset of relevant features by removing the less informative of the initial features [46]. Building Logistic Model Trees with the LMT algorithm are orders of magnitude slower than simple tree induction or using model trees for classification. Improving the computational efficiency of the method using feature selection could be an interesting field for further research.

Appendix

A java software tool implementing the proposed algorithm and some basic run instructions can be found at http://www.math.upatras.gr/~sotos/SelfLMT-Experiment.zip.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

References

  1. A. K. Jain, “Data clustering: 50 years beyond K-means,” Pattern Recognition Letters, vol. 31, no. 8, pp. 651–666, 2010. View at: Publisher Site | Google Scholar
  2. Q. Ye, H. Pan, and C. Liu, “Enhancement of ELM by clustering discrimination manifold regularization and multiobjective FOA for semisupervised classification,” Computational Intelligence and Neuroscience, vol. 2015, Article ID 731494, 9 pages, 2015. View at: Publisher Site | Google Scholar
  3. S. Friedhelm and T. Edmondo, “Pattern classification and clustering: a review of partially supervised learning approaches,” Pattern Recognition Letters, vol. 37, pp. 4–14, 2014. View at: Publisher Site | Google Scholar
  4. S. Sun, “A survey of multi-view machine learning,” Neural Computing and Applications, vol. 23, no. 7-8, pp. 2031–2038, 2013. View at: Publisher Site | Google Scholar
  5. I. Triguero, S. García, and F. Herrera, “Self-labeled techniques for semi-supervised learning: taxonomy, software and empirical study,” Knowledge and Information Systems, vol. 42, no. 2, pp. 245–284, 2015. View at: Publisher Site | Google Scholar
  6. M. Li and Z.-H. Zhou, “Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples,” IEEE Transactions on Systems, Man, and Cybernetics Part A:Systems and Humans, vol. 37, no. 6, pp. 1088–1098, 2007. View at: Publisher Site | Google Scholar
  7. C. Rosenberg, M. Hebert, and H. Schneiderman, “Semi-supervised self-training of object detection models,” in Proceedings of the 7th IEEE Workshop on Applications of Computer Vision (WACV '05), pp. 29–36, IEEE, January 2005. View at: Publisher Site | Google Scholar
  8. M.-L. Zhang and Z.-H. Zhou, “CoTrade: confident co-training with data editing,” IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, vol. 41, no. 6, pp. 1612–1626, 2011. View at: Publisher Site | Google Scholar
  9. C. Liu and P. C. Yuen, “A boosted co-training algorithm for human action recognition,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 21, no. 9, pp. 1203–1213, 2011. View at: Publisher Site | Google Scholar
  10. J. Tanha, M. van Someren, and H. Afsarmanesh, “Semi-supervised self-training for decision tree classifiers,” International Journal of Machine Learning and Cybernetics, 2015. View at: Publisher Site | Google Scholar
  11. F. Provost and P. Domingos, “Tree induction for probability based ranking,” Machine Learning, vol. 52, no. 3, pp. 199–215, 2003. View at: Publisher Site | Google Scholar
  12. L. Torgo, “Inductive learning of tree-based regression models,” AI Communications, vol. 13, no. 2, pp. 137–138, 2000. View at: Google Scholar
  13. M. Li and Z.-H. Zhou, “SETRED: self-training with editing,” in Advances in Knowledge Discovery and Data Mining: 9th Pacific-Asia Conference, PAKDD 2005, Hanoi, Vietnam, May 18–20, 2005. Proceedings, vol. 3518 of Lecture Notes in Computer Science, pp. 611–621, Springer, Berlin, Germany, 2005. View at: Publisher Site | Google Scholar
  14. O. Chapelle, B. Schölkopf, and A. Zien, Semi-Supervised Learning, MIT Press, Cambridge, Mass, USA, 2006. View at: Publisher Site
  15. X. Zhu and A. Goldberg, Introduction to Semi-Supervised Learning, Synthesis Lectures on Artificial Intelligence and Machine Learning, Morgan & Claypool Publishers, 2009.
  16. L. Didaci, G. Fumera, and F. Roli, “Analysis of co-training algorithm with very small training sets,” in Structural, Syntactic, and Statistical Pattern Recognition, vol. 7626 of Lecture Notes in Computer Science, pp. 719–726, Springer, Berlin, Germany, 2012. View at: Publisher Site | Google Scholar
  17. J. Du, C. X. Ling, and Z.-H. Zhou, “When does cotraining work in real data?” IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 5, pp. 788–799, 2011. View at: Publisher Site | Google Scholar
  18. S. Sun and F. Jin, “Robust co-training,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 25, no. 7, pp. 1113–1126, 2011. View at: Publisher Site | Google Scholar | MathSciNet
  19. S. Wang, L. Wu, L. Jiao, and H. Liu, “Improve the performance of co-training by committee with refinement of class probability estimations,” Neurocomputing, vol. 136, pp. 30–40, 2014. View at: Publisher Site | Google Scholar
  20. Z. Jiang, S. Zhang, and J. Zeng, “A hybrid generative/discriminative method for semi-supervised classification,” Knowledge-Based Systems, vol. 37, pp. 137–145, 2013. View at: Publisher Site | Google Scholar
  21. C. Deng and M. Z. Guo, “A new co-training-style random forest for computer aided diagnosis,” Journal of Intelligent Information Systems, vol. 36, no. 3, pp. 253–281, 2011. View at: Publisher Site | Google Scholar
  22. M. F. A. Hady and F. Schwenker, “Co-training by committee: a new semi-supervised learning framework,” in Proceedings of the IEEE International Conference on Data Mining Workshops (ICDMW '08), pp. 563–572, IEEE, Pisa, Italy, December 2008. View at: Publisher Site | Google Scholar
  23. J. Wang, S.-W. Luo, and X.-H. Zeng, “A random subspace method for co-training,” in Proceedings of the International Joint Conference on Neural Networks (IJCNN '08), pp. 195–200, IEEE, Hong Kong, June 2008. View at: Publisher Site | Google Scholar
  24. Y. Yaslan and Z. Cataltepe, “Co-training with relevant random subspaces,” Neurocomputing, vol. 73, no. 10–12, pp. 1652–1661, 2010. View at: Publisher Site | Google Scholar
  25. Z.-H. Zhou and M. Li, “Tri-training: exploiting unlabeled data using three classifiers,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 11, pp. 1529–1541, 2005. View at: Publisher Site | Google Scholar
  26. T. Guo, G. Li, and T. Guo, “Improved tri-training with unlabeled data,” in Software Engineering and Knowledge Engineering: Theory and Practice, vol. 115 of Advances in Intelligent and Soft Computing, pp. 139–147, 2012. View at: Publisher Site | Google Scholar
  27. Y. Zhou and S. Goldman, “Democratic co-learning,” in Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI '04), pp. 594–602, IEEE, November 2004. View at: Publisher Site | Google Scholar
  28. S. Sun and Q. Zhang, “Multiple-view multiple-learner semi-supervised learning,” Neural Processing Letters, vol. 34, no. 3, pp. 229–240, 2011. View at: Publisher Site | Google Scholar
  29. T. Huang, Y. Yu, G. Guo, and K. Li, “A classification algorithm based on local cluster centers with a few labeled training examples,” Knowledge-Based Systems, vol. 23, no. 6, pp. 563–571, 2010. View at: Publisher Site | Google Scholar
  30. A. Halder, S. Ghosh, A. Ghosh, and A. Halder, “Ant based semisupervised classification,” in Swarm Intelligence, vol. 6234 of Lecture Notes in Computer Science, pp. 376–383, Springer, Berlin, Germany, 2010. View at: Publisher Site | Google Scholar
  31. Y. Wang, X. Xu, H. Zhao, and Z. Hua, “Semi-supervised learning based on nearest neighbor rule and cut edges,” Knowledge-Based Systems, vol. 23, no. 6, pp. 547–554, 2010. View at: Publisher Site | Google Scholar
  32. M. Iggane, A. Ennaji, D. Mammass, and M. Yassa, “Self-training using a k-nearest neighbor as a base classifier reinforced by support vector machines,” International Journal of Computer Applications, vol. 56, no. 6, pp. 43–46, 2012. View at: Publisher Site | Google Scholar
  33. L. Breiman, H. Friedman, J. A. Olshen, and C. J. Stone, Classification and Regression Trees, Wadsworth Statistics/Probability, Chapman and Hall/CRC, 1984.
  34. N. Landwehr, M. Hall, and E. Frank, “Logistic model trees,” Machine Learning, vol. 59, no. 1-2, pp. 161–205, 2005. View at: Publisher Site | Google Scholar
  35. M. Sumner, E. Frank, and M. Hall, “Speeding up logistic model tree induction,” in Proceedings of the 9th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 675–683, Porto, Portugal, October 2005. View at: Google Scholar
  36. C. Perlich, F. Provost, and J. Simonoff, “Tree inductions vs. logistic regression: a learning-curve analysis,” Journal of Machine Learning Research, vol. 4, pp. 211–255, 2003. View at: Google Scholar
  37. M. Sumner, E. Frank, M. Hall, and M. Sumner, “Speeding up logistic model tree induction,” in Knowledge Discovery in Databases: PKDD 2005, vol. 3721 of Lecture Notes in Computer Science, pp. 675–683, Springer, Berlin, Germany, 2005. View at: Publisher Site | Google Scholar
  38. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, “The WEKA data mining software: an update,” ACM SIGKDD Explorations Newsletter, vol. 11, no. 1, pp. 10–18, 2009. View at: Publisher Site | Google Scholar
  39. J. Alcalá-Fdez, A. Fernández, J. Luengo et al., “KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework,” Journal of Multiple-Valued Logic and Soft Computing, vol. 17, no. 2-3, pp. 255–287, 2011. View at: Google Scholar
  40. S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, and K. R. K. Murthy, “Improvements to Platt's SMO algorithm for SVM classifier design,” Neural Computation, vol. 13, no. 3, pp. 637–649, 2001. View at: Publisher Site | Google Scholar
  41. A. Blum and T. Mitchell, “Combining labeled and unlabeled data with co-training,” in Proceedings of the 11th Annual Conference on Computational Learning Theory (COLT' 98), pp. 92–100, Morgan Kaufmann Publishers, Madison, Wis, USA, July 1998. View at: Publisher Site | Google Scholar
  42. C. Deng and M. Guo, “Tri-training and data editing based semi-supervised clustering algorithm,” in MICAI 2006: Advances in Artificial Intelligence, vol. 4293 of Lecture Notes in Computer Science, pp. 641–651, Springer, Berlin, Germany, 2006. View at: Publisher Site | Google Scholar
  43. C. Deng and M. Z. Guo, “A new co-training-style random forest for computer aided diagnosis,” Journal of Intelligent Information Systems, vol. 36, no. 3, pp. 253–281, 2011. View at: Publisher Site | Google Scholar
  44. Y. Li, H. Li, C. Guan, and Z. Chin, “A self-training semi-supervised support vector machine algorithm and its applications in brain computer interface,” in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP '07), vol. 1, pp. I-385–I-388, Honolulu, Hawaii, USA, April 2007. View at: Publisher Site | Google Scholar
  45. S. García, A. Fernández, J. Luengo, and F. Herrera, “Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power,” Information Sciences, vol. 180, no. 10, pp. 2044–2064, 2010. View at: Publisher Site | Google Scholar
  46. Z. Xu, I. King, M. R.-T. Lyu, and R. Jin, “Discriminative semi-supervised feature selection via manifold regularization,” IEEE Transactions on Neural Networks, vol. 21, no. 7, pp. 1033–1047, 2010. View at: Publisher Site | Google Scholar

Copyright © 2016 Nikos Fazakis et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


More related articles

2599 Views | 711 Downloads | 9 Citations
 PDF  Download Citation  Citation
 Download other formatsMore
 Order printed copiesOrder

Related articles

We are committed to sharing findings related to COVID-19 as quickly and safely as possible. Any author submitting a COVID-19 paper should notify us at help@hindawi.com to ensure their research is fast-tracked and made available on a preprint server as soon as possible. We will be providing unlimited waivers of publication charges for accepted articles related to COVID-19. Sign up here as a reviewer to help fast-track new submissions.