Table of Contents Author Guidelines Submit a Manuscript
Computational Intelligence and Neuroscience
Volume 2016, Article ID 7386517, 13 pages
Research Article

Stratification-Based Outlier Detection over the Deep Web

1Department of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215002, China
2School of Computer Engineering, Suzhou Vocational University, Suzhou, Jiangsu 215104, China
3Computer Science Department, University of Central Arkansas, Conway, AR 72035, USA

Received 3 November 2015; Revised 15 March 2016; Accepted 6 April 2016

Academic Editor: Leonardo Franco

Copyright © 2016 Xuefeng Xian et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.