Table of Contents Author Guidelines Submit a Manuscript
The Scientific World Journal
Volume 2014 (2014), Article ID 870406, 7 pages
http://dx.doi.org/10.1155/2014/870406
Research Article

Bit-Table Based Biclustering and Frequent Closed Itemset Mining in High-Dimensional Binary Data

1Department of Process Engineering, University of Pannonia, Veszprém 8200, Hungary
2Bioinformatics & Scientific Computing Core, Campus Science Support Facilities, Vienna Biocenter, 1030 Vienna, Austria

Received 15 August 2013; Accepted 4 December 2013; Published 30 January 2014

Academic Editors: Y. Blanco Fernandez and Y.-B. Yuan

Copyright © 2014 András Király et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Abstract

During the last decade various algorithms have been developed and proposed for discovering overlapping clusters in high-dimensional data. The two most prominent application fields in this research, proposed independently, are frequent itemset mining (developed for market basket data) and biclustering (applied to gene expression data analysis). The common limitation of both methodologies is the limited applicability for very large binary data sets. In this paper we propose a novel and efficient method to find both frequent closed itemsets and biclusters in high-dimensional binary data. The method is based on simple but very powerful matrix and vector multiplication approaches that ensure that all patterns can be discovered in a fast manner. The proposed algorithm has been implemented in the commonly used MATLAB environment and freely available for researchers.