Table of Contents Author Guidelines Submit a Manuscript
Comparative and Functional Genomics
Volume 5, Issue 2, Pages 156-162
Conference paper

Classification of Chemical Compounds to Support Complex Queries in a Pathway Database

EML Research GmbH, Schloss-Wolfsbrunnenweg 33, Heidelberg 69118 , Germany

Received 7 November 2003; Revised 11 December 2003; Accepted 23 December 2003

Copyright © 2004 Hindawi Publishing Corporation. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Data quality in biological databases has become a topic of great discussion. To provide high quality data and to deal with the vast amount of biochemical data, annotators and curators need to be supported by software that carries out part of their work in an (semi-) automatic manner. The detection of errors and inconsistencies is a part that requires the knowledge of domain experts, thus in most cases it is done manually, making it very expensive and time-consuming. This paper presents two tools to partially support the curation of data on biochemical pathways. The tool enables the automatic classification of chemical compounds based on their respective SMILES strings. Such classification allows the querying and visualization of biochemical reactions at different levels of abstraction, according to the level of detail at which the reaction participants are described. Chemical compounds can be classified in a flexible manner based on different criteria. The support of the process of data curation is provided by facilitating the detection of compounds that are identified as different but that are actually the same. This is also used to identify similar reactions and, in turn, pathways.