Table of Contents Author Guidelines Submit a Manuscript
Diagnostic and Therapeutic Endoscopy
Volume 2012 (2012), Article ID 418037, 9 pages
Review Article

A Review of Machine-Vision-Based Analysis of Wireless Capsule Endoscopy Video

Department of Computer Science and Engineering, School of Engineering, University of Bridgeport, Bridgeport, CT 06604, USA

Received 25 July 2012; Revised 20 September 2012; Accepted 18 October 2012

Academic Editor: Klaus Mönkemüller

Copyright © 2012 Yingju Chen and Jeongkyu Lee. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Wireless capsule endoscopy (WCE) enables a physician to diagnose a patient's digestive system without surgical procedures. However, it takes 1-2 hours for a gastroenterologist to examine the video. To speed up the review process, a number of analysis techniques based on machine vision have been proposed by computer science researchers. In order to train a machine to understand the semantics of an image, the image contents need to be translated into numerical form first. The numerical form of the image is known as image abstraction. The process of selecting relevant image features is often determined by the modality of medical images and the nature of the diagnoses. For example, there are radiographic projection-based images (e.g., X-rays and PET scans), tomography-based images (e.g., MRT and CT scans), and photography-based images (e.g., endoscopy, dermatology, and microscopic histology). Each modality imposes unique image-dependent restrictions for automatic and medically meaningful image abstraction processes. In this paper, we review the current development of machine-vision-based analysis of WCE video, focusing on the research that identifies specific gastrointestinal (GI) pathology and methods of shot boundary detection.