Abstract

The most commonly encountered problem in vision systems includes its capability to suffice for different scenes containing the object of interest to be detected. Generally, the different backgrounds in which the objects of interest are contained significantly dwindle the performance of vision systems. In this work, we design a sliding windows machine learning system for the recognition and detection of left ventricles in MR cardiac images. We leverage on the capability of artificial neural networks to cope with some of the inevitable scene constraints encountered in medical objects detection tasks. We train a backpropagation neural network on samples of left and nonleft ventricles. We reformulate the left ventricles detection task as a machine learning problem and employ an intelligent system (backpropagation neural network) to achieve the detection task. We treat the left ventricle detection problem as binary classification tasks by assigning collected left ventricle samples as one class, and random (nonleft ventricles) objects are the other class. The trained backpropagation neural network is validated to possess a good generalization power by simulating it with a test set. A recognition rate of 100% and 88% is achieved on the training and test set, respectively. The trained backpropagation neural network is used to determine if the sampled region in a target image contains a left ventricle or not. Lastly, we show the effectiveness of the proposed system by comparing the manual detection of left ventricles drawn by medical experts and the automatic detection by the trained network.

1. Introduction

Machine learning (ML) is a form of artificial intelligence (AI) which gives computers the skills to learn without being specifically programmed. It focuses on building computer programs which are subject to change when exposed to new data. Machine learning can be classified as either supervised or unsupervised. Supervised algorithms can apply past knowledge to new data whereas unsupervised algorithms make conclusions from datasets [15].

The field of medical imaging has witnessed a delay in embracing the novel ML techniques as compared to other fields. Despite machine learning being virtually new, its concept has been applied to medical imaging for years, particularly in areas of computer-aided diagnosis (CAD) and functional brain mapping [6]. Components of medical imaging (image analysis and reconstruction) tend to benefit from the merger of machine learning with medical imaging. From this perspective, new methods for image reconstruction and exceptional performance in both clinical and preclinical applications will be achieved [6]. A study [7] sees machine learning as a major tool for current computer-aided analysis (CAD). Previous knowledge acquired from examples provided by medical experts has helped in areas like image registration, image fusion, segmentation, and other analyses steps towards describing accurately the initial data and CAD goals. Other applications of machine learning in medical imaging include but are not limited to tumour classification, tumour diagnosis, image segmentation, image reconstruction, and prediction [3, 6, 7].

In this research we focus on detection tasks employing artificial systems (machines). Such systems are required to “look” at an image and determine if a particular object of interest is contained anywhere in the image, in addition to detecting it. Medical object detection is a task that traditionally belongs to the class of computer vision problems. It is noteworthy that while humans are very effective and efficient in detecting various complex objects irrespective of scene constraints such as varying background, object scale, object positional translation, object orientation, and object illumination; machines strive to achieve near human performance on object detection. Furthermore, it is stressed that object detection is quite more challenging for machines as compared to object recognition. In object detection, the object of interest to be detected can be positioned in any region of an image, while, in object recognition, the objects of interest to be recognized is usually already segmented, hence, making the recognition less challenging. In order to succeed in a task such as object detection, developed vision models or systems should be capable of coping with the aforementioned scene constraints. More important is that, in robotic systems, medical objects detection tasks are very delicate, requiring utmost accuracy. Since robotic systems are usually interacting with its environments in a somewhat real-time fashion, the consequence of wrongly detecting objects of interest can be very grave or serious.

In this paper, we design a sliding window based machine learning system for the detection of left ventricles in MRI slices. It is important to note that while any other object could have been used to demonstrate the effectiveness of the designed system, we found the detection of left ventricles in highly varying and unconstrained images sufficient. Furthermore, it will be seen later on in this work that the approach implemented for the detection of left ventricles in images can be easily extended and modified to realize the detection of other objects in images.

2. Sliding Window Machine Learning Approach

In the sliding window approach, a window of suitable size, say , is chosen to perform a search over the target image [8, 9]. First, a classifier is trained on a collection of training samples spanning the object of interest for detection as one class and random objects as the other class. Formally, samples belonging to the object of interest for detection are referred to as positive examples, while random object samples of no interest are referred to as negative examples. For a single object detection task, the idea is to train a binary classifier, which determines if the presented object is “positive” or “negative.” The trained classifier can then be used to “inspect” a target image by sampling it, starting from the top-left corner. It is noteworthy that the input dimension of the trained classifier is generally a fraction of the size or dimension of the target image; hence, sampling of target images can be achieved.

Some of the classifiers that have found applications within the context of object detection include deep neural network (DNN), convolutional neural network (CNN), and decision trees (DT) [2, 10, 11]. Considering the aforementioned considerations for selecting a suitable classifier for object detection, the support vector machine (SVM), which is a maximum margin classifier, would be an obvious choice, but for the long training time as compared to backpropagation neural network and decision trees. In view of required training time, decision trees (DTs) usually have the least training time as compared to the support vector machine (SVM) and backpropagation neural network (BPNN); however, decision trees tend to quickly overfit or “memorize” the training data. The consequence is such that the performance of decision trees on the test set (unseen examples) is not competitive. The backpropagation neural network (BPNN) seems to be the modest trade-off between training time and generalization power, since the BPNN has a training time that is in between that of the support vector machine and decision tree and a generalization performance that is better than that of decision trees and competitive with that of the support vector machine [12, 13]. Hence, in this project work, the backpropagation neural network has been used as the classifier for the object detection task.

3. The Proposed Automatic Left Ventricle Detection System

The aim of this work is to develop an artificial vision system that can perform the task of detecting left ventricle in images. In this work, considering challenges such as object illumination, scale, translation, and rotation, which make the detection a complex problem for such an open detection problem, we resolve to implementing an intelligent system which can somewhat graciously cope with the aforementioned detection constraints. Neural network, namely, the backpropagation neural network (BPNN), has been used in this work as the ‘brain’ behind the detection.

This research is achieved in two phases. First is the left ventricle recognition phase by training a backpropagation neural network (BPNN). The second phase is the detection of left ventricle objects in MRI slices using the backpropagation neural network. The flowchart for the system is shown in Figure 1 and both phases are briefly described below.

3.1. Phase 1: Left Ventricle Object Recognition

In this phase, a backpropagation neural network is trained to recognize left ventricle objects and nonleft ventricle objects. In order to achieve this binary classification task, training data is collected to span both left ventricle images and nonleft ventricle images. The data used for training and testing data are obtained from Sunnybrook Cardiac Data (SCD) [14]. The dataset contains 45 cine-MRI slices collected from a mix of patients and different pathologies such as healthy, hypertrophy, heart failure with infarction, and heart failure without infarction. A subset of 100 images was used for the proposed system training and testing purposes for both phases. Since the actual interest is to develop a system that recognizes left ventricle objects, MRI slices were cropped to have only the left ventricle and are referred to as positive examples or samples. Conversely, images containing random nonleft ventricle images are referred to as negative examples or samples. Note that, for earlier training phases, there was no constraint on the contents of the negative examples except that they do not contain left ventricle objects. However, it was discovered that the negative images can be collected by cropping the other parts of the whole MRI cardiac slice by excluding the left ventricle. This seems to improve the robustness of the system in distinguishing left ventricle and nonleft ventricle images. Cropping ventricle and nonventricle images from cardiac MR image is shown in Figure 2.

3.1.1. Image Processing

Since the positive and negative examples are cropped manually, they are of different sizes. Thus, in order to make the images consistent, they are all resized to pixels (1600 pixels). Samples of positive and negative examples are shown in Figure 3.

3.1.2. Backpropagation Neural Network (BPNN) Design, Training, and Testing

A backpropagation neural network is trained on the collected samples spanning both positive and negative examples. For the positive examples (left ventricles), 100 samples cropped from different cardiac MRI slices are used, while, for the negative examples (nonleft ventricles), 200 samples are used. The negative images are more because one MR image can provide many negative images where the left ventricle is not included. The positive and negative samples form the training and testing data for the designed backpropagation neural network (BPNN). All images are first rescaled to pixels (1600 pixels). The whole dataset is then divided into training and testing data. The testing data allows the observation of performance of the trained BPNN on unseen or new data. It is very desirable that trained ANNs can perform well on unseen data, that is, generalization. 75 left ventricles and 275 nonleft ventricles are used for training, while 25 left ventricles and 25 nonleft ventricles are used for testing the trained BPNN. Hence, there are a total of 250 training images and 50 testing images.

(a) Input Data and Neurons. Considering that the training images are now pixels, the designed BPNN has 1600 input neurons, where each input attribute or pixel is fed into one of the input neurons. Also, note that the input neurons are nonprocessing. That is, they basically receive input pixels and supply them to the hidden layer neurons which are processing neurons.

(b) Hidden Layer Neurons. The hidden layer is where the extraction of input data features that allows the mapping of input data to corresponding target classes is achieved. Unlike the input layer neurons, the hidden layer neurons are processing. Also, each hidden layer neuron receives inputs from all the input layer neurons. In this work, several experiments are carried out to determine the suitable number of hidden layer neurons. Finally, the number of suitable hidden neurons was obtained as 80 during network training.

(c) Output Layer Coding. Considering that we aim to classify all images as left ventricle object or nonleft ventricle object, the BPNN has two output neurons. The output of the BPNN is coded such that output neurons activations are as shown in the following:(i) → a left ventricle object(ii) → a nonleft ventricle object.

Figure 4 shows the designed BPNN. The BPNN is trained on the processed images described in Figure 3. The final training parameters are shown in Table 1.

The Log-Sigmoid activation function allows neuron’s output to be in the range of 0 to 1. From Table 1, it is seen that the BPNN achieve the required error of value 0.01 in 40 secs, with 1,215 epochs. The learning curve for the BPNN is shown in Figure 5.

The trained BPNN is then tested using the training and testing data. Table 2 shows the recognition rates of the BPNN on the training and testing data.

It is seen in Table 2 that the BPNN achieved a recognition rate of 100% and 88% on the training and testing data, respectively. Note that a testing recognition rate of 88% is enough to show that the BPNN can generalize well on unseen data (images), that is, classifying new images as left ventricle or nonleft ventricle.

3.2. Phase 2: Left Ventricle Detection from Images

In this phase, the trained BPNN is used to detect left ventricles in images containing various objects, background, illumination, scale, and so on. In order to detect left ventricles in new images, the new images are sampled in a nonoverlapping fashion using a sliding window or mask. Firstly, all images in which left ventricles are to be detected are rescaled to pixels; this significantly reduces the required number of samplings and therefore computations. Note that the new size of images containing left ventricle for detection is selected such that input field ( pixels) of the earlier trained BPNN can fit in without falling off image edges.

It therefore follows that if the new images containing left ventricle for detection is rescaled to pixels, and a sliding window of size pixels is used for nonoverlapping sampling, 3 samplings are obtained in the x-pixel coordinate, and 3 samplings are obtained in the y-pixel coordinate; this makes a total of 9 samplings for an image. Figure 6 shows the analogy of the sampling technique.

The sampling outcomes using a sliding window of size pixels (1600 pixels) is supplied as the input of the trained BPNN as shown in Figure 6. It is expected that, for windows containing a left ventricle, the BPNN gives an output of , as coded during the BPNN training. Also, it is expected that, for windows not containing left ventricles, the trained BPNN gives an output of . From the sampling approach described above, it will be observed that 9 samplings (patches) and therefore predictions are made for any target image. The BPNN output with the closest match with the desired output for left ventricle output, , is selected as containing a left ventricle, that is, with maximum activation value for neuron 1 in Figure 4. It is seen that to achieve the complete detection of left ventricles in images, both phases 1 and 2 are sandwiched together as one module.

4. Performance Evaluation

An example of left ventricle detection for the image shown in Figure 6 is shown in Figure 7 using the developed system. More examples of the left ventricle localizations of different types of MR images are shown in Figures 8 and 9.

The detected left ventricle is highlighted in a rectangular bounding box.

Also, samples of other target images for left ventricle detection using the developed system within this work are shown in Figures 8, 9, and 10. The detected left ventricle objects are highlighted as a rectangular bounding box.

Also, some instances where the developed failed to achieve the correct detection of left ventricle in images are shown in Figure 10.

Generally, most of the approaches provided in the state of the art of left ventricle detection can be considered as some variation of the active contour and segmentation models [1519]. These models are meant to segment the endocardium and epicardium areas of the left ventricle. In contrast, our proposed model is a general squared detection or localization of the left ventricle in an MRI slice. This model is mainly a machine learning approach that aims to evaluate the effectiveness and capability of a simple backpropagation neural network in sampling an MRI slice for the purpose of finding and detecting a left ventricle object based on a sliding window’s approach. The approach here is not to accurately segment the edges of the left ventricle; however, it is to find and localize the left ventricle as an object in the image. Therefore, in some images, a small part of the left ventricle can be undetected, and still this can be considered as a correct detection due to the application type which is to find or localize the left ventricle even though a small part of it is missing. Hence, our results cannot be compared to other research findings since the approach and the techniques used are totally different.

In order to show the effectiveness of the developed system, some left ventricles were manually detected by medical experts to validate our system capability in detecting the left ventricle in MRI slices. The idea is to compare both detections, that is, the network detection and the medical expert manual detection of the left ventricles to check if the left ventricles were fit into the detected square in target images. In other words, it is to check how accurately the system was capable to detect the left ventricle by comparing the detected area to the left ventricle highlighted by the medical expert. Figure 11 illustrates some images where the manual and network’s detection of the left ventricle are shown.

5. Results Discussion

Since artificial neural network weights are usually randomly initialized at the start of training, it therefore follows that trained BPNN is not always guaranteed to converge to the global minimum or good local minima. Consequently, the learning of left ventricles and nonleft ventricles can be negatively affected; this therefore affects the detection phase, where the trained BPNN may wrongly predict a sampling window or patch as containing a left ventricle. In order, to solve this problem, the MATLAB program written contains instructions to retrain the BPNN till a testing recognition (relating to BPNN generalization capability) of greater than 80% is obtained. This greatly reduces the BPNN’s probability of wrongly predicting a sampling window (patch) as containing a left ventricle. In this project, we have allowed for a maximum of 30 retraining schedules of the BPNN. Therefore, when the MATLAB script for the developed whole detection system is run, it is possible that the BPNN may be automatically retrained a couple of times before the detection task is then executed.

Moreover, another challenge encountered is that even after the BPNN achieves a testing recognition rate of greater than 88%, it is still possible that sampling windows are wrongly classified, though the probability of this happening is quite small. In this project, it is found that when the BPNN achieves a testing recognition of greater than 88%, a maximum of 3 retraining schedules is required to correctly detect a left ventricle in the target image.

This work describes a highly challenging task in computer vision, medical object detection. We show that backpropagation neural network (BPNN) can be employed to learn the robust recognition/classification of left ventricles and nonleft ventricles as positive and negative training examples, respectively. The trained BPNN is then used in a nonoverlapping sampling fashion to “inspect” target images containing left ventricles for detection. The developed system is tested and found to be very effective in the detection of left ventricles in images containing other objects. Also it is important that the developed system is intelligent such that image scene constraints such as translation and scale only slightly affect the overall efficiency of the system.

6. Conclusion

In this research, an artificial vision system for left ventricle detection has been developed. It is important to note that the work in itself is broader than the detection of only left ventricles, since the same insight and approach presented within this work can be used to realize the detection of other objects. Also, considering the broadness of the scenes and environments in which the developed system will be deployed, we opt to reformulating the detection task as that of a machine learning problem. This allows some robustness to the aforementioned scene constraints which may render the developed system quite erroneous on the detection task. A backpropagation neural network (BPNN) has been used as the learning system in this research. The BPNN is trained on samples of left ventricles and nonleft ventricles (random) collected from the same samples. For the detection of left ventricles in target images, a window size ( pixels) corresponding to the size of the input to the BPNN is used to sample the target image in a nonoverlapping fashion. The developed system tested some randomly collected target images containing left ventricles without any scene constraints such as the scale, translation, illumination, and orientation of left ventricles contained in the target images. The developed system is seen to perform quite well in detecting cub objects, including scenarios, where left ventricles are even partially occluded. Furthermore, to show the effectiveness and robustness of the developed system, the left ventricles were contoured or detected by medical experts beside the system detection to show the effectiveness and accuracy of the proposed system of performing the left ventricles detection.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

The authors would like to thank Dr. Hadi Sasani for comments on earlier versions of this paper.