Abstract

Fish species identification and automated fish freshness assessment play important roles in fishery industry applications. This paper describes a method based on support vector machines (SVMs) to improve the performance of fish identification systems. The result is used for the assessment of fish freshness using artificial neural network (ANN). Identification of the fish species involves processing of the images of fish. The most efficient features were extracted and combined with the down-sampled version of the images to create a 1D input vector. Max-Win algorithm applied to the SVM-based classifiers has enhanced the reliability of sorting to 96.46%. The realisation of Cyranose 320 Electronic nose (E-nose), in order to evaluate the fish freshness in real-time, is experimented. Intelligent processing of the sensor patterns involves the use of a dedicated ANN for each species under study. The best estimation of freshness was provided by the most sensitive sensors. Data was collected from four selected species of fishes over a period of ten days. It was concluded that the performance can be increased using individual trained ANN for each specie. The proposed system has been successful in identifying the number of days after catching the fish with an accuracy of up to 91%.

1. Introduction

Automatic fish sorting by species is an important process for many fishery applications such as freshness assessment, marine ecology issues, and automate logging of the catch in commercial and research fishing vessels. In this work, fish identification is done with the aim of improving the accuracy of freshness assessment system. Traditionally, the patterns of fish skin as well as the whole shape have been used by fishery researchers to identify the fish. However, the lighting conditions, freshness of the fish, and changes in skin colour influence the fish type identification. This makes it difficult to manually classify the fish species and correctly interpret the findings.

The use of fish image as inputs for the training system was not effective as neither memory usage is optimum nor fish type is easily recognizable. Fish skin images could be incorporated in the proposed fish identification system as a supplement and not as the necessary object identification input. Fish type identification based on fish eye images and useful parameters extracted from these images is the methodology adopted in the designed system.

Support vector machine (SVM) has proven to be an effective tool for pattern recognition [1]. The typical method for applying SVM to multiclass problems (N classes) is to construct N number of binary-SVM classifiers, each of which identifies one class among N different classes. These types of SVMs are referred to as 1-v-r (short for one-versus-rest). Another typical method is to combine all possible binary (two-class) classifiers. In this case, for N-class problems, N(N-1)/2 classifiers should be trained. These types of SVMs are referred to as 1-v-1 (short for one-versus-one) [2]. We adopted 1-v-1 technique for this project to identify the fish species due to its superior accuracy.

Upon identification the species of the fish, its freshness, is to be assessed. Fish quality is a complex concept involving a whole range of factors, freshness being one of the most important. It indicates the degree of various physical, chemical, biochemical and microbiological changes in the fish. Of the various sensory and instrumental methods adopted to ascertain the freshness of fishes, E-nose-based methods continue to be the primary choice [3]. An E-nose system comprises of a sensor array, an interfacing electronic circuitry, a sniffing mechanism and a pattern-recognition unit that acts as an odour classifier.

The odour that emanates from the fish when detected by the E-nose generates a characteristic pattern or sensor print. Collecting the response from each sensor, as they convert the chemical reaction into an electrical signal, results in smell prints. The type of odour and the degree of selectivity of odour depend on the choice and number of sensors in the sensor array. The data processing and pattern recognition techniques can be applied to the sensor signals to differentiate substances or train a system to provide identification based on the collection of known responses.

E-nose has been used for fish freshness monitoring and odour evaluation [4]. Smell detection techniques and other approaches to sensor electronics such as feature extraction and data processing techniques have been employed in freshness assessment of fishery products [5]. Various methodologies used to ascertain the freshness of the fish are briefly outlined below.

A European project that used two types of E-noses based on different sampling procedures and sensor technologies investigated the possibility of developing a multisensor device using a combination of E-noses, spectroscopic, texture-meters, image analysers, colour meters, and measuring the electrical properties to measure and estimate fish freshness [6]. Metallic potentiometric electrodes were employed as sensors to ascertain the freshness of fish. These electrodes were found to be sensible to various unspecific elements and hence multivariate analysis techniques were used to draw conclusions from the experiment [7]. E-nose systems have been used to determine the aroma and flavour of tea and spices [8] and identify typical wine aromas [9] and cigarette brands [10]. In this paper, we are presenting the use of E-nose to other more complex mediums in the determination of fish freshness.

2. Methodology

The block diagram shown in Figure 1 explicitly depicts two processing components of the system. The SVM-based method processes the images of fish eye for identification purpose and the ANN-based method processes the smell prints for fish freshness assessment. Moreover, the information of the SVM classifier is used to enhance the performance of the ANN-based technique as explained below in detail.

2.1. Species Identification

SVM is a supervised learning technique that can be applied to classification or regression [11]. SVM algorithms are based on the statistical learning theory and the Vapnik-Chervonenkis (VC) dimension [12].

These algorithms try to find a unique hyperplane with maximum margin in the SVM models to separate different classes. SVM can also be used in classifying complicated models by employing nonlinear kernels. The role of kernel functions is to perform computations in the original input space rather than the high-dimensional (even infinite) feature space [11]. Because only the inner product is involved in SVM, learning and predicting is much faster than a multilayer neural network [5]. Figure 2 shows the structure of an SVM with a test pattern of and the support vectors which are mapped into a feature space with the nonlinear function , and the computed dot products [11].

SVM has a faster training time among other statistical training methods and utilizes less memory space. However, in image classification applications the use of raw images as inputs for training SVM can result in the utilization of a large memory space and consequently a long training time. Hence, in this study, we used features of the fish eye and fish skin images as part of training SVM instead of the raw images.

Optimum separation hyperplane (OSH) is the hyperplane with the maximum margin for a given finite set of learning patterns [11]. Linear SVM performs binary pattern classification by finding OSH function . The parameters and b are determined by learning from examples. is the transpose matrix of . Suppose that linearly separable data and their class labels () are given. Among possible linear discriminant functions that classify the given data without errors, the classifier function is aimed to find an OSH as the separable surface to separate two classes of patterns with maximal margin (the maximal distance between the surface and the nearest data point of each class) [5]. These nearest points to the hyperplane are called support vectors. Resultant hyperplane is called OSH. The optimal weight vector, , is defined by Therefore, the equation of the hyperplane separating two different classes is given by The coefficients are determined by solving a quadratic programming problem. is nonzero only for support vectors. This means, the parameter w of a linear SVM is given by linear combination of support vectors.

The above learning method can be extended to the case of nonlinear classifiers.

Let be a nonlinear map from the input space X to some feature space F . We can construct a linear classifier in the feature space so that the margin is maximized as we did in the input space. The discriminant function is given by where indicates inner product in the feature space F.

To learn nonlinear SVM, we do not need to explicitly specify the nonlinear mapping . What we need is a “kernel function” k (x, y) which gives inner product in the feature space: By introducing a kernel function, a feature space is implicitly specified.

The decision function for an input vector x is defined as where denotes a learning vector, and SV denotes the set of the support vectors. k (a, b) is the kernel function. The classification result is determined by the sign of the function .

The parameters are obtained by solving the following quadratic programming problem:C is a parameter which penalizes the learning errors.

SVM’s learning method is strongly related to structural risk minimization [12], and thus SVM can be expected to have good generalization power.

Max-Win [13] algorithm is probably the most typical 1-v-1 method for multiclass recognition. The total number of classifiers is for an N-class problem [14]. As the recognition method using 1-v-1 for a multiclass problem, Max-Win algorithm is used. In Max-Win algorithm, each SVM casts one vote and the recognition result is the class with the maximum number of votes.

An input pattern is given to all pairwise classifiers. Receiving the input, each SVM determines the class to which the input pattern belongs. It is considered as a “vote”. We can use the number of “votes” for each class as the measure of confidence. The final answer is the class which acquires the maximum number of votes. The recognition time is proportional to , which is the number of the SVMs. Usually, the performance of the Max-Win algorithm is very good. However, its evaluation time which relates to the number of operations, , is very slow.

A new strategy is used in the design of a two-level classifier. It employs voting mechanism to vote but not make the final prediction of all classifiers of the first level which is called level-0. The relation between votes and the true classes is induced by classification algorithm of the second level named level-1. Figure 3 shows the block diagram of this classifier.

In Figure 3, there are SVM classifiers in level-0 which indicates that the input pattern is evaluated by each 1-v-1 SVM (i versus j) situated in this level. Any SVM that classified the input as an class, it is considered as “not-j”. The output of each classifier of this level will be considered as an input for level-1, which is used to label the class of the input.

The procedure can be summarized as follows:

(i)apply the testing data as input to all SVM classifiers in level-0 in parallel,(ii)use the prediction of all SVM classifiers in level-0 as input data for level-1 classifiers, and(iii)use MAX-Win algorithm to determine the output of level-1 as the identification result. The experiments were carried out using a database of 232 images of fish eyes from four different fish types. The samples of fish eye images which were used in this research are shown in Figure 4.

All images were taken under a uniform lighting condition. Other photographical parameters such as the position of fish related to the camera were also fixed during the image acquisition process in order to reduce the environmental-related errors.

Use of the most efficient features extracted from the fish eyes and fish skin, in addition to the downsampled version of the selected images, resulted in reduced training time and improved performance in fish species identification. Selecting an optimum image size ensures reduced computation cost and faster response while training and testing the SVM. However, downsizing the image can lead to compromising the amount of information provided and increase the errors in identification. Hence, a  pixel image size was chosen as ideal for the proposed system. A  pixel image was later downsized to a image during down sampling. The 400 fish eye image samples of each fish type were selected and concatenated with the 4 extracted features from the same image in order to generate a data set for SVM.

The compressed part of each image was used as one part of the input to the SVM. The second part of the input vector is 4 feature parameters extracted from the fish eye image. They are statistical features including the mean, max, std, var of the image.

Therefore, instead of using raw fish image, we used 404 samples (400 from the compressed image and 4 extracted parameters) to minimize the structural complexity of the SVM and reduce the training time.

An SVM implementation program called LSSVM lab [15] toolbox was used in this work. Least squares support vector machines (LSs-SVMs) [16, 17] are reformulations to standard SVM. We trained six SVM classifiers, each of which for classifying two classes.

The quality of SVM for classification depends on the combination of several parameters: capacity parameter C, the kernel type K, and its corresponding parameters . Also, there are no clear guidelines for selecting the optimum set of theoretical parameters. Therefore, the only practical way of finding an optimally predictive SVM model is through extensive experiments. It is well known that the results of SVM approach lie largely on the choice of a kernel, which determines the sample distribution in the mapping space. Therefore, the kernel functions should be decided first. In this work, we used the MLP (multilayer perceptron) kernel as well as linear kernel. We chose the optimal value of penalty coefficient C to be 10, and the parameter equal to 0.2 empirically.

2.2. Freshness Assessment

A portable Cyranose 320 E-nose comprising an array of thirty-two polymer carbon black composite sensors was used in our experiments. The conductivity of these sensors changes resulting in an increase in resistance value when they are exposed to vapours or stimulus aroma. The variations in resistance are recorded as sensor outputs across the array, in the form of a digital pattern. These patterns are distinct and unique for different vapours [18]. Possession of a built-in neural network that can be trained to identify specific stimuli and the generation of stimuli specific smell prints are the factors that make E-nose as the preferred choice over other commercially available products.

Four species were selected: Red Snapper, Gurnard, Tarakihi, and Trevally. The minced flesh of the selected fishes stored in the refrigerator. A fixed amount of meat of each fish was taken from the fridge and exposed to room temperature prior to each experiment in order to bring the sample to the room temperature. These samples were placed into four sterile glasses (one glass for each fish) in a temperature-controlled environment. Readings were taken from the headspace of the samples by manually introducing the E-nose system. For the purpose of evaluation, the position of the E-nose sensor on the sample is fixed. The baseline purge time, sampling time, and purge time of E-nose were accurately set for each experiment to avoid error. Time allowed for the E-nose to record a complete smell print was 30 seconds of sampling time. Sufficient time was allowed for each data collection for a completed sample smell print and for cleaning the chamber. Different samples of the same fish were measured everyday (except days 3 and 4) over ten-day period. The smell prints generated from these measurements were used to ascertain the fish freshness. The proposed method may be used in other countries for the identification of fish species by training the data sets to be selected accordingly.

2.2.1. Experimental Data

The process of converting the odour into sensor patterns results in the generation of a time series of thousands of resistance values for each sensor. We converted the odour of four selected types of fish to smell prints over days 1, 2, 5, 6, 7, 8, 9, and 10 after catching the fish (no data was collected on days 3 and 4). With a sampling interval of 30 seconds, approximately 2000 samples were collected by each sensor during each process. Therefore, about 2,048,000 data samples [4 (fish) 8 (days) 32 (sensors) 2000 (samples) = 2,048,000] were obtained. Obviously, processing this amount of raw data for classification purposes requires a complex classifier. Figure 5 shows the plot of 666 samples of the smell print response by sensor 1 after exposing to the red Snapper fish.

Figure 6 shows the relative responses of all 32 sensors for the selected fish (Tarakihi).

The relative value was calculated, based on the difference between the smell-prints of two consecutive days (e.g., between day 7 and 8). It can be observed that some sensors show better responses compared to others. This is due to the fact that E-nose sensors are manufactured in such a way to respond differently to different odours, in the present case Sensors 3, 4, and 6 responded favourably to the fish smells. However, some sensors (e.g., sensor 23) with good responses were eliminated due to their instability responses to different fishes. Therefore, for training and testing the neural network classifiers, we selected the smell-print of sensors 3, 4, and 6 only.

In order to reduce the size of the data and to further simplify the classifier, we decided to evaluate the assessment test only for 4 days (day 2, 5, 7, and 8). Thus, the size of the initial data was reduced to 96,000 samples: [4 (fish) 4 (days) 3 (sensors) 2000 (samples) = 96,000].

Moreover, it was found that each smell print of 2000 samples consisted of approximately 8 similar pulses with a pulse width of less than 250 samples. To achieve a better data reduction factor, we applied a window with the width of 200 samples to each smell print. The window selected a pulse starting from its rising edge in the selected smell print. This method enabled the time series of each smell print to be fixed to a 200-sample interval and the size of the initial data reduced to 9 600 samples:

2.2.2. Parameter Extraction

Extraction of few numerical features that can robustly describe the smell print from the raw data resulted in the reduction in the size of data set used. In this context, the number of samples to describe a single sensor response was reduced from 200 samples to 50 samples as discussed in [19].

The most efficient features were extracted from the selected pulse as well as its down-sampled version. Hence the input vector of each classifier comprised of two components. A fixed 200-sample interval initiated at the rising edge of a pulse in the smell-print was selected and down-sampled by a factor of 5 to generate the 40-sample compressed part, which is the first component of input to the neural network architecture. The second part of the input vector included 10 feature parameters extracted at the same 200-sample interval. These statistical features are: the median, minimum (min), maximum (max), s tandard deviation (std), variance (var), median, mean/median absolute deviation (mad), the geometry mean (geomean), standard deviation (std_m), and variance (var_m) of the mean-removed data .

We used 50 components (10 features plus 40 compressed components) instead of using 200 samples of raw smell prints. The reduction in the size of data enabled the reduction of network complexity and improved the performance of data classification.

3. Neural Network Classifiers

The inherent massive parallel distributed architectures in artificial neural networks are employed for information processing and signal classification of different systems. They are characterized as computational models and consist of a collection of processing elements called neurodes (neural nodes), and weighted connections (synapses) [10]. An ANN can be trained by the input data to learn to extract the statistical properties of the input data for identification purposes. During the learning process, a meaningful relationship between input and output variables can be established. Furthermore, ANN identifies the relationships between different input data and output variables (classes) in the training phase by adapting the weight factors assigned to the interconnections between the layers. In the testing phase, the new data (such as a new smell print) can be interpreted by the network, and the output is calculated based on the fixed weight factors [20].

Smell print classification was performed by feed-forward ANNs consisting of three layers of neurons. Preprocessed data was presented at the input layer of ANNs for further processing in the hidden and output layers. ANNs are computational modeling tools that have been extensively used in many disciplines to model complex problems [21]. They have been applied to E-nose data for the purpose of classification [2229]. The trained ANN can be employed for classification of fish freshness and the identification of the day after catching.

Extracted features from the E-nose sensors and ANNs were used to assess the freshness of the fish by classifying the smell print data according to the day of data collection.

Of the different neural network architectures that are employed to ascertain an optimum solution to the intelligent analysis of actual fish freshness, the four networks, based on a multilayer feed-forward model of three layers including the input, hidden, and output layers, were selected.

The input vector included a total of 50 elements which indicated that the number of neurons in the input layer should equal 50. The optimum number of neurons in the hidden layer was chosen empirically to equal 15 and the transform function of “tansig” was selected for this layer. The output (target) vector can be defined by a combination of 1 second and 0 second to represent each class. To identify the actual day that the fish was caught into four different categories, the number of neurons in the output layer can be either 2 or 4. We chose 2 neurons for the output layer to have 4 combinations of (00, 01, 10, and 11) among the outputs assigned to each class. The transform function of “tansig” was also selected for this layer.

Each ANN is initialized with random weights to the connections among the neurons. A training data set, with a known outcome, is entered into the input neurons. The ANN compares its own output values with the known outcome and calculates the mean square error (MSE) value. This error value will change as the weights to the connections are updated. The ANN attempts to minimize the error by adjusting the weights according to the learning algorithm. This process is repeated for a predefined number of epochs. Finally, the new data with unknown outcome values can be tested by the ANN in the recall phase. It should be noted that there is a small classification errors due to the nonzero value of MSE for each network.

4. Results

4.1. Species Identification

As mentioned in the previous section, the experiments were carried out using a database of 232 images collected from fish eyes of four different fish types. The data base was divided in to two groups. The first group (training data set) includes a total of 176 images which contains 44 eye images from each fish type.

The second group (testing data set) includes the rest of 56 images and includes 14 eye images of each fish type. The results of applying the proposed SVM classifier to the testing data for the purpose of species identification are tabulated in Table 1.

4.2. Freshness Assessment

The same training parameters were chosen for all networks. In order to reduce any error caused by weight initialization, each network was trained five times by randomly selecting inputs from the training sets.

In Table 2, the second index assigned to each Net indicates the number of attempts for training the network. For example, Net1-3 means: Net1 at the third training attempt.

The training epoch was set to 100, and the TRAINLM training function was used (because it is very fast) for all four networks. The TRAINLM is a very fast network-training function in MATLAB that updates weight and bias values according to Levenberg-Marquardt optimization [17].

4.3. Test Results

From the results presented in Table 2, we selected the best trained networks, which are Net1-5 (for Snapper), Net2-5 (for Gurnard), Net3-1 (for Tarakihi), and Net4-2 (for Trevally).

The selected networks were used to test the fish freshness using 32 test data sets. As discussed in previous sections, the data sets included the extracted features of the smell print of four fish selected for the days of data collection (days 2, 5, 7, and 8).

The correct rate of identification and as the average number of days after catching the fish is reported in its average for the number of days after catching the fish are reported in Table 3.

5. Discussion and Conclusion

The study undertaken has dealt with fish species identification from fish eye samples using statistical learning. New sets of features have been extracted from the original fish eye image followed by the development of a two-level classifier to improve the generalization properties of the trained classifiers. The statistical features, such as mean, max, std, var, were extracted from the fish eye images. Part of the data set was used for training and the other part for testing of the proposed SVM-based classifier.

Evaluation carried out on 232 fish eye images shows that the performance of the two-level classifiers using linear and MLP kernel was good with an overall classification rate of 96.46%. For the identification of some special cases of fishes with no visible eyes, such as flounders and soles, fish skin images may be used instead of fish eye images.

The use of commercial E-nose in food industry and quality control such as assessment of fish freshness is the subject of great interest. After the fish is caught, it starts degrading. Environmental conditions such as temperature and water contents play a significant role in maintaining the freshness of the fish. E-nose can be employed not only to estimate the freshness but also it can be used to predict how fast it degrades in a given environmental condition. E-nose can be also employed to control the environmental conditions via a certain feedback mechanism to prolong fish freshness.

In this paper, we presented a fish freshness assessment method based on a portable E- nose and ANN classifiers. The array of chemical sensors, forming the basis of the E-nose, responded to the signal pattern specific to the different days of each fish type. The ANN-based classifiers processed these signals and assigned the freshness level to the individual fish.

The proposed method was tested on the samples of four fish over different days. The results were obtained using similar architecture for all four networks (50, 15, and 2 neurons in the input, hidden, and output layers, resp.). The networks were trained with different training data sets, collected from individual fish, therefore the weight values among their layers were different. After training the networks five times, we selected the best trained network for each fish. By randomly selecting the learning and testing sets, the average learning error equalled 2.50E-02 (for Net1), 1.25E-02 (for Net2), 5.44E-02 (for Net3), and 4.17E-03 (for Net4).

While all four fish could be classified using a single network, the purpose of using four different networks was to increase the accuracy of the system. The assessment of freshness for individual fish was performed by a specific network which was trained for the specie.

A fairly accurate assessment of fish freshness has been obtained for the selected samples. The averages of the right classification of the number of days after catching the fish were 95.83%, 93.75%, 91.67%, and 95.83% for the selected fish types. The results confirm that the proposed assessment method can be considered as a suitable technique for freshness assessment. However, it has to be put on record that the samples were kept in a refrigerator which is the practical way to store them and freshness is related to the refrigerator stored environment.

The common disadvantages with respect to quality control within industrial environment are that the process is slow and the analysis requires laboratory facilities and qualified staff. Moreover, there is arising difficulty in establishing standards for use in various locations. The key obstacle in the way of this system being considered as a commercial solution lies in the fact that quality controllers/inspectors need to evaluate the freshness of a batch of fishes rather than individual sample. Employment of this system to process batch of fishes is still being explored.

In conclusion, E-nose can be employed not only to estimate the freshness but also it can be used to predict how fast it degrades in a given environmental condition. E-nose can be also employed to control the environmental conditions via a certain feedback mechanism to prolong fish freshness. Using four different networks can increase the accuracy of the system. Further research is being undertaken to make the system commercially viable.