Abstract

Remote robotic exploration holds vast potential for gaining knowledge about extreme environments, which is difficult to be accessed by humans. In the last two decades, various underwater devices were developed for detecting the mines and mine-like objects in the deep-sea environment. However, there are some problems in recent equipment, like poor accuracy of mineral objects detection, without real-time processing, and low resolution of underwater video frames. Consequently, the underwater objects recognition is a difficult task, because the physical properties of the medium, the captured video frames, are distorted seriously. In this paper, we are considering use of the modern image processing methods to determine the mineral location and to recognize the mineral actually within a little computation complex. We firstly analyze the recent underwater imaging models and propose a novel underwater optical imaging model, which is much closer to the light propagation model in the underwater environment. In our imaging system, we remove the electrical noise by dual-tree complex wavelet transform. And then we solve the nonuniform illumination of artificial lights by fast guided trilateral bilateral filter and recover the image color through automatic color equalization. Finally, a shape-based mineral recognition algorithm is proposed for underwater objects detection. These methods are designed for real-time execution on limited-memory platforms. This pipeline is suitable for detecting underwater objects in practice by our experiences. The initial results are presented and experiments demonstrate the effectiveness of the proposed real-time visualization system.

1. Introduction

Underwater exploration of the seafloor is used for various scientific reasons such as assessing the biological environment, exploring mineral, and taking population census [1]. However, the problem of mitigating sea mines is challenging and multifaceted. Ocean mines may be located on the seafloor, in the water column, or on the sea surface. Recently, two deficiencies in mine counter measures are mentioned by marine engineering researchers. First, the problem of mine hunting is one of the most important and difficult problems. Second, the key technologies for mine and mine-like objects recognition are not exist. To this end, in recent decades, most unmanned systems were developed for supporting underwater mining. The goal in developing automation for underwater mine detection is to automatically determine the mine location and recognize the mine-like objects instead of the human. There are two stages of mine hunting operations: one is search-classify-map (SCM), which is intended to locate all sufficiently mine-like objects in a given operational area; the other one is reacquire-and-identify (RI), which distinguishes the mines or nonmines and prosecutes them accordingly.

Under recent concept of operations, automatics underwater vehicles (AUVs) based systems hosting low-frequency sonars are first used to search-classify-map relatively large areas at a high search rate. Abundant sonar systems were developed in the last two decades, such as side-scan sonar [2], volumetric sonar [3], and EO [4]. SCM is fast but does not exactly determine the mines. So, the reacquire-and-identify is used to confirm final classification by reacquiring the target, at a close range, with magnetic, acoustic or electro-optic sensors. This paper will concentrate solely on the optical imaging sensors onboard the underwater mining system.

Although the underwater objects detection technology makes a great progress, the recognition of underwater objects also remains a major issue. Challenges associated with obtaining visibility of objects have been difficult to overcome due to the physical properties of the medium. Different from the natural photographs, underwater images suffer from poor visibility due to the medium scattering and light distortion. The reason is that, first of all, capturing images underwater is difficult, mostly due to attenuation caused by light that is reflected from a surface and is deflected and scattered by particles, and absorption substantially reduces the light energy. Second, the random attenuation of the light is the main cause of the haze appearance while the fraction of the light scattered back from the water along the sight considerably degrades the scene contrast. In particular, the objects at a distance of more than 10 meters are almost indistinguishable while the colors are faded because the characteristic wavelengths are cut according to the water depth [5]. Furthermore, as the artificial light is employed, there usually leave a distinctive footprint of the light beam on the seafloor.

There have been many techniques to restore and enhance the underwater images. Schechner and Averbuch [6] exploited the polarization dehazing method to compensate for visibility degradation. Ancuti et al. [7] used image fusion method in turbid medium to reconstruct a clear image, and Hou et al. combined point spread function and a modulation transfer function to reduce the blurring effect [8]. Ouyang et al. [9] proposed a bilateral filtering based image deconvolution method. Although the aforementioned approaches can enhance the image contrast, these methods have demonstrated several drawbacks that reduce their practical applicability. First, the equipment of imaging is difficult in practice (e.g., range-gated laser imaging system, which is hardly applied in practice). Second, multiple input images are required, which is difficult to capture by hardware. Third, they cannot solve the color distortion very well.

In this paper, we introduce a novel approach that is able to enhance underwater images based on single image to overcome the drawbacks of the above methods. We propose a new guided trigonometric filter instead of the matting Laplacian to solve the alpha mattes more efficiently. In short summary, our technical contributions are in threefold: first, the proposed guided trigonometric guided filter can perform as an edge-preserving smoothing operator like the popular bilateral filter but has better behavior near the edges. Second, the novel guided filter has a fast and nonapproximate constant-time algorithm, whose computational complexity is independent of the filtering kernel size. Third, the proposed αACE can be effective in underwater image enhancement.

The organization of this paper is as follows. In Section 2, underwater imaging model will be discussed. And we will demonstrate an image enhancement system in Section 3. We apply the enhancement model in underwater optical images in Section 3. Finally, a conclusion is presented in Section 4.

2. Vision System Architecture

In the optical model [13], the acquired image can be modelled as being composed of two components. One is the direct transmission of light from the object, and the other is the transmission due to scattering by the particles of the medium (e.g., airlight). Mathematically, it can be written as where is the achieved image, is the scene radiance or haze-free image, is the transmission along the cone of vision, and , is the attenuation coefficient of the medium, is the distance between the camera and the object, is the veiling color constant, and is a pixel. The optical model assumes linear correlation between the reflected light and the distance between the object and observer.

The light propagation model is slightly different from the airlight environment. In the underwater optical imaging model, absorption plays an important role in image degrading. Furthermore, unlike scattering, the absorption coefficient is different for each color channel, being the highest for red and lowest for blue in seawater. These leads to achieve the following simplified hazy image formation model: where is the scattering coefficient and is the absorption coefficient of light. The effects of haze are highly correlated with the range of the underwater scene. In this paper, we simplify the situation at a certain water depth; the transmission is defined only by the distance between camera and scene (see Figure 1).

The detailed algorithms are used and justify our choices for those methods and the parameters. The proposed system can well eliminate the effects of scattering, artificial light footprint, organic particulates noise, and color distortion. The illumination of the mine detection system is shown in Figure 2.

Firstly, the captured video or image is removing the potential Moire effect and uses the homomorphic filters to correct the nonuniform illumination. Then, we take the DTC-wavelet transform and the proposed MAP estimation-based method to denoising. And we must make a decision to remove artificial light or not. If there artificial light exists, remove the artificial light by single frame vignetting correction algorithm, or else, utilize the proposed guided-trigonometric bilateral filters to remove the haze. In the next step, take the proposed fast automatic color equalization algorithm for enhancing the contrast. Finally, use the shape-based Grub-cut method for object segmentation and recognition of the underwater object precisely. The details of the proposed algorithms will be discussed in the next sections.

2.1. Homomorphic Filtering

The homomorphic filter is used to correct nonuniform illumination and to enhance contrasts of the image. Assume the captured image is a function of the product of the illumination and the reflectance as where is the captured image, is the illumination multiplicative factor, and is the reflectance function. Take the logarithm to achieve

Compute FFT of (4),

Then, utilize the high-pass filters to the coefficients of FFT. And after inverse-FFT, we can get the filtered images. The processed images also contain some noise; in the next subsection, we use the DTC-wavelet transform based denoising method [14].

2.2. Dual-Tree Complex Wavelet Denoising

We describe a method for removing normal image noise based on a statistical model of the decomposed DTC-wavelet coefficients. This method proposes an anisotropic multivariate shrinkage (AMS) probability density function (PDF) to model neighborhood DTC-wavelet coefficients. Then, according to the proposed PDF model, we design a maximum a posteriori (MAP) estimator, which relies on a Bayesian statistics representation for the DTC-wavelet coefficients of images.

Let be equally spaced samples of a real-valued image. is i.i.d. normal random variables. The image with noise can be expressed as

In the DT-wavelet domain, the problem can be formulated as where is the noise DT-wavelet coefficient, is the true coefficient, and is the independent noise. The standard MAP estimator for (7) is

Using the Bayes rule, (8) is equivalent to

Equation (9) is equivalent to

The spherically contoured zero-mean -dimensional BKF density is where is the modified Bessel function. and are the scale parameter and shape parameter. In this paper, we propose a simple non-Gaussian multivariate PDF to model the noise-free coefficients, considering the relationship between a coefficient, neighbors, cousins, and parent. with factor in both the numerator and denominator of the fraction. is the standard deviation of the noise coefficients. In this paper, the variables , and are dependent to each other, but the neighbors and cousins are independent of parent. Let us use the MAP estimator for this model. Maximizing (10) for each component, we can get where . The property of the modified Bessel function of the second kind is Then, the second term of (13) can be computed as

Through the above equations, the MAP estimator can be formulated using (13) and (15); it gives

Then, we approximate the as . The multivariate shrinkage function can be written as where , , . is the HH subband. and are the variance and kurtosis.

2.3. Descattering
2.3.1. Transmission Estimation

After denoising, we use the descattering method to remove the haze in turbidity particles. According to recent researches, we found that the red color channel is attenuated at a much higher rate than the green or blue channel [15]. We further assume that the transmission in the water is constant. We denote the patch’s transmission as . Take the maximum intensity of the red color channel to compare it with the maximum intensity of the green and blue color channels. We define the dark channel for the underwater image as where refers to a pixel in color channel in the observed image, and refers to a patch in the image. The dark channel is mainly caused by three factors: shadows, colorful objects or surfaces, and dark objects or surfaces.

Here, take the min operation in the local patch on the haze imaging; we assume the transmission as

Since is the homogeneous background light, the above equation is taken the min operation among all three color channels as follows:

In [16], let us set as the transmission veil; is the min colour components of . We have . For grayscale image, . Utilize the guided trigonometric bilateral filter (GTBF), which will be discussed in the next subsection. We compute the . And then, we can acquire the transmission veil by ; here is the parameter in . Finally, the transmission of each patch can be written as

The background is usually assumed to be the pixel intensity with the highest brightness value in an image. However, in practice, this simple assumption often renders erroneous results due to the presence of self-luminous organisms. So, in this paper, we compute the brightest pixel value among all local min corresponds as follows: where is the local color components of in each patch.

2.3.2. Guided Trigonometric Bilateral Filter

In this subsection, we proposed guided trigonometric bilateral filter (GTBF) to overcome the gradient reversal artefacts occurring. The filtering process of GTBF is firstly done under the guidance of the image that can be another image or the input image itself. Let and be the intensity value at pixel of the minimum channel image and guided input image, and let be the kernel window centred at pixel , to be consistent with bilateral filter. GTBF is then formulated by where the kernel weights function can be written by where and are the mean and variance of guided image in local window and is the number of pixels in this window. When both and are concurrently on the same side of an edge, the weight assigned to pixel is large. When and are on different sides, a small weight will be assigned to pixel . In this paper, we take trigonometric bilateral filter to accelerate the computational complex.

2.3.3. Recovering the Scene

With the transmission depth map, we can recover the scene radiance according to underwater imaging model. We restrict the transmission to a lower bound , which means that a small certain amount of haze is preserved in very dense haze regions. The final scene radiance is written as Typically, we choose . The recovered image may be too dark. Here, we take ACE for contrast enhancement in Section 2.5.

2.4. Artificial Light Correction

In the deep sea, we must use artificial light for imaging. However, it will cause the vignetting effect. In [17], Sooknanan et al. proposed a multiframe vignetting correction model for removing the vignetting phenomenon which involves estimating the light source footprint on the seafloor. This artificial light correction can be well done; however, it costs large time for computing. So, in this paper, we intend to introduce a signal frame-based vignette removal method [18]. Given the fact that we are interested in the overall effect of light attenuation through the system and not all of the image formation details, we have derived an effective degradation model, , as follows: where is the image with vignetting, is the vignetting-free image, and is the vignetting function. Our goal is to find the optimal vignetting function that minimizes asymmetry of the radial gradient distribution. By taking the log operation of (26), we can get

Let , , and . We denote the radial gradients of  , , and for each pixel by , , . Then,

Given an image with vignetting, we find a maximum a posterior (MAP) solution to . Taking Bayes rule, we get

Considering the vignetting function at discrete, evenly sampled radii: , where . Each pixel is associated with the sector in its resides, and sector width is . The vignetting function is in general smooth; therefore, we obtain where is chosen to compensate for the noise level in the image and is approximated as

Using the sparsity prior method on the vignetting-free image ,

Substituting (32) and (28), we have The overall energy function can be written as

Through minimizing , we can estimate the . Then, we use the IRLS technique for estimating the vignetting function.

2.5. Color Correction

As mentioned before, the preprocessed images may be too dark, so we introduce a novel fast colour correction method for contrast enhancement. It showed that by replacing with a polynomial, the summation in can be decomposed into convolutions, reducing the complexity to . We change the with an odd polynomial approximation:

As mentioned before, the input image is assumed to be , so the argument is guaranteed to be between −1 and 1. By the Stone-Weierstrass theorem, the continuous function can be uniformly approximated on by a polynomial with any desired precision. To reduce the computational cost, we select the coefficients to minimize the maximum absolute error over :

The optimal can be found using the Remez algorithm. It is then possible to decompose into a sum of convolution: where is cyclic convolution over the whole torus . If , , else, . So, the convolutions can be efficiently computed with DCT transforms instead of FFT transforms with . We apply the αACE to correct the underwater distorted images. In this research, we set .

3. Experiments and Discussions

In the experiment, we simulate the mine detection system in the darkroom of our laboratory (see Figure 3). We take underwater camera for capturing images. The distance between light-camera and the objects is 3 meters. The size of the images is 640 × 480 pixels. We take the artificial light as an auxiliary light source. As a fixed light source, it caused uneven distribution of light. Because the light is absorbed in water, the imaging array of the camera captured a distorted video frame; see Figure 3(a). Figure 3(b) shows the denoised image: electrical noise and additional noise are removed. After estimation, we use single frame vignetting method to remove artificial light. And the dehazing method is proposed to eliminate the haze in the image. After that, the contrast of Figure 3(d) is obvious than Figure 3(c). The obtained image is also too dark. So, αACE is used to enhance the image. And finally, the sharp-based recognition method is used to distinguish the objects.

In the second experiment, we compare our method with most recent methods (such as Fattal, He, and Xiao’s work). Figure 4 shows the comparison results of different methods. The drawback of Fattal’s method [10] is elaborated on [11]. In short, it cannot remove the haze or suspended solids well in practice application. To compare it with He’s method, our approach performs better. In He’s approach, because of using soft matting, the visible mosaic effects are observed. Some of the regions are too dark (e.g., the right corner of the coral reefs) and hazes are not removed (e.g., the center of the image). There are also some halos around the coral reefs in Xiao’s model [12]. Our approach not only works well in haze removal but also costs little computational complex. We can see the refined transmission depth map to compare these methods clearly in Figure 5. We can see the refined transmission depth map to compare these methods clearly.

The visual assessment demonstrates that our proposed method performs well. In addition to the visual analysis of these figures, we conducted quantitative analysis, mainly from the perspective of mathematical statistics and the statistical parameters of the images in Figure 4. These include high-dynamic range visual difference predictor2 (HDR-VDP2) [19], SSIM [20], and CPU time. The results in Table 1 indicate that our approach not only works well for haze removal, but also results in lower computation time.

Let and be the th pixel in the original image and the distorted image , respectively. The MSE and PSNR between the two images are given by where PSNR means the peak signal-to-noise ratio (values are over 0, the higher the better).

In [20], a multiscale SSIM method for image quality assessment is proposed. Input original image and distorted image ; let , , and , respectively, be the mean of , the variance of , and the covariance of and . The parameters of relative importance , , are equal to 1. The SSIM is given as follows: where , are small constants. SSIM is named as structural similarity (values are between 0 (worst) and 1 (best)).

The objective quality predictions do not map directly to the subjective mean opinion scores (MOS). There is a nonlinear mapping function between subjective and objective predictions. In [19], the author proposed a novel logistic function to account for such a mapping: where is the pooling function that produced the strongest correction with the quality databases. The -MOS value is between 0 (worst) and 100 (best). Table 1 displays the numerical results of -MOS, PSNR, and SSIM measured on several images. The results indicate that our approach works well for haze removal.

From Figure 6, we also can find that the result of independent component analysis (ICA) based method (Fattal’s model) is distorted largely, because of the synthetic transmission and shading function is difficult to determine manually. For dark channel prior method (DCP), also named as He’s method, the result can be readable, but the color is distorted largely. Another key question is that the soft matting based depth map refinement costs a lot of computing time. Obviously, compared with the abovementioned methods, our proposed method is the best, both in visual effects and computing complex.

4. Conclusions

This paper has shown that it is possible to enhance degraded image or video sequences from seabed surveys using the modern image processing technologies. The proposed algorithm is automatic and requires little parameters adjustment. Total computing time of our system is about 0.5 minute. This algorithm is fast and can be improved with a translation C language and parallel algorithms. In this paper, we proposed a simple prior based on the difference in attenuation among the different color channels, which inspire us to estimate the transmission map. Another contribution is to compensate the transmission by guided trigonometric bilateral filtering, which not only has the benefits of edge-preserving and noise removal but also speeds up the computational cost. Furthermore, the proposed αACE-based underwater image colorization method can recover the underwater distorted image well. The artificial light correction method can eliminate the nonuniform illumination very well and is faster than the multiframe vignetting correction methods.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

This work is supported in part by the Japan Government Research Fellowship for Young Scientists by Japan Society for the Promotion of Science (no. 25J10713), Grant in Aid for Non-Japanese Researchers by NEC C&C Foundation, and State Key Laboratory of Marine Geology, Tongji University, China. The second author also wishes to thank the anonymous reviewers and researchers for the useful comments and resources that helped improve the quality of this paper.