Abstract

It is difficult for the conventional image compression method to achieve good compression effect in the underwater acoustic image (UWAI), because the UWAI has large amount of noise and low correlation between pixel points. In this paper, fractal coding is introduced into UWAI compression, and a fractal coding algorithm based on interest region is proposed according to the importance of different regions in the image. The application problems of traditional quadtree segmentation in UWAIs was solved by the range block segmentation method in the coding process which segmented the interest region into small size and the noninterest region into large size and balanced the compression ratio and the decoded image quality. This paper applies the classification, reduction codebook, and correlation coefficient matching strategy to narrow the search range of the range block in order to solve the problem of the long encoding time and the calculation amount of encoding process is greatly reduced. The experimental results show that the proposed algorithm improves the compression ratio and encoding speed while ensuring the image quality of important regions in the UWAI.

1. Introduction

The UWAI refers to an image generated by imaging sonar according to the characteristics of the underwater target echo signal by means of acoustic wave detection. Due to the limitation of the underwater acoustic (UWA) channel and the storage capacity of the device [1, 2], the UWAI must be compressed to meet the actual application requirements. Compared to ordinary optical images, the UWAI is affected by uncertain factors such as underwater environment, various types of noise, ocean reverberation, ocean internal heterogeneity, and seabed terrain irregularity, which is seriously polluted by kinds of noises. Generally, the signal to noise ratio is low, and there are many outliers that differ greatly from the gray values of surrounding pixels. The image target area has less gray level; however the background area has rich gray level. The whole image is rough, and the complex and fine edges are missing [3, 4].

Due to the particularity of UWAIs, many methods used for optical image compression cannot always achieve good results. The traditional image compression method is based on removing the correlation redundancy in the image. Due to the existence of noise, the correlation between the pixels of the UWAI is very poor. At the same time, limited by the entropy, the compression ratio of traditional image compression method is generally not high in the UWAI compression. Therefore, this paper applies the fractal coding based on the fractal theory and iterative function system to UWAI compression to improve the image compression performance. Fractal coding algorithm breaks the limitations of the information theory which is based on the widely existing self-similarity in images. The basic idea is to achieve high compression ratio by removing the structural redundancy through a set of compression transformations. Since Barnsley first applied fractal theory to image compression in 1988 [5], fractal coding has attracted wide attention by its large compression potential, fast decoding speed, and resolution of decoded image resolution.

In 1992, inspirited by Barnsley’s research, Jcaquin proposed fractal coding based on the local iterative function system [6], which made the fractal coding into practice and promoted the rapid development of fractal coding research. However, there are still some problems in the fractal coding. (1) The coding process is complicated and the amount of calculation is large, which costs long time for the coding procedure. (2) The compression ratio and the decoded image quality cannot be balanced. In order to solve the long fractal coding time, scholars have proposed several methods such as classification matching [710], optimized search range, and nonsearch coding [1115]; scholars propose a variety of different range block partitioning schemes, such as quadtree partitioning [1618], HV segmentation [19], triangulation, and irregular segmentation [20, 21] to balance the problem of compression ratio and decoded image quality. The latter three methods require more information to describe the position and shape of each subblock compared with quadtree partitioning, and the search matching processes are more complicated. In image processing, the K-Means algorithm which belongs to the machine learning algorithm is a clustering algorithm based on distance similarity [22, 23]. By comparing the similarity between samples, the samples of the form are divided into the same category. However, the usage of AI algorithms requires a large number of data sets such as different signal-to-noise ratios, different textures, different types, different sources of UWAIs, and then the algorithm can find the most suitable codebook based on various feature information of the images, which means a large number of underwater acoustic experiments is required and which is unrealizable currently because the acquisition of UWAIs is difficult. Therefore, this paper utilizes the widely used quadtree partitioning algorithm in the segmentation of UWAIs. However, because of the widespread noise in UWAI, most of the segmented images have little or no large size image blocks which causes difficult in the application of the quadtree partitioning algorithm. Focusing on this, we propose the improved quadtree partitioning algorithm which is based on the region of interest. In addition, by classifying the range blocks, the search range of the range blocks is optimized which can significantly improve the encoding speed. At the same time, the codebook capacity is reduced and the correlation coefficient is matched which can not only balance the compression ratio and the decoded image quality but also gain a higher encoding speed.

The structure of this paper is as follows: the second section briefly introduces the principle of basic fractal coding; the third section focuses on the fractal coding algorithm based on region of interest and the correlation coefficient fractal coding algorithm is proposed; the simulation and experiment results are given in the fourth section; the fifth section comes the conclusion.

2. Basic Fractal Coding

The basis fractal coding (BFC) is to find a set of compression transformations whose fixed points are similar with the original image. These sets of compression transformation are called the Iterated Function System (IFS). Jacquin proposed the Partial Iterative Function System (PIFS) and the basic fractal coding, inspired by IFS. The implementation of BFC algorithm is as follows.

When encoding, the image is first divided into a Range Block (R block) and a Domain Block (D Block). The R blocks do not overlap with each other, and the entire R blocks can form the original image. The D blocks can be generated by sliding a certain size window in the original image according to a preset step size which are allowed to have overlapping areas. To ensure the convergence of PIFS, the block size of D should be larger than that of R (usually the D block size is twice larger than the R block).

Supposing the size of the original image is M × M, the block size of R should be B × B and the block size of D block will be 2B × 2B. The sliding step is represents by , and then the whole image is divided into R blocks and D blocks.

In order to match the D and R blocks, the D block needs to do the spatial contraction, and its size is scaled from 2B × 2B to B × B. Usually, the four-neighbor pixel averaging method is used for spatial contraction (as shown in Figure 1), and S represents the spatial contraction operation. is the contracted D block, and then the value of each pixel in can be written aswhere and are the gray values of and at the point , respectively.

For each R blocks, it is necessary to find the closest block in the contracted D blocks to obtain the fractal code. According to PIFS, between the R block and the D block there is an affine similarity, which means that the R block is similar to the D block after the equidistant transformation (rotation, flipping, etc.). Jacquin simplified and summarized the isometric transformations, which is shown in Table 1.

In the table, represents the pixel value of point and represents the isometric transformation.

After the entire D block is contracted and isometric transformed, the codebook (also called the definition domain pool) required for block matching is formed. For convenience, the block in the codebook is represented by D.

After the above process, for any , the D block in the codebook is selected for match searching, that is, to find a best matching block , so that is the closest one to after proper affine transformation, which is

The matching error between D block and R block is , which is calculated by the following [24, 25]:

When searching in different D blocks, the one with the smallest matching error is selected as the best matching block. The procedure can be expressed asin which represents the inner product; R and D represent and , respectively; and represent the constant block formed by the mean value of R and D; represents the vector 2 norm. Such tetrad is the fractal code of block R.

The procedure of the BFC is shown in Figure 2.

The fractal decoding process is relatively simple. By compression transformation, the decoded image can be applied to any image of the same size as the original image. corresponds to the fractal code after 5~8 iterations (8 iterations are selected in this paper). The R block of the image generated by the kth iteration is determined by

3. The Proposed Algorithm

3.1. Quadtree Partitioning Based on Region of Interest

As can be seen from Section 2, the fractal coding method only encodes each R block after image segmentation, not the entire image. Therefore, the block size of R directly affects the compression ratio and the image quality. The larger the block size of R is, the less the block number of R will be. Accordingly, the less “entry” in fractal code, the higher compression ratio which causes the increasing of the matching error. On the contrary, the smaller the block size of R is, the better the decoded image quality will be, but the compression ratio becomes lower. In order to balance the compression ratio and the decoded image quality, this paper improved the fractal coding based on quadtree partitioning.

Quadtree partitioning is similar to the tree structure in data structure. The original image can be regarded as the root of the tree, and the four fork points (corresponding to the four quadrant regions of the image respectively) are separated from the root node. Each of the fork points can select whether to further divide the four new subnodes according to certain rules (corresponding to four sub-regions of the image area corresponding to the intersection, respectively). The new subnode continues to split according to this rule, and when the segmented node reaches the maximum segmentation depth, the segmentation will stop. Then the entire image is divided into several subblocks with different sizes.

In practice, the gray-scale uniform criterion is generally used for the quadtree partitioning. When the block size of one image is larger than the minimum segment size, which means the difference between the maximum gray value and the minimum gray value is greater than a preset threshold, the block is further divided. Otherwise the block does not need to be split.

For different size image blocks generated after segmentation, the fractal code can be obtained by match searching in the corresponding codebooks using the basic fractal coding method. Therefore, the quality of the decoded image can be increased by dividing the portion of the image with a smaller difference in the pixel value by a larger size, thereby increasing the compression ratio and dividing the portion with a larger difference in the pixel value by a smaller size, thereby ensuring the decoding image quality.

Figures 3 and 4 show the effect after the quadtree segmentation of the ordinary grayscale image Lena and the UWAI Ship (the maximum segmentation size is 16 × 16, and the minimum segmentation size is 4 × 4) respectively. As can be seen from Figure 3, the Lena image has a good segmentation effect, all blocks of , , and account for a certain percentage. Due to the noise in UWAI, the effect of quadtree partitioning is not obvious. All blocks in the image are not smaller than the gray threshold and, therefore, further subdivision is needed. For the blocks subdivid, only a few blocks satisfy the requirement with no need for further segmentation, most of blocks are divided into blocks. It is conceivable that when image noise pollution is more serious, the effect of quadtree segmentation will be exactly the same as segmentation. Therefore, new quadtree segmentation methods that suitable for the UWAIs are needed.

Considering that the main information of the UWAI is concentrated in the area where the target is located, the noisy background area does not need much attention. For this reason, different size segmentation is used for different important areas in the UWAIs in this paper. The important area in the image (hereinafter referred to as the region of interest) is divided by a small size, and a large size segmentation is adopted for the noninterest region. The specific segmentation process is as follows.

(1) Image Denoising. UWAI usually has noise interference, so the image needs to be denoised first to reduce the impact of these interferences on the region of interest in the image. Since this step is to prepare for the extraction of the region of interest in the next step, in order to reduce the nonedge line interference, the image is smoothed while maintaining the contour of the region of interest while denoising. In this section, the average filtering, median filtering, Butterworth low-pass filtering, Gaussian low-pass filtering, Wiener filtering, and other methods are used for denoising. Figure 5 shows the denoising effect of the UWAI Ship by different methods. It can be seen that the five methods produce different degrees of smoothing effects on the image. The average filtering, median filtering and Butterworth low-pass filtering blur the boundary between the target and the background while denoising, which is not conducive to the extraction of the region of interest. The Gaussian low-pass filtering does not have the problem of boundary blurring, but the smoothing effect is not good, while wiener filtering has good denoising effect and maintains the contour of the target area which is the reason for selecting it in this paper.

(2) Extraction of Interest Region. In this section, the interest region in the UWAI is extracted by means of edge detection and morphological operations. Both the target area and the shadow area formed by UWAI contain important information. The principle of extracting the area of interest is to include some background areas, which will result in the final area of interest being larger than the range extracted by traditional target segmentation. The specific steps for extracting the interest region are as follows (for the sake of convenience, the target and the shadow are collectively referred to as the target):

① Use the canny operator for edge extraction on the denoised image.

② Morphological expansion is used to connect discontinuous target edge lines into closed lines.

③ The nontarget contour lines caused by interference are removed by morphological operation.

④ Fill the target contour area.

⑤ Morphological expansion is applied to expand the target area, and the expanded area is the required interest region for coding.

Figure 6 shows the results of the extraction of the regions of interest for the two UWAIs

(3) Image Segmentation. On the basis of Section 2, small-size segmentation is performed for the region of interest, and large-scale segmentation is adopted for the non-interest region. Figure 7 shows the segmentation effect of UWAIs by the small size 4 × 4 and the large size 8 × 8.

3.2. Reducing the Search Area of R Blocks

After the UWAI is segmented, R blocks of different sizes are generated. For each R block, it is necessary to search for the best matching block in its corresponding codebook to obtain a fractal code. According to the basic fractal coding proposed by Jacquin, the R block searches in all codebooks, which will cost a long time on encoding. Therefore, the search area of R block is reasonably reduced, and the matching process between R block and D block is optimized to reduce the calculation amount of encoding process and improve the encoding speed.

From formula (3):in which, is the total number of R block pixels, is the mean value of , represents the vector inner product, and is the variance of .

As can be seen from (6), the following holds for any :

When the variance of the R block is small, the matching error is accordingly small. Therefore, the R block is divided into two categories according to the variance: the R block whose variance is less than the preset threshold is regarded as a smooth block; otherwise it is regarded as a nonsmooth block. For a smooth block, the pixel value of each point is replaced by its pixel mean value, so there is no need to search again. For nonsmooth blocks, in the next section the following methods are adopted.

Equation (5) also shows that the D block with smaller variance is unlikely to be the best matching block of the R block with larger variance (if these two are the best matching pairs, it is likely that is greater than 1 and then the PIFS will no longer converges). Therefore, D blocks with smaller variance can be eliminated from the codebook in advance. The reduction codebook is defined asin which, is the preset threshold. For nonsmooth R blocks whose variance is greater than the threshold , it is only necessary to find the best matching D block in the reduced codebook.

3.3. Match Searching Based on the Correlation Coefficient

When searching for the best matching block of the R block, it is necessary to calculate the parameters according to (5) and then calculate the matching error between the R block and each D blocks. Although Section 3.2 has narrowed down the searching area, the computation of this process is still large. If the D blocks that may not be the best matching block of the R block can be removed in advance, a large number of complicated calculations can be avoided and the encoding speed can be improved. In this paper, according to the correlation coefficient between R block and D block, D blocks with little correlation is excluded; thus the encoding process can be speed up.

For R block and D block, the correlation coefficient is represented byin which represents the mean of , represents the product of the vector, and represents the norm of vector 2.

This is available in formula (3)

Equation (10) shows that, with the increase of absolute value of the correlation coefficient between the R block and the D block, the possibility of the two blocks becoming matching pairs increases. For the R block of the variance (in which is a preset threshold), it is known from (10)

This means that if the R block matches the D block ( is small), is larger which means that enough large is the necessary and sufficient condition for the R block matching for the D block.

According to the above analysis, in the encoding procedure, for each R blocks, the correlation coefficient and the matching error of the first D block in the reduced codebook are calculated, respectively as and . And then the correlation coefficient between the other D blocks in the reduced codebook and this R block are calculated. If , then the matching error between block D and block R is further calculated; otherwise the D block is ignored and turned to the next D block. This can prevent the calculation of the parameter and the matching error for all D blocks, which greatly reduces the computation complexity of the entire encoding process and saves the encoding time.

3.4. The Encoding Procedure of the Proposed Algorithm

For a UWAI to be compressed, firstly extract the interest region. Then divides the interest region and the non-interest region into different sizes, and generates the codebooks with different size simultaneously. A reduced codebook is generated according to the threshold . According to the method proposed in Section 3.3 of the reduction codebook, the fractal code is obtained by searching and matching the divided R blocks of different sizes. The detailed encoding process is as follows:

(1) Preset the threshold of R block, codebook threshold , etc.

(2) According to the method proposed in Section 3.1, the UWAI is segmented by quadtree partitioning based on region of interest.

(3) According to the R block partition size, two codebooks of different sizes are respectively produced; and the reduced codebook is constructed according to the codebook threshold .

(4) For smooth blocks (R blocks with variance less than), take the constant block as approximated, and the fractal code is directly output without searching.

(5) For nonsmooth blocks (R blocks with variance greater than or equal to ), perform match searching in the reduced codebook. The search strategy is as follows: select the first D block in the reduced codebook and calculate its correlation coefficient and matching error with R block. Then, the correlation coefficient of other D blocks in the reduced codebook with the selected R block is sequentially calculated. If , the matching error of the two blocks is further calculated. If , let ; otherwise ignore this D block and move to the next. After traversing all the D blocks in the reduced codebook, select the D block corresponding to as the best matching block and the fractal code is output.

(6) Complete the encoding of all R blocks to generate an encoded file.

The specific steps of the encoding are shown in Figure 8.

3.5. The Decoding Process of the Proposed Algorithm

(1) Read information such as the information of the quadtree partition, the position of the best matching block of the R block in different size codebooks, the equidistant transformation mode, the contrast factor, and the grayscale offset.

(2) Define three images Y1, Y2, and X having the same size with the original image, in which X is used to store the image generated by the iterative process and Y1 and Y2 are used to generate two codebooks with different sizes.

(3) Initialize Y1 and Y2.

(4) For each R block in X, according to its fractal code , locate its corresponding best matching block in the corresponding codebook Y and restore the R block according to the following formula:

in which is a four-neighbor averaging operation. After all the R blocks are obtained by the above affine transformation, the iterative image is formed by tiling the above R blocks.

(5) If the number of iterations reaches 8, go to step (6); otherwise, copy X to Y1, Y2 and go to step (4).

(6) Output image X.

4. Experimental Results and Analysis

In order to verify the effectiveness of the algorithm, two UWAIs: ship and plane were used for the simulation experiment. The test image is shown in Figure 9. The PC configuration used in the experiment is as follows: CPU: Intel Core i5-3470, Memory: 8G, Operating System: Windows10, Simulation Software: MATLAB2015B. The parameters that test the performance of the algorithm are as follows: coding time, PSNR, and compression ratio. PSNR is the objective evaluation standard for quality of the decoded image, which is obtained byIn which is the pixel value of the original image with size at the point and is the pixel value of the decoded image with size at the point .

(1) The Influence of Parameters on Encoding Performance. The two parameters and in the algorithm directly affect the encoding time and the quality of the decoded image. The optimal values for the two parameters are difficult to determine theoretically and therefore they can only be selected experimentally. In order to observe the influence of parameters on the decoded image clearly, the Lena image is selected as test image. When analyzing the influence of the parameter on the coding performance, it is set as to eliminate the interference of other factors. Similarly, when analyzing the parameter accordingly set .

① Experimental study of the parameter . The larger is, the more number of R blocks are considered as smooth blocks. Since the smoothing block does not need to perform match searching, the encoding speed is faster. However, since more R blocks are smoothed, the quality of decoded images is also degraded (such as Figure 10). It can be seen from Figure 10 that when , blocking effect occurs in some parts of the decoded image, and the larger is, the more obvious the blocking effect will be. When , the blocking effect is almost negligible. Figure 11 is the comparison of decode image with different parameters. Therefore, the best value of this parameter is 4.

② Experimental study of the parameter . The value of determines the size of the reduced codebook. As can be seen from Figure 12, the larger is, the shorter the encoding time will be. When increases from 0 to 20, the PSNR decreases slowly, and when increases from 20 to 45, the PSNR drops rapidly. This means when the threshold is small, the encoding time can be significantly reduced at the cost of the decoded image quality’s slightly lower. While the threshold is large, the best matching block is likely to be excluded from the codebook, resulting in degrade in the quality of the decoded image quality. Considering the encoding time and the decoded image quality, this paper chooses 20 as the best value for .

(2) Comparison of the Proposed Algorithm with Other Algorithms. The maximum segment size of the algorithm in this paper is 8 × 8, and the minimum split size is 4 × 4. The R block size of other algorithms used for comparison is also chosen as the above two sizes. Table 2 lists the experimental results of the proposed algorithm and the basic fractal algorithm (BFC) [6], the variance based fast search algorithm (VBFC) [26] and the particle swarm optimization algorithm (PSO) [27].

It can be seen from Table 2 that as the partition size increases from 4 × 4 to 8 × 8, the coding time and the PSNR of the BFC algorithm, the VBFC algorithm, and the PSO algorithm are gradually reduced and the compression ratio is gradually increased. In the comparison of the encoding time, the proposed algorithm of is only 1/53 of the 4 × 4 BFC algorithm and 1/13 of the 8 × 8 BFC algorithm, which is mainly own to the optimization of the encoding process. According to the strategy of Section 3.2, by reducing the codebook capacity, searching for matching based on correlation coefficient, setting the ending condition of matching, etc. The calculation amount of the encoding process is effectively reduced, and the encoding speed is greatly improved.

Since most of the images are noninterest regions, this paper adopts a large size 8 × 8 segmentation for these parts. And for interest region, the small size 4 × 4 segmentation is adopted. Therefore, the compression of the algorithm in this paper is significantly improved compared with the basic 4 × 4 algorithm. In the quality of the decoded image, it can be seen from Figure 13 that the decoding quality of the interest region in the image is close to the BFC 4 × 4 algorithm, and the recovery of the noninterest region approaches the BFC 8 × 8 algorithm. Although the PSNR is not as good as the BFC 4 × 4 algorithm, the important information in the image is well recovered. The proposed algorithm balances the compression ratio and the decoded image quality well.

The VBFC algorithm and the PSO algorithm use the method of variance approximation matching and particle swarm optimization to improve the encoding speed. It can be seen from the experimental results that the coding time of the VBFC and PSO algorithms decreases with the increase of the R block partition size, but even the PSO (8 × 8) algorithm with the smallest coding time is also longer than the encoding time of the proposed algorithm. This is mainly because the algorithm in this paper comprehensively adopts multiple strategies such as reduced codebook and fast search based on correlation coefficient, while VBFC algorithm and PSO algorithm only improve the match searching mode, and the calculation efficiency is not good than the proposed algorithm.

It can be seen from Figure 13 that the quality of the decoding image in the proposed algorithm is higher than the VBFC and PSO algorithms when the R block partition size is 8 × 8. For the interest region in the image, in order to reduce the matching error and improve the image quality, this paper adopts a small size 4 × 4 segmentation to preserve the important details in the image. Results show that the recovery quality is better than the other two algorithms with the same segmentation size. In addition, the compression ratio of the algorithm proposed in this paper is also significantly higher than that of the VBFC and PSO algorithms using 4 × 4 segmentation (as shown in Table 2).

The proposed algorithm is also compared with the quadtree fractal algorithm. The quadtree fractal algorithm was first proposed by Fisher, but Fisher’s method needs to calculate the matching error of four subblocks in each block, and the matching process adopts the full search strategy which is the same as the BFC algorithm, resulting in a large amount of computation and long encoding time. Therefore, this paper chooses an improved quadtree fractal coding algorithm for comparison. The improved algorithm firstly performs quadtree decomposition on the image according to the gray uniformity criterion and then performs matching search on the decomposed R blocks of different sizes in the reduced codebook. The encoding efficiency is greatly improved compared with Fisher’s method.

The maximum segmentation size of the improved quadtree algorithm and the proposed algorithm are 8 × 8, and the minimum segmentation size is 4 × 4. As can be seen from Table 3, when the quadtree algorithm is used to decompose the image, the small size image block occupies a large proportion, which leads a significantly lower compression ratio of the quadtree improvement algorithm compared to the proposed algorithm. At the same time, due to the large number of small sized blocks, the quadratic tree improved algorithm has a larger computational complexity and because of adopting the full search strategy, which results in a much longer coding time than the proposed algorithm.

5. Conclusion

In view of the particularity of UWAIs, this paper uses fractal coding based on partial similarity to compress the UAWI. In order to improve the encoding speed and compression ratio, this paper proposes a fractal coding algorithm based on interest region and correlation coefficient. The algorithm divides the interest region in the image into small size and divides the noninterest region into large size, which effectively increases the compression ratio and recovers the information of important regions well. At the encoding stage, the R block searching range is reduced by reducing the codebook and classification, and the inappropriate D block is pre-excluded according to the correlation coefficient between the R block and the D block, which greatly reduces the calculation amount of the encoding process. The simulation results verify that the proposed algorithm not only improves the UWAI compression ratio but also significantly reduces the encoding time, while at the same time ensuring the restoration quality of the interest region in the image.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors acknowledge the project of the National Natural Science Foundation of China (Grant no. 61701487), the Innovation Foundation of Chinese Academy of Sciences (Grant no. CXJJ-17-M126), the Natural Science Foundation of Hainan Province (Grant no. 417211), the Young Talents’ Science and Technology Innovation Project of Hainan Association for Science and Technology (Grant no. QCXM201812), the National Key Research and Development Program of China (Grant no. 2016YFC1400100), the Strategic Priority Research Program of Chinese Academy of Sciences (Grant no. XDA13030000), and the Fundamental Research Funds in Heilongjiang Provincial Department of Education (no. 135209239). The authors also thank the Technical Bureau of Qiqihar GYGG-201622.