Abstract

The wavelet analysis of the integral images can be used in extraction of the depth of the photographed/synthesized 3D objects. The result of the analysis, however, depends on the colour/texture of the object and thus can be ambiguous. In this paper, we propose to normalize the image before processing in order to avoid such an ambiguity, and to extract the depth without regard to the colour/texture. The proposed technique is verified in multiple integral/plenoptic images and can be applied to multiview and light-field displays as well.

1. Introduction

There are many 3D technologies [13] in 3D imaging. Particularly, the wavelets are used in 3D imaging, for instance, in multiview video compression [4], in image coding [5], in image fusion [6], as a quality metric [7], etc. The disparity of stereoscopic images can be estimated [8], the shape of the photographed/synthesized objects can be analysed, and the depth can be extracted using the wavelet analysis of the integral images [911]. In this paper, we propose the technique to eliminate the texture effect. Based on the similarity between the 3D images, where a 3D content is represented in a single image plane consisting of the logical image cells [12], our results can be applied to integral [1315], multiview [1618], plenoptic images, and light-field displays [1921].

The result of the wavelet analysis of 3D images, however, depends on the colour/texture of the surface of the object. This effect was noticed before, and the most images in [911] were binary BW images and most results were presented in a qualitative visual form. This unexpected undesirable effect must be reduced, but a solution has been unknown yet.

The intensity in any point of the image in the image plane is proportional to the brightness of the point of the object in that point. The intensity of all separated parts of voxel pattern [22] is equal to the brightness of the corresponding point of the object. The voxel patterns precede the wavelets; thus, this important property is kept in wavelets. Therefore, the wavelet coefficients depend on the colour; and the result of the wavelet analysis of a multiview image is proportional to the brightness of voxels (or pieces of surface in a texture model).

Define the central view (CV) of an integral image as an image, where the central pixels of all image cells of the integral image are assembled in accordance with the location of the cells; i.e., the centre pixel of the left top cell goes to the left top corner of the CV, the centre pixel of the right top cell goes to the right top corner of the CV, etc.; such image can be seen by a hypothetic (nonexistent) camera located at the centre of the lens array.

The CV can be calculated by applying the known interlacing technique (see, e.g., [23]) along both dimensions of the integral image. The CV can be calculated for binary, greyscale, or colour images; see Figure 3. In our paper, the CV is solely used as a simple graphical picture of all objects of a 3D image, because in the original integral image, the objects may look unclear (nonsharp, or out-of-focus, or blurred, etc.); compare Figures 1(a) and 1(b).

What is important is that it is unnecessary to calculate the CV for the wavelet transform, and therefore not a one operation with the wavelets needs the CV. In this context, the central view is nothing but a descriptive illustration for the journal paper.

Later in this paper we will name an integral/plenoptic image by its CV (as “in Figure 1(b)”), but actually such a referencing will mean the integral image (i.e., “Figure 1(a)”) which corresponds to the named CV.

Previously, the colour dependence remained behind the scene, and the depth was mixed with colours. In this paper, as soon as we know an answer, we tried to open a door, slightly. To restore the depth more correctly, we propose to use the normalized image. Firstly, we would like to illustrate the dependence on colours in the wavelet analysis of the integral images.

2. Dependence of Wavelet Coefficients on Colours

Consider colours of three digits in the 123-image. If all colours are identical (as in the BW image in Figure 1), then the wavelet coefficients are the same for every digit. Note that it is a repainted colour image, originally provided by Prof. B. Lee.

The wavelet analysis of the binary BW image Figure 1 shows almost uniform shape (a flat-top curve) for the depth planes between -6 and +6, where the digits of this 3D image are presumably located in space; see Figure 2(a). The wavelet coefficients for each digit processed separately are shown in Figure 2(b), where all three maxima are close to each other.

Based on Figure 2(b), one may conclude that the digit 1 is located between the + and + planes, digit 2 in , digit 3 between - and - planes. Outside of this region ( > 3), the wavelet coefficients monotonously decay. Therefore later in the related Figures 4 and 5, we will show the wavelet coefficients within the depth region [-2, +2] only, where the expected result is a flat-top horizontal line.

Then, if the shades of grey of digits or their colours are not the same as in Figure 3, the expected results of the wavelet analysis would be different for every digit.

The results of the wavelet analysis (wavelet coefficients) of greyscale images, Figures 3(a) and 3(b), are shown in Figure 4(a). The wavelet coefficients of the colour digits, Figures 3(c) and 3(d), are shown in Figure 4(b).

There is an essential difference between the wavelet coefficients of digits of various colours. In all cases, instead of a flat top and decay (as in Figure 2(a) for the binary image), the wavelet coefficients of grey images may rise up, fall down, or have a maximum or minimum in the middle; see Figure 4. These graphs can be treated ambiguously: either the digits in the same plane have different colours, or the digits of the same colour are located at different distances. A variety of intermediate interpretations is also possible.

This confirms a strong dependence of the results on the colours of voxels (texture); it is an undesirable side effect in the depth (shape) extraction. Below, we describe how to eliminate it.

3. Materials and Methods

In our examples, the source images are integral/multiview/plenoptic images with the square grid of the image cells (a cell is an area under a lenticular lens). The images were taken from different independent sources, either photographs or synthesized (computer-generated) images.

The multiview wavelets and the algorithm of the continuous multiview wavelet transform are exactly the same as presented in our previous paper about the multiview wavelets [10, 11].

In order to avoid the undesirable effect of texture and to restore the spatial structure without regard to the colours, we propose to process so-called normalized image [24]. A normalized image is typically used in order to reduce a nonuniform illumination in a local neighbourhood. The normalized image can be built by the algorithm [25]. The normalized image corresponding to the original image of digits Figure 1 is shown in Figure 5(a). The result of the wavelet analysis of this normalized image is shown in Figure 5(b).

Despite insignificant imperfectness (such as non-100% horizontal flat top), the graphs in Figures 2(a) and 5(b) are similar to each other more than any graphs in Figure 4. This means that the normalized image predominantly contains the information about the 3D structure (depth), rather than about colours (texture). For instance, the RMS difference of the wavelet coefficients between the processed RGB and BW images is 1.2 A.U. (normalized RMS error 60%); in grey and BW images it is 0.7 A.U. (48%); however, in BW and normalized images it is 2.5 – 4.2 times less (0.3 A.U, 19%). This is not a complete elimination of the undesirable effect, but its essential reduction.

Before processing, the colour images were transformed into the grey-scaled images, but it is not necessary. The full-colour images can be processed as well, by processing the R, G, B colour components separately.

The dimensions of the images are as follows: the image of digits is 800 x 800 pixels (CV and normalized image are 40 x 40 pixels), the rabbit 1350 x 1350 (CV and normalized 45 x 45), the books 4031 x 4031 pixels (CV and normalized 139 x 139), and the house 4575 x 4575 pixels (CV and normalized 75 x 90).

4. Results

To illustrate the proposed technique, we applied it to plenoptic/integral images from various sources. The normalization is followed by the wavelet transform. In the examples, we will compare the results (i.e., the wavelet coefficients for the source and normalized images) by the depth planes and along the rows (horizontal lines) in the array of the coefficients.

The original and normalized images of the house are shown in Figure 6 (recall, this is a CV).

Consider, for example, the depth plane 0 and the row 70 of this image. The boundary line between the path and the lawn is at the same time the line of the change of colour of the texture. Because of that, the depth of this line is concealed (hidden) in the original image; see the wavelet coefficients and the graph along the horizontal row in Figures 7(a) and 7(b). What is important is that in the normalized image, this line can be clearly seen as a separate pulse in Figures 7(c) and 7(d).

N.B. The first image of each pair in Figure 7 displays the modulus of the wavelet coefficients (the black colour means maximum, and the white colour means zero), while the second graph is the full profile along the selected row. The same layout will be used later in Figures 9, 11, and 12.

Also, note the difference in the average (mean) level on the body of the car between the columns 45 and 65 as indicated by the dashed line in Figures 7(b) and 7(d). The influence of the texture colour of the body of the car is clearly reduced in Figure 7(d) down to the average level of that row.

The source and normalized images of books are shown in Figure 8.

In this example, we consider the depth plane -1, the row 95. Figures 9(c) and 9(d) clearly show a recognized 3D edge (backbone of the book) as a separate pulse, but not a texture-induced effect (a step pulse) as in Figures 9(a) and 9(b).

The influence of the texture is reduced in this image too. The average level at the cover of the book is almost the same along the row in the processed normalized image (shown in Figures 9(b) and 9(d) by the dashed line).

N.B. The letters “EG” on the cover of the book in Figure 9(a) are not a restored 3D structure, rather a texture of the surface.

The source and normalized images of the rabbit are shown in Figure 10. In the 3D analysis, two planes will be considered.

(1) Plane 2, row 13. The eye of the rabbit becomes clearly recognized in the normalized image, while it is completely invisible in the unnormalized one; see Figure 11.

(2) Plane 5, row 17. The same is valid for the nose. The nose is clearly recognized in the processed normalized image, while it is hidden in the source (unnormalized) image; see Figure 12.

These two examples also demonstrate that the influence of the texture is reduced in the normalized images, so as some previously unrecognized features appear.

5. Discussion

We processed the greyscale images. Before processing, the colour images (if any) were transformed into the grey-scaled images, but it is not necessary. The full-colour images can be processed as well, by processing the RGB colour components. Also, we used the BW normalized image; however in general, each colour component can be normalized individually, and a colourful normalized image could be obtained.

The numeration of the rows and columns in the array of the wavelet coefficients is slightly different from that of rows and columns in the CV, because of the different sizes. (The size of the CV is fixed, but the array size at each depth plane varies, because we did not make any assumptions about the behaviour of the image beyond the sides of the original image.) The difference is less than the number of the current depth plane, but this small difference is out of the scope of the current paper.

The 3D images used in this paper were obtained by the independent authors either from a plenoptic camera (Figures 6 and 8) or by means of the computer simulation of the integral imaging (Figures 1 and 10). In either case, a lens array with a square grid of microlenses or its computational equivalent was used. The high quality lens arrays are known for their very uniform structure (a small deviation of the lens pitch across the array). The proposed image processing procedure can be applied to the 3D images built on the hexagonal grid of the lenses (with properly redesigned wavelets).

Instead of a lens array, the 3D images can be also obtained from a camera array, as described in [27] and in the references in [3]; on the other hand, the layout of cameras in the camera array might be less uniform than that of the lens array. Moreover, the 3D imaging with the sensors at random locations is demonstrated [28]. Therefore, we hope that a wavelet processing can be in principle applied to 3D images from the camera arrays; however the wavelets have to be radically modified in this case.

In the normalization of the most images (only a few of many processed images are presented in this paper), we used the default values σ1 = 18, σ2 = 24; however in a few cases, these values were reduced to σ1 = 4.5, σ2 = 6.

6. Conclusions

The influence of colours (texture) is essentially reduced by means of the normalized image instead of the source full-colour or greyscale image. The technique is confirmed by processing integral/plenoptic images from independent sources. The numerical comparison became possible between the planes and within each plane (because the wavelet coefficients are normalized by definition, and because the integral image itself is used in the normalized form). Integral, multiview, plenoptic (light-field) images with the square grid of cells can be processed. Colour images can be processed as well. The proposed technique can be efficiently used in 3D imaging for the depth extraction and the shape reconstruction without regard to the colour/texture.

Data Availability

The plenoptic/integral image data used to support the findings of this study were obtained from two sources. Namely, two of four images we used in the examples are available for free on the website “http://www.tgeorgiev.net/” by Dr. T. Georgiev [26]; the other two images were kindly supplied by Prof. B. Lee upon our personal request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Acknowledgments

This work was partially supported for one author (Prof. I. Palchikova) by the Russian Foundation for Basic Research and by the Ministry of Education, Science, and Innovative Policy of the Novosibirsk Region within the framework of the research [project No. 17-47-540269]. We greatly appreciate Prof. B. Lee for the images provided to us personally (the digits and the rabbit). Also, we are thankful to Dr. T. Georgiev for his wonderful and very useful website (research until 2014), particularly, for the images available at that site for free (the input and the house crop).