Abstract

High-quality 3D scenes often show poor rendering effect and insufficient operational performance in low-end devices. Therefore, how to make better use of 360-degree panoramic technology and replace the traditional pure model scene by visual deception is of great significance to the picture quality improvement and performance optimization of low-end equipment. In this paper, a three-degree-of-freedom (3-DOF) application scenario for indoor simulation is used as an example. Under the premise of 360-degree panoramic technology, depth information is obtained through laser ranging, converted into a two-dimensional form of depth image with the help of spherical coordinates, and combined with a transparent information image, the depth image is compressed using DEFLATE compression algorithm, and finally be stored. By optimizing the data through bilinear sampling and with the help of geometric knowledge, the occluded parts are removed or translucent according to the actual situation when the graphics are rendered, and the occlusion relationship is correctly handled in order to achieve a real integration of the virtual environment with the 3D objects. Through multiple data, it is shown that the method possesses significant performance improvement while ensuring geometric realism.

1. Introduction

With the development of digital media technology, the 3D rendering effect has been continuously improved and applied to various fields [110]. However, the generated 3D images require heavy computation, which result in significant increases in performance requirement by operating equipment. In order to make high-quality 3D images available on low-end devices, we often use 360-degree panoramic technology instead of the pure modeling scheme (PM) [1116]. Nevertheless, because of the noninteractivity of virtual background, it is difficult to achieve real integration with the internal 3D object, with a result that it is generally being used only in distant scenes, or comes alone. Therefore, to further replace the 3D scene and increase the sense of reality, it is necessary to solve the occlusion relationship between this two.

As early as the augmented reality was proposed, people have carried out relevant studies on occlusion consistency [1721], in which the technology that can be directly applied to 3-DOF application scenarios is the occlusion scheme eliminated by rendering sequence based on environmental modeling (OEM) [22]. Due to the invariant nature of the virtual scene, this technique reconstructs the environment with a low-model mesh, with depth testing and depth writing enabled, we adjust the rendering order so that the environment model is rendered first and the later rendered 3D objects are excluded from the back part of the environment and the opacity of the environmental low-model is set to 0. At present, the main problem of this solution is that the accuracy of the low model is not high and the masking of small objects is poor, while if the model accuracy is increased, the performance requirements of the equipment are significantly higher, resulting in the inability to play an optimal role. Another common solution is a technique commonly used in 2D games which is implemented in 3D by layering the scene content into multiple 360-degree panoramas based on distance and proximity to achieve a sense of different levels of visualization. This approach is simpler, but the problem is also obvious: there is no transition effect when 3D objects span between layers.

In this paper, through the comparison and test of various solutions, we propose an occlusion scheme based on a 360-degree panoramic depth image (OPDI) for 3-DOF application scenarios of indoor simulation, based on the premise of optimizing performance and preserving spatial realism, using the depth information of 360-degree panoramic environment and eliminating the occluded parts of objects through graphical depth calculation, so that 3D objects can form a real visual sensation of occlusion with the environment in the only 360-degree panoramic image technology environment, which allows for the preservation of spatial realism while optimizing performance.

2. Method

This section specifically introduces the optimization scheme of 3-DOF application scenario using 360-degree panoramic technology with depth information in interior design simulation. The 360-degree panoramic image is acquired by the camera; furthermore, the 360-degree panoramic depth information is obtained by laser ranging and then saved in a 16-bit single-channel 360-degree panoramic grayscale image by spherical coordinate conversion and compression using DEFLATE compression algorithm and then based on the 360-degree panoramic depth image and scene translucent objects to draw a 360-degree panoramic transparency information image. Finally, when rendering the 3D object, the depth information and transparency information of the environment relative to the camera in each spatial voxel are retransformed to compare with the spatial voxel corresponding to the 3D object, so as to determine how to render and display. Figure 1 shows the specific process, where parallelograms represent the data, and rectangles represent the steps.

First of all, a 360-degree panoramic view of the required indoor environment needs to be collected according to the 360-degree panoramic camera, as shown in Figure 2.

In the same position of the camera, the laser ranging sensor rotates at high speed in the range of horizontal 360-degree and vertical 180 degrees, and the depth information of each angle is detected and collected in turn. The next step is to generate an associated 360-degree panoramic depth image. Unlike traditional 2D image methods, depth information will be built in a spherical coordinate system, which is more suitable to the properties of the depth data and then a lossless depth image is generated. Take the left-handed coordinate system as an example, the corresponding spherical coordinates are shown in Figure 3. Then, the information of each point on the spherical coordinate is mapped to the depth image, where the yaw angle is the row coordinate of the depth image pixel, and the pitch angle is the vertical coordinate of the depth image pixel. It is worth noting that for the convenience of the following calculation in this sphere coordinate, the yaw angle is calculated starting from axis, and the pitch angle is calculated starting from axis, while the distance value is converted to pixel value . The mapping relationship is as follows:

In these formulas, the function is used to round a number and represents the quantization coefficient of the angle, which determines the resolution of the 360-degree panoramic depth image. Since is exactly the step size of the sensor, the obtained depth information density can be effectively controlled by adjusting the coefficient.

The generated depth images are saved in 16-bit single-channel PNG format grayscale images using DEFLATE compression derived from LZ77, and the data are compressed using DEFLATE compression algorithm after processing by the Lempel-Ziv (LZ77) [23] algorithm to ensure lossless compression while reducing spatial redundancy. The final result is shown in Figure 4.

After acquiring the 360-degree panoramic depth image, the next step is to perform the most important occlusion removal work in this paper, which will be done in the graphics rendering, where the model spatial coordinates of the 3D object vertices are obtained through the vertex shader and converted to world spatial coordinates and then further into camera spatial coordinates . where is the transformation matrix from the model space to the world space and is the transformation matrix from the world space to the camera space. After transformation, the results are transmitted to the patch shader, in which the vertex coordinate information is automatically linear interpolated, thus ensuring that the current camera space coordinates of each voxel of the object are obtained. Then, the distance value is calculated according to the camera spatial coordinates of each voxel. The formula is as follows:

The next step is to obtain information about the direction of each corresponding voxel in the 360-degree panoramic depth image. According to the spatial coordinates of each voxel, the yaw angle and the pitch angle relative to the center position are calculated according to the following formula:

It is worth noting that the value range of function is ; therefore, we also need to make interval judgment of angle to ensure that is in interval. Since we defined is in interval as to store data conveniently before, here, we also need to add 90 to it in order to comply with the previous setting of ω starting with axis.

Based on yaw angle and pitch angle obtained from 3D model voxels, mapping to the interval which served as the UV value of the pixel point of the depth image, the corresponding distance is read out from texture using the sampling function .

Due to the interval of sampling steps, the final effect will produce a certain degree of sawtooth. Therefore, according to the step size and application requirements when collecting the 360-degree panoramic depth map, the 360-degree panoramic depth image can be selectively bilinear sampling processing, so as to smooth the data. Finally, the distance value in the depth image is obtained according to the UV value of the corresponding voxel and compared with that voxel to determine whether the voxel needs to be rejected, where is the maximum value that can be stored in the depth image.

can now do most of the scene masking effects, but in reality, there are some cases where full masking is not desired, for example, glass and other transparent material environment objects which need to translucent processing rather than simply eliminate, so we also need a 360-degree panoramic image of transparent information. It is worth mentioning that through the drawing software, a copy is made on the depth image, and the part that needs to be transparent is filled with different grayscale colors according to the intensity of transparency, while white is completely opaque, and the resolution can be reduced appropriately according to the actual situation, so a 360-degree panoramic environmental transparency information image is obtained, as shown in Figure 5.

We can use it to calculate the opaque value of 3D objects to achieve the superposition effect with the 360-degree panoramic environment. The data acquisition method of 360-degree panoramic transparency information image is the same as acquiring depth information, where is the opacity value of the corresponding point on the 360-degree panoramic transparency information image, and when , it means completely opaque, dynamically controls the adjustment of transparency intensity, function limits the value range to interval, and at the same time, we need to redefine the elimination condition on the basis of .

3. Experiment

In order to verify the superiority of the scheme in this paper, we conducted three groups of control experiments on the virtual scene in Unity Engine 2020.3.11f1c1: they are OPDI, OEM, and PM. In the experiment, the Game panel was rendered with a resolution of , and an indoor environment (model from Unity Asset Store) was selected as the scene; the scene of three scenarios was tested for the corresponding masking effect with a rectangular cube as the 3D object.

In order to observe clearer, the display type of Scane Panel in Figures 68 is Sprite Mask. Through the Hierarchy Panel and the Scane Panel, we can clearly see the differences of these schemes in terms of model requirements, in which the OPDI does not use any environmental model, while the model compression rate of OEM is about 75% of the model of PM. The Game Panel on the right side shows the occlusion effect under different schemes, and there is almost no difference in vision. The OPDI also reproduces the translucent effect almost realistically, as shown in Figure 9.

With the help of Unity Profiler Panel, the performance consumed by the three schemes is statistically analyzed. Figures 10 and 11 and Table 1 show the differences of CPU usage (Rendering), GPU usage (Camera.Render), and memory usage in different schemes during a period of running. The runtime configuration is shown in Table 2.

By comparison, it can be found that no matter in CPU usage (Rendering) or GPU usage (Camera. Render), OPDI takes the shortest processing time, and memory usage is also the least, followed by OEM, and the last is PM.

By analyzing the data in Table 3, triangles, vertices, and draw calls, all three of which are major factors affecting performance, where triangles and vertices represent the number of triangles and vertices to be drawn, and too much slice by slice and vertex by vertex computation will seriously affect the GPU usage. Draw calls, on the other hand, represent the number of times the CPU submits data to the GPU and then issues rendering commands to the GPU, so it will directly determine the CPU usage. Therefore, reducing the number of these three has a significant effect on improving performance. The depth image-based occlusion scheme provided in this paper does exactly exploit this property while ensuring a sense of spatial realism as a way to greatly improve performance. Moreover, the scheme proposed in this paper has no significant change in the number of draw calls, triangles, vertices, and memory required by the scene in any environmental scenario; however, through the above data analysis, the relatively more complicated pure model-native scenario scheme, however, can optimize the performance substantially.

4. Conclusion

In this paper, taking the 3-DOF application scenario of indoor simulation as an example, a relatively novel performance optimization scheme is proposed under the condition of ensuring the spatial realism. Using 360-degree panoramic image instead of real 3D environment, mapping the distance of 360-degree panoramic environment relative to the camera to 2D image with the help of spherical coordinate conversion, and bilinear sampling data with the help of data alignment features in the image so as to optimize it, and comparing and eliminating the distance of each voxel of 3D object relative to the camera with the corresponding information saved in the 360-degree panoramic depth image to form the corresponding occlusion effect. Similarly, with the help of a 360-degree panoramic transparent information image, the occlusion of glass and other materials has a translucent effect. Through the method proposed in this paper, both the picture quality and the spatial realism are ensured, and the performance optimization is further achieved. According to the experimental results, the proposed scheme is obviously superior to the current mainstream 3D scene optimization schemes in terms of visual effect and performance. At the same time, the scheme is not only suitable for 360-degree panoramic image but also for 360-degree panoramic video. Therefore, it will have a broader application scenario, for example, in order to solve the problem of insufficient performance and poor display effect of 360-degree panoramic games with nonmobile perspective using high-quality rendering on traditional mobile platforms, the 360-degree panoramic video can also be used to achieve some visual effects of scene view movement, which is of certain significance for the expansion of virtual reality technology.

Data Availability

The data we used is available and the performance optimization scheme proposed in this paper can be used in 3-DOF application scenarios. And part of them is available to you from the corresponding author upon request ([email protected]).

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the study of this work and publication of this paper.

Acknowledgments

Our work is supported by the Digital Media Art, Key Laboratory of Sichuan Province, Sichuan Conservatory of Music, Project No.: 21DMAKL01; supported by the first batch of industry-university cooperation collaborative education project funded by the Ministry of Education of the People’s Republic of China, 2021, Project No.: 202101071001; supported by Minjiang University school-level scientific research project funding, Projects No.: MYK17021, MYK18033, and MYK21011; and supported by Minjiang University Introduced Talents Scientific Research Start-up Fund, Projects No.: MJY21030.