In recent years, the virtual stage background has become increasingly popular in the stage design of large-scale art evenings as well as in dance performances. It deftly blends the form of dance beauty with the content of dance performance, creating an atmosphere that not only sets the scene but also expands the stage space and enhances the artistic expression of dance. This paper focuses on the design of stage performance art using 3D modeling and MEC (mobile edge computing) technology. The data fusion method of multi-Kinect joints is researched and realized using 3DMAX to complete the modeling and virtual design of a 3D virtual stage scene. The performance scheme is created by analyzing music features with a computer, designing lighting actions, and matching music and lighting actions. Finally, it shows how DDLS (distributed deep learning shunting) configuration affects convergence performance as well as how model setting affects the diversion scheme.

1. Introduction

With the advancement of computer technology, computer graphics has gradually matured, from static plane images to three-dimensional dynamic models and then to realistic visual, auditory, and tactile virtual worlds [1]. The key to the success of a performance lies in its appeal, and the stage scenery is an important part of it. VR (virtual reality) technology has gradually gained popularity in recent years, and it is now being used in stage productions. Because traditional stage scenes are set in real scenes, which are time-consuming and labor-intensive to change, simulation stage scenes are a more feasible and efficient method [2]. It is unquestionably necessary to improve the quality of stage programs. It is difficult to break through in the creation and performance of traditional programs, but the advent of lighting art has given them a lot of leeway. The ability of lighting art to inject any content and rhythm into the performance of program content and artistic conception is not only realized in traditional music, dance, and other artistic projects but also in the performance of lighting art. Simulation technology provides a theoretical foundation and technical support for multidomain digital performance cross-collaboration.

Immersive VR technology needs more equipment, such as 3D stereoscopic display, sensor gloves, and stereo headphones, to achieve multiple stereoscopic feelings such as vision, hearing, and touch. This experience can fully inspire the user’s inspiration. In addition, the user can change the virtual model by operation to achieve the interactive effect [3, 4]. In a mobile communication network, the MEC (mobile edge computing) system is located between the wireless access network and the mobile core network. IT distributes mobile traffic from third-party applications using IT equipment such as general-purpose servers or virtualization facilities, provides localized processing services, connects to the Internet, and can also connect to other network cloud services like enterprise private networks via MEC [5]. The application of augmented reality (AR) technology to stage performance is a new application field, with few studies conducted both at home and abroad, and many issues are to be resolved. As a result, combining joint tracking technology, image processing technology [6, 7], particle systems, and other special effects technologies to create a stage AR system with good user experience and strong practicability is a worthwhile research direction.

A stage performance art design mode based on 3D modeling and MEC technology is proposed in this paper, which combines the advantages of the VRP virtual platform to realize the realism of stage scene modeling and provide a vivid, realistic, and visual stage virtual effect for 3D stage scene design. The optimization constraint framework is used to realize the multi-Kinect data acquisition module, and traditional image processing and motion capture technology are combined to power the AR system at this stage.

On the basis of the real world, AR technology architecture integrates virtual content into the surrounding world. It does not completely immerse users in a virtual world and does not cut off the connection between users and the real world. Users can see the real world and interact with the virtual world. Gao [8] developed an AR fitting mirror product TryLive, which allows users to watch the effects of various clothes on their bodies in real time and can be used by manufacturers for marketing. Yang et al. [9] proposed an efficient and robust attitude estimation framework, which combines local optimization and global retrieval technology and introduces a variant Dijkstra algorithm to extract features from a single-depth image stream. Hu et al. [10] proposed a method based on machine learning and object recognition, which mapped the attitude estimation problem into a simpler pixel-by-pixel classification problem. Literature studies [11, 12] simulate the stage visual lighting effect through OGRE rendering engine to achieve the effect of visual editing. Sutherland et al. [13] simulated the stage scenery and designed the stage layout, lighting system arrangement, and other parts with mouse and keyboard, which saves manpower, material resources, and financial resources. This simulation system greatly simplifies the process of design stage and facilitates designers, but it does not apply 3D rendering technology to the real performance stage. Raul et al. [14] built multiple Kinect sensors to track the human body and used the Kalman filter and optimization framework to fuse multiple sets of human skeleton data, respectively, so that the generated skeleton has strong stability, which better improves the original problems and promotes the application of Kinect in stage performances.

Junhui et al. [15] proposed a semisupervised DL (deep learning) model to classify the received frame/packet patterns and infer the original attributes of traffic in the WiFi network. Their proposed DL model showed better performance than traditional machine learning technology. Feiyan et al. [16] proposed to use DL-based architecture and long-term and short-term memory to model the spatiotemporal correlation of mobile traffic distribution, respectively. Kim and Um [17] designed an edge computing system for health monitoring and treatment and extracted features from mobile sensor data by CNN (convolutional neural network), which played an important role in their epileptiform localization application. Zaman et al. [18] proposed a deep edge learning framework and proved its superiority in reducing network traffic and running time. Jeong et al. [19] introduces the DL into the edge computing environment and designs a new shunting strategy to optimize the performance of the DL application of the Internet of Things with edge computing. Ansari and Sun [20] designed a deep food identification system based on the service computing paradigm of edge computing, which overcame the problems of high system delay and low battery life of mobile devices inherent in the traditional mobile cloud paradigm.

3. Research Method

3.1. Three-Dimensional Modeling of Performing Arts
3.1.1. Modeling and Virtual Design of Three-Dimensional Virtual Stage Scene

Graphics transformation, realistic graphics generation, and human-computer interaction technology are the main technologies used in 3D computer models. Window viewing area transformation, graphic geometry transformation, and projection transformation are the three main components of graphic transformation technology. Color, shadow, texture, layering, and other elements are necessary for realistic graphics. As a result, blanking technology is used in the creation of realistic graphics to eliminate the layering of the invisible part. Calculating the light and shade effect through the formula can better reflect the texture of the model. Color texture with patterns and geometric texture with concave-convex feeling can better reflect the texture of the model.

Three-dimensional modeling and virtual design are carried out for the stage scene. Figure 1 shows the steps of modeling and designing the stage virtual scene with the 3DMAX software.

The main stage’s scene modeling: The built main stage model and accessory stage model are synthesized to the true proportions, and necessary adjustments are made to the synthesized stage model. To create a stage effect, lights are added. The VRP editor exports the composite model, and the stage virtual design is done on the computer. To obtain the complete stage virtual design drawing, the sky box is used to render the stage background, and then, the roles and actions are added to the stage virtual design drawing via the role and action modules. For 3DMAX spline curves and surfaces, there are already forming calculation methods. Because the existing calculation methods for spline curves and surfaces cannot accurately obtain all of the required data in practice [21] due to the interference of many uncertain factors, special analysis is performed for special problems.

After adding the corresponding materials in the stage modeling process, the corresponding maps are made so that the whole stage model has a more stereoscopic effect and a more vivid look and feel, or the way to add the pattern is to paste the pictures processed by using the Photoshop software on the surface of the 3D virtual model in a mapping way with the computer as the medium so as to improve the lifelike effect of the 3D model.

According to the analysis of music theory and example data, the balance of left and right channels, main volume, average duration, average sound intensity, average pitch, pronunciation area and interval, etc., play a decisive role in determining whether a track is the main melody track, so this paper takes these characteristic parameters as the basis for judging the main and auxiliary tracks.

Let be the absolute value of the intervals of two adjacent notes and calculate and process according to the following formula:

Set the array to store the interval statistics results, accumulate and count the intervals of 0 to 6 degrees in the audio track, that is, the number of intervals with pitch difference of 0 to 6 to obtain , accumulate and count all the intervals of the audio track to obtain , and take the proportion of the intervals of 0 to 6 degrees in the audio track as the interval characteristic quantity, that is,where is the number of intervals from 0 to 6 degrees in track, is the total number of intervals in track, and is the number of intervals in the interval statistics result of track.

Reasonable organization structure and description of lighting action is also the key of this system. On the basis of sufficient argumentation, this paper describes the light action as a tree structure. As shown in Figure 2, according to whether the light projection direction changes, the light actions can be divided into dynamic light actions and static light actions, that is, the actions caused by the change of rotation and pitch channels are called dynamic light actions, and the actions caused by the change of other channels are called static actions.

It can be divided into single scene action and multiscene action based on the process of light movement. A single scene action is defined as the direct application of light to a single scene; a multiscene action is defined as the application of light to multiple scenes. Single scene action and multiscene action are divided into overall change action and sequential change action based on the lamp group’s change mode. The overall change action refers to the simultaneous replacement of all lamps in a lamp group, whereas the sequential change action refers to the replacement of lamps in a lamp group in a specific order. The quantity and quality of lighting actions in a music lighting performance will have a direct impact on the effect of the music lighting performance. To meet the requirements of music lighting performance, it is necessary to design as many lighting scenes as possible based on the characteristics of lamps and lanterns and their installation layout onsite and then use these lighting scenes to derive more lighting actions.

The stage simulation data module stores the extracted stage scene data in the cue format file and outputs it, including the moving target position, moving speed, and moving delay time of the stage module. The output data file is used as the basis for controlling the movement of stage equipment, which provides feasible and accurate design guarantee for realizing holographic stage space-time architecture.

Source video image data are as follows:where represent the coordinates of the area where the data is repartitioned in the source video image, represents the column value where the stage module is located, represents the width of the video image data subdivision area, represents the maximum lift of the stage module, represents the current motion height of the stage module, and indicates the specifications of the projection screen set on the stage module.

3.1.2. Multi-Kinect Joint Data Fusion

The creation of artistic conception of dance is to bring the audience into the space of imagination through the artistic close-up of shaping artistic image of dance and visual scene blending and then gain spiritual insight. Virtual stage background can make dance performance more tense in dance performance. It can not only enhance the atmosphere of dance performance but also expand the space of stage performance, thus creating a unique artistic conception.

The imagination of dancers and the contrast of stage background are inextricably linked to the expression of artistic emotion. A dynamic and interactive combination of virtual stage background and dance performance is skillfully combined. It can display more artistic languages, create a more vivid artistic image for dance, and arouse and resonate with the audience’s emotions. The time contrast of the pictures in this painting is expertly displayed, allowing the audience to appreciate the lovely dance and gain more aesthetic feelings. In this dance work, the clever use of the virtual stage background successfully highlights the artistic image in the dance and expresses the dance theme clearly.

Kinect’s joint tracking technology classifies the human body regions by analyzing the depth data of the target human body. Kinect projects infrared rays through an infrared projector and receives them by an infrared camera, and its projection and receiving areas overlap with each other so as to calculate the depth data of any unobstructed object point in this view and generate the depth image stream in real time.

In order to solve the problem of joint tracking from a single angle of view, the data capture module of the AR system of this stage is built with multiple Kinect. At present, the effective capture depth of Kinect (0.5 m∼4.5 m) has already met the needs of most stage plays, so this paper uses two Kinects to build and expand the width of its effective capture range, and the specific number of them can be expanded according to actual conditions.

The Kinect facing the dancer is named as the main camera and the other Kinect as , as shown in Figure 3.

After two human skeletons captured by Kinect are converted into the same coordinate system and averaged, under certain circumstances, the problem of joint jitter will be weakened and the whole skeleton data will be more stable. However, it cannot effectively solve the problems of self-occlusion and bone length change. On average, only some wrong data can be neutralized, but it cannot be eliminated.

Aiming at the problem of joint mismatch, this system solves it through a constrained optimization framework. Two Kinects use Microsoft Kinect SDK to track skeleton joints and generate two independent sets of human skeleton data for real-time processing. The bone length of the human body is taken as a hard constraint, and the weight between inconsistent joint positions is taken as a soft constraint, so that the optimal position of the relevant nodes can be solved.

In order not to lose generality, it is assumed that Kinect has been calibrated, and the rotation and translation matrices between calculated by the calibration method are , respectively.

For any point in the observation coordinate system, its corresponding point in the camera observation coordinate system is

are human bones independently acquired by two Kinect. This data acquisition module calculates the position of the related nodes and finally obtains the optimal position of the target.

Then, this reliability is incorporated into the weight formula:

Considering the tracking state of the joint points, the joint points are adjusted in “tracked” state with and in “inferred” state with , and the specific size of is adjusted according to the actual effect.

Dance beauty is the outer shell of the dance art form’s beauty, the face of the dance image, and an essential tool for expanding the theme, portraying characters, expressing content, shaping image, enhancing style, and creating artistic conception. The virtual stage background is becoming increasingly modern, scientific, and technological and with the continuous addition of technical means, the virtual stage background has been continuously expanded in expression form, while at the same time adding infinite charm to the performance of dance art.

3.2. Multiserver MEC Network Diversion Decision for Stage Performance

Virtual modeling technology has brought surreal multidimensional stage space and also created virtual actors and real actors to compete on the same stage, producing unexpected effects and quietly changing everyone’s aesthetic attitude towards traditional stage art. The virtual actor has his unique performance specialty, which makes the art and science and technology reach a shocking fusion. His entry and exit time can be quickly switched between the real body and the animated body at will, and he can complete the difficult movements that the real actor cannot accomplish. He can change the size and quantity of the body at will, flexibly control the orientation, distance, and time sequence of his appearance in the stage space, and change the virtual and real effects of the body according to different brightness.

Virtual modeling technology can be used to simulate the complex realistic background environment on the stage and combine 3D and 2D scenes as needed, with richer contents and faster background changes. Virtual actor, whose appearance enables the director to use more artistic expression techniques to deeply interpret and show the inner world of actors in artistic works.

In this paper, only the energy consumption of wireless devices is considered. In order to jointly evaluate the task completion delay and the energy consumption of wireless devices [22], the reward function is expressed aswhere are two scalar weights representing delay and energy consumption, respectively.

This process continues with the increase of time index . Our goal is to design a strategy , which can effectively generate a diversion action for each system state to minimize the expectation of the reward :

Generally speaking, this problem involves multiarm slot machines with different options of arm and . In the field of reinforcement learning [23], because of the existence of reward function , it is sometimes called “ordinary.” For example, given the system state , we always choose the action with the lowest value. However, it is very time-consuming to search for the optimal action in the action space with the size .

A DDLS (distributed deep learning shunting) algorithm is adopted to approximately minimize the proposed reward expectation. The DDLS algorithm uses a batch of PDNN (parallel deep neural network) to generate a binary shunt action from each PDNN in parallel and selects the action with the lowest return as the output action.

DDLS learns from past experience to generate the best shunting action. The gradient descent algorithm is used to optimize the parameter value of each PDNN to minimize the cross entropy loss:

In this paper, heterogeneous DDLS is further considered, in which all hidden layers are different.

The most distinguishing feature of a virtual stage background is interactivity, which allows users to better control and use information while also improving comprehension. A stage multimedia communication system can be formed using virtual stage background technology, and the receiver can ask the sender to transmit some required information at any time, and the information can then be reproduced on the stage or integrated into the performance in real time, providing the audience with great satisfaction and integration.

4. Results Analysis and Discussion

4.1. Convergence of Heterogeneous DDLS

In the past, dancers’ performances on the stage were based on the designed stage, dancing back and forth between the background and props. Under the guidance of the virtual stage background, dancers can freely perform virtual scenes and props and even perform with virtual characters. Let these virtual images perfectly combine with real actors on the stage, which reflects the interaction of the virtual stage background.

The global optimal strategy is obtained by enumerating the combination of all binary diversion strategies and plotting the ratio between the global optimal result and the nonuniform DDLS prediction result.

In Figure 4, the convergence performance of the heterogeneous DDLS algorithm and DDLS algorithm is compared. Generally speaking, heterogeneous DDLS converges faster than DDLS, and the generated diversion strategy is better. Intuitively, because of different PDNN structures, heterogeneous DDLS has a higher degree of exploration.

The unique artistry of the virtual stage background is transmitted through the characteristics of the video images. The essence of the art symbol of film and television images determines its basic characteristics. The artistic features of film and television images and the lens language features are also applicable to the virtual stage background.

When using these virtual images to create, it is also inevitable to use these artistic rules. Only in this way can our dance works show the mixed aesthetic feeling of various artistic features and bring unparalleled visual experience to the audience. This feature enables dance creators to throw off the shackles and create more freely.

4.2. Performance of Different Diversion Strategies

In Figure 5 and 6, the reward performance of different strategies under different weights is studied. With regard to the weighted sum of energy consumption and delay, four other representative benchmarks were also evaluated: all edge processing, all cloud processing, all local processing, and random allocation processing (diversion decisions are randomly generated).

With the increase of the delay scalar and energy scalar, the reward values of all strategies increase. The local processing strategy produces the greatest reward, while DDLS and heterogeneous DDLS are superior to other diversion strategies.

4.3. Influence of Different MEC Network Structures

In Figure 7, the performance of different strategies under different numbers of wireless devices is studied. With the increase of the number of wireless devices, the total reward of edge processing strategy grows faster than other diversion strategies.

In Figure 8, the performance of different strategies under different tasks is studied. With the increase of the number of tasks, the total reward of edge processing strategy grows faster and faster. Because when an edge server processes multiple tasks at the same time, its processing unit is shared among all tasks. DDLS and heterogeneous DDLS are superior to other shunting strategies.

In Figure 9, the performance of different strategies under different edges is studied. The local processing strategy does not change with the number of edges. Because of more processing resources and closer proximity to wireless devices, the rewards of other strategies gradually decrease with the increase of edge servers.

In Figure 10, the CPU calculation time of the heterogeneous DDLS algorithm and LR-based algorithm under different wireless devices is compared.

The realization of the virtual stage background is completely based on the development of modern science and technology. It is the rapid development of modern science and technology that makes the existence of virtual stage background possible, and it also has artistic and technical means that other stage arts do not have.

4.4. Scene Design Fidelity Contrast

To verify the effect of designing 3D virtual stage scene with this method, a stage landscape is designed with this method. In order to highlight the quality of the virtual stage scene designed by this method, the virtual stage design method based on Vega and the virtual stage design method based on Web3D are compared with this method. In the experiment, three methods were used to simulate the five submodels of the stage model, respectively. Figure 11 shows the contrast of the fidelity of the virtual stage scene designed by different methods.

Figure 11 shows that the fidelity of each submodel designed by this method is not less than 90%, which is stable. Relatively speaking, the 3D virtual stage scene designed by this method is more realistic.

The virtual stage background created by modern science and technology is transformed onstage scene by scene like a movie, making the stage space broad and flexible, the picture displayed naturally and smoothly, and the work’s content appropriate and coherent, demonstrating the film and television characteristics of the virtual stage background and thus breaking the traditional stage background’s space limitation. If virtual stage technology is used to create a virtual dancer and allow him to dance with real performers, this arrangement can not only make the space change arbitrarily and the scenery come alive but also put the creator and audience in the same physical space, allowing the creator to focus on the creation and display of dance art while the audience gets a more unique visual experience.

5. Conclusion

The stage performance art design mode and specific function realization based on 3D modeling and MEC technology are discussed in this paper. The system can present the design effect of the beauty stage in each stage dynamically and visually, giving the designer the most intuitive feeling. The stage model is imported into the VRP editor using 3DMAX for VRP, and the virtual design of the stage scene is done. The system’s module design is carried out in accordance with the requirements, and the stage augmented reality system based on multi-Kinect is finally realized, with the functions of each module explained in detail. The semantic network is used to express the light action to reflect the structure and hierarchy of the light action, and the lamp information database, scene information database, and action model database are all established. Numerical simulations show that both algorithms are effective, with the DDLS algorithm outperforming the others. Simultaneously, it demonstrates the impact of the DDLS configuration on convergence performance as well as the impact of model setting on the diversion scheme.

Although the data union of multiple Kinects is similar in theory, it will face some challenges in practical development and application, such as increased algorithm efficiency due to the increased number of Kinects, data redundancy from multiple Kinects, Kinect arrangement, effective number selection strategy, and so on.

Data Availability

The data used to support the findings of this study are included within the article.

Conflicts of Interest

All the authors do not have any possible conflicts of interest.