In the era of e-commerce, online clothing sales have grown rapidly, but the differences in clothing size and style details have brought serious problems to consumers’ purchase. At present, as far as online clothing sales are concerned, the most prominent problem is that consumers cannot try on clothes online as they do in the real environment, let alone touch and feel the texture of clothing fabrics. This has seriously affected and restricted the development of online clothing sales. The after-sales service cannot keep up, which is mainly reflected in the consumer return and replacement which is not convenient, and the purchased clothing cannot be received in time and other aspects. 3D garment customization is a hot area of research at present, and there is a wide range of needs to implement a software that can be customized for users. As a depth sensing technology, it strengthens the ability of video sensors for target recognition and brings new advances in sensing and processing deep vision tasks. Aiming at this problem, this paper proposes a modern clothing design scheme based on human 3D somatosensory technology. The designed scheme focuses on the acquisition of human size data and the personalized combination process of clothing components in the process of clothing design. In the scene where the human body changes greatly or the human body moves rapidly, the real-time limb coverage of the clothing fabric is higher, and the posture matching degree is higher. Applicable clothing types and human body pose types are more abundant, with a higher sense of reality. The clothing perception model fused with profile and girth features can match the dimensional changes of key parts of the human body. Through personalized clothing design, the combination of clothing parts is used to provide more choices of clothing styles, colors, and sizes. Using this solution, the time of clothing design can be greatly shortened, and the user’s satisfaction with the clothing design can be improved.

1. Introduction

In recent years, with the rapid development of Internet technology and the transformation of public consumption concepts and shopping methods, the e-commerce industry has grown rapidly and gradually stabilized, and apparel products are one of the most popular products in the Internet retail industry [1]. The “2016 Global Apparel B2C E-commerce Development Status” released by the Hamburg-based market research company yStats pointed out that in the huge e-commerce market, apparel sales, and consumer electronics accounted for the largest proportion. The “2015-2016 China Apparel E-commerce Industry Report” released by the China E-Commerce Research Center shows that the total transaction volume of textile and apparel e-commerce in 2015 reached 3.71 trillion yuan, an increase of more than 25% compared with 2014, and the total transaction volume in 2016 has approached 5 trillion yuan which will continue to maintain rapid growth.

With the advent of the global economic era, the garment industry has gradually developed and expanded. It evolved from the original cold, heating, shelter, and other functions to decoration, logo, and beauty, showing the beauty of clothing, to meet people’s spiritual enjoyment. When choosing clothing, consumers are not only satisfied with the same clothing, different colors, fabric and textile structure, and supporting accessories but also hope to match the clothes on the table according to their own body shape and can customize the clothing style [2] according to their own will. Because users are not satisfied with the goods they buy, they will frequently return and exchange goods, which will directly reduce the users’ shopping experience, cause cumulative and long-term impact, and even cause word-of-mouth effect, which will have a negative impact on [3] on the development of merchants, platforms, and related industries. According to Vipshop statistics, the return rate of clothing is as high as 25 percent to 30 percent. The main reason is the wrong size, poor personal experience, and do not like costume details.

Based on this background, this paper proposes a modern clothing design method based on human 3D motion sensing technology. From the user’s point of view, this can achieve the “tailored” function of different human body types and design the clothing styles according to the user’s expectations. Specifically, in order to realize the parameterization of the mannequin, the corresponding three-dimensional mannequin is generated in real time according to the body shape characteristics of users, so as to recommend the appropriate size and facilitate users to buy the desired clothes.

2. State of the Art

Based on 3D somatosensory data, the modeling of 3D data and virtual fitting technology are inseparable. First, the 3D point cloud data of the human body is obtained through the camera with the function of point cloud collection, and the distance (depth) from the image grabber to each point in the scene is an image of the pixel value that directly reflects the geometry of the visible surface of the scene. Depth images can be calculated as point cloud data after coordinate conversion, and point cloud data with rules and necessary information can also be converted into depth image data. And then, the 3D virtual modeling of the human body is performed. Then, use the different parts of the clothing design to design the clothing as a whole, and try the designed clothing on the user’s virtual model [4]. Finally, according to the user’s virtual model try-on effect, adjust the texture, parts, color, and other information of the designed clothing, and continue the virtual fitting until a satisfactory effect is achieved. The overall clothing design process based on human 3D somatosensory data is shown in Figure 1.

2.1. 3D Human Body Modeling Technology

The main methods of 3D human modeling technology are 3D modeling software modeling, image recognition method modeling, and 3D scanning equipment modeling [5].

3D modeling software simply refers to the construction of a model with 3D data in a three-dimensional virtual space, and 3D modeling is the core of 3D printing technology and the core content of 3D education. The software has the advantages of good modeling effect, strong sense of reality, and accurate data, but the production cycle is long, the realization is difficult, and professional and technical personnel are required. Image recognition method modeling is divided into two methods: image rerendering and image-based modeling. The rerendering method only shows the rendering effect of the human body from different angles with the change of illumination and position and does not realize the three-dimensional reconstruction of the human body. Based on image modeling, the coordinates of the final 3D object are calculated from the correspondence between the extracted feature points [6]. This method requires small differences between pictures and high limitations. Image-based 3D modeling is convenient and fast, but it is often accompanied by problems such as camera positioning and tracking, calibration, image segmentation, and matching, resulting in high efficiency and robustness that cannot meet the existing requirements. The core idea of 3D scanning equipment modeling is to use scanning point cloud data or IXIB. D data for registration, surface reconstruction, and other technologies obtain the surface of the scanned object. The compact and portable depth camera can capture the color and depth information of the scanned object surface at high frequency. By actively emitting and receiving infrared light, depth data can be obtained directly, and a better interactive interface and strong practicability can be provided [7].

Kinect fusion is a three-dimensional object scanning and modeling method provided by Microsoft to Kinect developers. Its main implementation process is four steps of depth data extraction, data calibration, data fusion, and real-time rendering. Capture directly through the camera, without user interaction, in real time (GPU implementation), and easily realize 3D model reconstruction. However, because the algorithm itself depends on the depth change in the scene, it has high requirements for user actions, and it is difficult to coincide with the original position, resulting in the reconstruction of the 3D model that is not closed and only considers the rigid calibration technology, so many researchers are committed to Kinect fusion. Algorithm improvements were emerged during past years. Tong et al. proposed a 3D human reconstruction method based on multiple Kinects for the situation that the 3D model is not closed and used the turntable to calibrate the data scanned by the three Kinects to obtain complete human body point cloud data to realize the automatic scanning process. Weiss and Hirshbergl71 used a Kinect to propose a parametric human body model based on the SCAPE method, but it takes about an hour to complete the modeling process, which obviously cannot achieve rapid modeling. Zollhofer and Martinekt established the correspondence between depth images and color images to achieve high-quality face models. Weise and Bouaziz used Kinect as an interactive device to build a system for capturing and tracking facial expressions in real time. Due to the difference of the Kinect distance, the acquired depth data has more or less noise, which reduces the accuracy of 3D reconstruction. Common noise reduction algorithms include Gaussian filtering, Laplace smoothing, Wiener noise reduction, and bilateral filtering.

With the rapid development of computer information technology and mechanical manufacturing technology, 3D scanning technology is widely used in different industries, which can quickly obtain a large amount of spatial point information of the target to be measured in a short period of time, establish a 3D model of the target and extract line, surface, body, and other data, obtain massive 3D point cloud data on the surface of static objects with the principle of light reflection, and adopt high-precision reverse 3D modeling and reconstruction technology. By synchronously obtaining the 3D coordinate data of the target range and digital photos, the 3D stereoscopic information of the object such as the entity or the real scene is quickly obtained, and the 3D data model is reconstructed by the computer to reproduce the real-time, changing, and real morphological characteristics of objective things, which can realize noncontact measurement and provide a new tool for rapid modeling of objects and spatial change analysis.

2.2. Virtual Clothing Fitting

In the early days, the virtual “fitting mirror” system with two-dimensional texture maps attracted the attention of consumers. Clothing and accessories were represented by two-dimensional maps [8]. This technical principle is simple and easy to implement, and the interaction method is intuitive and interesting, but there are often some shortcomings: (1) the standard model used in the fitting mirror is far from the body shape, and it is impossible to judge whether the clothes are suitable for you. (2) The display of clothing is often fixed in a certain posture, and real-time dynamic effects cannot be presented according to the user’s movement. (3) Clothes are closer to a flat effect, and the effects of their special decorations (such as ribbons and pleats) are difficult to display. (4) The processing of clothing photos requires significant human resources and operating costs.

In recent years, with the improvement of computer system performance, many companies have launched some 3D fitting software [9]. The 3D fitting software can see the IJ’s specific 3D image data directly from the computer, using fully automatic and interactive 3D image segmentation, quickly obtaining the target range, and creating a three-dimensional model in just a few simple steps. The MIRACIoth system developed by the M/RALab laboratory of the University of Geneva, Switzerland, and the C·Me and V-Stitcher systems of the American Browzwear Company are relatively mature. The domestic three-dimensional fitting software is Hexuan C2pop software, Shanghai Quda Company (quda website deforms to the corresponding body shape according to the input body size), and Shanghai fitting company Haomaiyi (Haomaiyi website shows the same human body wearing different sizes of clothes), as shown in Figure 2.

3. Methodology

3.1. 3D Somatosensory Technology

3D motion sensing is a deep sensing technology that allows people to use body movements directly and interact with surrounding devices or environments without the need to use any complex control devices, which enhances the ability of video sensors to recognize objects and brings new advances in perceiving and processing depth vision tasks [10]. The current mainstream 3D vision solutions include binocular stereo vision method and time-of-flight method. These technologies simulate the human vision system and promote technological development in the fields of AI and computer.

3.1.1. Binocular Stereo Vision

Binocular stereo vision is an important form of machine vision. It is based on the method of the principle of disparity and using the imaging equipment to obtain two images of the measured object from different positions and to obtain the 3 D geometric information of the object by calculating the position deviation between the corresponding points of the image. Due to the different perspectives of the cameras, objects appear significantly different in the two images, the so-called parallax, which refers to the same point in the three-dimensional space. For the coordinate difference in the imaging of the two cameras, the closer the object is to the imaging plane, the larger the parallax, and vice versa, the smaller the parallax [11]. Three-dimensional images can be obtained through triangulation technology, which is similar to the imaging principle of human eyes. At the same time, the grayscale information of the image can also be used for encoding to reflect the effect of distance. The closer the spatial point is to the imaging plane, the brighter the grayscale value. The schematic diagram of triangulation is shown in Figure 3, and the schematic diagram of binocular stereo vision is shown in Figure 4.

According to the law of similarity of triangles, we have

Solving the equation above, we get

Further simplification yields

According to the above formula, to solve the depth value , it is necessary to determine the camera focal length and the camera baseline and also need to know the parallax between the two camera points.

3.1.2. Principle of 3D Registration Based on SVD Decomposition

Three-dimensional registration is used to solve the problem of coordinate mapping from one three-dimensional coordinate system to another three-dimensional coordinate system. Usually, the SVD singular value decomposition method is used for the matching calculation of the three-dimensional point set. It is assumed that there are two three-dimensional point sets {Pi} and {Qj}; if the three-dimensional data points in them correspond one-to-one, there is the following conversion relationship: where is the rotation matrix, is the translation matrix, and is the noise vector, indicating that the data points in the point set {Pi} can reach the position of the corresponding data point in the point set {Pj} through the transformation of rotation and translation. The matrix and translation matrix are unknown and need to be solved by SVD singular value decomposition method. First, find the centroids of the two 3D point sets:

Calculate the correlation matrix of two point sets using the centroid displacement vector:

(correlation matrix) represents the correlation of the coordinates of two point sets and performs SVD decomposition on the matrix:

Matrices , , and are orthogonal matrices in the decomposition process. For matrix , the optimal solution is

Verify that the optimal solution is valid: if , the matrix is calculated correctly; otherwise, if , the calculation is invalid. When is calculated correctly, the translation matrix can be calculated from :

The conversion diagram of two three-dimensional point sets {Pi} and {Qj} is shown in Figure 4:

3.2. Human 3D Reconstruction

Scanning reconstruction of 3D human body is one of the hot topics of research. The quality of human body model parameterization has been improved. It is directly related to whether the clothing can be properly tried on the human body and will also be a key factor for consumers to consider whether to buy the clothing [12].

3D human body reconstruction method is fast and requires no user interaction. The corresponding human body template is generated by Make Human body modeling software, and a closed and reliable 3D human body model is formed by template matching and interpolation algorithm [13]. The main steps of the algorithm are as follows:

3.2.1. Extract Human Depth Data

Use the cost-effective Kinect somatosensory device to extract the user’s depth data, convert it into point cloud data, perform noise reduction processing on the point cloud, and calculate the normal direction of all point clouds. In addition, the depth data and infrared data are analyzed to obtain skeleton information.

Kinect is a 3D somatosensory camera developed by Microsoft, including infrared transmitters, color sensors, infrared depth sensors, and other parts, with functions such as motion capture, image recognition, microphone input, and voice recognition. Infrared data can be obtained by the infrared transmitter and the camera, and the infrared data can be calculated and processed to obtain the depth-of-field image data.

Kinect’s joint tracking system can provide users to track the joint positions of the person’s skeleton, which is often used for gesture detection, user interface operation, and many other functions. The main principle of its realization is to analyze the depth data and infrared data obtained by Kinect’s depth camera and infrared camera to obtain a human body model and further extract the human skeleton based on the analysis method of maximum posterior probability [14].

As shown in Figure 5, the Kinect skeleton consists of 23 joints and a total of 22 bones. The left side shows the details of the joints in the skeleton, and the right side shows the effect of real-time capture, all omitting the joint points showing the fingertips. Table 1 shows the numbering comparison of Kinect bone points.

3.2.2. Data Preprocessing

Due to the inconsistency of the coordinate system, the occlusion of surrounding objects, and the interference of noise points, the captured depth data and skeleton data often need to be preprocessed to ensure the accuracy and precision of the data and facilitate subsequent use. The noise points of the depth data are processed by the bilateral filtering algorithm; the pose (position and orientation) of the world coordinate system is calculated, and the nearest iteration method is used for calibration; the smooth filtering algorithm is used for the Kinect skeleton [15].

3.2.3. Feature Extraction

In order to improve the accuracy and speed of the system calculation, the human body is firstly processed into blocks; secondly, the corresponding feature points, feature lines, and contour lines are extracted from the human body data. Among them, the main characteristic points are overhead head point, laryngeal node point, acromial point, axillary point, elbow point, umbilical point, hip point, and knee point. The feature contour line is associated with the feature point, seeking the cross section parallel to the plane or the plane. Finally, calculate the circumference of the human body (waist circumference, hip circumference, bust circumference, etc.) and height information.

After obtaining the human body model, according to the linear relationship between the proportion of human body structure and height given in Table 2, we divide it into blocks, simplify it according to the degrees of freedom, and divide the human body model into five parts: upper body, left leg, right leg, left hand, and right hand [16]. Figure 6 shows the block rendering and also shows that the block results do not depend on the pose of the human body. The algorithmic basis for the segmentation of the human body model is the position of the skeleton and the biological characteristics of the human body [17]. Firstly, the automatic alignment of human depth data and skeleton is realized by using the position of the viewing angle field of Kinect; then, according to the position of the joints in the skeleton, the setting can divide the corresponding human body. Use the center point of the hip and the horizontal scan line to separate the upper body and lower body of the human body. For arms and legs, it is carried out with the help of axillary feature points and crotch feature points. The identification of the axillary point and the cross point also adopts the method of horizontal scanning line. The left and right hands can be identified with the help of the shoulder and armpit feature points; the left and right legs can be identified with the help of the crotch and hip feature points.

3.2.4. Human Model Reconstruction

Realize the reconstruction of the 3D human body model matched by the feature data, and match the 3D human body model corresponding to the human body size from the human body model database [18].

3.3. Personalized Clothing Design Based on 3D Somatosensory Technology

In real life, user-designed products are becoming more and more mainstream, gradually replacing mass production. For consumers to buy clothing, in addition to considering the comfortable and beautiful wearing experience, they also consider the possibility of “colliding shirts.” At this stage, more and more people no longer pursue “star-like” products but care more about the personalization of clothing and their own feelings about clothing design [19]. Unlike other fashions, users prefer to change the style details of clothing according to their own preferences.

In order to facilitate the realization of modern clothing design for users, this paper proposes a method of personalized design and combination of clothing. The specific implementation process is as follows: (1)Design stage: the designer designs the clothing style (hand-drawn drawing), the pattern maker designs the hand-painted clothing style as a clothing model, and at the same time, the artist adjusts the position of the two-dimensional pattern and sets the stitching line on the border of the pattern(2)Virtual fitting realization stage: automatically triangulate the two-dimensional pattern to form a uniform and unique triangular mesh; use the physical simulation method of cloth to realize the virtual fitting effect of three-dimensional clothing, and perform high-fidelity rendering of the three-dimensional clothing(3)Parts assembly stage: read all pattern data of parts quickly and easily, and automatically set the initial position of the pattern; realize the rapid triangulation step in the process of part assembly, reduce the total time of the part assembly process, and save each part and the main pattern. The stitching information and simulation results between them are established, and a component library is established, so that users can quickly view the simulation results

In order to realize the development of 2D pattern to 3D clothing, it is necessary to perform physical simulation on the triangulated pattern. Make the fabric conform to the laws of kinematics and display the dynamic virtual fitting effect. When an explicit integration framework is employed, the force analysis-based approach suffers from overstretching issues, whereas the position constraint-based approach to physical simulation of cloth (PBD) does not [20]. This is because the PBD method, compared with the previous explicit or implicit methods, can directly manipulate the position of the item points. When dealing with collisions, the constraints are satisfied and there will be no overstretching. In addition, the method also has the advantages of rapidity, stability, easy implementation, and strong controllability.

In the PBD method, the key point of moving particles is to ensure the conservation of linear and angular momentum. The gradient of constraint is perpendicular to the rigid body motion of the object, since this is the direction of maximum change of the constraint. Linear and angular momentum are automatically conserved if the correction for the vertex position is chosen to be along the gradient.

Combining the above two formulas, the following formula is obtained:

PBD adopts many constraint solving methods, such as distance constraint, bending constraint, triangle collision constraint, and volume constraint. This system mainly adopts distance constraint and bending constraint solution.

4. Result Analysis and Discussion

In order to better test the performance of the clothing design method based on the human 3D somatosensory technology designed in this paper, the following test method is designed: first, the Kinect somatosensory device is used as the acquisition sensor to record the video in real time, to obtain the human somatosensory data, and to carry out three-dimensional Human reconstruction. The reconstructed three-dimensional human body model is displayed on the screen, and the specific process is shown in Figure 6. Secondly, according to the design scheme of personalized clothing, different three-dimensional effects of clothing are generated. Finally, the designed clothing is fitted with the three-dimensional human body model, and the upper body effect of the user wearing the designed clothing is displayed in real time.

During the test, the fitter faces the Kinect device and can try a variety of dynamic fitting scenarios of clothing types, such as posing any pose or turning the body. Pose changes, occlude parts of the body, etc. Finally, the image frames in the recorded video are randomly intercepted, and the try-on effect display and test evaluation are carried out.

According to the test method designed in the previous section, the specific content of the system test is as follows: (1)The clothing models tested include the following: short-sleeved tops, sleeveless tops, long dresses, tube top skirts, trousers, cropped trousers, five-point trousers, and other styles and types of clothing models(2)Test postures include the following: standing, squatting, front and rear leg raising, lateral leg raising, T-shaped posture, arm bending posture, akimbo posture, lunge posture, and other postures(3)Accuracy comparison of the circumference size fitting methods of different literatures: the chest circumference, waist circumference, and hip circumference were fitted by the circumference size fitting methods of different literatures and the method in this paper, and the fitting error results were recorded(4)Parameter optimization of the collaborative tracking method: set the tracking error threshold M0 with different values to optimize the accuracy and real-time performance of the collaborative tracking method(5)Accuracy comparison of try-on methods in different literatures: compare the accuracy of try-on methods in different literatures with the collaborative tracking method in this paper, and record the results of the relative tracking error M(6)Comparison of try-on effects of models of different body types: select models of different body types, try on clothing models of the same clothing style, and record the dynamic try-on effect in real time

This article integrates all functional modules through the Unity3d scripting feature. The Kinect data acquisition module is used to acquire two-dimensional image data, including RGB of human body areas, depth data, and skeletal features. The somatosensory measurement module used to calculate the width and circumference of the human body has also been fully utilized. Using the acquired feature data, the human body silhouette is extracted by the pixel clustering method based on the joint feature, the width size of the human body section is calculated, and the GBDT circumference calibration model is used to combine and calibrate the human body circumference size. The 3D clothing perception model building module is used to mark the dynamic feature points of clothing in the 3D clothing simulation model and integrate the circumference feature dimensions of the human body section to build a 3D clothing perception model with personalized characteristics of human bones, silhouettes, and circumferences.

The process of forming a new set of garments after a designer provides samples of garments that have been assembled. After the designer provides the garment samples that have been assembled, the process of forming a new set of garments (including reading the stitching method of the parts from the parts library, retriangulating the parts, and fabric simulation) takes about 10 s in total. For a garment with 4 parts and four styles of each part, the approximate number of patterns is about 50, and the reading speed is about 3 s in total.

Taking the circumference size features of the human torso (chest circumference, waist circumference, and hip circumference) as an example, the accuracy of the GBDT circumference calibration model method proposed in this paper is compared with the linear regression algorithm and the ellipse model fitting method. By fitting the torso circumference of 10 fitters, each experimenter used the above three methods to calculate 3 times, respectively, and the fitting results of each method were recorded as the mean of the 3 calculations. The accuracy of the results can be calculated by combining the actual girth measurements of the 10 fitters. Figure 7 is a comparison of the relative errors of the three methods for girth size fitting (taking bust as an example).

From the statistics of the experimental results in Table 3, it can be seen that compared with the elliptic model fitting method and the linear fitting method, the method proposed in this paper has a mean error (MAE) of 4.3 for girth fitting, and a root mean square error (RMSE) of 4.3. It is 4.6, and the overall error is smaller, so it can ensure that the clothing model more accurately matches the body structure and profile characteristics of the human body, improves the fit and personalized experience of virtual try-on, and is more conducive to clothing design and user experience feedback.

Based on the methods designed in this paper, the 3D model of the human body is generated by using the 3D somatosensory technology of the human body, and various clothing can be generated by using the component combination technology of clothing design. Finally, through the alignment and 3D registration of the 3D model and clothing, users viewing the implementation effect of the designed clothing will help to improve the efficiency of modern clothing design. Figure 7 shows the real-time viewing effect of some garments designed by this method, and through this method, various needs and personalized solutions of users are realized.

5. Conclusion

By studying the architectural design of the 3D virtual try-on experience test system, the script development and integrated modulation of the module functions were completed, and a complete data collection and personalized combination of clothing components were constructed. The test software includes data collection and processing, 3D clothing perception model, clothing dynamic try-on, real-time interactive control, virtual try-on display, and other main functional architectures, with low-cost, real-time interaction design characteristics, to achieve the experience function of 3D virtual clothing dynamic try-on for real people. The key points of its implementation are as follows: on the one hand, the reconstruction effect of the human 3D model based on the Kinect depth somatosensory camera is realized. The human body model database was generated by Make Human, the depth point cloud data was captured by the Kinect depth camera, and the human body feature information (including height, circumference, feature points, and feature lines) was automatically extracted. Algorithms for matching, translation and rotation deformation algorithms are performed on the template at the same time, and mesh smoothing is performed on the deformed human model. The algorithm does not require user interaction, captures human body data in time, and implements a fast human body deformation algorithm, resulting in a uniform 3D human body mesh. At the same time, the algorithm can also be used for the user to automatically input parameters to achieve human body deformation. On the other hand, a method of clothing personalized design and combination based on 2D pattern to 3D clothing is proposed. The cloth physics simulation method of PBD is adopted, and the main principle is to use the constraint method to solve the problem. This method has many advantages such as fast, stable, easy to implement, and strong controllability. Finally, the storage structure and usage method of the clothing parts library are designed, and the free combination design method of clothing is realized.

This method has a higher degree of real-time limb coverage of clothing fabrics and a higher degree of posture matching in scenes where the human body changes greatly or the human body moves rapidly. Applicable clothing types and human posture types are more abundant (such as sideways, lunges, and akimbo), with a higher sense of reality. The clothing perception model fused with profile and girth features can match the dimensional changes of key parts of the human body. Through personalized clothing design, the combination of clothing parts is used to provide more choices of clothing styles, colors, and sizes.

Data Availability

The labeled datasets used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that there are no conflicts of interest.