Abstract

This article constructs and studies a graphic design software model based on data-driven image interactions. The semantic information of image interaction is integrated into the framework of the augmented reality label placement algorithm. A new feature map, Guidance Map, is proposed to combine image saliency information with semantic information in the task scenario, accurately describing the importance of different regions in the user’s field of view to derive a more reasonable label placement strategy. This database is different from other label placement algorithms that only use data-driven saliency detection. This paper proposes a data-driven augmented reality labeling method, which first presents the design requirements for augmented reality scene labeling placement, then designs the energy function for augmented reality scene characteristics, transforms the labeling placement problem into an optimization problem, and solves it. This paper proposes a feature representation form oriented to pixels, elements, interelement relationships, planes, and applications and quantifies the geometric features, perceptual features, and stylistic features in them. The model can fit the probability density distribution of the parts to predict the optimal element placement and element color under the target conditions.

1. Introduction

The integration of mobile Internet and terminals in modern society has overturned the social landscape. A diverse, open community full of infinite possibilities brings opportunities and challenges for everyone and every industry. Graphic design is a visual communication tool that creates and combines graphics, text, and other elements in various forms to convey messages to users. According to the learning objectives, deep learning models can be classified into discriminative and generative models [1]. Among them, discriminative models learn to transform the input as high-dimensional and rich sensory information into the output as a mapping of category labels; generative models learn the underlying distribution pattern of data and generate similar new data from it. Thanks to the application of backpropagation and random deactivation algorithms, discriminative models have achieved many research results. Compared with discriminative models, generative models have been accumulated. However, their research progress is still slow, mainly because of the large amount of a priori knowledge and computation required for modeling, which causes limitations in model training and data generation [2]. Professional graphic designers can design creative and visually appealing layouts, but there is no simple and repetitive work. The feasibility of sketch solutions cannot be assessed visually and effectively in the design evaluation and decision phases. On the other hand, design evaluation is more maturely applied in the engineering field. Still, it lacks the flexibility and freedom of the conceptual design phase, and the morphological solution often needs to be readjusted to fit the engineering constraints during the implementation process. Suppose some commonalities in layout design can be extracted to generate structures automatically. In that case, it will significantly reduce their workload, and they can redesign and give full play to their creativity on this basis.

The Guidance Map is generated according to the original images. The images’ semantic information and saliency information are combined with the statistical features of the task-driven manual annotation placement tendency to establish the energy function and transform the annotation placement problem into an optimization problem-solving [3]. The MLP database needs to be built first for the training process to learn the semantic tendency of user annotation placement based on the training set images and manual annotation information to obtain the task-driven importance prior. The annotation placement algorithm starts when new objects are detected in the AR system, and annotations need to be generated for the testing process [4]. Firstly, image data and annotation information are preprocessed. For image data, its edge, saliency, and semantic segmentation image must be obtained; its predefined POI point location and annotation size must be accepted for predefined annotation. In the second stage, a Guidance image is generated. Based on the saliency image, semantic segmentation image, and the task-driven importance prior (reflecting the semantic tendency of manual annotation) obtained from the training process, the Guidance image of the original image is obtained, and the result is consistent with the user’s understanding of the critical region when facing a specific task [5]. In the third stage, the energy function is defined and optimized. Professional graphic designers can design creative and visually appealing layouts, but there is no lack of simple and repetitive work. Suppose some commonalities in layout design can be extracted to generate structures automatically. In that case, it will significantly reduce their workload, and they can redesign and give full play to their creativity on this basis. The results obtained from the joint analysis, such as Guidance image, edge image, POI point location, annotation size, and other information, are substituted into the energy function and solved using the greedy algorithm to derive the optimal position for annotation placement.

The development of computational design, user research, human-computer interaction, and other related fields has injected new energy and brought a new vision to the traditional design industry. If the conventional design process is compared to the creation from 0 to 1, the intelligent design process is more like the evolution and development from 1 to 10. The computer can analyze the typical features in the data and apply them to the automated design process, resulting in a batch output that meets business needs [6]. This new design paradigm places new demands on designers and challenges designing research and computer technology. There is no mature and complete design theory describing the design model in the context of artificial intelligence. Taking the design of graphic images as an example, from posters that can be seen everywhere in life to web ads of large and small sizes on the Internet, the demand for many low-value and consumable graphic designs has always been a challenge [7]. These visual design tasks consume enterprises and society’s human and financial resources and waste the inherent learning value of design resources. Graphic design communicates a specific message visually by creating or combining elements such as symbols, images, and text.

Along with the rapid development of Internet technology in recent years, art and multidiscipline in the context of big data have shown a trend of integration and development. The scope of application of various types of data visualization presentations has also been expanded [8]. The specific content of the exhibition is no longer superficial. Furthermore, it deepens into social phenomena and ideology, and the design concept also shows contemporary solid characteristics. The expression form is also rich, diverse, and unconventional, with considerable application value at the art level. Data visualization has brought a vast development space to graphic design [9]. It also indicates that the diversified development of graphic design is an unstoppable trend, among which the application of function images to graphic design is a relatively new research field. The fundamental theoretical research and practical application have brought more development possibilities. In design information and knowledge integration, the traditional creative design stage is completed by designers with their knowledge and experience [10]. Therefore, the innovation and rationality of design solutions are limited by the quantity, diversity, and professionalism of design resources owned themselves. This design method is slow in design efficiency and requires constant design communication and modification, thus making the design cycle longer. Suppose we can take advantage of the booming Internet and other platforms to assist designers in designing and integrating resources to help designers understand products and accurately locate user preferences [11]. In that case, we will vastly improve the efficiency of using preliminary design knowledge and resources, thus improving the competitiveness of products.

In the concept creative design stage, the traditional design is dominated by sketch solutions. As an expression tool in the creative concept generation stage, sketching facilitates designers’ creation to a certain extent and improves the convenience and flexibility of design [12]. More scholars are devoted to breaking designers’ creative thinking in directed space to avoid design thinking solidification in innovative design research. However, most of the study still stays at the stage of two-dimensional sketching. In contrast, the limited information carried by sketching makes it more difficult to communicate between people of different disciplines and challenges making decision evaluation of solutions [13]. How to conduct multidisciplinary cross-innovation design with the help of computers and effectively reduce the number of design iterations and shorten the design cycle is a problem that design researchers have explored for many years [14]. More and more researchers have devoted themselves to design information integration, intelligent sketching, etc. However, the feasibility of sketching solutions cannot be evaluated intuitively and effectively in the design evaluation and design decision stages [15].

On the other hand, design evaluation is more maturely applied in the engineering field. Still, it lacks the flexibility and freedom of the conceptual design phase. The morphological solutions often need to be readjusted to fit the engineering constraints during the implementation process. Dewi et al. proposed a 3D product design method that provides designers and users with a free interactive product form exploration method, which improves creative design freedom to a certain extent [16]. Moreno and Ramirez proposed a 3D modeling generation method based on aesthetic products, which combines aesthetic product choices through product syntax transformation rules and parametric modelers to generate various morphological solutions [17]. Compared to interactive form exploration methods, traditional 2D sketches are more ambiguous and unstructured in their extrapolation of form and often require designers to spend a great deal of time exploring solutions. Pollice et al. argue that the focus of design research efforts has evolved from the first approach (tools for computer-aided representation) to the second approach (the ability of the computer to act as a tool for information transformation) and idea stimulation (a tool to transform information and stimulate ideas) [18]. Digital design is more associated with data and with computers. In contrast to the sketching process, digital exploration allows the computer to go beyond what the designer draws or imagines as a solution. In terms of abstract form, 3D forms are more expressive and can assist the designer in thinking visually and exploring the spatial structure of the product.

3. Design of Graphic Design Software Base Model Construction Based on Data-Driven Image Interaction

3.1. Data-Driven Image Interaction Model Construction

Along with the development of modern technology, all fields want to find ways to integrate with cutting-edge technology to stand firm in this era of rapid growth. In graphic design, innovation never stops, and the application of function images manifests the development of the times. The function image is obtained by processing the original data information through computer operation, which acquires design elements to become “evidence-based.” The process from nothing to something has clear calculation steps. The application of this form of visual expression is one of the trends of the future development of the graphic design to diversify, starting from the current situation of the application of function images in graphic design, exploring its causes, and studying the aesthetic characteristics of function images in graphic design and its advantages in the development of the graphic design. Many image fusion algorithms have been proposed in recent years, generally classified into seven categories according to the adopted theories: multiscale transformations, sparse representations, neural networks, subspaces, significance-based methods, hybrid models, and other methods. The data-driven image generation design flow is shown in Figure 1. Typically, multiscale transform-based fusion schemes for infrared and visible images consist of three steps; first, each source image is decomposed into a series of multiscale representations. Then, the multiscale models of the source images are fused according to a given fusion rule. Finally, the fused images are acquired using the corresponding inverse multiscale transform on the fused representations.

Typically, multiscale transform-based fusion schemes for IR and visible images consist of three steps; first, each source image is decomposed into a series of multiscale representations. Then, the multiscale models of the source images are fused according to a given fusion rule [19]. Finally, the fused images are acquired using the corresponding inverse multiscale transforms on the fused representations. The multiscale transform tools used for decomposition and reconstruction are wavelet, pyramid, curvilinear wavelet transform, etc. In the past decades, multiscale transforms have played an essential role in infrared and visible image fusion. Multiscale transforms can decompose the original image into components of different scales. Each piece represents a subimage of each scale, and real-world objects usually include members of different scales. Several studies have shown that the multiscale transform is consistent with human visual features, and this property can make the fused image visually appealing. The key to multiscale transform fusion is the design of fusion rules.

When the source image is decomposed into a series of multiscale representations using the multiscale transform, fusing the multiscale models of the infrared and visible photos is necessary according to the given fusion rules. The coefficient combination approach is the most used IR and visual image fusion rule. The maximum value scheme selects the top weight as the multiscale representation coefficients. For example, Chai obtains low-frequency and high-frequency coefficients by performing a quadratic wavelet transform on each source map and fuses the low-frequency sub-bands based on the phase and magnitude of the low-frequency sub-bands and a weighted average fusion rule of the spatial variance. The fusion rule is determined by the maximum contrast and coefficient energy choice to fuse the high-frequency sub-bands. Finally, the inverse quadratic wavelet transforms constructed the final combined image. Design rules are the knowledge of designers based on design experience, which can guide the design process or constrain the design results.

Among the rule-constrained feature models, the design template is one of the simplest forms. In a template, the design constraints are written into functions. These functions allow the algorithm to output the design result in each style while satisfying the input conditions. A variable data printing algorithm for document layout is proposed. The algorithm automatically translates dynamic textual content into the document layout using layout constraints and linear functions. Another common way of model construction is an energy function that quantifies design features into several evaluation functions. The score of the evaluation function is then continuously improved by optimizing the function parameters until a design result satisfying the threshold value is obtained. For this type of evaluation function, the standard form is as follows: the function image is obtained by processing the original data information through computer algorithms, making the design elements “evidence-based.” The process from nothing to something has clear calculation steps:

Constructing a probabilistic estimation model based on basic geometric features is the key to achieving print ad design. The essence is to use the kernel density estimation function to fit the sampled element geometric features into a continuous probability distribution to predict the distribution of the components under specific conditions. The kernel density estimation method is a nonparametric method used to estimate probability densities. The technique uses a smoothed kernel function to fit the observed data points, thus providing a probabilistic fit to the distribution curve. Since this method does not utilize prior knowledge of the distribution of the data of interest and does not attach any assumptions to the data distribution, it is often used to characterize the distribution of the data sample itself. Suppose there are n mutually independent sample points , the distribution of these sample points is , and the probability density function is , and then, there exists. The innovation and rationality of the design solution are limited by the number, diversity, and expertise of the design resources that the designers themselves have. This design approach is not only slow in design efficiency but also requires constant design communication and modification, thus making the design cycle longer:

denotes the probability of the sample occurring at that point when Since the probability density function is the first-order derivative of the distribution function, it can be defined by the fundamental theorem of products as follows: when function images are used in graphic design, the essential step is determining the proper position of points in space. The issue can represent a specific spatial location and express the size and form of the object. It can be used as a visual image by itself, but also through different forms of points for optical arrangement and combination to form a variety of graphic formats:

Since the actual distribution is unknown, the empirical distribution function can be used to approximate ; that is, the model refers to such pixel data as display features. Many current deep learning-based image generation methods use pixels as training data to perform tasks such as image generation, style rendering, and color reconstruction end-to-end:where K is a non-negative kernel function and h is the smoothing parameter, also known as the bandwidth. To ensure that the integration of the density function is 1, a density function of other distributions can replace the original kernel function, commonly including uniform kernel functions, trigonometric kernel functions, gamma kernel functions, and Gaussian kernel functions [20]. These kernel functions satisfy the properties of integration of 1, mean value of 0, etc. The larger the bandwidth, the smaller the proportion of observed data points in the last curve, and the flatter the overall KDE curve; the smaller the bandwidth, the more significant the proportion of observed data points in the final curve shape, and the steeper the broad KDE curve. There are many ways to choose the bandwidth, and cross-validation is a commonly used selection method. In the probability density function construction process, the data can be divided into two parts: fitting and validation. By comparing the validation results of multiple bandwidth values, the bandwidth with the slightest error is finally used as the smoothing parameter of the function. Based on one-dimensional variables, the kernel density function can be further extended to multidimensional variables; that is, the computer can analyze the typical features in the data and apply them to an automated design process, resulting in a batch output that meets business needs. This new design paradigm poses new requirements for designers and challenges for design research and computer technology:

The color characteristics of the background in the text elements can be used as conditions to construct the corresponding conditional probability density functions to estimate the probability distribution of the text color characteristics. Figure 2 shows the distribution of text color brightness under the conditions of different background brightness features. From the results, we can see that under the background with a darker color (e.g., when the background brightness is 0.2), the text color will be more inclined to high brightness; under the ground with brighter color (e.g., when the background color brightness is 0.9), the text color brightness is mostly below 0.5; when the background brightness is in the middle region, not only the text color with high shine can be used with high probability, but also some low brightness colors can be used appropriately When the background brightness is in the middle area, not only can we use high brightness text color with high probability, but also can use some low brightness color appropriately. Compared with the kernel density estimation method, some features of graphic design images will have more convergence under conditional probability estimation. The more convergent results are also more design-guiding for the computer.

3.2. Graphic Design Software Base Model Design

Design features are a form of data representation for graphic design images, which directly affects the visual presentation of the design. The graphical image design feature model is shown in Figure 3, which supports the construction of the feature model and learning of graphic design. The features include pixel, element, relationship, plane, and application layers from bottom to top. The corresponding features are display, geometric, perceptual, style, and business features. The display features at the lowest layer are the original data representation of the computer. The higher the layer of features, the higher the level of abstraction of the represented features. The closer the features are to the subjective human perception, the more difficult they are to quantify objectively with formulas [21]. Pixels are the primary data representation of an image and are often stored in the computer as RGB values. Models refer to such pixel data as display features. Many current deep learning-based image generation methods use pixels as training data to perform image generation, style rendering, and color reconstruction end-to-end. In the problem of automatic generation of planar layouts, if the available treatment is followed; that is, the input to the generative adversarial network is the layout image, the problem is that the generative adversarial network cannot capture the dependencies between elements well, and it is also challenging to capture the layout patterns accurately, resulting in poorly generated layouts. Display features are mainly applicable to the learning of natural images because such images are often acquired by shooting, using pixels as the primary representation unit. Elements are the fundamental representation of graphic design images because visual design images are usually made of different design elements (including logos, pictures, and text) combined according to specific rules. The elements themselves have characteristics that can be called geometric features. Standard geometric features include the element’s position, the element’s size, and the color of the element. In addition to these inherent characteristics, the geometric features of elements include the roles of elements according to their functions in a graphic image. In the case of print advertising images, the common types of elements include image elements, text elements, and visual elements. Image elements often appear as the main body of advertising images, products, characters, etc. Its role can include many forms such as commodities, backgrounds, and trademarks; text elements are the carriers of information conveyance, often divided into primary copy, subcopy, content copy, and action copy according to their functions; graphic elements, on the other hand, often act in the image with the process of decoration and setting, and the specific presentation forms include paste border, line, fragment, tiling, and area. The visual presentation of data has brought a vast development space to graphic design, indicating that the diversified development of graphic design is an unstoppable trend. The application of function images to graphic design is a relatively new research field. The fundamental theoretical research and practical application have brought more development possibilities.

Generative adversarial networks are widely used for unstructured data like images’ outstanding performance. In the problem of automatic generation of planar layouts, if the available treatment is followed; that is, the input of the generative adversarial network is the layout image, the problem is that the generative adversarial network cannot capture the dependencies between elements well, and it is also challenging to capture the layout patterns accurately, resulting in poor generated layouts. In response, Layout GAN turns to structured data; that is, both the input and output of the generator correspond to the category probabilities and geometric parameters of each element in the layout image, which makes good use of the accuracy of the geometric parameters, thus improving the layout generation results. In addition, Layout GAN also gives two different structures of discriminators to discriminate between the layout synthesized by the generator and the actual layout, the relationship-based discriminator and the wireframe rendering discriminator, respectively. The generators of the Layout GAN model, relationship-based discriminators, and wireframe rendering discriminators are described in detail below. The structure of the generator is shown in Figure 4.

Suppose there are n elements in a layout diagram, and each piece is represented by its category probability p and geometric parameter @. The generator first inputs the category probability p randomly drawn from a uniform distribution and the geometric parameter @ randomly drawn from a normal distribution. The input data is represented as . After processing, the final output is the adjusted category probability and the geometric parameter , and the output data is defined as . The network structure of the encoder, the relational module, and the decoder are a multilayer perceptron, which first processes the category probability and geometric parameters of each element; the relational module integrates the spatial and semantic relationships between components and further refines the output of the feature by the encoder; the network structure of the decoder is also a multilayer perceptron, which reconstructs the fine features formed by the relational module and finally outputs new features. It carries out reconstruction and finally outputs new category probabilities and geometric parameters, at which time the individual elements will form a good layout.

Regarding the specific operation of the relational module, suppose the embedding feature of the component is , the embedding feature of the element is and is processed by the relational module to output . Visual design tasks consume the human and financial resources of business and society and waste the inherent learning value of design resources. Graphic design works communicate a specific message in visual form by creating or combining elements such as symbols, images, and text:

The structure of the relation-based discriminator is somewhat like that of the generator in that it is composed of an encoder, a relation module, and a classifier. The relationship-based discriminator first uses an encoder consisting of a multilayer perceptron to embed each element’s class probabilities and geometric parameters. A simple relationship module extracts the spatial relationships between the elements and performs an entire pooling operation. Finally, a classifier, also structured as a multilayer perceptron, is used to output a probability value for determining the natural and synthetic layouts. The wireframe rendering discriminator uses a convolutional neural network that captures spatial relationships. Compared with traditional sketching and computer-aided design, computational design relies on logical rules, data flow, and interface management in the data processing process. Its computing process is presented in data flow and has stringent requirements on the structure and organization form of data. Different interfaces may lead to different results or invalid properties. Since convolutional neural networks usually only receive input from images, but the initial information is in the form of data consisting of class probabilities and geometric parameters, the wireframe rendering discriminator adds a wireframe rendering layer before the convolutional neural network, which transforms the input data into a representation of the image. The wireframe rendering layer operates as follows: assume that the input elements can be rendered as grayscale ideas , each grayscale image is of size , the output image is of height , where represents the number of channels of the image and the number of categories of the input elements, which are in correspondence with each other. The relationship between the output image and the grayscale image of each component of the rendering layer exists as follows. Graphic design is a visual communication tool that creates and combines graphics, text, and other elements in various forms to convey messages to users. According to the learning objectives, deep learning models can be classified into discriminative and generative models:

Compared with the traditional sketching of solutions and computer-aided design, the computational design relies on logical rules, data flow direction, and interface management in the data processing process. Its operation process is presented in the form of data flow and has stringent requirements on the structure and organization of data. Different interfaces may lead to different results or invalid properties. Therefore, logical rules and data flow direction are essential for computational design. This chapter studies the construction of a directed morphological graph (DMG) for these problems [22]. It gives a general flow for constructing a directed morphological graph to assist designers in establishing a logical pathway for morphological computational program design. Directed morphological computational graphs (DMGs) assist designers in computational design for product morphology, providing design content guidance and analytical thought processes to implement morphological algorithms. It firstly understands the design task, then collects design information and sketches ideas, integrates and abstracts design elements and design resources by clarifying the designer’s computational design ideas, then designs and formulates rules for classification logic algorithms, determines parameter intervals by converting and debugging the abstract resources into parameters, and finally carries out the algorithmic implementation of morphological computation to obtain morphological computational solution sets.

4. Analysis of Results

4.1. Data-Driven Image Interaction Model Analysis

At present, graphic art has developed into an essential global “language” due to its uniqueness. The continuous development of science, technology, and socioeconomics has contributed to the diversification of graphic arts. For graphics and artistry, they also need to have a certain degree of science to lead the audience to the aesthetic needs of science and technology, so design workers in this field need to have scientific and artistic thinking skills. To make the picture rich in modern beauty, graphics can be applied to rational analysis and scientific induction, from the three main aspects of point, line, and surface to carry out the practical design. As an indispensable element of morphological composition, the issue has rich artistic expression. When function images are applied in graphic design, the essential step is determining the proper position of points in space. The topic can represent a specific spatial location and express the size and form of the object. It can be used as a visual image itself, but also through different optical arrangements and combination forms to form a variety of graphic formats. Multiscale transforms play an important role in infrared and visible image fusion areas. The multiscale transform can decompose the original image into components of different scales. Each piece represents a subimage of each scale, and real-world objects usually include members of different scales. In the application process of function image, the point is the smallest unit, which can effectively represent the specific direction and shape. Each issue also exists in different colors, dimensions, and sizes. Its most significant advantage is that it can show a high degree of concentration, thus attracting the audience’s attention and promoting the formation of its visual center, thus showing a strong visual impact. When applying function images in graphic design, special attention should be paid to applying point elements, enhancing the visual effect when used correctly. And the application of multiple points through the relevant laws to combine will form a unique sense of dynamic and rhythmic beauty. The size, position, and direction of the dots are different and bring different visual experiences to the audience. In graphic design, the form of beads can be various, depending on the characteristics of the information to be conveyed. For example, text, graphics, and other elements applied in dots can become a visual center point. The focal point formed by lines or surfaces through intersection and superposition can also be an optical center point. The result is a better visual experience through the representation of the visual center point. The comparison of image generation results of different color feature models is shown in Figure 5.

Lines are the trajectory of point movement and the starting point for forming surfaces, and the movement of lines can form a variety of surfaces. Therefore, lines have a sense of motion and direction. With MATLAB software, lines’ rhythmic and dynamic characteristics can be expressed differently. Compared with points and surfaces, lines have more substantial variability and individuality and are a very active element in visual expression. The orderly arrangement of dots and the outer contours of characters can be considered lines, which often guide the visual direction and reading order in graphic works. The line has different morphological characteristics and produces different visual perceptions [23]. Thick straight lines give people a heavy feeling, thin consecutive lines give people a sharp sense, and curves give people a soft, natural feel. The reasonable use of turns in the creation of graphic visual works can attract the audience’s attention more quickly because of the instinctive reaction of human beings. American and Japanese brain neurologists have found that excellent brain waves will appear in the brain when people observe curves, and curves have the characteristics of flow, softness, and change. Therefore, curves bring people visual enjoyment and can fully satisfy people’s psychological and physiological needs. Thus, the concept of a modern building form skeleton with changeable shape combined with function images will be fully displayed in the design practice. The function image of the architectural skeleton generated by MATLAB software contains rich curves to express the theme and convey information. At the same time, it can attract the audience and make them have rich associations, thus showing the artistic charm and performance characteristics of the function image with the concept of modern architectural form. The results of image chromaticity feature adjustment are shown in Table 1.

There are many ways to express the surface in graphic design. Its size can be changed according to the need. It can also sound like two-dimensional and three-dimensional spaces, thus giving the audience a richer visual experience. The artistic effects presented by applying the surface according to different needs are rich and diverse and bring different feelings to the audience [24]. For example, if the faces are arranged parallel or perpendicular to each other, it will give the audience a straight, dull, and stable feeling; while when the faces are set in a free curve by inputting variables, they can become beautiful, smooth, and dynamic, giving the audience a visual feeling of natural form, free expression, and rich changes. Facets have a more powerful visual word for points and lines and have a higher consistency of color. So, for graphic visual works, the message can be better conveyed. Therefore, in design practice, function images are appropriately applied in the form of facets so that the program has visual impact, more uniform colors, and more efficient communication of information.

4.2. Graphic Design Software Base Build Implementation

For natural images, it is challenging to capture pairs of pictures with different styles but the same content, so it is not easy to compare the effect of image migration pixel by pixel. The dataset chosen for the experiments in this section is the SUNCG dataset, a large 3D model dataset created by scholars for interior scenes, containing over 45,000 different interior scenes with manually created room layouts. Since the study in this article targets the floor plan layout, only the room layout drawings in the SUNCG dataset are selected. After filtering and rotating, 7500 room layout drawings are finally obtained. Since the size and shape of each room are different, this article adjusts the image size to 512 × 512 uniformly while keeping the room shape and size proportion unchanged, as shown in Figure 6. There are 25 categories of objects in the room, and this experiment only examines 12 types of things with high frequency and uses one-hot coding for the categories. In addition, for the convenience of the result display, also set different colors for each type, extract the coordinate information of the objects in the room layout map belonging to the 12 categories that have been selected, and normalize them, assuming that there are at most 16 objects in each layout map, then combining the category probabilities and geometric parameters yields an input data of size 16 × 16.

Accuracy (precision) and recall (recall) are two crucial data evaluation metrics in information retrieval that can evaluate the quality of data annotation. The accuracy rate is the ratio of the number of positive samples to the total number of pieces, measuring the proportion of correct annotations among all annotations. A higher accuracy rate indicates higher annotation accuracy. The recall is the ratio of the number of predicted positive samples to the total number of actual positive examples, which can measure the completeness of image annotation. A higher recall rate indicates that the image is more adequately annotated. Compared with discriminative models, generative models have accumulated. However, their research progress is still slow, mainly because of the large amount of a priori knowledge and computation required for modeling, which causes limitations in model training and data generation. To calculate the accuracy and recall of the annotation results of sampled images, the ground truth of these images needs to be obtained first. However, design element labeling is a subjective judgment process, and different design labeling results can be obtained based on different design perception perspectives [25]. Five experts in graphic design were invited to annotate the sampled images accordingly to get uniform ground truth values for evaluating the annotation quality. Based on the annotation results of each of the five experts, the truth value of each print advertisement image was decided by voting. The results show that about 37% of the images in the dataset have more than half of the labeled area. The print advertisement images with less than 20% of the marked area account for about 10% of the entire dataset. The number of elements with the highest frequency in the statistical results was between 6 and 8, and each image contained an average of 7.7 graphic design elements. A comparison of the visual design software image results is shown in Figure 7.

The optimization algorithm used in this section is still Adam, and the learning rate is set to 0.00002, 1 is set to 0.5, and 2 is set to 0.999. The generator’s category probabilities and geometric parameters input are taken from 0 to 1 uniform distribution and normal distribution with a mean of 0.5 and variance of 0.15, respectively. Each batch is introduced with the category probabilities and geometric parameters corresponding to 4160 layout maps. The training batch size is set to 64, which takes about 50 hours to prepare the model. To verify the algorithm’s effectiveness, using the MLP database, four evaluation metrics are defined: the average distance from manual annotation, the overlap area of annotation, the number of guideline crossings, and the average length of guidelines. Then, through qualitative analysis, it is demonstrated that the method in this article avoids essential areas such as pedestrians, vehicles, and traffic signs in the field of view and has a shorter guideline length than previous augmented reality labeling algorithms. The quantitative analysis shows that this method effectively implements the general criterion and the criterion of augmented reality annotation placement. The ablation study shows that the two innovations proposed by the algorithm to fuse semantic information to generate Guidance images and update the significance detection algorithm based on deep learning make the evaluation index of annotation placement perform better. The final user study shows that this article’s algorithm outperforms previous annotation placement methods in real-world readability, virtual annotation readability, and subjective user perception.

5. Conclusion

Open design resources such as the Internet and IoT to acquire and integrate assisted creativity to become a new strategy to enhance design solutions’ quality, innovation, and competitiveness. In this article, based on the construction of a graphic design software model for data-driven image interaction, the semantic information of images is integrated into the framework of the augmented reality annotation placement algorithm. A new feature map Guidance Map is proposed to combine image saliency information with semantic information in the task scenario, which more accurately describes the importance of different regions in the user’s field of view and thus leads to a more reasonable annotation placement strategy. An augmented reality MLP database is constructed. The semantic tendency of manual annotation is learned on the one hand and used to create a benchmark for the increased reality annotation placement problem on the other hand. An image data-driven augmented reality annotation method is proposed. Based on the synthesized dataset, edge and detail loss functions are designed in conditional GAN. The network is trained end-to-end with the synthesized multimodal image dataset to obtain a fusion network finally. This network enables the fused images to preserve better the details of visible images and the target features of IR images and sharpen the boundaries of thermal radiation targets of IR images. The method’s effectiveness is tested by subjective and objective image quality evaluation compared with mainstream methods such as IFCNN, Dense Fuse, and Fusion GAN on the TNO public dataset. Good results have also been achieved in data-driven image interaction for graphic design software.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interests.

Acknowledgments

This work was supported by the project of Jimei University University Accounting Fund Project “Traditional Creative Product Design Research Based on Design Thinking” (No. Q202005), and Fujian Provincial Social Science Fund Youth Project “Design Strategy of Twenty-four solar terms Cultural and Creative Products Based on Nonitive Cultural Heritage Innovation” (No. FJ2020C058).