Data-Driven Fuzzy Multiple Criteria Decision Making and its Potential Applications 2022View this Special Issue
Graphic Language Representation in Visual Communication Design Based on Two-Way Long- and Short-Memory Model
With the popularity of neural network research, the application based on neural network model is gradually applied to all aspects of people’s life. Neural network model can not only solve the algebraic problems that traditional machine learning can solve but also recognize and analyze graphics through self-learning. For example, face recognition, web page recognition, product packaging design, and application are inseparable from the dissemination of graphic language. When these processes are realized through computer language, it is necessary to accurately identify these graphic languages. However, traditional machine language learning has poor performance in graphic language learning, which further leads to the application to achieve the purpose of the original visual communication design. Therefore, based on the neural network algorithm, this paper improves a new neural network model—two-way long- and short-memory model to make the computer recognize the graphic language more accurate and further explores the graphic language representation in the visual communication design based on the two-way long- and short-memory model.
With the rapid development of the Internet, people’s communication methods have gradually changed from face-to-face communication or written communication such as letters to online communication on the Internet, such as communication on many social software and instant communication such as e-mail. Traditional communication methods give people a formal and serious feeling, while Internet communication methods give people a fast and convenient feeling . In addition, Internet communication is full of immediacy and is not limited by time and space, which makes people’s communication more efficient and fast; At the same time, it greatly reduces the cost of communication and increases the convenience of communication; Different communication can occur in various Internet communication applications and can communicate across platforms, which increases the integration of communication . Therefore, Internet communication not only undertakes the role of emotional communication between people but also has become one of the main working methods of many Internet companies. In this context, traditional written communication cannot meet the current mood, feeling, aesthetics, and other requirements for all kinds of communication . In the study of nonverbal symbols, Alberton Meige found that there are many ways of expression, but the main ways of expression are facial expression and body expression, accounting for 55% of the overall expression process, while the language as the subject of expression content only accounts for 45% of the overall expression process; Similarly, in people’s understanding of communication content, replacing language with expression often makes people better understand communication content . In a research study initiated by the University of Bangor in 2015, people were surprised to find that more than half of the respondents with common Internet communication methods believed that using emoticons in Internet communication could express their meaning more directly and accurately than text communication and also better understand each other’s intention when using emoticons . The essence of emoticons is graphics, so we can see from this survey that when people express or accept the content of communication, they are more willing to convey the current language intention through visual form, especially emotional expression. In addition, in some visual communication designs such as posters, leaflets, or advertising, large-area graphics are often used instead of cumbersome words to quickly attract the attention of the target group and convey the intention to be expressed. The essence of this phenomenon also reflects that in the fast-paced current society, people are more in pursuit of a fast and intuitive visual communication design .
Visual communication refers to the communication of information through visual senses. In the new era, with the change of people’s way of obtaining information, the traditional design concept is gradually eliminated. The new visual art of obtaining information brings people a more reasonable way of information expression, dissemination, and understanding. This visual language and visual space is called visual communication design . Influenced by the background of the times, the visual communication requirements pursued by people make the visual communication design mainly use images to convey information. Among them, images are mainly composed of graphics, so graphics have become the main element of visual communication. In daily life, people often use graphics to quickly and directly convey information, such as ubiquitous public signs, which usually guide people’s behavior through a simple forehead gesture or symbol . This process is the first mock exam to transform the basic visual form into a visual language generally accepted. When this pattern is generally recognized, it can be seen through graphics that the purpose of guiding behavior becomes a visual effect. The capture and understanding of the visual effect by the final target object form a complete visual experience . Therefore, in visual communication design, we need to be extra cautious in the selection of graphics. The selected graphics should be able to accurately express the meaning described in the language and realize the accurate communication and communication of intention to the target group. However, at present, there is a common problem of ignoring “graphic language” in visual communication design. In the selection of graphics, it ignores the language representation contained in graphics, but blindly selects for the overall composition and layout. This leads to the isolation of graphics and meaning in visual communication design, which makes the target group unable to truly and accurately understand the meaning and information truly expressed by graphics in design. Then the visual communication design also loses its significance of communication . In order to make the graphic language play a real role in visual communication and integrate the “line” and “meaning” in the design, it is necessary to accurately classify the different meanings represented by different graphics, and then select different graphics in the visual communication design with different intentions to realize the graphic language representation in the visual communication design.
Based on the neural network algorithm, this paper innovatively improves a new neural network model—bidirectional long- and short-memory model. The recognition of graphic language by computer is more accurate, and the graphic language expression in visual communication design based on two-way long- and short-memory model is further explored. Long-memory and short-memory models are introduced into graphic language classification to accurately classify the languages contained in graphics, and the graphic language expression in visual communication design is discussed based on two-way long-memory and short-memory models. The results show that the two-way long-memory and short-memory models perform better in the recognition accuracy of graphic language expression and visual communication design intention. It can not only solve the algebraic problems that traditional machine learning can solve but also recognize and analyze graphics through self-learning.
2. Related Work
The representation of graphic language in the west is earlier than that in China. The study of graphic language in the West originated from the study of graphic topology. It was first founded and studied by Gestalt psychological aesthetics. Rudolf Arnheim, one of its founders and representatives, put forward the theory of how to organize perception in design feeling in 1912. This theory is also the core of Gestalt psychological aesthetics. Rudolf Arnheim believes that designers should fully understand and master the psychological and physiological characteristics of beauty in human body, and show the current feeling and specific through graphics with different characteristics. At the same time, the theory discusses the different perceptual or emotional features contained in different graphics, such as the different meanings represented by the principles of approximation, proximity, and continuity, and the different emotional meanings contained in a specific environment. In addition, in order that the designed visual communication design can be more in line with the law of human visual cognition, the theory studies the principle of integrity and closure tendency and the principle of selective features respectively. These studies have laid the foundation for the subsequent application of graphic language representation in visual communication design . On this basis, Otto newlat, an Austrian philosopher in the 1920s, put forward the famous graphic communication system theory, which aims to create a set of global graphic communication system theory, so that even the communication object with no language can create a graphic world independent of the real-word world through this set of teaching standards. In this graphic world, communication can be carried out through the visual symbols of graphics, and cross-cultural and cross-regional communication can be carried out . On the whole, the teaching of graphic language in China has attracted extensive attention, and domestic art colleges and universities have added this part to the key teaching content . However, on the whole, under the influence of the existing graphic language theory, people put too much emphasis on the impact of creativity on the application of graphic language, ignoring the receiver’s feelings and relationship to graphic language in the process of visual communication. Therefore, we need to further explore the gap between visual communication theory and practice.
The practical application of graphics language is inseparable from the development of computer graphics technology. In computer graphics, the research on graphics texture and graphics classification is an indispensable research object. In recent years, the research of computer graphics based on neural network has become one of the research hotspots . The early research on computational graphics was mainly based on statistical methods and multiresolution filter algorithms, which were proposed by Simoncelli and Portilla . On this basis, the pyramid algorithm was integrated, which greatly improved the quality and speed of graphic classification results. Subsequent graphic classification methods basically continue this algorithm . Zhu et al. first introduced machine learning into computer graphics, which brought the graphics classification algorithm into the Markov model, making the universality and process controllability of the classification algorithm far more than the previous algorithms. Subsequently, with the popularization of machine learning and neural network research, more and more people began to apply the algorithm to computer graphics technology. For example, the fast Fourier transform domain is used for graphics classification and matching, so as to reduce the computational complexity of computer graphics classification . In 2009, Wei et al. comprehensively summarized the overall definition and algorithm development of computer graphics . In 2012, the spiral ladder clan dominated by Hinton participated in the Imagenet image recognition competition, and its alexnet based on cyclic neural network won the championship with ultra-high accuracy in the competition . This has set off an upsurge of deep learning research based on cyclic neural network in computer graphics. From 2012 to 2017, computer graphics based on deep learning has made a lot of research results. The image classification effect based on cyclic neural network has surpassed the human eye classification effect for the first time. The results are not only used in visual communication design requiring image classification, but also widely used in other fields requiring image recognition. For example, alphago players based on deep learning algorithm beat human professional go players for the first time, including the visual communication role played by graphic language .
The difficulty of graphic language representation and classification mainly lies in the computer’s understanding of visual problems. In the process of human brain recognizing graphics, in addition to the transmission of visual system, there is also the participation of neural consciousness layer of brain. The brain analyzes and arranges the geometric features of graphics captured by visual system and gives play to the associative function of prefrontal lobe, so as to achieve the visual transmission of graphic language representation. However, it has been found that the process of using computers to deal with visual problems is much more difficult than human imagination. The representation and recognition of graphic language by computer is mainly divided into three links. The first is to recognize the entity of the object. In this link, the computer needs to accurately identify the physical characteristics of the object, especially under the influence of different environments on the entity of the object; it can still clearly recognize the original appearance of the object. For example, the perspective difference caused by viewing the object from different angles, the change of the color of the object itself under different lighting conditions, or how to recognize the whole object through some object entities when the object entity is blocked by other objective existence are the difficulties of computer recognition in the first link; after the object is accurately identified through the first layer, it is necessary to accurately locate the identified object, that is, the category of the object. The difficulty of recognition in this link lies in the accurate classification of objects with similar physical characteristics. According to the differences in the natural properties of classified objects, we divide the recognition types into category differences and background interference. According to the category difference of the classification object, it can be divided into two types: large category difference and small category difference. Large category difference refers to the large difference between the classified object and the classification sample. Just like the classification category of the sample car, the difference between the car and the large truck can be easily distinguished. The difference between dogs and wolves is small, which can be found only by careful comparison. Background interference refers to the fact that the identified object has the same material or color and other characteristics as the background, and there is some interference in the discrimination process. For example, if the polar bear is identified in a picture with snow background, the main bodies of both are white, and the recognition process has a certain complexity; The third recognition link is conceptual level recognition. This link is the unique associative function of the human brain. Human society has a long history and has formed the unique humanistic culture of human society in the development of history. When people see a figure, they will naturally associate with other characteristics, which is the important role of graphic language representation in visual communication, This link is also a function that computers that simply perform calculation and comparison do not have. In addition, these three links also need certain logical reasoning ability, which cannot be achieved by traditional computers. With the development of neural network technology, computers gradually have a certain ability of logical reasoning and learning. The essence of picture recognition is the recognition of graphics. Therefore, neural network algorithm is gradually introduced into the research of graphic language representation. In this paper, the two-way long- and short-memory model, one of the commonly used models of neural network algorithm, studies the graphic language representation in visual communication design. The two-way long- and short-memory model is also called LSTM model, and its specific cycle structure is shown in Figure 1.
The most classic model in the neural network is the “M-P neuron model”, which is mainly composed of input signal, linear weighting, summation, and nonlinear function activation. Through this process, each neuron can accurately train the input signal. The common activation functions include sigmoid function, tanh function, relu function, maxout, ELU, and so on. Among them, the activation function shown in the two-way long- and short-memory model is tanh function, the specific function expression is shown in formula (1).
Neural network usually has three layers, namely input layer, output layer, and hidden layer. Each two layers are connected to form a perceptron, in which the logical operations required for problem solving can be carried out. At present, the logical operations that neural network can perform include AND, OR, and Boolean operations. The learning rules of the perceptron are very simple. It mainly adjusts the logic parameters to get the best results. The weight update adjustment method is shown in formulas (2) and (3).
With the adjustment of different logic parameters, the training results in neurons also change. Taking the training process with initial parameter value of 10 and initial parameter value of 50 as an example, the results are shown in Figure 2. It can be seen that the training results change nonmonotonically with the increase of the number of iterations. When the initial parameter value is 10, the variation range of training value is small, and when the initial parameter value is 50, the variation range of training value is large. It should be noted that the change of fluctuation value here does not represent the final training result, but only the training result between two perceptrons and the final training result is the training result of multiple perceptrons. In addition, there is no correlation between training value and accuracy. Large training value does not mean high accuracy and vice versa. Training value only represents the training results of objective problems.
The reason may be that the data noise is too large, and there are problems in the data set, such as labeling problems. The learning rate is too high or there are too many model parameters and too little data. Solution: reduce the complexity of the model, increase the L2 regular term, and add the dropout layer in the full connection layer. With dropout, the network will not add high weight to any feature. Eventually, dropout has the effect of shrinking the square norm of weight.
Through the aforementioned description, we know that the traditional neural network can solve the key problems of graphic recognition and classification in the research of graphic language representation. However, in the process of practical problem solving, we find that even if the circular neural network with the best perceptron link in the traditional neural network is used for the training of graphic language features, its hidden layer state update formula is shown in formula (4).
The circular neural network makes the information continue between the two perceptrons in a circular way, but in the process of information transmission, the receiving neurons only accept the final training results of the input neurons. Therefore, all neurons abandon the previous training results and start training again and output a new result in the training process. However, when solving the graphic language representation, the computer needs to make an overall logical judgment combined with the results before and after. The traditional neural network only ignores the retention of front and back information, which leads to the poor performance of graphic language expression. The suppression of gradient attenuation in the optimization process makes it feasible to directly train the depth neural network. With the increase of data and computation, researchers began to gradually use deeper and larger supervised learning networks. The bidirectional long-short memory model closely relates the cyclic sequence to the list, thus forming a natural framework of neural network. The bidirectional long-short memory model belongs to a special cyclic neural network. Based on the cyclic neural network, it can further learn the distance dependence, so that it can remember the long-distance information in the training process and determine the forgotten content through the forgetting gate, so as to better store the information. The update method of forgotten content and the update method of stored information are shown in formulas (5)–(7).
Therefore, the bidirectional long-short memory model has significant advantages in speech recognition, language modeling, translation, and other fields that need logical inference and classification. We use the two-way long- and short-memory model and the traditional neural network model as the graphic language representation classification algorithm respectively. In Figure 3, it can be seen that the graphic language representation value based on the two-way long- and short-memory model is higher than that based on the cyclic neural network. Although there are slight differences between different images, on the whole, the algorithm based on two-way long- and short-memory model is more stable.
After determining the algorithm, we further optimize the algorithm based on the long- and short-memory model algorithm to a suitable algorithm model for visual representation. First, we determine the previous contents required for forgetting gate in the process of graphic language representation and recognition, as well as the data memory to be retained in neurons in the optimized model. The specific expressions are shown in formulas (8) and (9).
According to the forgotten contents of the aforementioned forgetting gate and the contents saved by neurons, we can determine the state value of a memory unit. As shown in formula (10), the forgetting state of a neuron is determined by its forgetting state.
After determining the current memory unit state value, then calculate the output value of the current neuron. The output value is affected by the multiplication of the current neuron output gate value and the current neuron activation function value. The specific expression is shown in formula (11).
Using the aforementioned expression to perform equalization operation on the target graphic image, the equalization histogram can be obtained, as shown in Figure 4. It can be seen that the graphic distribution is concentrated and the overall distribution is normal. This also shows that the recognition is not uniform in the first part of the graph, so we will further optimize this problem.
In order to fully identify the graphic features when collecting the graphic language features, we can enable the weight of the similarity function and minimize the error. When the weight value changes, the similarity function value changes with the position of the gradient region, so the optimal solution exists. See formula (12) for the specific weight expression of similarity function.
When the optimal solution is determined, we then seek the optimal solution. At this time, the expression of the optimal solution is as shown in formula (13).
We test with 30 graphs, and the characteristic PSNR curve can be obtained, as shown in Figure 5. It can be seen that the PSNR value of the two-way long- and short-memory model after optimization is the best compared with that before optimization, exempar based and NN inpainting algorithms. Therefore, it is feasible to use this algorithm to study graphic language representation.
Finally, we explore the recognition rate of the algorithm. We select the graphic language features with different sequence lengths to recognize them in different algorithms. The results show that the recognition rate of the algorithm based on the two-way long- and short-memory model is the highest. At the same time, the recognition rate of different graphic sequence length is also different. When the sequence length is 100, the recognition rate of the model is the highest. We speculate that this is because when the sequence is too short, the memory retained by the two-way long- and short-memory model is also small; when the sequence is too long, the memory retained by the model will enter the load state. Too much or too little memory will affect the final logical judgment of the two-way long- and short-memory model, so the overall recognition rate increases first and then decreases. See Figure 6 for details.
4. Result Analysis and Discussion
The earliest graphics in China can be traced back to totem patterns in ancient times. Later, people gradually use graphics to draw on caves or stone walls. At this time, graphics have become a medium for people to communicate or express consciousness. Later, with the further improvement of productivity, Cangjie gradually created words specially used for recording or communication. Early Chinese characters belong to pictographs. Here are pictures abstracted from the appearance of the real objects corresponding to the characters. Later, they slowly evolved into the horizontal and vertical Chinese characters we use today. Therefore, graphics are gradually developed with the needs of human life and production. However, the research on graphics has gradually become a hot spot since the 1990s. In this era, with the proposal of the concept of visual communication in the field of design, more and more fashion designers realize that a good design work needs a unique way of expression to accurately display it in addition to wonderful creativity. In the display process, graphic visual language plays an important role. The visual display of creativity using graphic language is the outline and skeleton of the work to convey the intention. At the same time, the designer’s grasp and creativity of graphic visualization in the design process is the key for designers to improve the visual communication design level and their own design connotation. Taking the poster shown in Figure 7 as an example, the whole poster is composed of different graphics. The human silhouette shows the theme of the dance poster, which echoes with the background full of rectangle, circle, and some irregular graphics, making the content of the whole poster fuller and more dynamic. The theme of the dance was vividly and intuitively conveyed to the audience.
Next, we use the bidirectional long- and short-memory model algorithm optimized in this paper to verify the accuracy of the graphic language representation extraction of the poster, so as to test the extraction and recognition of the graphic language representation by the bidirectional long- and short-memory model algorithm optimized in this paper. The specific results are shown in Figure 8. The results show that the accuracy of the optimized two-way long- and short-memory model is much higher than that of other neural network algorithms such as cyclic neural network and convolutional neural network, and the accuracy rate is 84.3%. In addition, the two-way long- and short-memory model consumes the shortest time for training in the recognition process, so the classification efficiency of the two-way long and short memory model is the highest. On the whole, the two-way long and short memory model performs well in the extraction of graphic language representation.
Finally, let us look at the performance of the algorithm in visual communication. Similarly, taking Figure 7 as an example, we will visually communicate the optimized two-way long- and short-memory model for the poster graphics and test whether the model has correct visual communication for the poster design through the results. The specific results are shown in Figure 9. As can be seen from Figure 9, the main intention of visual communication of the model is dance and freedom, accounting for 40% and 30%, respectively, which is consistent with the theme of our dance party poster. In addition, five intentions such as music, pleasure, and youth are also recognized. These intentions will also be realized when the human brain processes the poster graphic language. Therefore, we can say that the visual communication design based on the long- and short-memory model can extract the graphic language representation more accurately. Although 5% of the graphic representations are based on the two-way long- and short-memory model and do not recognize the specific intention, on the whole, the graphic language representation of poster design is more complete in visual communication and high accuracy.
Using computer technology to represent graphic language in visual communication, the first thing is to accurately identify and classify graphic language through computer technology. There are two main problems that the computer needs to solve, “where” and “what”, which are the core problems that the computer needs to solve. It is mainly divided into two steps. The first step is the graphic detection step. The computer needs to find the area where the graphic language exists first; The second is pattern recognition, which identifies and names of the detected area. This completes the process of computer graphics recognition. Computer graphics recognition is applied in all aspects of the Internet. Solving the classification of computer graphics language representation is the core problem in the field of computer vision. This paper introduces the long- and short-memory model into the graphic language classification to accurately classify the language contained in graphics, and then explores the graphic language representation in visual communication design based on the two-way long- and short-memory model. The results show that the two-way long- and short-memory model performs better in the recognition accuracy of graphic language representation and the intention of visual communication design. This paper has some innovation, but there are still some problems that need to be further modified and supplemented. For example, the network accuracy is not high, the research network representation ability is not enough, and other problems need to be further analyzed and supplemented in future research.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The author declares no conflicts of interest or personal relationships that could have appeared to influence the work reported in this paper.
R. Shashidhar, S. Patilkulkarni, and S. Puneeth, “Combining audio and visual speech recognition using LSTM and deep convolutional neural network,” International Journal of Information Technology, vol. 2022, no. 140, pp. 110–119, 2021.View at: Google Scholar
J. Lin, S.-H. Zhong, and F. Ahmed, “Deep hierarchical LSTM networks with attention for video summarization,” Computers & Electrical Engineering, vol. 97, no. 53, p. 101, 2022.View at: Google Scholar
S. Han, Z. Meng, and X. Zhang, “Hybrid deep recurrent neural networks for noise reduction of MEMS-IMU with static and dynamic conditions,” Micromachines, vol. 12, no. 2, p. 214, 2021.View at: Google Scholar
S. Rai and M. De, “Analysis of classical and machine learning based short-term and mid-term load forecasting for smart grid,” International Journal of Sustainable Energy, vol. 40, no. 9, pp. 821–839, 2021.View at: Google Scholar