Metric Locating Parameters of NetworksView this Special Issue
Research on Folk Handicraft Image Recognition Based on Neural Networks and Visual Saliency
How to identify quickly the images of folk arts and crafts works has become a difficult problem of cultural heritage value mining. Therefore, combined with the image recognition technology can improve the accuracy of the identification of folk arts and crafts works. This paper improves the ITTI significant model based on the method of linear addition of significant maps with the same proportion. Firstly, Bayesian model and Gaussian model are used to extract the probability distribution of image feature vector; secondly, the k-means algorithm is used to identify image accuracy extraction work, and finally ALOI database is used to test judgment image recognition accuracy; experimental results found that the improved technology does help to improve the folk arts and handicraft image recognition accuracy.
Scholars in many fields at home and abroad are committed to studying the mechanism of visual attention; psychologists mainly study the behavioral correlation of visual attention, and neurophysiologists mainly study how nerve cells work when facing the goal of interest. Computer neuroscience work to build neural network models to simulate human visual attention behavior. ITTI proposes a combination of bottom-up and top-down optimization models. This model combines the ITTI's visual attention model and Bayesian model, so it is also called the Bayes model. Based on the ITTI significance map, it extracts the feature vector of each feature map and analyzes the different feature values with Gaussian distribution (also known as normal distribution), so as to build the detector of each category. Then, the object to identify has brought into all the detectors of all the learned categories, and finally, the identification result of the class with the highest probability as the target has been obtained. In 2011, Haiku Duo et al. improved on the Bayes model to analyze the probability distribution of the extracted with mixed Gaussian vector in the image instead of a single Gaussian distribution.
The introduction of hierarchical K-means clustering hybrid Gaussian model about improvement algorithm has more advantages, and the main method adopted is to improve the model based on the adaptive recognition algorithm. ITTI model, as a significant model, can input the image in the processing process, extract the image features, generate important capture focus, and realize the transfer of focus. ITTI's model forms a simple and fast computing model in saliency vision. The model structure can pay more attention to many focus characteristics in the vision system, and the model can realize parallel computing in large-scale complex scenes to form attention focus. However, the disadvantage of ITTI is that it cannot complete task driven recognition, and it is difficult to separate the data information from the background target. This will lead to the generation of false information items, making the feature vector inaccurate. In this case, the ITTI significance model can be improved, which integrates the adaptive recognition algorithm.
Combined with folk arts and crafts, this paper conducts in-depth analysis and discussion, explores the form beauty, style characteristics, artistic conception, and extremely symbolic significance of folk arts and handicrafts works, and excavates the deeper spiritual connotation and artistic value. Compared with the advantages and disadvantages of other research methods, the innovation of this paper has highlighted, and the image recognition technology has specifically applied to the research of folk technology. It provides a scientific identification means for researchers in the art field, which enables researchers to improve the identification accuracy and save time in the process of identifying images.
2. Folk Arts and Crafts Image Recognition under Visual Significance
2.1. Visual Significance Analysis
Visual saliency refers to the ability of human visual system to observe the things around the world in a factual and efficient way, that is, it is easy to attract the visual system to pay attention to the things around. This is visual saliency, which reflects the visual attention mechanism . In the face of a large amount of information, the visual scene will actively select specific parts and ignore other irrelevant parts. Human visual attention belongs to cognitive ability, which will cause their own subjective feelings in the image target, ignore things other than interest, and only pay attention to the objects with intuitive feelings in line with their own interest . This visual saliency is used for image recognition, which can filter some unimportant information and focus on processing the significant part, so as to quickly attract the visual system as the target, quickly identify and understand, reduce the description of the secondary part, so as to improve the accuracy of image recognition, and facilitate the rapid recognition of search targets.
Visual saliency, combined with visual attention mechanism, has become an interdisciplinary research field. Comprehensive computer, neuroscience, and psychology have attracted attention, especially target recognition has been widely used in national defense, space technology, and national economy . A visual saliency model can be constructed to strengthen target recognition. Therefore, the study of visual saliency is of practical value for the recognition of folk arts and crafts images. In this paper, the visual attention computational model is a Bayesian model as follows:
In the model, c (high-resolution) is the central fine scale and s (low-resolution) is the surrounding coarse scale. The improved model weights the probability of color (C), of brightness (I), and of orientation (O). The improved model is as follows:
2.2. Image Recognition Preprocessing of Folk Arts and Crafts
In the image recognition and multiprocessing of folk arts and crafts, the psychological and physiological characteristics of people and human vision are mainly integrated to form a significant mapping map. For the targets that you are more interested in, the nerve cells will strengthen the simulation . In the ore processing of the identification of folk arts and crafts images, the neural network model has been mainly constructed by computer neurology to attract human visual attention. This multiprocessing is achieved by two pathways, one with input features driving bottom-up attention, and another with top-down attention, controlled by system selectivity . This paper presents the image processing process, as shown in the flow chart of the ITTI significance model in Figure 1.
At present, there are many better simulation models related to the preprocessing and recognition of visual image recognition, which integrate the theories of biology and psychology. Among them, the task drive is mainly top-down, and the target is detected consciously, while most models are bottom-up . In this paper, the image recognition preprocessing combined from bottom to top and from top to bottom to take the features of image recognition. Because the features are uniform, a feature map has formed. Through this nonuniform sampling, the features are fused together to form a saliency map and to strengthen the attention of the focus.
2.3. Folk Arts and Crafts Image Recognition Model
After the improvement of the ITTI significance model, all steps with biological basis from bottom to top have formed, and the simulation algorithm can be realized. The specific architecture of ITTI saliency model is mainly that the input image has linearly filtered at the spatial scale, the planning operation has formed around the center, and then, 42 feature maps have formed and merged and finally constructed into a saliency map. If the features are met, it will attract visual attention . In the process of feature extraction, it is necessary to input video sequences to static images to facilitate feature extraction. The computer combines the visual physiological characteristics of human eyes and imitates the processing mechanism of visual cortex, selects the color to measure the orientation, and extracts the three-color orientations of red, blue, and green. The angles have extracted by 3 degrees, 45 degrees, 90 degrees, and 135 degrees . R, B, and G respectively, represents the specific red, blue, and green color channels and the brightness has been represented by I value. The average value is calculated as follows.
By normalizing the even tone from the brightness, four wide skew color channels have been generated, which were represented by capital letters, red R, blue B, green G, and yellow Y. Finally, (4)–(7) can be obtained, as follows:
Gaussian pyramids have been established for the four-color channels and brightness, respectively. The images of each channel are filtered and sampled layer by layer to obtain the azimuth information formed by nine scales of 0–8. With the help of pyramid filter and the multiplication of two-dimensional Gaussian and cosine grating packets, the neurons in the primary visual cortex are simulated , so as to receive the stimulus response, so as to respond to the color Brightness establishes a nine-layer pyramid . Specify the fine scale C of the calculation center to form the center as the high-resolution layer and the surrounding coarse-scale low-resolution layer s and surround the difference, to obtain the following equation:
The second graph was generated through the color channel. In the sensory range of visual cortex, neurons stimulated by one color will produce the operation of inhibiting another color. This operation forms a competitive contrast relationship in space color. It mainly includes red and green, green and red, blue and yellow, and yellow and blue , which produces the characteristic diagram of RG (c, s) and BY (c, s) and finally becomes a double competitive relationship as shown in (9) and (10):
Then, form directions and angles for six scales and four directions and draw the center surrounding mechanism diagram. The final directional feature diagram is
Finally, 42 feature maps, 6 brightness maps, and 24 directional maps have been obtained.
Among them, the significant image generation method uses the normalization operator to simulate the physiological connection, so that the similar features inhibit each other, to highlight the more eye-catching peaks. The normalization operation forms a distribution map, the brightness characteristics of the comprehensive saliency map are more obvious, and the difference of azimuth characteristics has been amplified to detect the position with high saliency . It is determined that the value range of saliency diagram is fixed between [0, M] to eliminate the difference of modal amplitude. Then, calculate the maximum value m and the local limit m, multiply the whole image by (M-M) 2, and the characteristic image was obtained through normalization operation accumulated through equal weights. Finally, the significance image s  has been obtained. In equation (12), represents the azimuth feature, represents the color, represents the brightness, and s represents the low-resolution layer.
Bayesian model based on Ti forms Bayesian theory for object perception. Based on computer vision, Bayesian theory model combines the probability that the target has correctly recognized in a certain category in a given training set . Bayesian method introduces Bayesian law, and the equation is
From the formula, we can clarify the relationship between a posteriori probability and a priori probability. Set candidate hypothesis sets in the scene to find the possibility of belonging to the hypothesis set, that is, the maximum a posteriori hypothesis . Each candidate hypothesis needs to calculate a posteriori probability. The equation is as follows.
In given θ after that, we need to consider finding the maximum hypothesis, that is, the naturalness of F, and then make the maximum hypothesis close to the estimated value.
The normal distribution in the probability distribution model is physically reasonable, and the mathematical calculation is more convenient. In the normal distribution probability model, it is convenient and fast to combine the definition of the nature of the distribution  to form the probability density function 15, which is defined as
Gaussian model belongs to the model of positive Pacific distribution curve. It integrates visual saliency, combines Bayesian model, describes the object with saliency model, and combines the assumption that f represents the feature vector to find out the region of interest. The image Q contains n pixels. In the spatial competition, the one with the largest value has publicized as the center as follows:
After calculation, a class of objects θ, then establish the parameter set θ. The similarity value p has calculated by means of Gaussian distribution.
When the number of images is one, the variance value is 0.001.
3. Research on Realizing Folk Handicrafts Image Recognition
This paper puts forward targeted feature extraction for the specific features of folk arts and crafts images, studies the template matching method, which is the key link of image recognition technology, and realizes the research of folk arts and crafts image recognition.
For the recognition of objects in visual saliency, mutually independent features are extracted from different scales of the image and the space in the air, which requires a lot of computation in image recognition . With the help of Bayesian network and the decision scheme of maximum, a posteriori prediction is as follows.
Assuming that the total number of training categories is C, i represents the objects of C, after sorted likelihood lists, permutation from large to small, prior probability, and P (F) is a constant.
According to the target, each probability ranked from large to small, and the maximum probability points to the category, which is the best identification of the image to be tested, combined with each feature graph as an independent individual, and then the formula, through the joint distribution, and finally get the following.
Using the method of probability multiplication easily leads to the need to process the probability value with log in order to avoid the problem of underflow. The simplified equation is as follows.
The significance model and the Bayesian model recognition algorithm are learned. Bayesian probability distribution uses Gaussian distribution, which is simple and reasonable, and the results are similar. However, the more eigenvalues in the process of distribution, the more complex the distribution is, forming multiple models, which is obviously difficult to accurately grasp the eigenvalues and prone to errors . To this end, the integrated Bayesian algorithm has been modified by using a hybrid Gaussian model.
Suppose a sample set obeys the mixed Gaussian model to form a combination of red and green curves. Assuming that the Gaussian model is composed of K Gaussian distributions, the comprehensive a priori probability of mean variance has formed, and K partial weighted sums can be obtained at the same time. The probability density function is as follows:
3.1. Fuzzy Cluster Analysis of Digital Display of Folk Arts and Crafts
For the recognition of each folk arts and crafts image after fitting the Gaussian mixture model, this paper selects the clustering method. This clustering requires the collection of similar features, and the Gaussian distribution coefficients gathered into a class of clustering form a fuzzy clustering. The fuzzy clustering should be calculated by combining the hierarchical and K-means algorithm as the clustering method, and the parameter setting is adjustable; it is convenient to control the number of iterative calculations . In the algorithm, K-means clustering method calculates the training samples as K clusters and then assigns other samples to similar clusters. After repeated iterative calculation, the standard measure function begins to converge. The sum of the mean square deviation of the total number of samples N and the objects whose sample points form the clustering database is E.
Therefore, the goal of the algorithm is to minimize E, make the closest center of each data point, fix the iterative method, then solve the optimal r distance, make the derivative to 0, and then substitute r, so as to calculate the value of the center point from the E minimum value.
Finally, the minimum value of E has been obtained by combining the sample average of all classification K and E has constantly been reduced to ensure a final minimum value, thus obtaining an image of fuzzy cluster visual significance.
3.2. Key Technologies of Digital Display of Folk Arts and Crafts
Using MATLAB image recognition system, reference ITTI opensource toolbox, in the test image random classification, combined with the database of color image capture object record change, the system will change the observation angle, light angle, and light color, to obtain folk arts and crafts images containing 24 light angle, 72 observation angle, and 12 light color . To prove that the Gaussian hybrid model is also applicable to the bass model, each class of images contains three feature set orientation, color, light, some pictures as training Atlas, and the other images as test Atlas . In the test stage, the extracted eigenvalues of each feature map of the test image were inserted into the training parameters and combined with the Bayes formula to find the probability, and finally the weight of each image was calculated.
Through the control of lighting, color, and azimuth conditions, the color, angle, and rotation angle characteristics of each type of image are collected, so as to form the training and testing image set. The first line of the table shows the feature values of different sets for a single Gaussian fitting, and the second line is the single Gaussian model analysis of the images collected under each condition, and then the weighted mixing is carried out in accordance with a fixed proportion. Complete the probability calculation and test according to (24). The images obtained from 25% training and 75% as test are compared in Table 1.
Based on the saliency model, the eigenvalues of each class of objects in the eigenspace are calculated, that is, the local maximum eigenvalues. By comparing the three methods and the adaptive Gaussian mixture model recognition as a result, finally, the adaptive pattern recognition results found by improving the method can improve the efficiency of image recognition and, compared with the SalBayes model and adaptive model, can markedly improve the recognition results, as shown in Table 2. The angle change increased by 8.35, the rotation angle increased by 21.05, and the comprehensive recognition rate increased by 16.13.
4. Simulation Experiment Analysis
Through simulation experiments, the folk arts and crafts image recognition is improved, and the effectiveness of recognition has been verified. The improved model algorithm needs to carry out experiments. For complex folk art images as test objects, pictures have been randomly selected. To better achieve the experimental effect, folk arts and crafts images need to separate and studied separately. The significance of the formed arts and crafts combination background has analyzed to form the proportion of subfeature map; linear addition generates a significance graph . Combined with the subfeature map and weight coefficient of folk arts and crafts image, the comparison of two images has formed after fusion.
Figure 2 is the original picture, and Figure 3 is the Bayes model, which is the picture with a prominent position in the background and highlights the attention of folk arts and crafts images, with the significance of weighting coefficient guiding the focus.
The second experiment has tested with ALOI database. The mean and variance of the parameters have been obtained during training. And, all test pictures are obtained based on the known parameters, match the class targets, calculate the probability of each feature map through Bayesian network (as shown in Figures 4 and 5), then calculate the logarithm of the comprehensive probability, and finally, add the probability on a full-time basis.
Figures 6 and 7 add a weight-weighted coefficient identification method to highlight the features of the image to improve the model identification efficiency. The ALOI database had selected for the experimental data, and the final Atlas comparison is shown in Table 3. The identification method with weighted coefficients highlights the characteristics of each category, thus further improving the identification efficiency of the improved model.
In the identification of folk arts and crafts images, a large number of different categories of Atlas have sorted out, multiple training pictures have formed for objects, whether the weights calculated from the features, and the accuracy of image recognition is improved by improving the model.
Bayesian adopts the k-means clustering method, which has error with the actual image feature distribution, improves the accuracy, forms a better clustering effect, and proposes the recognition algorithm combining significance and Bayesian model. Based on the comprehensive significance features, in the improved ITTI significance model, the characteristics of orientation, color, and brightness have been compared, and different weights are to improve the identification efficiency. Estimation methods were improved based on Bayesian likelihood functions in significant recognition algorithms. Although some scholars have added a mixed Gaussian model based on the Bayes model, the k-means clustering method leads to large errors between the feature distribution of the probability density function and the actual object image. Because the initial centerfold of this clustering method is randomly selected, in order to improve the accuracy of clustering, the simplified H k clustering method is chosen, which analyzes the data set to obtain the initial center of mass. Then, for k-means algorithm, we set an optimal threshold to reduce the number of iterations and control the time overhead. Experiments demonstrate that a better clustering effect can be obtained by applying this method.
ALOI data used to support the findings of this study are cited at relevant places within the article as references.
This work was a 2017 quality engineering teaching research project of Anhui Provincial Department of Education (project name: animation and art virtual simulation experiment teaching center) (project no.: 2016xnzx016).
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.
All authors equally contributed to this work.
G. Wang and G. Wang, “Visual significance detection of rail surface defects based on PCA mode and color features,” Automation instrument, vol. 38, no. 01, pp. 73–76, 2017.View at: Google Scholar
F. Zhuang, “Research on visual significance detection,” Modern computer (Professional Edition), vol. 12, no. 14, pp. 77–81, 2017.View at: Google Scholar
Wu Shaoce Trademark Recognition Based on Visual Saliency [D, Hebei University, Hebei, 2018.
Zhang Duzhen Research on Theory and Method of Visual Saliency Target Detection Model of Image [D], Nanjing University of technology, Nanjing, 2020.View at: Publisher Site
Kang Wen Research on, Human Behavior Recognition Method Based on Image Super-resolution Reconstruction and Visual Saliency [D], Harbin Institute of technology, 2019.View at: Publisher Site
Xuan Dongdong Research on Visual Saliency Target Detection Technology Based on High-Level Semantics [D], Anhui University of engineering, Anhui, 2019.
Ke. Yang, Research on Infrared Image Small Target Recognition Algorithm Based on Visual Saliency, Shanghai Jiaotong University, Shanghai, 2020.View at: Publisher Site
Yu. Qianhui Research on Neural Network Image Recognition Algorithm Based on Visual Attention in Natural Scene [D], Harbin Engineering University, 2021.View at: Publisher Site
Wang Zhipeng Research on Depth Visual Attention Method for Multi Class Target fine-grained Recognition, Jiangxi Normal University of science and technology, 2021.View at: Publisher Site
L. Nan, “Regional architectural texture feature recognition based on visual saliency model,” Journal of Luoyang Institute of Technology (NATURAL SCIENCE EDITION), vol. 31, no. 04, pp. 27–33, 2021.View at: Google Scholar
Yu. Gao, C. Tian, and Jiang, “Mingxin design of intelligent parking space management system based on image recognition,” Image and signal processing, vol. 11, no. 01, pp. 19–26, 2022.View at: Google Scholar
H. Li, A. Manickam, R. Samuel, and J. Dinesh, “Automatic detection technology for sports players based on image recognition technology: the significance of big data technology in China’s sports field,” Annals of Operations Research, vol. 2, p. 24, 2022.View at: Google Scholar
H. Mohammed, R. Ibrahim, S. Khan, and T. Murat, “Learning discriminative representations for multi-label image recognition,” Journal of Visual Communication and Image Representation, vol. 83, pp. 27–37, 2022.View at: Google Scholar
Z. Vadim and T. Maxim, “Noise immunity and robustness study of image recognition using a convolutional neural network,” Sensors, vol. 22, no. 3, pp. 67–69, 2022.View at: Google Scholar
F. Sun, H. Choon Ngo, and Y. Wee Sek, “Combining multi-feature regions for fine-grained image recognition,” International Journal of Image, Graphics and Signal Processing, vol. 14, no. 1, pp. 12–23, 2022.View at: Google Scholar
H. Wang, J. Shi, X. Luo, and H. Lv, “Swimmer’s posture recognition and correction method based on embedded depth image skeleton tracking,” Wireless Communications and Mobile Computing, vol. 2022, Article ID 8775352, 12 pages, 2022.View at: Google Scholar