Abstract

In this era of rapid development, the exchanges between countries are increasing rapidly, which leads to the integration of multiculturalism and its impact on the local culture, making it diluted. Taking the plastic art features of the nomadic civilization in the northern grasslands as an example, the plastic art features of the nomadic civilization are very rich, including color, texture, shape, and local characteristics; the use of traditional methods will lead to poor feature effects, and it is difficult to obtain high-level information. There will also be problems with image recognition. With the hot development of deep learning, for these problems, its advantages and characteristics are introduced and applied to the characteristics of plastic arts, and a deep and shallow network is constructed as its input and feature recognition, which solves the problem of image feature recognition. At the same time, the convolution idea is introduced to enlarge its features, which is more conducive to feature recognition, extraction, and analysis. For the neural network model of deep learning, the traditional optimization algorithm is changed to the Adam optimization algorithm, which solves the problem of decreasing accuracy, improves the accuracy of prediction, and makes it more stable. From the final experimental results, it is not difficult to find that the feature algorithm greatly improves the accuracy rate under different noises, and the time consumption of the algorithm operation is also reduced. The traditional algorithm of the deep learning neural network model is changed to the Adam optimization algorithm, which also improves the prediction accuracy and makes it more stable. In the future development, the unsaturated function can be used as the activation function to optimize or change the model feature algorithm to make the model easier to build and have better training effects.

1. Introduction

In the context of the Internet, there is hot development based on deep learning and neural networks [15], making it penetrate into various fields [610]. Because it has many excellent features, such as a huge database as a support, it can learn through layer-by-layer and feature abstraction, and it can imitate the distributed representation of human knowledge data. These advantages are exploited in the recognition, extraction, and analysis of image features. In particular, for the northern steppe nomadic civilization [1115], its plastic arts are very rich, including color, texture, shape, and local features, and deep learning can replace traditional methods to build deep and shallow networks as its input and feature recognition, which solves the problem of image feature recognition. At the same time, the convolution idea is introduced to enlarge its features, which is more conducive to feature recognition, extraction, and analysis. In this paper, the feature algorithm and the traditional optimization algorithm are changed to the Adam optimization algorithm, which improves the accuracy of the algorithm and reduces the calculation time of the algorithm. At the same time, it also solves the problem that the accuracy of the model decreases in the process of feature extraction and improves the prediction accuracy, making it more stable.

2. An Overview of the Plastic Art Features of Deep Learning

With the in-depth study of the cultures of various ethnic groups, special attention is paid to ethnic minorities, including the northern grassland nomads, and the characteristics of their plastic arts are analyzed. With the hot development of deep learning, its characteristics are integrated into feature extraction and analysis. In order to solve the problem of image recognition, deep and shallow network input and feature recognition processing structures are generated. At the same time, the convolution idea of deep learning is introduced to amplify the color, texture, shape, and local characteristics of the plastic parts of the northern nomadic civilization, which is convenient for machine identification, extraction, and analysis.

2.1. Convolutional Neural Networks for Deep Learning

With the in-depth research of deep learning, its advantages and characteristics are gradually enlarged, making it penetrate into more and more fields. Among them, for feature extraction, based on the method of deep learning, the extracted plastic art features are used for retrieval through the idea of convolution. There are three methods: one is the hybrid method, that is, in the CNN network, wherein the image area is inputted from the input layer of the network, and then, feature extraction is performed on the CNN network; the second is the generative method, whose structure can describe the top-level related characteristics of the network input data; the third is the discriminative method, which can perform classification, discrimination, and strengthening.

Deep networks rely on big data to make their learning and abstraction very superior. Through layer-by-layer learning and feature abstraction, the initial input is transformed into the representation of abstract features [16], enabling better recognition and classification of image features, because it can imitate the distributed representation of human knowledge data and then realize the abstraction of high-level features from primary features. Moreover, the feature outputs of each layer of the deep network are processed by functions to achieve nonlinear feature dimension reduction. The sigmoid () function in the article is used as the activation function of the model, and the advantages are continuous and smooth. In this paper, for the hidden layer neuron output, a real number, can be processed to the last (0, 1) interval. Its function definition is

Range is (0, 1), where the derivative of the function is

Function characteristics are as follows: when or -10, ; when , .

Among them, what is interesting is that in order to solve the problem of image feature recognition, shallow and deep network input and feature recognition structures have been proposed successively [17], as shown in Figures 1 and 2.

By comparing the structure diagrams of the two, it can be seen that the input and processing mechanisms of the shallow and deep networks are significantly different, and the normal operation of the shallow network requires manual feature extraction. The deep network is to gradually complete the reconstruction of image features in the absence of artificially extracted features.

Introduce the idea of transposed convolution, change the multiplication relationship between the matrix and the vector, and combine the eigenvectors decomposed by the input feature with the convolution kernel to form the transposition of the original matrix. In this way, the addition vector of the relevant eigenvectors can be obtained. That is, we zoom in on the feature. Figure 3 shows the process of transposing convolution for up-acquisition.

2.2. Characteristics of Plastic Arts of Nomadic Civilization in Northern Grasslands

In view of the premise of deep learning, the traditional identification and extraction of the characteristics of nomadic plastic arts in the northern grasslands has changed due to the introduction of deep learning, that is, using layer-by-layer pretraining to build a convolutional neural network and adding multiple hidden layers to improve computer performance. And other types of terminal equipment are used to identify and extract its features, obtain more information, and finally filter the obtained feature information through step-by-step abstraction, thereby reducing various environmental and human interferences.

Among them, for the nomadic plastic parts of the northern grasslands, including ethnic buildings, ethnic costumes, murals, decoration designs, unique ethnic characters and calligraphy, carvings, etc., for these distinctive artistic features that are different from the Central Plains culture, more information is needed during extraction. Different features of each region should be classified and extracted to ensure that this feature information can be completely extracted. The characteristics of plastic arts are divided into the following: (1)Color characteristics: for the northern grassland nomadic democracy, with regard to its clothing, utensils, modeling artworks, etc., all have extremely rich colors, and there are various color mixing and matching. When extracting them, the color moment, color histogram, and color aggregation vector, further analysis of these features is required

Among them, the color moment for plastic art feature extraction is to describe the color distribution by calculating the moment, which is a simple and efficient color feature expression method. The advantage of this feature extraction method is that it does not need to quantify the color space inside the image area, and the obtained feature vector has a low dimension. The disadvantage is also obvious; that is, the feature retrieval efficiency of this feature extraction method is low, so it is usually used to filter images in practical applications to reduce the scope of retrieval. Its extraction process is as follows:

Among them, represents the -th color component of the -th pixel of the image, and represents the number of all pixels in the image. (2)Texture features: texture features can distinguish the differences between ethnic clothing and craft decorations, including the color of clothing, roughness of fabrics, shape, stripes, color matching, and surface smoothness of decorations and crafts. It is characterized by calculating a value in the image and quantifying the feature of the grayscale change in the image area. Generally, there are two ways to express texture features: one is the cooccurrence matrix, which describes the texture features by the relativity of the image gray space and the other is the local binary pattern (LBP) [18]. Among them, the local binary mode mainly obtains image texture features by comparing the size of the LBP operator and its field

For the spatial grayscale cooccurrence matrix in texture features, it starts from pixels whose grayscale is , and the statistical distance from is Grayscale is the pixel, the probability of occurrence is , and its mathematical expression formula is as follows:

The local binary pattern in the image texture feature is a description of the operator of the local texture feature of the image. The scene is also very broad. Its specific expression process is as follows:

Among them, represents the coordinates of the center pixel, in which is the first pixel of the field, which represents the gray value of the field pixel; represents the gray value of the center pixel; and represents the sign function, as shown below: (3)Shape feature: for the shape feature of the image, there are two main extraction methods: one is the contour feature, which [19] extracts the edge information of the overall image contour, and the other is the regional feature, which is the shape features are extracted(4)Local features: generally, global features are used for retrieval. The disadvantage is that the false detection rate is high. Therefore, local features are used for retrieval based on this to improve the accuracy

However, before the feature extraction of plastic arts, environmental factors, such as national clothing deformation, partial occlusion, wrinkles on clothes, lighting, and noise, make the pictures that need feature extraction blurred, with uneven lighting distribution, distortion, etc. problems, so that the feature information is damaged, the extracted features are incomplete, etc. In order to solve this problem, it is necessary to preprocess the image, so that the image before processing can be restored to the shooting environment as much as possible, and the complete information can be retained as much as possible. The processing method is as follows.

2.2.1. Image Bilateral Filtering Processing

This method mainly removes redundant noise in images (including video files) while ensuring that the graphics files are not overly distorted. According to the actual situation, reasonably reduce the white noise area in the target image and improve the recognition accuracy.

However, high-pass filters are processed in high dimensions, so we all need to introduce bilateral amplification circuits to incorporate spatial information and the degree of correlation between pixel values. In addition to the plane range, it is considered that the influence of the five central pixels on the edge pixels is greatly reduced, and we can better save the screen pixel values at the edge of the image.

The mathematical formula is as follows:

The mathematical expression of the domain kernel is

In the formula, is the Gaussian variance.

The mathematical expression of the range kernel is

The mathematical expression of the bilateral filtering weight function is

2.2.2. Image Equalization Processing

The target image after bilateral filtering will be distorted to a certain extent. We also need to perform histogram equalization processing on it and then nonlinearly change some parts of the target image where the gray values are more aggregated. It should be noted that the order of gray value arrangement of the original image cannot be changed.

The steps of histogram equalization are as follows:

Step 1. Assume that the original image has a total of gray levels, which are represented by .

Step 2. The number of pixels that meet the conditions is counted, the grayscale is set as , and its mathematical expression is In the formula, is the total number of pixels.

Step 3. Calculate the probability distribution function:

Step 4. For the output image, set the gray level to have a level , , and the given mathematical expression is Note: indicates the maximum gray value.

2.2.3. Image Illumination Interference Suppression Processing Method

This method is mainly for the process of backlight recognition, when strong light interferes with the light-taking process of the photosensitive original of the viewfinder. Before processing the target image, the 3D environment contrast of the target image is improved by means of the nonlinear transformation of illumination. Of course, this is not a perfect processing method. A common drawback is that the target image has too dark or too bright areas that cannot be recognized due to changes in lighting conditions.

The quotient image theory is based on the quotient image algorithm and Retinex theory. The target image is connected with the source image through the numerator and denominator relationship. Let the source image be and the target image (quotient image) be . Its mathematical expression is

The denominator in the formula concentrates on the original image processed by the smooth feedback circuit, but the filtering effect is more prominent, and the weighted Gaussian resonator is used for anisotropic power amplification to obtain the elastic processing method image.

In the formula, represents the convolution operation, and represents the filter. The filter kernel should meet the following requirements:

In the formula, the result of the domain addition is the convolution kernel, is set as the normalization factor, and then, the value of is weighted; is the Gaussian function.

There are two areas: and according to its threshold, at the left and right ends of the threshold. For the threshold of the target image, if it is greater than it, assign it to 1; otherwise, assign it to 0; the mathematical expression is as follows:

In this way, the convolution operation can only be performed on the areas with large grayscale changes, thereby reducing the influence of the halo effect in the case of poor lighting conditions.

2.3. Recognition and Extraction of Plastic Art Features

For the plastic parts of the northern nomadic civilization, the recombination of images and videos is one of the three major steps of digital processing, which is to evenly distribute the images into many subdomains with obvious characteristics and large differences, so that the two sides do not overlap. One of the options after processing is to upload it to the database and use cloud computing to perform fast feature point matching, understanding, and other operations.

2.3.1. Image Feature Extraction Overview

The process of image recognition mainly focuses on two steps: one is to extract and select the features of the preprocessed images and the other is to classify the images that have been determined. Among them are mainly feature extraction and selection as well as classifier design and classification decision [20].

After completing the processing of the image, the computer will identify and analyze it and then use feature extraction and feature selection to select or extract from numerous feature sets to form subsets and then use the classifier to classify the obtained subsets.

Step 1. Feature extraction and selection.

What plays a leading role in the success rate of image recognition is the selection of image feature points in the early stage. The process of feature extraction or selection is shown in Figure 4 below. In this process, the first step is to reduce the dimensionality of the image to obtain a subset that can reflect the essence of the data structure, and this subset has a higher recognition rate. After digitizing an image, a large amount of data is generated, and now, it is only necessary to hard code image pixel values in the spatial domain and then compress them in time to reduce the amount of synthetic data. (The algorithm presentation is provided by these sections a–e). The results are critical for subsequent classification.

Step 2. Classifier design and classification decision.

The key to the success or failure of the classifier is that various factors in the current environment, such as pixel mapping, viewing angle, and lighting, can affect the design of the classifier. In addition to the above image factors, there are classification categories, such as the time difference as the classification standard, based on various classification criteria such as user hobbies and crowd categories. The preprocessed image in the preface is imported into the classifier, and then, its feature points are picked, and then, a classification decision is made, as shown in Figure 5, and it is analyzed, evaluated, and estimated to improve the accuracy as much as possible.

The concept of classification is to learn a classification function or construct a classification model on the basis of existing data. Image classification is generally divided into the following 5 steps: (1) use torchvision to load and normalize the training and test datasets of CIFAR10; (2) define a convolutional neural network; (3) define a loss function; (4) on the training sample data, train the network; (5) test the network on the test sample data.

2.3.2. Support Vector Machine (SVM) Model

SVM is a high-performance and powerful validation set in machine learning, and it is more common in image recognition. The task of SVM is to increase the interval of feature points in the target image to the maximum value, and the calculation is based on the following mathematical expression:

Secondly, it will be changed from large to small, and the mathematical expression is

In the bilateral filtering algorithm, the expression of the definition domain kernel, that is, the function, selects the weight according to the distance of the image cable. The closer the distance, the greater the weight. This is the same as the box filtering and Gaussian filtering. The function assigns weights based on pixel differences. If the two pixel values are closer, even if they are far apart, the difference between the pixels and the pixels that are close to each other is greater. It is what the function does that makes the edges (distance apart) close but different.

In the following mathematical formula, substitute the Lagrangian factor to get the optimal solution algebraic formula .

where is the number of samples, and then, the original problem is transformed into

For the nonunknown partial derivative in the original formula, taking its dual, the mathematical expression is as follows:

Perform a quadratic programming problem through algebraic expressions to solve the optimal value.

One of the tasks of SVM is to convert the collected data samples of the target image from a low dimension to a high dimension through a one-to-one correspondence similar to a function, so as to complete the segmentation of the target image. The value of the variable that realizes the change process and provides high-quality help is called the kernel function, which is the inner product of the corresponding relationship in the Cartesian coordinate system. The so-called function is a mathematical relationship between the two ideal images. Using the relevant nature of the kernel function, a complex kernel function can be constructed and the dimension can be increased. The effect demonstration is shown in Figure 6.

We know that support vector machines can map feature vectors from low-dimensional to high-dimensional, so the kernel function of this implementation process often has nonlinear kernel functions, polynomial kernels, and Gaussian kernels. The calculation formula of each kernel function is listed below:

This mapping process can be achieved through the above kernel functions.

3. Improvement of Feature Algorithm and Neural Network Optimization Function of Deep Learning

3.1. SIFT Feature Algorithm Improvement

For the SIFT feature algorithm, it is divided into four fourth stages: First, in order to consider the extreme point of the specific position in the target image, the Gaussian feature number method is introduced. Second, the image is a rectangular coordinate system of a two-dimensional plane, the specific coordinates of the obtained points in the target image are confirmed, and the low-curvature and low-contrast key points of the edge are removed with the point as a reference. Third, assign principal directions to vectors of feature points. Fourth, use Euclidean distance to calculate the similarity of feature vectors.

3.1.1. Build an Image Pyramid

The image pyramid is divided into groups, each group of layers, and the relationship between the layers is defined as a progressive relationship: the upper layer is sampled from the bottom layer. Each pixel in the DOG is set to be the extreme value in the fixed neighborhood and the domains corresponding to the upper and lower layers. The feature points of the DOG in the image are shown in Figures 7 and 8 [21].

3.1.2. Double Position Matching Criterion

In order to reduce the error caused by feature extraction under the condition of traditional algorithm, we introduce a double position matching criterion. The specific method is as follows: first, the ratio of the adjacent image block to the adjacent image block is used as the value of rough matching, and the value of is set to 1.0. After the rough matching is completed, the image block values are ranked according to the similarity. According to the display of the matching results, decide whether to stay or not for this match. As the corresponding points of sample with high correlation are used as various standards of this common method, this paper will expound the double matching criterion when mapping 3D to 2D on the 2D level.

In Figure 9, the points that are both are the reference matching points in the target feature point set and the original feature point set, respectively, and they have a pairwise correspondence. For and , the following mathematical relationship is satisfied between the straight line and the straight line :

Order:

It can be seen from the calculation that the coordinates of the intersection of the two straight lines are

Since the positions of the points are relative, the following equation holds:

According to the position of point relative to the triangle , the coordinates of are calculated, and then, the coordinate error between , , and can be obtained [22].

If the double coordinate errors of point are all less than the given threshold, the matching is considered reasonable. The double-matching position criterion can avoid the problem that the points in the image tend to be on a straight line through the traditional algorithm and bring about a large error.

3.2. Adam Optimization Algorithm

In the neural network of deep learning, Adam is a first-order optimization algorithm that can solve the stochastic gradient descent problem [23]; instead of the traditional method, it is based on the training data iteratively updating the neural network weights. The algorithm estimates the first and second moments of the gradient of each parameter from the final function [24], and computes it with an exponential moving average. Under the condition that the enlargement and reduction of features remain unchanged, the problem of high noise and gradient dilution in the iterative process is solved, and the formula is expressed as follows:

In the above formula, in the iterative process of the parameter space, represents the number of iterations, represents the first feature parameter, represents the value of the distance along the gradient descent , and and are the bias corrections for the exponentially decaying mean of the power and quadratic historical gradients, respectively [25].

4. Experimental Simulation

In order to ensure the compatibility of the algorithm, a large amount of data is used for experimental verification. The experimental objects are as follows: the number of successful matching points, the matching accuracy rate, and the matching time-consumption. The advantages of the algorithm before and after the improvement are often determined through the above comparison angles. According to the above four considerations, after summarizing, an experimental comparison is made in terms of accuracy and time. Finally, the improved superiority of this paper is compared through the real case of the plastic art feature extraction of the northern nomadic civilization.

4.1. LLE-SIFT Feature Algorithm for Gaussian White Noise Correlation Difference Experimental Simulation

This part of the experiment is mainly to compare the performance of different algorithms under different Gaussian white noise. We explore the comparison of experimental results when is 10, 20, 40, and 80. The comparison of the intelligent image recognition algorithms of the terminal is shown in Table 1. From Tables 1 and 2, it can be seen that the more Gaussian white noise, the greater the impact on the algorithm, and the success rates of the three are relatively low, but among the three, LLE-SIFT still maintains a high performance in different Gaussian white noises. The success rate and matching time are also the lowest. The comparison of the three algorithms under the change of noise conditions is shown in Figures 10 and 11 (NMK: matching logarithm; NCMK: matching successful logarithm; RCM: matching correct rate; : matching time.)

4.2. Experiments on the Performance Comparison of Adam Optimization Algorithm

Based on the introduction of the Adam optimization algorithm proposed in this paper, taking the neural network GUR as an example, the Adam optimization algorithm is evaluated by comparing the mean absolute error and square mean error of the two models SGD-GUR and Adam-GUR in multiple different datasets. The performance of the experimental data is shown in Tables 3 and 4, and the experimental comparison diagrams are shown in Figures 12 and 13.

MAE is the mean absolute error, and RMSE is the root mean square error. From the experimental data obtained in Table 3, it can be seen that the mean absolute errors of the SGD-GUR model in the three datasets are 0.0089, 0.0081, and 0.0401. The average value is 0.0190, while the average absolute variance of the Adam-GUR model in the three datasets is 0.0033, 0.0081, and 0.0817; the average value is 0.031; and the experimental data obtained is shown in Table 4. It can be seen that for the SGD-GUR model, the root mean square errors of the three datasets are 0.0657, 0.0619, and 0.1274, with an average of 0.085. The root mean square error of the Adam-GUR model in the three datasets is 0.0132, 0.0126, and 0.0817, and its mean is 0.0358. After the visual presentation of Figures 12 and 13, it can be clearly found that different datasets will affect the training of the model to different degrees. The GUR model using the Adam optimization algorithm solves the problem of accuracy decline and improves the prediction accuracy, making it more stable.

4.3. Examples of Feature Extraction of Plastic Arts of Nomadic Civilization in Northern Grasslands

Select the experimental image and rotate and scale it by a certain degree, as shown in Figures 14 and 15. The feature points are extracted by the LLE-SIFT algorithm, the experimental data table is shown in Table 5, and the experimental comparison diagrams are shown in Figures 16 and 17.

The other two comparison algorithms, SIFT and PCA-SIFT, are compared with the improved algorithm in this paper. From the experimental data obtained in Table 3, it can be seen that the LLE-SIFT feature algorithm performs better than the other two algorithms; the comparison of the three algorithms under different image orientation changes is shown in Figures 16 and 17. Although the image has undergone operations such as rotation, scaling, and excluding the influence of low image edge contrast, the LLE-SIFT feature algorithm can still maintain a high matching success rate.

The purpose of the three comparative experiments in this article is to demonstrate the advantages of the LLE-SIFT algorithm in processing images under the above conditions. It is not difficult to find that among the three algorithms, both the matching accuracy and the time consumption are better than the other two algorithms. In the experiment, we do not have a fixed matching logarithm, which is based on the computing power of the computer and has nothing to do with the pros and cons of the algorithm.

5. Conclusion

Reviewing the foreword, this paper discusses the performance of the plastic arts of the nomadic civilization in the northern grasslands based on the performance of the algorithm level under the deep learning and the test of the simulation experiment, which has shown excellent performance. It is precisely because the advantages and characteristics of deep learning are applied to the plastic arts, the deep and shallow network is constructed as its input and feature recognition, which solves the problem of image feature recognition. At the same time, the convolution idea is introduced to make its features enlarged, which is more conducive to feature recognition, extraction, and analysis. In the simulation experiments in this paper, the experimental results show that the improved feature algorithm in this paper and the optimization algorithm for deep learning neural network have good performance, but because the hyperbolic tangent or sigmoid function used by the model as the activation function will make the neural network, the gradient decay on the network layer makes the model very difficult to build and train. These are issues that we need to pay attention to. On the road of future development, in the case of ensuring the translation effect of the model, nonsaturating activation functions, such as the ReLU function, can be used instead of the original ones to prevent the decay of the network architecture, further shorten the training time of the model, and make the model easier to build and train.

Data Availability

The experimental data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declared that they have no conflicts of interest regarding this work.