Abstract

With the development of music and video, music and video management has been in a relatively backward state. This study uses the logistic regression algorithm based on sigmoid functions to analyze and process the music audio-visual library of a music software and establishes the logistic regression dynamic model, which provides a scientific research method for this complex system. The LPC characteristic coefficient is extracted, and then the logistic regression model of music audio-visual archives resources is established. After completing the model, this study obtains the logistic regression model through a series of experiments, which effectively optimizes the management of music audio-visual archives. The specific experimental conclusions are as follows: through genre classification and emotion classification, users can search more music audio-visual archives resources. The comparison shows that the recommendation effect after filtering by the logistic regression model is better than that of the nonstandard collaborative filtering recommendation system. Finally, after practical applications, it is concluded that the model based on the logistic regression algorithm has made a good optimization conclusion for the resource management of music audio-visual archives.

1. Introduction

Folk songs are diverse. They have their own unique style according to their national characteristics. Many nationalities express their feelings of happiness, anger, sadness, and joy according to folk songs. In the Oroqen culture, their folk songs have a wide variety and unique tunes. Zandaren of Oroqen nationality, which has been handed down from generation to generation, makes up for the lack of written history of Oroqen people. It presents the social life conditions of Oroqen in the primitive hunting era in the form of art and provides precious data for the research of ethnology, folklore, anthropology, ethnic religion, folk art, and so on. Music, as the carrier of expressing human emotions and transmitting information, is also developing with human history. As an indispensable spiritual food for many people, with the advent of the era of big data, the storage and communication media have also undergone revolutionary changes. From black glue to magnetic tape, and then to CD record, until today’s electronization, music audio-visual shows ease of availability and convenience with the development of science and technology [1]. At the same time, the threshold of music and audio-visual production has been reduced, and a large number of online singers have emerged after 2000, which makes the number of music and audio-visual have a big explosive growth compared with the past. The music and audio-visual management method relying solely on manual classification and sorting can no longer meet people’s needs. Music computing science came into being under this background. It is a new discipline based on the algorithm, which carries out theoretical research, information storage, intelligent analysis, and fuzzy recognition of music audio and video [2].

The two main functions of music computing science are classification and retrieval, which are two important aspects of music audio-visual archive resource management that need to be upgraded and optimized, so it has aroused great interest in academia and music industry. Music archives is an important part of music semantic management. The traditional music audio-visual classification is based on the general information of songs, such as song name, singer or band name, word author, song author, and file format. This classification method is accurate and efficient, but it can no longer meet the needs of the current music market. Users of today’s online music listening software prefer to get retrieval and recommendation based on personal preferences, which requires more humanized intelligent classification of music audio and video in terms of style, emotion, genre and song theme [3]. Before entering the era of big data, the classification based on style, emotion, genre, and song theme was carried out manually. Although the accuracy can meet the requirements, it is inefficient and the cost of manpower and time is huge. And with the development of data explosion, labor has been increasingly unable to keep up with the growth of data. The automatic classification of music and audio-visual using the computer algorithm began to enter the public’s field of vision and began to play its own efficient and irreplaceable role. At present, the classification of music genres is basically recognized as the following ten categories: blues, classical, country, disco, hip hop, jazz, metal, pop, reggae, and rock [4]. A few scholars have proposed different genre classifications, but they have not been widely used, so this study will not be used. The technical principle of emotion based classification is to extract the acoustic features in music audio-visual files, and then use algorithm analysis to classify music into emotions such as enthusiasm, cheerfulness, calmness, sadness, and anger [5].

In this experiment, the logistic regression algorithm (LR) is selected as the tool to establish the model, and the modern classification and retrieval are used to manage and optimize the music audio-visual archives resources. The amount of data features that can be extracted from music audio and video is up to thousands of dimensions. In these huge amounts of data, there are not only effective features that are strongly related to classification but also redundant features that are related or even irrelevant. If these redundant features are added to the discrete dynamic model constructed by the algorithm for analysis, it is bound to reduce the accuracy of the results [6]. Therefore, feature selection is needed. It is to select the initial features extracted from music and audio-visual images, and evaluate the optimal subset for the next analysis. The logistic regression algorithm uses the classification information of the existing discrete dynamic system and massive music samples to obtain the correlation between the characteristics and categories of music audio-visual files. In the process of classification, the logistic regression algorithm will select the smallest feature subset with the greatest correlation with the categories [7]. Then, we compare the minimum feature subset with the user’s real demand features and calculate the matching rate until the target value is reached. On the one hand, the real user classification evaluation results based on the existing discrete dynamic system classification standards can be obtained, so that users can have a qualitative understanding and understanding of the customized music classification from the perspective of the system classification standards; On the other hand, the accuracy of classification can be improved through multiple iterations [8].

The analysis of music audio-visual by the logistic regression algorithm mainly includes two contents: the extraction of music audio-visual features and the selection of features. At present, many scholars in the field of music audio-visual classification based on computer algorithms have studied these two problems [9]. Some researchers use the Apriori algorithm to extract the feature composition of texture, pitch, and melody of music audio-visual files, study, and compare these feature sets, and finally obtain a 73% accuracy. In recent literature, scholars have proposed new feature methods, which improve the accuracy of music audio-visual archives classification to 78%. In addition, the author uses the three-dimensional data of wavelet histogram to associate it with the classification of music audio-visual system, which greatly improves the management efficiency. Some researchers use the wavelet coefficients of music audio and video, which are mainly used in rock music to subdivide it more scientifically and reliably. Montreal Institute of Canada extracted the characteristics of loudness, rhythm, and tone in music audio-visual samples, and calculated the zero crossing rate and pitch frequency. Some scholars take the pitch frequency, spectral centroid, parent band energy, and other music audio-visual perception characteristics in the 12th order Mel cepstrum coefficient, and take these extracted features as the feature vector of music audio-visual. On the basis of the former research, some scholars use wavelet transform to efficiently extract the energy characteristics of the mother band, making the results more accurate [1014].

Strict archiving management of audio-visual archives. Audio visual archives archiving management measures of modern archives management: in combination with the requirements of audio-visual archives management regulations, strictly follow the archives management process. Generally, the archives of departments are handed over to the archives room, and the archives room is handed over to the archives. Strictly collect key archives. It is necessary to focus on the collection of key audio-visual archives in combination with the actual situation and follow the relevant management system. The management of audio-visual archives of modern archives management needs to be based on the requirements of safe storage of audio-visual archives and the specific conditions such as constant temperature, constant humidity, and antimagnetism required by them. Relevant units need to do a good job in supporting work. Some key audio-visual archives are required to be stored separately in audio-visual cabinet or closed iron cabinet, strictly control the temperature and humidity of the warehouse, and keep away from light source, magnetic field, etc. The service life of audio-visual archives can be extended by regular bars, so that the value of audio-visual archives can be fully utilized.

Most of the existing music audio-visual archives resource management are based on classification standards and do not consider all the classification needs of archives managers and users. Based on the introduction of the current situation of music audio-visual archives resource management, this paper uses the logical regression algorithm to classify the existing music audio-visual databases, so as to meet the new needs of managers and users. The feature selection parameters of several music audio-visual types are analyzed by the logistic regression algorithm, and the archives resource management scheme of neural network is realized.

2. Principle and Selection of the Logistic Regression Algorithm

2.1. Logistic Regression Algorithm

Logistic regression algorithm, also known as the logarithmic probability model, is a classical algorithm of machine learning, which is often used to solve classification and regression problems. The essential difference between classification and regression is the continuity of the data processed by the problem. The data values to be processed in the regression problem are continuous, such as measuring the water level height of a certain reservoir in a certain period of time [15]. The data to be processed in the classification problem are discontinuous and discrete. For example, in music streaming software, to judge whether users like a song. The logistic regression algorithm can be simplified to as follows:

In the formula, the independent variable represents the input, which is all factors affecting the dependent variable, and the dependent variable represents the output, which is the expected result. This logistic regression function is used to describe the influence degree of the independent variable on the dependent variable and predict the data of the dependent variable. In the algorithm function, if there is only one independent variable, this function belongs to univariate regression analysis. If there are multiple or even infinite independent variables, this function belongs to multivariate regression analysis. The logistic regression algorithm can also customize the range of output values. The dependent variable in the above example can be defined as 0 or 1. No matter how many inputs are, the final output can only be 0 or 1. In this example, if a song is not liked by the user, it can be taken as 0, and if the user likes it, it can be taken as 1. Two states are used to judge the user’s preference. After selecting the sample set of music audio-visual archives, the logistic regression algorithm will collect and store the feature information and label information of the sample set and calculate a dynamic classifier through more complex functions than those in the above example [16]. The classifier can automatically recognize and distinguish music audio-visual files and classify and store them.

Linear regression is the simplest regression model, as shown in the left figure in Figure 1. It can be seen that the sensitivity of linear regression remains unchanged in the whole value range, so its robustness is very poor. This leads to low accuracy results when processing biased data. Logistic regression introduces a sigmoid function on the basis of linear regression. The formula of the function is as follows:where in the formula is an independent variable with a value range from negative infinity to positive infinity. The logistic regression curve is drawn on the coordinate axis, as shown in the right figure below.

The right figure in Figure 1 shows the sigmoid function curve. It can be seen that in the value range from 0 to 1, the sensitivity will change accordingly with the change of independent variable Z, indicating good robustness [17]. The accuracy can be effectively improved when processing samples containing deviation data.

2.2. Comparison between the Logistic Regression Algorithm and Different Algorithms

Firstly, this study compares the performance of logistic regression algorithm with BP neural network algorithm, support vector machine algorithm and decision tree algorithm through experiments to verify the scientificity and rationality of the experiment [18]. Two feature selection methods, package type, and filter type are selected to test the performance of the above four different classification algorithms, and the selected performance index is the accuracy. First, select a sample set of music audio-visual files, and then give a sample weight value. The accuracy can be calculated by the following formula. in formula (3) represents the accuracy, is the sum of the weights of the samples with classification labels in the test sample set, and is the sum of the weights of the samples classified by the algorithm. The value range of is the sample number that can predict the category, and the value range of is the sample number with correct prediction results. In this experiment, the default weight of each sample is assumed to be 1, and the result is measured by the time from the beginning to the end of the classifier test. The filtering type adopts the principle of feature first search based on optimal relevance, and the encapsulation type adopts a wrapper feature subset to search the optimal feature subset based on the Bayesian classification [19]. The experimental results are shown in Figure 2:

Figure 2 shows the time taken by various classification algorithms to classify a certain amount of music audio-visual file samples based on filter and encapsulated feature selection methods and the accuracy of the final results. As can be seen from the figure, the time required for the logistic regression algorithm to process quantitative samples is 6.9 seconds, which is shorter than 7.2 seconds of the decision tree algorithm and 8.8 seconds of the support vector algorithm, but the time of the logistic regression algorithm is longer than 4.8 seconds of the BP neural network algorithm. We also need to consider the impact of the accuracy rate on the results. From the ordinate, we can see that the highest accuracy rate of logic vector is 93.2%, which is higher than 88.6% of the BP neural network algorithm, 75% of the support vector algorithm and 63.9% of decision tree algorithm. Comprehensively considering the time and accuracy, it can be concluded that the logical regression algorithm has superior accuracy and efficiency in dealing with music audio-visual file samples, so it is reasonable and scientific to use this algorithm in this experiment.

3. Management Strategy of Music Audio-Visual Archives Resources

3.1. Logistic Regression Classification Strategy

The traditional music audio-visual classification is based on the song name, singer or band name, word author, song author, and file format. It has something in common with the automatic classification method of music audio-visual based on genre and emotion, that is, it needs three steps: algorithm extraction, selection of the best features, and classification training. However, there are great differences between the two. The definition of music audio-visual by genre and emotion is abstract, which is different from the qualitative characteristics of traditional music classification. Music audio-visual genre and emotion are based on human subjective feelings. It is an advanced way of music audio-visual description. The establishment of schools is classified according to the common points and starting points of different artists when creating music, audio, and video works, which can be used as a simplified summary of some artists’ works [20]. As the theme of music audio-visual expression, emotion plays a decisive role in arousing the feeling of the audience of the work. According to the setting of melody and section, it can also be used as a simplified summary of the common points of some music audio-visual works. With the development of the music industry, the number of music audio-visual archives also increases. When finding the required music audio-visual, managers and users quickly associate and list the archives based on genre and emotion classification, which determines the advantages and disadvantages of a music audio-visual database, which is of great significance to the development of the music industry. The ten most widely accepted music genres mentioned above are: blues, classical, country, disco, hip hop, hiphop, jazz, metal, pop, reggae, reggae, and rock. So, when we want to divide a song into one or several of the ten genres, we will face many problems. For example, there are regional, cultural, and living habits differences between music audio-visual file managers and users. Moreover, artists do not necessarily compose words and music according to a certain kind of fixed music paradigm when creating; The music, audio, and video works created by the same artist in different periods will have different styles and belong to different schools. Aiming at the fuzziness of music audio-visual classification, this study extracts the acoustic features, temporal and spatial features, and semantic features of music audio-visual works to represent the individual’s understanding, memory, and feeling of music audio-visual works. The subset of features extracted by the logistic regression algorithm can form a hierarchical relationship graph of music audio-visual cognition based on genre and emotion.

As can be seen from Figure 3, the acoustic characteristics of music audio-visual can be seen, which are mainly composed of pitch, time value, timbre, and chord. These physical characteristics can represent the individual’s understanding; The space-time characteristics are composed of rhythm, speed, and strength, which together constitute the individual memory points of music and audio-visual. Semantic features are composed of musical form, mode, change and expression, which establishes the individual’s emotional tone for music, audio, and video.

3.2. Extraction of LPC Characteristic Coefficients of Music Audio and Video

The high-level semantic information such as genre and emotion of music audio-visual needs to analyze and process the characteristics of music audio-visual by using logical regression algorithm. After confirming the level of music audio-visual, obtaining the characteristics is the next work to realize the classification of music genre and emotion. The center of the spectrum is also called the dynamic center of energy. As a very important sensing parameter in music, video and sound, the spectral centroid can be calculated from the dynamic energy distribution. The required formula is as follows.where in the formula represents the number of audio acquisition points of spectrum dynamic energy distribution, represents the Fourier transform amplitude of the audio dynamic energy frame, and the value range of is all integers from 1 to positive infinity.

Music audio-visual energy signal will be attenuated in transmission, so the attenuation cut-off frequency can reflect the distribution of spectrum dynamic energy in low frequency band. It is an important parameter of spectrum waveform. The definition formula of attenuation cut-off frequency is as follows. and in the formula have the same meaning as that in formula (6).

The amount of change in the distribution of two adjacent frames in the spectrum dynamic energy is called spectrum flow, which is a dynamic response to the characteristics of music audio-visual signals. Its calculation formula is as follows:where in the formula is the result of normalizing the frame in the spectrum dynamic energy.

Zero crossing rate refers to the change frequency of the number of frames of music audio-visual signal in a short time. It refers to the total number of times that the sampling signal changes from positive to negative and from negative to positive in a short time. Its calculation formula is expressed by the following formula.where in the formula refers to the discrete music audio-visual spectrum signal, is the symbol value of . When is positive, the symbol value is 1, and when is negative, the symbol value is −1. Generally, the zero crossing rate of music audio-visual is low.

The sigmoid function predicted spectral coefficient LPCC of the logistic regression algorithm is calculated through its relationship with the linear function predicted spectral coefficient LPC. Starting from 1, it is obtained recursively according to the following three formulas.where is the -th spectral coefficient of LPCC and is the factorial of LPC. LPC is a music audio-visual spectrum sequence represented by a linear combination of several parameters, in which the weight coefficient describes the essential characteristics of the sequence, which is called LPC prediction coefficient. The subsequent music audio-visual samples can be weighted and expressed in form according to the previous LPC prediction coefficient, which is shown in the following formula.

The minimum error characteristic obtained by calculation is the LPC prediction characteristic coefficient. After obtaining this coefficient, we can use the logical regression algorithm to establish a model for the automatic management of music audio-visual archives.

4. Establishment of Logistic Regression Model of Music Audio-Visual Archive Resources

The LPC prediction characteristic coefficient is substituted into the logistic regression algorithm to establish the music audio-visual file resource model. The steps of data training for a music audio-visual library are shown in Figure 4.

As can be seen from Figure 4, first, the original music audio-visual data are randomly divided into two parts, and the data set 1 is trained by the logistic regression algorithm. A strong classifier composed of several data subsets can be obtained. The main decision to use metadata mapping is how to represent the information in metadata according to the running code. Due to the lack of dynamic code generation, the modification of mapping requires the relevant information of newly compiled and deployed software. The metadata are obtained and transformed into a programming language structure, which then drives the code to generate output or reflection mapping. LPC is used to predict the feature coefficients, and the strong classifier is re-encoded to form a new strong association set. Each subset is reconstructed using the output vector to form new music audio-visual features for use in the logistic regression model. Data set 2 is reconstructed directly using output vector , and then merged with data set 1. Finally, the combined data set is classified into two categories to predict the classification and search of the managers and users of the music audio-visual data set.

5. Management and Optimization Results of Music Audio-Visual Archive Resources Based on the Logistic Regression Model

After establishing the model of logistic regression algorithm, this study selects the database of a music software for analysis. First, the results shown in Figure 5 can be obtained through genre and emotion classification.

Through the classification of the logistic regression model, it can be seen that in the music audio-visual database used in the experiment, the proportion of rock and roll in the genre classification is up to 44%, pop accounts for 18% of the total data, country music accounts for 17%, hip-hop types account for 14% of all samples, and the remaining 7% is the collection of the other five types. According to emotion, the results of music audio-visual classification are as follows: 51% are sad, 26% are angry, 14% are calm, and 9% are happy. Through these two kinds of advanced classification, different from the traditional song information classification methods, the music audio-visual file resources can be better searched by users and greatly optimized.

Because the logistic regression algorithm also needs to analyze the user attribute characteristics of music audio-visual resource database and automatically filter many attribute characteristics. Therefore, the experiment analyzes the attributes of music audio-visual. Since many of all music audio-visual attributes are consistent, the results obtained after screening and analyzing the database according to the established logistic regression model are shown in Figure 6.

As can be seen from Figure 6 above, there are 8 samples of common characteristic attributes of music audio and video generated after calculation by logical regression model. It shows that the eight music audio-visual files have four common dynamic attribute characteristics: attr1, attr2, attr3, and attr4. It shows that although there may be many attributes of music and audio-visual, the attributes that managers and users really pay attention to are not all characteristics.

In order to more intuitively analyze the experimental results of the logistic regression algorithm, this study makes a comparative experiment of the music audio-visual system recommendation using the logistic regression algorithm and nonstandard collaborative filtering algorithm. Then, the two methods analyze the music audio-visual data of the same sample, and the results are shown in Figure 7.

On the left side of Figure 7 is the comparison of accuracy. It can be found from the figure that the accuracy of the recommendation effect of the logical regression model is significantly higher than that of the standard collaborative filtering recommendation. On the right side of Figure 7 is the comparison of recall rates obtained by two different algorithms. It can be seen from the figure that with the growth of music audio-visual recommendation list, the recall rate of both algorithms gradually increases. When the recommendation list is 45, the acceptable recall rate is reached. At this time, the music audio-visual recommendation recall rate using logistic regression model is 4.5%, while the music audio-visual recommendation recall rate of non-standard collaborative filtering is about 6.3%. The comparison shows that the recommendation effect after filtering by the logistic regression model is better than that of the nonstandard collaborative filtering recommendation system.

Music audio-visual archive data have three-dimensional characteristics. In order to optimize management, the dimension of the data is reduced so that the database can be represented by one-dimensional data. This experiment uses the top two genres with the highest proportion, rock and pop, as examples, and the results are shown in Figure 8.

Figure 8 describes the distribution map of rock and pop genres represented by the timbre characteristics and hearing characteristics of music audio-visual samples after dimension reduction using logistic regression algorithm. Abscissa is the timbre feature of music after dimensionality reduction, and ordinate is the listening feature of music after dimensionality reduction. The large sample points in the figure represent the centroid of the sample category. It can be seen from the figure that for most music audio-visual data samples, rock, and pop music can be characterized by the sigmoid function through the music timbre characteristics and listening characteristics after dimensionality reduction.

The experiment uses the logical regression algorithm architecture to complete the music audio-visual resource management system, and then it is actually installed into a music software system. Results after being used for a period of time, the feedback from file managers, and users were obtained, and the results are shown in Figure 9.

It can be seen from the data in Figure 9 that 67.5% of managers and users actually feel the improvement of the new system in the management of music audio-visual archives, and 62.9% of users think the optimization effect is obvious. At the same time, 33.6% of the respondents said that the system needs to be further improved. This shows that the model based on the logistic regression algorithm optimizes the resource management of music audio-visual archives.

6. Conclusion

Through a series of experiments, this study obtains the logistic regression model, which effectively optimizes the management of music audio-visual archives. The specific experimental conclusions are as follows: different from the traditional song information classification methods, through the advanced classification of genre and emotion, users can better search music audio-visual archives resources and greatly optimize music audio-visual archives resources; Although music audio and video may have many attributes, not all attributes that managers and users really pay attention to. The comparison shows that the recommendation effect of logistic regression model is better than that of the nonstandard collaborative filtering recommendation system. For most music audio-visual data samples, rock and pop music can be characterized as an S-shaped function by reducing the dimension of music timbre and auditory features. Finally, through the questionnaire survey of users, the model based on logistic regression algorithm is obtained, which makes a good optimization for the resource management of music audio-visual archives. However, the logistic regression algorithm also has shortcomings in mining efficiency, which needs to be further improved. There is also the risk of data leakage. Music and audiovisual managers and users should also make rational use of the analysis results.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The author declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

Innovative talent training project of Hei Longjiang Province from Education Department of the Hei Longjiang Government: The study of musical theater creation depends on the traditional folk songs of E Lunchun Minority Group of Huma Hei Longjiang (No. UNPYSCT-2018111); and the fundamental project of Education Department of the Hei Longjiang Government: The study of promotion of pop song genre about the traditional folk songs of E Lunchun Minority Group (No. 135109542).