Abstract

After the outbreak of the COVID-19 pandemic, cloud computing and voice recognition services have provided a more critical role in the enterprise supply chain management process. Speech emotion recognition technology can identify the content and emotion of the speaker according to the content and tone of speech. Analyzing the attitude and tone behind a statement through language and high-level information acquisition is very difficult. Cloud computing technology is becoming more widely used each day and is the cornerstone of enterprise information development. The anti-globalization caused by COVID-19 has made enterprises pay increasing attention to supply chain management. Determining how to optimize and integrate the supply chain has become an urgent problem for enterprises. Based on the work and research completed in this field, this paper first analyzes the algorithm and model used in speech recognition and then tests the system. To improve the utilization rate of enterprise resources and enterprise industrial benefits, this paper puts forward the optimization scheme of the enterprise supply chain from the perspective infrastructure construction to product development to backstage management and procurement.

1. Introduction

During COVID-19, implementing national or regional containment measures will likely lead to disruptions in domestic and international supply chains. Parts processing is scattered in different regions, and the relevant emergency policies vary from region to region, which can easily lead to plant disruptions. A problem in one part of the supply chain directly affects the operation of the upstream and downstream processes. The global spread of COVID-19 has made it more difficult for companies to survive. The competition among enterprises has changed from product competition to supply chain competition. Customers can only be attracted by providing suitable products, satisfying services, and improving customer satisfaction. The rise of cloud computing technology solves the problem of enterprise supply chain optimization and establishes service systems with cloud computing technology to improve supply chain efficiency.

The essence of language is to communicate between people. Speech recognition extracts some required elements from speech information. In the past decades, speech recognition technology has been studied in many countries and has made some progress. At present, many language recognition products appear in people’s daily lives. Speech signals can currently be analyzed in the market, but there is little research on the speaker’s attitude. This is especially true in the Chinese context due to the different emotions of speakers, where different tones will express other effects. This is the defect of the traditional speech recognition method. The inability to analyze differences between emotions leads to a loss of information expression.

The present paper studies establishing a sound emotion recognition system based on acoustic properties to add to the current research. Based on the previous work on speech recognition, various algorithms will be implemented to explain in detail, analyze the establishment of a new model, and test the system to verify its performance and reliability.

From the 1980s, scholars began studying vocal emotion recognition. An earlier study was based on an acoustic system for emotion recognition [1]. This field aims for computers to recognize the emotion behind the text. Sung-Woo and Seok-Pil suggest that human emotions can be expressed through physiological signals, body parts, verbal signals, and facial expressions [2]. Gwanggil et al.suggest that speech emotion recognition can also be applied to e-commerce [3]. After 2000, with the development of computer application technology, intelligent devices were welcomed by the public [4]. The research on speech emotion recognition has also become more meaningful. The development of artificial intelligence technology and computer Internet technology provides the basis for its study [5]. Presently, the research mainly focuses on identity and differences in phonetic emotion. The literature proposes that emotions can be divided into five types, and the characteristic frequencies of different emotions can be extracted as parameters [6].

In some cases, a BS classifier is used for classification and then a regression analysis is used for identification, where identification accuracy can reach 3/5 [7]. Some people also applied the speech emotion recognition method to call centers. For comparison, energy characteristics were used as parameters [8]. Experiments have also proposed that when different machine learning methods are used to identify emotions, their performance are different, so neural networks can be used to determine the best effects [9].

The supply chain initially consisted of purchasing and manufacturing departments and later included sales and statistics departments. The supply chain is an essential part of enterprises, and supply chain management is integral to enterprises’ strategic positioning and efficiency improvement [10]. Therefore, if the company attaches importance to supply chain management, it will maintain an advantageous position in the competition. Some studies regard the supply chain as a business process model composed of various value chains that can provide value chains corresponding to user needs [11]. Enterprises should take careful consideration when choosing suppliers, such as taking note of the bullwhip effect in the clothing supply chain [12]. Managing the brand supply chain of garment enterprises is critical [13]. This is especially true for inventory management in the supply chain [14] and the supply chain management model [15]. Early studies on the combination of computer information technology and supply chains have found that an intelligent supply chain has good business value [16]. The continuous progress of logistics big data operation models has an important impact on the supply chain [17].

3. Voice Identification and Cloud Computing Services in the Intelligent Application of Enterprise Supply Chain Management

Voice identification and cloud computing services globalize markets and providers through technology, from which companies derive maximum benefit.

3.1. Speech Emotion Recognition
3.1.1. Detailed Design of Feature Parameter Extraction Module

Figure 1 shows the architectural diagram of parameter extraction.

From Figure 1, all modules are successfully executed and the features are extracted.

3.1.2. Module Process

The system will extract feature parameters in a particular order, as shown in Figure 2.

3.1.3. Key Technologies

The first step of speech emotion recognition is to analyze speech signals. Analyzing the characteristic emotional parameters is represented by language, and the correct model is used to make emotional categories to identify the types of emotions. Therefore, the characteristic parameters of speech signals must be extracted before achieving speech emotion recognition.

The process of recognizing and processing speech signals is a focus and very complex. These processes are either completed through the time-domain analysis or frequency analysis. In signal processing, models can be developed using model or nonmodel patterns. In general, a combination of both is needed.

3.1.4. Preprocessing Module

Because computers cannot analyze continuous speech signals, the speech signal is first segmented when it is received. The speech signal is digitally encoded, and then some header information is added to divide the continuous speech signal into multiple frames [18]. Previous studies have found that the average power of speech signals can be adversely affected by various factors. The speech signal will be lost if the glottic excitation or frequency is too high. To solve this problem, pre-emphasis is implemented to increase the high frequencies while maintaining the low-frequency part. In this way, the signal of the whole cloud slows, and attenuation is not apparent. A filter with the following formula is used:

Usually, speech signals are completed over a short period and need to be isolated for subsequent processing. Generally, frames within 100 are classified. There is some overlap between each frame to ensure the stability of separated speech signals. Overlap frames are presented with a rectangular window or an artificial window.

The method of the rectangular window is as follows:

The method of the Hanting window is

The continuous speech signal becomes a one-dimensional array after various operations are completed. In the time-domain analysis process, the speech signal parameters are more intuitive, smaller in volume, and easier to extract and analyze. Therefore, the time domain parameters of the speech signal are extracted for the time-domain analysis.

The short-time energy can be expressed as

Short-term average amplitude functions can also be used to show short-time energy; its formula is as follows:

Usually, when people are excited or angry, their speech is louder than when they speak during calm or neutral states. When people are unhappy, they are very quiet. This means that people’s emotions can be characterized in terms of short-term energy and short-time average.

In addition, the quality of the recording file may cause differences in the overall amplitude of the sound, resulting in a variety of average energies in a short period. The average energy formula needs to be changed to eliminate the differences.

The short-term rate of the speech signal is defined as follows:

To minimize the impact of the environment on the recording file, some judgments can be added, and the waveform can be analyzed during correlation analysis. The speech signal is defined as follows:

A short-time autocorrelation function is an even periodicity function that is maximized when K is 0.

Through the short-term autocorrelation function, the peak position can be determined, the period of the signal can be established, and the frequency of the voiced signal can be calculated. However, the number of projects calculated this way will gradually decrease. In other words, the position of the peak of the function may not have much to do with the cycle. To solve this problem, a modified short-term autocorrelation function is introduced:

The authors utilize an endpoint algorithm to segment the voice part and the silent part of every speech signal frame and improve the proportion of the voice part.

The vocal cord vibration frequency is fundamental and its extraction has recently become a research focus. Due to the wide range of human vocal periods and the influence of other noises in the channel, the human vocal signal is not completely periodic. These factors make it challenging to extract the fundamental frequency, and it is difficult to extract the vibration frequency of vocal cords. The authors implemented the autocorrelation or descent method and average amplitude function to calculate gene frequency.

Based on previous studies on the autocorrelation function, a modified autocorrelation method is used to extract pitch frequency and ensure that the number of items will not be reduced in the extraction process. The current defect is that the product of each term, which has time redundancy, needs to be calculated, so the method of the autocorrelation function needs to be improved in the future.

The average amplitude function is shown as follows:

The method eliminates the influence of other factors by separating the excitation source and channel of the acoustic signal by convolution.

The response of the filter is

Its refrigeration iswhich is subject to

Fisher’s classification method was used to judge the light turbid sound. Here, the authors took as many sample points as possible because the points are beneficial to the continuation of the study:

After obtaining the extreme value, its projection and limit can be determined:

Morphemes are the number of words a human speaks per unit of time. The speed of speech is often affected by emotions. Generally speaking, people tend to speak faster when angry or excited and slower when they are sad. Therefore, human emotions can be identified by judging the speed of speech as a parameter input into the deterministic model, which further improves recognition accuracy. The wavelet transform is used to segment speech, and the specific process is as follows.

When the signal uses wavelet transform, the wavelet transform is divided into discrete and continuous types. The wavelet transform method is very close to the human auditory system so that it can reflect the human perception of sound more realistically. The expanded discrete wavelet transform is shown as follows.

The short-term rate of the speech signal is defined aswhich includes the following:

When a signal passes through a channel, resonance occurs, which increases the frequency of a part of the frequency. This is called a formant. In practical application, only three formants are needed for analysis.

Predictive coding in a linear manner can be understood as follows:

Its error function is

The prediction parameters are determined by minimizing the error function, and the short-time speech signal is taken as a periodic pulse signal to establish a linear constant system response function, whose function is as follows:

In general, in the process of speech signal processing, the full-episode model is taken for processing, which is

Linear predictive analysis is needed in modeling because a sequence between speech sounds can produce errors. The mean square error is defined as

The value is 0 after the guide, indicating that the predicted coefficient and the speech signal are orthogonal.

The derived prediction factor is

Its autocorrelation function is defined as

The autocorrelation function is satisfied:

Then,

It is then expanded to become the following:

3.2. Supply Chain Management
3.2.1. Supply Chain Operation Reference Model

The International Supply Chain Association developed the standard supply chain SCOR model. The model is divided into three levels, each level of the enterprise supply chain for each part of the function. The model consists of five basic processes. The first process is planning to balance the relationship between supply and demand and develop a good solution so that the following process can proceed normally. The second step includes procurement, obtaining raw materials to develop solutions and then producing products from the inventory. The next step in the process is distribution, where the system is delivered to the customer according to the order. In addition, customers are allowed to return goods. The process is shown in Figure 3.

At the configuration layer, they are configured according to different processes, and the supply chain is adjusted according to specific policies. At the process element level, the company’s competencies are defined, along with a description of each process.

The SCOR supply chain is mainly analyzed from the current process state to describe the future state. It interacts with the supply chain of other peers and finds the next target, quantifying the gap between the two to catch up.

3.2.2. Supplier Management Inventory

In inventory management, it is necessary to consider suppliers and manage and control the needs of customers. The cost in the management process should be reduced as much as possible. Establishing an agreement with the partner and implementing the inventory management policy in strict accordance with the agreement is necessary. It is even more necessary to break the traditional inventory management mode, adopt the idea of integration, and dynamically change according to market changes. The inventory management of suppliers is characterized by collaboration, so a positive attitude should be taken when dealing with problems.

4. Speech Emotion Identification and Cloud Computing Services in the Intelligent Application of Enterprise Supply Chain Management

4.1. Cloud Computing Service Model

As can be observed in Figure 4, the implementation of the cloud computing service model has gradually increased in recent years, with the proportion of offline sales steadily decreasing.

4.2. Cloud Computing Service Model

From Figure 5, it can be understood that clothing enterprises are the core of the supply chain. When clothing enterprises manage the supply chain, each node needs to be active and become a partner with other processes, where the final cooperation of all links guarantees the success of the whole supply chain.

In the supply chain of garment enterprises, some garment enterprises will handle some noncore business by other companies, and the company will only keep the core part so that it can spend a lot of energy on design and sales, which will help enterprises avoid risks and reduce investment, thus ensuring their core competitiveness.

4.3. Enterprise Supply Chain Management System

A supply chain management system is established with an information processing system as the core. The main links involved in the supply chain are shown in Figure 6.

Product lifecycle management (PLM) has been applied in many fields and has mainly supported product R&D in the supply chain of garment enterprises. Table 1 shows the PLM comparison EPR advantage.

You can customize the customer on-demand or provide a more flexible system configuration, while the staff can access the system at anytime from anywhere.

4.4. Enterprise Supply Chain Management Optimization

The problems of apparel company supply chains are shown in Table 2.

A collaborative forecast of suppliers can be made from three aspects. First of all, the popular trend can be used for prediction. The future epidemic activities can be evaluated according to past experience, past demand, and current market demand. Fashion trend refers to the fashion trend in a certain period, which reflects not only people’s preferences but also the style of the times. Sources of forecast information are shown in Table 3 and Figure 7.

An inventory management performance system is established to improve distribution efficiency, reduce inventory accumulation, and improve the efficiency of the whole logistics, as shown in Table 4.

In the supply chain inventory management, the core suppliers are selected so that some upstream suppliers can pay for the orders through the network platform and improve the operability of warehouse management, as shown in Table 5.

5. Conclusion

The innovation of this research lies in that when the speech emotion recognition method is used for signal processing, the characteristic parameters of the speech file are first extracted, which are used as the input of the model and the basis of speech segmentation. With the continuous development of language recognition technology, the pattern recognition technology using cloud computing technology and emotion recognition accuracy is also constantly improving. This study combines different signal processing methods, weighs their accuracy and complexity, and selects the most appropriate parameters and algorithms. The support vector is used as the classifier model, and the complex emotional features behind the speech are finally recognized. A new intelligent clothing system can also be developed from the service mode of cloud computing and the supply chain management theory. At the same time, the current situation of the supply chain of garment enterprises was analyzed, and its future development was expounded on. Clothing enterprises should analyze and solve their existing problems from the aspects of development, inventory purchase, and so on.

The supply chain of clothing enterprises should be accelerated in the following aspects: extending the industrial chain, constructing a whole industrial chain service system to efficiently meet the various procurement needs of customers involved in the supply of garments, and realizing the efficient integration of all production processes within the same industrial zone. Clothing enterprises should upgrade from labor-intensive to capital-intensive, replace labor with market-leading production equipment, and improve production efficiency. They should enhance their design capabilities, create their intellectual property rights, and shorten product lead time and new product launch cycle to reduce customers’ costs and improve their sales efficiency. [19].

Data Availability

The datasets used and analyzed within the present study are available from the corresponding author upon reasonable request.

Conflicts of Interest

The authors declare that they have no potential conflicts of interest concerning this article’s research, authorship, or publication.

Acknowledgments

This project was supported by the National Social Science Foundation of China (Grant no. 20VYJ026).