Abstract

In this paper, we conduct an in-depth research on the corresponding enterprises, combined with some problems existing in the process of data processing and use. We establish a deep learning model on the extensive collection and comprehensive investigation of the research results of domestic and foreign enterprises in all aspects of the process of data processing and use, and determine the research directions. Firstly, in view of the increasing complexity and dimension of enterprise data, and the difficulties of enterprise data application, this paper studies the related data preprocessing methods. Secondly, aiming at the problems of enterprise cost control and customer relationship management, this paper studies the prediction based on enterprise data through the analysis of practical problems and the processing of corresponding data. Finally, in order to progress and advance the efficiency and scientific usefulness of enterprise management, we in this paper study the evaluation based on enterprise data. The model is verified through simulations and compared with several models i.e. cross hybrid and sequential hybrid models. Using certain assumptions, the attained outcomes confirm that the accuracy of the deep learning structure of the single model is sophisticated and greater than that of the cross hybrid model, but lower than that of the sequential hybrid model.

1. Introduction

With the continuous improvement of the development level of social technology, especially the rapid growth and extensive application of information technology (IT), big data, deep leaning, Internet technology, communication technology, computer technology, and other technologies, the data growth in various fields presents a “explosive exponential development” trend. At the same time, the form of data also tends to be increasingly complex, people have more and more ways and means to obtain data, and the cost of data acquisition is reduced accordingly; the data acquisition speed is greatly accelerated. Giving to the exploration and investigation outcomes report of International Data Corporation (IDC) organization, the amount of data generated worldwide was approximately 0.49 zb in 2008 and roughly equal to 0.8 zb in 2009. In the next year, this figure was noted approximately 1.2 zb in 2010 and 1.82 zb in 2011. The amount of data generated in 2011 is equivalent to 3.7 times that of 2008. The company predicts that by 2020, the global amount of data will reach 35 zb. In 2013, itl68, together with ITPUB and CHINAUNIX, conducted a survey on the monthly new data of some enterprises and found that 18.11% of the enterprises with a monthly new data scale of more than 500 g, an increase of 8.64% year-on-year compared with 12.67% in 2012.

Although there is still a big gap between this data and the growth rate predicted, on the whole, the growth of data is very huge, especially for Internet enterprises, such as Baidu, the world’s largest Chinese search engine, whose daily new data can reach 10 TB, and the daily data that needs to be processed reaches 100pb; for Tencent, it has 800million users in 2013, and the data stored in its data warehouse has reached 4,400 sets of single cluster data, and its daily new data is about 200 tb to 300 tb. In addition to the Internet, the data volume of the telecommunications industry, manufacturing industry, financial industry, power industry, and other fields has also reached the Pb level. According to GSMA prediction, the global mobile data traffic will increase at a compound growth rate of 50% every year from 2012 to 2018. With the continuous progression of data, the problem of data processing and application has quickly received great attention. In recent years, the global academia, industry, and governments have paid great attention to data processing technology and set off a research upsurge similar to the information highway in the 1990s. Some European and American developed countries even put forward a series of data research plans from the national science and technology strategy and national security level. In 2007, Jim gray, the Turing prize winner, mentioned “scientific discovery based on dense data” in a speech, which expanded scientific research from three paradigms to four paradigms.

Data science has become a new science after experimental science, theoretical science, and computational science. It can be seen that data shows a very imperative part in the current growth of science and technology. In the face of huge data, how to use data and how to deal with data has undoubtedly become one of the primary problems to be solved by enterprises. Using the data of corresponding industries for relevant prediction is an effective application of data processing. In 2009, an influenza A (H1N1) virus appeared, which spread rapidly in just a few weeks. However, through its huge data resources, combined with the corresponding data processing methods and mathematical models, Google can very timely determine where and how the flu spread, thus winning valuable processing time for public health institutions in epidemic control. The rational and effective use of data not only serves the public but also brings great commercial value to enterprises and industries, so as to improve the economic benefits of enterprises and industries. Therefore, data is also known as “new oil in the future,” and the future competition focus will be on the possession, control, and application of data.

Although data has been greatly reused in many enterprises, due to the unbalanced development of information technology, many domestic enterprises still cannot enjoy the high added value that data brings to enterprises, and the use of data in many enterprises is still in the initial stage. At present, the current situation of many enterprises is that they have spent a lot of money on the construction of information systems and saved a lot of historical data, but these data have not been well used, and even some data are still in a “sleeping” state. One of the purposes of this paper is to make these data play its due role. In the investigation of some enterprises, it is found that a common problem is that in the face of a large amount of data saved by enterprises, they do not know how to use it. Data preservation and maintenance increases the operating costs of enterprises but makes data a burden on enterprises.

At present, the generation methods of data are ever-changing, and the existence forms of data are diverse. Data often needs to be processed before use; otherwise, the relevance of data dimensions or the internal incompleteness of data will affect the accuracy of the final result. Moreover, the structure of data not only has linear structure but also has a nonlinear structure. Different methods are used to process data with different structures, and it can improve the efficiency of data use and has an important impact on the later data application. Due to the increasing amount of data, it brings opportunities as well as challenges to enterprises. How to effectively use these data to support enterprise decision-making and even create new value for enterprises has become one of the goals pursued by enterprises. However, due to the surge in the quantity of data, the rise and growth in dimension, and the complexity of data, enterprises are tired of increasing costs to save data, but they are unable to effectively use data. This should be noted that with the increasing socialization of information interaction, both for enterprise customers and enterprises themselves, they have more choices.

For enterprises, they can collect all kinds of supplier information through multiple channels. By using this information, enterprises can effectively evaluate their suppliers and select some suppliers that meet the requirements of enterprises to serve them. For enterprise customers, they have the same right to choose, which makes it more difficult for enterprises to retain existing customers, but it also provides opportunities for enterprises to attract new customers. By analyzing and processing the operation data of enterprises, enterprises can conduct in-depth research on their consumer groups so as to continuously improve the product and service experience, improve the loyalty of enterprise customers, and reduce the loss rate of enterprise customers. Similarly, for the maintenance service industry, because it has a big amount of historical data of maintenance services, the effective use of these data can diminish the inventory cost of enterprises and develop the service level of enterprises. In view of the increasing number and dimension of enterprise data and the current situation that enterprises cannot effectively use their historical data to optimize enterprise management, this paper hopes to use the corresponding data dimensionality reduction methods to process enterprise data reasonably and effectively, and apply these data to the prediction and evaluation of enterprises and other management activities, so that enterprise data can play its due role and serve the production and operation management of enterprises. The fundamental contributions of our research study can be shortened and briefed as follows:(1)We establish a deep learning model on the extensive collection and comprehensive investigation of the research results of domestic and foreign enterprises in all aspects of the process of data processing and use, and determine the research directions.(2)In view of the increasing complexity and dimension of the enterprise data, and the difficulties of enterprise data application, in this paper, we study and investigate the related data preprocessing methods.(3)Aiming at the problems of enterprise cost control and customer relationship management, this paper studies the prediction based on enterprise data through the analysis of practical problems and the processing of corresponding data.

The rest sections of this paper are systematized in the following way. Existing state-of-the-art approaches and deep learning techniques are discussed in Section 2. The anticipated enterprise prediction method with deep learning is deliberated in Section 3. Simulations, numerical experiments, and tests are achieved in Section 4. To end with, Section 5 summarizes this paper and put forward some directions for further research.

In recent years, due to the sharp growth in the amount of data and the scale of data attributes, data processing and application have become more difficult [1]. The method of reducing the data dimension under the condition of ensuring the loss of data information as little as possible shows an imperative part in data analysis and application. In order to create a compact low-dimensional expression of the original dataset, the primary method of data dimensionality decrease is to, in fact, map the high-dimensional dataset to the low-dimensional space [2]. Data dimensionality reduction can not only solve the problem of “dimension disaster,” but also alleviate the current situation of “rich information and lack of knowledge,” which is conducive to reducing the complexity of data and improving the ability to recognize and understand data [3]. According to different perspectives of understanding problems, there are many kinds of dimensionality lessening approaches. Dimensionality reduction techniques may be further broken down into linear and nonlinear techniques depending on the properties of the data organization system. Moreover, according to whether to consider the relationship between data points and their nearest neighbors, the global and local dimensionality reduction techniques are two categories of dimensionality reduction techniques.

According to whether to introduce the category information of samples, the dimensionality lessening methods can be distributed and classified into: (i) supervised dimensionality lessening methods and (ii) unsupervised dimensionality lessening methods. After years of research by scholars [4, 5], there have been many different branches of dimensionality reduction algorithms, among which the most traditional dimensionality lessening procedures are: (1) principal component analysis and (2) linear discriminant investigation. Both of these methods deal with the data from a global perspective. For the data whose samples are from Gaussian distribution, these two methods have certain effectiveness. In order to expand the application scope of these two methods, the work in [6] applies the concept of kernel to PCA method and extends it to KPCA, so that the principal component method can also be applied in high-dimensional nonlinear space.

According to similar ideas, Mika et al. proposed the KLDA procedure to decrease the dimension of corresponding data nonlinearly. In order to break through the constraint of the global linear assumption of data, the authors in [7] proposed a self-organizing mapping (SOM) method, which uses self-organizing neural network to map high-dimensional data to low-dimensional space on the premise of preserving the topological attributes of data space; other corresponding methods include mapping and principal curves proposed by Hastie et al. [8]. With the deepening of the research on dimensionality reduction algorithms, some manifold learning algorithms have also been used in the nonlinear dimensionality reduction of data. The LLE algorithm, in fact, uses and takes into account the idea of saving local information of data to apply the reconstruction information of data in high-dimensional space to low-dimensional space.

The ISOMAP [9] algorithm uses the geodesic distance information between points. The algorithm assumes that linearity only exists locally, while the global is a nonlinear geometric structure; Le algorithm achieves the purpose of dimensionality reduction by retaining the local nearest neighbor information of data with the help of thermonuclear equation; other manifold learning algorithms used for data dimensionality reduction include the method based on Hessian matrix and the method using local tangent information. In order to better apply manifold learning algorithm to real data, some scholars have studied the linearization expansion of dimensionality reduction algorithm based on spectral analysis. Typical methods of this kind of algorithm include LPP, NPE, and ONPP [10].

The scholars in [11] provide a brief introduction and explanation of the topic writing before using empirical analysis to confirm the model's validity and drawing a conclusion. The authors of [12] talk about how private institutions make use of the ability to delegate the authority to evaluate in order to create applied teaching teams. The research that was released in [13] also examined the ownership structures of Russian businesses and organizations. Similar to this, [14] uses Company A as an example and designates the workers in the company's workplace as the research subject. These elements interact and are challenging to correctly separate. To assess the whole economic advantages of coastal firms, the AHP and fuzzy comprehensive assessment approach are integrated in this context [15]. It should be noted that the authors of [16] want to offer a simple way for assessing the system health of a private cloud. In fact, [17] summarizes the factors that affect the development of business environment and puts forward the evaluation index of business environment from the perspective of private manufacture enterprises. Based on the above findings, the work in [18] presents countermeasures and suggestions for the high-qua. Other influential work includes [19, 20].

3. Enterprise Prediction Method with Deep Learning

With the economy of China continuing to grow and science and technology continuing to advance, the market competition in the telecommunications industry is intensifying. Enterprises gradually realize that fully exploring the potential value of customers is the key to improve the core competitiveness of enterprises. In recent years, the telecommunications industry has been fully developed [21]. At the same time, its corresponding enterprises are also facing great opportunities and challenges. Through the investigation of a telecom enterprise, it is found that with the continuous expansion of business, the enterprise has accumulated a large amount of data, but these data have not been well used. This section will carry out the corresponding prediction research on customer loyalty by extracting the data and combining the corresponding model.

The data used in this section mainly comes from some data in the Oracle database of a telecommunications enterprise, involving a total of 84 data tables [22]. The tables are denoted with T-1 to T-84 as given away in the following Table 1. These data tables mainly include customer information, business code, detailed business, user status, historical arrearage, consumption details, consumption account, and historical arrearage. The amount and quantity of data contained in the corresponding data tables is shown in Table 1.

These original data tables have not been processed, and they are full with noisy information. Additionally, various application goals require various data to support them. As a result, this component initially processes and screens this data as is necessary [23]. In order to evaluate the efficacy, we first start with the fundamental customer data and filter out users with inaccurate or missing identification information so that the users utilized in this part have accurate information. Then, the user's gender information and age information are added by using the user's identity information; third, we calculate the corresponding customer's online time according to the customer's online time. Fourth, according to the status information of customers, product information used, business area information, etc., the relevant data should be screened accordingly. Fifth and finally, according to the customer's consumption information and business code information, the customer's consumption information of various businesses is separated and summarized, and the customer's consumption trend information and average consumption level and other data are calculated. According to the above data processing and other data statistics, a dataset containing data attributes such as customer basic information and various consumption details of customers is finally obtained. The dataset is randomly selected, and approximately 8000 pieces of data are selected as the data training set of this model, and 2000 pieces of data are selected as the test set.

3.1. Relevant Principles and Algorithms of Deep Learning

After determining the dataset, let us introduce the relevant deep learning models. Since the concept of artificial intelligence was put forward in 1956, as a new discipline, although it has made great progress, it still has a long distance from the idea of Turing experiment, which once made people frustrated with artificial intelligence. With the prompt growth of science and technology, popularity and great use of mobile devices, and the unremitting efforts of scholars, a breakthrough was finally made in the field of machine learning in 2006. Hinton et al. anticipated the idea of deep learning algorithm. According to the idea of this algorithm, people found a feasible method to simulate the conceptual abstraction of human brain in a certain sense.

In medicine, the working mechanism of the human brain has been puzzling many scientists and scholars until David Hubei and Torsten wieselt found that the information processed by the visual system is hierarchical. External things reach VI through the retina through LGN (forming a simple visual form), then reach V4 through V2 (forming some intermediate visual forms), then reach AIT through pit (forming a high-level object description), and then continue to pass on. This step-by-step propagation mechanism of visual signals in the brain makes people further think that the working process of the brain may be a continuous iterative and abstract process. After people receive the original signal, the brain first carries out low-level abstraction and then iterates to high-level abstraction level by level. Therefore, what we perceive is a highly abstract concept. Deep learning algorithm cleverly simulates this process of brain. In essence, deep learning algorithm is actually a network model, but one of the differences between it and artificial neural network is that deep learning adopts layer by layer initialization, that is, only one layer of the network is trained each time [24].

As mentioned earlier, the training mechanism of deep learning algorithm is trained layer by layer. This training method avoids the gradient diffusion phenomenon when the traditional neural network uses BP for residual transmission because there are many layers in the network. The training process of deep learning algorithm can be summarized into two steps: one is Pretrain, and the other is adjustment and optimization [25].(1)Unsupervised learning from the bottom up. Use the unlabeled data X to train the first layer first, and its goal is to make the output X′ consistent with the input x as much as possible (if the training uses the auto-encoder model, it needs to make the output of the decoder consistent with the input as much as possible), so that the parameters of the first layer can be obtained. Then, take the output of the first layer as the second input, and go down to the nth layer in turn, so that the parameters of each layer can be obtained.(2)From top to down supervised learning. The corresponding parameters of the multilayer model are obtained through step (1), but these parameters are obtained through unsupervised learning. In order to make the model more optimized, this step can use labeled data to optimize the model.

From the previous introduction, we can know that for the pretrain stage of the deep learning algorithm, it uses the available unlabeled data. However, according to the traditional neural network method, we also know that labeled data can, in fact, train the parameters of every layer of the network rendering to the error amongst the output and the label, but how to determine the training error for unlabeled data is the key of the stage [26].

The autoencoder method 4 consists of two parts, one is encoder and the other is decoder. In fact, the original data gets an output through the encoder. This step is called encoding, and then the encoded result is passed as input into the decoder. In this way, we can get a reconstruction of the original input. By comparing the error between the original data and its reconstruction, we can train the first layer, as shown in Figure 1.

Through the training of the first layer, we can get the current final result of the encoder. Taking this result as input, and using the same method to train the second layer, and so on, we can get the training results of all layers of autoencoder, as shown in Figure 2.

The next is the stage of network fine-tuning. Through adding a classifier module at the upper layer of the established network, and then using the labeled data to fine tune the system, fine-tuning is divided into two methods, one is to fine tune only the last classifier, and the other is to fine tune the whole system. There are many variants of the autoencoder method. For example, sparse auto-encoder can be obtained by introducing the concept of sparsity, and denoising autoencoder can also be obtained.

Boltzmann machine originally originated from statistical physics. It is a method based on energy function. Its network structure includes visual layer and hidden layer. The relationship between numerous units is expressed by weight. The interaction between the nodes of a restricted Boltzmann machine only occurs amongst the hidden layer and the visible layer, whereas the nodes of the hidden layer and the visible layer are, in fact, independent. This is the major distinction between a restricted Boltzmann machine and a Boltzmann machine.

The full probability distribution P (y, h) of the nodes in the Boltzmann machine is restricted to obey the Boltzmann distribution. Assuming that V ∈ {0, 1} is a visible layer element and H ∈ {0, 1} is a hidden layer element, the energy function of the model can be defined as given in (1):where is a model parameter. The joint distribution of the corresponding visible layer units and hidden layer units can be defined as illustrated in (2):where is the partition function.

Since, the nodes in the Boltzmann machine layer are independent of each other, therefore the conditional probability distribution of the hidden layer in the known visible layer state is , and the probability of a single hidden unit is given in (3):where is the sigmoid function.

Because the mechanism of the restricted Boltzmann machine is symmetrical, similarly, when the hidden unit is known, the conditional probability of the visible layer is , and the probability of the corresponding visual unit is mathematically modeled using (4):

Similarly, very similar to the autoencoder method, the RBM is also trained layer by layer by comparing the original data with its reconstruction so as to establish the whole model. Using KL distance, it is deduced that the objective function of the RBM model is maximized likelihood estimation , and the parameter value of the model can be finally calculated by Gibbs sampling and other methods.

3.2. Model Construction

After the above simple understanding of deep learning, now let us build the model used in this section. In the deep learning model, the organizational structure of the whole model is like a stack of blocks, and a layer of network is like a building block. The traditional depth model is formed by stacking similar building blocks. This paper calls it the depth structure of a single model, but because different models have different advantages, for example, the assumption of the autoencoder model is to make the input and output of the model equivalent, while the RBM model is based on the energy function, it is a probability model. In order to make comprehensive use of the advantages of different models, this paper constructs the deep learning structure of the hybrid model. In the construction strategy, this paper adopts two different forms, one is cross mixing, and the other is sequential mixing.

At the same time, in order to improve the performance of the model, this paper uses the sparsity limit and dropout method in the constructed model. In addition, when training and optimizing the parameters of the model, this paper uses different methods to deal with different models. For the autoencoder model, this paper uses the random gradient descent method to adjust the parameters, while for the RBM model, this paper uses the method of specific divergence to adjust the parameters. The model used in this section consists of eight layers, the first layer is the input layer, the eighth layer is the output layer, and the intermediate structure of the network is 1000.800-500.300-150.30. For the deep learning structure of a single model, the RBM model is used in each layer model; for the deep learning structure of hybrid model, we use RBM model and autoencoder model.

To test and compare various deep learning approaches, we use three well-known metrics. The following equations (5), (6), and (7) are the formulas for calculating the precision, recall, and the F1-Measure score, respectively:where FP denotes the ratio of false positive outcomes to projected outcomes and TP denotes true positive results. Like FN, the false negative is described by FN.

4. Experiment and Result Analysis

Through the construction of the above model, in order to authenticate the effectiveness of the anticipated model, in this paper, we practice the enterprise data processed in Section 3 to carry out the corresponding simulation experiments. Through the training of the model, we get the corresponding mean square error change trend, as revealed in Figure 3. In the figure, model 1 characterizes the deep learning structure of a single model, model 2 represents the deep learning structure of a cross hybrid model, and model 3 represents the deep learning structure of a sequential hybrid model.

At the same time, we input the corresponding test set data into each trained model, and we can get the prediction accuracy of each model, as given away in Table 2. This should be noted that other metrics, in terms of precision, recall, and the F1-measure metrics along with their associated score are discussed in later sections.

From Table 2 and Figure 4, we can see that the prediction accuracy of the three models designed in this paper is better than the traditional neural network model. In fact, this could be understood that the model used in this section is effective. In addition, the accuracy of the deep learning structure of the single model is higher than that of the cross hybrid model, but lower than that of the sequential hybrid model. In addition, for the cross hybrid deep learning structure, it is found in the corresponding experiments in this paper. The model is unstable, and there is error jitter in the hidden layer of the model during training. Although, the deep learning model has been well applied in many fields, whether the enterprise management data can also play its excellent characteristics needs further simulation experiments on more enterprise management data sets. Figure 5 shows the comparison diagram of test set accuracy in terms of Precision, Recall, and F1-Measure for various techniques. Using certain assumptions, the attained outcomes confirm that the accuracy and precision of the deep learning structure of the single model is sophisticated and greater than that of the cross hybrid model but lower than that of the sequential hybrid model. Moreover, the traditional model has the lowest recall rate than other methods that were compared in the experiments.

5. Conclusions and Future Work

The usage of data is becoming increasingly comprehensive and in-depth in today's society as a result of the advancement of science and technology. Many domestic and foreign enterprises are constantly mining new businesses from their own data. However, due to the imbalance of industrial structure and technological development, many domestic enterprises are still in the initial stage of data utilization, and there are a large number of idle data in enterprises that have not been developed and utilized. How to use these data to serve the development of enterprises is an urgent problem for these enterprises, which is also the only way for enterprises to connect with the world. The effective and rational use of enterprise data must be problem-oriented, and data also serves to solve problems. Different problems need different data, and different data solve different problems, which requires reasonable processing after data collection.

Through the on-the-spot investigation of enterprises, this paper finds the following problems: (1) enterprises are facing the problem of how to effectively use huge data due to the continuous expansion of data; (2) customers are of great significance to enterprises; (3) various enterprises have customers' consumption data, basic information data, etc. can we find customers' behavior patterns, consumption habits, etc. from these data, so as to predict customers' loyalty and loss intention; and (4) how to use enterprise data to optimize the management process, reduce enterprise costs, and improve enterprise efficiency. These problems prompted the research interest of this paper, so the main research focus of this paper focuses on the processing of enterprise data and its application in prediction and evaluation. Using certain assumptions, the attained outcomes confirm that the accuracy of the anticipated deep learning structure of the single model is sophisticated and greater than that of the cross hybrid model, but lower than that of the sequential hybrid model. The neural network model that has been constructed and proposed in this paper is entirely fictitious and has not been extensively used in practical applications. Later, practice and application are required. The focus of our upcoming investigation will be these limits. We will also think over how the training and prediction duration can be shortened so that the algorithm converges as quickly as possible.

Data Availability

The corresponding author can provide the datasets used and analyzed during the current study upon reasonable request.

Conflicts of Interest

The authors of this paper declare that they have no conflicts of interest.

Acknowledgments

This paper was supported by the Scientific Research Fund Project of Xijing University (No. XJ190104).