Recent Advances in Random Matrices for Mathematical ModelingView this Special Issue
Innovation Model of College Education Based on Nonlinear Random Matrix Organization and Management
At present, the scientific research and innovation team in colleges and universities has become an important carrier for the development of science and technology and personnel training in China, but in terms of its development status, the role of the scientific research and innovation team has not been really brought into play. Most colleges and universities only have the organizational form of scientific research and innovation teams. Too many scientific research tasks still rely on a small number of backbones. The cohesion of the team is not enough, and the combined force of the team is not manifested. In production practice, more and more high-dimensional data with dimension p approaching or even exceeding the number of samples n are collected and stored. Unlike traditional data analysis, high-dimensional data analysis is more complex and difficult. Layer of scientific research and innovation teams in colleges and universities is often ignored by managers, which is manifested by ignoring the cultural construction of the team. In the eyes of team managers, the time spent on building a team culture is far less than the time it takes to do a few more scientific experiments. University scientific research and innovation team is a multi-level network system. The asymptotic and nonasymptotic theories of random matrices break the framework of classical multivariate statistical analysis and are very suitable for the study of statistical characteristics of high-dimensional data, which can help machine learning algorithms to complete the analysis of high-dimensional data and expand its application range. The research and innovation team in colleges and universities is deeply analyzed from the perspective of nonlinearity. It is believed that the scientific research and innovation team in colleges and universities is a complex nonlinear system in terms of scientific research objects-interdisciplinary research, or from the perspective of personnel composition. Correspondingly, the construction should be considered from a nonlinear perspective in the process, following the systematic, integral, and multi-dimensional characteristics of nonlinear systems. Specifically applied to the construction of the team, first, from a macro perspective, three principles should be paid attention to at the beginning of the creation of the team, and then the specific application should be explained in detail. It is hoped that a new perspective will be put forward in the construction of scientific research innovation teams in colleges and universities to better promote the development of teams.
The so-called scientific research and innovation team is based on scientific and technological research and development, relying on key laboratories or engineering projects, under the leadership of outstanding subject leaders, composed of a small number of people with complementary skills, willing to work for common scientific research purposes and a group organization that takes responsibility for each other by working methods . From the perspective of system theory, the scientific research and innovation team in colleges and universities is a system composed of various elements . The elements in the system include scientific research personnel, knowledge of various disciplines, scientific research funds, scientific research tools, team culture, etc. Due to the inherent development trend of contemporary science, disciplines are required to be continuously intersected, integrated, and infiltrated [3–5]. The above elements are related to each other under the action of disciplinary synthesis, and their effects lead to the continuous emergence of some new disciplines and new fields. While promoting the development of the team, it also promotes the innovation of scientific research and technology . From the perspective of its internal composition, the environment construction of scientific research and innovation teams in colleges and universities is mainly composed of two elements, namely, the hard environment and the soft environment of the team . Hard environment refers to objective factors such as working conditions, equipment, and funds in the team. Soft environment refers to factors such as values, rules and regulations, academic team, and management culture [8–10].
The hard environment has been formed in the early stage of the establishment of the team and is an objective object. Although the hardware support of the team has a great relationship with the creativity of the entire team, the hard environment is an invariable factor, which is often determined by the level of local economic development, the school. It is not meaningful to discuss the construction of the hard environment in this paper, so it is not meaningful to discuss the construction of the scientific research innovation team in colleges and universities from a nonlinear perspective; in fact, the main purpose is to study the construction of the soft environment of the scientific research innovation team in colleges and universities, namely, team system and culture construction . The impact of the increase of data dimensionality on data analysis is multi-faceted . For example, in nonparametric estimation, the high-dimensionality of data affects the convergence speed of algorithm estimation; in model selection, too many data variables will cause the performance of the model to decline; in regression analysis, the sparsity of high-dimensional data is also one of the hard problems in data forecasting . Also in multivariate statistical analysis, it is usually assumed that the dimension of the data is fixed and limited. When the dimension of the data is close to or even exceeds the number of samples of the data, the classical multivariate statistical analysis theory also reflects its own limitations, especially estimates of mean and covariance matrices for high-dimensional data . The estimation of the mean of the data is a basic problem in multivariate statistical analysis. Many data analysis methods, such as diagonal discriminant analysis and Markowitz mean-variance analysis, need to estimate the mean of the data . When the data dimension is very large, it is difficult for the data subject to a specific distribution to be near the overall mean in the high-dimensional space, and when the data sample is kept unchanged, as the data dimension increases, the data will gradually move away from the overall mean . For example, in the principal component analysis, the calculation of the covariance matrix and the determination of the number of principal components are the key links in the dimensionality reduction of the original data; in the Bayesian multivariate statistical inference theory, the calculation of the multivariate normal distribution approximation conditional probability requires consistent estimation of the precision matrix; similarly, in large-scale reinforcement learning methods based on Gaussian process classifiers, the covariance matrix and the precision matrix also need to be estimated [17–19].
Due to the increasing amount of data and the increasing dimensionality, some data will inevitably be missing in the process of data collection and storage . There are many reasons for the lack of data. For example, in social surveys, people may refuse to answer some sensitive questions; in the process of data collection, the collected data may be incomplete due to the experimental environment and equipment failure. Aiming at the problems existing in traditional machine learning methods in high-dimensional data analysis, this paper uses the relevant research results of random matrix theory to propose a regularized discriminant analysis algorithm, a regularized discriminant analysis algorithm for good mean estimation, and a high-dimensional missing data analysis algorithm. The research and innovation team in colleges and universities is deeply analyzed from the perspective of nonlinearity. It is believed that the scientific research and innovation team in colleges and universities is a complex nonlinear system in terms of scientific research objects-interdisciplinary research, or from the perspective of personnel composition. Correspondingly, the construction should be considered from a nonlinear perspective in the process, following the systematic, integral, and multi-dimensional characteristics of nonlinear systems. Specifically applied to the construction of the team, first, from a macro perspective, three principles should be paid attention to at the beginning of the creation of the team, and then the specific application should be explained in detail. It is hoped that a new perspective will be put forward in the construction of scientific research innovation teams in colleges and universities to better promote the development of teams.
2. Nonlinear Random Matrix for Organizational Management
2.1. The Three-Dimensional Structure of College Teams
Compared with the planarity and one-dimensionality of the linear system, nonlinear science believes that the scientific research and innovation team in colleges and universities is a network system that is intertwined by various levels, factors, and structures according to certain procedures. Managers in scientific research and innovation teams must truthfully and comprehensively reflect the structure at all levels implicit in the team and conduct crisscross analysis of the team in order to better promote the development of the team. Based on this point of view, within the scientific research innovation team of universities, from a certain point of view, it can be considered that there are two hierarchical structures: one is the dominant layer, and the other is the recessive layer.
The explicit layer is a cyberspace composed of elements that, from the perspective of the actor, belong to the group briefly. And the purpose of the system is relatively clear, that is, to complete the current task of the organization and ensure the survival and development of the organization. The hidden layer is composed of some so-called hidden rules. For example, the phenomenon of “inbreeding” in teams is caused by hidden rules that will not be publicly announced, and a positive team culture is also constructed by various hidden rules. The blurred boundaries of the recessive layer are generally related to the habits and hobbies of team leaders, as well as the personality characteristics and academic loyalty of team members. The hidden layer and the dominant layer together form the network three-dimensional space of the team. In the implicit layer, the actors act on the explicit layer of the team through information flow, energy flow, behavior flow, emotion flow, friendship flow, trust flow, and other relationship flows. A specific stimulus input is from the outside world, through the processing of the above relationship flow, the hidden layer may make various response outputs, and the response output and stimulus input are not always proportional, that is to say, a very small hidden layer in the team. The consequences of the rules for the team can be fatal.
Layer of scientific research and innovation teams in colleges and universities is often ignored by managers, which is manifested by ignoring the cultural construction of the team. In the eyes of team managers, the time spent on building a team culture is far less than the time it takes to do a few more scientific experiments. University scientific research and innovation team is a multi-level network system. If we do not look at it three-dimensionally, we only focus on one plane of team building, and let the implicit rules develop in the hidden layer of the team, which will eventually lead to academic corruption and bad atmosphere within the team. Sprawl is contrary to the purpose of the team’s creation. Of course, within the scientific research and innovation team of colleges and universities, its three-dimensional network is not only reflected in this structure. The case of the Key Laboratory of Scientific Instruments and Dynamic Testing of the Ministry of Education of North Central University is cited here. Its existence in this team is also a three-dimensional network space, including 48 researchers in the dominant layer, and the total value of large-scale experimental equipment and instruments. It also includes a construction area of 6,000 square meters, potential team culture, communication mechanism, incentive mechanism for business units, etc. In addition, for example, in terms of personnel composition, there are senior and old professors, middle-aged scientific research workers who are the main force of scientific research, and young and energetic youths. Each age group has different thinking patterns and personality characteristics. From the perspective of the academic background, there are professors, postdoctoral fellows, doctorates, masters, and management. In the process of communication and cooperation, the team must face up to the complex structure of its existence, to better exert the creativity of the team.
In statistical analysis, many statistics can be expressed as some functional form of empirical spectral distribution (ESD), so the ESD of random matrix plays a crucial role in studying the properties of random matrix.
Assuming that R is an -dimensional symmetric matrix with eigenvalues qR (i = 1, …, m), the ESD of matrix R can be defined asm, is the set of real numbers, 1 is an indicative function, and R is 1 when i < t, and 0 otherwise. When m tends to infinity, the empirical spectral distribution Fm(t) of the matrix R converges to the limiting spectral distribution (LSD) F(t) according to the probability. Transform is an important tool to study the spectral properties of random matrices. The probability density function of spectral distribution can be obtained by its Stilettoes transform. The Stilettoes transform for a bounded variogram G(x) can be given bywhere C denotes a complex half-plane with positive imaginary parts and z is a complex number in the C-plane. If G(x) is continuous on the interval a and b and a < b, the inverse transform can be defined asIm represents the imaginary part of a complex number, and i represents an imaginary number.
The discrete form of the Stilettoes transform of its empirical spectral distribution F can be expressed asIm represents the m-dimensional identity matrix, and Tr represents the trace of the matrix.
Suppose the matrix X is a A random matrix with all elements independent and Gaussian distributed with mean 0 and variance 1. Then, the Hermitian matrix M can be expressed aswhere X represents the transpose of matrix X. It found that the empirical spectral distribution Fm(t) of matrix M converges to the limiting spectral distribution F(t) according to probability. The probability density function of F(t) is
The probability density function is also known as the semicircle law.
The relationship between the semicircle law and the distribution of the eigenvalues of the matrix M is further explained by simulation experiments. The sample number and dimension m = 5000 of the matrix X, as shown in Figure 1, where the red line represents the theoretical probability density function f(t), and the blue part represents the distribution of the eigenvalues of the matrix M. It can be seen from the figure that the semicircle law can describe the distribution of the eigenvalues of the Hermitian matrix M well.
2.2. Principles of Overall Team Collaboration
Using the holistic synergy theory in nonlinear science to form and manage scientific research innovation teams, on the one hand, it requires collaboration between scientific research innovation teams in colleges and universities, and on the other hand, it requires technical, cultural, and institutional dimensions within the scientific research innovation team that also cooperates with each other. After the team manager realizes this, first of all, among the scientific research innovation teams in universities, they should constantly understand the scientific research development status of other teams and learn the new methods and methods adopted by other teams for interdisciplinary research, as well as measures that are worth learning from in terms of management, so as to reflect on various factors of their own team, take advantage of their strengths, and avoid their weaknesses, in order to know themselves and others, so that a collaborative development trend of chasing and catching up is formed among the scientific research and innovation teams of colleges and universities. Secondly, in the current situation where scientific research and innovation requirements are becoming more and more complex, various systems and cultures within the team must also achieve innovation in synchronization with the development of science and technology. This requires the managers of scientific research innovation teams in colleges and universities to assess the situation, always pay attention to the trend of scientific research development, and accordingly develop the management system and operating mechanism within the team to keep pace with the times, to better promote the development of scientific research and the production of technological innovations. Aiming at the estimation of high-dimensional covariance matrix in LDA, this chapter uses the related research of random matrix theory to apply the two covariance matrix estimation methods of nonlinear shrinkage and eigenvalue interception to the LDA algorithm and obtains a method for high-dimensional covariance matrix estimation.
Under this general principle, managers of scientific research innovation teams in colleges and universities must continuously absorb advanced scientific research management experience, make team management and scientific and technological level develop together, and build an appropriate scientific research team management system to catalysis. In reality, as long as the scientific research and innovation teams in colleges and universities that can survive long-term competition or develop well, most of them are teams that value collaborative innovation at all levels of the team and only focus on a single dimension—the scientific research and technology dimension. Teams are often eliminated early.
2.3. Innovative Pattern Algorithm Based on Nonlinear Random Matrix
Combining discriminant analysis with the above-estimated high-dimensional covariance matrix, a regularized discriminant analysis method based on random matrix theory is obtained. The implementation of the linear discriminant analysis algorithm needs to use the training dataset train X to estimate the prior k. However, in high-dimensional situations, the estimated covariance matrix H is usually ill-conditioned or even singular. The discriminative algorithm RMRDA is used for data classification. The algorithm design is shown in Figure 2.
2.4. The Openness Principle of Innovative Education Team
Nonlinear science believes that a system is only in an open state, and the inside of the system continuously receives external matter, energy, and information, and when external factors enter the system, the original stable state of the system is broken, and the system is in an extremely unbalanced state. On the one hand, it overcomes or alleviates the limitations and constraints of its own technical capabilities, resources, funds, etc., and realizes resource sharing, complementary advantages, and risk sharing; on the other hand, through open cooperation and exchanges, each team realizes material and energy exchange of each other, so that the team is in a nonequilibrium state, as shown in Figure 3. Furthermore, it is necessary to correctly view the competition between scientific research and innovation teams in colleges and universities. When encountering strong competitors, scientific research managers should not be blindly hostile, but should fully absorb the superior information of the other party into the team to make it break the team. Under the premise that the research object is regarded as a system, it should pay attention to the comprehensive effect of the internal structure, level, function, whole and part, system, and environment of the system, to achieve a comprehensive grasp of the object.
Humans are the most complex species in nature. The composition of the human brain and the entire thinking process of human beings are nonlinear, which determines that the team must follow this law in the operation. Man is an organism, not passively responding to stimuli, but an essentially autonomous system. Social groups composed of people present great randomness, ambiguity, uncertainty, and instability. To ignore the spontaneity of the living body is to ignore the creative potential of man. The scientific research innovation team in colleges and universities should consciously train the nonlinear thinking of team members, guide the team’s practical activities, release the potential of the members, play the role of the team members more reasonably, and promote the team to achieve the ideal goal of sustainable development. The principle of the MS method is that each missing variable is replaced by the mean of the observed data for that variable, while the MICE method is to impute the missing variable by a separate model and then make a combined inference.
2.5. Continuation of the Innovative Education Model
The scientific research environment of the scientific research innovation team in colleges and universities refers to the sum of various factors that affect the growth and development of the team. Theoretically speaking, the scientific research and innovation team is a complex dynamic operation process, which is affected and restricted by many factors. It is divided into material environment and spiritual environment. The physical environment refers to the hardware facilities for team development, including objective factors such as working conditions, equipment, and funds; the spiritual environment refers to the cultural construction of the team, the application of scientific research systems, and the human environment in which team members perform scientific research work. When building the scientific research environment of the team, we should not only focus on the cultivation of individual environmental elements, but also conduct a comprehensive investigation from the perspective of nonlinear systems. The nonlinear theory emphasizes grasping the object.
As for the environmental factors of the scientific research and innovation team, the impact of the physical environment on the team can often be valued by team builders, manifested as actively applying for project funds for the team, introducing the most advanced scientific research equipment, attracting authoritative scientific research personnel to join the team, and many more. However, when the scientific research and innovation team develops to a certain level, the influence of the material environment on the entire team will be reduced. The scientific research personnel and scientific research equipment are established for a certain period, and then the factor affecting the overall development of the team at this time is mainly the spiritual environment of the team building. Because people themselves are a nonlinear system, the size of the team should not only consider the spontaneity of members in the team and maximize the creative thinking of individuals, but also ensure the overall effect of the team, formulate a reasonable human resource management system, regulate human behavior, and guide the team to truly exert the ability of 1 + 1 > 2. Then, the size of a reasonable team should be based on the research direction and research goals and adhere to the principle of relatively stable scale and flexible changes.
Common values in a staged form: you can learn from the practices in the enterprise. For example, in the case of Amway direct sales, all the marketers in the Amway team basically have a working state of enthusiasm and have a persistent and fanatical pursuit of Amway products, and all their states are from the inside; it is puzzling. Looking at the member management model of Amway Direct Selling, it is not difficult to find that they will have multiple stages of achievement rewards in the process of formulating a total Amway value, for example, ordinary marketers to marketing assistants to marketing directors, senior marketing directors, marketing managers, senior marketing managers, marketing directors, etc., the personnel treatment, influence, and status at each stage have achieved a qualitative leap, which allows each marketer to recognize the company’s overall goals and firmly believe that it can be achieved, and its realization is achieved along with the realization of personal goals; then in the minds of marketers, they completely trust the company and invisibly form a common value—work hard to achieve a common position. A similar approach can also be used for reference in scientific research and innovation teams. Common values are very abstract in the actual development of the team.
3.1. Experimental Analysis of Simulated Data
First m eigenvectors also directly affect the dimensionality reduction effect of the data. To verify the effectiveness of the proposed dimensionality reduction algorithm, the eigenvectors obtained by RPCA and traditional PCA are compared with the actual eigenvectors, and the similarity between them is calculated for quantitative evaluation. Using the traditional PCA method needs to fill in the missing data in advance. Here, two different methods, the mean substitution method (MS) and the chain equation multivariate imputation method, are used to fill in the data.
The data missing ratio of the simulated data is (1 − x), keeping the data dimension p = 50 unchanged, experiments are carried out on data with different sample numbers n, and the results of each experiment are obtained by averaging 50 experiments. And to solve the convex optimization problem in the formula, the CVX software package for solving convex programs is used in the experiment. As shown in Figure 4, compared with other dimensionality reduction algorithms, the FN and CS values obtained by the RPCA algorithm are the smallest, and the eigenvectors obtained by the RPCA algorithm are also the closest to the real eigenvectors.
When the proportion of missing data is 15%, this chapter also studies the effect of cumulative contribution rate ACR on FN and CS. As shown in Figure 5, with the increase of the cumulative contribution rate, the FN and CS values of the dimensionality reduction method proposed in this paper are always smaller than the FN and CS values of the dimensionality reduction method after data filling. When the number of samples is large, the FN and CS values of all algorithms are relatively small. In addition, to reduce the impact of the cumulative contribution rate on the data classification after dimensionality reduction, the cumulative contribution rate in this algorithm is generally set to 85%. To further verify the performance of the RPCA algorithm on the simulated data, a classification experiment was carried out on the high-dimensional missing data combined with the LDA algorithm. Three types of simulated data obeying the multivariate Gaussian distribution 1N, 2N, and 3N are generated, respectively, where the mean of the first type of data is 0, the mean of the second type of data is 1, and the mean of the third type of data is 1. Let the number of samples of the three types of data be the same, randomly remove the elements in the data according to the (1 − x) ratio, generate a total of n samples with missing data, and generate 1200 data as test samples. Randomly select n samples as training data, randomly remove data points in the samples according to different values, and use the remaining samples as test data. The average classification accuracy of each algorithm is obtained by averaging the results of 3 experiments.
3.2. Data Filling and Dimensionality Reduction Management
The experiment selects the university dataset and education stage dataset collected by the UCI data center for testing. The breast cancer dataset is a complete dataset, containing 569 samples, divided into two types of data: “benign” tumors and “malignant” tumors, each of which is described by 10 features. As shown in Figure 6, compared with PCA combined with different missing data filling algorithms, the classification performance of the RPCA algorithm combined with the LDA algorithm is relatively high. When the number of samples n decreases or the missing rate increases, the dimensionality reduction performance of the RPCA algorithm may decrease, resulting in an increase in the misclassification rate of the LDA algorithm.
3.3. System Security Detection
Obviously, as x increases, the correlation between variables gradually increases. In the simulation experiment, the selected x is 0.1, 0.3, 0.6, and 0.8, the value of the sample dimension p is 300, and then the value of the number of data samples n is changed between 100 and 500, as shown in Figure 7. The relationship between the minimum variance loss MVLF estimated by the covariance matrix such as sample, nonlinear shrinkage, and eigenvalue clipping and the sample n is obtained. The results of each experiment are averaged by 10 experiments obtained, and the logarithm of all experimental results was taken.
As the number of samples n increases, the MVLF values obtained by all covariance matrix estimation methods also gradually become smaller, reflecting that when the samples are sufficient, the covariance matrix estimation methods in the figure can make a good estimate of the overall covariance matrix. In addition, it can be found that when the sample number n is close to the sample dimension p, the MVLF value of the nonlinear shrinkage estimation method will be improved, but it will not affect the estimation of the overall covariance matrix. From the perspective of the specific thinking process, researchers seeking innovation in the complex scientific research process need to have complete and mature creative thinking. To analyze this thinking process concretely, it goes through four stages of “experience, reason, intuition, and inspiration.”
4.1. Random Matrix Management Efficiency
RMRDA algorithm proposed in this paper and other comparison algorithms are shown in Figure 8, where RMRDA1 and RMRDA2 represent the improved algorithm of LDA by nonlinear shrinkage and eigenvalue interception estimation, respectively. The average classification accuracy of the algorithms in the table is obtained from the average of 50 experiments, and the standard deviation (SD) of the 50-classification accuracy is obtained. As can be seen from the table, the performance of the LDA classifier is generally poor, especially when n is smaller than p. The RMRDA algorithm proposed in this paper is better than most of the comparison algorithms, and the classification performance of the RMRDA1 algorithm is better than that of the RMRDA2 algorithm in most cases. However, when the correlation between variables is relatively small (i.e., when the x value is small), the classification effect of RMRDA is better than that of LDA and DLDA. The classification effect of the algorithm is poor. As the correlation between variables increases, the classification accuracy of the RMRDA algorithm is higher than that of other comparison algorithms. It can be noticed that the standard deviation of the classification accuracy of the RMRDA algorithm also maintains a low level. The MVLF value of the covariance matrix estimated by the nonlinear shrinkage and eigenvalue interception method is always smaller than the MVLF value of the sample covariance matrix estimation method. And the MVLF value obtained by the nonlinear shrinkage estimation method is smaller than the MVLF value estimated by the eigenvalue interception method, which also shows that the nonlinear shrinkage method can better estimate the overall covariance matrix in most cases.
The concept of teamwork is the guarantee for the success of innovative teams. Especially after the rapid development of science and technology in my country, the resources available to society are gradually decreasing, and the content of scientific research is becoming more and more complex. Many researchers are studying complex interdisciplinary research under limited scientific research resources, which will inevitably lead to conflicts, and conflicts evolve into competition. In the face of such a severe social development environment, the formation of vicious competition among researchers will only hinder the pace of scientific research development. Only by cooperation can a team be called a team. The work of an innovation team is highly challenging and has great uncertainty. There are often great differences in the values, personalities, experiences, and work skills of team members, which will affect innovation team building.
4.2. Analysis of Management Effectiveness
To further verify the effectiveness of the proposed algorithm, according to the correlation between the data variables, the x values of 0.2 and 0.7 were selected to generate simulated data, and the data sample n = 240 was fixed. Observe that the classification accuracy of the above algorithm increases with the relationship of sample p changes. As shown in Figure 9, with the increase of the data dimension p, the average classification accuracy of all algorithms in the figure almost increases except for the HDRDA algorithm. When the correlation between the sample variables is relatively low, the average classification accuracy of the smDLDA algorithm is relatively high, and when the correlation between the sample variables is relatively high, the classification performance of the RMRDA algorithm is better than other comparison algorithms.
The innovation, pioneering, and breakthrough of creative thinking are mainly realized through intuition and inspiration. The two ways of thinking, intuition, and inspiration have extraordinary properties. When the subject is in intuition and inspiration, the combination of thinking information presents a state of chaos from macroscopic disorder to microscopic order. Due to the entry of the third “image” or “thought” that seems to be irrelevant and outside, the equilibrium state of the entire thinking system is broken, and the thinking makes a highly drastic response to the new information system and implements reorganization. The result often leads to new theories produced, which in turn produce innovative scientific research results. The inner working mechanism of intuition and inspiration shows that the neural network of the human brain is a nonlinearly arranged material system, which has the characteristics of divergence and creativity, and can produce an overall amplification effect of 1 + 1 > 2 in thinking activities, so it is also called “nonlinear thinking.” From this point of view, the creation of an innovative knowledge system is not a simple simulation, research, and experiment among researchers, but actually a violent nonlinear reaction in the field of thinking.
5. Future Research
Aiming at the problem that the linear discriminant analysis algorithm cannot classify high-dimensional data, or the classification effect is not ideal, a regularized discriminant analysis algorithm based on random matrix theory is proposed. First, consider that when the LDA model classifies high-dimensional data, because the sample covariance matrix in the model is ill-conditioned or singular, the classification effect of LDA is not ideal or even cannot be classified; then, by using two random-based methods the high-dimensional covariance matrix estimation method of matrix theory—nonlinear shrinkage and eigenvalue interception method—re-estimates the covariance matrix in the LDA model and improves the linear discriminant classification model; finally, the proposed classification model is applied to simulated datasets, handwriting on real datasets such as characters and microarrays, and compared with other high-dimensional data classification methods, the improved discriminative classification algorithm improves the classification accuracy of high-dimensional data.
The asymptotic and nonasymptotic theories of random matrices break the framework of classical multivariate statistical analysis and are very suitable for the study of statistical characteristics of high-dimensional data, which can help machine learning algorithms to complete the analysis of high-dimensional data and expand its application range. Aiming at the problems existing in traditional machine learning methods in high-dimensional data analysis, this paper uses the relevant research results of random matrix theory to propose a regularized discriminant analysis algorithm, a regularized discriminant analysis algorithm for good mean estimation, and a high-dimensional missing data analysis algorithm. Correspondingly, the construction should be considered from a nonlinear perspective in the process, following the systematic, integral, and multi-dimensional characteristics of nonlinear systems. Specifically applied to the construction of the team, first, from a macro perspective, three principles should be paid attention to at the beginning of the creation of the team, and then the specific application should be explained in detail. It is hoped that a new perspective will be put forward in the construction of scientific research innovation teams in colleges and universities to better promote the development of teams. The cost of data production and storage is getting lower and lower, which has also led to the rapid growth of data. The use of data analysis methods to complete the mining of data value has become a key part of data research. As an important tool for data analysis, machine learning methods have also developed rapidly. With the huge increase in data, more and more high-dimensional data have been discovered, and various problems have also appeared in the application of multivariate statistical analysis methods and machine learning algorithms to high-dimensional data. In the future, the innovation, pioneering, and breakthrough of creative thinking are mainly realized through intuition and inspiration.
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Zhikai Luo and Mao Zhang contributed equally to this work.
This work was supported by the project of Sichuan Vocational College of Health and Rehabilitation: “Research on Improving Ideological and Political Course Teaching Methods in Higher Vocational Colleges in Southern Sichuan under the Background of Changing Student Source Structure” (CWKY-2019Z-09).
S. Ramazanov and M. Petrova, “Development, management and forecasting in a green innovative economy, based on the integral dynamics model in the conditions of “INDUSTRY - 4.0”,” Access: Access to science, business, innovation in digital economy, vol. 1, no. 1, pp. 9–30, 2020.View at: Publisher Site | Google Scholar
M. Belitski, R. Caiazza, and Y. Rodionova, “Investment in training and skills for innovation in entrepreneurial start-ups and incumbents: evidence from the United Kingdom,” The International Entrepreneurship and Management Journal, vol. 16, no. 2, pp. 617–640, 2020.View at: Publisher Site | Google Scholar
G. Y. Chen, M. Gan, F. Ding, and C. P. Chen, “Modified Gram–Schmidt method-based variable projection algorithm for separable nonlinear models,” IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 8, pp. 2410–2418, 2018.View at: Google Scholar
Y. Fan and X. Liu, “Auxiliary model‐based multi‐innovation recursive identification algorithms for an input nonlinear controlled autoregressive moving average system with variable‐gain nonlinearity,” International Journal of Adaptive Control and Signal Processing, vol. 36, no. 3, pp. 521–540, 2022.View at: Publisher Site | Google Scholar
N. Karballaeezadeh, H. Ghasemzadeh Tehrani, D. Mohammadzadeh Shadmehri, and S. Shamshirband, “Estimation of flexible pavement structural capacity using machine learning techniques,” Frontiers of Structural and Civil Engineering, vol. 14, no. 5, pp. 1083–1096, 2020.View at: Publisher Site | Google Scholar
M. G. Kucharska, A. Özmen, M. Szafrański, G. W. Weber, M. Golińśki, and M. Spychała, “Knowledge accelerator by transversal competences and multivariate adaptive regression splines,” Central European Journal of Operations Research, vol. 28, no. 2, pp. 645–669, 2020.View at: Publisher Site | Google Scholar