Abstract

Big data has become a new driving force for national innovation and entrepreneurship. Although colleges and universities, which are responsible for cultivating high-quality entrepreneurship and innovation talents, have achieved certain results, there are still many problems in practice, which need to be driven by big data. Big data and the practice of innovation and entrepreneurship education in colleges and universities have certain inherent commonality. The integration of big data into the practice of innovation and entrepreneurship education in colleges and universities needs to be improved and strengthened in terms of top-level design, data environment, and educational concepts. Educational big data has become a hot topic and trend in educational research. College students’ entrepreneurship is a kind of entrepreneurial process in which special groups of college students and graduate students are the main body of entrepreneurship. With the recent transformation process of our country and the increasing pressure of social employment, entrepreneurship has gradually become a career choice for college students and graduates. Through data mining, statistical analysis, model construction, and comprehensive reasoning of educational big data are realized, and positive suggestions and countermeasures are provided for education and teaching. On the basis of the pattern analysis of educational data mining process, the concrete application of educational data mining is analyzed, and the correlation and rules between educational phenomena are found, so as to provide educational prediction and educational decision support for further optimization of teaching.

1. Introduction

With the development of science and technology and the intensification of competition, innovation plays a more and more obvious role on national competitiveness and even plays a decisive role. As the backbone of innovation, college students are a new force of building an innovative country, bearing the historical mission of national and social development. Entrepreneurship activities are the core of social development, and they are essential in the process of transforming China from a manufacturing power to a creative power. When participating in the activities related to entrepreneurship, Premier Li Keqiang has repeatedly stressed that the purpose of the state’s entrepreneurship encouragement activities is to make market entities participate independently and make the society full of vitality. The number of college graduates has reached a record high every year. Data show that the number of college graduates in 2020 has again reached its historical peak, which undoubtedly brings unprecedented pressure to employment and entrepreneurship. College students are new forces in the society. “Big data” requires a new processing mode to have stronger decision-making power, insight and discovery power, and process optimization ability to adapt to massive, high growth rate, and diversified information assets. How to make them consciously participate in entrepreneurial activities? This is a hot topic of discussion among experts and scholars at home and abroad in recent years. In recent years, college students engaged in entrepreneurial activities is a new force on the road of entrepreneurship in China, which plays an increasingly important role. Based on this, college students’ entrepreneurship has attracted the attention from all walks of life. Under the background of the government strongly encouraging college students to start their own businesses, how can we make more and more college students take the initiative to understand entrepreneurship and devote themselves to entrepreneurship? What are the problems faced by college students in entrepreneurship? What are the problems that the government and universities need to solve urgently? This should cause the attention of colleges and universities, the government, and the society. With the continuous development of economy and society and the further implementation of the policy of college enrollment expansion in recent years, the national college graduates showed an explosive growth trend. However, with the increasing number of college students’ graduates, the existing positions in the society are far from meeting the increasing employment demand, which leads to more and more college students who are facing the risk of unemployment after graduation. In the face of such a severe employment situation, in order to effectively alleviate the employment of college students, the state has issued relevant policies to actively encourage college students to participate in entrepreneurship. In the first China “Internet +” College Students Entrepreneurship and Innovation Competition, Premier Li Keqiang of the state council also ordered college students to be a new force in implementing the innovation-driven development strategy and promoting mass entrepreneurship and innovation. The relationship between big data and cloud computing is as inseparable as the front and back of a coin. Big data cannot be processed by a single computer, so a distributed architecture must be adopted. Its characteristic lies in the distributed data mining of massive data. However, it must rely on the distributed processing, distributed database, cloud storage, and virtualization technology of cloud computing. In addition, in the second quarter of 2015, the state council also twice included the content of supporting college students’ entrepreneurship in the topic of executive meetings. Under the strong support, “mass entrepreneurship and innovation is continuing to advance to a larger scale, a higher level and a deeper level,” and colleges shoulder the obligatory responsibility [1, 2]. Although many achievements have been made in the practice of it, to a certain extent, there are still situations such as emphasizing theory over practice, emphasizing form over content, emphasizing simulation over actual combat, and emphasizing professional education over education [3]. According to the “Tracking Evaluation of the Cultivation Quality of China’s 2017 College Graduates” released by Max, 44% of the college students accepted the education provided by their alma mater, mainly entrepreneurship teaching courses, and 56% of the college students believed that the most in need of improvement in education is “Insufficient practice activities” [4]. At present, my country’s economy has shifted, which has put forward new and higher requirements for the cultivation of talents in colleges [5]. Especially with the economic society, data has become foundation. Strategic resources and big data are increasingly having impact on various industries [6]. Opening a business must have a clear purpose. In different stages of entrepreneurship, we need to formulate clear goals and decompose them in detail. If a team wants to achieve long-term development, it must have long-term development goals. The long-term development goals can be broken down into different small goals by stages, and these small goals can be broken down to everyone involved. In this process, as the leader of entrepreneurs, we need to plan and manage different goals. However, due to the lack of policy support, design coordination, and talent technology, the practice of education in colleges is relatively limited, resulting in uneven data literacy capabilities of participants, clear information barriers, narrow and single channels, and low utilization efficiency [7, 8]. As the foothold and ultimate goal of education in colleges, how to use it to build a model is becoming an urgent problem [9]. The popularization data have been related to increase in the amount of network data [10]. We have entered a big data era [11]. With the rapid progress of mobile Internet, the teaching quality of my country’s education courses is gradually improving, which allows our education to see infinite possibilities. It contains huge social value and scientific research value, which has attracted the hearts of countless researchers [10]. Today, it not only includes all aspects of students’ learning, but also provides educators with objective and comprehensive feedback on teaching effects, and provides managers with the overall basis for education and teaching conditions, so that educational resources can be obtained [12]. With better distribution and organization, measures for educational reform and development can also be better formulated. Therefore, the integration of the education field and big data technology has brought more and better development opportunities for method, the teaching quality, and the cultivation of themselves [13]. Big data is usually used to describe a large amount of unstructured and semistructured data created by a company. These data will spend too much time and money when downloaded to relational databases for analysis. Big data analysis is often associated with cloud computing, because real-time large dataset analysis requires a framework like MapReduce to allocate work to dozens, hundreds, or even thousands of computers. Big data is not only a concept and idea, but also a means and tool for change [14]. The two have inherent commonality in many aspects, and can provide theoretical and practical support for each other, and jointly build a reasonable and effective model for the practice of education in colleges [15].

2.1. The Connotation of Micro-Teaching and the Role of Data Fusion Technology in Education
2.1.1. The Concept and Characteristics of Micro-Teaching

In essence, micro-teaching is a flexible application of project-based teaching, which is to transform the project tasks in actual work into learning-type micro-theme tasks [16]. Realizing it in a series of teaching practice activities based on class hours can better between teachers and students and between students [17].

Micro-teaching has four notable features: the first is the fragmented teaching time. Micro-teaching divides traditional and fixed-time teaching into fragmented teaching mode in order to satisfy students’ timely solution to teaching problems [18]. The second is the diversification of teaching media; that is, teaching is completed through some media. The third is the miniaturization of teaching objects. Generally, the teaching objects of micro-teaching are not more than 10 people, which can implement flexible and flexible discussions and evaluations more fully and in-depth, and it is convenient to grasp the learning needs of the teaching objects and the degree of their knowledge [19]. Fourth is the content of micro-teaching targeted, and students can strengthen the acceptance of knowledge points through continuous, repeated, and in-depth discussions [20].

2.1.2. The Role of Data Fusion Technology in Education

Network information big data has two notable characteristics: richness and effectiveness. As the key technical education, network information big data can facilitate teachers and students to obtain rich and effective innovation and entrepreneurship information boom in a timely manner, and can quickly check the dissemination path of important information, which is conducive to analyzing future development trends [21].

Data fusion technology has affected all social strata from the side and can also provide financial and policy support for colleges [22, 23].

The big data innovation education platform is used to predict entrepreneurial hotspots and create more entrepreneurial opportunities. Extract effective innovation and entrepreneurship information from big data, so as to understand the new trends of entrepreneurship information, focus on hot spot prediction, and create an excellent innovation practice environment [24]. College students have learned a lot of theoretical things in school and have high-level technological advantages. At present, the most promising career is to set up high-tech enterprises. The importance of technology is self-evident. College students’ entrepreneurship is bound to move towards the field of high technology and high-tech content from the beginning. “Exchanging intelligence for capital” is the characteristic and inevitable way of college students’ entrepreneurship.

The big data concept of innovation and entrepreneurship has many advantages over traditional education: In terms of daily course teaching, the latest innovation and entrepreneurship information resources are used to innovate teaching models, change teaching methods, effectively broaden college students’ entrepreneurial ideas, and further improve entrepreneurial information [25]. Sensitivity of resources, in terms of practical training, and the analysis of examples are used to effectively improve students’ innovative decision-making ability, so as to further improve the innovative education system [26].

2.2. Educational Data Mining and Its Process Mode
2.2.1. Data Mining

The main purpose of data mining is to analyze and synthesize, discover association rules and knowledge, and realize decision support and service [27]. Knowledge with potential application value hidden in the data by means of certain technical methods data. In this sense, data mining as it in Databases [28]. The discipline that crosses, penetrates, and integrates multiple disciplines, such as sampling survey, estimation analysis and hypothesis testing in mathematical statistics, machine learning, pattern recognition and intelligent search in artificial intelligence, and data cleaning in databases, knowledge modeling, and visual analysis all provide technical support for data mining [29]. Big data is just an appearance or feature of the development of the Internet to the current stage. There is no need to mythologize it or maintain awe for it. Against the backdrop of technological innovation represented by cloud computing, these data that originally seemed difficult to collect and use began to be easily used. Through continuous innovation in all walks of life, big data will gradually create more value for mankind (Figure 1).

2.2.2. Educational Data Mining and Its Technology

(1) Forecast. For example, learners’ mastery of course knowledge and existing learning risks can be predicted through learners’ online discussions and interactive communication [8, 30].

(2) Clustering. A large dataset is a connotative feature of the data, for example, according to the learner’s personality characteristics and cognitive style.

(3) Judgment Process. The data is described in an intuitive way of graph visualization, and the machine learning model is improved to facilitate people’s knowledge understanding, so as to achieve rapid judgment and differentiation of data.

3. Technology

Firstly, it defects deficiencies of the algorithm in terms of the large dataset for students’ course selection and the lack of pertinence to the mining results. In-depth analysis of the existing data mining algorithms, the students’ course selection dataset, and algorithm is improved, and an improved association rules integration algorithm is proposed. This algorithm is different from the traditional support confidence-based association rule mining algorithm. It integrates the ball-tree structure-based -means algorithm and the correlation coefficient-based Top-K. It integrates the analysis and improvement ideas of the previous chapters and uses the online course dataset to conduct experiments. The experimental process is introduced in detail, and the performance of the improved integrated algorithm is verified by experiments.

3.1. The Principle of -Means Algorithm

Determine the size of it, and the set is . Record as:

Then, we set the in a dataset.

For each point in the dataset, calculate as follows:

Therefore, in general, the layer nodes: where is between [1, 11].

The choice of activation function has a decisive influence on the entire neural network.in this case, such a neural network is meaningless. Therefore, we add activation by adding activation functions and function to make sense. The commonly used activation functions are as follows. The definition of the Sigmoid function is as follows:

At this time, the neural network will be difficult to obtain effective training due to the vanishing gradient. The Tanh function is also a very common activation function, and its definition is as follows:

3.2. Value Determination

This experiment as follows:

In the KNN algorithm, the ball-tree search query needs to be performed for each point. The following example illustrates the ball-tree search rule, as shown in Figure 2.

According to the structure in the figure, we want to search for the nearest neighbor of point , and we take pointas the center and as the radius, that is to say

We traverse from the root node and recursively search from top to bottom for each subspace that may contain the last neighbor;

Then, the subspace is the subspace we want to search next, and then, recursively search the set of all satisfying points in the subspace.

3.3. Analysis and Improvement of Top-K Mining Algorithm

For the rule or , defined as:

The coefficient lift can occurrence of item set B. It have many issues. When lift is greater than 1, it is inhibition and ignores the case where the lift is 1.

The two rules and are the same:

After processing the dataset with -means, the dataset can be performed on each cluster. It is defined as: where is the former dataset, is the latter dataset, and is the entire dataset. Confidence definition:

The algorithm needs to set the value of it (generating rules), the minimum support, the minimum confidence, and the correlation coefficient. Then, run mining on each subset of the dataset separately. According to the Top-K calculation process analyzed in the previous chapter, the algorithm will first read in the data, and then judge for the first time whether a 1-1 rule can be generated according to the “support, confidence” metric, and will 1-1 rules are put into L, R sets through the SAVE process.

4. Experimental Results and Analysis

The log function of the Moodle platform is relatively complete, which can completely record the activities on the network platform.

After systematic clustering analysis, a clustering tree diagram of various member lists is obtained, which presents the interrelationship of various learning behaviors when learners use the Moodle platform in a tree structure, and realizes a more intuitive and effective visualization of clustering results, specifically shown in Figure 3.

In order to achieve a more targeted analysis of students’ learning behavior, SPSS software is further used to draw radar charts, and the three types of student learning behavior classification radar charts shown in Figure 4 are obtained.

From the above learning behavior performance radar chart, it can be easily seen that the average frequency of each type of performance behavior is different. The access frequency of the module is used for group classification, which shows that the clustering grouping has a certain scientific nature.

We can judge whether the selected learning rate is appropriate by the prediction accuracy of the corresponding model at the selected learning rate. We went to the learning rate of 0.1, 0.5, and 0.8, respectively, and built the model multiple times to conduct comparative experiments. The results are shown in Figure 5.

The model is built 10 times, and the accuracy of each time is taken for comparison. Each model is not high, and the accuracy of the model fluctuates greatly, to a large extent. The reason is that the minimum value obtained each time is unstable because the learning rate is too high, and it is very likely that the final value obtained crosses the minimum value point.

From Figure 6, the old and new algorithms are trending. The new and old algorithms both show significantly less in the original algorithm.

Each time the value is increased by 50, the value is gradually increased from 50 to 500, and the algorithm is recorded (Figure 7).

Across the range of test datasets, as the number of dataset samples increases from 50% to 75% to 100%, the running time of the algorithm and the memory usage also increases, which is in line with our expectations, because samples increase, the number of 1v1 rules that need to be created will also increase, and the number of rules that need to be expanded will also increase, and the running time and memory footprint will certainly increase.

We also use the chess it in the UCI dataset for experimental comparison. We run the two algorithms 10 times with the same parameter conditions, because it needs to set the number of generation rules, so we first use the support as 0.4 run the NegFIN algorithm and specify the value of the Top-K algorithm according to the number of item sets mined by the algorithm. It is shown in Figure 8.

5. Conclusion

Modern college students have the spirit of innovation, confidence, and desire to challenge traditional concepts and traditional industries, and this spirit of innovation often creates the power source of college students’ entrepreneurship and becomes the spiritual foundation of successful entrepreneurship. College students have entrepreneurial dreams in their hearts to work hard and create wealth. Combined with big data technology, the innovative education resource database, a big data platform for education, is constructed. In view of the problem that the mining algorithm can mine the entire dataset, the ball-tree structure is used to algorithm. Eventually, the dataset is separated into a suitable number of clusters. Then, the difference between the algorithm and the Top-K algorithm is analyzed, and the correlation coefficient is combined with the Top-K algorithm after analyzing the pruning process of the rules, and then, the correlation is used for each cluster. The coefficient Top-K algorithm was mined, and the results and efficiency were analyzed experimentally. Break the traditional practice and innovation teaching mode, and create rich data resources such as industry data, technological innovation data, college education mode, policy data, and market demand data to provide the driving force for education reform. To build an education micro-teaching system, fragmentation, the introduction of diversified and miniaturized micro-teaching methods into the whole cycle of practical innovation education can effectively promote the reform of the teaching and continuously improve the teaching effect of practical innovation in major universities. What investors value is how high the real technological content of your entrepreneurial plan is, to what extent it is not replicable, and the potential of market profit. For these, you must have a set of detailed feasibility studies and implementation plans, and you cannot make people pay for them just by a few words of an idea. In the process of entrepreneurship, we should often plan or plan something in advance. When making a plan, we must integrate various factors to form a practical action decomposition, taking into account any possible details. In the process of implementation, we should make timely adjustments according to the current specific situation. Operation requires strong plan management ability, and only with this ability can we get closer to the door of successful entrepreneurship.

Data Availability

The figures used to support the findings of this study are included in the article.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to show sincere thanks to those techniques who have contributed to this research.