Computational thinking (CT) is an approach that applies the fundamental concepts of computer science to solve problems, design systems, and understand human behavior, which can help students develop lifetime learning and generate new topics. It has been the elements of competency expected of the next generation of talents. However, the current research on computational thinking evaluation is still at a relatively weak stage. The existing related evaluation research is still limited to traditional curriculum evaluation methods. Therefore, the training effect of computational thinking cannot be well quantified, and the characteristics of students cannot be further explored. In this work, we propose a three-way decision model for improving computation thinking. We first developed a system of evaluation metrics, including five specific primary indicators and several secondary indicators. Next, the weight of each indicator was determined by applying an expert similarity measure, consequently getting the best metric sequence. We employ a grey correlation analysis to calculate the distance of each test result from this optimal sequence. Then, we trisect the set of testers based on the distance to build three regions of high score sequences, medium score sequences, and low score sequences inspired by the three-way decision. We can then exploit these rules on target students in the relatively low regions to improve their computational thinking. An example analysis illustrates the effectiveness and applicability of the method. This article provides a solid theoretical basis for improving students’ computational thinking ability. Teaching administrators can conveniently formulate computational thinking teaching strategies, and timely warning and intervention for students with poor computational thinking ability can effectively improve students’ computational thinking ability. The corresponding training measures are given to students of different ability levels to achieve differentiated and personalized training.

1. Introduction

With the rapid development of artificial intelligence and information technology, human thinking is experiencing change, and computational thinking has become essential in the information age. As a new method of the intelligent information age, computational thinking is a kind of thinking activity that can flexibly use computational tools and strategies to solve problems. The cultivation of computational thinking can promote the comprehensive development of people and benefit a lifetime.

Computational thinking has attracted widespread attention in international primary education since it was proposed in 2006 [1], and curriculum standards related to computational thinking have been developed. The U.S. Computer Science Standards (CSTA) for grade K12, published in 2011, has included computational thinking as a critical element of the computer science curriculum. The British Ministry of Education issued the Computational Learning Plans I–IV in 2013 to guide the development of computational thinking skills for students in primary education in the UK. In 2015, the Australian Ministry of Education released the Digital Technology Curriculum Standards, emphasizing that people need computational thinking literacy in a digital information society. China also gradually pays attention to the development of computational thinking education. In 2010, the C9 University Consortium emphasized that developing computational thinking skills would be a significant, long-term, and complex core task of primary computer teaching. In 2012, the Ministry of Education made the cultivation of computational thinking a priority. It pushed the reform of the computer curriculum to improve the practical application of computers and realize computer empowerment education. In 2017, it included computational thinking in the General High School Information Technology Curriculum Standards as one of the four core elements of the subject.

Thus, computational thinking education has gradually become younger, and more teachers and parents are paying more and more attention to cultivating computational thinking skills from a young age. It has steadily practiced the cultivation model of computational thinking and teaching strategies, but how effective is the cultivation? Are students’ computational thinking skills improved, and how are they being evaluated? Without reliable assessment tools or methods, it is not easy to make the best use of computational thinking when it is integrated into educational curricula. Evaluation is crucial for developing computational thinking and is a prerequisite for developing student’s computational thinking skills. Pedagogical evaluation is a guide for developing computational thinking and a guarantee of its sustainability.

Only by fully understanding the shortcomings in the development of computational thinking, we can design a scientific, reasonable, and perfect assessment system in a targeted manner, thus well-developing student’s computational thinking skills. Computational thinking evaluation research is still in its infancy. There is still a lack of professional evaluation systems and evaluation methods that can quantify the effect of developing computational thinking. Combining teaching practice with quantitative evaluation of student’s computational thinking ability is the next question researchers must consider. We can only facilitate the research of the following cultivation strategies by fully grasping student’s computational thinking ability. Therefore, a scientific and reasonable teaching evaluation will have a decisive influence on the cultivation of computational thinking. By evaluating student’s computational thinking skills, it is possible to grasp student’s abilities and thus give different training strategies to students with unique characteristics, thus meeting society’s demand for individualized talents. The evaluation results can explore the features of students with varying levels of ability and then give corresponding training to students with different levels of ability, thus achieving differentiated and individualized training. Therefore, a reasonable evaluation model and ability feature mining research are significant for student’s personalized computational thinking.

We organized the rest of this study as follows. In Section 2, we review the strategies for developing computational thinking and measuring computational thinking. Section 3 proposes a computational thinking evaluation metric framework. We employ the grey correlation between the comparative sequence of the test taker and the optimal reference sequence to construct three regions of high level, medium level, and low level, according to two thresholds by sorting them according to the correlation value. In Section 4, we perform a three-way classification and determine the final category. Then, the hidden association rule properties behind the student evaluation results are mined based on the Apriori algorithm. Section 5 gives a summary and planning for future work.

This section will review computational thinking and its related evaluation methods and then review methods such as grey correlation analysis and three-way decision.

2.1. Computational Thinking Development Strategies

Robotics and programming are crucial vehicles and avenues for the development of computational thinking. Angeli and Valanides [2] studied the effect of educational robots on student’s computational thinking of different genders. The results showed that boys benefited more from spatial orientation and manipulative activities, while girls benefited more from collaborative writing activities. This research contributes to the body of knowledge about teaching computational thinking. The results can design lessons and classroom activities that focus on a broader range of computational thinking skills. Chalmers [3] studied how Australian elementary school teachers integrated robotics and coding in their classrooms and its impact on student’s computational thinking skills. The results showed that using robotic tools and activities for exploration can help teachers build confidence and a body of knowledge. Relkin [4] et al. studied changes in computational thinking skills in the first- and second-grade students. The results provide that teaching young children to code can accelerate their computational thinking skills. Özmutlu [5] et al. studied the impact of short-term, intensive coding and robotic training on the self-efficacy of middle school students’ computational thinking skills.

There would be many possibilities to explore the impact of these experiences on elementary and students in the areas of coding, robotics, mobile devices, Arduino-based applications, and game-based learning. Gadzikowski [6] designed coding, robotics, and engineering course for young students to learn knowledge, such as coding, robotics and engineering concepts, and practice skills, such as creative problem-solving, computational thinking, and critical thinking. Qu and Fok [7] focused on student-robot interactions in robotic education and attempted to cultivate student’s computational thinking skills. Chevalier et al. [8] discussed how educational robotics fostered computational thinking skill development and confirmed that robotic education is necessary for specific teaching interventions.

Xiao and Yu [9] explored teaching computational thinking in four stages one by one, from problem identification and decomposition, system abstraction and solution design optimization, solution implementation, and problem migration, with an engineering design perspective of problem-solving. Vesikivi et al. [10] focused on teaching computational thinking and the teaching methods and research design under different types on the impact of the development of computational thinking. Cui and Ng [11] studied evidence-based directions towards enriching mathematics education with computational thinking. Grover et al. [12] tapped into the existing relationship between cognitive level and computational thinking through student’s programming behaviors, thus showing the superiority of programming instruction as a means of computational thinking development. Based on computational review and app inventor characteristics, Ku [13] proposed developing student’s computational thinking skills with the teacher as the designer, organizer, guide, and app inventor learning tool. The method motivates students to actively use computational thinking to analyze and solve problems through teacher-student cooperation and student-student cooperation as learning forms.

2.2. Computational Thinking Evaluation Methods

Existing computational thinking evaluation methods include programming task-based assessments [14, 15] and scale assessments [1619]. Automatic scoring systems based on programming tasks automatically score the test taker’s computational thinking skills by the learner’s programming code situation. For example, an automatic scoring system based on programming tasks automatically scores the test taker’s computational thinking ability based on the learner’s programming code. Another approach to programming-based present assessment is the design of a computational thinking assessment framework, which evaluates programming items based on the computational thinking concepts, practices, and perspectives involved in the programming project.

The scale assessment methods include the test-based evaluation scale CT, which assesses computational thinking through actual student project answers. There are also evaluation scales based on the five factors of computational thinking designed to evaluate student’s computational thinking based on their behavioral data and an evaluation scale based on self-efficacy, which evaluates the learners’ level of computational skills. Román-González et al. developed a multi-competency test-based evaluation scale, CTt, to assess the computational thinking ability of the subjects. Korkmaz et al. used the theoretical framework of computational thinking proposed by ISTE as a basis to design the computational thinking scale (CTS). In 2015, Korkmaz et al. similarly designed and oriented the scale to measure college student’s level of computational thinking skills, which comprised 5 factors and 29 measures, and validated the reliability and validity of the computational thinking scale. It was later revised by Korkmaz et al., and the scale was oriented to students at the K12 level. The revised CTS still contains the original five factors and 22 measures with the same validity and reliability and focuses on measuring different age groups. Kukul et al. developed the computational thinking self-efficacy scale (TSES), through which learners self-assess their level of computational competence. Brennan et al. proposed a three-dimensional evaluation framework and argued that assessment can be carried out in terms of the concepts (e.g., sequence, loop, and parallelism), practices (e.g., incremental and iterative, testing and debugging, and reuse and recreation), and perspectives (e.g., expression, communication, and questioning) of computational thinking.

The above assessment methods collect student data and scores based on items or scales. The subjective scoring of learners for each task based on teachers’ experience is, first, more subjective and, second, does not consider the evaluation index levels and index weights. The simple statistical method of measuring the effect of computational thinking training cannot tap into the deep relationships among students, which is not conducive to proposing targeted training strategies. Analyzing student’s data and exploring the hidden relationships between student’s computational thinking levels are an urgent problem to be solved. The three-way classification has been extensively investigated and applied in various situations.

2.3. Three-Way Decision and Three-Way Classification

With the rapid development of massive data and artificial intelligence, decision-making has become increasingly prominent [20, 21]. Instead of the traditional binary classification problem, Yao first outlined a three-way decision theory [22], applied to the classification problem by “thinking in three.” The third alternative of the boundary region is introduced, which is associated with deferred or indeterminacy decisions of the classification. The three-way classification has been extensively investigated and applied in various situations [23].

The trisecting-acting-outcome (TAO) model of a three-way decision encompasses three components: trisecting divides a whole into three pairwise disjoint or weakly joint regions , and . The acting is to devise action strategies for three regions. The outcome evaluation measures the effect of the trisection and action strategy [2426]. The TAO model has merged as a new three-way decision model that promises to make the three-way decision smarter. The major concern regarding the TAO model is about the outcome that is the effectiveness of trisecting and acting.

Using three-way classification in developing teaching strategies is a significant experiment. The framework for measuring and improving the level of computational thinking using three-way decision, especially the TAO model in this study, is illustrated in Figure 1.

All students are assessed from a whole, as shown in “A whole,” based on specific multilevel metrics, which may be two levels. Thus, we made three segments: high-level, medium-level, and low-level regions. Students in the high-level area have better marks on particular measures, while students in the low-level area have worse impacts on specific criteria. Analyzing these specific characteristics allows teachers to design customized instructional strategies to further develop student’s specific competencies to improve their computational thinking. These instructional strategies form the “strategies” node. We can obtain the benefits of these two processes through the “outcome evaluation.”

3. Evaluation of Computational Thinking Using Three-Way Decision

This section first proposes a computational thinking evaluation index system. It assigns weights to each index through expert clustering, which fully reflects the contribution of different experts to the index weight and avoids the disadvantages of being too single subjective. Then, through the weighted grey correlation analysis method, the grey correlation degree between the comparison sequence of each testee and the optimal reference sequence is thoroughly studied and analyzed. The degree of correlation is sorted according to the value of the correlation degree. According to the set threshold, the students can be initial classification. The three regions of high level, medium level, and low level were constructed according to two thresholds by sorting them in order according to the correlation value.

3.1. Computational Thinking Evaluation Metric Framework

Computational thinking has different components, according to various scholars and research institutions. MIT’s NEET program considers computational thinking to apply fundamental computational procedures, data structures, and algorithms to other social systems, such as production and life. Özgen considers computational thinking as a piece of knowledge, skills, and attitudes that enables computers to solve real-life problems. The British School Computing Curriculum Working Group [27] considers that the elements of computational thinking include logical, algorithmic, recursive, and abstraction skills. Brennan and Resnick [28] think that computational thinking comprises three major components: computational concepts, computational practices, and computational viewpoints, containing 16 areas of skills. Settle and Perkovic proposed a conceptual framework of computational thinking from the perspective of computer principles. ISTE believes that computational thinking comprises five components: creativity, algorithmic thinking, critical thinking, problem-solving, and collaboration. Selby and Woollard [29] argue that computational thinking comprises decomposition, abstraction, generalization, algorithm, and evaluation. Korkmaz et al. [17] argued that computational thinking includes cognitive and application-based knowledge structures related to computer science, e.g., problem representation and solving, and abstraction. Many researchers have continuously explored and refined computational thinking.

Since the concept was put forward from computational thinking, there has been a lot of research on the interpretation of the connotation of computational thinking and teaching. There has been a lot of research on the interpretation of the connotation of computational thinking, teaching methods, models, etc. However, there are relatively few studies on the evaluation of computational thinking. The existing evaluation methods of computational thinking include table evaluation method, work analysis evaluation method, interview evaluation method, question evaluation method, and evaluation of related computational thinking. However, most of these evaluation methods focus on simple score evaluation, and the evaluation indicators are not. Students' computational thinking characteristics behind these achievements are not deeply explored. Thus, the weighted establishment of evaluation indicators and the classification and mining of student characteristics have become the focus of this article. At the same time, the traditional student classification method classifies students as good or poor according to their rank or total proportion. It does not consider the relationship between multiple constituent indicators and the hierarchical relationship. In particular, for classifying middle school students, there is a problem of inaccurate classification, and the problem of inaccurate implementation of teaching strategies that follow brings additional teaching costs.

We integrate the five significant elements to design a computational thinking level evaluation metric based on the principles, including scientificity, feasibility, comprehensiveness, and independence. It incorporates the evaluation characteristics of programming education from the connotation and components of computational thinking and takes programming as a fundamental approach to cultivate computational thinking.

This study’s evaluation metric framework of computational thinking contains five first-level indicators, including problem decomposition, abstraction, pattern generalization, algorithm, and evaluation. Moreover, on this basis, more fine-grained two-dimensional metrics are established to build a hierarchy of computational thinking evaluation metrics, as shown in Figure 2.

3.1.1. Second-Level Evaluation Metrics

The core of computational thinking is the logical decomposition of significant problems, thus breaking them down into smaller modules that are easier to solve. The indicator “decomposition” (denoted as ) includes two secondary measures: the ability to analyze the material studied (denoted as ) and the ability to decompose the problem (denoted as ). The former refers to understanding the material, organizing and analyzing it logically, and clarifying the problem’s core. The latter refers to decomposing complex problems into more minor problems, clarifying the relationships between the smaller problems, and establishing a logical sequence of the different parts.

“Abstraction” (denoted as ) refers to extracting core things or critical data from many transactions and ignoring irrelevant details. The final representation in a formal way is the transformation of data or problems into a data structure or formal mathematical model suitable for computer processing. It comprises three secondary metrics: conceptual analysis (denoted as ), inductive extraction (denoted as ), and formal representation (denoted as ). Conceptual analysis refers to the ability to clarify the various concepts contained in a transaction and to clarify each concept and the relationship between concepts by means of comparison, judgment, and reasoning; inductive extraction refers to the ability to extract the common essential properties, methods, and rules of different things and then to exclude the nonessential parts or irrelevant details of the individuality of specific things. Formal representation refers to the representation of a problem so that a computer can solve it, thus forming an abstract representation and a visual representation of the object.

“Pattern” (also called a generalization, denoted as ) is a general pattern for solving a class of problems. It is used to summarize some specific problem-solving patterns by continuously comparing abstraction and generalization of problems and extending them to the solution of similar problems. It includes three secondary metrics: model construction (denoted as ), structural specification (denoted as ), and stable operability (denoted as ). The model construction shows the ability to summarize a pattern through the current problem, clarify the type of pattern, and be familiar with the things to be solved so that it can be applied to the same type of things. The structure specification means that the structure of the pattern is hierarchical and logical. The elements represented by the pattern are simple and can reflect the core and essence. Stable operability indicates that it can be appropriately applied to similar problems by simple modifications and has high applicability.

“Algorithm” (denoted as ) is a series of computer instructions for solving a problem, a collection of infinite rules. Algorithmic thinking and computer systems can form a series of automated solutions to problems. It consists of four secondary indicators: data representation (denoted as ), functional refinement (denoted as ), straightforward process (denoted as ), programming (denoted as ), and debugging, respectively. Data representation means that variables can be extracted, their type can be determined, and the relationships between the data can be analyzed. Finally, the appropriate data structure was chosen according to the needs of the problem. Function refinement means clarifying the program’s specific functions, sorting out the logical relationships between functions, and defining different functions. Clarifying the flow means that a suitable structure can be built with flowcharts. Programming and debugging mean choosing the proper statements, translating the problem into a program, and debugging the errors to build a well-readable program.

“Evaluation” (denoted as ) is the process of using practical steps and resources to arrive at the most appropriate and suitable solution, procedure, or algorithm, by weighing the pros and cons and finding an ideal solution that is most applicable. It consists of three secondary metrics, namely completion (denoted as ), process optimization (denoted as ), and usability (denoted as ). Completion indicates whether or not the basic functionality can be accomplished as required and allows the correctness of the solution to be assessed. Program optimization refers to optimizing the program to make it functionally richer. Usability means that the program has a certain level of usability or better performance.

3.2. Establishing Three Partitions Based on Weighted Grey Correlation Analysis

Since the importance of metrics is different, it is necessary to distinguish the role of each metric in the overall evaluation, and determining the weight of metrics is one of the core issues of evaluation. In this section, the cosine similarity among experts completes the expert clustering. The intra-class weights and interclass weights of experts are based on the information entropy and the ratio of clustering numbers, respectively. Finally, the proportion of each expert is calculated comprehensively. Then, the final metric weights are obtained by combining the multiplicative sum of the initial value of each expert-rated metric and the proportion of each expert. This method can reflect the individual expert weights and comprehensively consider each expert’s contribution to the index weights.

3.2.1. Weighting Analysis Based on Expert Clustering

Evaluation of computational thinking evaluates the combined effect of multiple factors rather than a single evaluation. On the basis of the evaluation metric framework for computational thinking in Figure 2, we derived a final evaluation for each student and divided this evaluation into three subdivisions, i.e., high-level area, medium-level area, and low-level area. The collection of the metrics includes the following:

Because each metric’s importance is different, to determine the final evaluation outcome, we need to determine the weights of each metric in the ultimate result. In this study, we use the method of expert scoring. To avoid the cumulative effect of experts with the same type or similar background knowledge on the metric weights, we first clustered the experts through the cosine similarity between experts, and second, we calculated the different proportions of each expert by calculating the intra-class weights and interclass weights of the experts. They used the two methods of information entropy and clustering number proportion, respectively. Finally, the final metric weights are obtained based on the multiply sum of the initial values of the metrics given by each expert and the proportion of each expert. This method reflects the weight of individual experts and considers each expert’s contribution to the metric weights.

Assume there are experts scoring the importance of metrics, and denotes the score of expert scoring the importance of metric , which finally constitutes the importance matrix . According to the cosine similarity, we calculate the similarity between experts as follows:

Assume is the information entropy of the metric evaluation vector of expert . Then,where is the weight of the importance of the th metric in the th expert rating vector to the sum of the evaluation of the th metric.

Interclass expert weights: experts are divided into classes, and there are experts in each class, and then, the weight of each class is given by:

Intra-class expert weight: the entropy weight of the th expert in the class is as follows:

Expert aggregate weights are defined by

Metric weights: after multiplying the weight vector of experts with the standardized importance matrix, the sum of columns of is calculated; i.e., the terms of the same subscript are added to obtain a vector, which is the metric weight vector and is given as follows:

Based on the expert similarity matrix, finally, we complete the clustering of experts. The specific Algorithm 1 is as follows.

Input: Initial scoring of metrics by experts , n
Output: Weight of metrics
(1)calculate expert similarity matrix
(3)for i = 1 to n do
(4) for j = 2 to n do
(6) cluster similar experts
(7) // Initialize the largest collection and class among experts.
(8)// Initialize the maximum number of similar classes among experts.
(9) if find and then
(10)   corresponding two experts
(13)Repeat the above steps until .
(14) // Initialize the expert collection.
(15)for i = 1 to k-1 do
(16) for j = 2 to k do
(17)  combine the collections containing the same experts in the pairwise clusters
(18) if unclustered experts then
(19)  separate into a class
(20)// the weight of each class
(21)// the metric evaluation vector of expert
(22)// the entropy weight of the th expert in the class
(23)// expert aggregate weights
(26)// metric weights
3.2.2. Constructing a Tripartition

The core idea of the grey correlation analysis method is based on the similar program pairing between the various sequences in the entire system. The degree of association between the sequences is analyzed. The model requires only a small amount of samples for data analysis and has operational capabilities. It has the advantages of a simple operation method, convenient operation, and easy mining of data laws. Therefore, the grey correlation analysis model must perform simple operation analysis by extracting a small amount of sample data in a system. Then, the overall system can be analyzed. Development and change trends provide a quantitative measure. It is essentially a quantitative description of the dynamic development of the object methods of analysis and comparison. This method calculates the comparison sequence and reference that can reflect the behavior characteristics of the object. The degree of relevance between the sequences is used to sort and analyze the objects and finally get the results of the pros and cons of the objects.

The main steps of the traditional grey relational analysis model are as follows:(1)Determining reference series and comparison series.(2)Dimensionless processing of the sequence.(3)Finding the grey correlation coefficient of reference series and comparison series.Suppose a reference series is denoted as , it has comparison series, denoted as , and each comparison series is associated with the reference series at various moments or under different behavior characteristics. The coefficient can be calculated by the following formula: where is the minimum difference in the second level, and is the maximum difference in the second level. The absolute difference is compared between each feature point on the sequence and each feature point on the reference sequence , and it is recorded as . In general, the resolution coefficient in the formula is generally 0.5.(4)Finding the degree of grey relation .Each associated sequence and the selected reference sequence are all sequences composed of different moments or different characteristics. The correlation coefficient refers to the correlation degree value between the comparison sequence and the reference sequence at a particular time or feature. Usually, there are multiple values. There is a correlation coefficient under each time or each feature. Because the information is too scattered, it is not conducive to the overall comparison of objects. Therefore, it is necessary to gather multiple correlation coefficients into one value. This value will be used as a quantitative representation of the degree of correlation between a comparison series and a reference series. Generally, the average value of the correlation coefficients at each time or feature is obtained. It indicates the degree of grey correlation, and the calculation formula is as follows: where the value range of is . When the value of is closer to 1, the correlation between the two sequences is better, and the similarity is higher. The closer to 0, the opposite is true.(5)Relevance rankingComparing the degree of association between different sequences is mainly by calculating the grey correlation value of n different comparison sequences to the same reference sequence and sorting them from largest to smallest, forming an association order, denoted as , association. The sequence reflects the pros and cons of each comparison sequence. If , it is said that is better than for the same reference sequence 0, which is recorded as ; represents the characteristic value of the time comparison sequence to the reference sequence 0.

The study used the grey correlation analysis method to perform a comparative analysis of student sequences. The authors [30] dealt with quantifying qualitative indicators using an improved grey statistics-based approach. They combined the approximating ideal solution method with the grey correlation method to find out the weaknesses of teaching training and improve the assessment of training levels. The contribution of the work in the literature [31] solved the problem of weighting the evaluation indicators by weighing the different importance among the evaluation indicators through the correlation degree between the sequences. The work [32] also used this method to evaluate the weights of each evaluation index and, at the same time, combined with the theory related to the cloud model to complete the comprehensive evaluation of teaching quality. The literature [14] used the combination of grey correlation analysis and hierarchical analysis method to determine the weights of several factors. The grey correlation degree among each factor creatively established a hierarchical grey combination evaluation model and then judged the grade of internship teaching effect. This section utilizes the grey correlation analysis to construct the tripartition.

Assume there are test samples, and the test result is , which represents the scores of these test samples on metrics.

Normalization of yields , where is the ratio of the component to the mean of in the sequence . That is,

The optimal sequence is denoted as . In this study, the optimal value of each metric, i.e., the maximum value, is selected as the value of each component in the test data series and, after standardization, is noted as .

The absolute value of the difference between and at the th component is noted as , and the minimum value of the difference between the comparison sequence and the reference sequence at components is , and the maximum value is .

The absolute value of the difference between the samples and the reference sequence is calculated separately, and the minimum value of all the differences is , abbreviated as , and the maximum value of all the differences is , abbreviated as . The formula to calculate the correlation coefficient between the sample sequence and the comparison sequence is as follows:

From (11), it can be seen that the product of the discriminant coefficient and has a significant influence on the final result of the whole equation. The value of impacts the overall contribution of to the correlation degree. In general, is taken as 0.5.

Based on equations (4) to (9), the metric weights can be calculated and denoted as . The weighted grey correlation between the comparison sequence and the reference sequence is denoted by and is calculated as follows:

The grey correlation values were between [0, 1]. Lager value means that the students’ computational thinking skills are more similar to the optimal reference sequence, i.e., more excellent. To determine the percentage of students in each category, we can define two variables, a and b, and sort the students from largest to smallest based on the grey correlation. The top a% of students will be classified as excellent category, the bottom b% will be the passing category, and the rest will be medium.

3.3. An Illustrative Example

In this section, an example is given to verify the validity and reasonableness of the evaluation model. The experimental data are obtained from an online testing platform of a university. The data consist of two parts. The first part is the importance ratings of computational thinking indicators by six experts on a scale from 1 to 5, with higher values having the highest importance. The second part shows the test results of students in a school. Each student’s test results for each metric were scored by 1–10. At the same time, a questionnaire was taken from the students. Moreover, the students self-evaluated their performance on each metric through self-awareness on a scale of 1–10. The final score matrix of the students was obtained as the mean of the scores of the two parts, teacher evaluation and self-evaluation.

First, the importance score matrix of the first-level metrics was given by six experts as follows:

The expert similarity matrix is calculated according to Equation 3.

The experts were clustered, and the clustering results were [1, 5, 6] for the first class, [2, 4] for the second class, and [3] for the third class. Next, the intra-class weights of the experts were calculated as follows: first class: 0.34081814, 0.32816024, and 0.33102162; second class: 0.49323828 and 0.50676172; and third class: 1. The weights between classes are as follows: 0.64285714, 0.28571429, and 0.07142857; the weights of 6 experts are as follows: 0.21909738, 0.14092522, 0.07142857, 0.14478906, 0.21096015, and 0.21279961.

According to equations (8) and (9), each expert weight is multiplied with the standardized metric importance vector, and the metric values with the same subscript are summed up. The final weight of the primary metric is obtained as [0.18465826, 0.23438986, 0.21356242, 0.2440862, 0.12330328].

Similarly, the secondary indicator weights can be calculated, and then, the primary and secondary indicator weights are combined to obtain the final secondary metric weights. The results are shown in Table 1. The meanings of some abbreviations in the table are as follows. FLM represents the first-level metrics, FLW represents the first-level weight, SLM represents the second-level metric, the ISLW represents the initial second-level weight, and FW represents the final weights.

A student’s scores on each metric form a sequence that contains scores, and a sample of students forms an initial score matrix. Some of the student data are listed in Table 2.

In this study, the top 30% of students were selected as the excellent category, i.e., category A, the bottom 20% as the average category, i.e., category C, and the rest as the medium category, i.e., category B. Thus, the initial category classification of the evaluated subjects was completed. The initial classification of the three categories of students, that is, the tripartition, is A = [10, 7, 24, 23, 20, 5, 6, 22, 18], B = [9, 19, 4, 25, 13, 8, 21, 12, 28, 11, 15, 3, 17, 14, 29], and C = [1, 26, 27, 16, 2, 0], respectively.

4. Association Rule Mining Based on Three-Way Classification

The evaluation system is gradually reformed, and the rating system has been steadily promoted. Compared with the refined scoring system, the rating is more conducive to promoting the progress and development of the evaluation objects. The two-branch classification will cause a more significant loss of misjudgment caused by the evaluation object. Multibranch classification divides the evaluation objects into excellent, good, medium, average, poor, or more fine-grained classification. This classification method increases a specific classification cost, and the teaching effect it brings is also open to question. The characteristics between categories are weakened, which is not conducive to mining. We are distinguishing features between categories. According to the characteristics of students' ability classification, this study introduces the three decision-making theories into the application of student ability classification, considering the relevance of evaluation objects, classifies students into three evaluation categories, and finally divides students into three categories: good, medium, and general. The correct classification of students can effectively reduce teaching costs and, at the same time, obtain more practical teaching effects.

4.1. Three-Way Classification Based on Computational Thinking

The correlation between the metrics and the assessment results in the high-level areas, or in the middle-level areas, can be explored, which allows us to develop specific courses for students in the low-level areas and thus improve their computational thinking. This section will give the definition of three-way decision based on this example, the definition of three-way rule mining, and the specific example analysis process.

A three-way decision model with an ordered relationship is defined as follows.

Definition 1. Assume that is the set of students to be tested. is an evaluation function on set . For , is an evaluation function value of . Given a pair of thresholds with , we divide into three pairwise disjoint regions:The three regions satisfy the following two conditions:(1)(2)According to the evaluation function , those objects greater than or equal to the value of the function are divided into a region . Those objects less than or equal to the value of the function are divided into a region , and objects in between are divided into a region .
From the perspective of the TAO model, the division of the three regions allows us to better focus on each region and analyze each region’s characteristics. We can identify those metrics that can be improved, moreover develop some target strategies, and thus, we can improve the students’ computational thinking. The direct outcome of the process is to move the students from relatively low-level regions to middle-level or high-level regions, that is, the movement-based three-way decision model, which was proposed by [22]. The movement-based three-way decision introduced actionable rules into the three-way decision, which means that a user can mine actionable rules and then produce the outcome of moving objects to generate benefits. The model aims to mine action strategy in three regions and move objects from unfavorable regions to favorable regions.

Definition 2. A decision table is a tuple as follows:where is a nonempty finite set of objects, is a finite nonempty set consisting of attributes composed by three subsets, in which is stable attributes, is inert attributes that do not change easily but do change, is flexible attributes, is a decision attribute, is a nonempty set of values for every attribute , and is a mapping. For every , attribute , and value means that the object has the value for attribute .

Definition 3. Assume that and are equivalence classes in different regions. We can get two decision rules:where , is decision rule, is a set of stable attributes, is the value of attribute , is a set of flexible attributes, is inert attributes, is the value of attribute , and is the value of decision attribute .

Definition 4. Assume that and are equivalence classes in different regions, where is the equivalence class that is located in relatively low-level regions, such as low-level region and middle-level region, and the is the target equivalence class, which means that it is relatively high-level region. An ideal strategy is to make the equivalence class convertible to or close to convertible to the equivalence class; that is,where is actionable rules from to , means that and have the same value of stable attributes, means that and have the same value of inert attributes, and means that the value of flexible attributes is changed from to .
The reason for introducing inert attributes is to strip away those attributes that do not change easily, even with much training under the teacher’s strategic instruction, such as the student’s IQ. These attributes may only change a little, even after prolonged training. They may be genetic in origin. Stripping these attributes may help teachers discover which characteristics are susceptible to instructional strategies.
In the following work, we analyze the association rules for the objects in the three regions. In particular, we use the Apriori algorithm to analyze students’ computational thinking test data. In doing so, we can discover some strong association rule relationships among metrics and between metrics and assessment results in a large number of students’ data. To reduce the cost of instructional strategy design, we divided these rules into three regions based on their frequency of occurrence: high-frequency rules, medium-frequency rules, and low-frequency rules. In other words, each area has three regions of rules. Then, teachers can choose specific teaching strategies to teach according to specific constraints, such as cost, so that students’ computational thinking level can be improved and developed. This process is shown in Figure 3.

4.2. An Illustrative Example

By analyzing and mining the association rules that exist for each category of students and mining the association rules between indicators and test results, we can analyze the characteristics of students with different ability levels and can propose targeted improvement strategies to discover the characteristics of more capable students, thus having some positive significance for instructing weaker thinking students.

4.2.1. Test Data and Analysis

According to the test data, we assume that all student ratings made a collection and the collection consists of different items, consisting of all indicators taken and the final evaluation results. Each transaction in the set is a set of items in . is the set of scores of student on each indicator and the final evaluation result.

The continuous data scores are discretized based on the student's score on each metric, to represent the continuous data scores, with 1–3 being a C, 4–6 being a B, and 7–10 being an A, resulting in all discrete grades, i.e., A, B, and C. We refer to all test data for each student as a transaction. Let us take the students of category as an example and perform student feature mining.

The actual transaction data are listed in Table 3. “TID” represents the test questions, and “Test Items” represents the grade on the five metrics and the final evaluation outcome.

The dataset is scanned and the candidate set C1 is generated as shown in Table 4. The item in C1 with support less than the minimum support is removed, which in turn generates L1. Correspondingly, support level is as follows: “5B” is 0.889; “4A” is 0.889; “6A” is 0.778; “2B” is 0.778; and “3B” is 1, respectively. The set of items from the frequent item set is aggregated into the candidate set C2. Items in C2 are removed with support less than the minimum support, thus generating L2 as shown in Table 5. In a similar method, we can obtain C3 and L3, and C4 and L4 as shown in Tables 68, respectively. In this case, K = 5 is selected, and a total of 4 item sets are generated, as shown in Table 9.

4.2.2. Rule Mining

Through the analysis, we mined 63 strong association rules for the first category of students, that is, category A students. For example,(1)“4A” ≥ “6A,” confidence: 0.75(2)“3B,” “4A” ≥ “6A,” confidence: 0.75(3)“2B,” “3B,” “4A” ≥ “5B,” confidence: 1.0(4)

The association rule shows that when the algorithmic ability is A, the level of computational thinking is A, and the pattern level has little effect on the computational thinking outcome. When the abstract level and pattern level are both B, the computational thinking outcome is only B even if the algorithmic test result is A. Therefore, when teachers instruct students, they should not only focus on students’ algorithmic ability, but also focus on abstract understanding and pattern skill.

Similarly, we can analyze 16 strong association rules for intermediate students and a total of 54 strong association rules for average students. For example, the strong association rules for middle-level students include the following:(1)“2B” ≥ “6B,” confidence: 0.769(2)“5B” ≥ “6B,” confidence: 0.769(3)“2B,” “3B” ≥ “5B,” confidence: 0.9(4)

The correlation rule shows that when the abstraction level is B, the corresponding computational thinking level is also B; when the evaluation level is B, the computational thinking level is also B, which indicates that both abstraction ability and evaluation ability influence computational thinking. In addition, the abstract level and pattern level have a grade of B, and the evaluation result is also B. Therefore, if we want to improve the evaluation ability, we should also improve the abstract and pattern ability accordingly.

For the average student, we found that:(1)“2B,” “4B,” “3B” ≥ “6C,” confidence: 0.799(2)“3B” ≥ “4B,” confidence: 1.0(3)

From the above rules, we find that when the levels of abstraction, pattern, and algorithm are all rated as B, the corresponding level of computational thinking is C. When the pattern is B, the level of the algorithm is also B. Therefore, there is a correlation between the level of pattern and the level of algorithm. Moreover, the level of the algorithm cannot be improved without the level of pattern, and the level of computational thinking cannot be improved without the level of abstraction, pattern, and algorithm.

By analyzing the association rules of students in different level areas, teachers can change the level of some metrics. That is, specific teaching methods and strategies are adopted to change certain thinking skills, thus allowing students to transform from a lower level of computational thinking to a better level of computational thinking.

5. Conclusion

In this study, we explore research related to computational thinking, including strategies for developing it and means to measure it. We propose an evaluation model by combining grey correlation analysis, association rule, and three-way decision theory. The first step is to develop computational thinking evaluation indicators and then use a weighted grey correlation analysis-based approach to evaluate student’s computational thinking skills. The weighted grey correlation between the student samples and the optimal reference sequence was considered, classifying tested students into three levels. Based on the initial classification results, the neighborhood of students was calculated based on the grey correlation between the evaluation objects, and each category of students was divided into positive, negative, and boundary domains, respectively.

We envision the future work to include, first, enriching and improving the evaluation indexes. Second, for the feature mining part after student classification, this study only applies the Apriori association rule mining algorithm. How to improve the rule mining also needs further research. Finally, the efficiency of this evaluation model and the system’s performance also need to be improved. To analyze a large amount of student data, the classification method and the efficiency of feature mining need further research and exploration.

Data Availability

All data used during the study are available in a repository or online in accordance with funder data retention policies (https://archive.ics.uci.edu/ml/datasets.php, http://cs.uef.fi/sipu/datasets/).

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this study.


This work was supported in part by the Natural Science Foundation of Heilongjiang Province (LH2020F031) and Key Projects of Higher Education Reform in Heilongjiang Province (SJGZ20200084).