Abstract
Education is mandatory, and much research has been invested in this sector. An important aspect of education is how to evaluate the learners’ progress. Multiplechoice tests are widely used for this purpose. The tests for learners in the same exam should come in equal difficulties for fair judgment. Thus, this requirement leads to the problem of generating tests with equal difficulties, which is also known as the specific case of generating tests with a single objective. However, in practice, multiple requirements (objectives) are enforced while making tests. For example, teachers may require the generated tests to have the same difficulty and the same test duration. In this paper, we propose the use of Multiswarm Multiobjective Particle Swarm Optimization (MMPSO) for generating k tests with multiple objectives in a single run. Additionally, we also incorporate Simulated Annealing (SA) to improve the diversity of tests and the accuracy of solutions. The experimental results with various criteria show that our approaches are effective and efficient for the problem of generating multiple tests.
1. Introduction
In the education sector, evaluation of students’ study progress is important and mandatory. There are many methods such as oral tests or writing tests to evaluate their knowledge and understanding about subjects. Because of the scalability and ease of human resources, writing tests are used more widely for the final checkpoints of assessment (e.g., final term tests), where a large number of students must be considered. Writing tests can be either descriptive tests, in which students have to fully write their answers, or multiplechoice tests, in which students pick one or more choices for each question. Even though descriptive tests are easier to create at first, they consume a great deal of time and effort during the grading stage. Multiplechoice tests, on the other hand, are harder to create at first as they require a large number of questions for security reasons, as in Ting et al. [1]. However, the grading process can be extremely fast, automated by computers, and biasfree from human graders. Recently, many researchers have invested their efforts to make computers automate the process of creating multiplechoice tests using available question banks, as in the work of Cheng et al. [2]. The results were shown to be promising and, thus, make multiplechoice tests more feasible for examinations.
One of the challenges in generating multiplechoice tests is the difficulty of the candidate tests. The tests for all students should have the same difficulty for fairness. However, it can be seen that generating all tests having the same level of difficulties is an extremely hard task, even in the case of manually choosing questions from a question bank, and the success rate of generating multichoice tests satisfying a given difficulty is low and timeconsuming. Therefore, to speed up the process, some authors chose to automatically generate tests with the use of computers and approximate the difficulties of the required difficulties. This is also known as generating tests with a single objective where the level of difficulty is the objective. For example, Bui et al. [3] proposed the use of particle swarm optimization to generate tests with approximating difficulties to the required levels from users. The tests are generated from question banks that consist of various questions with different difficulties. The difficulty value of each question is judged and adapted based on users via previous reallife exams. The work evaluates three random oriented approaches, which are Genetic Algorithms (GAs) by Yildirim [4, 5] and Particle Swarm Optimization (PSO). The experiment result shows that PSO gives the best performance concerning most of the criteria by Bui et al. [3]. Previous works only focused on solving a single objective of the extracting test based on the difficulty level requirement of the user defined. In practice, exam tests can depend on multiple factors such as questions’ duration and total testing time. Thus, designing a method that can generate tests with multiple objectives is challenging. Furthermore, the proposed approaches can only extract a single test at each run. To extract multiple tests, the authors have to execute their application multiple times. This method is timeconsuming, and duplicate tests can occur because each run is executed separately. Besides, they do not have any information about each other to avoid duplication.
In this paper, we propose a new approach that uses Multiswarm Multiobjective Particle Swarm Optimization (MMPSO) to extract k tests in a single run with multiple objectives. Multiswarms are the same as the multitest in extracting k tests. However, they are based on their search on multiple subswarms instead of one standard swarm that executes their application multiple times to extract multiple tests. The use of diverse subswarms to increase performance when optimizing their tests is studied in Antonio and Chen [6]. Additionally, we use Simulated Annealing (SA) to initialize the first population for PSO to increase the diversities of generated tests. We also aim to improve the results on various criteria such as diversities of solutions and accuracy.
The main contributions of this paper are as follows:(1)We propose a multiswarm multiobjective approach to deal with the problem of extracting k tests simultaneously.(2)We propose the use of SA in combining with PSO for extracting tests. SA was selected as it is capable of escaping local optimum solutions.(3)We propose a parallel version of our serial algorithms. Using parallelism, we can control the overlap of extracted tests to save time.
The rest of this paper is organized as follows. Section 2 describes the related research. The problem of extracting k tests from question banks is explained in Section 3. Correlated studies of normal multiswarm multiobjective PSO and multiswarm multiobjective PSO with SA for the problem of extracting k tests from question banks are presented in Sections 4 and 5. The next section analyzes and discusses the experimental results of this study. Finally, the future research trends, and the conclusions of the paper are provided in Section 6.
2. Related Work
Recently, evolutionary algorithms have been applied to many fields for optimization problems. Some of the most wellknown algorithms are Genetic Algorithms (GAs) and Particle Swarm Optimization (PSO). GAs were invented based on the concept of Darwin’s theory of evolution, and they seek solutions via progressions of generations. Heuristic information is used for navigating the search space for potential individuals, and this can achieve globally optimal solutions. Since then, there have been many works that used GAs in practice [7–11].
Particle swarm optimization is a swarmbased technique for optimization that is developed by Eberthart and Kennedy [12]. It imitates the behavior of a school of fishes or the flock of birds. PSO optimizes the solutions via the movements of individuals. The foundation of PSO’s method of finding optima is based on the following principles proposed by Eberhart and Kennedy: (1) All individuals (particles) in swarms tend to find and move towards possible attractors; (2) each individual remembers the position of the best attractor that it found. In particular, each solution is a particle in a swarm and is denoted by two variables. One is the current location, denoted by present[], and the other is the particle’s velocity, denoted by []. They are two vectors on the vector space R^{n}, in which n changes based on the problems. Additionally, each particle has a fitness value that is given by a chosen fitness function. At the beginning of the algorithm, the initial generation (population) is created either in a random manner or by some methods. The movement of each particle individual is affected by two information sources. The first is P_{best}, which is the bestknown position of the particle visited in the past movements. The second is G_{best}, which is the bestknown position of the whole swarm. In the original work proposed by Eberhart and Kennedy, particles traverse the search space by going after the particles with strong fitness values. Particularly, after disjointed periods, the velocity and position of each individual are updated with the following formulas:where rand() is a function that returns a random number in the range (0,1) and c1, c2 are constant weights.
While PSO is mostly used for the continuous value domain, recently, some works have shown that PSO can also be prominently useful for discrete optimization. For example, Sen and Krishnamoorthy [13, 14] transformed the original PSO into discrete PSO for solving the problem of transmitting information on networks. The work result proves that the proposed discrete PSO outperforms Simulated Annealing (SA).
To further improve the performance for reallife applications, some variants of PSO have been proposed and exploited such as multiswarm PSO. Peng et al. [15] proposed an approach for multiswarm PSO that pairs the velocity update of some swarms with different methods such as the periodically stochastic learning strategy or random mutation learning strategy. The experiments have been run on a set of specific benchmarks. The results showed that the proposed method gives a better quality of solutions and has a higher chance of giving correct solutions than normal PSO. Vafashoar and Meibodi [16] proposed an approach that uses Cellular Learning Automata (CLA) for multiswarm PSO. Each swarm is placed on a cell of the CLA, and each particle’s velocity is affected by some other particles. The connected particles are adjusted overtime via periods of learning. The results indicate that the proposed method is quite effective for optimization problems on various datasets. In order to balance the search capabilities between swarms, Xia et al. [17] used multiswarm PSO in combination with various strategies such as the dynamic subswarm number, subswarm regrouping, and purposeful detecting. Nakisa et al. [18] proposed a strategy to improve the speed of convergence of multiswarm PSO for robots’ movements in a complex environment with obstacles. Additionally, the authors combine the local search strategy with multiswarm PSO to prevent the robots from converging at the same locations when they try to get to their targets.
In practice, there exist a lot of optimization problems with multiple objectives instead of a single objective. Thus, a lot of work for multiobjective optimization has been proposed. For example, Li and Babak [19] proposed multiobjective PSO combining with an enhanced local search ability and parameterless sharing. Kapse and Krishnapillai [20] also proposed an adaptive local search method for multiobjective PSO using the time variance search space index to improve the diversity of solutions and convergence. Based on crowded distance sorting, Cheng et al. [21] proposed an improved version, circular crowded sorting, and combined with multiobjective PSO. The approach scatters the individuals of initial populations across the search space in order to be better at gathering the Pareto frontier. The method was proven to improve the search capabilities, the speed of convergence, and diversity of solutions. Similarly, Adel et al. [22] used multiobjective with uniform design instead of traditional random methods to generate the initial population. Based on R2 measurement, Alan et al. [23] proposed an approach that used R2 as an indicator to navigate swarms through the search space in multiobjective PSO. By combining utopia pointguided search with multiobjective PSO, Kapse and Krishnapillai [24] proposed a strategy that selects the best individuals that are located near the utopia points. The authors also compared their method with other algorithms such as NSGAII (Nondominated Sorting Genetic Algorithm II) by Deb et al. [25] or CMPSO (Coevolutionary multiswarm PSO) by Zhan et al. [26] on several benchmarks and demonstrated the proposed method’s effectiveness. Saxena and Mishra [27] designed a multiobjective PSO algorithm named MOPSO tridist. The algorithm used triangular distance to select leader individuals which cover different regions in Pareto frontier. The authors also included an update strategy for Pbest with respect to their connected leaders. MOPSO tridist was shown to outperform other multiobjective PSOs, and the authors illustrated the algorithm’s application with the digital watermarking problem for RBG images. Based on chaotic particle swarm optimization, Liansong and Dazhi [28] designed a multiobjective optimization for chaotic particle swarm optimization and based on comprehensive learning particle swarm optimization, and Xiang and Xueqing [29] proposed an extension, the MSCLPSO algorithm, and incorporated various techniques from other evolutionary algorithms. In order to increase the flexibility of multiobjective PSO, Mokarram and Banan [30] proposed the FCMOPSO algorithm that can work on a mixup of constrained, unconstrained, continuous, and/or discrete optimization problems. Recently, Mohamad et al. [31] reviewed and summarized the disadvantages of multiobjective PSO. Based on that, they proposed an algorithm, MMOPSO. The authors also proposed a strategy based on dynamic search boundaries to help escape the local optima. MMOPSO was proven to be more efficient when compared with several stateoftheart algorithms such as Multiobjective Grey Wolf Optimizer (MOGWO), Multiobjective Evolutionary Algorithm based on Decompositions (MOEA/D), and Multiobjective Differential Evolution (MODE).
An extension of multiple objective optimization problems is the dynamic multiple objective optimization problems, in which each objective would change differently depending on the time or environment. To deal with this problem, Liu et al. [32] proposed CMPSODMO which is based on the multiswarm coevolution strategy. The author also combined it with special boundary constraint processing and a velocity update strategy to help with the diversity and convergence speed.
To make it easier for readers, Table 1 summarizes different application domains in which PSO algorithms have been applied for different purposes.
The abovementioned works can be effective and efficient for the optimization problems in Table 1; however, applying them for the problem of generating k test in a single run with multiple objectives is not feasible according to the work of Nguyen et al. [33]. Therefore, in this work, we propose an approach that uses Multiswarm Multiobjective Particle Swarm Optimization (MMPSO) combined with Simulated Annealing (SA) for generating k tests with multiple objectives. Each swarm, in this case, is a test candidate, and it runs on a separate thread. The migration happens randomly by chance. We also aim to improve the accuracy and diversity of solutions.
3. Problem Statement
3.1. Problem of Generating Multiple Tests
In our previous works [3, 33], we have proposed a PSObased method to multichoice test generation; however, it was a singleobjective approach. In this paper, we introduce a multiobjective approach of multichoice test generation by combining PSO and SA algorithms.
Let Q = {q_{1}, q_{2}, q_{3}, …, q_{n}} be a question bank with n questions. Each question q_{i} ∈ Q contains four attributes {QC, SC, QD, OD}. QC is a question identifier code and is used to avoid duplication of any question in the solution. SC denotes a section code of the question and is used to indicate which section the question belonged to. QD denotes a time limit of the question, and OD denotes a real value in the range [0.1, 0.9] that represents an objective difficulty (level) of the question. QC, SC, and QD are discrete positive integer values as in the work of Bui et al. [3] and Nguyen et al. [33].
The problem of generating multiple k tests (or just multiple tests) is to generate k number of tests simultaneously in a single run, e.g., our objective is to generate a set of tests, in which each test E_{i} = {q_{i1}, q_{i2}, q_{i3}, …, q_{im}} (q_{ij} ∈ Q, 1 ≤ j ≤ m, 1 ≤ i ≤ k, k ≤ n) consists of m (m ≤ n) questions. Additionally, those tests must satisfy both the requirements of objective difficulty ODR and testing time duration TR that were given by users. For example, ODR = 0.8 and TR = 45 minutes mean that all the generated tests must have approximately the level of difficulty equal to 0.8 and the test time equal to 45 minutes.
The objective difficulty of a test is defined as , and the duration of the test is determined by .
Besides the aforementioned requirements, there are additional constraints each generated test must satisfy as follows: C1: each question in a generated test must be unique (i.e., a question cannot appear more than once in a test). C2: in order to make the test more diverse, there exists no case that all questions in a test have the same difficulty value as the required objective difficulty ODR. For example, if ODR = 0.6, then ∃ q_{ki} ∈ : q_{ki} ·OD ≠ 0.6. C3: some questions in a question bank must stay in the same groups because their content is relating to each other. The generated tests must ensure that all the questions in one group appear together. This means if a question of a specific group appears in a test, the remaining questions of the group must also be presented in the same test [3, 33]. C4: as users may require generated tests to have several sections, a generated test must ensure that the required numbers of questions are drawn out from question banks for each section.
3.2. Modeling MMPSO for the Problem of Generating Multiple Tests
The model for MMPSO for the problem of generating multiple tests can be represented as follows:where are swarms that represent multiple tests; are the number of questions in the test.
Assume that F is an objective function for multiobjective of the problem; it can be formulated as follows:where is a weight constraint () and and F_{i} is a singleobjective function. In this paper, we use an evaluation of the two functions, which are the average levels of difficulty requirements of the tests and total test duration .
, where F_{1} satisfies the conditions {C1, C2, C3, C4} and m is the total number of questions in the test, is the difficulty value of each question, and is the required difficulty level.
, where F_{2} satisfies the conditions {C1, C2, C3, C4} and m is the total number of questions in the test, is the duration for each question, and is the required total time of tests.
The objective function is used as the fitness function in the MMPSO, and the results of the objective function are considered the fitness of the resulting test.
In this case, the better the fitness, the smaller the F becomes. To improve the quality of the test, we also take into account the constraints C1, C2, C3, and C4.
For example, provided that we have a question bank as in Table 2, the test extraction requirements are four questions, a difficulty level of 0.6 (ODR = 0.6), a total duration of the test of 300 seconds (TR = 300), and a weight constraint (α = 0.4). Table 3 illustrates a candidate solution with its fitness = 0.1 computed by using formula (3).
4. MMPSO in Extracting Multiple Tests
4.1. Process of MMPSO for Extracting Tests
This paper proposes a parallel multiswarm multiobjective PSO (MMPSO) for extracting multiple tests (MMPSO) based on the idea in Bui et al. [3]. It can be described as follows. Creating an initial swarm population is the first step in PSO, in which each particle in a swarm is considered a candidate test; this first population also affects the speed of convergence to optimal solutions. This step randomly picks questions in a question bank. The questions, either standalone or staying in groups (constraint C3), are drawn out for one section (constraint C4) until the required number of questions of the section is reached and the drawing process is repeated for next sections. When the required number of questions of the candidate test and all the constraints are met, the fitness value of the generated test will be computed according to formula (3).
The G_{best} and position information is the contained questions. All slowly move towards G_{best} by using the location information of G_{best}. The movement is the replacement of some questions in the candidate test according to the velocity . If the fitness value of a newly found of a particle is smaller than the particle’s currently bestknown (i.e., the new position is better than the old), then we assign a newly found position value to .
G_{best} moves towards the final optimal solution in random directions. The movement is achieved by replacing its content with some random questions from the question bank. In a similar way to , if the new position is no better than the old one, the G_{best} value will not be updated.
The algorithm ends when the fitness value is lower than the fitness threshold ε or the number of movements (iteration loops) surpasses the loop threshold λ. Both of the thresholds are given by users.
4.2. Migration Parallel MMPSO for the Extracting Test (Parallel MMPSO)
Based on the idea in Nguyen et al. [33]; we present the migration parallel approach of MMPSO for increasing performance. Each swarm now corresponds to a thread, and the migration happens by chance between swarms. The migration method starts with locking the current thread (swarm) to avoid interference from other threads in.
In the dualsector model [34], Lewis describes a relationship between two regions, the subsistence sector and the capitalist sector. We can view the two types of economic sectors here as the strong (capitalist) sectors and the weak (subsistence) sectors (while ignoring other related aspects of the economy). Whether a sector is strong or weak depends on the fitness value of G_{best} positions of its swarm. However, when applying those theories, some adjustments are made so that the parallel MMPSO can yield better optimal solutions.
The direction of migration changes when individuals with strong (strong individuals) in strong sectors move to weak sectors. The weak sectors’ G_{best} may be replaced by the incoming , and the fitness value of the weak swarms should make a large lift, as in the work of Nguyen et al. [33].
Backward migration from the weak swarms to strong swarms also happens alongside forwarding migration. For every individual that moves from a strong swarm to a weak swarm, there is always one that moves from the weak swarm back to the strong swarm. This is to ensure that the number of particles and the searching capabilities of the swarms do not significantly decrease.
The foremost condition for migration to happen is that there are changes in the fitness values of the current G_{best} compared to the previous G_{best}.
The probability for migration is denoted as γ, and the unit is a percentage (%).
The number of migrating particles is equal to δ × the size of the swarm (i.e., the number of existing particles in the swarm), where δ denotes the percentage of migration.
The migration parallel MMPSObased approach to extract multiple tests is described in a form of a pseudocode in Algorithm 1.

The particle updates its velocity (V) and positions (P) with the following formulas:where is the velocity of , with determined by ; is the velocity of G_{best}, with determined by , , are random values, and is the number of questions in the test solutions.
The process of generating multiple tests at the same time in a single run using migration parallel MMPSO includes two stages. The first stage is generating tests using multiobjective PSO. In this stage, the algorithm proceeds to find tests that satisfy all requirements and constraints using multiple threads. Each thread corresponds to each swarm that runs separately. The second stage is improving and diversifying tests. This stage happens when there is a change in the value of G_{best} of each swarm (for each thread) in the first stage. In this second stage, migration happens between swarms to exchange information between running threads to improve the convergence and diversity of solutions based on the work of Nguyen et al. [33]. The complete flowchart that applies the parallelized migration method to the MMPSO algorithm is shown in Figure 1.
4.3. Migration Parallel MMPSO in Combination with Simulated Annealing
As mentioned above, the initial population affects the convergence speed and diversity of test solutions. The creation of a set of initial solutions (population) is generally performed randomly in PSO. It is one of the drawbacks since the search space is too wide, so the probability of getting stuck in a local optimum solution is also high. In order to improve the initial population, we apply SA in the initial population creation step of migration parallel MMPSO instead of the random method. SA was selected since it is capable of escaping local optimums in Kharrat and Neji [35]. In this study, the process of finding new solutions using SA is improved by moving to G_{best} using the received information about the location of G_{best} (which is commonly used in PSO). The MMPSO with SA is described by a pseudocode in Algorithm 2.

5. Experimental Studies
5.1. Experimental Environment and Data
Bui et al. [3] evaluated different algorithms such as the random method, genetic algorithms, and PSObased algorithm for extracting tests from a question bank of varying sizes. The results of the experiment showed that the PSObased algorithms are better than others. Hence, the experiment in this paper only evaluated and compared the improved SA parallel MMPSO algorithm with the normal parallel MMPSO algorithm in terms of the diversity of tests and the accuracy of solutions.
All proposed algorithms are implemented in C# and run on 2 computers which are a 2.5 GHz Desktop PC (4CPUs, 4 GB RAM, Windows 10) and a 2.9 GHz VPS (16CPUs, 16 GB RAM, Windows Server 2012). The experimental data include 2 question banks. One is with 998 different questions (the small question bank) and the other one is with 12000 different questions (the large question bank). The link to the data is https://drive.google.com/file/d/1_EdCUNyqC9IGziFUIf4mqs0G1qHtQyGI/view. The small question bank consists of multiple sections, and each section has more than 150 questions with different difficulty levels (Figure 2). The large question bank includes 12,000 different questions in which each part has 1000 questions with different difficulty levels (Figure 3). The experimental parameters of MMPSO are presented in Table 4. The results are shown in Tables 5 and 6 and Figures 4 and 5.
Our experiments focus on implementing formula (3) and an evaluation of the two functions, which are the average levels of difficulty requirements of the tests and total test duration .
5.2. Evaluation Method
In this part, we present the formula for the evaluation of all algorithms about their stability to produce required tests with various weight constraints (α). The main measure is the standard deviation, which is defined as follows:where z is the number of experimental runs. is the average fitness of all runs. y_{i} is a fitness value of run i^{th}.
The standard deviation is used to assess the stability of the algorithms. If its value is low, then the generated tests of each run do not have much difference in the fitness value. The weight constraint α is also being examined as it balances the objective functions. In our cases, a change in α can shift the importance towards the test duration constraint, the test difficulty constraint, or the balance between those two. We can select α to suit what we require, emphasizing more on either the test duration or the test difficulty.
5.3. Experimental Results
The experiments are executed with the parameters following Ridge and Kudenko [36] in Table 3, and the results are presented in Table 5 (run on computer 4CPUs) and Table 6 (run on computer 16CPUs). The comparisons of runtime and fitness of the small and large question bank are presented in Figures 4 and 5. Regarding Tables 5 and 6, each run extracts 100 tests simultaneously, and each test has a fitness value. Each run also requires several iteration loops to successfully extract 100 candidate tests. The average runtime for extracting tests is the average runtimes of all 50 experimental runs. The average number of iteration loops is the average of all required loops of all 50 runs. The average fitness is the average of all fitness values of 5000 generated tests. The average duplicate indicates the average number of duplicate questions among 100 generated tests of all 50 runs. The average duplicate is also used to indicate the diversity of tests. The lower the value, the more diverse the tests.
When α is at the lower range [0.1, 0.5], the correctness for difficulty value of each generated test is emphasized more than that of the total test time. Based on the average fitness value, all algorithms appear to have a harder time generating tests at the lower range [0.1, 0.5] compared with at the higher range [0.5, 0.9]. Additionally, when α increases, the runtime starts to decrease, the fitness gets better (i.e., the fitness values get smaller), and the numbers of loops required for generating tests decrease. Apparently, satisfying the requirement for the test difficulty requirement is harder than satisfying the requirement for total test time. The experiment results also show that integrating SA gives a better fitness value without SA. However, runtimes of algorithms with SA are longer as a tradeoff for better fitness values.
All algorithms can generate tests with acceptable percentages of duplicate questions among generated tests. The duplicate question proportions between generated tests depend on the sizes of the question bank. For example, if the question bank’s size is 100, we need to generate 50 tests in a single run and each test contains 30 questions, and then, some generated tests should contain similar questions of the other generated tests.
Based on the standard deviation in Tables 5 and 6, all MMPSO algorithms with SA are more stable than those without SA since the standard deviation values of those with SA are smaller. In other words, the differences in fitness values between runs are smaller with SA than without SA. The smaller standard deviation values and smaller average fitness values also mean that we less likely need to rerun the MMPSO with SA algorithms many times to get the generated tests that better fit the test requirements. The reason is that the generated tests we obtain at the first run are likely close to the requirements (due to the low fitness value) and the chance that we obtain those generated tests with less fit to requirements is low (due to the low standard deviation value).
6. Conclusions and Future Studies
Generation of question papers through a question bank is an important activity in extracting multichoice tests. The quality of multichoice questions is good (diversity of the level of difficulty of the question and a large number of questions in question bank). In this paper, we propose the use of MMPSO to solve the problem of generating multiobjective multiple k tests in a single run. The objectives of the tests are the required level of difficulty and the required total test time. The experiments evaluate two algorithms, MMPSO with SA and normal MMPSO. The results indicate that MMPSO with SA gives better solutions than normal MMPSO based on various criteria such as diversities of solutions and numbers of successful attempts.
Future studies may focus on investigating the use of the proposed hybrid approach [37, 38] to solve other NPhard and combinatorial optimization problems, which focus on finetuning the PSO parameters by using some type of adaptive strategies. Additionally, we will extend our problem to provide feedback to instructors from multiplechoice data, such as using fuzzy theory [39], and PSO with SA for mining association rules to compute the difficulty levels of questions.
Data Availability
The data used in this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this paper.