Abstract

Internet-enabled technologies have provided a way for people to communicate and collaborate with each other. The collaboration and communication made crowdsourcing an efficient and effective activity. Crowdsourcing is a modern paradigm that employs cheap labors (crowd) for accomplishing different types of tasks. The task is usually posted online as an open call, and members of the crowd self-select a task to be carried out. Crowdsourcing involves initiators or crowdsourcers (an entity usually a person or an organization who initiate the crowdsourcing process and seek out the ability of crowd for a task), the crowd (online participant who is a having a particular background, qualification, and experience for accomplishing task in crowdsourcing activity), crowdsourcing task (the activity in which the crowd contribute), the process (how the activity is carried out), and the crowdsourcing platform (software or market place) where requesters offer various tasks and crowd workers complete these tasks. As the crowdsourcing is carried out in the online environment, it gives rise to certain challenges. The major problem is the selection of crowd that is becoming a challenging issue with the growth in crowdsourcing popularity. Crowd selection has been significantly investigated in crowdsourcing processes. Nonetheless, it has observed that the selection is based only on a single feature of the crowd worker which was not sufficient for appropriate crowd selection. For addressing the problem of crowd selection, a novel “ant colony optimization-based crowd selection method” (ACO-CS) is presented in this paper that selects a crowd worker based on multicriteria features. By utilizing the proposed model, the efficiency and effectiveness of crowdsourcing activity will be increased.

1. Introduction

Nowadays, individuals voluntarily offer their own time, talent, and money to engage in activities that include helping the poor and making the planet a better place [1]. The Internet has made it easier for people to be connected and to be a part of a collaborative function, and this collaboration of people over the Internet is conceptualized into a term known as “crowdsourcing” [2]. Crowdsourcing is modern paradigms that employ cheap labors (crowd) for accomplishing different types of tasks. The task is usually posted online as an open call and members of the crowd self-select a task [3]. Crowdsourcing is an online participative activity in which organizations make use of the heterogeneous group of people having knowledge and skills to complete the task with the announcement of an open call [4, 5]. Social networking improvements have made it possible for organizations to pool collective knowledge from people around the world, i.e., “crowds’ wisdom,” for finding best solutions to various problems [6]. The wide use of social networking services acts as a massive pool of workers. These workers vary in demography and in their population. The information present in their profiles is used for inferring abilities and preferences. Crowdsourcer registers specific crowds by utilizing built-in functionality such as in Facebook using private messages and in Twitter using “@”. There is, therefore, an evolving trend of “managed crowdsourcing” where employees are actively selected [7, 8]. The process of crowdsourcing involves initiators or crowdsourcers (an entity usually a person or an organization who initiate the crowdsourcing process and seek out the ability of crowd for a task), the crowd (online participant who is a having a particular background, qualification, and experience for accomplishing the task in crowdsourcing activity), crowdsourcing task (the activity in which the crowd contributes), the process (how the activity is carried out), and the crowdsourcing platform (software or market place where requesters offer various tasks and crowd workers complete these tasks)[912]. Crowdsourcing is widely used in various domains such as tagging images [13], schema matching [14], and entity resolution [15].

The crowdsourcing task is delivered to the group of people (crowd) who complete the tasks. The allocation of the task is an important characteristic in the crowdsourcing context, and it requires suitable techniques. If the allocation of tasks is carried out correctly, it provides best outcomes [11]. In crowdsourcing the task may be outsourced to a dispersed crowd (workers) who might be inexperienced on these tasks [16]. Crowd selection is becoming a challenging issue with the growth in crowdsourcing popularity. There may be an untrustworthy person, who sometimes makes errors in solving various types of tasks. Crowdsourcing will be effective if an appropriate crowd is selected [1719]. Crowd workers differ in several dimensions, so it is mandatory that we delegate tasks based on different features [20]. A participant may be identified by basic information available in workers’ profiles, such as gender, nationality, education level, his majors, personality test score [21], his skills, and his willingness for performing tasks [22].

Diversity, largeness, and profiles difference of crowd workers across many dimensions (e.g., skills, motives, and socioeconomic backgrounds) are the foundation for the success of a given crowdsourced tasks. In addition to the involvement of dishonest workers the differences among workers may also contribute to varying quality of responses received [2325].

Identification of high-quality workers is a significant, complex, and realistic problem as in the crowdsourcing activity various tasks are accomplished by global crowd whose size and nature is unknown. Crowd attitudes, behaviors, and skills must be identified prior to assigning tasks. High-quality work can be accomplished by workers possessing unique features such as the level of education, major, and age [21]. Crowd selection is a complex problem in which the skills and knowledge of huge crowd workers are matched with the requirements of a job [26]. Crowd selection or formation is an optimization problem that involves all types of approaches to build a crowd group to whom various tasks will be offered [27]. Various features of crowd workers were identified from existing studies with the aim to use it for crowd selection in crowdsourcing. To the best of our knowledge, the proposed “ACO-CS” is a novel approach as not a single prior study exists that addresses the problem of crowd selection in crowdsourcing using the ant colony approach. By utilizing this approach, the effectiveness and efficiency of crowdsourcing activity would greatly be increased as it selects appropriate crowd workers according to multicriteria features for accomplishing various crowdsourced tasks.

2. Literature Review

Crowd selection is an important step in the activity of crowdsourcing process as the selection may make the activity effective one or may affect the activity. Various techniques were utilized for Crowd Selection in literatures which are discussed in subsequent sections.

2.1. Crowd Selection Based on Trustworthiness

Trust is the key element in choosing employees for a task [19, 2729]. In different practices such as in the discovery of truth and in selection of workers, trust plays a significant role [30]. Employer can employ a trustworthy crowd worker [28]. SSC (strong social component) and C-AWSA (context-aware worker selection approach) are used to classify trustworthy workers, which is an accurate and useful algorithm for choosing trustworthy employees. It utilizes trust quality for optimization purposes. The forward search algorithm is used to measure trust of a worker [19]. Crowd trust efficiently chooses workers who are trustworthy according to an approval rate. The model effectively distinguishes dishonest workers and trustworthy workers [31].

2.2. Crowd Selection on the Basis of Expertise Filtering

Expertise [11, 2628, 3237] level of the crowd is estimated by expertise filtering. According to the expertise level, a right crowd for a task is selected [33, 34]. If they have expertise in the appropriate field, workers will be assigned tasks [27]. As an individual has task expertise, he will attempt to complete it with full attention [26]. Crowd of diverse expertise level is selected for tasks [35].

2.3. Crowd Selection on the Basis of Individual Profiles

Workers interested in participating on various tasks are required to build profiles that consist of various worker information, such as age, gender, skill, interest, and accomplished task history that will be stored on the platform in the worker profile database [3840]. There are three forms of profiles. A declarative profile is created by workers themselves, the derived profile is determined from the system’s user interaction, and hybrid profile that contains declarative as well as derived information [41].Profiling evaluates the ability of individuals to work [32]. Profiles contain three types of information: first, the voluntary information they provide about themselves; second, the information and criteria that the platforms collect about their job performance; third, the assessments of their customers. Unlike crowd discourses, these platforms allow employees to modify their individual profiles to varying degrees of independence. This is not a concession to the online workers of autonomy and self-determination, but it is because of the competitive structure of the global labor market. Platforms assist consumers in evaluating workers by providing individually tailored profiles (in the absence of traditional recruitment documents and job interviews) [42]. Relevant workers for tasks are filtered using their profiles information. [34, 43]. Based on the level of task quality delivered, these profiles are updated [38].

2.4. Crowd  Selection on the Basis of Task-Related Qualification

Several qualification criteria for workers may be defined by requesters. Such criteria often include test forms that are necessary to assess their qualifications [34]. Skill tests are also carried out to confirm worker qualification [44]. The workers are selected on the basis of relevant qualification they possess. For evaluating the ability of workers, qualification tests are conducted and a worker must pass the required examination to work on a project. It determines the capacity of workers in the activity. The qualification tests ensure that each employee has job related knowledge. Workers are selected on the basis of their score in qualification tests [45, 46].

2.5. Crowd Selection on the Basis of Experience

Workers are selected according to their experience in a task [47]. By utilizing the experience strategy, experienced workers are selected [48]. The selection of the experienced worker can make significant differences in the results of a task, i.e., it can produce high-quality results [35].

2.6. Crowd Selection on the Basis of Skills

Skills is a major personal attribute considered for appropriate participant selection [49]. Organization selects skilled work force for various tasks [42]. As the quality depends highly on workers’ skills [43], an Initial screening of the crowd workers is carried out [50]. These screening are also referred as skill assessment which evaluates the crowd according to possessing skills and they are helpful in matching skilled labor to a task [26, 51]. Activity-based positions on the platform are allocated to workers with the essential training and skills. Crowd workers are automatically assigned to tasks if they possess the required skills [27].

2.7. Need for the Proposed Study

Existing studies focus only on single or few features of crowd for addressing the problem of crowd selection that is not sufficient for selection of appropriate crowd to carry out crowdsourcing activity. Our proposed model “ACO-CS” will increase the efficiency and effectiveness of crowdsourcing activity as it selects crowd based on multicriteria features; therefore, appropriate multifeatured crowd will be selected to accomplish various tasks in crowdsourcing activity.

3. Methodology

The selection of quality crowd workers is a challenging issue. The workers are identified by their unique features [21]. These features are identified in the literature analysis and the selection of crowd will highly defend on these identified features. The feature set consists of large redundant data which will affect the appropriate selection of crowd workers. To remove redundancy and complexity, these are filtered out.

3.1. Feature Selection

A feature is an individual assessable asset of the process being investigated [52]. Feature selection (FS) is used in machine learning processes particularly in solving complex feature problems. Feeding a wide range of features into a model of recognition not only raises the strain of computation but also creates the issue widely known as the dimensionality curse. With feature selection, a large and complex dataset is reduced, as appropriate features are sorted out. Feature selection is broad and extends through many areas, including categorization of text, data mining, and identification of patterns and processing of signals [53]. Using the feature selection approach, a feature subset from the original set is selected and the accuracy of the original set is preserved. The efficacy and scalability will be increased by eliminating unnecessary and redundant features [54]. In dealing with larger feature datasets, the selection of features is obligatory. FS is a requirement in real-world problems due to the proliferation of noisy, meaningless, or deceptive characteristics.

Despite the availability of data with hundreds of variables leading to high-dimensional data, many feature selection strategies were used by researchers for selecting best features. The various feature selection techniques provide us with a way to reduce computing time, boost prediction efficiency, and better understanding of machine learning process or pattern recognition. Crowd consists of diverse and multidisciplinary people and this crowd posses unique features. Crowd features are collected from existing research studies. These are positive and negative features. The identification of these features was necessary for distinguishing appropriate and inappropriate crowd workers. These features are then filtered out to remove the ambiguity and complexity. A set of crowd containing multifeatures (Table1) was obtained as a result of combining the features that were captured during literature analysis. According to multicriteria features, our proposed method will select or reject various types of crowd. The various features of the crowd are represented in Table 2 (negative features), Table 3 (positive Features), and Table 1 overall (negative and positive) features.

3.2. Background

Ant colony optimization is a swarm intelligence technique which was introduced by M. Dorigo and his colleagues in 1990 [97]. They were inspired by the foraging behavior of certain species of ant. For leaving marks on encouraging direction that should be followed by other ant’s colony (members), these ants deposit various pheromones (chemical) on the ground. To solve optimization problems [98], ant colony optimization approach is utilized. Two variables are used by ants in solving problems such as heuristic knowledge and the value of pheromones. Quality outcomes can be generated as a result of the mutual communication among artificial ants. This is obtained by pheromone trail values through indirect contact (sensed the pheromone) of various ants. Ants do not change themselves but adjust the way other ants represent and view the problem adaptively [99]. ACO was mostly concerned with addressing the problems of ordering. One of ACO’s recent trends is to address various existing problems in the industrial sector [98].

3.3. Proposed Method

In crowdsourcing operations, the proposed ACO model can be extended to the problem of crowd selection. There are different steps involved in implementing the ACO algorithm in the selection of crowds. Figure 1 shows the overall operation of the proposed system “ant colony optimization-based crowd selection method (ACO-CS).” The proposed method undergoes the process of crowd selection. The selection of crowd starts with the generation of ants that will traverse on various paths (edges) and will select crowds on the basis of the pheromone value present on different edges, If the ant traversal satisfy a stopping criteria, the ants stops (traversal terminates) and the best subset of crowds is generated that will latterly be used for assigning different tasks. If the traversal does not correspond to stopping criteria, then the pheromone value is updated and once again the process is initiated.

3.3.1. Ant Colony Optimization-Based Crowd Selection Method

A set of 10 crowd workers are present (Table 1). The crowd selection technique is to minimize crowd subset which will be less than the original set of crowd; a higher accuracy in the depiction of the original crowd set will be retained. The partial selection may be in any order between the solutions. Simultaneously, the potential crowd to be chosen is not generally affected by the prior crowd attached to the node. It is not, however, necessary that the solutions to the selection of crowds should be of equal scale. The following steps are considered in mapping the crowd selection problem to the ACO algorithm:(i)The graphical representation(ii)Pheromone and heuristic desirability(iii)Updating the pheromone value(iv)Outcomes formation

(1) The Graphical Representation. The problem of crowd selection can also be defined in terms of ant colony optimization problem. ACO generally represents the problem in a graphical form as represented in Figure 2. The nodes signify various types’ crowds, and the edges reflect the corresponding crowd choice. The nodes are linked with each other to permit the selection of any crowd. An optimal subset of the crowd is selected when ant traverse over the graph, i.e., visit various nodes. The ant traversal must satisfy stopping criteria (select optimal multifeatures appropriate crowds). In Figure 2, initially, the ants A, B, C, D, E, F, G, H, I, and J from their nest are allowed to start traversing to different nodes such as C1 or C2 and subsequently to C3, C4, C5, C6, C7, C8, C9, and C10. These ants in the traversal process leave pheromones (represented in Figure 3) which is a chemical substance on different edges. The ants move according to the probability of the pheromones level on various edges, i.e., if the levels of pheromone are high, the ant will select only high pheromones values edges (bold line) and select only those specific nodes (Figure 4).The ant A from the nest will select node C1 and then using transition rule, the ant selects a crowd C4. Next, it chooses C5, C8, and C9 . On reaching C9, the ant traversal satisfies stopping criteria and its stops its traversal and provides a partial solution of the original crowd set “C” that consists of crowd workers C1, C4, C5, C8, and C9. A high accuracy is achieved as an outcome of the crowd workers subset. The crowd subset is then used as a nominee for different tasks.

(2) Pheromone and Heuristic Desirability. Initially teams of crowd are evaluated to identify best crowd workers. A simple multistart local search method decides the initial selection of crowd alternatives. In general, heuristic function is used in combination with the pheromone value in the ACO algorithm to make a right transition. Quality crowd workers are selected by calculating the pheromone and heuristic value. If a crowd worker is to be selected, then this value is assigned with a greater value “1.” On the other hand, if a crowd is not selected, then the pheromone value is kept smaller “0.” An ant in C1 determines whether C3 is to be selected or not, and the decision is taken in accordance with highest pheromones probability on edges and the probability can be calculated using the following formula:

The probability of each ant selecting a node is determined using equation (1) where represents the probability, represents edges, and represents heuristic desirability. If an edge is to be selected, then value is kept higher otherwise lower. The traversal and selection of a node (i.e., crowd) depend on the pheromone value. The ant will traverse on edge having a greater pheromone value.

(3) Updating the Pheromone Value. If the stopping condition is not satisfied by ants traversal, then the pheromone is modified, a new collection of ants is generated, and once again the process iterates. The pheromone is modified according to the following formula on each edge:where value may be considered 0 and 1, and it is the coefficient pheromone trail decay. If the stopping criterion is not satisfied (i.e., best crowd subset is not produced), then the ants modify pheromone. More pheromones are laid on the best solution nodes by the best ants, and as a result, optimal solutions are revealed.

(4) Outcomes Formation. The entire process of ACO-CS initiates the formation of randomly positioned numbers of ants. Then, these ants are positioned on a graph and the numbers of ants are equally set to the number of crowds (i.e., both are 10).The process of path building from a specific crowd starts with every ant. They cross the nodes from those starting positions in a probabilistic fashion until the stopping condition is fulfilled (optimal multifeatures appropriate crowds are selected). For an ideal subset, the resulting crowd subset is obtained and investigated. When best crowd subset has been found, then the experiment ends and it is noted (Figure 1). The pheromone update takes place when the conditions are not met, new ant’s colony is created, and the process repeats again.

4. Results and Discussion

The process of crowdsourcing involves Internet crowds [5, 90]. Crowds are recruited from Internet-enabled societies. Crowd workers possess some attributes, such as qualification, age, gender, language, worker place, skills, past service, and experience. Based on these characteristics, workers are selected [6, 11, 39, 100]. An employer should carefully pick workers to produce quality results [56]. The success of the organization depends on the allocation of tasks to members of the crowd, which requires adequate control systems, such as screening workers [11, 41]. When choosing the crowd, care must be taken as it is noted that the quality increases with a varied choice of crowds [50]. In our process of crowd selection, a crowd set (10 number of crowd “C1, C2, C3, C4, C5, C6, C7, C8, C9, and C10”) possessing multiple features (positive and negative features) are positioned in a graphical structure. The crowds are represented as different nodes connected with each other with the help of edges; an equal number of ants (10 numbers) are generated for the purpose to traverse on various edges and to select various crowd nodes, i.e., partial solution of the crowd set. The partial solutions (crowd subset C) will be evaluated based on the pheromone probability on paths, if it meets the stopping criteria (i.e., it selects the best crowd subset C), then the ants will stop its traversal and will produce the best crowd subset having multiple features. If the ants do not satisfy the stopping criteria, the pheromones are updated, and for the second iteration, the process is initiated. The selection and rejection of nodes (crowd) depend on the probability of pheromone on each edge; if the value is higher, the nodes will be selected and best crowd subset will be produced in ant’s traversal and if the value of pheromone is less, the edge and in turn, the crowd node is rejected. Table 4 represents the ants and their selected path. The probability of edges is calculated to find the best traversing path and in turn selects only appropriate crowd that will be assigned with different tasks in various activities. In our crowd selection method, the crowd subset C1, C4, C5, C8, and C9 by ant A, traversal is selected as the evaluation of the probability of pheromone values on edges linking these nodes (crowds) were higher than the edges linking other nodes (Crowd).

5. Conclusion

In the proposed study, we proposed ant colony optimization-based crowd selection for the problem of crowd selection in crowdsourcing. The key contribution of our research is to select the crowd based on the multicriteria features as the selection of crowd in previous approaches was based on a single or few features that do not guarantee the appropriate selection of crowd and thus affects the crowdsourcing activity. With our presented approach (ACO-CS) best crowds are selected according to multicriteria features that will, in turn, make crowdsourcing activity efficient and effective. The model “ACO-CS” in this paper is presented theoretically with less number of crowds. In future, we will implement it practically as this model is effective (i.e., selects crowd on the basis of multi-criteria features) than existing techniques used for crowd selection; therefore, it will play an important role in the crowd selection phase in crowdsourcing activity.

Data Availability

No data were used to support the findings of this study..

Conflicts of Interest

The authors declare that they have no conflicts of interest.