Wireless Communications and Mobile Computing

Volume 2019, Article ID 3525347, 8 pages

https://doi.org/10.1155/2019/3525347

## Attribute Reduction Based on Genetic Algorithm for the Coevolution of Meteorological Data in the Industrial Internet of Things

^{1}Division of Science and Technology, Nanjing University of Information Science and Technology, Nanjing 210044, China^{2}Department of Computer and Software, Nanjing University of Information Science and Technology, Nanjing 210044, China^{3}Department of Electronics, Binjiang College, Nanjing University of Information Science and Technology, Nanjing 210044, China^{4}School of Information and Safety Engineering, Zhongnan University of Economics and Law, Wuhan 430073, China

Correspondence should be addressed to Shaohua Wan; gro.eeei@naw.auhoahs

Received 23 August 2018; Revised 17 November 2018; Accepted 11 December 2018; Published 3 January 2019

Guest Editor: Mianxiong Dong

Copyright © 2019 Yong Cheng et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Due to the problem of attribute redundancy in meteorological data from the Industrial Internet of Things (IIoT) and the slow efficiency of existing attribute reduction algorithms, attribute reduction based on a genetic algorithm for the coevolution of meteorological data was proposed. The evolutionary population was divided into two subpopulations: one subpopulation used elite individuals to assist crossover operations to increase the convergence speed of the algorithm, and the other subpopulation balanced the population diversity in the evolutionary process by introducing a random population; these two subpopulations completed the evolutionary operations together. With the TSDPSO-AR algorithm and ARAGA algorithm, the attribute reduction operation for precipitation in meteorological data was performed. The results showed that the proposed algorithm maintained the diversity of the population during evolution, improved the reduction performance, and simplified the information system.

#### 1. Introduction

With the development of the Internet of Things technology, a large number of sensors and smart terminals are used in traditional industries, which will lead to a tremendous growth in big data. How to effectively manage large amounts of data in the Industrial Internet of Things (IIoT) to improve industrial production efficiency has become an urgent problem that needs to be solved. Meteorological elements are increasing remarkably, which brings some challenges [1–4]. To address this issue, this paper designs configurable meteorological data acquisition to meet the needs of more application scenarios. The increase in data amount is beneficial to the improved mining of potential meteorological patterns, but there is no clear purpose in the process of collecting meteorological data, and the change in meteorological phenomena is only related to some meteorological elements collected, where the attribute redundancy in the collected meteorological data is large. These redundant attributes not only reduce the mining efficiency of meteorological data but also reduce the data mining accuracy. Therefore, it is very important to perform attribute reduction on collected weather data. In the rough set theory, which is a data mining method that effectively deals with fuzzy and uncertain information, one of its core contents is deleting redundant attributes in the knowledge base under the condition of keeping the decision ability of the knowledge base unchanged [5]. Therefore, using attribute reduction to delete redundant attributes in meteorological data and improving the mining efficiency of meteorological data has important practical significance [6]. Many scholars nationally and abroad have conducted in-depth research and discussions on this method and have also made remarkable achievements. It has been proven that the minimum attribute reduction used for solving information systems is an NP-hard problem. Therefore, many scholars use heuristic algorithms to improve the reduction efficiency. You Z et al. discuss the attribute kernel and attribute reduction operation of multiple decision tables in the distributed environment and proposes an information entropy reduction algorithm based on the vertical distribution in the multidecision table [7]. This method reduces the communication cost in the process of distributed reduction and improves the reduction efficiency through parallel and conditional information entropy and elements in the transmission class. Heuristic-based attribute reduction improves the reduction efficiency to some extent, but there are still some shortcomings. To further improve the reduction efficiency, many scholars combine rough set attribute reduction with other optimization algorithms. The coevolution reduction algorithm, combined with the quantum frog group, was proposed by Ding Weiping et al. in [8]. Using the optimal execution experience of the frog group and elite individuals to guide the model group to the target direction quickly, the convergence efficiency and global search ability of the attribute reduction are improved, but the cooperative coevolution reduction algorithm is suitable for high performance. Dimensional data sets greatly reduce the performance of data reduction with smaller data dimensions. Chen J et al. proposed an efficient rough set clustering algorithm based on a genetic algorithm [9]. The global search ability of the genetic algorithm was used to improve the convergence speed of the algorithm. Zhang Rongguang et al. proposed a particle set based on the rough set attribute reduction algorithm [10]. By introducing the improved tabu search algorithm, the local search strategy of the particle swarm optimization was improved, and the diversity of the population was improved. Based on this background, attribute reduction based on the genetic algorithm for the coevolution of meteorological data (AECMD) was proposed. The algorithm divides the evolutionary population into two subpopulations. One subpopulation quickly guides population evolution through the use of elite-assisted cross-operation, and the other subpopulation maintains population diversity by introducing random populations in the later stages of evolution. This assists in crossover strategies to avoid the impact of random populations on the evolutionary population due to fitness values that are too small. Through the coevolution of the two populations, the entire evolutionary process improves the algorithm’s reduction performance.

#### 2. Attribute Reduction in the Rough Set Theory

##### 2.1. Information System

Formally, an information system can be described as follows: Let [11–18], where is a nonempty finite set (i.e., the domain is an attribute set, where and represent a conditioned attribute set and decision attribute set, respectively). , represents the conditioned attribute values, and is a decision attribute value. is an information function, which gives a value to each object in the information system. In the information system, represents a cluster of equivalence relations on . If and , we define , which is an indiscernibility relation. Obviously, is an equivalent relationship.

##### 2.2. Attribute Reduction and Attribute Core

Attribute subsets and are the equivalence relation clusters on the domain . and are the conditional attribute sets and decision attribute sets, respectively, and , . positive region is recorded as [7, 19–29]: If , , then it is said that a can be saved in ; otherwise, a is necessary in .

In the attribute subset , all sets of the necessary relations in the knowledge base are called cores of , as . In the information system, if the attribute subset is relative to an independent , and , then is a relative reduction in , and the collection of all -reductions is denoted as . Because of , therefore the attribute core is the intersection of all reductions and cannot be deleted in the attribute reduction process of the information system. Thus, the core of the information system can be regarded as the core of the attribute reduction [30].

##### 2.3. Attribute Independence Degree

Let the information system , where is an equivalence relation cluster on and ; then define the dependence degree of on as [31] denotes the base of the set; denotes the positive of in the universe .

#### 3. Coevolutionary Reduction in the Adaptive Genetic Attribute

The number of populations during evolution is limited. After several iterations, the population is composed of individuals with higher fitness values. At this time, the diversity of the population is low, which makes the selection and crossover operators lose their primary roles. In the process of evolution, different operators have different effects on population diversity. In the process of the selected operator iteration, the population evolution phenomenon of “survival of the fittest” is embodied, but it reduces the diversity of the population, the crossover operator keeps the diversity of the population, and the mutation operator improves the diversity of the species [32].

##### 3.1. Adaptive Genetic Algorithm

###### 3.1.1. Fitness Function

The fitness function is a method that calculates the individual’s ability to adapt to the surrounding environment. It is a key step in calculating the degree of an individual’s superiority and inferiority. It is also a key process in combining the genetic algorithm with the attribute reduction of the rough set. The purpose of attribute reduction is to remove redundant attributes as much as possible to obtain an optimal solution. Therefore, the design of the fitness function should meet the two requirements of strong classification ability and deletion of redundant attributes as much as possible. For this reason, the individual degree of attribute dependency and individuals with conditional attributes are introduced as parameters into the fitness function. The fitness function formula is as follows (4):where represents the number of conditional attributes and represents the number of individuals whose gene value is “1” in ; is the attribute dependency of in the individual whose gene value is “1”; and is the adjustment parameter.

###### 3.1.2. Selecting the Operator

Selecting operators is a key factor in the reduction in population diversity, which reflects the evolutionary direction of the survival of the fittest and determines the search performance of the algorithm [33]. Roulette strategy determines the selection probability according to the fitness of an individual. At the beginning of the iteration, individual differences are greater, and the diversity of the population is abundant. Through this method, the evolutionary phenomenon of the survival of the fittest can be well represented. However, as the iteration progresses, the individual fitness value of the population decreases, and the performance of the selection operator is also greatly weakened. Therefore, the selected operator is improved in this paper, as shown inwhere represents the minimum fitness value of the population and represents the individual fitness value of the probability of the current selection.

After calculating the individual fitness value of the population, the population is sorted in descending order according to the size of the fitness value. When selecting the operation, the individual fitness value is subtracted from the minimum fitness value in the contemporary population, and the roulette selection operation is performed. After the individual fitness value subtracts the minimum fitness value of the population, the degree of difference between individuals increases, which enriches the population diversity and balances the selection pressure.

###### 3.1.3. Cross and Mutation Operators

The traditional genetic algorithm uses fixed crossover probability and mutation probability, which may lead to slow convergence and premature convergence. Algorithm premature convergence affects the evolution of better individuals; the population tends to become static, with limited population diversity, and causes the crossover and mutation operators to become ineffective. The standard adaptive genetic algorithm measures the individual’s superiority and inferiority by comparing the individual fitness with the average fitness value. When the fitness value is greater than the average fitness value, the individual is considered to be a good individual and has a small probability of crossover and mutation. Premature convergence is caused by the individuals in the population. To avoid the multiplication of individuals being slightly larger than the average fitness value, causing the population to be single, this paper proposes using the average fitness value of an individual that is greater than the average fitness of the population as a measure of individual merit. This standard increases the probability of crossover and mutation in this part of the evolution process and avoids overproliferation. At the same time, from the point of view of the entire iterative process of the algorithm, due to the higher diversity of the population at the beginning of the iteration, the population has a greater probability of crossover and mutation. As the iteration proceeds, the population gradually starts to converge, and the population’s crossover and mutation probability also gradually decreases. Based on this, the crossover and mutation operators and , respectively, are improved as follows:where represents the maximum fitness value; represents the average fitness value of the individual whose fitness value is greater than the average fitness value; represents the evolution algebra; and represent the changing curvature of the crossover probability and mutation probability with regard to evolution algebra, respectively; and represent the convergence limit of the crossover probability and mutation probability, respectively; and represents the control factor.

###### 3.1.4. Population Diversity

Population diversity is a prerequisite for the evolution of genetic algorithms. Population diversity directly affects the performance of the algorithm. If the population is binary coded, the population size is , and the total gene length is , the population diversity measure is defined as follows:where and , respectively, represent the number of the 1 and 0 loci of all individuals in the binary coding group; indicates the distribution of the gene, and 0 and 1 are indicated in the population.

###### 3.1.5. Elite Individuals Assist in Cross-Operation

This paper is inspired by [34] to design a new crossover algorithm assisted by elite individuals. Reference [34] uses the same elite individuals to cross-operate with crossover individuals. This kind of operation can quickly lead the population towards elite individuals, but it also greatly reduces population diversity. The crossover operation of this paper is to randomly select an elite individual from the elite pool and complete the crossover operation with the cross individuals to avoid the individual elite individuals from guiding to reduce the diversity of the population. The following formula shows:

##### 3.2. Cooperative Evolution of the Adaptive Genetic Attribute Reduction Algorithm

The attribute dependence degree, as the basis of measuring the importance of the condition attributes in the information system to the decision attribute, provides a standard measure for evaluating the importance of the conditional attributes in the information system. As a self-organized global optimization search algorithm, the genetic algorithm improves the convergence speed and optimization efficiency of the algorithm. The fitness function is used to connect attribute reduction with the genetic algorithm. The number of attributes and the attribute dependency are introduced into the function, which explains the concept of attribute reduction in the rough set. The algorithm for the interval value attribute reduction based on the genetic algorithm is shown in Agorithm 1.