Abstract

The complex systems with edge computing require a huge amount of multifeature data to extract appropriate insights for their decision making, so it is important to find a feasible feature selection method to improve the computational efficiency and save the resource consumption. In this paper, a quantum-based feature selection algorithm for the multiclassification problem, namely, QReliefF, is proposed, which can effectively reduce the complexity of algorithm and improve its computational efficiency. First, all features of each sample are encoded into a quantum state by performing operations CMP and Ry, and then the amplitude estimation is applied to calculate the similarity between any two quantum states (i.e., two samples). According to the similarities, the Grover–Long method is utilized to find the nearest k neighbor samples, and then the weight vector is updated. After a certain number of iterations through the above process, the desired features can be selected with regards to the final weight vector and the threshold τ. Compared with the classical ReliefF algorithm, our algorithm reduces the complexity of similarity calculation from O(MN) to O(M), the complexity of finding the nearest neighbor from O(M) to , and resource consumption from O(MN) to O(MlogN). Meanwhile, compared with the quantum Relief algorithm, our algorithm is superior in finding the nearest neighbor, reducing the complexity from O(M) to . Finally, in order to verify the feasibility of our algorithm, a simulation experiment based on Rigetti with a simple example is performed.

1. Introduction

Complex systems [1] are nonlinear systems composed of agents that can act with local environmental information, which require big data to extract appropriate insights for their decision making. In the cloud computing [25], the data transmission delay between the data sources and the cloud centers is problematic for many complex systems where responses are usually required to be time critical or real time. Instead, a recently emerging computation paradigm, edge computing [69], is promising to cater for these requirements, as edge computing resources are deployed data sources which support time critical or real-time data processing and analysis. As we all know, the computing resources and storage resources of most intelligent terminals are very limited, which places higher requirements on the computing performance of algorithms, especially machine learning algorithms, in complex systems with edge computing.

Machine learning [10, 11] continuously improves its performance through “experience,” where experience generally originates from massive data. At present, many machine learning algorithms based on massive data [1214] have been proposed. In the practical scenario, the amount of data available for training is getting larger and larger, while the characteristics of data are becoming more and more abundant. Those data with redundant or unrelated features will cause the problem of “curse of dimensionality” [15], which greatly increases the computational complexity of the algorithm. One of the possible solutions is the dimension reduction [16], and the other is the feature selection [17].

Relief algorithm [18] is a well-known feature selection algorithm for the two-classification problem. It is widely used because of its excellent classification effect. However, the limitation of this algorithm is that it can only perform the binary classification, and the efficiency of the algorithm will be greatly affected when the data size and feature size increase. To extend the application of the algorithm, Kononenko [19] proposed a new feature selection algorithm for the multiclassification problem, namely, ReliefF algorithm. It has the advantages of simple principle, convenient implementation, and good results and has been widely applied in various fields [2022].

On the other hand, since Benioff [23] and Feynman [24] explored the theoretical possibilities of quantum computing, some excellent results have been proposed one after another. For instance, Shor’s algorithm [25] solves the problem of integer factorization in polynomial time. Grover’s algorithm [26] has a quadratic speedup to the problem of conducting a search through some unstructured database. These excellent results have prompted people to think about how to apply this computing power into machine learning algorithms. Thus, a new research hotspot, quantum machine learning [2732], has gradually formed. Although quantum technology provides a certain improvement in storage and computing power, the “curse of dimensionality” problem still exists in quantum machine learning. Therefore, the quantum-based dimensionality reduction method still has important research value. In 2018, Liu et al. [33] proposed a quantum Relief algorithm (namely, QRelief algorithm) for the two-classification problem, which reduces the complexity of similarity calculation from O(MN) to O(M).

As we know, in the application scenario of edge computing, there are various multiclassification problems based on distributed, massive, and large-feature data. The objective of this study is to design a feasible feature selection method which can effectively get rid of redundant or unrelated features in machine learning, reducing the computation load of intelligent terminals, and thus meet the requirement of real-time data processing and analysis in edge computing. In this paper, we introduce some quantum technologies (such as CMP operation, amplitude estimation, and Grover–Long method) and propose a quantum-based feature selection algorithm, namely QReliefF algorithm, for the multiclassification problem.

The main contributions of our work are as follows:(1)A quantum method is proposed to solve the problem of feature selection for the multiclassification problem in complex systems with edge computing. The proposed method fully demonstrates the quantum parallel processing capabilities that classical computing cannot match and significantly reduces the computational complexity of the algorithm.(2)The problem of finding nearest neighbor samples is firstly transformed into the similarity calculation of two quantum states (i.e., calculating their inner product) in quantum computing, and the Grover–Long method is utilized to speed up the search of the targets.(3)A simulation experiment based on Rigetti is performed to verify the feasibility of our algorithm.

The outline of this paper is as follows. The classic ReliefF algorithm is briefly reviewed in Section 2, and the proposed quantum ReliefF algorithm is proposed in detail in Section 3. Then, we illustrate the process of the algorithm with a simple example in Section 4 and perform the simulation experiment on Rigetti in Section 5. Subsequently, the efficiency of the algorithm is analyzed in Section 6, and the brief conclusion and discussion are summarized in the last section.

2. Review of ReliefF Algorithm

ReliefF algorithm [19] is a feature selection algorithm which is used to handle the multiclassification problem. Before introducing our proposed quantum algorithm, let us review the detailed process of the algorithm.

Without loss of generality, suppose there are M samples with N features, and they can be divided into P classes:where is the q-th N-feature sample that belongs to Class Cp, . And the weight vector of N features is initialized to all zeros, the upper limit of iteration is T, and the relevance threshold (that differentiates the relevant and irrelevant features) is τ (0 ≤ τ ≤ 1). The main steps of ReliefF algorithm are as follows (its pseudocode can be seen in Algorithm 1).

(1)Init WT = (0, …, 0)T
(2)for t = 1 to T do
(3)Pick a sample u randomly
(4)Find k nearest neighbor samples Hjfrom the same class of sample u
(5)for Cpclass(u) do
(6)  Find k nearest neighbor samples Mj(Cp) from the different Class Cp
(7)end
(8)for i = 1 to N do
(9)  
(10)end
(11)end
(12)Select the most relevant features according to WT and τ

At each iteration, ReliefF randomly selects a sample u and then searches for k nearest neighbor samples by cosine distance from each class. The closest same-class sample is called Hj, and the closest different-class sample is called Mj(Cp), where j = {1, 2, …, k}. The updating weight vector formula is shown as follows:where p(Cp) represents the probability of randomly extracting samples of Class Cp, and the definition of function is as follows:

After iterating T times, the final weight vector is obtained. Through the relevance threshold τ, we can retain relevant features and discard irrelevant features.

ReliefF algorithm is an extension of Relief algorithm that extends the two-classification problem to multiclassification scenario. However, with the increase of category size, sample size, and sample features, ReliefF algorithm will also face with the problem of “dimension disaster,” and the speed of the algorithm will also drop sharply. So, how to improve the efficiency of ReliefF algorithm becomes an urgent problem to be solved.

3. The Proposed QReliefF Algorithm

In order to implement the feature selection for the multiclassification problem in complex systems with edge computing, a feasible quantum ReliefF algorithm is introduced in this section. Suppose the sample sets (p represents the category of classification, ), the weight vector WT, the upper limit T, and the relevance threshold τ are the same as classical ReliefF algorithm defined in Section 2. Different from the classical one, all the features of each sample are represented as a quantum superposition state, and thus the problem of finding nearest neighbor samples is transformed into the similarity calculation of two quantum states (i.e., calculating their inner product). And the similarity between any two samples can be calculated in parallel in the way of quantum mechanics. Algorithm 2 describes the process of our algorithm in detail, and the specific steps are as follows.

(1)Init WT = (0, …, 0)T
(2)Normalized the sample sets:
(3)Prepare quantum states for all samples by operations CMP and Ry, respectively.
(4)for t = 1 to T do
(5)Select a state |ϕfrom randomly which corresponds to u
(6)Perform swap operation on |ϕand obtain
(7)The similarity information coded into quantum state through swap test, the inner product and amplitude estimation operations
(8)The nearest k samples in each class are obtained by Grover-Long method
(9)for i = 1 to N do
(10)  
(11)end
(12)end
(13)
(14)for i = 1 to N do
(15)if then
(16)  The i-th feature is relevant
(17)else
(18)  The i-th feature is not relevant
(19)end
(20)end
3.1. State Preparation

In order to store classical information in quantum states, we need to normalize the sample sets:where

Obviously, is a real number, and . Then, we prepare the initial quantum states as below:where corresponds to the quantum state of the q-th sample that belongs to Class and represents the i-th eigenvalue of the q-th sample. Assume our initial state is ; the construction scheme of the quantum state includes the following steps.

First, we perform Hadamard and CMP operations for |0〉n and get a new state:and its circuit diagram is shown in Figure 1, where the definition of CMP operation is

The function of CMP operation is to cut the quantum state larger than N, and its circuit diagram is shown in Figure 2. |i〉 and |N〉 represent a single qubit. The implementation of CMP operation needs to repeatedly implement such a circuit n times. After measurement, if the lowest register is |1〉, it means that i > N.

Next, we perform the unitary rotation operation Ry:on the last qubit to obtain our target quantum state :

3.2. Similarity Calculation

After the state preparation, the information of the samples is encoded into the quantum superposition state . In this paper, we use the cosine distance to define the similarity between the random sample ū and other sample (e.g., ):

Referring to equations (4) and (5), and are 1, and equation (11) can be simplified as follows:

First, |ϕ〉 (i.e., the sample ū) is randomly selected from which is the l-th sample in Class , as shown in the following equation:

Then, a swap operation is performed on |ϕ〉 to get

Next, a swap test (its circuit is given in Figure 3) is performed on , and we obtain

From equation (15), we know the probability of measurement result being |1〉 is

In addition, the inner product between |φ〉 and (i.e., the prepared state) can be calculated as follows:

Combining equation (16) with equation (17), we can get the similarity between samples ū and :

Since N is a constant value and is the angle cosine between the random sample ū and other sample (e.g., in Class ), then the smaller is, the smaller cosine distance is, which indicates that these two samples are more similar.

Then, we can rewrite equation (15) as follows:

3.3. Finding the Nearest Neighbor Samples

First, the quantum amplitude estimation method [34] is applied to store the similarity of the sample ū and in the last qubit:where , and its quantum circuit diagram is given in Figure 4.

In the Grover–Long method [35], one iteration can be divided into four operations, i.e., G = −WI0W−1O, and its quantum circuit is shown in Figure 5. O is an oracle operation which performs a phase inversion on the targets:where is the position of e in the diagonal matrix. The position of e is divided into two cases. If is odd, the u1(ϕ) operation will be applied to the lowest qubit:

If is even, X, u1(ϕ), X operations will be applied to the lowest qubit.

Besides, I0 is a conditional phase shift operation which performs a phase inversion on |0〉:where and J represents the number of iteration.

Having obtained the state |βp (see equation (20)) through the amplitude estimation, we introduce a quantum minimum search algorithm [37] to find k nearest neighbor samples from Class with the time complexity of , and its quantum circuit is shown in Figure 6.

Suppose the set represents the k nearest neighbor samples, we should prepare auxiliary qubits. As shown in Figure 6, the operator Ws represents Ws|βp|0k〉 = |βp|K1〉|K2〉⋯|Kk〉, and u1(ϕ) is the operator defined in equation (22). Let ; we can mark Kx when . Next, we can replace d0 after one iteration, where d0 is , x ∈ [1, k]. We repeat the above steps several times until all samples in Class are compared. Finally, all index of k nearest neighbor samples in Class can be obtained according to the similarity.

3.4. Updating Weight Vector

After the above steps, we obtain the nearest neighbor samples (i.e., Hj and ) of the random sample ū. Then, we update the weight vector according to the updating weight vector formula as follows:where i ∈ [1, N].

3.5. Feature Selection

After iterating the above steps, i.e., similarity calculation, we find the nearest neighbor samples and update weight vector T times, and we jump out of the algorithm’s loop. Then, we get a final weight vector WT. And the average weight vector is

Then, we make feature selection based on the final and threshold τ. Here, τ can be chosen to retain relevant features and discard irrelevant features [38], that is to say, those features whose weight is greater than τ will be selected, and those less than τ will be discarded. Here, the value of τ is determined with regards to the user’s requirements and the characteristics of the problem itself (e.g., the distribution of samples and the number of features).

4. Example

Suppose that there are four samples(see Table 1); S0 = (1, 0, 0, 1, 0, 0), S1 = (1, 0, 0, 0, 1, 0), S2 = (0, 1, 0, 0, 0, 1), S3 = (0, 1, 0, 1, 0, 0), S4 = (0, 0, 1, 0, 1, 0), and S5 = (0, 0, 1, 0, 0, 1), and thus n is 3, and they belong to two classes: A = {S0, S1}, B = {S2, S3}, and C = {S4, S5}, which is illustrated in Figure 7.

First, the four initial quantum states are prepared as follows:

Next, we take as an example, and then the H⊗3 operation is applied on the third and fourth qubits:

Then, we perform Ry rotation (see equation (9)) on the last qubit, and we can get

The other quantum states are prepared in the same way and they are listed as follows:

Second, we randomly select a sample (assume is ū) and perform similarity calculation with other samples (i.e., , , , , ). Next, we take and as an example and perform a swap operation between the last two qubits of :

After that, the swap test operation is applied on (|φ〉, ):

We perform a swap test operation to obtain a quantum state that encodes similarity in amplitude:

Then, through the amplitude estimation, we can obtain the quantum states:

Next, we perform an oracle operation on the quantum states obtained in the above steps to obtain the k nearest neighbor samples.

5. Simulation Experiment

Quantum Cloud Services (QCSTM) is Rigetti’s quantum-first cloud computing platform. At the end of 2017, a 19-qubit processor named “Acorn” was launched, which can be used in QCS through a quantum programming toolkit named Forest [39]. The chip of “Acorn” is made of 20 superconducting qubits but for some technical reasons, qubit 3 is offline and cannot interact with its neighbors, so it is treated as a 19-qubit device whose coupling map is shown in Figure 8.

In order to obtain the result and also verify our algorithm, we choose Rigetti to perform the quantum processing. However, since the Rigetti platform limits the length of the entire circuit and noise has a great influence on the preparation of quantum states [40], we only show one of the ideal experiment circuits of similarity calculation in QReliefF algorithm running on Rigetti platform. We successfully stored the characteristic information in the sample into the amplitude of the quantum state and then extracted the amplitude information into the quantum state through the phase estimation algorithm. Figure 9 gives the schematic diagram of our experimental circuit. The corresponding code of the circuit in Rigetti is shown in Algorithm 3. After running Algorithm 3 8 times, the result can be seen in Figure 10. We can get |1〉 with the average probability of 0.435125. Then, we successfully stored the characteristic information in the sample into the amplitude of the quantum state. According to equation (32), we can get , i.e., , and then we extracted the amplitude information into the quantum state through the phase estimation algorithm.

(1)# Define the new gate from a matrix
(2)theta = Parameter(“theta”)
(3)cry = np.array([
(4)   [1, 0, 0, 0]
(5)   [0, 1, 0, 0]
(6)   [0, 0, quil_sqrt(1 − thetatheta), −thetatheta]
(7)   [0, 0, thetatheta, quil_sqrt(1 − thetatheta)]
(8) ])
(9)gate_definition = DefGate(“CRY”, cry, [theta])
(10)CRY = gate_definition.get_constructor()
(11)# Create our program and use the new parametric gate
(12)p = Program(
(13)  gate_definition, X(1), H(2), H(4), H(5), X(2), X(5)
(14)  CCNOT(2, 5, 18), X(2), X(5), CRY(1) (18, 0)
(15)  SWAP(0, 1), X(10), H(11), H(12), H(13), X(14)
(16)  X(11), X(12), CCNOT(11, 12, 17), X(11), X(12)
(17)  CRY(1) (17, 9), H(19), CSWAP(19, 0, 9)
(18)  CSWAP(19, 1, 10), CSWAP(19, 2, 11)
(19)  CSWAP(19, 4, 12), CSWAP(19, 5, 13)
(20)  CSWAP(19, 6, 14), CSWAP(19, 7, 15)
(21)  CSWAP(19, 8, 16), H(19)
(22) )
(23) # print the circuit
(24)print(p)
(25) # get a QPU, 20q − Acorn is just a string naming the device
(26)qc = get_qc(“20qAcorn”)
(27) # run and measure
(28)result = qc.run_and_measure(p, trials = 1024)

After all the steps have been performed, we obtain the quantum states S1 (H), S2 (M(B)), and S5 (M(C)) of the nearest neighbor samples of the quantum state S0 (ū) in each class of the random sample which can be seen in Figure 11. Then, the weight vectors are updated according to equation (24) and the result of WT is listed in the second row of Table 2 after the first iteration. The algorithm iterates T times (in our example, T=4) as above steps and obtains all the WT results as shown in Table 2. After T-th iterations, WT = [4, 4, 4, −2, 0, −2], and then . In this paper, the value of τ in the example is assumed to be 0.5 according to the updated result of WT in Table 2. Since the threshold τ = 0.5, the selected features are F0, F1, and F2, i.e., the first, second, and third features. The result of quantum feature selection is consistent with the classical ReliefF algorithm after being verified by Python.

In the final weight value comparison, considering the large amount of data in the complex system and the corresponding eigenvalues, the calculation amount required for the comparison after the final result is obtained is also large. In order to meet the requirements of big data and result accuracy, we adopted an optimized quantum maximum and minimum value search algorithm [37] when comparing weights in the last step to help us quickly and accurately select the features we want, so as to better solve the multiclassification problem in complex systems.

In circumstances when we can exactly estimate the ratio of the number of solutions M and the searched space N, this algorithm can improve the successful probability close to 100%. Furthermore, it shows an advantage in complexity with large databases and in the operation complexity of constructing oracles.

5.1. Efficiency Analysis

In order to evaluate the efficiency of QReliefF algorithm, three algorithms (i.e., classical Relief, classical ReliefF, and quantum Relief algorithms) are selected to compare with our algorithm from three indicators: complexity of similarity calculation (CSC), complexity of finding the nearest neighbor (CFNN), and resource consumption (RC).

In the classic Relief algorithm, it takes O(N) time to calculate the distance between randomly selected samples and any other samples, and then it finds the nearest neighbors related to M. This process needs to iterate T times, so CSC is O(TMN). Since T is a constant, CSC in the classic Relief algorithm is O(MN). As we know there are totally M samples, each with N features, CFNN is O(M) and RC in the classic Relief algorithm is O(MN) bits. The classical ReliefF algorithm is similar to the classical Relief algorithm. Since it finds k nearest neighbors at once time, the time complexity is O(kTMN). Then, we can simplify CSC to O(MN) because k and T are constants. Besides, CFNN for finding k nearest neighbors is O(M). In terms of resource consumption, there are M samples, and each sample has N features, so the resource consumption of the classic ReliefF algorithm is O(MN) bits.

In QRelief and QReliefF algorithms, the quantum property is used to calculate the distance from O(N) to O(1), so their CSCs are all O(TM). Since T is constant, their CSCs can be simplified to O(M). CFNN of QRelief is , and then it can be simplified to as k is constant, while CFNN of QReliefF is because it uses the quantum Grover–Long method to find k nearest neighbor samples which holds a quadratic acceleration. Since k is constant, CFNN of QReliefF is . RC of similarity calculation, finding the nearest neighbor samples and updating weight vector are O(TMlogN), O(TN) and O(N), respectively. Therefore, the total complexity is O(TMlogN + TN + N); since T is constant, RC of QRelief and QReliefF is O(MlogN + N). For multifeature big data in complex systems with edge computing, there is MN, so , and then RC of QRelief and QReliefF can be simplified into O(MlogN).

For convenience, we list the efficiency comparison of classic Relief algorithm, ReliefF algorithm, quantum Relief algorithm, and our algorithm in terms of CSC, CFNN, and RC in Table 3. Obviously, our algorithm is superior to classical algorithms (i.e., Relief and ReliefF) in terms of CSC, CFNN, and RC and better than quantum algorithm (i.e., QRelief) in terms of CFNN.

6. Conclusion and Discussion

With the rapid development of edge computing technology and quantum machine learning algorithms, researchers began to pay attention to the combination and application of these two fields. In this paper, we use quantum technology to solve the multiclassification problem of feature selection in the complex systems with edge computing and propose a quantum ReliefF algorithm. Compared to the classic ReliefF algorithm, our algorithm reduces the complexity of similarity calculation from O(MN) to O(M) and the complexity of finding the nearest neighbor from O(M) to . In addition, from the perspective of resource consumption, our algorithm consumes O(MlogN) qubit, while the classic ReliefF algorithm consumes O(MN) bit. Obviously, our algorithm is superior in terms of computational complexity and resource consumption.

It should be noted that our work aims to improve the algorithm efficiency, while the privacy protection of sensitive data is not taken into account. At present, data security has become a focus of attention in the field of artificial intelligence, and some solutions for data privacy protection in complex systems with edge computing have been proposed [4144]. In our future work, how to improve the efficiency of quantum machine learning algorithms while ensuring the privacy protection of sensitive data, such as [4548], will become our direction.

Data Availability

The specific data items and corresponding experimental codes used to support the findings of this study are included within the article.

Disclosure

An earlier version of the manuscript has been presented in “2019 International Conference on IEEE Cyber, Physical and Social Computing (CPSCom).”

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The support of all the members of the quantum research group of NUIST and SEU is especially acknowledged; their professional discussion and advices have helped us a lot. This study was supported by the Natural Science Foundation of Jiangsu Province (grant no. BK20171458). This study was also supported in part by the Natural Science Foundation of China (grant nos. 61672290 and 61802002), the Natural Science Foundation of Jiangsu Higher Education Institutions of China (grant no. 19KJB520028), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).