Abstract

In this paper, we propose a multiscale entanglement renormalization ansatz (MERA) feature extraction method based on a novel quantum convolutional neural network (QCNN) for binary scanning tunneling microscopy (STM) image classification. We design QCNN quantum circuits for state preparation, quantum convolution, and quantum pooling in the TensorFlow quantum framework and compare the performance of QCNN classifier and two hybrid quantum-classical QCNN models. Adversarial attacks are considered as a type of interpretable method to evaluate the robustness of QCNN models. The similarity between the pixels of image bitplane slicing and Ising phase transition opens up new ways for exploring classification performance enhancement by QCNN classifiers. Classification performance of different bitplanes of QCNN also shows that they can robustly resist adversarial attacks such as FGSM, CW, JSMA, and DEEPFOOL.

1. Introduction

In recent years, many researchers focused on the interplay between machine learning and quantum physics and investigated if quantum technology can help to improve traditional machine learning algorithms such as supervised learning, principal component analysis, and other dimension reduction algorithms [14]. Cong et al. [5] proposed a quantum circuit-based convolutional neural network (CNN) which can accurately recognize quantum states. Kossaifi and Bulat [6] parameterized the global CNN with a single higher-order tensor. Tensor methods have the potential to parameterize network structure representations in a compact manner. Via imposing a low rank structure on the tensor, it can regularize the network, reduce the number of parameters, and obtain higher accuracy and compression. Henderson et al. [7] proposed a quantum convolution layer with a number of random quantum circuits for feature extraction in image classification. Broughton and Verdon [8] proposed a software framework for quantum machine learning where quantum circuits for supervised learning in classification is make up of a sequence of quantum gates. Quantum circuits play an important role in machine learning [912]. More works of development in QCNN are still be carried out. Henderson et al. [7] evaluated CNNs, QCNN and CNNs with additional nonlinearity models on the MNIST dataset. They showed that the QCNN model had better accuracy as well as faster convergence when compared to the purely classical CNNs. Wei et al. [13] proposed a QCNN model for recognition of handwritten numbers and simulated three types of image filtering, smoothing, sharpening, and edge detection. It could reduce the computing complexity compared with CNNs. Chen et al. [14] proposed a QCNN model for the classification of high energy physics events. It demonstrated an advantage of learning faster than the CNNs. Li et al. [15] proposed a quantum-classical hybrid processing model inspired by the variation quantum algorithms on the MNIST and GTSRB datasets and verified the feasibility and validity when compared with CNNs.

However, the complex QCNN with features as a black box cannot explain its internal mechanism well, so verifying the robustness is crucial toward the trustworthy QCNN. In order for QCNN to be trusted, it needs to reliably explain why QCNN makes certain predictions. Traditional robust and explainable methods [16, 17] on classifier has the limitations of not revealing the intrinsic mechanism where general image bitplane slicing has similar phase transition pattern to that of the Ising models, and the adversarial attacked image has minor visual differences but the bitplanes of attacked image show significant differences. Our scheme is explainable that the accuracies of QCNN models with local construct features on some bitplanes also reveal the corresponding differences under different attacks with different level of attack intensity. With the arrival of noisy intermediate-scale quantum (NISQ) era, to solve these issues, an explainable anti-adversarial attacks image classification scheme is proposed in this paper. First, we simulate adversarial attacks and inject the perturbations on the input data. Then, the bitplane slicing and a feature extraction method are both employed to the novel QCNNs. Finally, the robustness of classification performance evaluation are utilized to evaluate the model interpretation that some bitplanes give over approximation of robust accuracy while other bitplanes give under approximation of robust accuracy. Adversarial attacks [1821] inject imperceptible perturbations to images and lead to deterioration of performance in deep image classifiers, it raise security concerns of image classification. The typical adversarial attacks include FGSM, JSMA, CW, DEEPFOOL, etc. Attackers can inject perturbations to specific objects, or inject imperceptible noise to the background, or inject perturbations to the whole image, the strength of the attack depends on specific parameters. Adversarial attacks are considered as a type of interpretable method, that is, the classification results of normal samples and adversarial samples can be analyzed and reasoned by different features, which can assist scientists to design a more appropriate structure of network. Adversarial attacks and different features are applied to study the robustness of interpretations for our QCNNs. The contribution of this paper are as follows: (1) We propose a MERA feature extraction method on our new designed quantum convolutional neural networks (QCNNs) for STM image classification. (2) We discover image bitplane slicing has similar phase transition pattern as that of the Ising model and explore the correlation between this pattern and the classification performance enhancement by QCNN classifiers. (3) For robustness and explainability, the classification performance of different bitplanes of QCNN also shows that they can robustly resist adversarial attacks such as FGSM, CW, JSMA and DEEPFOOL.

2. Local MERA Feature Construction

To improve the transparency of QCNNs, the proposed explanations provide the local construct features and the global QCNN framework to enhance the understanding of classifiers. Building on the recent interest in tensor networks for machine learning [22, 23], tensor networks have been a tool for the analysis of quantum many-body systems, it encodes the coefficients of the state wave function, ensembles of microstates and is superior to dimension reduction. Tensor networks can be interpreted as part of linear classifiers operating in exponentially high dimensional spaces to be useful in image analysis application and measure the scale of particles/pixels with degrees of granulation. This granularity can be distilled and encoded into a global QCNN.

Multiscale entanglement renormalization ansatz (MERA) and discrete wavelet transformations have a similar multiscale representation. Tensor networks can be used for physical states classification and simulating entangled correlated systems. Such correlation states can be simulated with multilevel analysis for extracting local features. Hallam and Grant [24] proposed a method for tensorizing neural networks by way of approximating scale invariant quantum states. They employed MERA as a replacement for the fully connected layers in a convolutional neural network on the CIFAR datasets. The proposed method provides great compression for the same level of accuracy and great accuracy for the same level of compression. MERA is a powerful tool to study phase transition, critical phenomena and strong coupling problems. In deep learning, people have observed that deep neural networks have the ability to extract features layer by layer. Inspired by the fact that general image bitplane slicing has a similar phase transition pattern to that of the Ising model. With granularity at different scales, we can explore the distinguishing ability of generated features and convert the coarse-grained Ising phase/state classification into fine-grained (pixel-level) image classification.

The scale-invariant MERA provides an efficient way to extract scaling operators. Unitary gates with reflection symmetry in MERA quantum circuits are scale representation of quantum many-body wave function which structurally similar to mappings of convolutional networks and MERA [25, 26] can encode correlations between different scales for data compression. Equation (1) is a unitary matrix represented as with some reflections.

Equation (2) is a unitary matrix with one parameter of reflection symmetric matrices .

It can also be denoted as follows:

The unitary circuit is formed by reflection symmetric matrices with the swap gate and a symmetric transforms parameter dilation factor three. The representation of Figure 1 is a scale layer unitary circuit with three rotation angles as given below.

is a 3  3 matrix with parameter in a scale layer unitary circuit, the decomposition form is,

is direct sum of N matrices from (3),

The scale layer unitary circuit has parameters , , and . Right, left of an edge-centered, and site-centered symmetric sequence are entered into this circuit. The multiscale circuit which encodes the images and its output are then be chosen to yield the ten output features. There is a connection between MERA quantum circuits and discrete wavelet transforms. We describe how MERA quantum circuits can be exploited to develop a new feature extraction method; the process is similar to features extraction from the wavelet transformation of the given image. From MERA circuits, we can distill features from an image and they can be integrated into the QCNN.

3. Quantum Convolutional Neural Network

Quantum convolutional neural network (QCNN) [27] can recognize specific features of quantum states. It is significant to study the combination of local features and the global QCNN circuit structure and that of the bidirectional contributions. Different from the previous work and recent advances, we study the information distil ability on scale patterns as local correlation features to be integrated into global QCNNs and spread into the entire unitary evolution system instead of parameterizing the whole network with single tensors. As many-body wave-functions are structurally similar to mappings of convolutional networks, we analyze the transformation classification pattern of the physical state/phase into the learning of traditional image classification. It is critical to prepare quantum initial states where the higher entangled state correspond to higher separated weight function. With entangled state, the QCNN would have more expressive power than its classical counterpart.

In a quantum system, an initial state where denotes a set of bases in the Hilbert space, . The QCNN applies the unitary transformation on it. Quantum circuit operates quantum bits form by quantum logic gates which are building blocks of quantum circuits. They are combined to form a global quantum circuit, and the whole quantum circuit is a large unitary matrix. It is critical to find a good set of parameters for the quantum circuit like activation function in the network. The information can be distilled through MERA based local features according to different data distributions.

The QCNN architectures for image classification task are illustrated in Figure 2. In this architecture, the first layer is quantum cluster state prepare layer which is shown in Figure 3. Where H gate is applied to any of its qubits indicates an excitation and CZ gate is applied to any of the two adjacent qubits to get the highly entangled state.

The second layer is the input layer where the encoded MERA features are distilled as the rotation angles of single-qubit gates and the rotation angles of the two qubit gates. The transformation of encoded features parameters enter into the parameterized unitary circuit where is supposed to be tensor product of with with rotation angles . Equations (7)–(9) define gates in the circuit, when the parameters enter into the input layer, and they decide the rotation angle around the , and axis in Bloch sphere. The gradient of the QCNN is relatively smooth, so local MERA features are vital to adjusting the gradient and exploring the correlation between scale dimension reduced features and the gradient. The cluster state prepare layer and input layer are added to the quantum circuit in order.

One and two qubit parameterized unitary matrices construct the convolution and pooling layers. The third layer is quantum convolution layer. Figure 4 depicts and gates in quantum convolution layer that can be constructed by a cascade of two-qubit parameterized unitary with pairs of adjacent qubits. The last layer is quantum pooling layer. Figure 5 depicts and CNOT gates in quantum pooling layer. CNOT gates are used to control entanglement. Two arbitrary qubit unitary make a parameterized pooling from two qubits to one qubit unitary circuit. The quantum pooling layer pools half of the qubits by two-qubit pooling. The pooling layer output the qubits where the label 1 assigned one state while −1 assigned the other state. The pooling layer is followed by the repeated measurement observable Z on state which is denoted as where . In QCNN architecture, image pixels are not suitable features entering into quantum circuit for classification, we have made MERA features as parameters of QCNN.

Figure 2(a) depicts a QCNN architecture is constructed by cluster state prepare layer, input layer, convolution, pooling and measurement layers. Figure 2(b) illustrates the hybrid QCNN model which combines a classical neural network with a single quantum convolution and pooling layer. Figure 2(c) illustrates the hybrid QCNN with multiple quantum which combines multiple quantum convolutions and pooling layer with a classical neural network. In recent years, more and more researchers concern about how to improve the performance of the deep network, it should not only pay attention to the depth of the network, an opposite direction of neural network by expanding the width instead of increasing the depth, called broad learning system should be worthy of attention. The difference between hybrid QCNNs and hybrid QCNNs with multiple quantum is that the width of the hybrid QCNNs with multiple quantum is expanded wider than the hybrid QCNNs. In some of the bitplanes, the difference of their performance will increase when compared to the original image, which mean that there is still difference in their anti-adversarial attack ability.

4. Experimental Results

In order to verify the effectiveness and demonstrate the interpretation of our proposed QCNN classifier based on MERA features. We implement three sets of experiment in an environment of TensorFlow-quantum 0.3.0 and cirq 0.8.0. The first experiment includes a dataset of 7589 scanning tunneling microscopy (STM) images [28], labeled as acquired either with a good or bad probe. STM images including 1761 images of good probe and good image which are labeled as class 1. 5828 images of bad ones, with an imperfect acquisition (e.g., inadequate sample region or coarse in sample, noisy image without probe sample contact; blurry images with dull probe, replicated images with multiple-feature probe; artifact with contaminated probe), which are labeled as class 2. In these experiments, the MERA features and Box-counting fractal features [29] were normalized to [−, ] as parameters of rotation angles in gates and gates and then feed into parameterized quantum circuits. This combination of encoded local scale features and QCNN better demonstrates the multiscale nature of data distribution.

A performance comparison of the three models: QCNN, hybrid QCNN, and hybrid QCNN with multiple quantum layers (horizontal expansion of hybrid QCNN by increasing the width) suggests that improvement have been achieved via the proposed features. Parts of the STM images are shown as Figure 6(a), 60 of these images are also randomly selected for training and the remaining 40 for validation. In Figure 6(b), QCNN, Hybrid QCNN, and Hybrid QCNN with multiple quantum layers have achieved accuracies of about 75–97, the convergence rate improvement of pure QCNN has much room for improvement when compared with the other two quantum classical hybrid models. Better convergence is expected for all three QCNN models, particularly the pure QCNN one.

MERA and fractal features have similarities in multiscale image analysis and representation methods; it is important to study the characteristics of images at various scales. By multiscale decomposition, the image information is distilled, which triggers improvement of the QCNN performance. Comparisons have been made between the MERA and Box-counting fractal features in our QCNN model and each feature shows some advantage respectively. Table 1 shows the classification performance comparison between the MERA and Box-counting fractal features. The accuracies of boxcounting fractal outperforms MERA 98.44% vs 95.31% and they must be accompanied by that the accuracies of some high-order bitplanes of MERA outperform those of boxcounting fractal features.

Ising model [30] can depicts the phase transition of ferromagnetic materials. When heated over some temperature threshold, the system loses its magnetism temporarily until cooled down to that threshold. The transition between magnetic and non-magnetic phases is called phase transition. The Monte Carlo method and Ising model-based metropolis algorithm are used to generate images which is shown in Figure 7(b). Granularity distribution with different scale can be used in classification, accompanied by different position distribution of the pixel values. Figure 7(a) depicts that the image bitplane decomposition has a similar phase transition to that of the Ising model. It is particularly important to study the relationship between phase transition and classification performance and gives an interpretation of why it achieve better performance.

Network structure and parameter adjustment help to improve performance. It suggests that the phase transition of the original image is universal. Especially, this phase transition is more likely to have a strong correlation with the classification performance. In the second set of experiments, image pixels are treated as physical particles. To demonstrate the effectiveness of our QCNN models. We first used the Monte Carlo method and Ising method-based Metropolis algorithm to simulate 10000 images with pixels which indicate ten different scale levels of granularity. They are shown in Figure 7(b). For these 10000 Ising scale images, a pre-defined number of clustering has been followed in order to fix the number of categories to 10. The [0-0.1], (0.1-0.2], (0.2-0.3], (0.3-0.4], (0.4-0.5], (0.5-0.6], (0.6-0.7], (0.7-0.8], (0.8-0.9], (0.9-1] region have been labeled with 1 to 10, each corresponding to a category representing one of the ten different scale of granularity from top to bottom in Figure 7(b), i.e., from fine-grained to coarse-grained image granularity. QCNN built on various scale granules can make use of granularity and scale for classification. We divided ten different scale Ising images into five groups [0-0.1] and (0.9-1], (0.1-0.2] and (0.8-0.9], (0.2-0.3] and (0.7-0.8], (0.3-0.4] and (0.6-0.7], and (0.4-0.5] and (0.5-0.6] and performed binary classification, respectively. QCNN, hybrid QCNN, and hybrid QCNN with multiple quantum layers with different network structures have achieved accuracies of about 60%–97% on groups [0-0.1] and (0.9-1]. Accuracies of about 70%–97% have been achieved on groups (0.1-0.2] and (0.8-0.9]. Accuracies of about 74%–95% have been achieved on groups (0.2-0.3] and (0.7-0.8]. Accuracies of about 65%–97% have been achieved on groups (0.3-0.4] and (0.6-0.7], accuracies of about 75%–97% have been achieved on groups (0.3-0.4] and (0.6-0.7]. They help to explore the relation between the quantization of pixels of the image and quantum particles.

In the third set of experiments, we evaluate the robustness and explainability of QCNNs by exploring the bitplanes and their antiadversarial attacks classification performance. We simulate the fast gradient sign method (FGSM) attack on STM to generate adversarial samples, which exploits the maximum direction of gradient changes in the network to inject perturbation noise to make the model deteriorate under the attack. Figure 8 depicts the original image and the adversarial sample images generated by FGSM adversarial attack with different strengths on STM. The attacked image has minor visual differences but the bitplanes patterns under attacks show significant differences and distortions. The first column from top to bottom is the original image, the associated 8th, 7th, 6th, and 5th bitplanes of the original image; the second column is attacked image by FGSM attack with the attack strength parameter eps which is set to 12/255 and its associated sliced bit-planes; the third column is the adversarial sample image when eps parameter is set to 16/255 and its associated bit-planes; the fourth column is the adversarial sample image when eps parameter is set to 24/255 and its associated bitplane; the fifth column is the adversarial sample image when eps parameter is set to 32/255 and its associated bitplane.

Figure 8 shows the original image and its adversarial sample by FGSM attack and its associated bitplane changes. The visualization of the bitplanes changes helps in interpreting the feasibility of anti-adversarial attack abilities. Table 2 shows comparisons of classification accuracy of different bit-planes against FGSM attacks. Different bitplanes have different anti-adversarial attack abilities, and disturbances in some bitplanes will be suppressed. Classification accuracy of classifier performance comparison under a different intensity of FGSM attack has been made between the original image and that of each bitplanes under FGSM attacks. Table 2 depicts that when eps parameter is set to 32/255 in FGSM attack, the accuracy rate is 98.44% in the seventh bitplane; when eps parameter is set to 24/255, the accuracy rate is 96.88% in the sixth plane; when eps parameter is set to 16/255, the accuracy rate reaches 92.19% in the eighth bitplane; when eps parameter is set to 12/255, the accuracy rate is 98.44% in the eighth bitplane. The experimental results show that bitplane slicing can help identify the true class of adversarial samples and show good classification performance against attacks.

We simulate four typical adversarial attacks: FGSM, CW, JSMA and DEEPFOOL. The following experiments are security evaluation on our QCNN which can resist FGSM, CW, JSMA, and DEEPFOOL adversarial attack. Figure 9 depicts different attacks on the background of the target image. The upper images are the original image and the FGSM, CW, JSMA, DEEPFOOL attacked image, and the lower images are the corresponding perturbations attack noise. The color bar indicates the strength of the attack which is also shown in Figure 9. An interesting observation is that some bit-planes can help classifier to improve accuracy. Table 3 depicts that in FGSM adversarial attack, the eighth bitplane yields the best accuracy 95.31%; in CW adversarial attack, the sixth bitplane yields the best accuracy 92.19%; in JSMA adversarial attack, the fifth bit-plane yields the best accuracy 87.50%; in DEEPFOOL adversarial attack, the sixth bitplane yields the best accuracy 93.75%. The experimental results show that bitplanes slicing of QCNN can accurately identify the true class of adversarial samples and show good classification performance against different attacks. Image bitplane slicing has a similar pattern to that of the Ising phase transition. There is research significance to explore the correlation between the chaotic nature of image and the classification/clustering models where the pixels of the image and the Ising chaology particles share similar patterns.

In the feature extraction section, the time complexity of the boxcounting algorithm is shown to be , where n is the number of considered points. The time complexity of MERA is shown to be , where is a refinement parameter for bond dimensions. We will demonstrate that for the neural computation with an input size of , the operator used in quantum implementation is , while it is on classical computers. We adopt the widely used time-space product complexity as the cost complexity, for the quantum implementation, the time complexity circuit depth is , where d is the number of layers, while the space complexity (i.e., qubit numbers) is . The time-space complexity is , the hybrid quantum-classical complexity can still lower than on the classical computer.

5. Conclusions

In this paper we propose a multiscale entanglement renormalization ansatz (MERA) features extraction method based on a novel quantum convolutional neural network (QCNN) for binary scanning tunneling microscopy (STM) image classification. We design QCNN’s quantum circuits for state preparation, quantum convolution, and quantum pooling in the TensorFlow quantum framework and compare the performance of QCNN classifier and two hybrid quantum-classical QCNN models. We also reveal the intrinsic mechanism where general image bitplane slicing has a similar pattern to that of the Ising phase transition, and the adversarial attacked images have minor visual differences but the bitplanes of attacked image show significant differences. Our scheme can robustly resist adversarial attacks and it is explainable that the classification performance of different bitplanes of QCNN also shows the corresponding differences under FGSM, CW, JSMA, and DEEPFOOL attacks with different levels of attack intensity.

Data Availability

The data used to support the findings of this study are available at https://alex-krull.github.io/stm-data.html.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors would like to acknowledge the financial support from National Key R&D Program of China (2019YFC0120102), Natural Science Foundation of Guangdong Province (No. 2018A 0303130169, No. 2020A151501212), Opening Project of Guangdong Province Key Laboratory of Big Data Analysis and Processing at the Sun Yat-Sen University (No. 201902), Key laboratory of Industrial Equipment Quality Big Data, MIIT (No. 2021-IEQBD-03), and Educational Science Planning Project of Guangdong Province (No. 2022GXJK073).