International Journal of Reconfigurable Computing

Volume 2016, Article ID 5718124, 18 pages

http://dx.doi.org/10.1155/2016/5718124

## An FPGA-Based Quantum Computing Emulation Framework Based on Serial-Parallel Architecture

VeCAD Research Laboratory, Faculty of Electrical Engineering, Universiti Teknologi Malaysia (UTM), 81310 Skudai, Johor Bahru, Malaysia

Received 13 October 2015; Revised 19 February 2016; Accepted 14 March 2016

Academic Editor: João Cardoso

Copyright © 2016 Y. H. Lee et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

#### Abstract

Hardware emulation of quantum systems can mimic more efficiently the parallel behaviour of quantum computations, thus allowing higher processing speed-up than software simulations. In this paper, an efficient hardware emulation method that employs a serial-parallel hardware architecture targeted for field programmable gate array (FPGA) is proposed. Quantum Fourier transform and Grover’s search are chosen as case studies in this work since they are the core of many useful quantum algorithms. Experimental work shows that, with the proposed emulation architecture, a linear reduction in resource utilization is attained against the pipeline implementations proposed in prior works. The proposed work contributes to the formulation of a proof-of-concept baseline FPGA emulation framework with optimization on datapath designs that can be extended to emulate practical large-scale quantum circuits.

#### 1. Introduction

Quantum computing is based on the properties of quantum mechanics, namely, superposition and entanglement. Superposition allows a quantum state to be in more than one basis state simultaneously, whereas entanglement is the strong correlation between multiqubit (quantum bit) basis states in a quantum system. Superposition and entanglement facilitate massive parallelism which enables exponential speed-ups to be achieved in the well-known integer factoring and discrete logarithms algorithms [1] and quadratic speed-ups in solving classically intractable brute-force searching and optimization problems [2, 3].

Similar to classical computing, quantum algorithms are developed long before any large-scale practical quantum computer is physically available. In 1994, Shor proposed the integer factoring and discrete logarithms algorithms [1] that brought the world’s attention to the enormous potential of quantum computing. An example of this is the Rivest-Shamir-Adleman (RSA) security scheme [4] which is widely applied in current public key cryptosystem. It is based on the assumption that integer factoring of large number is intractable in classical computing. Shor’s proposal, which, in contrast, factors integer in polynomial time, would make such security scheme no longer secure. In [5], Grover proposed a quantum search algorithm that is capable of identifying a specific element in an unordered elements database in attempts. This algorithm achieves a quadratic speed-up over the corresponding classical method that requires queries on average, to retrieve the desired data. Although the solution is only polynomially faster than the classical approach, Grover’s quantum algorithm is an important one as it can be generalized to be applied in many intractable computer science problems. Recently, quantum equivalents for random walks [6], genetic algorithms [3], and NAND tree evaluation [7] have been developed.

Shor in [8] categorized quantum algorithms known to provide substantial speed-up over the classical approach into three types: (a) algorithms that achieve notable speed-up by applying quantum Fourier transform (QFT) in periodicity finding; examples of this type of algorithm include integer factoring and discrete logarithms algorithms [1], Simon’s periodicity algorithm [9], Hallgren’s algorithms for Pell’s equation [10], and the quantum algorithms for solving hidden subgroup problems [11, 12]; (b) Grover’s search algorithm and its extensions [2, 13] which in general offer square root speed improvements over their classical counterparts; and (c) algorithms for simulating or solving problems in quantum mechanics [14].

Physical realization of a quantum computer is proving to be extremely challenging [15]. With research into viable large-scale quantum computers still ongoing, various technologies, namely, ion trap [16], nuclear magnetic resonance [17], and superconductor [18], were attempted. Nevertheless, only small-scale quantum computation implementations have been achieved [19, 20]. Instead of focusing on the realization of quantum gates, a different approach known as quantum annealing which solves optimization problems by finding the minimum point is used in the 128-qubit D-Wave One, 512-qubit D-Wave Two, and 1000-qubit D-Wave 2X systems [21, 22]. However, based on the research report presented in [23], the expected quantum speed-ups were not found in the D-Wave systems.

In parallel to efforts to develop physical quantum computers, there is also much effort in the theoretical research of quantum algorithms. Until large-scale practical quantum computers become prevalent, quantum algorithms are currently developed using the classical computing platform. However, due to their inherent sequential behaviour, classical computers that are based on Von Neumann architecture cannot simulate the inherent parallelism in quantum systems efficiently. On the other hand, the technology of field programmable gate array (FPGA) offers the potential of massive parallelism through hardware emulation. Consequently, significant improvement in speed performance over the equivalent software simulation can be achieved. However, FPGA is still a form of classical digital computing, and resource utilization on such a classical computing platform grows exponentially as the number of qubits increases. The problem is further compounded with the fact that accurate modelling of quantum circuit in FPGA technology is nonintuitive and therefore difficult, providing the research motivation for this paper.

This paper presents an efficient FPGA emulation framework for quantum computing. In the proposed emulation model, quantum computations are mapped to a serial-parallel architecture that facilitates scalability by managing the exponential growth of resource requirement against number of qubits. Quantum Fourier transform and Grover’s search are chosen as case studies in this work since they are the core of many useful quantum algorithms, and in addition, they have been used as benchmarking models in prior works on FPGA emulation. Experimental results on the efficiencies of different FPGA emulation architectures and fixed point formats are presented, which will sufficiently demonstrate the feasibility of proposed framework.

The rest of this paper is organized as follows: Section 2 discusses prior works on FPGA-based quantum computing emulation, emphasizing issues of hardware architecture and modelling of quantum system on FPGA platform. In Section 3, the theoretical background on quantum computing and related quantum algorithms is provided. Section 4 presents the design of the proposed FPGA emulation models for QFT and Grover’s search algorithms. Experimental results and analysis are given in Section 5. Finally, concluding remarks are made in Section 6.

#### 2. Related Work

Modelling of a quantum system on classical computing platform is a challenging task. Hence, it is even more difficult to map quantum algorithms for emulation on classical computing environment based on FPGA, which is highly resource-constrained. Many attempts have been made in the last decade in FPGA emulation of quantum algorithms, and these works include [24–27]. However, details of the critical design processes such as mapping of the quantum algorithms into the FPGA emulation models and the verification of the implementations are not revealed in these prior works.

For software-based simulation using classical computer, various types of quantum simulators have been proposed. An open source C library, , for simulation of quantum computing is presented in [28] where pure quantum computer simulation as well as general quantum simulation is supported by the tool. In 2007, a variant of binary decision diagram named quantum information decision diagram (QuIDD) for compact state vector storage was introduced in [29] for efficient quantum circuit simulation. García and Markov [30] proposed a compact data structure based on stabilizer formalism called stabilizer frames.

Most of the previous FPGA emulation works are based on the quantum circuit model, which is essentially an interconnection of quantum gates. A different approach was taken by Goto and Fujishima [24] where a general purpose quantum processor was developed instead of applying the quantum circuit model. However, Fujishima’s quantum processor assumed that the amplitudes of a quantum state can be either all zeros or with evenly distributed probability. In its emulation of Shor’s integer factoring algorithm, details of the implementation are inadequate for its results to be verified as claimed. For instance, it is stated in [24] that a 64-bit factorization was demonstrated using their emulator with only 40 Kbits of classical memory instead of 320 qubits as required with Shor’s algorithm in a quantum computer. This statement was not supported by design and implementation details on how factorization of such a large integer can be done with only 40 Kbits memory, where typically it would require at least bytes to represent a quantum state of such a scale on the classical platform.

In [25], FPGA emulation of 3-qubit QFT and Grover’s search are proposed. In this work, which is based on the quantum circuit model, qubit expansion is performed prior to the application of multiqubit quantum gate transformations. This leads to an inaccurate modelling of a quantum algorithm, since, according to [31], the input quantum state to QFT circuit should first be placed in superposition of basis states, where signal samples are encoded as sequence of amplitudes. In the work by [26], hardware emulation of QFT restricts its input quantum state to the computational basis state, implying that superposition is not included in the modelling. Rivera-Miranda et al. in [26] claims 16-qubit QFT emulation is achieved. However, the emulator can only process up to 32 input signal samples in one evaluation, which is equivalent to a 5-qubit QFT emulation if effects of superposition and entanglement are included.

From the above discussion it should be noted then that the critical quantum properties of superposition and entanglement were not considered in these previous works, resulting in inaccurate modelling of quantum algorithms. Without the superposition and entanglement effects, the power of quantum parallelism cannot fully be exploited. Previous works reported in [25–27] applied pipeline architecture in their FPGA emulation implementations so as to obtain high throughput and low critical path delay. However, a pipeline design imposes high resource utilization (due to the requirement of additional pipeline registers and associated logic), thus limiting FPGA emulation to be deployed in more practical quantum computing applications that typically require high qubit sizes. In these pipeline implementations proposed in prior works, resource growth was exponential to the increase in qubit sizes.

In this paper, the issues outline above is addressed. The efficiencies of different hardware architectural designs for FPGA emulation purposes are evaluated based on the chosen case studies of QFT and Grover’s search. We propose an accurate modelling of quantum system for FPGA emulation, targeting efficient resource utilization while maintaining significant speed-up over the equivalent simulation approach. Since our proposed FPGA emulation framework applies the state vector approach, simulation models based on the library are selected in this work for benchmarking purposes.

#### 3. Theoretical Background

In general, quantum algorithms obey the basic process flow structure. The computation process begins with a system set in a specific quantum state, which is then converted into superposition of multiple basis states. Unitary transformations are performed on the quantum state according to the required operations of the algorithm. Finally, measurement is carried out, resulting in the qubits collapsing into classical bits.

##### 3.1. Quantum Bit (Qubit)

In classical computing, the smallest unit of information is the* bit*. A bit can be in either state or state , and the state of a* bit* can be represented in matrix form as

On the other hand, in quantum computing, the smallest unit of information is the* quantum bit* or a* qubit*. To distinguish the classical bit with the quantum qubit, Dirac* ket* notation is used. Using the* ket* notation, the quantum computational basis state is represented by and . A qubit can be in state , or in state , or in superposition of both basis states. The state of a qubit can be represented aswhere both and are complex numbers and . is the probability where the qubit is in state and is the probability where the qubit is in state upon measurement. An -qubit quantum state vector contains complex numbers which represents the measurement probability of each basis state. However, on measurement, the superposition is destroyed and the qubits return to the classical state of bits depending on the probability derived from the complex-valued state vector.

##### 3.2. Tensor/Kronecker Product

Tensor product or Kronecker product is the basic operation that is applied in the formation of a larger quantum system as well as multiqubit quantum transformations. A quantum state vector that can be written as the tensor of two vectors is separable, whereas a state vector that cannot be expressed as the tensor of two vectors is entangled [15]. The tensor operation on any arbitrary two 1-qubit transformations is shown below:

##### 3.3. Quantum Circuit Model

A quantum algorithm is a description of a sequence of quantum operations (or transformations) applied upon qubits to generate new quantum states. The model most widely used in describing the evolution of a quantum system is the quantum circuit model, first proposed in [32]. A quantum circuit is the interconnection of quantum gates with quantum wires, and gate operations are represented by unitary matrices.

All unitary matrices are invertible and the products of unitary matrices as well as the inverse of unitary matrix are unitary. An -by- matrix is unitary if , where is the adjoint (conjugate transpose) of . Since all quantum transformations are reversible, quantum gate operations can always be undone. Fundamental quantum gates include the Hadamard gate, phase-shift gate, and swap gate, and these gates are described as follows.

Hadamard gate is one of the most useful single qubit quantum gates. It operates by placing the computational basis state into superposition of basis states with equal probability. The Hadamard transform can be represented by the following unitary matrix:

The following example illustrates the application of Hadamard gates in mapping a 2-qubit basis state to a superposition of basis states with equal probability:

Controlled phase-shift gate operates on 2 qubits, one of which is the control qubit and the other is the target qubit. If the control qubit is true, a phase-shift operation is performed on the target qubit; otherwise, there is no operation. The operation is represented by the following matrix:

Quantum gate is used for swapping two qubits. It switches the amplitudes of a quantum state vector. The operation of a 2-qubit gate is represented by matrix in

##### 3.4. Quantum Fourier Transform (QFT)

The Fourier transform is deployed in wide range of engineering and physics applications such as signal processing, image processing, and quantum mechanics. It is a reversible transformation that converts signals from time/spatial domain to frequency domain and vice versa. The Fourier transform is defined in (8) for continuous signals and in (9) for discrete signals:

The quantum Fourier transform (QFT) is a transformation on qubits and is the quantum equivalent of the discrete Fourier transform. It should be noted that a quantum computer performs QFT with exponentially less number of operations than the classical Fourier transform. However, QFT does not reduce the execution time of the algorithm when classical data is used. This is due to the characteristic of the quantum computer that does not allow parallel read-out of all quantum state amplitudes. In addition, there is no known method that can effectively instantiate the desired input state amplitudes to be Fourier-transformed [33].

In order to harness the power of quantum computing on Fourier transform, QFT has to be deployed within other practical applications. QFT is pivotal in quantum computing since it is part of many quantum algorithms. These algorithms include integer factorization and discrete logarithms algorithms [1], Simon’s periodicity algorithm [9], and Hallgren’s algorithms [10]. They offer significant speed-up over their classical counterparts. QFT has also found applications in many real-world problems such as image watermarking [34] and template matching [35].

To compute Fourier transform in quantum domain, discrete signal samples are encoded as the amplitude sequences of a quantum state vector which is in superposition of basis states [31]. An -qubit QFT operation which transforms an arbitrary superposition of computational basis states is expressed in

As the requirement for a valid quantum state, must be normalized such that it fulfils (11). If the original signal inputs do not comply with this requirement, the amplitudes of the signal samples have to be divided by the normalization factor, . In most cases, the input states formed by the normalized signal samples are entangled:

From (10), it can be observed that the term in QFT equation is a rational number in the range of . As qubit representation is typically used in computations, the in base-10 integer is redefined in base-2 notation as individual bit such that the binary fraction form as expressed in (12) can be conveniently adopted:

With some algebraic manipulations, the QFT equation can be derived from (13) to form (14) [33]:

Since the term produces either if or otherwise, Hadamard computation on the first qubit results in . Computations of the consecutive bits in the binary fraction are obtained using controlled phase-shift gates according to (14). QFT circuit consists of three types of elementary gates which are Hadamard gate, , controlled phase-shift gate, , and gate. The circuit model of an -qubit QFT is depicted in Figure 1.