Abstract

Information processors process information in a variety of ways. The human brain processes information through a highly interconnected system of neurons and synapses, while a digital computer processes information by having a binary switch toggle on and off in response to a stream of binary bits. The “switch” is the most primitive unit of the modern computer. The better it is (faster, more energy efficient, more reliable, etc.), the more advanced is the computer hardware. Energy efficiency, however, is more important than any other attribute, not so much because energy is costly, but because too much energy dissipation prevents increasing the density of switches on a chip that is necessary to make the chip increasingly more powerful. Reducing dissipation entails radically new and often revolutionary approaches for implementing the switch. One such approach is to encode digital bit information in the spin polarization of a single electron (or ensemble of electrons) and then using two mutually antiparallel polarizations to represent the binary bits 0 and 1. Switching between the bits can be accomplished by simply flipping the polarizations of the spins, which takes very little energy. Such switches are extremely energy efficient if designed properly, but they are somewhat slower than traditional transistor-based switches and can be more error prone. This paper discusses the pros and cons of spin-based switches and introduces the reader to the most recent advancements in information processing predicated on encoding information in electron spin polarization.

1. Introduction

Information processors (computers, cell phones, digital watches, personal communicators, etc.) pervade our everyday lives. This paper, for example, is typed in a desktop computer and the author used his cell phone several times during the typing of this paper. The information overload that our society deals with routinely requires ever-increasing computational prowess that can only be attained by packing more and more computing devices in a chip. Since the chip area is limited by considerations of cost, convenience, and practicality, one must increase the density of devices in a chip to keep abreast of the ever-increasing demands of computing. This was foreseen by the visionary cofounder of Intel Corporation who postulated the famous Moore’s law [1] stipulating that the density of devices in a chip must double roughly every 18 months. In the past, Moore’s law has been sustained; the density has increased roughly by a factor of 2 every 18 months, but a calamity looms in the horizon. What might stop device downscaling in accordance with Moore’s law is not so much the difficulty of fabricating smaller and smaller devices, nor is it the fact that classical laws of physics will be defunct when device dimensions approach atomic scales, but it is the unmanageable energy and heat dissipation associated with switching of a device. Present transistors dissipate about 0.2 fJ of energy (~50,000kT at room temperature; k = Boltzmann constant and T = absolute temperature) when they switch in isolation in ~100 ps. Therefore, the power dissipated per device per switching event is about 2 μW. The Pentium IV chip of circa 2000 had a transistor density of 108/cm2 [2] and even if 10% of them switched at any given time, the power dissipation would have been 20 W/cm2. That is roughly what the Pentium IV chip actually dissipated. Now imagine what would happen if transistor density increased in accordance with Moore’s law. By the year 2020, the density will be 8 × 1011/cm2 and the dissipation would have increased to 164 kW/cm2. There is no known heat sinking technology that can remove that much of heat from a chip. Surely, the chip would melt! This is the major problem threatening electronics today.

Excessive energy dissipation is virtually unavoidable in all charge-based digital switches like transistors that encode binary bit information in the amount of charge stored in the device. Charge is a scalar quantity that has magnitude but no direction. Hence, if binary bits 0 and 1 are to be encoded with charge, they must be represented by two different amounts of charge and . Switching between the bits will then necessitate changing the amount of charge in the device by an amount in some time , leading to the flow of current and the associated energy dissipation , where is the resistance in the path of the current. One can reduce this dissipation by increasing (switching slowly) or by decreasing , but neither is desirable since the former makes the switch slow and sluggish, while the latter makes the switch vulnerable to noise since it decreases the separation between the 0- and 1-states by bringing the two closer together.

The “spin” of an electron is a quantum-mechanical property and can be crudely thought of as the tiny magnetic moment associated with the electron spinning about its axis. It is a pseudovector that has a fixed magnitude of ( = reduced Planck’s constant) and a variable direction or polarization. If the electron is placed in a magnetic field, only two polarizations are allowed and therefore can be viewed as stable and metastable. The polarization parallel to the field will be stable and that antiparallel to the field will be metastable. These two polarizations can encode the binary bits 0 and 1. Switching between them will involve merely flipping the spin, without moving the electron in space and causing current flow as shown in Figure 1. This eliminates the dissipation, but does not eliminate dissipation altogether since the two spin states are nondegenerate and separated in energy by the Zeeman splitting energy ( = Landé -factor, = Bohr magneton, = flux density of the magnetic field). Therefore, the minimum energy dissipation in flipping a spin would have been per bit flip event. This energy, however, can be a lot smaller than incurred in switching a transistor switch.

2. Single Spin Logic (SSL)

The notion of using the bistable spin polarizations of a single electron placed in a magnetic field to encode the binary bits 0 and 1 is at the heart of an exotic idea known as Single Spin Logic (SSL) [38].

In SSL, single conduction band electrons are confined in semiconductor quantum dots that are delineated on a wafer. The entire wafer is placed in a dc magnetic field generated by a permanent magnet or an electromagnet. This global magnetic field defines the spin quantization axis and makes the spin polarization of every conduction electron bistable, that is, only polarizations parallel and antiparallel to this field are stable or metastable in each dot. This is the first step in making binary switches.

In order to ensure single electron occupancy in every dot, we have to ensure that the Fermi level (or chemical potential) in each dot is above the lowest spin split level in the conduction band but below all other levels. In that case, Pauli Exclusion Principle and Fermi-Dirac statistics would guarantee that there will be only one conduction band electron (or quasi-free electron) in each dot at low temperatures. One way to make this happen is to make sure that the energy cost to accommodate a second electron in any dot, which is roughly (e = electron charge and C = capacitance of the dot), is prohibitively large and exceeds the thermal energy by many times. This would prevent a second electron from getting into any dot. Single electron occupancy in an array of ~108 dots has been demonstrated experimentally [9].

The wavefunction of the lone conduction band electron in any dot is sufficiently delocalized that the wavefunctions of electrons in nearest neighbor dots overlap in space. Valence band electrons are tightly bound to their parent atoms and have localized wavefunctions that do not overlap with the wavefunctions of other electrons. Therefore, they play no role in what is discussed next.

Because of the overlap between the wavefunctions of nearest neighbor conduction band electrons, their spins can interact via exchange. Spin-spin interaction between second nearest neighbors is much weaker since exchange interaction strength decays exponentially with distance [10]. For our purpose, we can ignore second- or more distant neighbor interactions altogether.

It is possible to align the spins in certain chosen dots (designated as input dots) in desired directions (parallel or antiparallel to the global magnetic field) using external agents, such as local magnetic fields. (Local magnetic fields for this purpose can be generated with spin-polarized scanning tunneling microscope tips or even current lines if sufficient lithographic resolution is achievable). This is how one “writes” input data into the array. The arrival of the inputs takes the interacting array into a many-body excited state. The system is then allowed to relax to the thermodynamic ground state by coupling to the surrounding thermal bath. (The coupling of a single-isolated electron to the thermal bath is weak, but the collective coupling of many interacting electrons to the thermal bath is much stronger. Hence the entire spin system should relax to ground state much faster than an isolated spin.). When the ground state is reached by emitting phonons, magnons, and so forth, the spin orientations in certain other chosen dots (designated as output dots) will represent the result of a specific computation in response to the input bits. The quantum dots are arranged in space in such a way that the nature of the nearest neighbor interactions guarantees this occurrence. Thus, computation is carried out by engineering the spin-spin interactions by choosing appropriate layout of the quantum dots, which determines the nature of the spin-spin interactions. In many ways, this is a “collective computation” model, similar to neural networks.

Once the system has fully relaxed to the ground state, the result of the computation (spin orientations in output ports) can be read using a variety of schemes, all of which have been experimentally demonstrated [1113] (reading). Since this is an “all-hardware” computer with no involvement of any “software,” it is extremely fast in producing the final result. The disadvantage, however, is that a particular computer can do only one specific computation since the computer is entirely hard wired and is not easily reconfigured for a different task. The precise placements of the quantum dots on the wafer determine the nature of the exchange interactions and hence the specific computational task that can be carried out by the spin array. The layout is the key; it determines uniquely what kind of computation is performed.

The only requirement for this paradigm to succeed is precisely defined spin-spin interactions and control of single spins. These have been demonstrated repeatedly by numerous groups in the context of spintronic quantum computing [1427]. It is currently a well-established art.

Note that SSL is an equilibrium system where the spins are not intentionally maintained out of equilibrium. In fact, computation is performed by letting the excited spins thermodynamically relax to the ground state by coupling with the thermal bath (phonons). Therefore, this paradigm has some in-built noise immunity because the ground state is always the most stable. On the flip side, it does not exploit any possible advantage of nonequilibrium dynamics in computing that has been discussed in [2830]. Maintaining a system perennially out of equilibrium would have, however, consumed additional energy, although that need not have been dissipated in the chip.

Equilibrium statistics mandates that the absolute minimum energy dissipated in a single irreversible logic operation should be the Landauer-Shannon limit [31, 32]. However, reaching this limit requires complicated switching dynamics (e.g., time modulated potentials) and extreme timing synchronization between various components of the switching cycle [31, 32]. If that precision is unattainable, no time-modulated potential is available and switching is carried out in a simple abrupt step, then the minimum energy dissipation will be where is the static bit error probability (the probability that the bit flips spontaneously). It turns out that the energy dissipated in any irreversible logic operation in an SSL NAND gate (described below) is given precisely by the above expression [5]. This is actually a remarkable result since it shows that no paradigm can better the SSL in dissipation for an irreversible logic operation carried out nonadiabatically without elaborate time-modulated potentials and ultraprecise timing mechanisms, since SSL operates at the thermodynamic limit.

2.1. The SSL NAND Gate for General Purpose Computing

There are many ways to carry out general purpose computing (GPC), but the most common one is to use Boolean logic gates. In order to build a universal computing machine employing Boolean logic, we must construct combinational and sequential logic circuits by employing universal logic gates (e.g., the NAND gate). We will then interconnect them with “spin wires” that ferry spin signals between them unidirectionally. The two ingredients—NAND gates and unidirectional spin wires—are all that are required to implement a universal computing machine.

An SSL NAND gate is implemented with a linear array of three quantum dots each containing a single conduction band electron. The array is placed in a global static magnetic field that defines the spin quantization axis, that is, the spin in any dot will be aligned either parallel or antiparallel to it. If a spin is parallel to the field, we will assume that it encodes the binary bit 1 and if it is antiparallel, it encodes the binary bit 0.

The NAND gate realization is shown in Figure 2. The two peripheral dots in the array are treated as input ports whose resident spins are aligned to conform to input bits—either 0 or 1—with external agents that can generate local magnetic fields. The central dot is the output port and its resident spin’s polarization encodes the output bit.

It was rigorously shown in [5] that the ground state spin configuration in this system is antiferromagnetic, that is, spins in nearest neighbor quantum dots will be mutually antiparallel as long as the exchange interaction strength between nearest neighbors is greater than one-half the Zeeman splitting energy in any dot due to the global magnetic field, and the local magnetic field applied to the input dots is much stronger than the global magnetic field. In that case, whenever the two inputs are 1, the output must be 0 to preserve the antiferromagnetic ordering, and similarly whenever the two inputs are 0, the output must be 1. When one input is 1 and the other is 0, a tie seemingly occurs. This tie, however, is broken by the global magnetic field, which will generate a slight preference for the spin in the output dot to be aligned parallel to the field when the two inputs are dissimilar. (This is assuming that the Lande -factor (gyromagnetic ratio) of the dot material is positive). Since spin orientation parallel to the global magnetic field encodes logic bit 1, the output will be 1 whenever the two input bits are different. Thus, the input-output relation of this system obeys the truth table of the NAND gate as shown in Figure 2.

2.2. Theory of the SSL NAND Gate

To show that the 3-dot system indeed acts as described (i.e., performs the NAND logic operation), one must resort to rigorous quantum mechanics and consider the many-body Hamiltonian of the array. One can describe the system with a Hubbard Hamiltonian which will have 29 independent basis states. However, if we assume single electron occupancy in each dot (the dots are so small and have such small capacitance that the energy cost to add a second electron to any dot, which is , is prohibitively large), then the Hubbard Hamiltonian can be reduced to a much simpler Heisenberg Hamiltonian [4, 7] which has only 8 independent (orthonormal) basis states. This Hamiltonian is given by where the -s are Pauli spin matrices. We adopt the convention that the local magnetic field needed to align spins in input dots, and the global magnetic field, are always along the -axis.

The last two terms in the above Hamiltonian account for the Zeeman energies associated with the local and global fields. The first two terms account for exchange interaction between nearest neighbors (the angular brackets denote summation over nearest neighbors). We will assume the isotropic case when , where is the exchange energy, which is nonzero if the wavefunctions in dots and overlap in space.

The spins in the quantum dots are polarized in either the or direction by the global magnetic field (conforming to bits 1 or 0), and we designate the corresponding states as “upspin” (↑) and “downspin” () states, respectively. Recall that the downspin state (aligned antiparallel to the global magnetic field) encodes bit 0 and the upspin state (parallel to the global field) encodes bit 1.

Obviously, there are 8 independent 3-spin basis states representing the spin configurations in the 3-dot array, which are , , , , , , , . In this state representation, the first arrow in every ket is the spin polarization in the left dot, the second arrow is the spin polarization in the central dot, and the third in the right dot. These eight basis functions form a complete orthonormal set. The matrix elements are given in the matrix below, where the -s are the 3-electron basis states enumerated above. In spinor notation, ,

In the above matrix, is one-half of the Zeeman splitting energy associated with the global magnetic field (i.e., ), while and are Zeeman splitting energies in the left and right input dots caused by the local magnetic fields that write input data (; ). If the local magnetic field is in the same direction as the global field and writes bit 1, then the corresponding is positive, otherwise, it is negative. The quantity is always positive.

Reference [5] evaluated the eigenenergies and eigenfunctions of the above Hamiltonian for the 4 possible input bit combinations to the NAND gate (0, 0), (0, 1), (1, 0), and (1, 1). It was found that the ground state wavefunctions in the four cases approach the desired states , , , , respectively, provided , , and . Thus, the ground state spin polarization in the output dot is always the NAND function of the spin polarizations in the input dots, provided the Zeeman splitting caused by the local magnetic fields that “write” input bits in the input dots is much larger than the strength of exchange coupling between nearest neighbors, and the latter, in turn, is larger than one-fourth of the Zeeman splitting caused by the global magnetic field. Therefore, the NAND gate is indeed realized by three spins with nearest neighbor exchange coupling if we satisfy the conditions , and . Since the NAND gate is universal, any arbitrary combinational or sequential circuit can be implemented by interconnecting NAND gates with a “spin wire” shown in Figure 3.

A spin wire is a linear array of quantum dots, each containing a single electron, with tunable nearest neighbor exchange interaction. Between each pair, there is a metal gate that is electrically accessed. (This is a lithographically challenging job since the spacing between dots will hardly exceed 10 nm for sufficient exchange coupling strength. However, lithography has now progressed to the point where this is no longer infeasible.). When a positive potential is applied to the gate, it lowers the potential barrier between the flanking dots and allows their resident electrons’ wavefunctions to overlap in space. This turns on the exchange coupling only when the gate pad is activated. Without the positive gate potential, the barrier between dots is so high that exchange coupling is insignificant and the two dots are decoupled. Thus, we can turn the exchange interaction on and off with the gate pad potential.

We will describe in the next subsection how a spin polarization state can be unidirectionally propagated from left to right along the spin wire using a 3-phase clock. Unidirectionality is of paramount importance since, in order to work properly, an input stage in a logic circuit must drive an output stage and not the other way around [33]. In other words, there should be no feedback from the output to the input. In a transistor-based circuit, this is automatically ensured since there is inherent “isolation” between the input and output terminals of a transistor that enforces a master-slave relation between the input and output, forcing logic signal to propagate unidirectionally at all times. Unfortunately, that is not the case with SSL since exchange interaction, which plays the role of interconnecting wire between successive spins, is intrinsically bidirectional. Therefore, we must enforce unidirectionality in some other way. Since we cannot impose unidirectionality in space, we must impose it in time, using a “clock” [6]. This is actually an old idea that has been used to steer logic bits unidirectionally in charge-coupled-device- (CCD-) based shift registers. There, a “push” clock and a “drop” clock are used to enforce unidirectional bit propagation. (This also requires a 3-phase clock.) [34, 35]. In SSL, the clock signal is a sequence of positive voltage pulses that are applied to the gates interposed between each pair of dots. The arrival of a positive voltage pulse temporarily lowers the potential barrier between two adjacent quantum dots and exchange couples their spins. By sequentially exchange-coupling three adjacent dots at a time using a 3-phase clock, the spin state of the leftmost dot can be propagated unidirectionally from left to right in a bucket-brigade fashion [8].

There are other possible clocking schemes for spin wires, one of which is due to Bennett [36]. That scheme can be adapted to SSL as follows. Let us say that we wish to propagate the state (spin polarization) of the th dot in a chain to the right unidirectionally. We will then rotate the spins of the ()th dot and ()th dot by ~90° to the right by an external agent. When that agent is withdrawn from the ()th dot but not the ()th dot, the ()th dot finds that its exchange interaction with its left and right neighbors are unequal since one neighbor’s spin is pointing down and the other’s spin is pointing to the right (see Figure 4). This breaks the tie and allows the ()th dot’s spin to flip up because of the net exchange interaction it experiences. (The “flipping up” happens because it reduces the total energy of the system in this case). In the next step, the ()th dot’s spin is rotated to the right and the rotating agent is removed from the ()th dot. The latter’s spin then flips down owing to exchange interaction and the logic bit has propagated from the th dot to the ()th dot unidirectionally.

The next important question is what agent can possibly rotate the spin of a targeted dot to the right by 90°? (Whether rotation is to the right or to the left makes no difference. Obviously, either one will work). That agent is an electric field. The field causes Rashba spin-orbit interaction [37] in the dot [38, 39] and that can rotate the spin by ~ [40] and implement Bennett clocking. However, it takes a very large voltage to rotate the spin by large angles with this strategy [40], which makes this approach of rotating the spin with a dc voltage extremely energy inefficient. A more energy inefficient approach is to place all the dots in a microwave field and apply a much smaller dc voltage pulse to turn on a slight Rashba interaction in a target dot to increase or decrease slightly the total spin splitting energy in that dot caused by the global magnetic field [38, 39]. This can make the total spin splitting energy in the target dot resonant with the photon energy in the global microwave field (ac magnetic field) [41]. Only the target dot’s spin will couple with the microwave field since its spin splitting energy is resonant with the photon energy. This will rotate the spin in the target dot by an arbitrary angle due to Rabi oscillation [4244]: where τ is the dc pulse duration (the duration for which the dot is resonant with the global microwave field). By adjusting τ and , one can make θ = 90°. However, this approach may also require a considerable dc pulse amplitude, albeit less than what would be needed to rotate the spin with the dc potential alone, thereby making it still energy-inefficient. Moreover, the notion of placing a computer within a microwave cavity in order to obtain a sufficiently large is not particularly appealing from an engineering perspective and hence not entirely practical. Thus, the optimum scheme may still be the first approach where the potential barriers between three adjacent dots are lowered with voltage pulses to exchange couple the trio at a time.

2.3. SSL Spin Wire

A spin wire cannot only ferry spin logic bits unidirectionally, but it obviously can also perform the role of fan-out where a signal is split into multiple paths in order to drive multiple stages. This is shown in Figure 4(f). It is obvious that the same strategy can implement fan-in as well.

Finally, one last requirement that wires must satisfy is the function of “crossover” where two wires cross each other in space without interfering with one another. Combinational logic circuits (e.g., adders and subtractors) do not always need crossover, but sequential circuits (e.g., flip-flops) will require feedback of an output state to an input state, and therefore crossover. This is the most challenging requirement and normally will be implemented with multiple layers of dots where a dot in one layer is sufficiently distant from the nearest dot in the closest layer to avoid significant exchange coupling. As a result, combinational logic is usually easier to implement in SSL than sequential logic.

2.4. The Toffoli-Fredkin Gate with SSL

The NAND gate is a universal Boolean logic gate, but it is logically irreversible, meaning that we cannot infer the input bits if we have knowledge of only the output bit. For example, if the output bit is 0, then we can state with certainty that the input bits must have been (1, 1), but if the output bit is 1, then we could not tell whether the inputs were (1, 0), (0, 1), or (0, 0). A logically reversible universal gate is the Toffoli-Fredkin gate [45] which has three inputs A, B, and C and three outputs , ,  and  . Knowledge of the output bits of this gate allows us to infer the input bits uniquely. This is often a very desirable trait since it is believed that logically reversible gates can, in principle, be physically reversible and not dissipate any energy at all [31, 32].

The truth table of the Toffoli-Fredkin (T-F) gate is as in Table 1.

It is clear that the input-output relation can be expressed as where represents the logical exclusive OR operation and the represents the logical AND operation. Therefore, two of the output bits (, ) replicate the corresponding input bits (A, B)—called the control bits—while the third bit replicates itself unless both and are 1. In the latter case, it flips. Note that the gate is logically reversible since we can uniquely deduce the input bits A, B, and C from the output bits , , and .

The Toffoli-Fredkin (T-F) gate can be realized with the same 3-dot array as the NAND gate. The spin orientations in the two peripheral dots will represent the control bits and B, while that in the central dot will represent the target bit . The dots are placed in a global magnetic field pointing in the upspin direction as in Figure 2. As before, spin orientations antiparallel to the global magnetic field (upspin) will represent bit 0 and that parallel to the global field will represent bit 1. The same Hamiltonian as in (2) will represent the system. It can be shown [46] ([46] had a different convention where the magnetic field pointed in the downspin direction and spin polarization parallel to the field represented bit 0. The convention used in this article is equally valid) that as long as one-half of the Zeeman splitting caused by the local magnetic fields that orient the spins in the input dots greatly exceeds the exchange coupling energy, that is, and , where is the Zeeman splitting due to the global magnetic field, the T-F gate can be implemented.

Provided the above conditions are met, one can show [46] that when , the ground state spin configuration in the array approaches the many-body state (antiferromagnetic ordering) and the first excited state is approximately (ferromagnetic ordering). Therefore, when the array is in the ground state, C = 1 and when it is in the first excited state, . It can also be shown that when , the ground state is approximately (antiferromagnetic) and the first excited state is approximately (ferromagnetic). This time, in the ground state and in the first excited state. Finally, when and are logic complements of each other (dissimilar control bits), . Of course, we expect all of these to happen in any case since we expect the system to behave as a NAND gate. However, what we intend to focus on now is the energy differences between the first excited state and the ground state for the four different control bit combinations (1, 1), (0, 1), (1, 0), and (0, 0) since that will be the key to implementing the T-F gate with this array.

If we designate the energy difference between the first excited state and the ground state of the 3-spin system as for different control bit combinations (A, B), then we can show that [46]

The key to implementing the T-F gate is the fact that the energy difference between the first excited state and the ground state of the array depends on the control bits A and since . Note also that in every case, the difference between the excited state and the ground state of the array is only in the spin polarization of the central dot. Hence, we can view as essentially the spin splitting energy in the central dot for different states of the control bits and . The fact that the spin splitting energy in the central dot depends on the spin polarizations of the electrons in the peripheral dots is a consequence of exchange interaction.

In order to implement the truth table of the T-F gate, we will excite the 3-dot system with a -pulse of angular frequency , which means that we will turn on an ac magnetic field of angular frequency ω and amplitude for a duration such that . This will make the spin in the central dot flip (i.e., rotate by an angle ) if the spins in the peripheral dots are both upspin. Hence, if and only if , will flip (from 0 to 1 or 1 to 0). Otherwise, it will retain its previous state. This realizes the truth table of the T-F gate. Note that no energy is dissipated in the operation of the gate since the flipping of the spin in the central dot occurs by coherently absorbing a photon from the ac magnetic field (microwave).

There have been numerous ideas for physical implementation of the T-F gate [4749]. What we have described above is the first SSL implementation.

2.5. Energy Dissipation in SSL

We mentioned at the outset that SSL should be very energy efficient and dissipate very little energy to carry out logic operations. It therefore behooves us to provide some concrete estimates of energy dissipation.

There are two sources of energy dissipation in generic SSL: internal dissipation in the gate while it switches in response to changed input bits, and dissipation in the clock that steers bits unidirectionally in a spin wire. We examine both below.

2.5.1. Gate Dissipation

Reference [5] showed that the energy dissipated in a NAND gate operation is approximately which also happens to be the energy difference between the two antiparallel spin states in any isolated dot that is not subjected to any external field other than the global field. Furthermore, it was shown that if the coupled spin system is in thermal equilibrium and governed by Boltzmann statistics, then the energy is also equal to where is the probability of gate error caused by spins straying from the many-body ground state (which represents the correct gate result) into many-body excited states by absorbing phonons or magnons. (This result, although obvious for an isolated spin, is not obvious for a 3-spin system forming a NAND gate. Reference [5] proved this result rigorously). Remarkably, this energy——is the minimum energy that any irreversible gate must dissipate in a single logic operation as long as the gate is in thermodynamic equilibrium with the environment, and the switching is carried out abruptly without any time modulated potential, by taking the system from one state to another.

The energy dissipated in a gate operation, as well as the strength of the global magnetic field, is therefore determined by how much gate error probability can be tolerated at a given temperature. If the error probability cannot exceed , then the energy dissipated in a gate operation will be at any temperature. Since this energy is also equal to , one must choose the global magnetic field strength such that . With , .

2.5.2. Clock

The clock in SSL causes additional dissipation. For nonadiabatic clocking, the energy dissipated in the clock will be ~ where is the capacitance of the clock pad and is the amplitude of the clock pulse. This energy depends on the clocking mechanism. It will be very high for Bennett clocking and presumably much lower if we merely modulate the tunneling barrier between neighboring dots. In any case, it should be considerably larger than the thermal energy kT to protect against thermal noise [50]. Let us assume that the clock amplitude is 10 times larger than the thermal voltage fluctuation on the clock pad which is , resulting in a signal-to-noise ratio of 10 : 1 or 20 dB. Therefore, the clock dissipation will be ~100kT per cycle. In principle, this energy can be reduced to zero by using an RLC circuit—comprising a resistor in series with a parallel combination of an inductor and capacitor—to carry out the clocking where the dot acts as the capacitor. The clock should be a sinusoidal whose frequency is the resonant frequency of the RLC circuit. However, it is technologically challenging to string an inductor across a quantum dot of diameter ~10 nm, making this somewhat impractical.

It should be clear now that there are two sources of dissipation in an SSL circuit—the clock and the gate. The former could dissipate about 100kT per clock cycle and the latter dissipates per bit flip, which will be 34.5kT if we operate with a bit error probability of . Therefore, the total dissipation per clock cycle per bit is potentially ~134.5kT, which is considerably less than the ~50,000kT that present CMOS transistors dissipate [51].

2.6. The Speed of SSL

The speed of SSL (i.e., the maximum allowable clock frequency) is determined by four factors: (1) the speed with which an input bit can be written in an input port by the writing agent, (2) the speed with which an output bit can be read in an output port by the reader, (3) the gate switching speed, and (4) whether or not the architecture is pipelined. If the architecture is pipelined, then the clock speed is limited by the lowest of the other three speeds.

2.6.1. Pipelining in SSL

Fortunately, SSL is a pipelined architecture. The clock in SSL not only propagates signals unidirectionally, but it is also invariably makes the architecture pipelined. To understand this concept, consider the spin wire in Figure 3. The input bit is applied to the leftmost dot by aligning its spin in the up-direction with an external agent. This is done during the first clock cycle. In the next cycle, the potentials in the first two gate pads are raised to cause nearest neighbor exchange coupling between the first three dots which then order their spins in the antiferromagnetic configuration. In the third cycle, the potential in the first gate pad is lowered, while that in the second gate pad is held, and that in the third gate pad is raised to cause nearest neighbor coupling between the second, third, and fourth dots. This ensures antiferromagnetic ordering within this latter trio which successfully orients the fourth dot’s spin antiparallel to the input spin. In the fourth cycle, the potential at the second gate pad is lowered, that in the third gate pad is held high and that in the fourth gate pad is raised, which successfully transfers the input bit applied at the first dot to the fifth dot, thereby ensuring unidirectional signal propagation along the wire.

The point to note here is that as soon as the potential in the first gate pad is lowered in the third cycle, the first dot is decoupled from the chain, and the input applied to this dot can then be changed without affecting successful replication of the original input bit in the fifth dot as described above. In other words, the input can be changed during the fourth cycle regardless of how long the chain is. During the fifth clock cycle, when the first and second gate pad’s potentials are raised again to exchange couple the first three dots, the original input bit has already propagated down the chain (to the sixth dot) and is decoupled from the input side since the third gate potential has been lowered in the fifth cycle, which decouples the input side from the output side. Thus, the traveling bit will not be affected by the new input. In other words, a new input bit can be fed to the spin wire before the earlier input makes it to the very end of the wire. Therefore, the input bits can be pipelined. The reader should be able to determine that in this case, the input bit rate will be only one-third of the clock rate.

The pipelining, however, comes with a serious fabrication penalty since gate pads must now be interposed between every pair of dots in order to apply a local potential independently between any chosen pair to exchange couple them. We call this scheme of clocking “granular clocking” since every pair has its own clock pad. This increases the fabrication complexity and cost and limits the bit density on a chip. However, the alternate is a nonpipelined architecture which will be extremely slow and hence unacceptable.

One intriguing possibility to have the best of both worlds (pipelined and yet no separate clock pad for each pair) is to launch a guided electromagnetic wave in a waveguide built underneath a spin wire. When the crest of the wave arrives at a set of dots, the corresponding gate pad voltages are raised. Since we need to address two neighboring gate pads at a time, the wavelength of this wave should be roughly the distance spanned by four gate pads in order to maintain pipelining. This distance may be roughly 100 nm, requiring ultraviolet waves. This idea allows pipelining of data without requiring separate electrical connections to every gate pad and therefore appears to be very attractive. However, this is also fraught with some danger since the magnetic field in the electromagnetic wave may interfere with the spin states.

Another possibility is to launch a traveling magnetic field pulse in a waveguide buried underneath the spin wire. This field is not collinear with the global field. A quantum dot positioned at the crest of this pulse experiences a net magnetic field that is at an angle with the global field. The spin in this dot will align with the local field and hence will be slanted with respect to the global field. If the input bit propagates synchronously with this pulse, it can propagate unidirectionally in the wake of the pulse. This method too does not require individual connections to every quantum dot to implement a pipelined architecture.

2.6.2. The Clock Speed in SSL

Once it has been established that SSL is a pipelined architecture, we have to next determine the writing speed, the reading speed, and the gate switching speed in order to ascertain which is the slowest among them. The slowest speed will determine the maximum allowable clock speed.

2.6.3. Writing Speed

The speed with which an input bit can be written in an input port depends on the flux density of the local field . Reference [5] showed that this field must be strong enough that the Zeeman splitting it causes in the input dot is at least 20 times larger than the exchange coupling strength between dots. The latter can be about 1 meV in semiconductor dots [52]. Therefore, in InSb quantum dot systems, where we have assumed the -factor of bulk InSb which is −51. The -factor in quantum dots can be less than in bulk, which will increase .

This analysis clearly shows that writing of bits calls for a Herculean feat since generating ~7 Tesla of magnetic field locally is a very tall order. There are some materials like InSb1−xNx which reportedly have -factors as large as 900 in the bulk [53]. Assuming that the same -factor can be retained in quantum dots, the value of needs to be only ~0.4 Tesla, if one employs InSb1−xNx quantum dots as hosts for the spins. Generating field strengths of this magnitude locally is still quite demanding.

The time required to complete the “writing” of input bits in isolated input dots is of the order of ~. The value of in yields the writing time as ~0.1 ps, which is indeed very fast and clearly will not be the limiting factor for clock speed.

2.6.4. Reading

There are many strategies to “read” the spin polarization of single electrons in quantum dots [1113], among which the scheme of [13] is best suited to SSL. In [13], the reading time was of the order of a millisecond. This time is determined by the speed with which electrons can tunnel in and out of the dot and therefore one should be able to increase this speed dramatically with better engineered structures. Again, this should not be the limiting factor to determine clock speed.

2.6.5. Gate Switching Speed

The gate switching speed is determined by how long it takes for a gate to complete a logic operation. That, in turn, depends on how fast the coupled spin system can relax to the ground state when coupled with the external thermal bath. This time is much shorter than the spin relaxation time of a single-isolated spin for essentially the same reasons that the ensemble averaged spin dephasing time of many interacting spins is orders of magnitude shorter than the dephasing time of a single-isolated spin [54, 55]. There are no reports of any measurement of spin relaxation times in coupled (as opposed to isolated) quantum dots. However, there are numerous ways to shorten this time, for example, by implanting magnetic impurities in the barriers. It should be possible to reduce this time to ~1 ns.

It is clear now that among all the three switching speeds, the gate switching speed and the reading speed are the slowest and therefore will determine the clock speed. Assuming reading times and gate switching times of ~1 nanosecond, the maximum clock frequency will be

2.7. The Gate Error Probability in SSL

There are two types of gate error in SSL: (1) the intrinsic error caused by the coupled spin system in a gate occupying thermally excited states instead of the ground state with probability p; (2) the extrinsic error caused by a spin in a dot flipping spontaneously during a clock period (due to coupling with the environment) and its probability is given by where is the clock period and is the spin flip time of an isolated spin. Spin flip times of an isolated spin as long as 1 second have been demonstrated in GaAs quantum dots at very low temperatures of 120 mK [56] and in organic nanostructures at much higher temperatures of 100 K [57]. Assuming = 1 nsec and = 1 sec at the operating temperature, , which is acceptable.

2.8. The Temperature of Operation of SSL

Reference [5] showed that if we want a fixed intrinsic error probability p, then the temperature of operation is determined by the condition. (The condition for SSL to work is that ): where is the energy of exchange coupling between neighboring dots. Assuming J = 1 meV, which is achievable with today’s quantum dot technology [52], the maximum operating temperature turns out to be if we operate with an intrinsic error probability of 10−9. This is a very low temperature and requires He3 cooling, which is a serious disadvantage and essentially precludes SSL from being a serious contender for general purpose computing (although niche applications are still a possibility). Room temperature operation with such low error probability would have required exchange coupling strengths in excess of 300 meV, which is not presently achievable with semiconductor quantum dot technology.

Had we operated at room temperature with the presently achievable J = 1 meV, then the bit error probability would have been , which is clearly unacceptable. At 4.2 K temperature (which requires He4 cooling instead of the more demanding He3 cooling), the bit error probability would have been 4 × 10−3 which may be acceptable in some situations if significant error correction resources are available.

A recent development has altered this prognosis dramatically. It has been shown that graphene nanoflakes can implement SSL-type logic gates with much higher exchange interaction strength (2J = 180 meV) which allows room-temperature operation with a bit error probability [58]. This is a very exciting and promising route for SSL and may revive interest in SSL since it establishes a clear pathway for practical implementation.

Equation (10) also yields the value of the global dc magnetic field required for operating at 1 K with an error probability of 10−9. In an InSb quantum dot with g = −51, will be 0.7 Tesla, which is easily achieved. If the quantum dot material has a -factor of 900 [53], then the required strength of is only 0.04 Tesla. These field strengths can be easily achieved with permanent magnets.

2.9. Current Experimental Status of SSL

To our knowledge, SSL has never been demonstrated experimentally, but the pathways to low temperature demonstration are clear. This architecture requires the delineation of an array of quantum dots, each containing a single electron, in specific topological patterns on a wafer. Neighboring dots must be spaced within ~10 nm to allow significant exchange coupling between nearest neighbor spins, and gate pads must be inserted between every pair of dots to allow clocking. The lithography is undoubtedly challenging, but not daunting to the point of being unrealistic.

Numerous groups have demonstrated arrays of quantum dots with single electron occupancy [9] and manipulation of single electron spins in isolated quantum dots has also been demonstrated by a number of groups recently [1427]. These results inspire hope that SSL, which only requires single electron dots with nearest neighbor exchange coupling, is within the reach of current technology. The only major challenge is the alignment of gate pads between every pair of dots with a high degree of reliability. Recent demonstration of field effect transistors with 6 nm gate length [59] shows that lithography is advancing to the level where such challenges can be met.

3. Nanomagnetic Logic: Computing with Spin Ensembles

The major drawback of SSL is that it requires cryogenic operation because (i) exchange interaction between spins confined in semiconductor quantum dots is very weak, and yet it has to exceed the thermal energy kT manyfold in order to have small error probability p (see (10)); (ii) higher temperatures increase the spontaneous spin flip rate dramatically and hence increase the extrinsic error probability rapidly (see (9)). These two limitations make SSL a low-temperature technology.

Therefore, it behooves us to look at other systems that behave like SSL but are much more error-resilient (have much smaller at any temperature) and do not necessarily operate with exchange interaction. One such system is an array of single-domain nanomagnets each consisting of roughly 104 spins, all of which rotate or flip in unison under external stimuli. Thus, all the ~104 spins act like one giant classical spin with ~104 times the magnetic moment [60, 61]. The single-domain nanomagnets interact with each other via dipole coupling which can be easily ~1000 times stronger than exchange coupling. Furthermore, the magnetization of a nanomagnet is much more stable than the spin polarization of a single electron, that is, is much smaller at any given temperature. One can replicate SSL with nanomagnets instead of single electron spins. These systems have been termed magnetic quantum cellular automata [62, 63] and are essentially nothing but nanomagnetic versions of SSL with a single-domain nanomagnet replacing a single spin, and dipole interaction replacing exchange interaction.

While a single electron’s spin is made bistable by placing it in a magnetic field, a nanomagnet’s magnetization orientation cannot be made bistable in the same fashion. Instead, one can make the shape of the magnet “anisotropic” as in an elliptical cylinder whose major axis dimension exceeds that of the minor axis. Because of the anisotropic shape, the magnetization vector of this magnet has two (mutually antiparallel) stable orientations along the major axis, which is called the “easy axis” since it is easier for the magnetization to align along this axis compared to any other direction. Only these two orientations are stable because of the so-called “shape anisotropy energy” of the magnet, which makes the minimum energy state corresponds to magnetization alignment along the easy axis. Thus, just like the spin of a single electron placed in a magnetic field, a shape-anisotropic nanomagnet has two stable states: parallel and antiparallel to the easy axis. Unlike in the case of single spin, however, where the stable and metastable states were not energetically degenerate and were separated by the Zeeman splitting energy , here the two states are energetically degenerate. We can intentionally make them nondegenerate by applying a magnetic field along the easy axis, but that is not necessary.

The minimum energy barrier separating the two stable states in a shape-anisotropic single-domain nanomagnet is related to the degree of shape anisotropy and is given by where is the permeability of free space, is the saturation magnetization of the magnet per unit volume (~5 105 A/m for common materials like nickel and cobalt), is the nanomagnet’s volume, and , are the demagnetization factors along the - and -axes, respectively. The demagnetization factors are given by [64] for elliptical cylinders: where is the major axis, b is the minor axis, and is the thickness of the nanomagnet. Note that . If we choose a = 105 nm, b = 95 nm, and l = 6 nm, then the minimum energy barrier in a nickel or cobalt nanomagnet shaped like an elliptical cylinder is ~34kT at room temperature.

The probability that the magnetization of the shape anisotropic magnet will spontaneously flip in a period of time τ is , where is the “magnetic retention time” given by , with being the “attempt frequency,” which is between 1 ps and 1 ns [65]. Therefore, at room temperature (kT = 26 meV), is between 588 and 588,000 seconds since . For τ = 1 ns (or clock frequency of 1 GHz), we have the condition . That makes which is at room temperature. Thus, clearly, room temperature operation is possible with very high error-resilience. Note that . This is what makes a nanomagnet, consisting of many interacting spins, much more robust than a single-isolated spin.

It will be natural to assume that if a single-domain nanomagnet contains ~104 spins, then the energy dissipated in flipping the magnetization of the magnet will be ~104 times higher than in flipping a single spin, that is, the minimum dissipation will be where N (~104) is the number of spins in the magnet and is the probability of spontaneously flipping a single spin ().

The authors of [61] have shown this assumption to be flawed. In a single-domain magnet, all the spins collectively behave as one giant single spin [60] and rotate together in unison because the strong exchange interaction among them keeps them mutually parallel at all times. As long as the exchange interaction strength is much larger than kT, this will happen at any temperature . Thus, there is a single degree of freedom for the spins and not N independent degrees of freedom. As a result, the minimum energy dissipated to switch a single spin and the minimum energy dissipated to switch a single magnet consisting of many spins are roughly the same, that is, in both cases, this energy is ~ and not ! This remarkable result makes the idea of replacing a single spin with a single magnet worth pursuing.

The above discussion reveals why magnet-based switches are potentially much more energy efficient than transistor-based switches. In a nanotransistor, where there are charges (information carriers) in the channel, the minimum energy dissipation will indeed be because each charge represents an independent degree of freedom, but in a single-domain nanomagnet, it can be only ~. Thus, the magnet has an intrinsic advantage over the transistor, particularly when . To summarize, there are two reasons why magnets may replace transistors in digital logic systems: (1) the elimination of the dissipation (in principle, no current flow should be needed to switch a magnet), (2) the collective interaction between spins which makes the minimum energy dissipation in a magnet much less than that in a transistor when both contain the same number of information carriers (electron charges or electron spins). Magnets also suffer from no “leakage” unlike transistors, which increases their energy efficiency even more.

3.1. Switching a Nanomagnet: Penny-Wise and Pound-Foolish

There are two sources of energy dissipation in switching nanomagnets: (1) the internal energy dissipated when the magnetization flips (its minimum value is but the actual value may be somewhat higher); (2) the energy dissipated in the switching circuitry, which depends on the method of switching.

The internal energy dissipation in a magnet is typically small because of the collective interaction between spins as discussed, but unless one is judicious in the choice of the switching methodology, the energy dissipated in the external switching circuit may become overwhelming and completely erase the magnet’s advantage over the transistor. Thus, in order to avoid being penny-wise and pound-foolish, one must employ energy efficient switching strategies for flipping the magnetizations of single-domain shape-anisotropic nanomagnets.

The traditional method of switching nanomagnets is to generate a local magnetic field in the vicinity of a magnet with a current [63, 66]. The current flows in a loop circling the magnet. The magnetic field generated by this current is given by Ampere’s law: where the line integral is taken around the loop in which the current flows.

The last equation relates the minimum current needed to flip the magnetization to the minimum magnetic field that can overcome the energy barrier in (12) and make the magnetization switch from one stable state to the other. We can estimate by equating the magnetic energy of this field to the energy barrier: where is the nanomagnet’s volume. We will assume that at room temperature (this makes the error probability associated with spontaneous flipping of magnetization at room temperature) and = 105 A/m (typical for cobalt or nickel). If the nanomagnet is shaped like an elliptical cylinder, the dimensions that yield this value of (see (12) and ) are a = 105 nm, b = 95 nm, and l = 6 nm. Hence, 47,000 nm3. Equation (15) then yields the value of as 21,262 A/m = 267 Oe. From (14), we get = 13 mA, assuming the loop radius to be 100 nm. Therefore, the energy dissipated to flip a bit per clock cycle (assuming a switching time of 1 ns) is = 1.7 pJ = 4 × 108kT at room temperature, assuming the resistance of the loop to be 10 ohms. This is two orders of magnitude larger than the energy dissipated to switch a transistor in a circuit with a switching delay of the same 1 ns. Therefore, this method of switching nanomagnets—generating a local magnetic field with a current—is clearly energy-inefficient and must be avoided.

A second method of switching nanomagnets is by passing a spin-polarized current through it. This delivers either a spin transfer torque [6771] or induces domain wall motion [72], resulting in magnetization flip. The energy dissipated in this method is also of the order of 108kT [73] although there is a report of switching a nanomagnet with domain wall motion in ~2 ns while dissipating only about 104kT of energy [74]. Nonetheless, these methods unfortunately do not make magnetic switches so energy efficient that they would be actually poised to replace transistors and therefore merit serious attention. It is therefore imperative to find better schemes for switching magnets since the switching circuitry has turned out to be the Achilles’ heel.

3.2. Hybrid Spintronics and Straintronics

Recently, we devised an extremely energy efficient scheme for switching nanomagnets that employs multiferroics. This actually raises hopes that nanomagnets may indeed some day replace transistors as binary switches in digital logic circuits. Multiferroics [75] are sometimes multiphase materials, for example, a bilayer consisting of a single-domain magnetostrictive (magnet) layer overlying a piezoelectric layer. Consider the elliptical multiferroic in Figure 5. A voltage applied across the piezoelectric layer as shown generates uniaxial stress along the major axis of the piezoelectric through d31 coupling, provided the entire multiferroic structure is clamped to prevent expansion and contraction along the in-plane hard axis (minor axis of the ellipse). The associated strain is transferred elastically to the magnetostrictive layer, generating stress in it and rotating its magnetization by large angles [7684]. If the strain is withdrawn at the right juncture, rotation by ~180° is possible with >99.99% probability even in the presence of thermal noise at room temperature [85]. The switching takes less than 1 ns to complete, making this strategy one of the most energy efficient, and yet relatively fast, switching methodologies extant. Because we are rotating spins within the magnet with electrically-generated strain, we have termed this approach hybrid spintronics and straintronics [83]. We will discuss this next.

Consider the magnet in Figure 5 shaped like an elliptical cylinder whose cross-section is in the plane. The -axis is along the major axis of the ellipse and is the easy axis of magnetization. The stable magnetization orientations are of course along the -axis. There are two hard axes: the -axis is the in-plane hard axis and the -axis is the out-of-plane hard axis. Because the thickness of the magnet is much smaller than the lateral dimensions, the -axis will be “harder” than the -axis.

We will adopt spherical coordinates for analysis and assume that the magnetization vector’s direction is the radial direction. Hence, the magnetization orientation is specified by the coordinates , where is fixed. The polar angle is the angle subtended by the magnetization vector with the +-axis, and the azimuthal angle is the angle subtended by the projection of the vector on the plane with the + axis. Thus, [] corresponds to the stable orientations along the easy axis while [] corresponds to the plane of the magnet. The coordinate system is shown in Figure 5.

The total potential energy of the shape-anisotropic magnetostrictive nanomagnet is the sum of shape- and stress-anisotropy energies:where is the magnetostrictive coefficient and is the time-dependent stress. We assume the magnet to be polycrystalline so that we can ignore magnetocrystalline anisotropy energy.

Because of the inequality , it is clear that in the absence of stress, the minimum energy configurations are and . Therefore, the stable orientations of the unstressed shape-anisotropic nanomagnet’s magnetization are along the -axis. However, in the presence of stress, the minimum energy orientation will shift to and if the product is negative and the stress is sufficiently high to make

The potential energy profile as a function of the polar angle is shown in Figure 6 for . Note that by applying sufficient stress, one can move the potential energy minimum from to in the magnet’s plane. In other words, sufficient amount of stress will rotate the magnetization from the easy axis to the in-plane hard axis.

If the voltage is turned off (and stress withdrawn) as soon as θ reaches 90°, then the torque resulting from the out-of-plane motion of the magnetization vector will continue to rotate the magnetization past and make it approach , resulting in a “flip.”

In the above discussion, we have avoided some subtle issues. For example, if the initial orientation of the magnetization vector is exactly along the easy axis, then no amount of stress can budge it since the torque on the magnetization vector, which is proportional to the gradient of the energy in - and -space, vanishes. However, thermal fluctuations can dislodge the magnetization vector slightly from the easy axis, whereupon the torque resulting from stress and shape anisotropy rotates the magnetization vector away from the easy axis towards the in-plane hard axis and ultimately accomplishes switching.

3.2.1. Nanomagnetic Logic

A logic system has two components: (1) universal logic gates such as NAND or NOR and (2) a unidirectional “wire” for ferrying logic bits without feedback from the input stage to the output. These two components are sufficient to implement any combinational or sequential logic circuit.

Universal Gate
A NAND gate can be implemented in a way reminiscent of the approach adopted in SSL and a specific nanomagnetic implementation of a NAND gate with fan-in and fan-out is shown in Figure 7. The array is placed in a global magnetic field such that the magnetostatic energy due to this field (where is the saturation magnetization of the magnet per unit volume and Ω is the magnet volume) is smaller than the shape anisotropy energy and dipole interaction energy. Because of the specific layout employed, dipole interaction between the magnets ensures that the output bit is always the NAND function of the two input bits for any of the four input combinations (0, 0), (0, 1), (1, 0), and (1, 1) [86]. Bits will propagate unidirectionally through this gate if the four groups of magnets (classified into groups I, II, III, and IV) are clocked sequentially with a sinusoidal 4-phase clock that are phase shifted from each other by 90° [86].
The internal energy dissipated in the four magnets constituting the basic NAND gate is ~500kT at room temperature per clock cycle and the energy dissipated in the entire 12-magnet array to perform one logic operation is ~1250kT [86]. The energy dissipated in the clocking circuit is negligible in comparison and can be made essentially zero if the clocking is performed with a parallel LC circuit with a resistance in series, where the clock frequency is the resonant frequency of the LC circuit [86].

Logic Wire
A logic wire is implemented with a linear array of nanomagnets where the line joining the centers of adjacent magnets is parallel to the in-plane hard axis of the magnets (see, e.g., the three magnets for fan-in in Figure 7). Bits are propagated unidirectionally through the wire (or chain) by stressing the magnets sequentially pairwise using a 3-phase clock just as in the case of SSL. This implements Bennett clocking for unidirectional logic bit propagation. The stress rotates the magnetization of any magnet by 90°, aligning it temporarily along the in-plane hard axis just as shown in Figure 4. Reference [81] has shown rigorously that Bennett clocking by this method is not only possible, but consumes very little energy per bit in every clock cycle. The voltage required to rotate the magnetization by ~90° is about 200 mV if the magnetostrictive material is nickel (weakly magnetostrictive) and roughly 10 mV if the magnetostrictive material is Terfenol-D (strongly magnetostrictive) [81].
In order to calculate the energy dissipation in Bennett clocking as a function of switching speed, one needs to solve the time-dependent problem of switching (or magnetization dynamics) using the Landau-Lifshitz-Gilbert (LLG) [87] equation that describes the magnetization dynamics. Stress acts like an effective magnetic field that gives rise to two kinds of motion: (1) precessional motion about the field (which will lift the magnetization vector out of the plane of the magnet) and (2) damping motion that will tend to align the magnetization along the effective field. The precessional motion is nondissipative while the damping motion is dissipative. Materials like nickel have small damping because of weak coupling to dissipative processes, while Terfenol-D has much stronger damping. However, Terfenol-D has much stronger magnetostriction and hence requires much less stress than nickel to switch. As a result, Terfenol-D is much more energy efficient than nickel when used in the magnetostrictive layer of a multiferroic switch.

3.2.2. Nanomagnetic Memory

A memory element implemented with a multiferroic nanomagnet is shown in Figure 8. The bit information is stored in the magnetization orientation of the soft magnetostrictive magnet shaped like an ellipsoidal cylinder. The two (mutually antiparallel) orientations along the major axis are the stable states and encode bits 0 and 1. The reading and writing schemes are described in the caption of Figure 8. The memory is addressed via a cross-bar architecture shown in the right panel of Figure 8.

When writing bits, the voltage applied between the cross-bars should be able to not only rotate the magnetization, but rotate it by ~180°, resulting in a bit flip. This is indeed possible if we withdraw the stress as soon as the projection of the magnetization vector on the magnet’s plane reaches close to the in-plane hard axis, that is, the magnetization vector enters the plane defined by the in-plane and out-of-plane hard axes. Not only is this possible at 0 K temperature, but solution of the stochastic Landau-Lifshitz-Gilbert equation [88] has shown that it is possible at room temperature as well, despite thermal noise [89].

3.3. Energy Dissipation in Straintronics

Reference [89] and the later work by our group have shown that the total energy dissipated per bit flip in hybrid spintronic/straintronic memory is about 400kT at room temperature if we switch in ~1 ns. We can reduce this energy by a factor of 10 or more if we switch slower, for example, in 10 ns. Thus, in a chip with 108 logic switches per square centimeter, the power dissipated is 0.17 mW/cm2 at a clock rate of 100 MHz, if 10% of the devices switch at any given time (10% activity level). This opens up unprecedented applications. Chips with such low-power requirements can run by scavenging energy from the environment without requiring a battery. There are numerous energy harvesting schemes that can harvest this level of energy from energy radiated by cable TV, 3G networks and environmental vibrations [9094]. Furthermore, devices of this type are ideally suited for medically implanted devices, such as processors implanted in an epileptic patient’s brain that monitor brain signals and warn of an impending seizure. These processors can run by harvesting energy from the patient’s head movements or from electromagnetic radiation in the environment, without every requiring a battery. Another possible application of such processors is in distributed sensor networks for structural health monitoring that can run off the power harvested from mechanical vibrations in the structure (buildings, bridges) induced by wind or passing traffic.

3.4. Other-Spin-Based Logic and Memory Ideas

An idea that is closely related to hybrid spintronics and straintronics and has been advanced by its proponents as an energy efficient computing paradigm is reconfigurable array magnetic automata (RAMA) which visualizes pillars of nanomagnets embedded in a piezoelectric (or ferroelectric) matrix [95]. Because of the shape anisotropy of the pillars, magnetization up or down along the axis of a pillar are the two stable states. Nearest neighbor pillars interact via dipole interaction and hence two neighbors have antiferromagnetic ordering. By exploiting the dipole coupling between nearest neighbors, a NAND gate can be implemented in the usual way as shown in Figure 9.

Application of an electric field in the piezoelectric (along the pillar axis) generates strain that strains the pillars and hence produces stress anisotropy energy which rotates the pillar’s magnetization by up to 90°. Such rotations have been demonstrated in BiFeO3-based piezoelectrics interfaced with magnetostrictive materials [96]. This can implement Bennett clocking and hence a unidirectional logic wire, thus fulfilling the requirements of a complete logic system in nanomagnetic logic. However, implementing a memory is much more difficult and could be very costly in terms of energy dissipation.

In hybrid spintronics and straintronics, it is possible to rotate the magnetization by 180° and not just 90° if we withdraw the stress at or close to the exact juncture when the magnetization vector’s projection on the magnet’s plane aligns along the in-plane hard axis. What makes it happen is the out-of-plane dynamics of the magnetization vector that generates a helpful torque to rotate the magnetization from 90° to 180° [85]. This out of plane dynamics, crucial for a complete bit flip or the 180° rotation, is either absent or very weak in a pillar, making bit flip via stress nearly impossible. Therefore, the only way to implement memory with RAMA is to apply a local magnetic field in the direction of the intended magnetization when a bit is to be written. This is indeed the method advanced by the proponents of RAMA [95]. Unfortunately, local magnetic fields are not only challenging to produce, but dissipate enormous energy as already discussed. Hence, RAMA-based memory is not likely to be very energy efficient, unlike hybrid spintronics and straintronics.

3.4.1. All-Spin Logic

Another interesting “spintronic” idea that has received significant attention has been termed “all-spin-logic” [97100]. A basic element in this paradigm is shown in Figure 10 where two identical magnets are placed on a spatially asymmetric conducting channel. The channel is “asymmetric” since the ground terminal is closer to the left magnet than to the right one.

The current flowing through the two magnets under the common bias voltage is and , where since the second current path is longer. As a result, , which means that there is in-built nonreciprocity. Since the current injected by (or extracted from) the left magnet designated as is larger than the current injected by (or extracted from) the right magnet designated , the left magnet’s magnetization serves as the input determining the magnetization of the right magnet which acts as the output. We will explain this shortly.

As usual, logic bits are encoded in the two stable magnetization orientations of either magnets shaped like an elliptical cylinder. Let us first consider the situation when is negative. Magnet then injects net spin-polarized current into magnet where the spin polarization of this current is that of the majority spins in magnet , meaning that the spin polarization is parallel to the magnetization of . This happens because and hence the current injected by the left magnet overshadows that by the right. As a result, there is net flow of spin-polarized electrons from into . These spin-polarized carriers exert a spin transfer torque on the electrons in magnet and turn their spin polarizations in the direction of the majority spins in . As a result, the magnetization of becomes parallel to that of and this is the COPY operation, where the bit encoded by the input magnet is “copied” into the output magnet .

When is positive, majority spins are extracted from which must be replenished by electrons with the same spin polarization flowing in from . As a result, becomes deficient in these spins and gradually the spins whose polarizations are antiparallel to the magnetization of become the majority in . Therefore, the magnetization of the output magnet becomes antiparallel to that of the input. This is logical inversion or the NOT operation. Therefore, the structure in Figure 10 can perform either the COPY operation or the NOT operation by simply reversing the polarity of the bias voltage. This lends itself to applications in ring oscillators [100].

Note that placing the ground terminal closer to the input magnet has endowed this system with built-in non-reciprocity. Since this makes , we have isolation between input and output; the input commands the output and not the other way around. As a result, no Bennett clocking is needed for unidirectional logic propagation and that saves the energy in the Bennett clock. However, as we have shown, the dissipation in the Bennett clock is negligible and can be made close to zero by employing resonantly excited LCR circuits, so this energy saving is not a major advantage. What might be an advantage is the elimination of clock connections to individual devices, which is lithographically taxing.

Note that there is an isolation layer (or isolation trench) under each magnet which electrically isolates from the magnet to the right of and also from the magnet to the left of . The right side of is the “talking” side of that magnet that talks to magnet and the left side of is the “listening” side of that listens to . Similarly, the right side of will be the talking side that will talk to the magnet to the right of . Thus, there is a master-slave relation between any pair of magnets—the left magnet is the master that talks to the slave magnet on the right, who always listens to the master.

The most attractive feature of all-spin logic, in the opinion of this author, is the inherent non-reciprocity. The reason why magnetic quantum cellular automata type of architectures lacks non-reciprocity (and therefore requires a Bennett clock) is that it uses dipole interaction to communicate between magnets and that interaction is inherently bidirectional. One could, in principle, progressively increase the distance between nanomagnets in a magnetic quantum cellular automata “wire” to achieve unidirectionality in space (and therefore avoid Bennett clocking), but ultimately the dipole interaction will become too weak to communicate bit information. The all-spin logic does not use bidirectional interaction between magnets and hence can achieve non-reciprocity. Note that hybrid spintronics and straintronics do not have to lack non-reciprocity. As long as we do not use a bidirectional interaction to communicate between magnets (i.e., avoid magnetic quantum cellular automata type of architectures), we can fashion nonreciprocal circuits out of multiferroics and avoid Bennett clocking as well. An example of this will be presented in a forthcoming publication.

Reference [97] has shown how a universal logic gate can be configured in all-spin logic. It is considerably more complex than what we have discussed in the context of SSL or hybrid spintronics/straintronics or RAMA.

The energy dissipation in all-spin logic was briefly addressed in [99]. As always, there are two components to the energy dissipation; the internal energy dissipated in the magnets and the energy dissipated by the currents that switch the magnets. The latter will be roughly . Insofar as it takes a significant amount of current to flip the magnetization of a magnet via spin transfer torque [101103], it is unlikely that this paradigm will be any more energy efficient than the usual spin transfer torque-based logic or memory.

4. Conclusion

In this paper, we have outlined recent developments in spin-based architectures for logic and memory, focusing on nanomagnetic computing where shape-anisotropic nanomagnets act as binary switches for both logic and memory. We have shown that these can be considerably more energy efficient than traditional transistor-based architectures if proper switching methodologies (hybrid spintronics/straintronics) are employed for switching magnets. This is a major advantage since excessive energy dissipation is the primary threat to continued downscaling of electronic switches envisioned in Moore’s law. There are also other advantages of replacing transistors with nanomagnets owing to the fact that magnets are nonvolatile unlike transistors. That opens up the possibility of nonvolatile logic where the same elements act as both memory and logic, thereby obviating the need for the communication link between the processor and the memory. Finally, magnets have no leakage, unlike transistors, which make them even more energy efficient.

On the flip side, nanomagnetic architectures also have three shortcomings that are seldom discussed, but could end up being their nemesis. The first is that magnetization of nanomagnets is usually read with spin valves or magneto-tunneling junctions. They are trilayered structures consisting of two ferromagnets separated by a thin spacer layer. One of the ferromagnets is a permanent hard magnet and the other is the target magnet which is soft, stores the bit information, and whose magnetization is to be read. If the magnetizations of the two magnets are parallel, the spin valve’s resistance is low, whereas if they are antiparallel, the spin valve’s resistance is high. The ratio of the two resistances, however, is very small, barely 10 : 1 with current technology [104], which makes the on/off ratio unacceptably small whenever there is spin-to-charge conversion, as would be needed in any hybrid technology incorporating both magnets and transistors.

The second shortcoming of nanomagnetic architectures is specific to magnetic quantum cellular automata and is associated with dipole interaction that communicates bit information between nanomagnets. Not only does dipole interaction necessitate Bennett clocking since it is inherently “reciprocal,” but it also brings forth other woes. The strength of this interaction is proportional to the square of the magnet’s volume and inversely proportional to the cube of the separation between the centers of the magnets. Hence, this technology is not particularly scalable since reducing the volume indiscriminately will render the dipole interaction too weak to be useful. Of course, making magnets smaller than ~5 nm3 will make them superparamagnetic (as opposed to ferromagnetic) at room temperature, but the scaling limit is more likely to be set by dipole interaction rather than the superparamagnetic transition. It is unlikely that magnets smaller than ~50 nm in lateral dimensions will be practical, which sets the bit density limit to about 5 × 109 cm−2. We emphasize that this is not a fundamental limitation of nanomagnetic logic or memory, but is a fundamental shortcoming of magnetic quantum cellular automata type of architectures that rely on dipole interaction.

Finally, the third and perhaps the most serious shortcoming of nanomagnetic logic at room temperature is the error rate. There are two types of errors: static fault due to such things as manufacturing defects (e.g., magnet misalignment) [105] and dynamic faults occurring due to erratic magnetization dynamics caused by thermal noise. The latter is usually more serious. Rigorous simulations by our group have shown that it may be very difficult to reduce error probability to below 0.01% at room temperature, which will then call for impractical error correction resources. This will be discussed more in future publications.

Acknowledgments

The author’s work on hybrid spintronics and straintronics was carried out with Professor Jayasimha Atulasimha of the Department of Mechanical and Nuclear Engineering at Virginia Commonwealth University. Students who have contributed to this work are Mr. Kuntal Roy, Mr. Mohammad Salehi-Fashami, and Mr. Noel D’Souza. This work is supported by the US National Science Foundation under Grants ECCS-1124714 and CCF-1216614 and by the Nanoelectronics Research Initiative of the Semiconductor Research Corporation under task 2203.0001.