Abstract

Side channel and fault injection attacks are major threats to cryptographic applications of embedded systems. Best performances for these attacks are achieved by focusing sensors or injectors on the sensible parts of the application, by means of dedicated methods to localise them. Few methods have been proposed in the past, and all of them aim at pinpointing the cryptoprocessor. However it could be interesting to exploit the activity of other parts of the application, in order to increase the attack's efficiency or to bypass its countermeasures. In this paper, we present a localisation method based on cross-correlation, which issues a list of areas of interest within the attacked device. It realizes an exhaustive analysis, since it may localise any module of the device, and not only those which perform cryptographic operations. Moreover, it also does not require a preliminary knowledge about the implementation, whereas some previous cartography methods require that the attacker could choose the cryptoprocessor inputs, which is not always possible. The method is experimentally validated using observations of the electromagnetic near field distribution over a Xilinx Virtex 5 FPGA. The matching between areas of interest and the application layout in the FPGA floorplan is confirmed by correlation analysis.

1. Introduction

Side channel attacks (SCA) and fault injection attacks (FIA) are very efficient techniques to retrieve secret data stored in cryptographic devices such as smartcards. First attacks have been performed globally, for instance by measuring the power consumption (power analysis: PA) [1] of a device under analysis (DUA) or by quickly changing the nominal voltage of its power supply [2]. But best results have then been achieved locally, by using a small EM probe just over the cryptoprocessor (electromagnetic analysis: EMA) [3] or by shooting at it with a laser beam [4, 5]. Indeed, for SCA, such locality permits to solely collect the activity of the cryptoprocessor, instead of gathering the activity of the whole DUA. In the case of FIA, depending on the technology process of the integrated circuit, only one bit of the implementation can be affected. However, the efficiency of these attacks relies on localisation methods which have to pinpoint as accurately as possible the DUA-sensitive areas. Using these localisation methods is mandatory in the case of a cryptographic application embedded in a field Programmable gate array (FPGA) as its regular structure prevents the localisation of sensible modules by optical or electron microscopy. Indeed, the task is easier for most ASICs, where the functional modules stand clearly out from a visual inspection of the layout as rectangular shapes. Some methods have been proposed in the past, illustrated using as observations the near electromagnetic (EM) field radiated by the DUA: an EM probe is moved over the DUA from a position to another one, and for each of them, the temporal variation of the EM field is measured with an oscilloscope. We use once again such cartography procedure in this paper. Furthermore, we note that all previously published localisation methods along with the one in this paper can deal with other physical phenomenons, such as photons emitted by transistors while they commute [6, 7].

Up to now, two strategies have been deployed to locate cryptographic modules within a DUA. They consist in identifying areas where the physical observations vary according to:(1)the data processed during an encryption [8, 9] or(2)the operations performed by the cryptographic module [10], even if this latter is protected against SCA [11].

With the first strategy, two observations are collected for two different plaintexts and , and their fluctuations are assessed either by looking for the maximum difference in their temporal domain [8] or by calculating the incoherence of the frequency spectrum [9]. The larger the difference is or the lower the incoherence is, the closer to the cryptoprocessor the EM probe is. To improve the accuracy of the method, a third observation but of the same plaintext can be acquired, with a view to reject the measurement noise [8]. These approaches seem to be the most suitable because statistical tools which are then used, the CPA [12] for example, exploit plaintexts or ciphertexts as well. However, they would be optimal only if and only if the differences in the observations are maximal, which requires that all of the transistors making the datapath of the cryptoprocessor commute. This can happen only if the attacker knows the secret data, obviously the key but also the mask values when the DUA is protected by masking [13, 14], which is possible in the framework of an evaluation but not with a real-world application.

Instead of focusing on the same time slot, that of the encryption, and collecting two observations, the techniques of the second strategy need only one plaintext and take advantage of two or more time slots of a single observation. Typically, if the cryptoprocessor is in idle state, none of its transistors commute and the corresponding activity is at a low level. On the contrary, during an encryption, the activity is expected to be at a high level. Thus, the localisation of the cryptoprocessor can be achieved in the temporal domain by evaluating the difference between these activities [10]. To diminish the impact of the measurement noise, the localisation can be performed in the frequency domain: indeed, the succession of these low and high activity levels yields a special signature in the frequency spectrum [10]. As this succession still occurs for some countermeasures, the second strategy remains valid to localise protected cryptoprocessors. This fact is all the more true with protections using dual-rail with precharge logic (DPL [15]): as they alternate between two phases, namely, precharge and evaluation phases, they “oscillate’’ at the half of the master clock frequency, a frequential component of great interest for the frequency analysis [11].

Both previous strategies require to identify the time slot of some sensible operations, such as the encryption. This information can be extracted from the implementation netlist or by using simple power analysis (SPA) [1] or simple electromagnetic analysis (SEMA). Nonetheless, exploring the full implementation appears to be complex, at least time consuming and forces it to be partial, focused on few targets. In this paper, we propose a method to exhaustively locate the information sources of a DUA, without preliminary knowledge about it. This method is described in Section 2. Then, its ability to identify areas of interest, and in particular cryptographic modules, is evaluated in Section 3. Finally, Section 4 concludes this paper and presents some future works and perspectives.

2. Cross-Correlation Cartography

Cartography formally consists in monitoring one or more physical phenomenons at positions over a DUA. Generally, these positions form a 2D grid composed of and points over, respectively, the and axes. and could have the same value, for instance 10 as in Figure 1. For each of the positions, identified by their coordinates , observations of samples are achieved. They constitute an observations set . In the example of Figures 1 and 4 observations of samples are collected per position. To build the final 2D map, each set has to be reduced to a unique scalar . The common usage is to apply a function to the corresponding observations . From a “graphical’’ standpoint, the values, real numbers, are then mapped to colors according to a user-defined scale.

The localisation method we propose in this paper is motivated by the fact that the observation of a physical phenomenon depends on the time and on the space. In the SCA topic, the physical phenomenon we consider is the emanation of a digital integrated circuit. As the state of this circuit changes from a synchronization clock cycle to another one, the observations are made of successive peaks. The amplitude of each one of these peaks(i)varies in the time according to the data manipulated by its source (see for instance Figure 7);(ii)decreases when the distance between the source and the observation point increases.

In consequence, sources carrying distinct information generate physical phenomenons whose temporal variations look completely different, that is, uncorrelated. At the opposite, observations gathered at positions close to each other, and in particular at positions which are themselves close to a source, look very alike. Thus, to locate these sources, we collect a single observation per position, then estimate the similarity level between all of these observations. While the methods presented in Section 1 consider (through the function, see Figure 1) each observations set independently from the other, we use conjointly all of them.

The first step of our technique consists in taking each observation as a reference, then looking for the maximum of normalized cross-correlation (abridged NXC) between this reference and the other observations. The NXC function is defined as In (1), stands for the covariance, and are two observations (at two different points), which, respectively, have and temporal samples, whose mean values are and , and whose standard deviations are and . means that a delay belonging to the interval from to is applied to . From a “graphical’’ standpoint, the waveform of is shifted to the right along the temporal -axis, or in other words the origin of is moved to . We simply abridge as and note that because the standard deviation is considered over the complete waveform (and is thus independent of the offset ). Figure 2 shows the variations of the NXC function according to the values, when and are two signals with an identical shape, but distinct amplitudes, are delayed by two samples. The maximum value of indicates that and are similar, while the index of this maximum, 2, provides the sample delay between and . The value of does not reach because of computational side-effects: the -axis is not infinite but bounds to .

To briefly illustrate the result of such computation, we provide two NXC maps on Figure 3. The first one, (a), has been obtained considering the center of the map as the reference point. Maximum correlation values are at a very low level, lightly positive in red on the map, or negative in blue, except for the reference point, which by definition is 100% correlated to itself and takes the yellow point in the center. It does not identify a source of interest, because it is insulated. On the second map, (b) of Figure 3, an area of high correlation (in yellow) stands out around the reference point located at mm and mm and is marked with a white cross. This zone has a size greater than the actual active logic in the FPGA and notably extends outside the silicon chip’s boundary, depicted in a white dashed line. The diffusion of the EM field accounts for this extension of a couple of millimeters around the radiating logic. Indeed, in our setup, the distance between the loop sensor and the silicon surface is roughly speaking equal to 2 mm. Above, a second area emerges in blue, with a negative correlation value, as the observations correspond to EM field measurements, these ones, and in turn the correlation values, may be of opposite sign. But in reality, this second blue area contains the same information as the first yellow area does. Therefore, to prevent such artifact, we now consider the absolute maximum values of the normalized cross-correlation function.

Each observation gathered at a position of the 2D grid become in turn a reference observation, we finally collect NXC maps. Most of them are alike, as computed at neighbouring points, close to physical sources. The second step of our technique aims at grouping them. For this purpose, we need once again a correlation estimator, but this time, as we manipulate maps, this one should take into account the two dimensions and . This bidimensional estimator, namely, , is defined as: where In this equation, and are two maps, of and points on , and points on , whose mean values are and and standard deviations and . means that a spacial offset is applied to the map , so that its origin point is moved to . This offset is useful when the objective is to find the location of a small pattern within a reference map. In this paper, as we compare maps with identical size, and are set to zero. As previously, we fix a reference map, then we look for the maximum of the absolute value of . If this maximum is greater than a user-defined threshold, maps are considered as identical and grouped in the same list. Every list is called an area of interest.

To finish the analysis, one map per list has to be extracted. It could be randomly chosen, but we suggest to select the map for which the number of points with a value above under, respectively, a user-defined threshold is the greatest. The corresponding map is the one with the widest, nearest area. The full method is summarized by the Algorithm 1. In this algorithm, the selection of areas of interest is represented by a function called “extract.’’

Require: One observation per point
Ensure: List of identical maps
For each 2D grid point do Fixed reference point A}
/* Looping over all fixed reference points A */}
  for each 2D grid point do
/* Looping over all mobile points B */}
  end for
end for
/* Index of areas of interest */}
for each 2D grid point do
/* Looping over all fixed reference points A */}
for each 2D grid point do
/* Looping over all mobile points B */}
if
then
end if
end for
end for
for do
end for

3. Experimental Results

To evaluate the efficiency of our method, we have used it against an FPGA-based cryptoprocessor performing the simple and triple Data Encryption Standard (3DES) [16], and protected by first-order Boolean masking [13, 14]. In practice, we have implemented the same masking scheme as in [17]. We concur this design is obsolete for at least two reasons. First of all, DES has been replaced by the Advanced Encryption Standard (AES) [18] since the year 2001. Second, the employed masking scheme is not robust against High-Order Side Channel Attack (HO-SCA) [17, 19]. Nonetheless, the objective of this section is not to come up with a new attack to break a still considered invulnerable countermeasure, but to experimentally prove that our method identifies areas of interest, and in particular two sensible 64-bit registers, LR and MASK, carrying respectively the masked value and the mask itself.

To make the experiment easier, we have constrained their placement so that they may fit in rectangular areas, themselves placed at the opposite sides of the FPGA. As depicted by Figure 4, MASK is at the top left hand corner, while LR is at the bottom right hand corner. Splitting these registers in such a way has spread the routing of the 3DES cryptoprocessor datapath all over the FPGA. Therefore, in a view to keep the other components of our implementation visible, for the 3DES datapath, only its logic cells are displayed. They appear as black dots in the upper half part of the floorplan. The KEY scheduling block is at the bottom right of Figure 4, in salmon, while the 3DES CONTROLler, in green, is in the middle, on the left. Close to this latter, we find a 6502 CISC CPU in olive and an UART in turquoise. All previous components share a VCI bus along with its memories, in gold in Figure 4. This real-life application has been programmed into a Xilinx [20] Virtex 5 FPGA, whose metallic lid has been removed with a cutter, as shown in Figure 5. This way, not only we can reduce the analysis area strictly to the FPGA silicon die, but the signal to noise ratio is also greatly improved.

Observations have been acquired using a 2 mm diameter EM probe, a 3 GHz bandwidth 30 dB gain preamplifier, and an Agilent [21] Infiniium 54854 oscilloscope, whose bandwidth and sampling rate have been set up to, respectively, 3 GHz and 10 GSa/s. The EM probe has been moved following a points grid, per step of 480 μm along the -axis, and 400 μm along the -axis. The grid is rather rectangular, since we covered the whole silicon die of the Virtex 5 (refer to Figure 5): 12 mm wide and 10 mm high. Then, maps have been grouped together according to a threshold of , that is, two maps whose 2D cross-correlation coefficient is greater than are considered as identical and gathered in the same list. Finally, we have counted for each map the number of points with a correlation level above 90%. From each list, we have extracted only the map with the greatest number of such points.

Proceeding this way, we have obtained eleven areas of interest. The nine most significant ones are reported on Figure 6, with a disposition in the page that reflects their location within the FPGA. As in Figure 3, reference points are marked with a white cross. The maps (a) and (i) pinpoint two areas in the top left hand and bottom right hand corners. At first sight, they correspond to the two sensible registers LR and MASK. To confirm this, we have conducted large acquisition campaigns of 1,000 observations per point, then computed for each of them the CEMA factor, that is, CPA (Correlation Power Analysis) with electromagnetic waves. We denote by this CEMA factor, to distinguish it from , the NXC coefficient defined in (1). Note that we use data normally not accessible to an attacker such as the mask’s value: this is, however, possible in an evaluation context. The resulting maps for the MASK and LR registers are depicted by Figure 8. The CEMA clearly identifies that area of interest (a) is correlated with the mask and that area (i) is correlated with the masked data. In these two CEMA maps, the point with the maximum correlation is marked with a white cross. This location coincides almost exactly with that of maximum in the NXC maps. Hence, the proof that the methodology succeeds in insulating areas of consistent activity. Therefore, our main objective has been successfully reached. This result is very precious to continue with a HO-SCA (second order) taking advantage of this spatial diversity: LR is leaking more about the masked data, whereas MASK discloses information more related to the mask. HO-SCA such as those based on correlation reviewed in [22] or the one based on information theory in [23] would advantageously combine observations over these points. Identifying the other areas is not trivial as their shapes do not fit the arrangement of Figure 4. Indeed, EM radiations are generally more likely due to the power grid and the clock tree of the FPGA [10] than to its logic cells and routing paths.

To complement the analysis, Figure 7 delivers the output voltage of the EM probe when this latter is just over the points of interest. Except for the map (f), the 16 rounds of the DES encryption are neatly visible in the right hand part of the observations. The 16 peaks amplitude varies in time, but not in the same way from a position to another one, which confirms that we observe the activity of distinct elements. We guess that the observation that coincides with the locations:(i)(e) and/or (h) may be due to the key scheduling;(ii)(d) and/or (g) to some reads/writes on the VCI bus;(iii)(b) and/or (c) and/or (f) to some combinatorial functions in the 3DES datapath, such as exclusive logical OR.

We insist that our blind cartography method does not actually distinguish cryptographic blocks from the others. But still, the method has the following interests.(i)It highlights “equivalent areas’’ for EMA. Once those areas of interest are localized, the attacker can focus her measurements on them. In our example, this reduces the number of positions from to only 11.(ii)Applied to the second-order attack of a first-order masking scheme, the number of combinations to be tested to match the mask and the masked data activity is only . Without NXC, the number of couples to test would be equal to , which is deterrent for an attacker, but the computational workload is too high.

4. Conclusion and Future Works

Many implementation-level attacks can be enhanced if the floorplan of the application is known by the attacker. For instance, side-channel measurements can be made less noisy if focused on the most leaking zone, and fault injection attacks (by electromagnetic waves or laser shots) have indeed more chance to succeed in perturbing the adequate resource if positioned well in a vicinity of the zone of influence. As far as ASICs are concerned, the location of each module can be guessed by an optical analysis of chip photographs. Modern ASICs (such as modern smartcards) have their logic dissolved so as to make its analysis intractable. Now, regarding FPGAs, the problem is the same, since the fabric is extremely regular and does not show the location of the user design. In addition, FPGA chips are wider than ASICs, thus the research for sensitive regions is a priori more complex.

In this paper, we introduce a novel location method based on cross-correlation of electromagnetic cartographies. It is indeed able to reveal the position of blocks. This shows that the structure of the floorplan shall not be considered confidential in FPGAs, even if the bitstream is confidential (e.g., encrypted). Then, we experimentally demonstrate that the cross-correlation location method is efficient to pinpoint areas of interest in the context of a protected cryptographic application. This methodology illustrates a new aspect of the wealth of the information carried out by the electromagnetic field leaked by electronic devices. The floorplan reverse-engineering method presented in this paper is an algorithm-agnostic preliminary step that enables the further realization of well-focused electromagnetic analysis attacks aiming this time at extracting secrets. We have exemplified this method with the successful localization of the registers that hold the mask and the masked data that are manipulated concomitantly. Being able to record traces from both locations allows for second-order attacks by combination of the twain measurements [24]. Also, the same method could be used to record traces selectively from one half of separable dual-rail logic styles (such as SDDL [25, Section 3.1], DWDDL [26], divided backend duplication [27], partial DDL [28], or PA-DDL [29]) thereby defeating the complementation property of those “hiding’’ countermeasures.