Abstract

Test power is the major issue for current generation VLSI testing. It has become the biggest concern for today's SoC. While reducing the design efforts, the modular design approach in SoC (i.e., use of IP cores in SoC) has further exaggerated the test power issue. It is not easy to select an effective low-power testing strategy from a large pool of diverse available techniques. To find the proper solutions for test power reduction strategy for IP core-based SoC, in this paper, starting from the terminology and models for power consumption during test, the state of the art in low-power testing is presented. The paper contains the detailed survey on various power reduction techniques proposed for all aspects of testing like external testing, Built-In Self-Test techniques, and the advances in DFT techniques emphasizing low power. Further, all the available low-power testing techniques are strongly analyzed for their suitability to IP core-based SoC.

1. Introduction

The power consumption has been a major challenge to both design and test engineers. The efforts to reduce the power consumption during normal function mode further exaggerated the power consumption problem during test. Generally, a circuit may consume 3–8 times power in the test mode than in the normal mode [1]. As a result, the semiconductor industry is looking for low-power testing techniques [2].

To reduce the cost and time to market, the modular design approach is largely adopted for SoC. The structure of such predesigned, ready-to-use intellectual property (IP) core is often hidden from the system integrator. So testing of such cores is even more daunting. So power reduction during testing of such cores puts many constraints on current low-power testing methodology. To develop the right testing strategy for such SoC, it is necessary to survey all the available low-power testing approaches and find out the suitable approach for such SoC.

The paper is organized as follows. Section 2 gives the reasons for very high-power consumption during test and its effects of such high-power consumption on IC. It also includes definitions of various terms related to test power and also explains the model for energy and power. Section 3 contains the various schemes for low-power testing. Section 4 discusses the suitability of each scheme with reference to IP core-based SoC. Section 5 concludes the survey and explores the future scope.

2. Low-Power Test

A high density system like ASIC or SoC always demands the nondestructive test which satisfies all the power constraints defined during design phase. On the other way, the current testing philosophy demands much more power consumption during test compared to power consumption during functional mode. This section describes the reasons and effects of such high-power consumption.

2.1. Reasons of High-Power Consumption during Test

There are several reasons for this increased test power. Out of them, the main reasons are as follows.(i)The test efficiency has been shown to have a high correlation with the toggle rate; hence, in the test mode, the switching activity of all nodes is often several times higher than the activity during normal operations.(ii)In an SoC, parallel testing is frequently employed to reduce the test application time, which may result in excessive energy and power dissipation.(iii)The design-for-testability circuitry embedded in a circuit to reduce the test complexity is often idle during normal operations but may be intensively used in the test mode.(iv)That successive functional input vectors applied to a given circuit during system mode have a significant correlation, while the correlation between consecutive test patterns can be very low. This can cause significantly larger switching activity and hence power dissipation in the circuit during test than that during its normal operation [3].

2.2. Effects of High-Power Dissipations

The most adverse effect of very high-power dissipation during test is the destruction of IC itself. In addition, to prevent the IC from destruction, the power dissipation during test can affect the cost, reliability, autonomy, performance-verification, and yield-related issues [4]. Some of the effects are as follows. (i)The growing need of at-speed testing can be constrained because of the high-power dissipation. So stuck at faults can be tested without any effect, but the testing of the delay fault will become difficult. (ii)During functional testing of the die just after wafer etching, the unpackaged bare die has very little provision for power or heat dissipation. This might be a problem for applications based on multichip module technology, for example, in which designers cannot realize the potential advantages in circuit density and performance without access to fully tested bare dies [5]. (iii)Circuit can be failed because of erosion of conductors caused by electromigration. (iv)The online BIST in battery-operated remotes and portable systems consumes very high power for testing only. Remote system operation occurs mostly in standby mode with almost no power consumption, interrupted by periodic self-tests. Hence, power savings during test mode directly prolong battery lifetime. (v)The elevated temperature and excessive current density severely decrease circuit reliability. (vi)Because of excessive power dissipation, the simple low-cost plastic packages used in consumer electronics cannot be used, but expensive packages which can remove the excessive heat are forced to be used.

2.3. Definitions and Models of Energy and Power

Power consumption in CMOS circuits can be classified into static and dynamic. Static power dissipation is due to leakage current or other current drawn continuously from the power supply. Dynamic dissipation is due to (i) short circuit current and (ii) charging and discharging of load capacitance during output switching. For the current CMOS technology, dynamic power is the dominant source of power consumption.

A good approximation of the energy consumed during one clock period is1𝐸𝑖=2βˆ—π‘†π‘–βˆ—πΉπ‘–βˆ—πΆπ‘œβˆ—π‘‰2𝐷𝐷,(1) where 𝑆𝑖 is the number of switching during the period, 𝐹𝑖 is the fanout of the node, and πΆπ‘œ is the minimum size parasitic capacitance of the circuit. The fanout of the nodes is defined by circuit topology, and the switching can be estimated by a logic simulator (note that in a CMOS circuit, the number of switching is calculated from the moment the input vector is changed until the moment the internal nodes reach the new stable state, including the hazard switching). The product 𝑆𝑖𝐹𝑖 is named Weighted Switching Activity (WSA) of node 𝑖 and represents the only variable part in the energy consumed at node 𝑖 during test application. So the energy consumed in the circuit after application of a pair of successive input vectors (π‘‰π‘˜βˆ’1,π‘‰π‘˜) can be expressed by𝐸𝑉𝐾=12βˆ—πΉπ‘–βˆ—πΆπ‘œβˆ—π‘‰2π·π·βˆ—ξ“π‘–π‘†(𝑖,π‘˜),(2) where 𝑖 ranges all the nodes of the circuits and 𝑆(𝑖,π‘˜) number of switching provoked, by π‘‰π‘˜ at node 𝑖. Consider now a pseudorandom test sequence of 𝐿 vectors which is the test length. The total energy consumed in the circuit is𝐸total=12βˆ—πΉπ‘–βˆ—πΆπ‘œβˆ—π‘‰2π·π·βˆ—ξ“πΏξ“π‘–π‘†(𝑖,π‘˜).(3) It should be noted that energy is the total switching activity generated during test application and has impact on the battery lifetime during powerup or periodic self-test of battery-operated devices. Therefore, we can express the instantaneous power consumed in the circuit after application of vectors (π‘‰π‘˜βˆ’1,π‘‰π‘˜) as𝑃instξ€·π‘‰π‘˜ξ€Έ=πΈπ‘‰π‘˜π‘‡.(4) The peak power consumption corresponds to the maximum of the instantaneous power consumed during the test session. It therefore corresponds to the highest energy consumed during one clock period, divided by 𝑇. More formally, it can be expressed by𝑃peak=maxπ‘˜ξ€Ίπ‘ƒinstξ€·π‘‰π‘˜=ξ€Έξ€»maxπ‘˜ξ€·πΈπ‘‰π‘˜ξ€Έπ‘‡.(5) Finally, the average power consumed during the test session is the total energy divided by the test time and is given as follows:𝑃avg=𝐸total𝐿⋅𝑇.(6) Elevated average power adds to the thermal load that must be vented away from the device under test. It may cause structural damage to the silicon (hot spots), to bonding wires, or to the package.

According to the above expressions of the power and energy consumption and assuming a given CMOS technology and supply voltage for the circuit design, the number of switching of a node 𝑖 in the circuit is the only parameter that has impact on the energy, the peak power, and the average power consumption. Similarly, the clock frequency used during testing has impact on the peak power, and the average power. Finally, the test length, the number of test patterns applied to the CUT, has impact only on the total energy consumption. Consequently, when deriving a solution for power and/or energy minimization during test, a designer or a test engineer has to have these relationships in mind.

From the viewpoint of scan test, test power can be divided into shift power and capture power, corresponding to shift mode and capture mode, respectively. In shift mode, many clock pulses are applied to load a test vector and unload a test response. Therefore, average shift power dominates heat dissipation during scan shift. Excessive peak shift power may cause scan chain failures, resulting in yield loss. In capture mode, where only one or two clock pulses are needed, the contribution towards test heat is negligible.

3. Low-Power Testing Schemes

During the last two decades, the number of power reduction techniques for testing have evolved. These techniques either explore the ATPG and deal with the test vectors to be used with external testing or explore the internal structure of design using BIST or DFT. So existing low-power testing scheme is divided into the following two categories. (1)Low-Power Testing Techniques for External Testing using ATE, ATPG, and so forth.(2)Low-Power Testing Techniques for Internal Testing using BIST, DFT, and so forth.

3.1. Low-Power Testing Techniques for External Testing

The following are the classifications of low-power techniques for external testing.

3.1.1. Low-Power ATPG Algorithms

This category contains various techniques adopted to reduce the power consumption during external testing by ATE. These methods depend on the number of transitions in test data set. The current research in this field focuses on ATPG algorithm which not only gives maximum fault coverage but also ensures the maximum fault coverage at lowest possible power dissipation. Reference [6] proposed a heuristic method to generate test sequences which create worst-case power droop by accumulating the high- and low-frequency effects using a dynamically constrained version of the classical D-algorithm for test generation. A novel scan chain division algorithm [7] analyzes the signal dependencies and creates the circuit partitions such that both shift and capture power can be reduced when using the existing ATPG flows. Reference [8] presents a low capture power ATPG and a power-aware test compaction method. This ATPG lowers the growth of test pattern count compared to the detection number 𝑛. The peak power becomes smaller as the detection number 𝑛 increases. The test compaction algorithm further reduces the number of test patterns as well as the average capture power.

3.1.2. Input Control

Here the idea is to identify an input control pattern such that, by applying that pattern to the primary inputs of the circuit during the scan operation, the switching activity in the combinational part can be minimized or even eliminated. The basic idea of input control technique with existing vector- or latch-ordering techniques that reduces the power consumption has been covered in [9]. In the same area, [10] presented a technique of gating partial set of scan cells. The subset of scan cells is selected to give maximum reduction in test power within a given area constraint. An alternate formulation of the problem is to treat maximum permitted test power and area overhead as constraints and achieve a test power that is within these limits using the fewest number of gated scan cells, thereby leading to least impact in area overhead. The area overhead is predictable and closely corresponds to the average power reduction.

3.1.3. Ordering Techniques

The researches have widely explored the test vector reordering techniques to reduce the switching power. Hamming distance based reordering is described in survey paper [11]. Girard's approach of vector ordering is enhanced in [12]. In [13], another method based on artificial intelligence is proposed to order the test vectors in an optimal manner to minimize switching activity during testing.

3.1.4. Exploring the Do Not Care Bit

ATPG-generated uncompacted test data contains a large number of do not care bits. [14] proposed an automatic test pattern generation (ATPG) scheme for low-power launch-off-capture (LOC) transition test. The authors in [15] have used a Genetic Algorithm-based heuristic to fill the do not cares. This approach produces an average percentage improvement in dynamic power and leakage power over 0-fill, 1-fill, and Minimum transition fill (MT-fill) algorithms for do not care filling. [16] proposed segment-based X-filling to reduce test power and keep the defect coverage. The scan chain configuration tries to cluster the scan flip-flops with common successors into one scan chain, in order to distribute the specified bits per pattern over a minimum number of chains. Based on the operation of a state machine, [17] elucidates a comprehensive frame for probability-based primary-input-dominated X-filling methods to minimize the total weighted switching activity (WSA) during the scan capture operation. The authors in [18] describe the effect of do not care filling of the patterns generated via automated test pattern generators, to make the patterns consume lesser power. It presents a tradeoff in the dynamic and static power consumption.

3.2. Low-Power Testing Techniques for Internal Testing

The following are the classifications of low-power techniques for internal testing.

3.2.1. By LFSR Architecture

The Built-In Self-Test (BIST) architecture contains two major components: test pattern generator and response checker [19]. Both of these components use Linear Feedback Shift Register (LFSR). The LFSR can be designed to reduce the power consumption during test in the following ways.

3.2.2. By Reducing the Transitions

These methods reduce the transitions between successive patterns generated by LFSR as well as between the successive bits in a given pattern. A dual-speed LFSR scheme [20] is based on two different speed LFSRs to decrease the circuit's overall internal activity. Its objective is to decrease the circuit’s overall internal activity by connecting inputs that have elevated transition densities to the slow-speed LFSR. This strategy significantly reduces average power and energy consumption without decreasing fault coverage. Cellular automata-based test pattern generation to reduce power is described in [21]. In [22], the LFSR is modified by adding weight sets to tune the pseudorandom vectors signal probabilities and thereby decrease energy consumption and increase fault coverage. The LP-TPG [23] inserts intermediate patterns between the random patterns to reduce the transitional activities of primary inputs which eventually reduces the switching activities inside the circuit under test, and hence, power consumption. A polynomial-time algorithm that converts the test pattern generation problem into combinatorial problem called Minimum Set Covering Solutions is proposed in [24]. A new low-power BIST TPG scheme [25] uses a transition monitoring window (TMW) that is comprised of a TMW block and an MUX. The proposed technique represses transitions of patterns using the k-value which is a standard that is obtained from the distribution of TMW to observe over transitive patterns causing high-power dissipation in a scan chain. In [26], a TPG based on Read-Only Memory (ROM) is carefully designed to store the test vectors with minimum area over the conventional ROM. This reduces the number of CMOS transistors significantly when compared to that of LFSR/Counter TPG. An approach to reconfigure the CUT's partial-acting-inputs into a short ring counter (RC) and keep the CUT's partial-freezing-inputs unchanged during testing is proposed in [27]. A low hardware overhead test pattern generator (TPG) for scan-based Built-In Self-Test (BIST) that can reduce switching activity in circuits under tests (CUTs) during BIST is presented in [28]. It also achieves very high fault coverage with reasonable lengths of test sequences. The proposed BIST TPG decreases transitions that occur at scan inputs during scan shift operations and hence reduces switching activity in the CUT. In LT-LFSR [29], transitions in LFSR are reduced in two dimensions: (1) between consecutive patterns and (2) between consecutive bits. The proposed architecture increases the correlation among the patterns generated by LT-LFSR with negligible impact on test length. An efficient algorithm to synthesize a built-in TPG from low-power deterministic test patterns without inserting any redundancy test vectors is presented in [30]. The structure of TPG is based on the nonuniform cellular automata (CA). And the algorithm is based on the nearest neighborhood model, which can find an optimal nonuniform CA topology to generate given low-power test patterns. A low-power dynamic LFSR (LDLFSR) circuit [31] achieves comparable performance with less power consumption. Typical LFSR, a DFLSR[I], and a LDLFSR are compared on randomness property and inviolability property. Multilayer perceptron neural networks are used to test this LFSRs' inviolability property.

3.2.3. By Generating the Useful Vectors Only

A significant amount of energy is wasted in the LFSR and in the CUT by useless patterns that do not contribute to fault dropping. LFSR tuning modifies the state transitions of the LFSR such that only the useful vectors are generated according to a desired sequence [32]. To reduce such energy consumption, a mapping logic is designed in [33] which modifies the state transitions of the LFSR such that only the useful vectors are generated according to a desired sequence.

3.2.4. By Filtering Unnecessary Vectors

There are some nondetecting sequences generated by LFSR. By inhibiting such vectors during testing, over all switching can be reduced. A test-vector-inhibiting technique to filter out some nondetecting subsequences of a pseudorandom test set generated by an LFSR is proposed in [34]. The authors use a decoding logic to store the first and last vectors of the nondetecting subsequences. This work was extended in [35] by the filtering action to all the nondetecting subsequences. A pattern-filtering technique is combined with Hertwig and Wunderlich's technique to avoid scan-path activity during scan shifting in [36]. Hatami et al. [37] proposed a scan cell architecture that decreases power consumption and the total consumed energy. In the method which is based on the data compression, the test vector set is divided into two repeated and unrepeated partitions. The repeated part, which is common among some of the vectors, is not changed during the new scan path, where new test vector will be filled. As a result, the test vector is applied to the circuit under test in a fewer number of clock cycles, leading to a lower switching activity in the scan path during test mode.

3.2.5. By Partitioning Circuit

The circuit is strategically partitioned into subcircuits to achieve the parallel testing. An efficient scan partitioning technique reduces average and peak power in the scan chain during shift and functional cycles. A low-power BIST strategy based on circuit partitioning is described in [38]. This strategy partitions the original circuit into two structural subcircuits so that two different BIST sessions can successively test each subcircuit. To address the power in the scan chain, in [39], an efficient scan partitioning technique that reduces both average and peak power in the scan chain during shift and functional cycles is proposed. In [40], the authors proposed a novel low-power virtual test partitioning technique, where faults in the glue logic between subcircuits can be detected by patterns with low-power dissipation that are applied at the entire circuit level, while the patterns with high-power dissipation can be applied within a partitioned subcircuit without loss of fault coverage. Experimental results show that the proposed technique is very effective in reducing test power.

3.2.6. By Separate Testing Strategy for Memory

Various transition reduction techniques for memory testing by reordering read and write access are available in the literature. A row bank-based precharge technique based on the divided wordline (DWL) architecture is proposed in [41]. In low-power test mode, instead of precharging the entire memory array, only the current accessed row bank is precharged. This will result in significant power saving for the precharge circuitry. With the ever-increasing number of memories embedded in a system on chip (SoC), power dissipation due to memory test has become a serious concern. In [42], the authors proposed a novel low-power memory BIST. Its effectiveness is evaluated on memories in 130 and 90 nm technologies. A significant power reduction can be achieved with virtually zero hardware overhead.

3.2.7. Low-Power Design-for-Test Techniques

In this category, some extra hardware is added to design for reducing the power consumption during test. Clock partitioning and clock freezing [43] and use of J-scan instead of traditional MUX scan [44] are the examples of such methods. This approach adds to or modifies the on-chip design hardware to reduce the power consumption during test and hence may be called Design for Low-Power Test (DFLPT).

4. Low-Power Testing Techniques Emphasizing IP Core-Based SoC

With the emergence of core-based SoC design, BIST already coming as a part of IP core presents one of the most favorable testing methods because it allows preservation of a design's intellectual property [45]. Such BISTs are most suitable to test the IP core in standalone mode, but, when the IP core is integrated with other blocks to form a complete system, they might not be suitable.

Now let us think about adding some low-power schemes at the time of system integration. The structure of IP cores are often hidden from system integrator. So neither any modification to its internal scan chain nor any DFT insertion is possible for IP cores. Further, any testing tools like Automatic Test Pattern Generator (ATPG) or fault simulation cannot be applied to it. Such cores are coming with ready to use test data. This test data is used to test the core when it is in isolation as well as when it is as a part of system after being integrated to system. It is usually assumed that the core is directly accessible, and it becomes the task of the system integrator to ensure that the logic surrounding the core allows the test stimuli to be applied and the produced responses to be transported for evaluation. So the only option remains to reduce power is schemes applicable to readymade test data.

Based on the above discussion, first of all, let us list the characteristics of the power reduction technique suitable to IP core-based Soc and then compare each of the available techniques with our ideal model.

4.1. Characteristics of Power Reduction Scheme Suitable to IP Core-Based SoC
(i)It should not demand the knowledge of internal structure of design. (ii)It should not make any modification in internal design. (iii)But it can add the hardware as per requirement without modifying the available I/O pin configuration. (iv)It should deal with readymade test sequence rather than test architecture. (v)It should not be dependant on testing tools like ATPG or fault simulation which deals with the netlist of design.

Now comparing the available techniques with above characteristics.

4.2. Modification in LFSR

The implementation of this method (i)deals with test sequence rather than test architecture, (ii)requires the knowledge of internal details of design, (iii)requires the additional hardware to modify test pattern sequence, and (iv)requires modification in internal structure.

4.3. Partitioning the Circuit

The implementation of this method (i)deals with test architecture rather than test sequence, (ii)requires the well-defined internal hierarchical structure of design, (iii)requires the knowledge of internal details of design, (iv)requires the additional hardware to modify test pattern sequence, and (v)requires modification in internal structure.

4.4. Separate Testing Strategy for Memory

The implementation of this method (i)can be applied for memory when and where it is required, and(ii)is not applicable to functional blocks.

4.5. Improved ATPG Algorithms

The implementation of this method (i)deals with generation of new test set rather than available test sequence or test architecture,(ii)requires the netlist of the design. It cannot be directly applicable to hard core, and (iii)requires the knowledge of internal details of design.

4.6. Input Control

The implementation of this method (i)deals with test architecture rather than test sequence, (ii)requires the knowledge of internal details of design,(iii)requires the additional hardware to modify test pattern sequence, and (iv)requires modification in internal structure.

4.7. Ordering Technique

The implementation of this method (i)deals with test sequence rather than test architecture, (ii)requires the well-defined test sequence, that is, test data set, (iii)does not require the knowledge of internal details of design, (iv)requires the additional hardware to reorder test pattern sequence, and (v)does not require modification in internal structure.

4.8. Exploring Do Not Care Bits

The implementation of this method (i)deals with test data bit sequence rather than test vector sequence or test architecture, (ii)requires the well-defined test sequence, that is, test data set, (iii)does not require the knowledge of internal details of design. (iv)does not require any additional hardware, (v)does not require modification in internal structure.

Out of the above-mentioned categories, except β€œordering techniques” and β€œexploring do not care bits” methods, all other methods require the internal details of the design under test. Hence, in context of IP core-based SoC, only these two categories are suitable.

5. Conclusion

This survey paper on low-power testing techniques suitable to IP core-based SoC starts with the reasons and effects of high-power consumption during test, including energy and power model. Very advanced techniques available for power reduction during test are described in detail. The issues related to test power reduction in case of IP core-based SoC are discussed, and characteristics of ideal scheme suitable to IP core-based SoC is defined. Based on that, each available category of power reduction is compared with this ideal model. It is concluded that β€œordering techniques” and β€œexploring do not care bits” methods are the best suited to IP core-based SoC. The research can start with improvement in these schemes in terms of power reduction and then further optimizing them with other important test parameters like test application time, on-chip area overhead, test data compression, and so forth.