Abstract

A novel quantization error (QE) compensation method is proposed in design of high accuracy fixed-width radix-4 Booth multipliers, which will effectively reduce the QE and save the area of multipliers when they are employed in cognitive radio (CR) detector and digital signal processor (DSP). The truncated partial-products of the proposed multipliers are finely divided into three sections: reserved section, adaptive compensation section, and constant compensation section. The QE compensation carries of the multipliers are generated by applying probability estimation based on a shrunken minor truncated section which is a combination of the constant compensation and adaptive compensation. The proposed compensation method not only reduces the QE of the fixed-width Booth multipliers, but also avoids the exhaustive computing resources (time and memory) during getting the compensation carries by statistical simulation. The proposed method can achieve higher accuracy than the existing works under the same area and power budgets. Simulation and experiment results show that the improved compensation method has the minimum power-delay products compared with the existing methods under the same area and can save up to 30% area for realization of full-width radix-4 Booth multipliers.

1. Introduction

The fixed-width multipliers have been widely used in the design of digital signal processor (DSP) due to their smaller area and lower power dissipation [13]. In order to reduce the chip area of channel detector for cognitive radio, many fixed-width Booth multipliers have been used. However, they reduce the detection accuracy because of truncated partial-products. Therefore, a quantization error compensation (QEC) technique is required in the design of fixed-width Booth multipliers.

The traditional methods of QEC for fixed-width multipliers can be divided into three categories. The first category is constant compensation [4]. In this method, the value of QEC is a constant, and it has the advantage of simplicity, but leads to large quantization error (QE). The second category is adaptive QEC [5]. This method can reduce the truncated error by using variable compensation value. The third category is hybrid error compensation, which uses both constant and adaptive QEC techniques together to reduce the truncated error. Compared with the first two categories, this category is more accurate [6]. In [7, 8], two new QEC methods have been presented, respectively. However, these two methods are usually used in the design of Baugh-Wooley multiplier and are not applicable for the design of Booth multipliers. Another QEC method using binary threshold algorithm has been presented in [9], but its accuracy is lower compared with the method presented in [1]. In [2], the statistical analysis and linear regression analysis are adopted to generate QEC value, and the adaptive QEC method is introduced into the design of fixed-width radix-4 Booth multipliers for the first time. The adaptive QEC method has higher accuracy compared with [1] but is very complex. In order to overcome the disadvantages of [2], [10] has presented a method of dividing the truncated partial-products into the major truncated section and the minor truncated section. The major truncated section is adjacent to the reserved section, while the QEC of the minor truncated section is realized by adaptive method. In [11], a new QEC method based on [10] is proposed. Its truncation error is more symmetric, but its quantization accuracy is not improved distinctly. Refrence [12] has proposed an adaptive QEC method with conditional-probability estimator instead of the time-consuming exhaustive simulation of the previous works, especially for large bit-width Booth multipliers. One closed form formula was derived in [13] with the traditional method of probabilistic estimation to estimate the QEC. However, the closed form formula is only an approximation of the mean of the minor truncated section, which decreased the compensation accuracy.

The compensation carries of the minor truncated section in [1013] are generated by exploiting adaptive QEC methods based on all the truncated partial-products. In [10, 11], the compensation carries are generated by all the truncated partial-products, and hence the simulation statistics will consume large amount of computer resources (time and memory requirements). The demanded resources are almost an exponential function of the bit-width. Therefore, for the multipliers with wider bits, an ordinary computer cannot compete for the simulations. Although the resources are decreased in [12, 13], their QEC accuracy cannot be effectively improved.

Based on the trade-off between the accuracy and computer resources, the minor truncated section is divided into two parts in this paper: the lower partial-products and the upper partial-products. In fact, the compensation carries are less affected by the lower partial-products of the minor truncated section. Therefore, we propose that a compensation constant of the lower partial-products can be generated by statistical analysis, and then the compensation constant is incorporated into the upper partial-products to form a shrunken minor truncated section. Finally, the quantitative compensation carries are created by applying probability estimation based on the shrunken minor truncated section; hereafter, this multiplier is called shrunken partial-products compensation (SPPC) Booth multiplier. The proposed QEC method not only reduces the QE of the multipliers, but also avoids the exhaustive simulation resource requirements. Simulations and experiments show that, comparing with the previous works [913], the proposed QEC method can effectively improve the QE and performances of the fixed-width Booth multipliers. In order to verify the proposed QEC method we have also designed different width SPPC multiplier circuits and have compared them with other same width multipliers based on TSMC 0.18 μm process. The experiments show that the proposed SPPC Booth multipliers have smaller die area as the width of multipliers increases.

The rest of the paper is organized as follows. Section 2 introduces the principle of the modified radix-4 Booth multiplier. Proposed QEC method is discussed in Section 3 along with circuit realization. Simulation results, comparisons, and an application experiment are presented in Section 4. Section 5 concludes the paper.

2. Modified Radix-4 Booth Multiplier

The modified radix-4 Booth recoding method was proposed in [14], which is a common method used in Booth multiplier designs.

The N-bit multiplicand and M-bit multiplicator of the 2-complements’ multiplier are expressed as follows:

According to modified radix-4 Booth recoding, from the most significant bit (MSB), every three bits form a group and adjacent groups overlap by one bit. When M is odd, is assumed for proper recoding for sign extension. In any case, . in (2) can be rewritten as where

The recoding radix upon (1) is listed in Table 1. , , and denote the binary expression, signs, and nonzero states of recoding values, respectively. The scheme of the modified radix-4 Booth recoding is shown in Figure 1.

In a radix-4 Booth multiplier, each partial-product associates with two adjacent bits in . The partial-products of four possible combinations of and are shown in Table 2.

A radix-4 Booth multiplier will be used as a demo in the following discussion. Its partial-product array is shown in Figure 2. The can be concluded by Table 3, the sign extension bit can be described as the following expression [15]: where is the exclusive-OR operator.

3. Proposed QEC Method

3.1. Generation of QEC Carries

The partial-products of modified Booth multipliers can be divided into two sections: reserved section and truncated section , as shown in Figure 2. To improve the accuracy of fixed-width multipliers, an additional column following is reserved, and the compensation carries are derived by the simulated statistic results [10]. In this paper, we further divide the rest of into and , and then the product of the multiplier is

Assuming that a decimal point is between and , which does not affect the results of discussion, then

In the proposed method, the QE of is compensated by adaptive QEC, and the QE of is compensated by constant QEC. As a rule of thumb, 3~5 columns of partial-products are needed to compose when the width of multipliers is 8~32 bits.

In Table 1, we have defined If Booth recoding is nonzero , that is, , then the probability of equals that of .

Supposing that both the probabilities of and are , then where is the expected value. Substituting (11) in (9) we can obtain Thus, the constant acts as the QEC value of . Similarly, the expected value of is . We replace with in (8), obtaining its equivalent decimal value of .

Based on the above proposed SPPC multiplier, the with carries in Figure 2 can be redisplayed in Figure 3, which is denoted by .

The maximum carries will be generated if all in are 1. In Figure 3, the maximum carries need 7 bits; therefore we register the carry output states of in the variable temporarily. In the following, we propose one method that associates the nonzero recoding label with the compensation carries.

According to the number of 1 in , we divide the shrunken partial-product section in Figure 3 into 9 categories. If there is only one 1 in , such as , then only one row in is nonzero, which is the first category (cate-1). We count the total numbers of in all the simulation cases. If there are two 1’s in , such as , then there are two rows in that are nonzero. As discussed above, once again we count the total numbers of in all the cases by simulation, which is named the second category (cate-2). The remaining seven categories are deduced as above. Compared with [10, 11], the method can greatly reduce the simulation cost. The statistical results of different categories are listed in Table 4 (note that the middle bit between and in is still regarded as a partial-product in simulation).

We then encode each category with 4-bits to associate with compensation carries. For example, cate-0 is encoded , cate-1 is encoded , and so on. The compensation carries are derived by Table 4, which comply with the following rule: if the numbers of are larger than a half of the numbers of simulations (NoS), then , otherwise . The corresponding relations between the code and are listed in Table 5.

According to Table 5, the carries are

3.2. Architecture Design of SPPC Booth Multipliers

The circuit implementation of category encoding is shown in Figure 4, where the 2Bs-Adder and 3Bs-Adder denote the two-bit adder and three-bit adder, respectively.

In light of (13), one implementation of carry generation circuits is shown in Figure 5.

According to the above discussions, a modified radix-4 Booth fixed-width multiplier with the proposed QEC circuits is shown in Figure 6. The traditional Booth recoding encoder, circuits of the proposed category encoding, and carry generation compose the QEC circuits to generate the QEC carries.

4. Comparisons and Discussions

4.1. Quantization Accuracy Simulations

The comparison of various errors between the proposed SPPC Booth multipliers and the ideal truncated Booth multiplier and other previous works is listed in Table 6. These errors include the average error , maximum error , and variance . In accuracy simulation, all the pair data samples are inputted to estimate the QE of the SPPC multiplier. The , , and are defined as follows: where and are the ideal product and the quantized product of Booth multipliers, respectively; and max are the absolute and maximum operators, respectively.

The adaptive estimation methods in [911] have been adopted to improve the truncation error. Instead of exhaustive computing resource simulation methods in previous works [911], the QE of SPPC multipliers is analyzed and derived from a simpler statistical method. It is seen from Table 6 that the proposed SPPC multipliers have almost the best error performance compared with previous works, except [11] that has the highest performance of . The reason is that it uses more information from Booth encoder to alleviate the truncation errors [11]. Nevertheless, the area cost in [11] is increased from the extra information of compensation circuits. Even though and of the multipliers in [13] are smaller than other multipliers, their is larger compared with the proposed SPPC multiplier and multipliers in [1012].

The distributions of QE have been calculated in different multipliers. The sample ratios of QE value (i.e., ) are listed in the last three columns of Table 6. It is seen from the statistical results that the sample rations of in the SPPC Booth multipliers are higher than that in [911] by about 13%. On the other hand, this shows that the quantization accuracy of the proposed SPPC multiplier is higher compared with those four methods.

4.2. Performance Simulation

A comparison of performances between the proposed SPPC and previous works is implemented by using their own compensation circuits. Multipliers with different widths are synthesized by Synopsys Design Compiler using a standard cell library of TSMC 0.18 μm CMOS process. Their area, delay, and power dissipation are listed in Table 7.

In general, there exists a trade-off between the hardware overhead and the accuracy in these compensation circuits. The multiplier proposed in [11] has the highest accuracy in , but it has a larger area, delay, and power. However, the SPPC multiplier has the same area as the multipliers in [12, 13] and lower area compared with the multipliers in [911]. As a result, the proposed SPPC multipliers achieve higher accuracy at the cost of the lower area.

In order to comprehensively compare the performances of different multipliers, we consider their power-delay products as the standard of comparisons, which are listed in the last column of the Table 7. It is shown that the power-delay products of multipliers proposed in [911] are larger than that of other multipliers distinctly. Compared with the other two multipliers in [12, 13], the proposed SPPC multipliers have better comprehensive performances.

4.3. Verification in FIR Filter

The QEC performance of the proposed SPPC is verified by means of a 20 taps low-pass direct form FIR filter. The filter is designed to have 8 MHz pass-band (with a 40 MHz sampling frequency) and 70 dB attenuation in the stop-band for a CR detector. All the widths of input, output, and coefficient of the FIR filter are 16 bits, and the internal adders of the FIR filter are 22 bits. The input for test is a 5 MHz sinusoidal signal with a sampling rate of 40 MHz.

Four different multipliers ( bits) including the SPPC multiplier and three other multipliers in [1113] are instantiated in the filter. The error mean and error variance of the output samples in different instantiated multipliers are listed in Table 8. It is seen from Table 8 that the error mean of [11] is the smallest, and the error mean of SPPC is better than that of [12, 13], whereas the error variance of SPPC is the smallest. These results are consistent with the QE accuracy simulation results in the previous section.

In CR detectors, it is very important to detect the signal’s spectral peak-values for determining whether the channel is idle [16]. The relative errors of the average spectral peak-values (with 100 times of simulations) with respect to the ideal spectral peak-values of the FIR filter outputs are listed in the last column of Table 8. Table 8 shows that the spectral peak-values of the proposed SPPC multiplier are closer to the ideal peak-value.

5. Conclusion

By further dividing the minor truncated section of Booth multiplier into the adaptive compensation and constant compensation sections, we rebuilt the adaptive QEC for fixed-width multipliers. According to the numbers of 1 in the sequence of nonzero Booth recoding label we propose a new QEC method to generate the compensation carries. The simulation results have shown that the QE of the SPPC is smaller compared with the existing methods. The proposed QEC method and SPPC are useful for the DSP system with a large width multipliers and higher precision requirements.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors gratefully acknowledge the support of “Specialized Research Fund for the Doctoral Program of Higher Education” (Grant no. 20120201120026) and “the Fundamental Research Funds for the Central Universities.”