Abstract
Publickey cryptosystems are broadly employed to provide security for digital information. Improving the efficiency of publickey cryptosystem through speeding up calculation and using fewer resources are among the main goals of cryptography research. In this paper, we introduce new symbols extracted from binary representation of integers called Bigones. We present a modified version of the classical multiplication and squaring algorithms based on the Bigones to improve the efficiency of big integer multiplication and squaring in number theory based cryptosystems. Compared to the adopted classical and Karatsuba multiplication algorithms for squaring, the proposed squaring algorithm is 2 to 3.7 and 7.9 to 2.5 times faster for squaring 32bit and 8Kbit numbers, respectively. The proposed multiplication algorithm is also 2.3 to 3.9 and 7 to 2.4 times faster for multiplying 32bit and 8Kbit numbers, respectively. The number theory based cryptosystems, which are operating in the range of 1Kbit to 4Kbit integers, are directly benefited from the proposed method since multiplication and squaring are the main operations in most of these systems.
1. Introduction
The growth of digital technologies has an exponential trend and as a consequence the need of information security also increases even more than before [1, 2]. Cryptography is an essential tool in providing a reasonable solution for this necessity. The modern field of cryptography consists of two main areas, the symmetrickey cryptography and the publickey cryptography. The same key is used in symmetrickey cryptosystems to encrypt and decrypt a message, while the publickey cryptosystems use two keys in their protocols. Most of the publickey cryptosystems [3] use modular exponentiation in their calculation. For example, Diffie and Hellman introduced the first key exchange scheme in 1967 that is based on the modular exponentiation [4]. Few years later in 1978, one of the most used publickey cryptosystems, RSA [3], is also based on the modular exponentiation. ElGamal key exchange [5] is another example of public key that has been developed based on the modular exponentiation.
Modular exponentiation, , is a oneway function because the inverse of a modular exponentiation () is a known hard problem [6–8]. To achieve a comfortable level of security, the length of the key material for these cryptosystems must be larger than 1024 bits [9], and in the near future, it is predicted that 2048bit and 4096bit systems will become standard [10].
Calculating modular exponentiation for a large exponent and large modulo is a costly operation and therefore improving its efficiency has become an important research issue for researchers in cryptography and mathematics. There are two main approaches currently being employed in order to improve the efficiency of modular exponentiation: improving the involved operations, exponentiation, and division, separately, and improving both of the operations simultaneously. Residue number system (RNS) [11] and Montgomery modular multiplication [12] are examples of the first approach, while binary and mary exponentiation or Barrett reduction [13] are instances from the second approach. This paper focuses on the second approach, by proposing a new number representation, which will improve the squaring and multiplication operations, two of the three main operations in calculating modular exponentiation [14].
The naive approach of calculating the exponentiation is by doing repetitive multiplication, which is not an efficient way for calculating large exponent. A better alternative for calculating exponentiation is by employing binary exponentiation; that is, if , where , then . The term can be obtained by squaring the th term, . The number of operations for calculating by using the naïve method is multiplications. On the other hand, the binary method requires only squarings and multiplications (on average), where (see Algorithm 1). Consequently, improving the multiplication and squaring operations (as found in algorithm such as the righttoleft algorithm and its variants [6–8]) will inherently improve the efficiency of the exponentiation calculation [7].

2. Multiplication and Squaring Algorithms
The most wellknown algorithms for multiplication of two large integers or two polynomials are classical [15], KaratsubaOfman’s [16], ToomCook’s [17, 18], and fast Fourier transform (FFT) multiplication algorithms [19]. In spite of all the differences in these methods, which sometimes make them apparently unrelated to each other, these methods have been founded based on the same idea, that is, how to represent a polynomial to behave efficiently in calculations. The classical method uses coefficient representation, while the other three methods use pointvalue representation. This representation conversion enables us to reduce the cost of convolution from of classical method to a lower cost for pointtopoint multiplication. The process of finding pointvalue representation from its coefficient representation is called “evaluation” or “point evaluation” and the reverse process is known as “interpolation.” Table 1 summarizes the differences among the multiplication algorithms by their complexity, technique, and representation used.
Algorithms such as FFT and ToomCook have lower algorithm complexity. However, because of the preprocessing overheads such as the divide and conquer, evaluation, and interpolation, the operating cost of these algorithms is actually much higher, making them useful only when the integers are extremely large. Consequently, only classical and Karatsuba multiplication algorithms and their combination are being used in current cryptosystem. This is especially true after considering circumstances such as memory constraints and the practical finite field size.
2.1. Classical Multiplication and Squaring Algorithms
In positional numeral system [15], the natural way of multiplying numbers, known as classical multiplication algorithm, is by multiplying each digit of the multiplicand by each digit of the multiplier and then adding up all the properly shifted results. This method requires a multiplication table for single digits available to the algorithm. Knuth’s classical multiplication algorithm [15] can be stated as shown in Algorithm 2.

The complexity of the classical multiplication algorithm is. Therefore, the number representation that has fewer digits theoretically should run faster than the number representation that has more digits in its representation. In addition, the density of nonzero digits in the numbers influences the number of addition that has to be carried out by the classical multiplication algorithm as well.
Algorithm 3 shows the modified version of Algorithm 2 that computes the squaring operation efficiently for binary numbers. The efficiency of the modified squaring algorithm comes from Steps 2.1 to 2.2.1. Since the products of and are the same, this product is therefore calculated just once in Step 2.2.1. Note that in Step 2.2.1 of Algorithm 3 is the result of the addition. In Algorithm 3, is a singleprecision digit, while is a multipleprecision digit. With this improvement, the number of partial products in the squaring algorithm is less than what was found in Algorithm 1.

2.2. Karatsuba Multiplication and Squaring Algorithms
Karatsuba’s algorithm is an efficient scheme for multiplying two large numbers or two polynomials. It was introduced by Karatsuba and Ofman in 1960 and published in 1962 [20]. This algorithm is a remarkable example of the divide and conquer paradigm [21, 22], specifically for its binary splitting [23]. This method requires three multiplications and four additions in each iteration. To apply the algorithm both numbers are split into a lower and an upper half (for simplicity, assume n is even):
The halves , and are split again in half in the next iteration step. Since every step exactly halves the number of coefficients, the algorithm terminates after steps. Algorithm 4 shows the recursive Karatsuba algorithm (assuming the lengths of and are even). We can use Karatsuba algorithm for squaring with small modification. Algorithm 5 shows these modifications in Steps 23.

Combining other multiplication algorithms with Karatsuba algorithm is another technique that has been used by researchers [24]. The study on squaring and multiplying large integers by Zuras has shown the 2way, 3way, and 4way approaches for calculating big integer multiplication [25]. Sadiq and Ahmed [26] have extended the work further and summarized the results after splitting the long numbers into multi partitions (up to 10 partitions). More details on squaring algorithms can be found in the literature [6, 8, 27–29].
3. BigOnes Representation and the Proposed Algorithms
In this section, the Bigone (Bo) integer representation and the proposed multiplication and squaring algorithms, which are based on this representation, are presented. Bigone representation is created based on the binary number representation. Bigone is a compact representation with low Hamming weight (HW) compared to the binary number representation.
A Bigone is the numeric value of a sequence of consecutive binary symbol “1” with length and is denoted by . Examples of Bo’s are and . Consider A set of all Bo’s is called Bigones’ set and is denoted by .
BigOnes Number System (BONS). Let be a number in radix 2, where . This number system is called Bigone number system and denoted as BONS. For example, can be represented by in BONS. This number system is redundant. To transform BONS into a canonic (not redundant) representation, the maximum length of Bigones is used. The canonical version of BONS is known as CBONS. CBONS is a compressed representation of Bigone, by ignoring all the zeros and modifying the notation to , where shows the position of the specified Bigone in the binary number. Specifically, is the position of the least significant bit of the specified Bigone in the binary number. For example, we can write in CBONS as . To optimize the calculations based on CBONS, we can limit the length of maximum Bigones to “” (such that ) which we identified in this paper as the maximum length of Bigones [30–32].
3.1. BigOnes Analysis
From the definition of CBONS, it is apparent that there will be at least one digit zero bounding from the left and at least another digit zero bounding from the right of each Bigone digit (except for the least and most significant bits). Consequently, to calculate the number of s in any given binary number, we have to calculate the probability of “00” patterns appearing in the binary number. Since the probability of digit “1” and digit “0” appearing in a binary digit is , therefore it follows that the probability of appearing in a binary number is . As a result, the number of s in an bit binary number is . To calculate the Hamming weight of Bo’s in a Bigone number system, it is enough to calculate the total number of Bigones, , where . Consider Since (3) can therefore be written as For large enough , the number of Bigones (Hamming weight of CBONS) would be
Table 2 shows the result of calculating the number of Bigone digits in an 8Kbit binary number from 10,000 randomly generated binary numbers. As the table indicates, the experimental result does agree with the value found in (6).
Table 2 also indicates that the occurrence of Bigones decreases as the length of Bigones increases. The goal of the following experiment is therefore to find the optimized length for CBONS, to be used in LCBONS (limited length CBONS).
The length, identified as , is important for applications such as multiplication and squaring. This is because the size of will determine the size of the lookup table (LUT) that needs to be used by the respective algorithms. Table 3 indicates that the practical value for is 5 since the Hamming weight when is only slightly bigger than the optimum Hamming weight for CBONS (25.8% compared to 25%) but at the same time will produce a relatively compact LUT. Consequently, the following proposed multiplication and squaring algorithms will use LCBONS with .
3.2. Converting Binary Representation to BigOnes Representation
Algorithm 6 shows how to convert a binary representation to CBONS representation. In Step 2.2.1, the flag NewBo is set to true if and at the same time the position of the new Bigone is saved in “pos.” In Step 2.3.1, while the flag NewBo is true, the length of current Bigone (Length) is increased by one in each iteration of the loop until is found. The end of Bigone is identified by setting the flag NewBo to false in Step 2.3.2. Then, the length and position of the newly discovered Bigone digit are saved in and , respectively, where is the position of new Bigone in array C.

Algorithm 7 is the modified version of Algorithm 6 after applying the maximum length of Bigone in BONS. In Step 2.3.3 of Algorithm 7, the length of the current Bigone digit is checked. If the length of the Bigone is bigger than , then the relevant pointer will backtrack one bit and set the value to 0. Step 2.4 of Algorithm 7 acts similar to Step 2.4 in Algorithm 6 which has been explained earlier.

To use Algorithms 6 and 7 efficiently in squaring and multiplication, we assume that the output of these algorithms is in the form of , where . To show this point, we change the names of algorithms to Bin2BOL and Bin2LBOL accordingly.
3.3. Proposed Multiplication and Squaring Algorithm
Algorithm 8 is a modification of Algorithm 2, which has been designed based on the LBONS. In Step 1, by using function Bin2LBOL, is converted to . Output is a special representation of in LCBONC representation that shows the length of Bigones. Step 3.1 is introduced to ignore the zeros in and consequently will help reduce the number of operations. Another difference is related to Step 3.2.1.1 which uses the function . This function fetches the product of two Bigones by lengths of and from a precalculated lookup table.

The proposed squaring algorithm (see Algorithm 9) is a modified version of Algorithm 3. In Step 1, by executing the converter Bin2LBOL, is converted to which is a special representation of in LCBONS representation with maximum length being employed (). Other differences are related to Step 3.1, which has been proposed by Knuth [15] to ignore the zeros in . Similar to Algorithm 8, in Step 3.1.2.1 the function is used to fetch the product of two Bigones from a precalculated lookup table.

4. Results and Discussion
To compute the Bigones Hamming weight, 10,000 random numbers [33] were generated with different maximum lengths, , and different number lengths ranging from 32 bits to 8 Kbits. The results are summarized in Tables 2 and 3. According to this data, the Hamming weight for the numbers larger than 64 bits with is about 25.8%. If we increase the value of to 10, we can achieve slightly better Hamming weight value, that is, about 25%. However, to create a lookup table that can support , we have to use four times more memory than the case of . The size of LUT for the case of is 50 bytes ( bytes) for squaring and multiplication. In this paper, the result gathered is based on the case of .
Tables 2 and 3 indicate the execution time of the classical squaring (CLSQ) and multiplication (CM_MUL), Karatsuba squaring (KASQ) and multiplication (KA_MUL), and also the proposed squaring and multiplication algorithm against different bit lengths, which are randomly generated. The tests were conducted on a machine with an AMD Phenom (TM) 9950 QuadCore processor, 3 GB RAM, Windows XP (Service Pack 3) OS, and DevC++ version 4.9.9.2 compiler.
According to Table 4 the proposed multiplication algorithm is more efficient than CM_MUL and KA_MUL algorithms for multiplication numbers ranging from 32 bits to 8 Kbits, which is the range of numbers used by the current number theory based cryptosystems. The proposed multiplication algorithm is about 2.3 times faster than CM_MUL for multiplying 32bit numbers and about 3 times faster for multiplying 64bit numbers. For numbers ranging from 128 bits to 8 Kbits, this ratio fluctuates between 3.3 and 3.9. Generally, the Karatsuba multiplication algorithm (KA_MUL) with algorithm complexity is slower than the proposed algorithm (with algorithm complexity ) for multiplying numbers ranging from 32 bits to 8 Kbits. Table 4 shows that the proposed algorithm is about 7 times to 9.6 times faster than Karatsuba algorithm for multiplying 32bit to 64bit numbers. The speedup ratio continuously declines from 9.6 to about 2.4 times faster for multiplying numbers in the range of 64bit to 8Kbit numbers.
According to Table 5, the proposed squaring algorithm is more efficient than CLSQ and KASQ algorithms for squaring numbers ranging from 32 bits to 8 Kbits. The proposed algorithm is about 2 times faster than CLSQ for squaring 32bit numbers and this ratio gradually increases to 3.7 times for squaring 8Kbit numbers. In general, the Karatsuba algorithm (KASQ) is slower than the proposed algorithm for squaring numbers between the ranges of 32 bits and 8 Kbits. Table 5 shows that the proposed algorithm is about 7.9 times to 10.4 times faster than Karatsuba algorithm for squaring 32bit to 64bit numbers. The speedup ratio continuously declines from 10.4 to about 2.5 times faster for squaring numbers in the range of 64bit to 8Kbit numbers.
5. Conclusion
A multiplication and a squaring algorithm with a small lookup table, which are based on the classical multiplication algorithm and Bigones’ representation, are presented in this paper to speed up the squaring and multiplication calculation in publickey cryptography algorithms. The efficiency of the classical multiplication and squaring algorithm does not cover the whole range of numbers that is used by number theory based cryptosystems. In many instances, it has been reported that, at the threshold of 255 digits, the Karatsuba algorithm is performing better than the classical algorithm. In the proposed method, binary numbers are first converted to Bigone representation before being processed by the proposed multiplication or squaring algorithms. Compact representation with low Hamming weight of the Bigone representation decreases the number of submultiplication operations in the squaring and multiplication calculation. The experimental result gathered indicates that the proposed squaring and multiplication algorithm are efficient enough to substitute either the classical algorithm or Karatsuba algorithm or the hybrid of the two algorithms for squaring numbers. This finding should increase the performance of number theory based cryptosystems which depend heavily on the process of exponentiation (a process that depends on squaring and multiplication) of large integers in achieving the desired level of security.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgment
The researchers would like to thank the Universiti Sains Malaysia for supporting this research through Project Grant (1001/PKOMP/817059).