Research Article  Open Access
Parallel and Regular Algorithm of Elliptic Curve Scalar Multiplication over Binary Fields
Abstract
Accelerating scalar multiplication has always been a significant topic when people talk about the elliptic curve cryptosystem. Many approaches have been come up with to achieve this aim. An interesting perspective is that computers nowadays usually have multicore processors which could be used to do cryptographic computations in parallel style. Inspired by this idea, we present a new parallel and efficient algorithm to speed up scalar multiplication. First, we introduce a new regular halveandadd method which is very efficient by utilizing projective coordinate. Then, we compare many different algorithms calculating doubleandadd and halveandadd. Finally, we combine the best doubleandadd and halveandadd methods to get a new faster parallel algorithm which costs around less than the previous best. Furthermore, our algorithm is regular without any dummy operations, so it naturally provides protection against simple sidechannel attacks.
1. Introduction
The elliptic curve was first imported into the world of cryptography by Neal Koblitz and Victor Miller independently in 1985 [1, 2] and is now increasingly used for a wide range of cryptography primitives in practice such as public encryption and digital signature. More than 30 years after its introduction to the cryptography field, the practical advantages of elliptic curve cryptosystem (ECC) are clear and wellknown: it has richer algebraic structures, a smaller key size, and relatively faster implementations to achieve the same level of security compared with other deployed schemes such as RSA. Based on the above benefits, ECC is particularly suitable for resourceconstrained devices.
The efficiency of ECC is dominated by the speed of calculating scalar multiplication. Namely, given a rational point of order on elliptic curves, it requires to compute ( times), for a given scalar . Obviously, there are similar features between scalar multiplication and exponentiation in a general multiplicative finite group. Therefore, inspired by the repeated “squareandmultiply” algorithm, the normally used binary method called “doubleandadd” for scalar multiplication over elliptic curves has been regarded as a fundamental technique.
In constrained environments, scalar multiplication is easily implemented by “doubleandadd” variant of Horner’s rule, providing binary expansion of scalar . However, each bit of implies different algorithmic path during each iteration, that is, if , only a point doubling is necessary. Whereas if , a point doubling followed by a point addition is involved. As a consequence, different power and time consumption of this two prominent building blocks can be detected by simple power analysis (SPA) [3] and timing attack—this naive implementation leads to information leakage of secret scalar .
Protecting against simple sidechannel attacks (SSCA) can be achieved by recoding scalars in a regular manner, meaning that scalar multiplications are executed in the same instructions in the same order for any input value. Coron introduced a countermeasure against SSCA named “doubleandadd always” algorithm [4]. By inserting a dummy operation when necessary, it evaluates scalar multiplication by executing a doubling and an addition in each loop. However, it was soon found to be vulnerable to safeerror fault attacks [5, 6]. By timely inducing a fault at one iteration during the point addition, an adversary can determine whether the operation is dummy or not by checking the correctness of the output.
A measurement against safeerror fault attacks performs scalar multiplication in a predictable pattern. Besides the most commonly used Montgomeryladder algorithm [7], another efficient method is ary recoding [8]. This algorithm recodes a scalar in a sequence of zeros and a nonzero with the percentage of nonzero numbers . However, scanning from lookup table could be dangerous if this step cannot be proceeded in constanttime.
Another increased interestfocused field of regular executing scalar multiplication is exploiting efficient curve forms that allow complete addition law. For any pair of rational points on elliptic curves (or in desired subgroup), complete addition law can compute the correct result, ignoring whether two addends are identical or not. As a corollary of the main results in [9], elliptic curves embedded in any projective spaces of dimension by a symmetric line bundle admit a complete system of addition laws of bidegree . The later work of Bosma and Lenstra [10] shows that, when suitably chosen, a single addition law is able to act as add operation for all pairs of rational elliptic points. One of the wellstudied examples is Edwards curves [11, 12], of which exceptional pairs for addition law exist outside rational points. A recent work [13] proposed an optimized algorithm that adds any pair of rational points for prime order elliptic curves defined over field of characteristic different from and .
In [14], the authors introduce a new approach for scalar multiplication called Montgomeryhalving algorithm which is a variation of the original Montgomeryladder point multiplication. Besides, they present a new strategy for parallel implementation of point multiplication over elliptic curves by running the Montgomeryhalving algorithm with the original Montgomeryladder algorithm in parallel to calculate scalar multiplication concurrently. Moreover, this parallel algorithm can achieve protecting against SSCA. However, in their scheme, affine coordinate has to be used for halving, because the projective form of the Montgomeryhalving algorithm could not be used to save operations.
In this paper, we provide a similar parallel implementation method using regular recoding technique which should be highly efficient by parallel processing doubling and halving operations in two different coprocessors. It can be concluded as two main contributions.
The first contribution is that we give a new regular algorithm computing halving operation called zeroless signeddigit (ZSD) halveandadd which saves around and cost compared with Montgomeryhalving method in [14] with m = 233 and m = 409. The projective coordinate system could offer projective coordinates saving inversions. This is especially useful for our ZSD halveandadd algorithm (Algorithm 1). For halving operation, the best coordinate is affine coordinate. For the following addition operation, the better choice is projective coordinate. The Montgomeryhalving algorithm in [14] has to exploit affine coordinate for its special structure without other choices, while our Algorithm 1 could make use of projective coordinate for its different structure design, where can always be in affine coordinate for halving and can always be in projective coordinate for addition so that projective mixed addition law could be used and no more coordinate transformation needed. In addition, the regular recoding technique ensures the secure implementation of scalar multiplication against SSCA.

The second contribution concerns the new mixedparallel algorithm. After analyzing all the algorithms in Table 1, we combine the fastest doubleandadd method and Montgomery doubleandadd method, in [14], and the fastest halveandadd method, our ZSD halveandadd algorithm, in this paper. A new efficient and secure mixedparallel algorithm just comes into being, the mixedparallel method, which costs around and less than MontgomeryParallel approach in [14] when m = 233 and m = 409, respectively. The more thorough analysis will be exhibited in Section 4, and the related estimate results are all displayed in Tables 1 and 2.
 
MontgomeryD = Montgomery doubleandadd algorithm, MontgomeryH = Montgomery halveandadd algorithm, Algorithm. 2D (Projective) = Algorithm 2 using the projective coordinate system, Algorithm. 2D (twisted ) = Algorithm 2 using the twisted coordinate system, Algorithm. 1H = Algorithm 1 for halveandadd. 

The rest of this paper is organized as follows. In the next section, we introduce the related arithmetic knowledge of binary elliptic curves, especially on efficient coordinate point representation, twisted normal form, and how to evaluate scalar multiplication in parallel by combining point halving and doubling operations. In Section 3, our new regular algorithm for halveandadd is provided. Moreover, a similar parallel strategy as the one detailed in [14] shows how to efficiently implement scalar multiplication in a regular and parallel manner. Cost comparison and expected performance analysis are presented in Section 4. Finally, we conclude this paper and give the new mixedparallel algorithm after analyzing.
2. Preliminaries
We focus on elliptic curves defined over binary fields , by the Weierstrass equation:where , is an irreducible polynomial of degree . Isomorphic to the divisor class group of degree , the rational points on together with the point at infinity form an abelian group, of which the basic group operation—addition—is algebraically interpreted by the tangentandchord law.
Given two points and on , where , if the addition of the two points is presented by , then the coordinates of can be computed according to the following formula:with .
Similarly, given , where , if the doubling of the point is presented by , then the coordinates of can be computed according to the following formula [15]:with .
From the above formulas, it is easy to notice that there are inevitable inversion operations in the base field, which would consume much time. Usually, the projective coordinate system is more welcome for its inclusion of no field inversions. In practice, various kinds of coordinate systems are already available to be used. The work in this paper prefers to exploit the stateoftheart coordinate systems: coordinate and the projective coordinate system of twisted normal form. They perform excellently in different situations.
2.1. Coordinates
Efficient point representation is of great importance to accelerate scalar multiplication. Inversion in the base field takes a large amount of time; however, they are indispensable if points are represented in affine coordinate. The homogeneous projective coordinate system (also called standard projective coordinate system) is usually used to eliminate this obstacle by injecting any rational affine point into one of its projective copies , where . When one of the projective copies corresponds to the affine point , where , it is the Jacobian projective coordinate system. Later, López and Dahab proposed a new and efficient projective coordinate system. Compared with the above coordinate systems, the difference is here [16], denoted as LD coordinate for short. Later, Kim and Kim presented a fourdimensional LD coordinate system for binary curves which represents as , with , , , and .
The coordinate system was firstly noticed by Knudsen [17] when studying halving operations on binary elliptic curves. Oliveira [18] further surveyed its comprehensive arithmetic. Given a point with , the affine representation of is defined as , where . So, it is easy to derive point addition and doubling formulas of points in affine coordinates from the normal affine ones. Let and be two points on , where , then the formula for can be given by the following formula:
Referring to doubling operation, is given as follows:
As for projective conditions, the translation between affine representation and projective representation is defined by , with . The negative element of is . Assumed two points and represented in model on binary elliptic curves, similar to the affine case, the addition arithmetic could be described as the following formulas:and for , it could be given as follows:
The associated group addition and doubling operations can be calculated by and , respectively, where denotes a field multiplication and denotes a squaring.
Having the above formulas, a direct thought is to combine doubling and addition formulas to obtain a formula evaluating , which is of great importance in the latter part of this paper.
Let be points of , then can be computed as follows:
Using this, operations can be calculated efficiently by instead of , where denotes a field multiplication and denotes a squaring [18].
2.2. Twisted Normal Form
Twisted normal form [19] can be seen as the complement and extension of normal form [20]. The related definitions, theorems, equation forms, and group laws of twisted normal form and normal form are given by Kohel's series of papers [19–22]. There are three forms for (twisted) normal form, called (twisted) normal form, (twisted) semisplit normal form, and (twisted) split normal form separately. Yet, for practical consideration, only twisted spilt normal form will be used here.
Let be an elliptic curve over characteristictwo finite field in the twisted split normal form:and let and be two points on the curve. A complete system of addition laws is given by the two following two maps:respectively, where
For the point , the doubling map sends it to if , and toif . In twisted split normal form, addition operations of generic points can be evaluated by and doubling operations of a generic point can be evaluated by with notations for field multiplication and for squaring [19].
Among all the studied coordinate systems on binary curves, twisted normal form and projective coordinate appear to be faster. The difference is twisted normal form is better calculating doubleandadd, while projective coordinate can be used in halving operation. The costs of different point operations using various point representing systems are shown in Table 3.

2.3. Halving Operation
The main ingredient we consider is a cyclic subgroup in of odd order , denoted as . The multipleby2 isogeny on is an isomorphism, so is its inverse map halving operation . The use of point halving to speedup scalar multiplication was firstly investigated by Knudsen [17]. Given a point , it allows to compute another point satisfying in the cost of a field multiplication, calculating a square root and solving a quadratic equation, which could be directly understood from the formulas below:
The most commonly used method is to solve the second equation for , then the third one for , and finally the first one for .
When coordinate like is used instead of affine coordinate , where , the halving operation formulas would be changed as follows:
This time we just need two steps, that is to say, solve the first equation for and then the second one for . Without computing , the halving point coordinates of can be obtained more simply.
As proved in [23], solving a quadratic equation on binary curves with equivalents to computing the halftrace function . Although extra memory resources are needed, Fong et al. [23] showed a technique to significantly reduce the required time and space. With dedicated implementation, a point halving is approximately twice the time of a field multiplication, significantly faster than the customarily used point doubling.
From the algorithmic view, the halveandadd method [17] expands a scalar in radix representation system. Let be the binary length of , first compute , that is, . Much similar to doubleandadd, point multiplication,can be efficiently computed by applying point halving on an accumulator. It can be further optimized combining methods like NAF to get a better implementation performance, as shown in [23].
Enlightened by the treatment in halveandadd, if we choose an appropriate number less than , the scalar can be split into two parts naturally. In consequence, the halveandadd method is easy to be concurrently implemented with the doubleandadd algorithm in parallel model, making use of increasing cores in modern processors, which would be a lot faster than applying one algorithm without parallel implementation (some inevitable computation load should be considered in advance). Specifically speaking, if the lengths of is and a proper has been chosen, the scalar can be split into two portions applying halveandadd and doubleandadd algorithms simultaneously, which can be indicated as follows—the length of each part ( and ) depends on actual implementation speed of halving and doubling which can be found experimentally:
If we already have the binary expression of with odd order , then it is easily derived that . The scalar multiplication of is then split into two parts directly:
The first part is easily executed in the halveandadd method; meanwhile, the second part can be performed through a doubleandadd approach, in two different threads.
As far as sidechannel attacks being concerned, noticing that doubleandadd can be implemented using Montgomeryladder point multiplication, Negre and Robert [14] presented analogous Montgomeryhalving algorithm. During each iteration, two registers hold fixed difference2, and the algorithm processes a point halving and an addition in each iteration. However, as noticed by the authors, this parallel algorithm can only be implemented in affine coordinate, since halving operation cannot be implemented in the projective coordinate efficiently. To overcome this drawback, we present another regular recoding algorithm that can be used when implementing parallel halveandadd/doubleandadd in the projective coordinate system.
3. Regular Implementation
Protecting the implementation of scalar multiplication against SSCA can be achieved by many methods. Compared with unprotected implementation, algorithmic countermeasures like recoding scalars in a regular manner always sacrifice efficiency, yet may be easily mitigated by taking advantage of inherent parallelism of modern processors.
3.1. ZeroLess SignedDigit Expansion
In general, point addition and doubling of elliptic curves are very different from the usual arithmetic operations, which are so complicated and time consuming that plenty of scholars have been sparing no effort to find efficient approaches to speed them up like work in this paper. As is well known, the negative of a point is a very cheap operation ensuring subtraction of points on elliptic curves being just as efficient as addition. This motivates modifying the binary method to signeddigit representations, that is to say, the scalar is usually represented by digits in the set of instead of . As we all know, there are many kinds of signeddigit representations. For achieving our aim, in this paper, zeroless signeddigit expansion is chosen to be used to come up with regular algorithms improving the resistance of scalar multiplication against timing attack and SPA.
Zeroless signeddigit expansion [24] (ZSD) is a highly regular scalar recoding algorithm that expresses an odd integer with digits in . is usually denoted as . Since bit is avoided in recoded sequence, each iteration of point multiplication requires a doubleandadd operation, providing a natural protection against timing attack and SPA.
Let be the binary expansion of a scalar . Note that for any sequence of consecutive bits , the above expansion can be rewritten as , i.e., . Similar treatment is able to be applied to radix expansion of , since . When applying halveandadd algorithm, any consecutive bits can be rewritten as bits as well. So if is an odd integer, its radix2 ZSD expansion (or its corresponding one based on radix represented by ) with can be obtained from
From a security standpoint, every bit should be nonzero. When is even, it requires a special treatment. This can be circumvented by computing with the least significant bit of forced into 1 and finally subtracting (or in the corresponding condition) from the soobtained result if bit is zero. The three algorithms in this paper applied this way to deal with or correctly whether the input is even or not.
Having known enough about ZSD expansion, we will get regular algorithms combining ZSD expansion and common binary methods to calculate the scalar multiplication. Algorithm 2 illustrates the regular ZSD doubleandadd method based on radix2 expansion from left to right, while Algorithm 3 does it from the opposite side.


Algorithms 2 and 3 give regular binary methods to evaluate elliptic scalar multiplication based on radix2 expansion. When it comes to calculating , a similar condition based on radix has to be considered, for which the halveandadd method is needed. Referring to Algorithms 2 and 3, with a slight modification, we get Algorithm 1 for regular halving operation.
3.2. Parallelized Regular Scalar Multiplication
Let be the point of odd order with bit length and a scalar . The parallelized doubleandadd/halveandadd algorithm for scalar multiplication can be described in the following three parts including preprocessing, implementing, and postprocessing. Moreover, we may have a better view of the whole process from Figure 1. Preprocessing: select a proper and compute . So , where is the most significant bits and is the least significant bits of . This equation indicates . Implementing: point multiplication can be done by concurrently implementing in the binary method, in radix method in two different threads. In detail,(1)Feed parameters and as inputs to the regular doubleandadd algorithm, exploiting Algorithm 2 or Algorithm 3, in one thread. The final result point is stored in the register.(2)In the meanwhile, feed parameters and as inputs to regular halveandadd algorithm, Algorithm 1, in another thread. The final result point is stored in the register. Postprocessing: a singlepoint addition is operated to obtain the correct result of scalar multiplication.
4. Comparison and Expected Performance
Numerous standards have included NISTrecommended curves as implementation abelian groups for cryptographic protocols. The general conclusion in Tables 1 and 2 is specifically for NISTrecommended random curves having the form , where is an element in . To allow easy comparison, the two considered curves with estimate results in this section are NIST B233 and NIST B409, defined by and over , respectively.
4.1. Analysis
The theoretic complexity analysis of the four considered scalar multiplication approaches is reported in Table 1. Our work is to improve the algorithms in [14] and give a better new parallel algorithm for evaluating scalar multiplication (Algorithms 2 and 3 have the similar complexity, and just Algorithm 2 will be talked about in the following parts.)
For regular implementation against SSCA, the Montgomery methods and our new methods here both need m doubling and m addition point operations for doubleandadd algorithms and m halving and m addition point operations for halveandadd algorithms. To be specific, in MontgomeryD, , and mean doubling and addition operations of a very efficient Montgomery doubleandadd algorithm in [25]. It is so excellent that only field operations are enough for MontgomeryD, where and represent field multiplication and inversion. In MontgomeryH, and are halving and addition operations in the affine coordinate. Halving usually includes computing field multiplication, trace, solving the quadratic equation, and computing the square root operations. According to the analysis and experimental results in [14, 23], we can assume halving in affine coordinate needs field operations while field operations for projective coordinate. Besides, , addition in the affine coordinate needs field operations. Unavoidably, the structure of MontgomeryH algorithm requires to use affine coordinate only, because no proper projective coordinate could be applied here so far, which influences its efficiency significantly. It can be easily seen from the estimate results later.
In Algorithm 2, and represent doubling and addition in projective coordinate separately. and represent doubling and addition in twisted projective coordinate. As for their corresponding field operations, and requires and , while and require and . In Algorithm 1, means halving in affine coordinate and requires . Specially, if the mixed addition operation and the formula of calculating in Section 2.1 could be exploited, the field operations of Algorithm 2 will be for projective coordinate, in which is the cost of final step mixed addition in Algorithm 2 and is the cost of transforming the final result from projective coordinate to affine coordinate. When it turns to twisted projective coordinate, field operations are needed, in which is the cost of final step mixed addition in Algorithm 2 and is the cost of transforming the final result from twisted projective coordinate to affine coordinate. As for Algorithm 1, the mixed projective coordinate system could be applied saving inversion operations owing to the different algorithm structures of Algorithm 1 from MontgomeryH. Similarly, field operations are supposed to be consumed here.
In this work, we assume and ignore for squaring multiplication here referring to [14, 23]. In fact, squaring is nearly the fastest among all the field operations we talk about in this paper and usually is less than , so we can ignore it. For , it is a commonly used reference value. Yet on most occasions, may be bigger than , where MontgomeryH will be influenced most while the other three methods are almost unaffected. This is also the benefit of using the projective coordinate system. Having known all above cost comparison, two examples of NIST B233 and B409 are illustrated in Table 1 for easier understanding.
For doubleandadd, the MontgomeryD algorithm is so outstanding that Algorithm 2 still could not catch up with the speed of it even using the twisted projective coordinate system, which is the fastest to date. For halveandadd, Algorithm 3 saves and cost compared with MontgomeryH with m = 233 and m = 409. That means our algorithm for regular halveandadd is much more useful in practice by using projective coordinate. When making use of the faster algorithm, the parallel method would also be much more efficient.
One may ask why the mixed projective coordinate system could not be applied to MontgomeryH. It seems that comparing these two algorithms in different coordinate systems is so unfair. To be honest, it is not our tricks to do this on purpose. If we take a good look at MontgomeryH in [14], supposing that we already have in affine coordinate and in projective coordinate when , we would meet the dilemma of transforming into affine coordinate for halving operation and into projective coordinate for mixed addition operation in order to save inversions when . Every time the consecutive two bits are different, the transformation has to be done. For a random bits binary number, if its leftmost bit is 1, then the average number of next to or next to is approximately . Transforming from projective coordinate into affine coordinate equals to field operations. Taking these costs into account, MontgomeryH has to use around field operations, which is more than applying affine coordinate. So the best solution to deal with the problem is to use a new structure like Algorithm 1.
4.2. Parallel and New Discovery
Negre and Robert [14] get inspiration from [26] and utilize a split technique similar to the one introduced in [26]. They also provide a Montgomeryhalving algorithm like the original Montgomeryladder scalar multiplication method. By carrying out these knowledge, a parallel method using MontgomeryD and MontgomeryH algorithms is presented. It is a pity that the MontgomeryH method from [14] can only use affine coordinate for its special structure. Aiming at solving this, we come up with a new regular parallel approach including MontgomeryD and Algorithm 1, which we call it mixedparallel.
After analyzing each algorithm in Section 4.1, we can take a suitable split to see complexity in parallel condition. The specific results are shown in Table 2. In the Algorithm column, Montgomeryparallel is the parallel algorithm in [14] meaning executing MontgomeryD and MontgomeryH concurrently in two different threads. Our mixedparallel in the last line is the new united algorithm which applies MontgomeryD and Algorithm 1 simultaneously in different coprocessors.
It is evident that MontgomeryD has the least cost among all the algorithms in Table 1. However, either parallel method in Table 2 has less cost than MontgomeryD. Let us compare the Montgomeryparallel and MontgomeryD first. It turns out that Montgomeryparallel algorithm saves and cost than of MontgomeryD when m = 233 and m = 409. As a consequence, parallel is indeed a good idea for computing scalar multiplication. Furthermore, if we combine the best doubleandadd MontgomeryD algorithm and the best halveandadd Algorithm 1H, a new efficient parallel method, mixedparallel, jumps into our sight giving new hope. Estimating results demonstrate that our mixedparallel method costs and less than that of Montgomeryparallel when m = 233 and m = 409, respectively. This is a new discovery and record.
5. Conclusion
In this paper, we present a new parallel algorithm to improve the Montgomery algorithm in [14]. The two methods both take advantage of inherent parallelism of modern processors constructing parallel approaches. Instead of using Montgomerylike idea, a regular recoding technique is applied in our approach which is supposed to be highly efficient by processing doubleandadd and halveandadd in a parallel way. The regular method could protect the computing process against SSCA like Montgomery thought.
After the careful analysis of these algorithms, we could draw the conclusion that our regular halveandadd approach, Algorithm 1, could use projective coordinate making up for the disadvantage of MontgomeryH saving about and cost compared with that of MontgomeryH with m = 233 and m = 409.
As a result, combining MontgomeryD and Algorithm 1, a new preferable parallel approach is born, our mixedparallel. It costs and less than that of Montgomeryparallel when m = 233 and m = 409, respectively. This is a new record as well as a good improvement and supplement to the previous excellent work of [14].
Data Availability
All data generated or analyzed during this study are included in this published article.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Nos. 61872442, 61772515, 61502487, and U1936209); the National Cryptography Development Fund (No. MMJJ20180216); and the Beijing Municipal Science & Technology Commission (Project no. Z191100007119006).
References
 N. Koblitz, “Elliptic curve cryptosystems,” Mathematics of Computation, vol. 48, no. 177, pp. 203–209, 1987. View at: Publisher Site  Google Scholar
 V. S. Miller, “Use of elliptic curves in cryptography,” in Proceedings of the Conference on the Theory and Application of Cryptographic Techniques, Springer, Linz, Austria, April 1985. View at: Google Scholar
 P. Kocher, J. Jaffe, and J. Benjamin, “Differential power analysis,” in Proceedings of the Annual International Cryptology Conference, Springer, Santa Barbara, CA, USA, August 1999. View at: Google Scholar
 J.S. Coron, “Resistance against differential power analysis for elliptic curve cryptosystems,” in International Workshop on Cryptographic Hardware and Embedded Systems, Springer, Worcester, MA, USA, August 1999. View at: Google Scholar
 S.M. Yen and M. Joye, “Checking before output may not be enough against faultbased cryptanalysis,” IEEE Transactions on Computers, vol. 49, no. 9, pp. 967–970, 2000. View at: Publisher Site  Google Scholar
 Y. SungMing, “A countermeasure against one physical cryptanalysis may benefit another attack,” in Proceedings of the International Conference on Information Security and Cryptology, Springer, Seoul, Korea, December 2001. View at: Google Scholar
 P. L. Montgomery, “Speeding the Pollard and elliptic curve methods of factorization,” Mathematics of Computation, vol. 48, no. 177, p. 243, 1987. View at: Publisher Site  Google Scholar
 M. Joye and M. Tunstall, “Exponent recoding and regular exponentiation algorithms,” in Proceedings of the International Conference on Cryptology in Africa, Springer, Gammarth, Tunisia, June 2009. View at: Google Scholar
 H. Lange and W. Ruppert, “Complete systems of addition laws on abelian varieties,” Inventiones Mathematicae, vol. 79, no. 3, pp. 603–610, 1985. View at: Publisher Site  Google Scholar
 W. Bosma and H. W. Lenstra, “Complete systems of two addition laws for elliptic curves,” Journal of Number Theory, vol. 53, no. 2, pp. 229–240, 1995. View at: Publisher Site  Google Scholar
 D. J. Bernstein and T. Lange, “Faster addition and doubling on elliptic curves,” in Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Springer, Kuching, Malaysia, December 2007. View at: Google Scholar
 H. Hisil, “Twisted edwards curves revisited,” in Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Springer, Melbourne, Australia, December 2008. View at: Google Scholar
 J. Renes, C. Craig, and L. Batina, “Complete addition formulas for prime order elliptic curves,” in Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, Springer, Vienna, Austria, May 2016. View at: Google Scholar
 C. Negre and J.M. Robert, “New parallel approaches for scalar multiplication in elliptic curve over fields of small characteristic,” IEEE Transactions on Computers, vol. 64, no. 10, pp. 2875–2890, 2015. View at: Publisher Site  Google Scholar
 D. Hankerson, A. J. Menezes, and V. Scott, Guide to Elliptic Curve Cryptography, Springer Science & Business Media, Berlin, Germany, 2006.
 J. López and R. Dahab, “Improved algorithms for elliptic curve arithmetic in GF (2 n),” in Proceedings of the International Workshop on Selected Areas in Cryptography, Springer, Kingston, Canada, August 1998. View at: Google Scholar
 E. W. Knudsen, “Elliptic scalar multiplication using point halving,” in Proceedings of the International Conference on the Theory and Application of Cryptology and Information Security, Springer, Singapore, November 1999. View at: Google Scholar
 T. Oliveira, “Lambda coordinates for binary elliptic curves,” in Proceedings of the International Workshop on Cryptographic Hardware and Embedded Systems, Springer, Santa Barbara, CA, USA, August 2013. View at: Google Scholar
 D. Kohel, “Twisted μ_{4}normal form for elliptic curves,” in Proceedings of the Annual International Conference on the Theory and Applications of Cryptographic Techniques, Springer, Paris, France, April 2017. View at: Google Scholar
 D. Kohel, “Efficient arithmetic on elliptic curves in characteristic 2,” in Proceedings of the International Conference on Cryptology in India, Springer, Chennai, India, 2012. View at: Google Scholar
 D. Kohel, “A normal form for elliptic curves in characteristic 2,” in Arithmetic, Geometry, Cryptography and Coding Theory (AGCT 2011), Springer, Berlin, Germany, 2011. View at: Google Scholar
 D. Kohel, “Addition law structure of elliptic curves,” Journal of Number Theory, vol. 131, no. 5, pp. 894–919, 2011. View at: Publisher Site  Google Scholar
 K. Fong, D. Hankerson, J. Lopez, and A. Menezes, “Field inversion and point halving revisited,” IEEE Transactions on Computers, vol. 53, no. 8, pp. 1047–1059, 2004. View at: Publisher Site  Google Scholar
 R. R. Goundar, M. Joye, A. Miyaji, M. Rivain, and A. Venelli, “Scalar multiplication on weierstraβ elliptic curves from CoZ arithmetic,” Journal of Cryptographic Engineering, vol. 1, no. 2, pp. 161–176, 2011. View at: Publisher Site  Google Scholar
 J. López and R. Dahab, “Fast multiplication on elliptic curves over GF (2 m) without precomputation,” in Proceedings of the International Workshop on Cryptographic Hardware and Embedded Systems, Springer, Worcester, MA, USA, August 1999. View at: Google Scholar
 J. Taverne, A. FazHernández, D. F. Aranha, F. RodríguezHenríquez, D. Hankerson, and J. López, “Speeding scalar multiplication over binary elliptic curves using the new carryless multiplication instruction,” Journal of Cryptographic Engineering, vol. 1, no. 3, pp. 187–199, 2011. View at: Publisher Site  Google Scholar
Copyright
Copyright © 2020 Xingran Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.