CN110908635A - High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof - Google Patents

High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof Download PDF

Info

Publication number
CN110908635A
CN110908635A CN201911073701.7A CN201911073701A CN110908635A CN 110908635 A CN110908635 A CN 110908635A CN 201911073701 A CN201911073701 A CN 201911073701A CN 110908635 A CN110908635 A CN 110908635A
Authority
CN
China
Prior art keywords
data
modular
post
multiplication
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911073701.7A
Other languages
Chinese (zh)
Inventor
王中风
汪漂洋
田静
林军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University
Original Assignee
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University filed Critical Nanjing University
Priority to CN201911073701.7A priority Critical patent/CN110908635A/en
Publication of CN110908635A publication Critical patent/CN110908635A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/60Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
    • G06F7/72Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/08Key distribution or management, e.g. generation, sharing or updating, of cryptographic keys or passwords
    • H04L9/0816Key establishment, i.e. cryptographic processes or cryptographic protocols whereby a shared secret becomes available to two or more parties, for subsequent use
    • H04L9/0852Quantum cryptography

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention discloses a high-throughput modular multiplier based on post-quantum cryptography with a homologus curve and a corresponding modular multiplication method thereof. The modular multiplier mainly comprises a multiplication module, a reduction module and a post-processing module. The multiplication module reduces the number of multipliers by means of Karatsuba and the like. The reduction module uses constant multipliers with less resource consumption and a parallelization strategy. The post-processing module carries out parallelization processing on the adder and simultaneously calculates constant parameters in advance for optimization. Therefore, in summary, the modular multiplier of the present invention has the feature of high throughput. In addition, the modular multiplication method disclosed by the invention is a prime number form based on an unconventional base number, and has a faster calculation speed by using an optimized Barrett reduction method compared with a traditional Montgomery representation method. In summary, the present invention provides an effective modular multiplier architecture and a modular multiplication method for the current encryption scheme based on the post-quantum cryptography with the homologus curve.

Description

High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof
Technical Field
The invention relates to a modular multiplier and a modular multiplication method in the field of cryptography; in particular to a modular multiplier with high throughput rate and a modular multiplication method thereof in a post-quantum encryption scheme.
Background
In recent years, great progress has been made in the research of quantum computers. Many common public key algorithms, such as RSA algorithm and Elliptic Curve Cryptography (ECC), can be easily broken by brute force quantum computers according to the scherrer algorithm. This undoubtedly accelerates the development of post-quantum cryptography (PQC). Since 2017, the National Institute of Standards and Technology (NIST) has held two rounds of contests aimed at developing post-quantum standards. The super-singular homologous key encapsulation protocol (SIKE) was one of 26 candidates, which emerged from both rounds of competition. The advantage of SIKE is that its public and private keys are very short in size compared to other candidates, and are very perfectly compatible with conventional ECC protocols. The SIKE protocol was developed by encapsulating the super-singular homogeneous diffie-hellman (SIDH) key exchange protocol using a key encapsulation mechanism. SIDH was originally proposed in 2011. SIDH is based on the principle of finding the difficulty of homology between different super-singular curves to resist quantum attacks. In general, a large number of serial homologous computations in a protocol take a long time to delay, which is also a bottleneck in practical application. Therefore, the method for accelerating the SIDH can be directly used for accelerating the SIKE protocol.
Many researchers have optimized the SIDH/SIKE protocol based on software and hardware platforms. In 2011, Jao implemented SIDH using a GMP big database, which is also considered as the earliest implementation version of SIDH. After that the latest versions offered by c.costello and p.longa et al are generally considered to be the fastest software implementations today, which are constantly integrating the most advanced super-singular homologous cryptographic schemes. Meanwhile, the method also combines the optimization method proposed in the open literature, and provides the hardware realization of the SIDH on FPGA and ARM. By decomposing these calculations, it can be found that the modular multiplication operation is one of the basic operations of the scheme and is also a matter of great concern in the design.
In operation, it is noted that the smoothed homogenous prime number of a super-singular curve usually satisfies the formula p ═ f · axbyAnd +/-1. Where a and b are small prime numbers, x and y are positive integers, and f is a cofactor such that p is a prime number. Due to the special structure of the prime p, it is possible to improve its performance by doing some other work on the modulo operation. Karmakar published an efficient prime number format of 2.2xby-1 modulus takingThe algorithm EFFM, where x and y are even numbers. So that they can use an element on the domain based on an unconventional base number R-2x/2by/2Is expressed in terms of a multiplication operation reduced by half at the cost of adding a small number of addition operations. The FFM1 algorithm based on the method reduces the coefficient in the EFFM algorithm from three to two by an additional mapping function, so that the constant parameter calculated in advance can be discarded without changing the complexity. While the FFM2 algorithm expands the format of the prime number p to f.2 at the expense of more computationxby+ -1, and is the most advanced algorithm so far.
Disclosure of Invention
The invention aims at the problems and provides a modular multiplication method based on prime number form of unconventional base number based on previous research. The method adopts an optimized Barrett reduction method, and has higher speed than the previous Montgomery representation method. The invention also provides a corresponding modular multiplier architecture based on the method, which has the characteristic of high throughput rate, and the specific invention is as follows:
a high throughput modular multiplier architecture for post-quantum cryptography encryption schemes based on homologus curves is characterized by the following main modules:
1) the multiplication module is used for calculating multiplication of a quadratic term coefficient term after splitting the big data;
2) the reduction module is used for carrying out data reduction through modular taking and complementation operation;
3) and the post-processing module is used for post-processing the data to obtain final output.
The multiplication module of the modular multiplier architecture is characterized in that a coefficient item of a quadratic term after splitting of big data is input, and a Karatsuba method is used for optimization, so that the number of multipliers is reduced, and the calculation complexity is reduced.
The reduction module of the modular multiplier architecture is characterized in that the data is processed by using an optimized Barrett reduction algorithm to obtain reduced data. And constant multipliers with less resource consumption than ordinary multipliers are used in the module, and parallelization is used to reduce the length of a critical path.
The post-processing module of the modular multiplier architecture is characterized in that the length of a critical path of the adder is reduced by performing parallelization processing on the adder in calculation and calculating a constant parameter in advance.
The modular multiplication method of the modular multiplier framework is characterized by comprising five steps of input data processing, first-order Karatsuba calculation, Barrett reduction calculation optimization, output data calculation and output data post-processing:
firstly, processing input data, if necessary, calculating the modular multiplication of A and B with respect to prime p, wherein the smooth prime format of the super-singular curve in the algorithm is f.2xby+ -1, where f is 1 or 2, and x and y are even numbers, so that R-2 can be usedx/ 2by/2As a non-conventional base, thereby changing the input quantity to a quadratic term a ═ a2R2+a1R+a0(f=2)、A=a1R+a0(f is 1), and the coefficient (a) is determined2)、a1、a0、(b2)、b1、b0As an input item. For the version supporting multi-precision operation, as the coefficient cannot enter the operation module at one time, a storage or cache unit is required to be added to store data; for the case where f is 2, a is also required to be inputted2、a1、a0、b2、b1、b0And adding mapping:
Figure BSA0000194122040000021
therefore, the number of the input coefficients can be reduced from three to two, and the complexity of operation is effectively reduced.
Second and first order Karatsuba calculation, namely, calculating the product a of coefficients by using Karatsuba formula1b1、a0b0、a1b0+a0b1The formula is as follows:
aibi+aibi=(ai+ai)(bi+bi)-aib
the number of multipliers can be reduced to three, and theoretically, the complexity of multiplication can be infinitely reduced by the Karatsuba method, but meanwhile, the consumption of hardware resources is rapidly increased, so that a good compromise needs to be made between the two methods.
And thirdly, calculating the optimized Barrett reduction, and obtaining the reduced data through modular operation and complementation operation. Since constant multipliers consume fewer resources than ordinary multipliers, and multiplication of constant parameters is required in the algorithm, separately designed constant multipliers are used herein. In addition, for the multi-precision version, according to the algorithm formula, data after some shift operations are performed on the output of the step four of the last iteration, which is additionally added before reduction.
And fourthly, calculating output data, namely, superposing the quotient and the remainder obtained by the reduction in the previous step according to an algorithm formula to obtain preliminary output data. If the data is a multi-precision version, the data needs to be added to the data before the reduction operation in the third step after some shift operations. The clock frequency is further improved by adopting a parallelization strategy for optimization.
Fifthly, after data output is processed, and for the case when f is 2, the number of coefficients needs to be changed back to three through inverse mapping; the data obtained in step four is also correct data, but needs to be further processed. Because the data may not meet the constraint of the algorithmic formula, a carry is required. Some addition and subtraction operations need to be introduced to make the output meet the algorithm constraint; the throughput rate can be improved by parallelizing the adder in the calculation and calculating the used constant parameters in advance.
The combination of the modular multiplier architecture and the modular multiplication method has the following beneficial effects:
firstly, the modular multiplier performs calculation based on the form of an unconventional base number, the calculation speed after the calculation is faster than that of the traditional Montgomery representation method, and the interval between output data in an output stream obtained by calculation is only about one clock cycle at the fastest speed;
secondly, the invention has high throughput rate and supports multi-precision calculation, and the throughput rate of the multi-precision version reaches about 10 times of that of the prior design under the condition that the hardware resource consumption is equivalent or slightly increased. The non-multi-precision version is much smaller than the improvement of the throughput rate although the resource consumption of hardware is increased by a little, and compared with the prior design, the advantage of the throughput rate is more obvious and is about 60 times or more of that of the prior design;
thirdly, a plurality of modules of the invention adopt a parallelization strategy, thereby improving the clock frequency;
fourthly, mapping operation and Karatsuba calculation which are possibly used in data input processing reduce the operation complexity;
fifthly, the constant multiplier and the common multiplier are designed separately, so that the resource consumption is reduced.
Drawings
FIG. 1 is a block diagram of a modular multiplier according to the present invention;
Detailed Description
The following description will further describe embodiments of the present invention with reference to the accompanying drawings. Firstly, the modular multiplier architecture is introduced, and secondly, the modular multiplication method applicable to the modular multiplier architecture is introduced. The embodiments described below by referring to the drawings are exemplary and intended to be illustrative of the present invention and are not to be construed as limiting the present invention.
The architecture of the modular multiplier of the present invention is first described.
Fig. 1 is a schematic diagram of the modular multiplier of the present invention, which includes a multiplication module, a reduction module, and a post-processing module. The multiplication module is used for calculating multiplication of a quadratic term coefficient term after splitting of big data and only comprises three common multipliers; the reduction module is used for carrying out modular operation and complementation operation by adopting an optimized Barrett reduction algorithm to reduce data; and the post-processing module is used for processing the data to enable the data to meet the output constraint of the algorithm.
The specific operation is that according to the input data, if the version is f-2, the parameters need to be changed through the mapping operationTwo less. Then (a) is subjected to the Karatsuba method0,b0)、(a1,b1)、(a0+a1,b0+b0) Three common multipliers in the pair input multiplication module are divided to obtain three groups of products. Then the data is sent into a reduction module, and the data is processed by using the optimized Barretreduction algorithm to obtain two groups of residuals (c)0、c1) Quotient (q)0、q1). And accumulating the quotient and the remainder according to an algorithm formula to obtain a primary output result. If the data is a multi-precision version, the data needs to be added back to the data before entering the reduction module after some shift operation. And finally, inputting the primary output result into a post-processing module, and obtaining a final output result which accords with the algorithm constraint through some addition and subtraction operations.
The following section is used to illustrate the modular multiplier architecture-based optimized modular multiplication method of the present invention. The method comprises five steps of input data processing, first-order Karatsuba calculation, Barrett reduction optimization calculation, output data calculation and output data post-processing:
firstly, processing input data, if necessary, calculating the modular multiplication of A and B with respect to prime p, wherein the smooth prime format of the super-singular curve in the algorithm is f.2xby+ -1, where f is 1 or 2, and x and y are even numbers, so that R-2 can be usedx/ 2by/2As a non-conventional base, thereby changing the input quantity to a quadratic term a ═ a2R2+a1R+a0(f=2)、A=a1R+a0(f is 1), and the coefficient (a) is determined2)、a1、a0、(b2)、b1、b0As an input item. For the version supporting multi-precision operation, as the coefficient cannot enter the operation module at one time, a storage or cache unit is required to be added to store data; for the case where f is 2, a is also required to be inputted2、a1、a0、b2、b1、b0And adding mapping:
Figure BSA0000194122040000051
therefore, the number of the input coefficients can be reduced from three to two, and the complexity of operation is effectively reduced.
Second and first order Karatsuba calculation, namely, calculating the product a of coefficients by using Karatsuba formula1b1、a0b0、a1b0+a0b1The formula is as follows:
aibi+aibi=(ai+ai)(bi+bi)-aib
the number of multipliers can be reduced to three, and theoretically, the complexity of multiplication can be infinitely reduced by the Karatsuba method, but meanwhile, the consumption of hardware resources is rapidly increased, so that a good compromise needs to be made between the two methods.
And thirdly, calculating the optimized Barrett reduction, and obtaining the reduced data through modular operation and complementation operation. Since constant multipliers consume fewer resources than ordinary multipliers, and the algorithm requires multiplication using constant parameters, separately designed constant multipliers are used here. In addition, for the multi-precision version, according to the algorithm formula, data after some shift operations are performed on the output of the step four of the last iteration, which is additionally added before reduction.
And fourthly, calculating output data, namely, superposing the quotient and the remainder obtained by the reduction in the previous step according to an algorithm formula to obtain preliminary output data. If it is a multi-precision version, it is determined according to formula C(j)=C(j+1)·2k+AiB mod p, which requires some shift operations on these data, and adds them to the data before proceeding to the third reduction operation. The clock frequency is further improved by adopting a parallelization strategy for optimization.
Fifthly, after data output is processed, and for the case when f is 2, the number of coefficients needs to be changed back to three through inverse mapping; the data obtained in step four is also correct data, but needs to be further processed. Because the data may not meet the constraint of the algorithmic formula, a carry is required. Some addition and subtraction operations need to be introduced to make the output meet the algorithm constraint; the adder in the calculation is subjected to parallelization processing, and meanwhile, the used constant parameters are calculated in advance, so that the throughput rate can be improved.
Secondly, the modular multiplication method is explained by combining the hardware architecture, and the flowing situation of data in the hardware architecture is explained in detail, wherein the modular multiplication method is as follows:
first, a first step of input data processing, coefficient (a)2)、a1、a0、(b2)、b1、b0As an input item. For the version supporting multi-precision operation, as the coefficient cannot enter the operation module at one time, a storage or cache unit is required to be added to store data; for the case where f is 2, a is also required to be inputted2、a1、a0、b2、b1、b0The post-add mapping operation changes the input coefficient to a1、a0、b1、b0I.e. the coefficients need to be processed through an inverter, an adder and a selector, thereby reducing the input coefficients.
Second and first order Karatsuba calculation, the input coefficient of the first step needs to pass through two adders, and finally 6 data are obtained and divided into (a)0,b0)、(a1,b1)、(a0+a1,b0+b0) Three pairs of common multipliers are input into the second step, and three products a are obtained by calculation0b0,a1b1And (a)0+a1)(b0+b0). After a series of subtraction operations (a)0+a1)(b0+b0)-a0b0-a1b1And the like; if the version is a multi-precision version, the data after the shift operation is carried out by adding the output of the step four in the previous iteration before the step three is entered.
And thirdly, calculating the optimized Barrett reduction, obtaining quotient and remainder through modulus taking and remainder operation, and taking the high order or the low order of the data through operation on hardware. The multiplication of constant parameters in the calculation uses two constant multipliers to reduce resource consumption. In addition, for the multi-precision version, according to the algorithm formula, data after some shift operations are performed on the output of the step four of the last iteration is required to be additionally added before the step three.
And fourthly, calculating output data, namely obtaining preliminary output data by the quotient and the remainder obtained by calculation in the third step through a plurality of adders, subtracters and selectors according to an algorithm formula. If it is a multi-precision version, then it is based on formula C(j)=C(j+1)·2k+AiB mod p, which requires some shift operations on these data, and adds them to the data before proceeding to the third reduction operation.
Fifthly, after processing of output data, for the case that f is 2, the number of coefficients needs to be changed back to three through inverse mapping, the reflection mapping operation is similar to the mapping operation, only one more exclusive or operation is used as a selection signal, and one more negation device and one more selector are additionally arranged to generate one more coefficient output; the post-processing operation after that needs to go through a series of adders and subtractors, and then generates the final data output through a selector.
Implementation example: for prime number format p 2 × 23863242The modular multiplier with a corresponding security level of p771 for-1 performs the specific hardware implementation in the present invention. The implementation platform is Vivado 2016.4 of xilinx, xc7k325tffg900-2 development board based on Kintex-7 and xc7vx690tffg1157-3 development board based on Virtex-7.
The actual integrated resource consumption versus occupation ratio for the multi-precision version is shown in the following table:
TABLE-comprehensive results for Kintex-7 xc7k325tffg900-2 development boards
Algorithms FFM1 FFM2 Multi-precision version
FFs 9675 11635 12902
LUTs 16627 33051 25743
DSPs 122 529 210
fclk 55 25 57
Time(ns) 1164 1120 122
Throughput rate (Mb/s) 663 688 6278
As can be seen from the table, the multi-precision version has the implementation result that under the condition that the consumption of hardware resources is equivalent or slightly increased, the throughput rate reaches about 10 times of that of the previous design.
The results of the implementation of the non-multi-precision version are shown in the following table:
table two Virtex-7 xc7vx690tffg1157-3 development board comprehensive results
Algorithms Multi-precision version Non-multiple precision versions
FFs 12902 38976
LUTs 25743 63173
DSPs 210 729
fclk 56 60
Time(ns) 124 17
Throughput rate (Mb/s) 6168 46260
It can be seen that although the resource consumption of the hardware is increased by a little compared to the improvement of the throughput rate, the advantage of the throughput rate is more obvious compared with the previous FFM2 by the non-multi-precision version, which is about 60 times more than that of the FFM 2.
Through the modular multiplier architecture and the modular multiplication algorithm of the embodiment of the invention, the throughput rate of the modular multiplier can be improved to the maximum extent. The multiplication module, the reduction module and the post-processing module in the framework mainly give functional descriptions, and various methods and ways for realizing the functions of the parts are available. It should be noted that those skilled in the art can make various improvements and modifications without departing from the principle of the present invention, and such improvements and modifications should be considered as the protection scope of the present invention. The components specified in this embodiment can be implemented by the prior art.

Claims (5)

1. A high throughput modular multiplier architecture for post-quantum cryptography encryption schemes based on homologus curves is characterized by the following main modules:
1) the multiplication module is used for calculating multiplication of a quadratic term coefficient term after splitting the big data;
2) the reduction module is used for carrying out data reduction through modular taking and complementation operation;
3) and the post-processing module is used for post-processing the data to obtain final output.
2. The multiplication module of the modular multiplier architecture according to claim 1, wherein the coefficient term of the quadratic term after splitting the big data is input, and the optimization is performed by using a Karatsuba method, so that the number of multipliers is reduced, and the complexity of calculation is reduced.
3. The reduction module of the modular multiplier architecture of claim 1, wherein the data is processed using an optimized Barrettreduction algorithm resulting in reduced data. And constant multipliers with less resource consumption than ordinary multipliers are used in the module, and parallelization is used to reduce the length of a critical path.
4. The post-processing module of modular multiplier architecture according to claim 1, wherein the length of its critical path is reduced and the throughput is improved by parallelizing the adders in the computation and calculating constant parameters in advance.
5. A modular multiplication method based on the modular multiplier architecture of claim 1, comprising five steps of input data processing, first order Karatsuba calculation, optimization Barrett reduction calculation, output data calculation, and output data post-processing:
firstly, processing input data, if necessary, calculating the modular multiplication of A and B with respect to prime p, wherein the smooth prime format of the super-singular curve in the algorithm is f.2xby+ -1, where f is 1 or 2, and x and y are even numbers, so that R-2 can be usedx/2by/2As a non-conventional base, thereby changing the input quantity to a quadratic term a ═ a2R2+a1R+a0(f=2)、A=a1R+a0(f is 1), and the coefficient (a) is determined2)、a1、a0、(b2)、b1、b0As an input item. For the version supporting multi-precision operation, as the coefficient cannot enter the operation module at one time, a storage or cache unit is required to be added to store data; for the case where f is 2, a is also required to be inputted2、a1、a0、b2、b1、b0And adding mapping:
Figure FSA0000194122030000011
therefore, the number of the input coefficients can be reduced from three to two, and the complexity of operation is effectively reduced.
Second and first order Karatsuba calculation, namely, calculating the product a of coefficients by using Karatsuba formula1b1、a0b0、a1b0+a0b1The formula is as follows:
aibi+ajbi=(ai+ai)(bi+bi)-aib
the number of multipliers can be reduced to three, and theoretically, the complexity of multiplication can be infinitely reduced by the Karatsuba method, but meanwhile, the consumption of hardware resources is rapidly increased, so that a good compromise needs to be made between the two methods.
And thirdly, calculating the optimized Barrett reduction, and obtaining the reduced data through modular operation and complementation operation. Since constant multipliers consume fewer resources than ordinary multipliers, and the algorithm requires multiplication using constant parameters, separately designed constant multipliers are used here. In addition, for the multi-precision version, according to the algorithm formula, data after some shift operations are performed on the output of the step four of the last iteration, which is additionally added before reduction.
And fourthly, calculating output data, namely, superposing the quotient and the remainder obtained by the reduction in the previous step according to an algorithm formula to obtain preliminary output data. If the data is a multi-precision version, the data needs to be added to the data before the reduction operation in the third step after some shift operations. The clock frequency is further improved by adopting a parallelization strategy for optimization.
Fifthly, after data output is processed, and for the case when f is 2, the number of coefficients needs to be changed back to three through inverse mapping; the data obtained in step four is also correct data, but needs to be further processed. Because the data may not satisfy the constraint range of the algorithm formula, carry processing is required. Some addition and subtraction operations need to be introduced to make the output meet the algorithm constraint; the throughput rate can be improved by parallelizing the adder in the calculation and calculating the used constant parameters in advance.
CN201911073701.7A 2019-11-04 2019-11-04 High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof Pending CN110908635A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911073701.7A CN110908635A (en) 2019-11-04 2019-11-04 High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911073701.7A CN110908635A (en) 2019-11-04 2019-11-04 High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof

Publications (1)

Publication Number Publication Date
CN110908635A true CN110908635A (en) 2020-03-24

Family

ID=69814742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911073701.7A Pending CN110908635A (en) 2019-11-04 2019-11-04 High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof

Country Status (1)

Country Link
CN (1) CN110908635A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112286490A (en) * 2020-11-11 2021-01-29 南京大学 Hardware architecture and method for loop iteration multiply-add operation
CN112685003A (en) * 2021-01-05 2021-04-20 南京大学 Arithmetic device for obtaining modular multiplication result of homologous password
CN113179151A (en) * 2021-03-24 2021-07-27 中国科学院信息工程研究所 Universal software implementation method for middle-up rounding learning in post-quantum cryptography construction
CN113467754A (en) * 2021-07-20 2021-10-01 南京大学 Lattice encryption modular multiplication operation method and framework based on decomposition reduction
CN116540977A (en) * 2023-07-05 2023-08-04 北京瑞莱智慧科技有限公司 Modulo multiplier circuit, FPGA circuit and ASIC module
CN117134917A (en) * 2023-08-09 2023-11-28 北京融数联智科技有限公司 Rapid modular operation method and device for elliptic curve encryption
WO2023226173A1 (en) * 2022-05-24 2023-11-30 上海阵方科技有限公司 Modular multiplication operation method based on number-theoretic transform prime
WO2023246063A1 (en) * 2022-06-24 2023-12-28 上海途擎微电子有限公司 Modular multiplier, security chip, electronic device and encryption method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090100120A1 (en) * 2007-10-11 2009-04-16 Samsung Electronics Co., Ltd. Modular multiplication method, modular multiplier and cryptosystem having the same
CN102207847A (en) * 2011-05-06 2011-10-05 广州杰赛科技股份有限公司 Data encryption and decryption processing method and device based on Montgomery modular multiplication operation
CN105068784A (en) * 2015-07-16 2015-11-18 清华大学 Montgomery modular multiplication based Tate pairing algorithm and hardware structure therefor
CN110351087A (en) * 2019-09-06 2019-10-18 南京秉速科技有限公司 The montgomery modulo multiplication operation method and computing device of pipeline-type
CN113141255A (en) * 2020-01-17 2021-07-20 意法半导体股份有限公司 Method for performing cryptographic operations on data in a processing device, corresponding processing device and computer program product

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090100120A1 (en) * 2007-10-11 2009-04-16 Samsung Electronics Co., Ltd. Modular multiplication method, modular multiplier and cryptosystem having the same
CN102207847A (en) * 2011-05-06 2011-10-05 广州杰赛科技股份有限公司 Data encryption and decryption processing method and device based on Montgomery modular multiplication operation
CN105068784A (en) * 2015-07-16 2015-11-18 清华大学 Montgomery modular multiplication based Tate pairing algorithm and hardware structure therefor
CN110351087A (en) * 2019-09-06 2019-10-18 南京秉速科技有限公司 The montgomery modulo multiplication operation method and computing device of pipeline-type
CN113141255A (en) * 2020-01-17 2021-07-20 意法半导体股份有限公司 Method for performing cryptographic operations on data in a processing device, corresponding processing device and computer program product

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
C. PREMA 等: "《Enhanced high speed modular multiplier using karatsuba algorithm》", 《2013 INTERNATIONAL CONFERENCE ON COMPUTER COMMUNICATION AND INFORMATICS》 *
XINKAI YAN 等: "《An Implementation of Montgomery Modular Multiplication on FPGAs》", 《2013 INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CLOUD COMPUTING》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112286490A (en) * 2020-11-11 2021-01-29 南京大学 Hardware architecture and method for loop iteration multiply-add operation
CN112286490B (en) * 2020-11-11 2024-04-02 南京大学 Hardware architecture and method for loop iteration multiply-add operation
CN112685003A (en) * 2021-01-05 2021-04-20 南京大学 Arithmetic device for obtaining modular multiplication result of homologous password
CN112685003B (en) * 2021-01-05 2024-05-28 南京大学 Arithmetic device for obtaining modular multiplication result of homologous password
CN113179151A (en) * 2021-03-24 2021-07-27 中国科学院信息工程研究所 Universal software implementation method for middle-up rounding learning in post-quantum cryptography construction
CN113179151B (en) * 2021-03-24 2022-08-16 中国科学院信息工程研究所 Universal software implementation method for middle-up rounding learning in post-quantum cryptography construction
CN113467754A (en) * 2021-07-20 2021-10-01 南京大学 Lattice encryption modular multiplication operation method and framework based on decomposition reduction
CN113467754B (en) * 2021-07-20 2023-10-13 南京大学 Lattice encryption modular multiplication operation device based on decomposition reduction
WO2023226173A1 (en) * 2022-05-24 2023-11-30 上海阵方科技有限公司 Modular multiplication operation method based on number-theoretic transform prime
WO2023246063A1 (en) * 2022-06-24 2023-12-28 上海途擎微电子有限公司 Modular multiplier, security chip, electronic device and encryption method
CN116540977B (en) * 2023-07-05 2023-09-12 北京瑞莱智慧科技有限公司 Modulo multiplier circuit, FPGA circuit and ASIC module
CN116540977A (en) * 2023-07-05 2023-08-04 北京瑞莱智慧科技有限公司 Modulo multiplier circuit, FPGA circuit and ASIC module
CN117134917A (en) * 2023-08-09 2023-11-28 北京融数联智科技有限公司 Rapid modular operation method and device for elliptic curve encryption
CN117134917B (en) * 2023-08-09 2024-04-26 北京融数联智科技有限公司 Rapid modular operation method and device for elliptic curve encryption

Similar Documents

Publication Publication Date Title
CN110908635A (en) High-speed modular multiplier based on post-quantum cryptography of homologus curve and modular multiplication method thereof
Güneysu et al. Ultra high performance ECC over NIST primes on commercial FPGAs
Schinianakis et al. An RNS implementation of an $ F_ {p} $ elliptic curve point multiplier
Ding et al. High-speed ECC processor over NIST prime fields applied with Toom–Cook multiplication
Azarderakhsh et al. Parallel and high-speed computations of elliptic curve cryptography using hybrid-double multipliers
Chow et al. A Karatsuba-based Montgomery multiplier
Asif et al. High‐throughput multi‐key elliptic curve cryptosystem based on residue number system
Shah et al. A high‐speed RSD‐based flexible ECC processor for arbitrary curves over general prime field
Kudithi An efficient hardware implementation of the elliptic curve cryptographic processor over prime field
Tian et al. Ultra-fast modular multiplication implementation for isogeny-based post-quantum cryptography
Niasar et al. Optimized architectures for elliptic curve cryptography over Curve448
Tian et al. Fast modular multipliers for supersingular isogeny-based post-quantum cryptography
Awaludin et al. A high-performance ecc processor over curve448 based on a novel variant of the karatsuba formula for asymmetric digit multiplier
Hossain et al. Efficient fpga implementation of modular arithmetic for elliptic curve cryptography
Hossain et al. FPGA-based efficient modular multiplication for Elliptic Curve Cryptography
Langhammer et al. Efficient FPGA modular multiplication implementation
McIvor et al. High-radix systolic modular multiplication on reconfigurable hardware
Kudithi et al. Radix-4 interleaved modular multiplication for cryptographic applications
Parhami On equivalences and fair comparisons among residue number systems with special moduli
Koppermann et al. Automatic generation of high-performance modular multipliers for arbitrary mersenne primes on FPGAs
Zhu et al. Low-latency architecture for the parallel extended GCD algorithm of large numbers
CN111064567B (en) Rapid modular multiplication method for SIDH special domain
Kolagatla et al. Area-time scalable high radix Montgomery modular multiplier for large modulus
Verma et al. FPGA implementation of RSA based on carry save Montgomery modular multiplication
Nguyen et al. An efficient hardware implementation of radix-16 Montgomery multiplication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200324