CN114840174A - System and method for rapidly realizing Montgomery modular multiplication by using multiple multipliers - Google Patents
System and method for rapidly realizing Montgomery modular multiplication by using multiple multipliers Download PDFInfo
- Publication number
- CN114840174A CN114840174A CN202210565348.XA CN202210565348A CN114840174A CN 114840174 A CN114840174 A CN 114840174A CN 202210565348 A CN202210565348 A CN 202210565348A CN 114840174 A CN114840174 A CN 114840174A
- Authority
- CN
- China
- Prior art keywords
- data
- module
- bit
- multipliers
- asymmetric
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
- G06F7/60—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers
- G06F7/72—Methods or arrangements for performing computations using a digital non-denominational number representation, i.e. number representation without radix; Computing devices using combinations of denominational and non-denominational quantity representations, e.g. using difunction pulse trains, STEELE computers, phase computers using residue arithmetic
- G06F7/722—Modular multiplication
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Complex Calculations (AREA)
Abstract
The invention provides a system and a method for quickly realizing Montgomery modular multiplication by using multiple multipliers, and relates to the technical field of high-efficiency performance algorithms of security chips. The system carries out combined operation based on the existing point addition, double points, modular exponentiation, modular inversion, modular subtraction, modular addition and modular multiplication modules, optimizes the calculation mode of the original Montgomery modular multiplication formula loop iteration, uses a plurality of 64-bit multipliers for parallel operation, greatly improves the speed of operation signature, signature verification, encryption, decryption and key generation of an asymmetric algorithm chip, and improves the performance of a security chip.
Description
Technical Field
The invention relates to the technical field of security algorithms, in particular to a system and a method for quickly realizing Montgomery modular multiplication by using multiple multipliers.
Background
At present, the asymmetric cryptographic chip mostly uses an elliptic curve, and the elliptic curve public key cryptography is based on the following curve characteristics: 1. the elliptic curves on the finite field form a finite exchange group under the point addition operation, and the order of the finite exchange group is similar to the scale of the fundamental field. 2. Similar to exponentiation in finite field multiplications, the multiple point operations in elliptic curves constitute a one-way function.
In the multi-point operation, the problem of solving the multiple with the known multi-point and base point is called the elliptic curve discrete logarithm problem. For the discrete logarithm problem of a general elliptic curve, only an exponential calculation complexity solving method exists at present. Compared with the large number decomposition problem and the discrete logarithm problem in a finite field, the solution difficulty of the elliptic curve discrete logarithm problem is much higher.
The elliptic curve public key password is composed of operations of point multiplication and multiple points and point addition and modular exponentiation in curve calculation, and can be finally decomposed into operation modes of modular multiplication, modular addition and modular subtraction.
The implementation of large digital-to-analog multiplication in the prior art mainly uses Montgomery modular multiplication, and because the Montgomery modular multiplication formula is circularly and iteratively calculated, the current modular multiplication calculation speed is limited by the formula, so that the speed of signature, signature verification, encryption, decryption and key generation of asymmetric algorithm chip operation using the elliptic curve calculation method is only dozens of times to hundreds of times per second, and the method becomes the bottleneck of asymmetric encryption chip operation.
Disclosure of Invention
The present invention is directed to a system and method for fast implementing Montgomery modular multiplication using multiple multipliers, so as to solve the foregoing problems in the prior art.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a system for rapidly realizing Montgomery modular multiplication by using a multi-multiplier comprises an asymmetric algorithm chip and an upper computer, wherein the asymmetric algorithm chip comprises a processor, an asymmetric hardware module and a random data module, and the processor, the asymmetric hardware module and the random data module are all connected with a bus; the asymmetric hardware module comprises a register, a RAM and an algorithm module, wherein the algorithm module comprises a point addition module, a point doubling module, a modular exponentiation module, a modular inversion module, a modular subtraction module, a modular addition module and a modular multiplication module; the processor writes data and parameters to be operated into an RAM of the asymmetric hardware module through a bus, and writes a mode to be operated into a register of the asymmetric hardware module; after the asymmetric hardware module detects the enable bit of the register, calling a corresponding operation module to perform operation according to the operation mode, writing result data into the RAM after the operation is finished, and simultaneously setting an end mark in the register and generating interruption; the processor receives the interrupt or inquires the end mark in the asymmetric hardware module register, reads the result data in the RAM through the bus, and outputs the result data to the upper computer.
Preferably, the processor is a microprocessor.
Another object of the present invention is to provide a method for fast implementing montgomery modular multiplication by using multiple multipliers, based on the system for fast implementing montgomery modular multiplication by using multiple multipliers, comprising the following steps:
the processor acquires data and parameters to be operated, writes the data and the parameters into an RAM of the asymmetric hardware module, and writes an operation mode to be operated into a register of the asymmetric hardware module;
after the asymmetric hardware module detects an enable bit of a register, calling a corresponding operation module according to a determined operation mode to operate data and parameters to be operated;
writing result data into the RAM after the operation is finished, and simultaneously setting an end mark in the register and generating interruption; the processor receives the interrupt or inquires the end mark in the asymmetric hardware module register, reads the result data in the RAM through the bus, and uploads the result data to the upper computer.
Preferably, the method calls a corresponding operation module to operate the data and the parameters to be operated according to the determined operation mode, and specifically includes the following steps:
the method comprises the following steps:
s1, confirming the data length W to realize the modular multiplication operation, and confirming that the module of the data length is M, wherein M [0], M [1], M [2], … and M [ e ] are 64-bit length groups from low to high of M; a and B are multipliers of the data length, A [0], A [1], A [2], …, and A [ e ] are 64-bit length groups of A from low to high, B [0], B [1], B [2], …, and B [ e ] are 64-bit length groups of B from low to high, respectively; defining Md, T1, T2, u, W, V, W ', V' as intermediate result registers;
w/64 as an e in step S1; w may be selected to have a data length including, but not limited to, 256 bits, 512 bits, 1024 bits.
S2, first, a first cycle operation is performed using one 64-bit multiplier to calculate a pre-operation value Md, where Md is B [0] × Mc, Mc is- { M1, M0} -1mod P, and P is a power of 64 of 2;
s3, using two 64-bit multipliers and an adder to perform second period operation, and obtaining the parameter u of the first period operation by parallel calculation 1 ;
Parameter u 1 The technical process comprises the following steps: u. of 1 =T1+T2,T1=r0*Mc、T2=A[0]Md, r0 ═ r% W, W is the power of 64 of 2, r is the modulo 64 th bit of data;
s4, using eight 64-bit multipliers to perform the third period operation, using the lowest 64-bit data A [0] of the multiplier A]All bit data B [0] of and B]、B[1]、B[2]、…、B[e]And the parameter u of the first round of operation 1 Parallel calculation of the intermediate result r0 1 ;
r0 1 ={(W`[0]+W`[1]):(V`[0]+V`[1]):(W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[0]*B[0]、V[1]=u 1 *M[0]、W[0]=A[0]*B[1]、W[1]=u 1 *M[1]、 V`[0]=A[0]*B[2]、V`[1]=u 1 *M[2]、W`[0]=A[0]*B[3]、W`[1]=u 1 *M[3]。
S5, using two 64-bit multipliers and an adder to perform the fourth round operation, and calculating the parameter u of the second round operation in parallel 2 ;u 2 =T1+T2,T1=r0 1 *Mc、T2=A[0]*Md;
S6, eight 64-bit multipliers are used to perform the fifth cycle operation, and the second lower 64-bit data A [1] of the multiplier A is used]All bit data B [0] of and B]、B[1]、B[2]、…、B[e]And a second round parameter u 2 Parallel calculation of the intermediate result r0 2 ;
r0 2 ={(W`[0]+W`[1]):(V`[0]+V`[1]):(W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[1]*B[0]、V[1]=u 2 *M[0]、W[0]=A[1]*B[1]、W[1]=u 2 *M[1]、 V`[0]=A[1]*B[2]、V`[1]=u 2 *M[2]、W`[0]=A[1]*B[3]、W`[1]=u 2 *M[3]。
S7, using two 64-bit multipliers and an adder to perform the sixth cycle operation, and calculating the parameter u of the third cycle in parallel 3 ;u 3 =T1+T2,T1=r0 2 *Mc、T2=A[0]*Md;
S8, eight 64-bit multipliers are used to perform the seventh cycle operation, the second high 64-bit data A [2] of the multiplier A]All bit data B [0] of and B]、B[1]、B[2]…、B[e]And a third round parameter u 3 Parallel calculation of the intermediate result r0 3 ;
r0 3 ={(W`[0]+W`[1]):(V`[0]+V`[1]):(W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[2]*B[0]、V[1]=u 3 *M[0]、W[0]=A[2]*B[1]、W[1]=u 3 *M[1]、 V`[0]=A[2]*B[2]、V`[1]=u 3 *M[2]、W`[0]=A[2]*B[3]、W`[1]=u 3 *M[3]。
S9, repeating the steps S7-S8 in a loop until the 4 th round parameter u is calculated in parallel 4 ;
S10, using the highest 64 bits data A [3 ] of multiplier A]All bit data B [0] of and B]、B[1]、 B[2]、…、B[e]And the 4 th round parameter u4, and the final result r0 is calculated in parallel 4 。
Preferably, the first and second liquid crystal materials are,
preferably, W in step S1 is 256 bits, and the pre-calculation value Md in step S2 is B [0 ═ B]*Mc, Mc=-{M1,M0} -1 mod P, P being a power of 64 of 2;
parameter u of the first round of operation in step S3 1 The calculation method of (1) is as follows: u. of 1 =T1+T2,T1=r0*Mc、 T2=A[0]Md, r0 ═ r% W, W is the 64 th power of 2, r is the modulo 64 th data;
r0 in step S4 1 The calculation formula of (2) is as follows: r0 1 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[0]*B[0]、V[1]=u 1 *M[0]、 W[0]=A[0]*B[1]、W[1]=u 1 *M[1]、V`[0]=A[0]*B[2]、V`[1]=u 1 *M[2]、 W`[0]=A[0]*B[3]、W`[1]=u 1 *M[3]。
Preferably, in step S5: u. of 2 =T1+T2,T1=r0 1 *Mc、T2=A[0]*Md;
R0 in step S6 2 The calculation formula of (2) is as follows: r0 2 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[1]*B[0]、V[1]=u 2 *M[0]、 W[0]=A[1]*B[1]、W[1]=u 2 *M[1]、V`[0]=A[1]*B[2]、V`[1]=u 2 *M[2]、 W`[0]=A[1]*B[3]、W`[1]=u 2 *M[3]。
Preferably, in step S7: u. of 3 =T1+T2,T1=r0 2 *Mc、T2=A[0]*Md;
R0 in step S8 3 The calculation formula of (2) is as follows: r0 3 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[2]*B[0]、V[1]=u 3 *M[0]、 W[0]=A[2]*B[1]、W[1]=u 3 *M[1]、V`[0]=A[2]*B[2]、V`[1]=u 3 *M[2]、 W`[0]=A[2]*B[3]、W`[1]=u 3 *M[3]。
Preferably, in step S9: u. of 4 =T1+T2,T1=r0 3 *Mc、T2=A[0]*Md;
R0 in step S10 4 The calculation formula of (2) is as follows: r0 4 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[3]*B[0]、V[1]=u 4 *M[0]、 W[0]=A[3]*B[1]、W[1]=u 4 *M[1]、V`[0]=A[3]*B[2]、V`[1]=u 4 *M[2]、 W`[0]=A[3]*B[3]、W`[1]=u 4 *M[3]。
The invention has the beneficial effects that:
the invention provides a system and a method for rapidly realizing Montgomery modular multiplication by using multiple multipliers, which optimize the calculation mode of the original Montgomery modular multiplication formula loop iteration, use a plurality of 64-bit multipliers for parallel operation, greatly improve the speed of operation signature, signature verification, encryption, decryption and key generation of an asymmetric algorithm chip, and improve the performance of a security chip.
Drawings
FIG. 1 is a chip structure in a system for fast Montgomery modular multiplication using multiple multipliers provided in embodiment 1;
fig. 2 is a schematic diagram of the principle of the method for quickly implementing montgomery modular multiplication using multiple multipliers provided in embodiment 2.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the invention, are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1
The embodiment provides a system for rapidly realizing Montgomery modular multiplication by using a multi-multiplier, which comprises an asymmetric algorithm chip and an upper computer, wherein the asymmetric algorithm chip is shown in FIG. 1 and comprises a processor, an asymmetric hardware module and a random data module, and the processor, the asymmetric hardware module and the random data module are all connected with a bus; the asymmetric hardware module comprises a register, a RAM and an algorithm module, wherein the algorithm module comprises a point addition module, a point doubling module, a modular exponentiation module, a modular inversion module, a modular subtraction module, a modular addition module and a modular multiplication module; the processor writes data and parameters to be operated into an RAM of the asymmetric hardware module through a bus, and writes a mode to be operated into a register of the asymmetric hardware module; after the asymmetric hardware module detects the enable bit of the register, calling a corresponding operation module to perform operation according to the operation mode, writing result data into the RAM after the operation is finished, and simultaneously setting an end mark in the register and generating interruption; the processor receives the interrupt or inquires the end mark in the asymmetric hardware module register, reads the result data in the RAM through the bus, and outputs the result data to the upper computer.
Example 2
The embodiment provides a method for quickly implementing montgomery modular multiplication by using multiple multipliers, based on the system for quickly implementing montgomery modular multiplication by using multiple multipliers described in embodiment 1, including the following steps:
the processor acquires data and parameters to be operated, writes the data and the parameters into an RAM of the asymmetric hardware module, and writes an operation mode to be operated into a register of the asymmetric hardware module;
after the asymmetric hardware module detects an enable bit of a register, calling a corresponding operation module according to a determined operation mode to operate data and parameters to be operated;
writing result data into the RAM after the operation is finished, and simultaneously setting an end mark in the register and generating interruption; the processor receives the interrupt or inquires the end mark in the asymmetric hardware module register, reads the result data in the RAM through the bus, and uploads the result data to the upper computer.
In this embodiment, a corresponding operation module is called according to a determined operation mode to operate on data and parameters to be operated, and an operation principle of a multiplier adopted in the embodiment is shown in fig. 2, and the method specifically includes the following steps:
s1, confirming the data length W to realize the modular multiplication operation, confirming the module of the data length as M, M [0], M [1], M [2], … and M [ e ] are 64bit length groups of M from low to high; a and B are multipliers of the data length, A [0], A [1], A [2], …, and A [ e ] are 64-bit length groups of A from low to high, B [0], B [1], B [2], …, and B [ e ] are 64-bit length groups of B from low to high, respectively; defining Md, T1, T2, u, W, V, W ', V' as intermediate result registers;
s2, firstly, a 64-bit multiplier is used for carrying out first cycle operation to calculate a pre-operation value Md;
s3, using two 64-bit multipliers and an adder to perform second period operation, and obtaining the parameter u of the first period operation by parallel calculation 1 ;
S4, using eight 64-bit multipliers to perform the third period operation, using the lowest 64-bit data A [0] of the multiplier A]All bit data B [0] of and B]、B[1]、B[2]、…、B[e]And the parameter u of the first round of operation 1 Parallel calculation of the intermediate result r0 1 ;
S5, using two 64-bit multipliers and an adder to perform the fourth round operation, and calculating the parameter u of the second round operation in parallel 2 ;
S6, eight 64-bit multipliers are used to perform the fifth cycle operation, and the second lower 64-bit data A [1] of the multiplier A is used]All bit data B [0] of and B]、B[1]、B[2]、…、B[e]And a second round parameter u 2 Parallel calculation of the intermediate result r0 2 ;
S7, using two 64-bit multipliers and an adder to perform the sixth cycle operation, and calculating the parameter u of the third cycle in parallel 3 ;
S8, eight 64-bit multipliers are used to perform the seventh cycle operation, the second high 64-bit data A [2] of the multiplier A]All bit data B [0] of and B]、B[1]、B[2]…、B[e]And a third round parameter u 3 Parallel calculation of the intermediate result r0 3 ;
S9, repeating the steps S7-S8 in a loop until the parameter u of the e-th round is calculated in parallel e+1 ;
S10, using the highest 64 bits data A [ e ] of multiplier A]All bit data B [0] of and B]、B[1]、 B[2]、…、B[e]And the e-th round parameter u e+1 The final result r0 is calculated in parallel e+1 。
In the present embodiment, e in step S1 is W/64; w may be selected to have a data length including, but not limited to, 256 bits, 512 bits, 1024 bits.
In a more preferred embodiment, W in step S1 is 256 bits, the precalculated value Md is B [0] Mc in step S2,
parameter u of the first round of operation in step S3 1 The calculation method of (1) is as follows: u. of 1 =T1+T2,T1=r0*Mc、 T2=A[0]Md, r0 ═ r% W, W is the power of 64 of 2, r is the modulo 64 th bit of data;
r0 in step S4 1 The calculation formula of (2) is as follows: r0 1 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[0]*B[0]、V[1]=u 1 *M[0]、 W[0]=A[0]*B[1]、W[1]=u 1 *M[1]、V`[0]=A[0]*B[2]、V`[1]=u 1 *M[2]、 W`[0]=A[0]*B[3]、W`[1]=u 1 *M[3]。
In step S5 in the present embodiment: u. of 2 =T1+T2,T1=r0 1 *Mc、T2=A[0]*Md;
R0 in step S6 2 The calculation formula of (2) is as follows: r0 2 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[1]*B[0]、V[1]=u 2 *M[0]、 W[0]=A[1]*B[1]、W[1]=u 2 *M[1]、V`[0]=A[1]*B[2]、V`[1]=u 2 *M[2]、 W`[0]=A[1]*B[3]、W`[1]=u 2 *M[3]。
In step S7 in the present embodiment: u. of 3 =T1+T2,T1=r0 2 *Mc、T2=A[0]*Md;
R0 in step S8 3 The calculation formula of (2) is as follows: r0 3 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[2]*B[0]、V[1]=u 3 *M[0]、 W[0]=A[2]*B[1]、W[1]=u 3 *M[1]、V`[0]=A[2]*B[2]、V`[1]=u 3 *M[2]、 W`[0]=A[2]*B[3]、W`[1]=u 3 *M[3]。
In step S9 in the present embodiment: u. of 4 =T1+T2,T1=r0 3 *Mc、T2=A[0]*Md;
R0 in step S10 4 The calculation formula of (2) is as follows: r0 4 ={(W`[0]+W`[1]):(V`[0]+V`[1]): (W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[3]*B[0]、V[1]=u 4 *M[0]、W[0]=A[3]*B[1]、W[1]=u 4 *M[1]、V`[0]=A[3]*B[2]、V`[1]=u 4 *M[2]、 W`[0]=A[3]*B[3]、W`[1]=u 4 *M[3]。
By adopting the technical scheme disclosed by the invention, the following beneficial effects are obtained:
the method optimizes the calculation mode of the original Montgomery modular multiplication formula loop iteration, uses a plurality of 64-bit multipliers for parallel operation after conversion, greatly improves the speed of operation signature, signature verification, encryption, decryption and key generation of an asymmetric algorithm chip, and improves the performance of a security chip.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and improvements can be made without departing from the principle of the present invention, and such modifications and improvements should also be considered within the scope of the present invention.
Claims (9)
1. A system for rapidly realizing Montgomery modular multiplication by using a multi-multiplier is characterized by comprising an asymmetric algorithm chip and an upper computer, wherein the asymmetric algorithm chip comprises a processor, an asymmetric hardware module and a random data module, and the processor, the asymmetric hardware module and the random data module are all connected with a bus; the asymmetric hardware module comprises a register, a RAM and an algorithm module, wherein the algorithm module comprises a point addition module, a point doubling module, a modular exponentiation module, a modular inversion module, a modular subtraction module, a modular addition module and a modular multiplication module; the processor writes data and parameters to be operated into an RAM of the asymmetric hardware module through a bus, and writes a mode to be operated into a register of the asymmetric hardware module; after the asymmetric hardware module detects the enable bit of the register, calling a corresponding operation module to perform operation according to the operation mode, writing result data into the RAM after the operation is finished, and simultaneously setting an end mark in the register and generating interruption; the processor receives the interrupt or inquires the end mark in the asymmetric hardware module register, reads the result data in the RAM through the bus, and outputs the result data to the upper computer.
2. The system according to claim 1, wherein said processor is a microprocessor.
3. A method for fast implementing montgomery modular multiplication by using multiple multipliers, which is based on the system for fast implementing montgomery modular multiplication by using multiple multipliers in any one of claims 1-2, and comprises the following steps:
the processor acquires data and parameters to be operated, writes the data and the parameters into an RAM of the asymmetric hardware module, and writes an operation mode to be operated into a register of the asymmetric hardware module;
after the asymmetric hardware module detects an enable bit of a register, calling a corresponding operation module according to a determined operation mode to operate data and parameters to be operated;
writing result data into the RAM after the operation is finished, and simultaneously setting an end mark in the register and generating interruption; the processor receives the interrupt or inquires the end mark in the asymmetric hardware module register, reads the result data in the RAM through the bus, and uploads the result data to the upper computer.
4. The method according to claim 3, wherein the corresponding operation module is invoked to operate on the data and parameters to be operated according to the determined operation mode, and the method specifically comprises the following steps:
s1, confirming the data length W to realize the modular multiplication operation, and confirming that the module of the data length is M, wherein M [0], M [1], M [2], … and M [ e ] are 64-bit length groups from low to high of M; a and B are multipliers of the data length, A [0], A [1], A [2], …, and A [ e ] are 64-bit length groups of A from low to high, B [0], B [1], B [2], …, and B [ e ] are 64-bit length groups of B from low to high, respectively; defining Md, T1, T2, u, W, V, W ', V' as intermediate result registers;
s2, firstly, a 64-bit multiplier is used for carrying out first cycle operation to calculate a pre-operation value Md;
s3, using two 64-bit multipliers and an adder to perform second period operation, and obtaining the parameter u of the first period operation by parallel calculation 1 ;
S4, using eight 64-bit multipliers to perform the third period operation, using the lowest 64-bit data A [0] of the multiplier A]All bit data B [0] of and B]、B[1]、B[2]、…、B[e]And the parameter u of the first round of operation 1 Parallel calculation of the intermediate result r0 1 ;
S5, using two 64-bit multipliers and an adder to perform the fourth round operation, and calculating the parameter u of the second round operation in parallel 2 ;
S6, eight 64-bit multipliers are used for the fifth cycle operation, and the second lower 64 bits of the multiplier A are usedData A [1]]All bit data B [0] of and B]、B[1]、B[2]、…、B[e]And a second round parameter u 2 Parallel calculation of the intermediate result r0 2 ;
S7, using two 64-bit multipliers and an adder to perform the sixth cycle operation, and calculating the parameter u of the third cycle in parallel 3 ;
S8, eight 64-bit multipliers are used to perform the seventh cycle operation, the second high 64-bit data A [2] of the multiplier A]All bit data B [0] of and B]、B[1]、B[2]、B[3]And a third round parameter u 3 Parallel calculation of the intermediate result r0 3 ;
S9, repeating the steps S7-S8 in a loop until the parameter u of the e-th round is calculated in parallel e+1 ;
S10, using the highest 64 bits data A [ e ] of multiplier A]All bit data B [0] of and B]、B[1]、B[2]、…、B[e]And the e-th round parameter u e+1 The final result r0 is calculated in parallel e+1 。
5. The method according to claim 4, wherein in step S1, e is W/64; w may be selected to have a data length including, but not limited to, 256 bits, 512 bits, 1024 bits.
6. The method of claim 5, wherein W in step S1 is 256 bits, and the pre-calculated value Md in step S2 is B [0]]*Mc,Mc=-{M1,M0} -1 mod P, P is a power of 64 of 2;
parameter u of the first round of operation in step S3 1 The calculation method of (1) is as follows: u. of 1 =T1+T2,T1=r0*Mc、T2=A[0]Md, r0 ═ r% W, W is the power of 64 of 2, r is the modulo 64 th bit of data;
r0 in step S4 1 The calculation formula of (2) is as follows: r0 1 ={(W`[0]+W`[1]):(V`[0]+V`[1]):(W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[0]*B[0]、V[1]=u 1 *M[0]、W[0]=A[0]*B[1]、W[1]=u 1 *M[1]、V`[0]=A[0]*B[2]、V`[1]=u 1 *M[2]、W`[0]=A[0]*B[3]、W`[1]=u 1 *M[3]。
7. The method for fast Montgomery modular multiplication using multiple multipliers of claim 6, wherein in step S5: u. u 2 =T1+T2,T1=r0 1 *Mc、T2=A[0]*Md;
R0 in step S6 2 The calculation formula of (2) is as follows: r0 2 ={(W`[0]+W`[1]):(V`[0]+V`[1]):(W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[1]*B[0]、V[1]=u 2 *M[0]、W[0]=A[1]*B[1]、W[1]=u 2 *M[1]、V`[0]=A[1]*B[2]、V`[1]=u 2 *M[2]、W`[0]=A[1]*B[3]、W`[1]=u 2 *M[3]。
8. The method for fast Montgomery modular multiplication using multiple multipliers of claim 7, wherein in step S7: u. u 3 =T1+T2,T1=r0 2 *Mc、T2=A[0]*Md;
R0 in step S8 3 The calculation formula of (2) is as follows: r0 3 ={(W`[0]+W`[1]):(V`[0]+V`[1]):(W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[2]*B[0]、V[1]=u 3 *M[0]、W[0]=A[2]*B[1]、W[1]=u 3 *M[1]、V`[0]=A[2]*B[2]、V`[1]=u 3 *M[2]、W`[0]=A[2]*B[3]、W`[1]=u 3 *M[3]。
9. The method for fast Montgomery modular multiplication using multiple multipliers of claim 8, wherein in step S9: u. of 4 =T1+T2,T1=r0 3 *Mc、T2=A[0]*Md;
R0 in step S10 4 The calculation formula of (2) is as follows: r0 4 ={(W`[0]+W`[1]):(V`[0]+V`[1]):(W[0]+W[1]):(V[0]+V[1]) }; wherein, V0]=A[3]*B[0]、V[1]=u 4 *M[0]、W[0]=A[3]*B[1]、W[1]=u 4 *M[1]、V`[0]=A[3]*B[2]、V`[1]=u 4 *M[2]、W`[0]=A[3]*B[3]、W`[1]=u 4 *M[3]。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210565348.XA CN114840174B (en) | 2022-05-18 | 2022-05-18 | System and method for rapidly realizing Montgomery modular multiplication by using multiple multipliers |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210565348.XA CN114840174B (en) | 2022-05-18 | 2022-05-18 | System and method for rapidly realizing Montgomery modular multiplication by using multiple multipliers |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114840174A true CN114840174A (en) | 2022-08-02 |
CN114840174B CN114840174B (en) | 2023-03-03 |
Family
ID=82571290
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210565348.XA Active CN114840174B (en) | 2022-05-18 | 2022-05-18 | System and method for rapidly realizing Montgomery modular multiplication by using multiple multipliers |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114840174B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117240601A (en) * | 2023-11-09 | 2023-12-15 | 深圳大普微电子股份有限公司 | Encryption processing method, encryption processing circuit, processing terminal, and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020172355A1 (en) * | 2001-04-04 | 2002-11-21 | Chih-Chung Lu | High-performance booth-encoded montgomery module |
EP1471420A2 (en) * | 2003-04-25 | 2004-10-27 | Samsung Electronics Co., Ltd. | Montgomery modular multiplier and method thereof using carry save addition |
CN1786900A (en) * | 2005-10-28 | 2006-06-14 | 清华大学 | Multiplier based on improved Montgomey's algorithm |
CN102609239A (en) * | 2011-09-01 | 2012-07-25 | 北京华大信安科技有限公司 | ECC (elliptic curve cryptography) coprocessor |
CN104765586A (en) * | 2015-04-15 | 2015-07-08 | 深圳国微技术有限公司 | Embedded security chip and Montgomery modular multiplication operational method thereof |
CN109145616A (en) * | 2018-08-01 | 2019-01-04 | 上海交通大学 | The realization method and system of SM2 encryption, signature and key exchange based on efficient modular multiplication |
CN110460443A (en) * | 2019-08-09 | 2019-11-15 | 南京秉速科技有限公司 | The high speed point add operation method and apparatus of elliptic curve cipher |
-
2022
- 2022-05-18 CN CN202210565348.XA patent/CN114840174B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020172355A1 (en) * | 2001-04-04 | 2002-11-21 | Chih-Chung Lu | High-performance booth-encoded montgomery module |
EP1471420A2 (en) * | 2003-04-25 | 2004-10-27 | Samsung Electronics Co., Ltd. | Montgomery modular multiplier and method thereof using carry save addition |
CN1786900A (en) * | 2005-10-28 | 2006-06-14 | 清华大学 | Multiplier based on improved Montgomey's algorithm |
CN102609239A (en) * | 2011-09-01 | 2012-07-25 | 北京华大信安科技有限公司 | ECC (elliptic curve cryptography) coprocessor |
CN104765586A (en) * | 2015-04-15 | 2015-07-08 | 深圳国微技术有限公司 | Embedded security chip and Montgomery modular multiplication operational method thereof |
CN109145616A (en) * | 2018-08-01 | 2019-01-04 | 上海交通大学 | The realization method and system of SM2 encryption, signature and key exchange based on efficient modular multiplication |
CN110460443A (en) * | 2019-08-09 | 2019-11-15 | 南京秉速科技有限公司 | The high speed point add operation method and apparatus of elliptic curve cipher |
Non-Patent Citations (4)
Title |
---|
JIA-HONG ZHANG等: "Hardware Implementation of Improved Montgomery Modular Multiplication Algorithm", 《2009 WRI INTERNATIONAL CONFERENCE ON COMMUNICATIONS AND MOBILE COMPUTING》 * |
TRIO ADIONO等: "Full custom design of adaptable montgomery modular multiplier for asymmetric RSA cryptosystem", 《2017 INTERNATIONAL SYMPOSIUM ON INTELLIGENT SIGNAL PROCESSING AND COMMUNICATION SYSTEMS (ISPACS)》 * |
舒妍等: "基于Booth编码模乘模块RSA的VLSI设计", 《西安电子科技大学学报》 * |
许伟: "基于FPGA的RSA密码算法的模幂模乘的快速实现", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117240601A (en) * | 2023-11-09 | 2023-12-15 | 深圳大普微电子股份有限公司 | Encryption processing method, encryption processing circuit, processing terminal, and storage medium |
CN117240601B (en) * | 2023-11-09 | 2024-03-26 | 深圳大普微电子股份有限公司 | Encryption processing method, encryption processing circuit, processing terminal, and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114840174B (en) | 2023-03-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7904498B2 (en) | Modular multiplication processing apparatus | |
Öztürk et al. | Low-power elliptic curve cryptography using scaled modular arithmetic | |
WO2015164996A1 (en) | Elliptic domain curve operational method and elliptic domain curve operational unit | |
WO2020006692A1 (en) | Fully homomorphic encryption method and device and computer readable storage medium | |
US9268564B2 (en) | Vector and scalar based modular exponentiation | |
US8862651B2 (en) | Method and apparatus for modulus reduction | |
CN109145616B (en) | SM2 encryption, signature and key exchange implementation method and system based on efficient modular multiplication | |
CN106712965B (en) | Digital signature method and device and password equipment | |
EP3570488A1 (en) | Online/offline signature system and method based on multivariate cryptography | |
CN113010142B (en) | Novel pulse node type scalar dot multiplication double-domain implementation system and method | |
US20220166614A1 (en) | System and method to optimize generation of coprime numbers in cryptographic applications | |
Zhang et al. | Efficient prime-field arithmetic for elliptic curve cryptography on wireless sensor nodes | |
CN114840174B (en) | System and method for rapidly realizing Montgomery modular multiplication by using multiple multipliers | |
JP4180024B2 (en) | Multiplication remainder calculator and information processing apparatus | |
JP3542278B2 (en) | Montgomery reduction device and recording medium | |
Moon et al. | Fast VLSI arithmetic algorithms for high-security elliptic curve cryptographic applications | |
JP4170267B2 (en) | Multiplication remainder calculator and information processing apparatus | |
US7319750B1 (en) | Digital circuit apparatus and method for accelerating preliminary operations for cryptographic processing | |
CN112737778A (en) | Digital signature generation and verification method and device, electronic equipment and storage medium | |
CN113114462A (en) | Small-area scalar multiplication circuit applied to ECC (error correction code) safety hardware circuit | |
CN116527274A (en) | Elliptic curve signature verification method and system based on multi-scalar multiplication rapid calculation | |
CN114238205A (en) | High-performance ECC coprocessor system resisting power consumption attack | |
CN114594925A (en) | Efficient modular multiplication circuit suitable for SM2 encryption operation and operation method thereof | |
Arazi et al. | On calculating multiplicative inverses modulo $2^{m} $ | |
CN117992990B (en) | Efficient homomorphic encryption method for power data, processor and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |