CN109976707B - Automatic generation method of variable bit-width multiplier - Google Patents

Automatic generation method of variable bit-width multiplier Download PDF

Info

Publication number
CN109976707B
CN109976707B CN201910215666.1A CN201910215666A CN109976707B CN 109976707 B CN109976707 B CN 109976707B CN 201910215666 A CN201910215666 A CN 201910215666A CN 109976707 B CN109976707 B CN 109976707B
Authority
CN
China
Prior art keywords
multiplier
parameter
level
rtl
multipliers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910215666.1A
Other languages
Chinese (zh)
Other versions
CN109976707A (en
Inventor
邸志雄
叶帅
葛悦
李福强
周玉欣
陆可承
冯全源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest Jiaotong University
Original Assignee
Southwest Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest Jiaotong University filed Critical Southwest Jiaotong University
Priority to CN201910215666.1A priority Critical patent/CN109976707B/en
Publication of CN109976707A publication Critical patent/CN109976707A/en
Application granted granted Critical
Publication of CN109976707B publication Critical patent/CN109976707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Devices For Executing Special Programs (AREA)

Abstract

The invention discloses an automatic generation method of a variable bit width multiplier, which comprises the steps that a user creates a target folder and configures parameters of a top multiplier; according to the nested level of the multiplier in the parameter configuration file of the current level multiplier, dividing downwards step by step, and generating a corresponding RTL code; and stopping dividing until the divided units are the minimum granularity units, and completing the generation work of the RTL codes of the required multipliers. The invention considers the configuration of the flow level number, realizes the configurability of the multiplier, and ensures that the designed multiplier has high flexibility and strong universality.

Description

Automatic generation method of variable bit-width multiplier
Technical Field
The invention relates to the technical field of digital chip design, in particular to an automatic generation method of a variable bit width multiplier.
Background
The multiplier is one of important operation components in hard-core processors, DSPs, filters, high-performance microcontrollers and other devices. High-performance multiplication plays a very important role in the field of signal processing such as image, voice, encryption, and the like, in addition to being directly used for an arithmetic unit. The structure of the multiplier is complex, the time delay is large, the operation period is long and the multiplier is often positioned on a critical path of the system, so that the design and optimization of the structure of the multiplier can greatly improve the performance indexes such as the speed, the area, the power consumption and the like of the whole processor system. With the advent of high performance computing scenarios such as machine learning, large data acceleration, multipliers are a significant factor in processing real-time video, audio, and image signals. And because the application scenes are various, the bit width, the data type and the performance of the multiplier are different, and therefore, a high-flexibility multiplier hardware structure capable of rapidly designing variable bit numbers is needed.
Existing references, such as Jiao Jiye, mu Rong, hao Yue, are directed to programming language research for fast design of high performance signed multiplier circuits, journal of electronics 2013, vol.41 (11): 2256-2261. The core idea of the programming language of the high-performance signed multiplier is to separate the basic units of the encoder, the adder tree, the fast adder and the like of the multiplier, and the basic unit functions and the interconnection relation are expressed by adopting instructions. The disadvantage of this document is that the flexibility in the number of bits that can be varied is not supported enough and the configuration of the number of pipeline stages has not been considered in the fast design of the multiplier.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide a method for automatically generating a variable bit-width multiplier that is fast and flexible and takes into consideration the arrangement of the number of pipeline stages. The technical proposal is as follows:
step 1: a user creates a target folder, and configures parameters of a top multiplier, including a pipeline number;
step 2: according to the nested level of the multiplier in the parameter configuration file of the current level multiplier, dividing downwards step by step, and generating a corresponding RTL code;
step 3: and (2) circulating the step until the divided units are the minimum granularity units, stopping dividing, and completing the generation work of the RTL codes of the required multipliers.
Further, the configuration file of the top multiplier parameters is a text file, and includes user-input customized 9 multiplier parameters:
parameter $1: the multiplicand bit number represents the bit width of the multiplicand A;
parameter $2: the multiplier bit number represents the bit width of the multiplier B;
parameter $3: s represents that the multiplicand A is signed number; u represents that the multiplicand A is an unsigned number;
parameter $4: s represents that the multiplier B is a signed number; u represents that the multiplier B is an unsigned number;
parameter $5: rst denotes asynchronous reset; sclr denotes synchronous reset;
parameter $6: the target file path represents the storage path of the script-generated Verilog code
Parameter $7: the cell library file path represents the file path of the minimum multiplier cell library;
parameter $8: the character string represents the top module name;
parameter $9: the pipeline stages represent user-configured pipeline stages.
Further, the value range of the configurable number m of the pipeline stages is as follows: m is more than or equal to 0 and less than or equal to 2n, wherein n is the number of RTL code division layers; for each layer of RTL code, the pipeline has two insertion positions selectable: after the adder of the maximum number of bits, between the CSA and the adder.
Further, the multiplier nesting level uses a configurable number of bits {2,4,8,12,16,20,24,28,32} for a user to configure any combination of multipliers.
Further, the step 2 specifically includes: after the top-level parameters of the current-level multiplier are configured, judging whether the current-level multiplier is a minimum unit multiplier, and if so, directly calling a minimum unit multiplier; otherwise, splitting the primary multiplier into two groups of secondary multipliers to generate RTL codes, including: the top layer RTL code of the multiplier, CSA RTL code, final adder RTL code, register set RTL code and parameter configuration file of secondary multiplier.
The beneficial effects of the invention are as follows: the invention fully utilizes the characteristic that the multiplier structure has a multi-level regular structure, a user side creates a target folder, configures needed multiplier parameters, then performs step-by-step downward division according to the nested levels of the multipliers and generates corresponding RTL codes, and stops division until the divided units are minimum granularity units, thereby completing the generation work of the needed multiplier RTL codes. Thus, variable bit widths of the multiplier and multiplicand may be supported. Meanwhile, the configuration of the number of the flow stages is considered, the configurability of the multiplier is realized, and the designed multiplier has high flexibility and strong universality.
Drawings
FIG. 1 is a general flow chart of the automated generation method of the variable bit width multiplier of the present invention.
Fig. 2 is a code generation flow chart.
Fig. 3 is an overall framework for multiplier implementation.
FIG. 4 is an overall framework of an 8bit cell library script.
Detailed Description
The invention will now be described in further detail with reference to the drawings and to specific examples. Fig. 1 is a flowchart of the fast multiplier generation proposed in this patent, after a user creates a target folder and configures the needed multiplier parameters, then performs step-by-step downward division according to the multiplier nesting levels listed in tables 3-4, and generates corresponding RTL codes, until the divided units are minimum granularity units, and then stops division, thus completing the generation work of the needed multiplier RTL codes. The method comprises the following specific steps:
step 1: creating a target folder at the user end and configuring parameters of a top multiplier.
The top configuration file is a text file, in which parameters of multipliers input by users are separated by spaces so as to be correctly identified by scripts, and in which 9 parameters are specifically represented, and the meanings of the parameters are shown in table 1. The top layer parameters are customized by a user, other all levels of parameters are generated by each layer of software, the formats in the files are identical, and each current configuration level can automatically generate the configuration file of the next level.
Tables 3-6 Profile requirements
Figure BDA0002001989680000031
Step 2: and dividing downwards step by step according to the nested levels of the multipliers in the parameter configuration file of the current level multiplier, and generating corresponding RTL codes.
When the minimum multiplier cell bank is a 2bit multiplier, all even multipliers up to 32 bits and any combination of multipliers between them should be implemented, but given that some multipliers are not commonly used, only {2,4,8,12,16,20,24,28,32} these configurable bit numbers are used, the user can configure any combination of multipliers between them, the detailed nested hierarchical division forms of which are shown in tables 3-4.
Table 3-4 multiplier nested level division table form
Figure BDA0002001989680000032
/>
Figure BDA0002001989680000041
The splitting composition form based on the 4bit and 8bit minimum multiplier unit libraries is the same as the splitting principle in the table above, and will not be described again here.
Step 3: and (2) circulating the step until the divided units are the minimum granularity units, stopping dividing, and completing the generation work of the RTL codes of the required multipliers.
The code generation method of each stage is basically consistent, the format of the configuration file generated by each stage is identical, and the difference is the bit width of the multiplier, the format of the multiplicand and the multiplier and the top-level module name of the generated code of each stage.
Each level of "code generation method" will produce five parts: the top layer RTL code of the multiplier, CSA RTL code, final adder RTL code, register set RTL code, and the parameter configuration file of the next stage.
FIG. 2 is a flow chart of an implementation of each level of "code generation method". After the top-level parameters of the current-level multiplier are configured, judging whether the current-level multiplier is a minimum unit multiplier, and if so, directly calling a minimum unit multiplier; otherwise, splitting the multiplier of the present stage into a multiplier composition form 1 and a multiplier composition form 2, respectively generating Verilog top-level files of the present stage, respectively generating next-stage configuration, and generating RTL descriptions of adders and registers.
Pipeline configurability: the value range of the flow-level configurable number m is as follows: m is more than or equal to 0 and less than or equal to 2n, wherein n is the number of division layers of the RTL code. For each layer of RTL code, the pipeline has two insertion positions selectable: after the adder of the maximum number of bits, between the CSA and the adder.
The embodiment realizes the configurability of the multiplier on the basis of optimizing the performance of the multiplier, so that the designed multiplier has high flexibility and strong universality, mainly comprises two large aspects, namely the realization of the unit library multiplier and the realization of the script, and the large-bit multiplier can be split into small-bit multipliers and then is called step by step until the unit library multiplier is completed, and the work completed by the script is just the splitting of the multipliers and the calling among the multipliers, so that RTL (real time transport layer) synthesizable files of the multipliers are generated.
FIG. 3 is a simplified overall framework in which a user first writes a specific configuration of the top multiplier to a designated file, then the contents of the file are read by the script program, and instantiates the relevant RTL file, including the top multiplier, register set, CSA compressor, serial carry adder, and the specific configuration of the next stage after splitting.
In this embodiment, the minimum multiplier units have three implementation manners, namely 2-bit, 4-bit and 8-bit multipliers, so that a user can select the minimum multiplier units according to specific requirements. It has been mentioned that the whole method is to split multipliers and call between multipliers to generate RTL synthesizable files, taking an 8-bit minimum multiplier unit library as an example, if a 32X32bit multiplier is generated, its whole implementation framework is shown in fig. 4, the method in this embodiment generates the top-level verilog hdl file of this level according to the division form of the multipliers, and then calls the next level multiplier step by step until the minimum unit library.

Claims (3)

1. An automatic generation method of a variable bit width multiplier is characterized by comprising the following steps:
step 1: the user creates a target folder, configures top-level multiplier parameters, and the configuration file of the top-level multiplier parameters is a text file, and comprises user-input self-defined 9 multiplier parameters:
parameter $1: the multiplicand bit number represents the bit width of the multiplicand A;
parameter $2: the multiplier bit number represents the bit width of the multiplier B;
parameter $3: s represents that the multiplicand A is signed number; u represents that the multiplicand A is an unsigned number;
parameter $4: s represents that the multiplier B is a signed number; u represents that the multiplier B is an unsigned number;
parameter $5: rst denotes asynchronous reset; sclr denotes synchronous reset;
parameter $6: the target file path represents a storage path of Verilog code generated by the script;
parameter $7: the cell library file path represents the file path of the minimum multiplier cell library;
parameter $8: the character string represents the top module name;
parameter $9: the pipeline stage number represents a pipeline stage number configured by a user;
step 2: according to the nested level of the multiplier in the parameter configuration file of the current level multiplier, dividing downwards step by step, and generating a corresponding RTL code;
after the top-level parameters of the current-level multiplier are configured, judging whether the current-level multiplier is a minimum unit multiplier, and if so, directly calling a minimum unit multiplier; otherwise, splitting the primary multiplier into two groups of secondary multipliers to generate RTL codes, including: the top layer RTL code of the primary multiplier, the CSA RTL code, the final adder RTL code, the register set RTL code and the parameter configuration file of the secondary multiplier;
step 3: and (2) circulating the step until the divided units are the minimum granularity units, stopping dividing, and completing the generation work of the RTL codes of the required multipliers.
2. The automated generation method of a variable bit-width multiplier of claim 1, wherein the configurable number m of pipeline stages has a range of values: m is more than or equal to 0 and less than or equal to 2n, wherein n is the number of RTL code division layers; for each layer of RTL code, the pipeline has two insertion positions selectable: after the adder of the maximum number of bits, between the CSA and the adder.
3. The automated generation method of variable bit width multipliers of claim 1, wherein the multiplier nesting level uses a configurable number of bits {2,4,8,12,16,20,24,28,32} for a user to configure any combination of multipliers.
CN201910215666.1A 2019-03-21 2019-03-21 Automatic generation method of variable bit-width multiplier Active CN109976707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910215666.1A CN109976707B (en) 2019-03-21 2019-03-21 Automatic generation method of variable bit-width multiplier

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910215666.1A CN109976707B (en) 2019-03-21 2019-03-21 Automatic generation method of variable bit-width multiplier

Publications (2)

Publication Number Publication Date
CN109976707A CN109976707A (en) 2019-07-05
CN109976707B true CN109976707B (en) 2023-05-05

Family

ID=67079797

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910215666.1A Active CN109976707B (en) 2019-03-21 2019-03-21 Automatic generation method of variable bit-width multiplier

Country Status (1)

Country Link
CN (1) CN109976707B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125976B (en) * 2019-12-06 2022-09-06 中国电子科技集团公司第五十八研究所 Automatic generation method of RTL model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101458617A (en) * 2008-01-22 2009-06-17 西北工业大学 32 bit integer multiplier based on CISC microprocessor
CN105378651A (en) * 2013-05-24 2016-03-02 相干逻辑公司 Memory-network processor with programmable optimizations

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101552547B (en) * 2009-01-14 2011-07-20 西南交通大学 Pseudo-continuous work mode switch power supply power factor correcting method and device thereof
CN201607731U (en) * 2009-09-15 2010-10-13 新思科技有限公司 Equipment used for circuit design of sequential units
CN102073473A (en) * 2009-11-20 2011-05-25 杨军 Field programmable gata array (FPGA)-based metric floating-point multiplier design
GB201111243D0 (en) * 2011-06-30 2011-08-17 Imagination Tech Ltd Method and apparatus for use in the sysnthesis of lossy integer multipliers
CN102591615A (en) * 2012-01-16 2012-07-18 中国人民解放军国防科学技术大学 Structured mixed bit-width multiplying method and structured mixed bit-width multiplying device
US9753693B2 (en) * 2013-03-15 2017-09-05 Imagination Technologies Limited Constant fraction integer multiplication
CN103699355B (en) * 2013-12-30 2017-02-08 南京大学 Variable-order pipeline serial multiply-accumulator
CN110443360B (en) * 2017-06-16 2021-08-06 上海兆芯集成电路有限公司 Method for operating a processor
CN109445365B (en) * 2018-12-27 2021-07-09 青岛中科青芯电子科技有限公司 Screening test method of FPGA embedded multiplier

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101458617A (en) * 2008-01-22 2009-06-17 西北工业大学 32 bit integer multiplier based on CISC microprocessor
CN105378651A (en) * 2013-05-24 2016-03-02 相干逻辑公司 Memory-network processor with programmable optimizations

Also Published As

Publication number Publication date
CN109976707A (en) 2019-07-05

Similar Documents

Publication Publication Date Title
CN110689125A (en) Computing device
US11249721B2 (en) Multiplication circuit, system on chip, and electronic device
CN115934030B (en) Arithmetic logic unit, method and equipment for floating point number multiplication
CN114816331B (en) Hardware unit for performing matrix multiplication with clock gating
CN111522528A (en) Multiplier, multiplication method, operation chip, electronic device, and storage medium
JPS6132437Y2 (en)
CN108733347B (en) Data processing method and device
CN109325590B (en) Device for realizing neural network processor with variable calculation precision
Hou et al. Enhancing accuracy and dynamic range of scientific data analytics by implementing posit arithmetic on FPGA
US20230221924A1 (en) Apparatus and Method for Processing Floating-Point Numbers
US20240126507A1 (en) Apparatus and method for processing floating-point numbers
CN110109646A (en) Data processing method, device and adder and multiplier and storage medium
CN109976707B (en) Automatic generation method of variable bit-width multiplier
US11354097B2 (en) Compressor circuit, Wallace tree circuit, multiplier circuit, chip, and device
US5177703A (en) Division circuit using higher radices
CN113010148B (en) Fixed-point multiply-add operation unit and method suitable for mixed precision neural network
CN111931441A (en) Method, device and medium for establishing FPGA rapid carry chain time sequence model
CN111814972A (en) Neural network convolution operation acceleration method based on FPGA
CN111445016A (en) System and method for accelerating nonlinear mathematical computation
US7729898B1 (en) Methods and apparatus for implementing logic functions on a heterogeneous programmable device
CN113504892A (en) Method, system, equipment and medium for designing multiplier lookup table
CN111610955B (en) Data saturation and packaging processing component, chip and equipment
CN113867799A (en) Computing device, integrated circuit chip, board card, electronic equipment and computing method
CN111142840A (en) Data calculation method and device based on FPGA
Rangisetti et al. Area-efficient and power-efficient binary to BCD converters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant