CN109976707B

CN109976707B - Automatic generation method of variable bit-width multiplier

Info

Publication number: CN109976707B
Application number: CN201910215666.1A
Authority: CN
Inventors: 邸志雄; 叶帅; 葛悦; 李福强; 周玉欣; 陆可承; 冯全源
Original assignee: Southwest Jiaotong University
Current assignee: Southwest Jiaotong University
Priority date: 2019-03-21
Filing date: 2019-03-21
Publication date: 2023-05-05
Anticipated expiration: 2039-03-21
Also published as: CN109976707A

Abstract

The invention discloses an automatic generation method of a variable bit width multiplier, which comprises the steps that a user creates a target folder and configures parameters of a top multiplier; according to the nested level of the multiplier in the parameter configuration file of the current level multiplier, dividing downwards step by step, and generating a corresponding RTL code; and stopping dividing until the divided units are the minimum granularity units, and completing the generation work of the RTL codes of the required multipliers. The invention considers the configuration of the flow level number, realizes the configurability of the multiplier, and ensures that the designed multiplier has high flexibility and strong universality.

Description

Automatic generation method of variable bit-width multiplier

Technical Field

The invention relates to the technical field of digital chip design, in particular to an automatic generation method of a variable bit width multiplier.

Background

The multiplier is one of important operation components in hard-core processors, DSPs, filters, high-performance microcontrollers and other devices. High-performance multiplication plays a very important role in the field of signal processing such as image, voice, encryption, and the like, in addition to being directly used for an arithmetic unit. The structure of the multiplier is complex, the time delay is large, the operation period is long and the multiplier is often positioned on a critical path of the system, so that the design and optimization of the structure of the multiplier can greatly improve the performance indexes such as the speed, the area, the power consumption and the like of the whole processor system. With the advent of high performance computing scenarios such as machine learning, large data acceleration, multipliers are a significant factor in processing real-time video, audio, and image signals. And because the application scenes are various, the bit width, the data type and the performance of the multiplier are different, and therefore, a high-flexibility multiplier hardware structure capable of rapidly designing variable bit numbers is needed.

Existing references, such as Jiao Jiye, mu Rong, hao Yue, are directed to programming language research for fast design of high performance signed multiplier circuits, journal of electronics 2013, vol.41 (11): 2256-2261. The core idea of the programming language of the high-performance signed multiplier is to separate the basic units of the encoder, the adder tree, the fast adder and the like of the multiplier, and the basic unit functions and the interconnection relation are expressed by adopting instructions. The disadvantage of this document is that the flexibility in the number of bits that can be varied is not supported enough and the configuration of the number of pipeline stages has not been considered in the fast design of the multiplier.

Disclosure of Invention

In view of the above problems, an object of the present invention is to provide a method for automatically generating a variable bit-width multiplier that is fast and flexible and takes into consideration the arrangement of the number of pipeline stages. The technical proposal is as follows:

step 1: a user creates a target folder, and configures parameters of a top multiplier, including a pipeline number;

step 2: according to the nested level of the multiplier in the parameter configuration file of the current level multiplier, dividing downwards step by step, and generating a corresponding RTL code;

step 3: and (2) circulating the step until the divided units are the minimum granularity units, stopping dividing, and completing the generation work of the RTL codes of the required multipliers.

Further, the configuration file of the top multiplier parameters is a text file, and includes user-input customized 9 multiplier parameters:

parameter $1: the multiplicand bit number represents the bit width of the multiplicand A;

parameter $2: the multiplier bit number represents the bit width of the multiplier B;

parameter $3: s represents that the multiplicand A is signed number; u represents that the multiplicand A is an unsigned number;

parameter $4: s represents that the multiplier B is a signed number; u represents that the multiplier B is an unsigned number;

parameter $5: rst denotes asynchronous reset; sclr denotes synchronous reset;

parameter $6: the target file path represents the storage path of the script-generated Verilog code

Parameter $7: the cell library file path represents the file path of the minimum multiplier cell library;

parameter $8: the character string represents the top module name;

parameter $9: the pipeline stages represent user-configured pipeline stages.

Further, the value range of the configurable number m of the pipeline stages is as follows: m is more than or equal to 0 and less than or equal to 2n, wherein n is the number of RTL code division layers; for each layer of RTL code, the pipeline has two insertion positions selectable: after the adder of the maximum number of bits, between the CSA and the adder.

Further, the multiplier nesting level uses a configurable number of bits {2,4,8,12,16,20,24,28,32} for a user to configure any combination of multipliers.

Further, the step 2 specifically includes: after the top-level parameters of the current-level multiplier are configured, judging whether the current-level multiplier is a minimum unit multiplier, and if so, directly calling a minimum unit multiplier; otherwise, splitting the primary multiplier into two groups of secondary multipliers to generate RTL codes, including: the top layer RTL code of the multiplier, CSA RTL code, final adder RTL code, register set RTL code and parameter configuration file of secondary multiplier.

The beneficial effects of the invention are as follows: the invention fully utilizes the characteristic that the multiplier structure has a multi-level regular structure, a user side creates a target folder, configures needed multiplier parameters, then performs step-by-step downward division according to the nested levels of the multipliers and generates corresponding RTL codes, and stops division until the divided units are minimum granularity units, thereby completing the generation work of the needed multiplier RTL codes. Thus, variable bit widths of the multiplier and multiplicand may be supported. Meanwhile, the configuration of the number of the flow stages is considered, the configurability of the multiplier is realized, and the designed multiplier has high flexibility and strong universality.

Drawings

FIG. 1 is a general flow chart of the automated generation method of the variable bit width multiplier of the present invention.

Fig. 2 is a code generation flow chart.

Fig. 3 is an overall framework for multiplier implementation.

FIG. 4 is an overall framework of an 8bit cell library script.

Detailed Description

The invention will now be described in further detail with reference to the drawings and to specific examples. Fig. 1 is a flowchart of the fast multiplier generation proposed in this patent, after a user creates a target folder and configures the needed multiplier parameters, then performs step-by-step downward division according to the multiplier nesting levels listed in tables 3-4, and generates corresponding RTL codes, until the divided units are minimum granularity units, and then stops division, thus completing the generation work of the needed multiplier RTL codes. The method comprises the following specific steps:

step 1: creating a target folder at the user end and configuring parameters of a top multiplier.

The top configuration file is a text file, in which parameters of multipliers input by users are separated by spaces so as to be correctly identified by scripts, and in which 9 parameters are specifically represented, and the meanings of the parameters are shown in table 1. The top layer parameters are customized by a user, other all levels of parameters are generated by each layer of software, the formats in the files are identical, and each current configuration level can automatically generate the configuration file of the next level.

Tables 3-6 Profile requirements

Step 2: and dividing downwards step by step according to the nested levels of the multipliers in the parameter configuration file of the current level multiplier, and generating corresponding RTL codes.

When the minimum multiplier cell bank is a 2bit multiplier, all even multipliers up to 32 bits and any combination of multipliers between them should be implemented, but given that some multipliers are not commonly used, only {2,4,8,12,16,20,24,28,32} these configurable bit numbers are used, the user can configure any combination of multipliers between them, the detailed nested hierarchical division forms of which are shown in tables 3-4.

Table 3-4 multiplier nested level division table form

/>

The splitting composition form based on the 4bit and 8bit minimum multiplier unit libraries is the same as the splitting principle in the table above, and will not be described again here.

The code generation method of each stage is basically consistent, the format of the configuration file generated by each stage is identical, and the difference is the bit width of the multiplier, the format of the multiplicand and the multiplier and the top-level module name of the generated code of each stage.

Each level of "code generation method" will produce five parts: the top layer RTL code of the multiplier, CSA RTL code, final adder RTL code, register set RTL code, and the parameter configuration file of the next stage.

FIG. 2 is a flow chart of an implementation of each level of "code generation method". After the top-level parameters of the current-level multiplier are configured, judging whether the current-level multiplier is a minimum unit multiplier, and if so, directly calling a minimum unit multiplier; otherwise, splitting the multiplier of the present stage into a multiplier composition form 1 and a multiplier composition form 2, respectively generating Verilog top-level files of the present stage, respectively generating next-stage configuration, and generating RTL descriptions of adders and registers.

Pipeline configurability: the value range of the flow-level configurable number m is as follows: m is more than or equal to 0 and less than or equal to 2n, wherein n is the number of division layers of the RTL code. For each layer of RTL code, the pipeline has two insertion positions selectable: after the adder of the maximum number of bits, between the CSA and the adder.

The embodiment realizes the configurability of the multiplier on the basis of optimizing the performance of the multiplier, so that the designed multiplier has high flexibility and strong universality, mainly comprises two large aspects, namely the realization of the unit library multiplier and the realization of the script, and the large-bit multiplier can be split into small-bit multipliers and then is called step by step until the unit library multiplier is completed, and the work completed by the script is just the splitting of the multipliers and the calling among the multipliers, so that RTL (real time transport layer) synthesizable files of the multipliers are generated.

FIG. 3 is a simplified overall framework in which a user first writes a specific configuration of the top multiplier to a designated file, then the contents of the file are read by the script program, and instantiates the relevant RTL file, including the top multiplier, register set, CSA compressor, serial carry adder, and the specific configuration of the next stage after splitting.

In this embodiment, the minimum multiplier units have three implementation manners, namely 2-bit, 4-bit and 8-bit multipliers, so that a user can select the minimum multiplier units according to specific requirements. It has been mentioned that the whole method is to split multipliers and call between multipliers to generate RTL synthesizable files, taking an 8-bit minimum multiplier unit library as an example, if a 32X32bit multiplier is generated, its whole implementation framework is shown in fig. 4, the method in this embodiment generates the top-level verilog hdl file of this level according to the division form of the multipliers, and then calls the next level multiplier step by step until the minimum unit library.

Claims

1. An automatic generation method of a variable bit width multiplier is characterized by comprising the following steps:

step 1: the user creates a target folder, configures top-level multiplier parameters, and the configuration file of the top-level multiplier parameters is a text file, and comprises user-input self-defined 9 multiplier parameters:

parameter $5: rst denotes asynchronous reset; sclr denotes synchronous reset;

parameter $6: the target file path represents a storage path of Verilog code generated by the script;

parameter $8: the character string represents the top module name;

parameter $9: the pipeline stage number represents a pipeline stage number configured by a user;

after the top-level parameters of the current-level multiplier are configured, judging whether the current-level multiplier is a minimum unit multiplier, and if so, directly calling a minimum unit multiplier; otherwise, splitting the primary multiplier into two groups of secondary multipliers to generate RTL codes, including: the top layer RTL code of the primary multiplier, the CSA RTL code, the final adder RTL code, the register set RTL code and the parameter configuration file of the secondary multiplier;

2. The automated generation method of a variable bit-width multiplier of claim 1, wherein the configurable number m of pipeline stages has a range of values: m is more than or equal to 0 and less than or equal to 2n, wherein n is the number of RTL code division layers; for each layer of RTL code, the pipeline has two insertion positions selectable: after the adder of the maximum number of bits, between the CSA and the adder.

3. The automated generation method of variable bit width multipliers of claim 1, wherein the multiplier nesting level uses a configurable number of bits {2,4,8,12,16,20,24,28,32} for a user to configure any combination of multipliers.