CN116933840A

CN116933840A - Multi-precision Posit encoding and decoding operation device and method supporting variable index bit width

Info

Publication number: CN116933840A
Application number: CN202310971673.0A
Authority: CN
Inventors: 王中风; 李琼; 方超
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2023-08-03
Filing date: 2023-08-03
Publication date: 2023-10-24

Abstract

The invention provides a multi-precision Posite encoding and decoding operation device and method supporting variable exponent bit width, wherein the device comprises a multi-precision Posite decoder, a multi-precision Posite operation unit and a multi-precision Posite encoder; the multi-precision Posite decoder receives Posite input data, a precision mode control signal and an exponent bit width es configuration signal, and completes decoding operation to obtain effective symbol, exponent and mantissa value output; the multi-precision Posit operation unit completes corresponding operation and sends an operation result to the multi-precision Posit encoder, and the multi-precision Posit encoder completes encoding of Posit output data. The invention realizes the wide dynamic configurable digit of the run-time, can simultaneously support the advantages of large dynamic range and high numerical precision of the Posit format in the same hardware, and simultaneously realizes the high-efficiency multi-precision Posit encoding and decoding operation of the hardware.

Description

Multi-precision Posit encoding and decoding operation device and method supporting variable index bit width

Technical Field

The invention relates to a multi-precision Posit encoding and decoding operation device and method supporting variable exponent bit width.

Background

The Posit (reference: gustafson J L, yonemoto I T.bearing floating point at its own game: posit arithmetic [ J ]. Supercomputing frontiers and innovations,2017,4 (2): 71-86.) format has since been proposed for widespread academic and industrial interest and has proven to have potential advantages in partial neural network applications: the low-precision Posit format is used for replacing the high-precision traditional floating point format, so that the calculation complexity is reduced, the storage requirement is reduced, and the model precision is kept unchanged. This benefits from the balance between dynamic range and numerical accuracy of the Posit format, and the dynamically configurable exponent bit width es makes this balance more flexible (the larger the exponent bit width es, the more biased towards a large dynamic range; the smaller the exponent bit width es, the more biased towards a high numerical accuracy), greatly improving the flexibility of Posit operations.

On the other hand, the calculation precision of different network layers and operations in the application of the neural network is often different, and by realizing flexible precision configuration in the operation, the calculation efficiency can be greatly improved, the energy consumption is reduced, and the parameter storage requirement of the neural network can be reduced under the low-precision operation. However, the Posit format is also more complex in its codec process than the conventional floating point format due to the unique region field. If the Posit coding and decoding units with various accuracies are realized in hardware at the same time, the hardware cost is greatly increased, and the cost of Posit operation is increased. Therefore, the multi-precision Posit encoding and decoding unit with high hardware efficiency is realized, and the neural network operation based on the Posit format can be better satisfied.

Document "Wang Zhongfeng, xu Mingyang, fang Chaodeng. Floating-point number multiplication circuit based on posit data format [ P ]. Jiangsu province: CN111290732B,2023-03-14, "sum" Liang Feng, wu, zhang Guohe, et cetera: CN111538472B,2022-11-04 et al propose arithmetic circuits based on multiplication, addition, etc. of Posit format, wherein all involve encoding and decoding Posit data, however, only single-precision and fixed-finger-wide Posit encoding and decoding are supported. A multiplier supporting multiple precision Posit operations is presented in the paper, "Zhang H, ko S B.effect multiple-precision Posit multiplier [ C ]//2021IEEE International Symposium on Circuits and Systems (ISCAS) & IEEE,2021:1-5, however the size of the finger width is still fixed at each precision. The Posit multiply-accumulate operation unit presented in the paper "New N, tomA s P, roma N.dynamic fused multiple-accumulate Posit unit with variable exponent size for low-precision DSP applications [ C ]//2020IEEE Workshop on Signal Processing Systems (SiPS) & IEEE 2020:1-6', then supports variable exponent bit widths, but does not provide an efficient multi-precision operation solution.

Disclosure of Invention

The invention aims to: the invention aims to solve the technical problem of providing a multi-precision Posit encoding and decoding operation device and method supporting variable index bit width aiming at the defects of the prior art. The device comprises a multi-precision Posit decoder, a multi-precision Posit operation unit and a multi-precision Posit encoder;

the multi-precision Posite decoder receives Posite input data, a precision mode control signal and an exponent bit width es configuration signal, and completes decoding operation to obtain effective symbol, exponent and mantissa value output;

the multi-precision Posit operation unit completes corresponding operation according to the obtained effective symbols, exponents and mantissa values, and sends operation results to the multi-precision Posit encoder, and the multi-precision Posit encoder completes encoding of Posit output data according to the precision mode control signals and the exponent bit width es configuration signals.

The multi-precision Posit decoder comprises a multi-precision two-by-two complement module, a multi-precision leading 0/1 counting module, a multi-precision region shifting module and a multi-precision mask exponent mantissa calculating module;

the multi-precision two-dimensional complement module adopts a segmentation method to realize multi-precision two-dimensional complement operation, and specifically comprises the following steps: dividing an input operand into N segments of sub-operands according to a supported minimum precision mode (wherein N is a result of dividing the bit width of the input operand by the supported minimum precision mode bit width, such as an input bit width is a value of 32-bits, and the bit width of the supported minimum precision mode is 8-bits, then N=32/8=4), namely sub-operand 0-sub-operand N-1, determining the effective sign of each segment of sub-operand according to a precision mode control signal, and performing all inverse operations on all negative digits and unchanged operations on all positive digits through exclusive OR operations on the sub-operands and the effective signs corresponding to the sub-operands; determining whether the result after the exclusive-or operation needs to be added with 1 according to the precision mode control signal, the effective symbols of the sub-operands and the carry generated by the low order bits; and finally, splicing the obtained N-segment sub-results, and performing OR operation on the sign bit position and the effective sign value according to the precision mode control signal to obtain a value after multi-precision two-s complement operation.

The multi-precision leading 0/1 counting module adopts a segmentation counting method, and a counting result in a low-precision mode is processed by an adder and a selector to obtain a counting result in a high-precision mode, and the method specifically comprises the following steps: the operand to be processed is divided into N segments of sub-operands according to the supported minimum precision (wherein N is the bit width of the operand to be processed divided by the supported minimum precision mode bit width), namely sub-operands 0 to N-1, and the respective leading 0 or 1 numbers of N groups of sub-operands are calculated by N groups of parallel leading 0/1 counting modules and are respectively: cpm [0] to cpm [ N-1], and the respective data valid signals of the N groups of sub-operands are respectively: vpm [0] -vpm [ N-1], and cpm is definite leading 0 or leading 1 number when the data effective signal vpm is 1; otherwise, when the data valid signal vpm is 0, representing that the input of the child operand is all 0 or all 1, and at the moment, cpm is 0; cpm is the leading 0/1 number of the sub-operands, N groups in total, i.e., cpm [0] is the leading zero 0/1 number of the sub-operand 0, similarly, cpm [1], cpm [2], … cpm [ N-1] are the leading 0/1 numbers of the sub-operand 1 to the sub-operand N-1, respectively; vpm is the data valid signal of the sub-operand, N groups of vpm [0] are the data valid signals of the sub-operand 0, similarly vpm [1], vpm [2], … and vpm [ N-1] are the data valid signals of the sub-operand 1 to the sub-operand N-1 respectively;

When the precision mode control signal is the lowest precision supported, cpm [0] -cpm [ N-1] are the required leading 0/1 counting results; otherwise, the counting result of every two adjacent sub-operands is processed by step-by-step combination to obtain the counting result of the next precision mode.

The multi-precision region shift module adopts a segmentation and hierarchical shift method to realize the support of multi-precision shift, and specifically comprises the following steps: dividing the operands into N segments of sub-operands according to the lowest supported precision mode (wherein N is the value of the operation digital width divided by the lowest supported precision mode bit width), namely sub-operands 0 to N-1, determining the left shift amount of each segment of sub-operands according to a precision mode control signal, and storing the maximum shift amount of each sub-operand as the bit width of the sub-operands by using L bits under the lowest supported precision mode, wherein L is the value obtained by log2 operation on the bit width of the sub-operands; under the highest supported precision mode, the maximum shift amount is the total bit width of the operand, and the operand is stored by using a K-bit number, wherein K is a value obtained by log2 operation on the total bit width of the operand; performing segmentation and hierarchical shifting, and sending N sub-operands into respective shifters in parallel to finish 1-L stages of shifting; under the supported lowest precision mode, the sub-results 0 to N-1 output by the N groups of shifters are the final shifting result, and the left shifting overflow high bits are directly abandoned; if the precision mode control signal is in a higher precision mode, the left-shifted result of the high-order shifter and the overflow bit of the low-order shifter are subjected to bit-wise OR operation so as to realize the continuous shift in the high-precision mode.

In the multi-precision mask exponent mantissa calculating module, masks of exponent sections and mantissa sections are determined according to an input exponent bit width es configuration signal and a precision mode control signal, and meanwhile operands only containing the exponent sections and the mantissa sections output by the multi-precision region shifting module are also shifted to the left by es-bits to ensure that values of the exponent sections are aligned to the right and values of the mantissa sections are aligned to the left; bit-wise AND operation is carried out on the shifted value and the exponent segment mask and the mantissa segment mask respectively, so that the results of the exponent segment and the mantissa segment are obtained;

selecting whether the mask corresponding to each bit of the operand is an exponent segment mask or zero according to the precision mode control signal, thereby determining the exponent segment mask equal to the operand in length; and inverting all the bits of the exponent segment mask to obtain the value of the mantissa segment mask.

The multi-precision Posit operation unit generally comprises a multi-precision addition and subtraction operation unit, a multi-precision multiplication and addition operation unit, a multi-precision division operation unit and a multi-precision evolution operation unit, wherein the multi-precision addition and subtraction operation unit receives the output of a multi-precision Posit decoder, namely valid symbols, exponents and mantissa values, and completes corresponding operation in the multi-precision Posit operation unit to obtain the symbols, exponents and mantissa values with valid operation results; the user can add operation units such as multi-precision addition, subtraction, multiplication, division and the like in a self-defined mode.

The multi-precision Posit encoder comprises a region and exponent section separation module, a multi-precision combination exponent and mantissa section module, a multi-precision combination region, exponent and mantissa section module and a multi-precision two-s complement module; the functions realized by the two multi-precision two's complement modules in the invention are the same, one is positioned at the input side of the multi-precision Posit decoder, and the other is positioned at the output side of the multi-precision Posit encoder.

In a region and index segment separation module, the value of the low es bit of the effective index input in each precision mode is the value of the index segment, the value of the high bit is the value of the index scale factor k represented by the region segment, an index segment mask with the same bit width is generated according to the precision mode control signal and the index bit width es configuration signal, and the value of the index segment is determined by bit and operation with the effective index input; after all the index segment masks are inverted, the index segment masks are bitwise and operated with the effective index input again, and the obtained result is shifted to the right by es bits, so that the k value is determined; furthermore, an initial value of the region segment and an actual bit width of the region segment are determined from the k value.

In the multi-precision combined exponent and mantissa segment module, the exponent segment and the mantissa segment are combined together, and redundant bits in the high order of the exponent segment are overflowed through multi-precision shifting.

In the multi-precision combination region, exponent and mantissa section module, continuously combining an initial value of the region section with a result output by the multi-precision combination exponent and mantissa section module, overflowing redundant bits in a high-order position of the region section through multi-precision shift, and rounding redundant bits according to a precision mode control signal to obtain an absolute value of the result;

and in the multi-precision two-dimensional complement module, according to the effective symbol input, implementing the two-dimensional complement operation under the multi-precision by adopting a segmentation method, and obtaining the output of the multi-precision Posit encoder.

The invention also provides a multi-precision Posit coding and decoding operation method supporting variable exponent bit width, and the following calculation process is completed through a multi-precision Posit decoder:

step a1, completing multi-precision binary complement operation of input Posit data in a multi-precision binary complement module according to a precision mode control signal and valid sign bits;

step a2, the operand after finishing the two's complement operation is sent into the multi-precision leading 0/1 counting module, leading 0 or leading 1 number of the operand under the corresponding precision mode is calculated, thus determining the actual bit width of the region section;

step a3, according to the actual bit width of the region segment, the region field of the multi-precision operand is shifted left in a multi-precision region shift module, and the lower bit is zero-filled, so that the operand only comprising an exponent segment and a mantissa segment is obtained;

Step a4, in the module for solving the exponent mantissa by using the multi-precision mask, determining the mask of the exponent section and the mantissa section according to the inputted exponent bit width es configuration signal and the precision mode control signal, and carrying out bitwise AND operation on the mask and the operand obtained in the step a3 to respectively obtain the values of the exponent section and the mantissa section;

step a5, determining the effective sign, the effective exponent and the effective mantissa value of the input operand according to the outputs of step a2 and step a4, wherein the effective sign is the highest bit of the corresponding operand determined according to the precision mode control signal; the effective index is calculated as: firstly, determining an index scale factor k value represented by a region segment according to the actual bit width of the region determined in the step a2, then shifting the k value by es bits left and carrying out bit OR operation on the k value and the value of the index segment determined in the step a4 to obtain an effective index value; the effective mantissa is calculated as: on the basis of the mantissa segment value determined in the step a4, adding hidden bits on corresponding bits according to the precision mode control signal to obtain an effective mantissa;

the following calculation process is completed through the multi-precision Posit encoder:

step b1, for effective index input, separating an index scale factor k value and an index segment value in each precision mode in a region and index segment separation module: for each effective index, the low es bit value of the bit width is the coding value of the Posit data index segment, and is an unsigned number; the rest high-bit value is an exponential scaling factor k value obtained by encoding a Posit data region segment, and is a signed number; meanwhile, determining an initial value and an actual bit width of a region segment according to the k value;

Step b2, combining the value of the separated exponent section with the effective mantissa input without hidden bits in a multi-precision combination region, exponent and mantissa section module according to the precision mode control signal, shifting out the left and the left of the superfluous 0 of the exponent section in the combination through multi-precision shift, only reserving the needed es bits, and setting the whole shifted to be X;

step b3, in the module of multi-precision combination region, exponent and mantissa segment, continuously combining the initial value of the region segment determined in the step b1 with the whole X obtained in the step b2, shifting the redundant bit of the region segment left out through multi-precision shift according to the actual bit width of the region segment, and rounding the redundant bit according to the precision requirement to obtain the absolute value of an output result;

and b4, in the multi-precision two-dimensional complement module, performing multi-precision two-dimensional complement on the absolute value obtained in the step b3 according to the effective symbol input to obtain a Posit output result, namely the output of the multi-precision Posit encoder.

The beneficial effects are that: the invention not only realizes the wide dynamic configurable index digit in operation, thereby simultaneously supporting the advantages of large dynamic range and high numerical precision of the Posit format in the same hardware, but also realizes the high-efficiency multi-precision Posit encoding and decoding operation of the hardware.

Drawings

The foregoing and/or other advantages of the invention will become more apparent from the following detailed description of the invention when taken in conjunction with the accompanying drawings and detailed description.

FIG. 1 is a schematic diagram of a multi-precision Posit decoder and encoder operation method.

Fig. 2 is a schematic diagram of a multi-precision two's complement arithmetic device.

FIG. 3 is a schematic diagram of a multi-precision leading 0/1 count operation device.

Fig. 4 is a schematic diagram of a multi-precision left-shift computing device.

FIG. 5 is a schematic diagram of an apparatus for supporting variable exponent bit width operations in a Posit decoder.

Fig. 6 is a diagram of an exponent segment mask and mantissa segment mask with two sub-operands.

Detailed Description

The invention provides a multi-precision Posite encoding and decoding operation device and method supporting variable exponent bit width, as shown in figure 1, the device comprises a multi-precision Posite decoder, a multi-precision Posite operation unit and a multi-precision Posite encoder; the multi-precision Posit decoder receives Posit input data, a precision mode control signal, an exponent bit width es configuration signal and decoding operation in the internal sub-module, so that effective sign, exponent and mantissa value output is obtained. After the decoding operation is finished, a user can add an operation unit such as Posit multi-precision addition, subtraction, multiplication and division, and the like in a self-defined mode, and corresponding operation is finished according to the effective symbols, exponents and mantissa values obtained through the decoding. For example, for Posit multiplication operations, the arithmetic unit needs to perform calculations such as exclusive or of sign bits, exponent value addition, and mantissa value multiplication. And finally, sending the effective symbols, the exponents and the mantissa values of the operation result into a multi-precision Posit coder, and finishing the coding of Posit output data according to the precision mode control signal and the exponent bit width es configuration signal.

Specifically, in the multi-precision Posit decoder, the main calculation flow is as follows:

step a2, the operand after finishing the two's complement operation is sent into the multi-precision leading 0/1 counting module, leading 0 or leading 1 number (namely the number of continuous 0 or continuous 1 digits of the high order) of the operand under the corresponding precision mode is calculated, thus determining the actual bit width of the region segment;

step a5, determining the effective sign, the effective exponent and the effective mantissa value of the input operand according to the outputs of step a2 and step a 4. Wherein the effective symbol is the highest bit of the corresponding operand determined according to the precision mode control signal; for the effective index: firstly, determining an index scale factor k value represented by a region segment according to the actual bit width of the region determined in the step a2, then shifting the k value by es bits left and carrying out bit OR operation on the k value and the value of the index segment determined in the step a4 to obtain an effective index value; for the effective mantissa: and c, adding hidden bits on corresponding bits according to the precision mode control signal on the basis of the mantissa segment value determined in the step a4, and obtaining the effective mantissa.

In a multi-precision Posit encoder, the main calculation flow is as follows:

step b1, for effective index input, separating an index scale factor k value and an index segment value in each precision mode in a region and index segment separation module: for each effective index, the value of the low es bit of the bit width is the coding value of the Posit data index section, and is an unsigned number; the rest high-bit value is an exponential scaling factor k value obtained by encoding a Posit data region segment, and is a signed number; meanwhile, determining an initial value (initial value: 000 … 001 or 111 … 110) and an actual bit width of the region segment according to the k value;

In order to realize the multi-precision Posit decoding and encoding functions, a scheme is to respectively realize a single-precision Posit decoder and encoder with each precision in hardware, so that a corresponding precision encoding and decoding unit can be enabled according to a precision mode control signal in operation. Therefore, the scheme fully utilizes means such as segmentation, hardware multiplexing and the like, so that the multi-precision encoding and decoding operation of the Posit format can be effectively supported under the condition of increasing little hardware cost. Specifically, the operations of multi-precision two-dimensional complement, multi-precision leading 0/1 counting, multi-precision shifting and the like exist in a Posit decoder and an encoder. In addition, the support of the variable exponent bit width can be realized by means of masking, shifting and the like.

Multi-precision two's complement:

the binary complement, i.e. for negative numbers, keeps the sign bit unchanged, and adds 1 after all the other bits are inverted; for positive numbers, all bits remain unchanged. In a multi-precision Posit decoder, the input operands need to be first binary-coded to ensure correct decoding of the negative numbers. In the Posit encoder, the obtained absolute value result is also subjected to binary complement to obtain Posit output.

In order to realize multi-precision two's complement operation and reduce area overhead, as shown in fig. 2, the scheme adopts a segmentation method, and delay is controlled at a lower level while area overhead is reduced. Specifically, the input operand is divided into N segments of sub-operands according to the supported minimum precision mode (where N is the value of the input operand bit width divided by the supported minimum precision mode bit width), namely sub-operands 0 to N-1, and the effective symbol of each segment of sub-operand is determined according to the precision mode control signal, and the sub-operands and their corresponding effective symbols are exclusive-ored to complete the operation of all the negative digits and all the positive digits unchanged. In addition, according to the precision mode control signal, the effective sign of the sub-operand and the carry generated by the low order bits, whether the result after the exclusive-or operation needs to be added with 1 is determined. And finally, splicing the obtained N-segment sub-results, and performing OR operation on the sign bit position and the effective sign value according to the precision mode control signal to obtain the value after the multi-precision two-s complement operation.

According to the thought, the two's complement operation under any multiple precision can be supported.

Multi-precision preamble 0/1 count:

in the Posit decoder, the actual bit width of the region field needs to be determined by the preamble 0 or preamble 1 count, so that the exponent scaling factor k value represented by the region field and the exponent section and mantissa section values following the region field can be determined.

In order to support leading 0/1 counting under multiple precision, the scheme also adopts a segmented counting method to avoid the additional hardware overhead of repeated counting under different precision, and the counting result under a low precision mode can be processed by an adder and a selector to obtain the counting result under high precision, as shown in fig. 3.

Specifically, the operand to be processed is divided into N segments of sub-operands according to the supported minimum precision (wherein N is the bit width of the operand to be processed divided by the supported minimum precision mode bit width), namely sub-operands 0 to N-1, the respective leading 0 or 1 number (cpm [0] -cpm [ N-1 ]) and the data valid signal (vpm [0] -vpm [ N-1 ]) of N groups of sub-operands are calculated by N groups of parallel leading 0/1 counting modules, and when the data valid signal vpm is 1 for each group of sub-operands, the cpm is the exact leading 0 or leading 1 number; otherwise, when the data valid signal vpm is 0, the input representing the child operand is all 0 or all 1, and the cpm is also corresponding to 0.

When the precision mode control signal is the lowest precision supported (namely the precision mode 1), cpm [0] to cpm [ N-1] are the required counting results; otherwise, the counting result of every two adjacent sub-operands can be processed to obtain the counting result of the next precision mode, and the processing procedure is as follows: the leading 0/1 number and data valid signals of each of two groups of low precision sub-operands are known as: leading 0/1 number of the sub-operand 0 is cpm [0], and the data valid signal is vpm [0]; the leading 0/1 number of child operand 1 is cpm [1], and the data valid signal is vpm [1]. To calculate the leading 0/1 number (i.e., cph [0 ]) and the data valid signal (i.e., vph [0 ]) of the corresponding operand in its higher precision mode, it is only necessary to obtain by selecting and adding logic based on the leading 0/1 number (i.e., cpm [0] and cpm [1 ]) and the data valid signal (i.e., vpm [0] and vpm [1 ]) of the 2 sub-operands described above: when vpm [1] is 1, representing that the count of child operand 1 is valid, then the value of cph [0] is equal to cpm [1], and vph [0] is also equal to 1; when vpm [1] is 0, if the highest order bits of sub-operand 1 and sub-operand 0 are different (i.e., one sub-operand is counting the leading 0 number and one sub-operand is counting the leading 1 number), then the value of cph [0] is equal to the bit width of the sub-operand and vph [0] is equal to 1; if vpm [1] is 0 and both sub-operand 1 and sub-operand are calculation preamble 0 or preamble 1, then the value of cph [0] is equal to the bit width of sub-operand 1 plus the value of cpm [0], and the value of vph [0] is also equal to vpm [0].

According to the operation, the leading 0/1 counting result and the data effective signal in the low precision mode are combined step by step, so that the leading 0/1 counting result and the data effective signal in the higher precision mode can be obtained until the leading 0/1 counting result and the data effective signal in the precision mode M (namely the highest supported precision mode) are obtained. According to the thought, the method supports leading 0/1 counting under any multi-precision, realizes low-precision support under the condition that only part of selectors and adders are added, avoids repeated leading 0/1 counting process, and realizes higher hardware efficiency.

Multi-precision shifting:

in the Posit decoder, after the actual bit width of the region segment is determined by counting the leading 0/1 counter, the operand is required to be shifted left in a multi-precision region shift module to shift out the sign segment and the region segment, so that the values of the exponent segment and the mantissa segment after the sign segment and the mantissa segment are determined conveniently. In the Posit encoder, in a multi-precision combined exponent and mantissa segment module and a multi-precision combined region, exponent and mantissa segment module, redundant bits of the exponent segment and redundant displacement bits of the region segment can be overflowed respectively through multi-precision left shifting, so that correct encoding of the Posit format is realized.

Since dynamic shifting is involved, if the respective shifting is implemented for each precision mode to be supported, a large area overhead will be generated. Thus, as shown in FIG. 4, the present scheme employs segmentation and hierarchical shifting to fully multiplex hardware, enabling efficient support for multi-precision shifting with only partial selection and control logic added.

Specifically, firstly, dividing an operand into N segments of sub-operands according to a supported lowest precision mode (wherein N is a value obtained by dividing the bit width of the operand by the bit width of the supported lowest precision mode), namely sub-operands 0 to N-1, determining the left shift amount of each segment of sub-operands according to a precision mode control signal, and in the supported lowest precision mode (namely precision mode 1), the maximum shift amount of each sub-operand is the bit width of the sub-operand and can be stored by using L bits (wherein L is a value obtained by log2 operation on the bit width of the sub-operand); in the highest supported precision mode (i.e., precision mode M), the maximum shift amount is the total bit width of the operand, which can be saved in K bits (where K is the value obtained by log2 operations on the total bit width of the operand). Next, the segmentation and hierarchical shifting are performed, that is, N sub-operands are fed into respective shifters in parallel (which may be implemented by a structure such as a barrel shifter), so as to complete the shifting of 1-L stages. In the precision mode 1 (namely the lowest supported precision mode), the sub-results 0 to N-1 output by the N groups of shifters are the final shifting result, and the left shift overflowed high order is directly abandoned; if the precision mode control signal is in a higher precision mode, the left-shifted result of the high-order shifter needs to be bitwise or operated with the overflow bit when the low-order shifter is shifted left, so as to realize the continuous shift in the high-precision mode.

According to the design thought, the hardware efficient shifting process under any multi-precision can be supported under the condition that only part of selection and bit or logic is added.

Multiple precision masking and shifting support variable exponent bitwidths:

in a general Posit operation unit, since only fixed exponent bit width es is supported, in the Posit decoding process, operands only containing exponent segments and mantissa segments are obtained through region shift, and the value of the fixed es bit with the highest operand is the value of the exponent segments; in the Posit coding process, the value of the exponent section is saved by using a fixed es bit width, and the exponent section is combined into the region section. This design is simple to implement, but the fixed digital width limits the dynamic range or numerical accuracy that can be represented by the Posit format, and cannot support high numerical accuracy or large dynamic range in the same hardware at the same time. Therefore, the Posit encoding and decoding operation device provided by the scheme not only supports high-efficiency multi-precision operation of hardware, but also supports variable index bit width, and the flexibility of Posit operation in the neural network is greatly improved.

In particular, since the exponent bit width is variable, the value of the exponent section can be saved only by the number of the maximum exponent bit width in the decoding and encoding processes. For example, when the exponent bit width es configuration signal is 3 bits, the exponent bit width value is represented as 0 to 7, so at least a 7 bit wide value is required to accurately preserve the value of the exponent section. The index mask with the same bit width can carry out bit-wise AND operation on an irrelevant position 0, and meanwhile, redundant displacement bits of an index segment can be overflowed out during encoding through displacement, so that the support of Posit encoding and decoding on variable index bit width is realized.

In the multi-precision Posit decoder, the masks of the exponent section and the mantissa section can be determined according to the input exponent bit width configuration signal and the precision mode control signal, and meanwhile, operands only comprising the exponent section and the mantissa section output by the multi-precision region shift module are also shifted to the left by es-bit to ensure the right alignment of the values of the exponent section (no sign number, high-order 0 supplement does not affect the value of the exponent section), and the left alignment of the values of the mantissa section (low-order 0 supplement does not affect the value of the mantissa section). The shifted values are bitwise and operated with the exponent segment mask and mantissa segment mask, respectively, to obtain the exponent segment and mantissa segment results, as shown in fig. 5.

For the exponent segment mask, in a specific embodiment, the exponent segment mask is generated as shown in the following table, assuming that the exponent bit width configuration signal es is 3-bit, i.e., the bit width of the real exponent segment is 0-7. And selecting whether the mask corresponding to each bit of the operand is an exponent segment mask or zero according to the precision mode control signal, thereby determining the exponent segment mask with equal length as the operand. In addition, the value of the mantissa segment mask can be obtained by inverting all the bits of the exponent segment mask.

Table 1 is an exponent segment mask generated by the 3-bit exponent bit width configuration signal es:

TABLE 1

Digital width es	Index segment mask
		000	000_0000
001	000_0001
		010	000_0011
011	000_0111
		100	000_1111
101	001_1111
		110	011_1111
111	111_1111

Taking fig. 6 as an example, the operand contains two sub-operands, each having an exponent section and a mantissa section, the exponent section mask and the mantissa section mask are generated by shifting the exponent section and the mantissa section to the left according to the exponent bit width configuration signal and the precision mode control signal (gray filled part represents mask value 1 and blank part represents mask value 0), and the values of the exponent section and the mantissa section are obtained by bitwise AND operation on the mask.

For a multi-precision Posit encoder, in a region and index segment separation module, the low es bit value of the effective index input in each precision mode is the value of the index segment, the high bit value is the value of the index scale factor k represented by the region segment, an index segment mask with the same bit width is generated according to a precision mode control signal and an index bit width configuration signal, and the value of the index segment can be determined according to bit and operation of the effective index input; and after all the index segment masks are inverted, the index segment masks are bitwise and operated with the effective index input again, and the obtained result is shifted to the right by es bits to determine the k value.

In the multi-precision combined exponent and mantissa segment module, the exponent and mantissa segments are combined together, but since the value of the exponent segment needs to be saved using the maximum bit width, and in practice the true bit width of the exponent segment may be less than the maximum bit width, it is necessary to overflow the extra bits in the high order of the exponent segment by shifting. For example, when the exponent bit width configuration signal es is 3' b101, although it means that the bit width of the true exponent section is 5, the exponent section can be stored only with the maximum bit width of 7 bits in hardware, so that the upper 2 bits of the number of 7-bits used to represent the exponent section are invalid and should be shifted out under the exponent bit width configuration signal. Under different precision modes, the shift amount of each sub-operand can be determined according to the finger bit width configuration signals, and then higher hardware utilization rate is realized through segmentation and hierarchical shift according to the multi-precision shift thought provided above.

Therefore, through the mask and the shift thought, configurable high-efficiency support in the operation of the exponential bit width can be realized in the multi-precision Posit decoder and the encoder, and the flexibility of Posit operation is improved.

Examples:

application scene: the invention provides a multi-precision Posite encoding and decoding operation device and method supporting variable exponent bit width, which are particularly suitable for being used as an encoding and decoding device of an operation unit based on a Posite format and applied to a neural network hardware accelerator. The invention not only realizes the wide dynamic configurable index digit in operation, but also supports the advantages of large dynamic range and high numerical precision of Posit format in operation, and is beneficial to improving the calculation precision of the accelerator. On the other hand, the invention supports the multi-precision Posit encoding and decoding operation, is beneficial to realizing flexible precision configuration of the accelerator in the operation process, and greatly improves the calculation efficiency. In addition, aiming at the two's complement, leading 0/1 counting and shifting processes in the Posit encoding and decoding process, the invention realizes a high-efficiency multi-precision two's complement device, a multi-precision leading 0/1 counting device and a multi-precision shifting device based on means such as segmentation and hardware multiplexing, thereby realizing higher area efficiency and energy efficiency as a whole and reducing the hardware cost of the device deployed in an accelerator.

Specific examples: the method and the device realize the multi-precision Posit encoding and decoding device supporting variable exponent bit width based on 1 x 16-bit and 2*8-bit, and are set in the following operation, the precision mode control signal is in 8-bit mode, the exponent bit width es configuration signal bit width is 3 bits, and the value expressed by the exponent bit width es configuration signal bit width is 1.

For a multi-precision Posit decoder, it is now assumed that there are 16-bit input operands: 1000_1010_0010_1001, then in the 8-bit precision mode, the valid sign, exponent and mantissa values of the high 8-bit sub-operand 1 (i.e., 1000_1010) and the low 8-bit sub-operand 0 (i.e., 0010_1001) respectively need to be decoded. The decoding steps are as follows:

and a1, completing multi-precision binary complement operation of the input Posit data in the multi-precision binary complement module according to the precision mode control signal and the effective sign bit. For the sub-operand 1 and the sub-operand 0, the effective symbols are 1 and 0 respectively, and in the process of solving the two's complement codes in a segmented way, the carry generated by the low 8-bit is not required to be transmitted to the high 8-bit, so that a multi-precision two's complement code result is obtained as follows: 0111_0110_0010_1001.

And a2, sending the operand subjected to the two's complement operation into a multi-precision leading 0/1 counting module, and calculating to obtain the leading 0 or leading 1 number of the operand in the corresponding precision mode, thereby determining the actual bit width of the region segment. For sub-operand 1 (i.e., 0111_0110), its region segment preamble 1 number is 3 and the region segment actual bit width is 4. For child operand 0 (i.e., 0010_1001), its field segment leading 0 number is 1 and the field segment actual bit width is 2.

And a step a3 of shifting the region field of the multi-precision operand left in the multi-precision region shift module according to the actual bit width of the region segment, and performing zero padding on the lower bit, thereby obtaining the operand only comprising the exponent segment and the mantissa segment. For sub-operand 1 (i.e., 0111_0110), a left shift of 5-bits is required, for sub-operand 0 (i.e., 0010_1001), a left shift of 3-bits is required, multi-precision piecewise shifting is employed, and the result after left shift of sub-operand 1 does not need to be bitwise or operated with bits overflowing after left shift of sub-operand 0. Thereby obtaining a shifted result: 1100_0000_0100_1000.

And a step a4 of determining masks of an exponent section and a mantissa section according to the inputted exponent bit width es configuration signal and the precision mode control signal in a module for calculating the exponent mantissa by using the multi-precision masks, and respectively obtaining values of the exponent section and the mantissa section by bitwise AND operation of the masks and the operand obtained in the step 3. Specifically, the exponent field mask is 000_0001 because the exponent bit width es configuration signal is 3-bit, which has an actual value of 1. Shifting the result shifted in the step 3 by an es bit (the actual value of es is 1 in the operation) to ensure that the values of exponent segments are aligned right and the values of mantissa segments are aligned left, and obtaining the value of exponent segments of the sub-operand 1 as 1 and the value of mantissa segments as 1000_0000 after bitwise AND operation according to the parity-wide exponent mask and mantissa mask determined by the precision mode control signal; the exponent segment of sub-operand 0 has a value of 0 and the mantissa segment has a value of 1001_0000.

Step a5, determining the effective sign, the effective exponent and the effective mantissa value of the input operand according to the outputs of step a2 and step a 4. For the valid symbols, the valid symbols of sub-operand 1 and sub-operand 0 are determined to be 1 and 0, respectively, according to the precision mode control signal. For the effective index, firstly, according to the actual bit width of the region segment obtained in the step a2, the index scale factors represented by the region segments of the sub-operand 1 and the sub-operand 0 are respectively 2 and-1. And (3) shifting the k value by es bits to the left, and performing bit OR operation on the k value and the value of the exponent section obtained in the step a4 to obtain effective exponents of the sub-operand 1 and the sub-operand 0 as 5 and-2 respectively. And d, adding hidden bits to the upper bits of the value of the mantissa segment obtained in the step a4 according to the precision mode control signal to obtain the effective mantissas of the sub-operand 1 and the sub-operand 0, wherein the effective mantissas are respectively as follows: 1100_0000 and 1100_1000.

For a multi-precision Posit encoder, it is now assumed that there are a significant symbol input 10 (2-bit representing the significant symbols of each of the 2 8-bit sub-results to be output), a significant exponent input 0000_0101_1111_1110 (16-bit, each 8-bit being a signed number representing the significant exponent of each of the 2 8-bit sub-results to be output), and a significant mantissa input 1100_0000_1100_1000 (16-bit, each 8-bit being the significant mantissa of each of the 2 8-bit sub-results to be output). The multi-precision Posit decoding process is as follows:

And b1, for effective index input, separating an index scale factor k value and an index segment value under each precision mode in a region and index segment separation module, and determining an initial value and an actual bit width of a region segment according to the k value. Therefore, the exponent scaling factor k value and exponent segment value of sub-result 1 can be found to be 2 and 1, respectively, and the exponent scaling factor k value and exponent segment value of sub-result 0 are-1 and 0, respectively. In addition, according to the k value, initial values of respective regions of the sub-result 1 and the sub-result 0 are also determined as follows: 000_0001 and 1111_1110

And b2, combining the separated exponent section value and the value with hidden bits removed in a multi-precision combination region, exponent and mantissa section module according to the precision mode control signal, and shifting the redundant 0 left of the high-order exponent section by multi-precision shift, wherein only the needed es bits are reserved, and the whole shifted exponent section is X. For sub-result 1, the whole X is: 11_0000_0000_0000; for sub-result 0, the whole X is: 01_0010_0000_0000.

And b3, continuously combining the initial value of the region segment determined in the step b1 with the whole X obtained in the step b2 in a multi-precision combination region, exponent and mantissa segment module, shifting out the left redundant bit of the region segment through multi-precision shift according to the actual bit width of the region segment, and rounding the redundant bit according to the precision requirement to obtain the absolute value of an output result. For sub-result 1, the absolute value obtained is: 0111_0110; for sub-result 0, the absolute value obtained is: 0010_1001

And b4, in the multi-precision two-dimensional complement module, carrying out multi-precision two-dimensional complement on the absolute value obtained in the step b3 according to the effective symbol input to obtain a Posit coding result. For sub-result 1, its effective symbol input is 1, so the two's complement value is: 1000_1010; for a sub-result 0, its effective symbol input is 0, so the two's complement value is: 0010_1001. The output of the multi-precision Posit encoder is therefore: 1000_1010_0010_1001.

In addition, in a specific embodiment, based on the multi-precision Posit codec operation device and method supporting variable exponent bit width provided by the invention, a 32-bit multi-precision Posit decoder and Posit encoder are realized, which support multi-precision codec of 1 x 32-bit or 2 x 16-bit or 4*8-bit and support variable exponent bit width. Meanwhile, when the above functions are realized by using the single precision Posit codec unit which also supports the variable exponent bit width, four 8-bit single precision Posit decoders and encoders, two 16-bit single precision Posit decoders and encoders, and one 32-bit single precision Posit decoder and encoder are required to be realized in hardware at the same time, so that the required Posit decoders and encoders are enabled according to the precision mode control signal.

The 32-bit multi-precision Posit decoder and Posit encoder realized by the scheme are integrated with single-precision Posit decoders and encoders based on 8-bit, 16-bit and 32-bit under the TSMC 28nm process, and the time delay, the area and the power consumption of the two are compared as shown in tables 2 and 3 respectively:

TABLE 2

TABLE 3 Table 3

It can be seen that the scheme realizes the support of parallel low-precision Posit encoding and decoding by means of segmentation, hardware multiplexing and the like on the basis of a 32-bit single-precision Posit decoder and a Posit encoder under the condition of increasing a small amount of hardware expenditure.

Specifically, for a multi-precision Posit decoder, to realize the same multi-precision support function of the scheme by using a single-precision Posit decoder, four groups of 8-bit decoders, two groups of 16-bit decoders and one group of 32-bit decoders are required to be realized, and compared with the scheme, the area reduction of 38.76% and the power consumption reduction of 52.26% are realized, so that higher area efficiency and energy efficiency are achieved. In addition, compared with a 32-bit single-precision Posit decoder, the multi-precision Posit decoder has almost no extra delay cost, and the scheme can realize higher calculation performance when supporting parallel low precision. For the multi-precision Posit encoder, compared with the combination of four groups of 8-bit encoders, two groups of 16-bit encoders and one group of 32-bit encoders, the scheme also realizes 15.15% area reduction and 30.54% power consumption reduction, and reduces the hardware overhead for supporting multi-precision Posit encoding.

The present invention provides a multi-precision Posit codec operation device and method supporting variable exponent bit width, and the method and approach for realizing the technical scheme are many, the above is only the preferred embodiment of the present invention, it should be noted that, for those skilled in the art, several improvements and modifications can be made without departing from the principles of the present invention, and these improvements and modifications should also be regarded as the protection scope of the present invention. The components not explicitly described in this embodiment can be implemented by using the prior art.

Claims

1. The multi-precision Posite encoding and decoding operation device supporting the variable exponent bit width is characterized by comprising a multi-precision Posite decoder, a multi-precision Posite operation unit and a multi-precision Posite encoder;

2. The apparatus of claim 1, wherein the multi-precision Posit decoder comprises a multi-precision two's complement module, a multi-precision leading 0/1 count module, a multi-precision region shift module, and a multi-precision mask exponent mantissa module;

the multi-precision two-dimensional complement module adopts a segmentation method to realize multi-precision two-dimensional complement operation, and specifically comprises the following steps: dividing an input operand into N sections of sub-operands, namely sub-operand 0-sub-operand N-1, according to a supported minimum precision mode, determining effective symbols of each section of sub-operands according to a precision mode control signal, and performing exclusive OR operation on all bits of a negative number and all bits of a positive number by the effective symbols corresponding to the sub-operands; determining whether the result after the exclusive-or operation needs to be added with 1 according to the precision mode control signal, the effective symbols of the sub-operands and the carry generated by the low order bits; and finally, splicing the obtained N-segment sub-results, and performing OR operation on the sign bit position and the effective sign value according to the precision mode control signal to obtain a value after multi-precision two-s complement operation.

3. The apparatus of claim 2, wherein the multi-precision leading 0/1 counting module employs a segment counting method, and the counting result in the low-precision mode is processed by the adder and the selector to obtain the counting result in the high-precision mode, and specifically includes: the operand to be processed is divided into N segments of sub-operands according to the lowest supported precision, namely sub-operand 0 to sub-operand N-1, and the leading 0 or 1 number of each of the N groups of sub-operands is calculated by N groups of parallel leading 0/1 counting modules, which are respectively: cpm [0] to cpm [ N-1], and the respective data valid signals of the N groups of sub-operands are respectively: vpm [0] -vpm [ N-1], and cpm is definite leading 0 or leading 1 number when the data effective signal vpm is 1; otherwise, when the data valid signal vpm is 0, representing that the input of the child operand is all 0 or all 1, and at the moment, cpm is 0;

4. The apparatus of claim 3, wherein the multi-precision region shift module employs a method of segmentation and hierarchical shifting to support multi-precision shifting, and specifically comprises: dividing the operands into N sections of sub-operands, namely sub-operands 0 to N-1 according to the lowest supported precision mode, determining the left shift amount of each section of sub-operands according to a precision mode control signal, and storing the maximum shift amount of each sub-operand as the bit width of the sub-operand and using L bits, wherein L is a value obtained by log2 operation on the bit width of the sub-operand under the lowest supported precision mode; under the highest supported precision mode, the maximum shift amount is the total bit width of the operand, and the operand is stored by using a K-bit number, wherein K is a value obtained by log2 operation on the total bit width of the operand; performing segmentation and hierarchical shifting, and sending N sub-operands into respective shifters in parallel to finish 1-L stages of shifting; under the supported lowest precision mode, the sub-results 0 to N-1 output by the N groups of shifters are the final shifting result, and the left shifting overflow high bits are directly abandoned; if the precision mode control signal is in a higher precision mode, the left-shifted result of the high-order shifter and the overflow bit of the low-order shifter are subjected to bit-wise OR operation so as to realize the continuous shift in the high-precision mode.

5. The apparatus of claim 4 wherein in the multi-precision masking exponent mantissa module, masking of exponent segments and mantissa segments is determined based on the input exponent bit width es configuration signal and the precision mode control signal, and wherein operands output by the multi-precision region shift module that include only exponent segments and mantissa segments are also left shifted by es-bits to ensure right alignment of values of exponent segments and left alignment of values of mantissa segments; bit-wise AND operation is carried out on the shifted value and the exponent segment mask and the mantissa segment mask respectively, so that the results of the exponent segment and the mantissa segment are obtained;

6. The apparatus of claim 5, wherein the multi-precision Posit encoder comprises a region and exponent section separation module, a multi-precision combination exponent and mantissa section module, a multi-precision combination region, exponent and mantissa section module, and a multi-precision two's complement module.

7. The apparatus of claim 6 wherein in the region and exponent section separation module, the low es bit value of the effective exponent input in each precision mode is the exponent section value, the high bit value is the exponent scaling factor k value represented by the region section, and the parity-wide exponent section mask is generated from the precision mode control signal and the exponent bit width es configuration signal, and the effective exponent input is bitwise and operated to determine the exponent section value; after all the index segment masks are inverted, the index segment masks are bitwise and operated with the effective index input again, and the obtained result is shifted to the right by es bits, so that the k value is determined; and determining the initial value of the region segment and the actual bit width of the region segment according to the k value.

8. The apparatus of claim 7 wherein the exponent section and mantissa section are combined together in a multi-precision combination exponent and mantissa section module by multi-precision shifting to overflow the extra bits in the high order bits of the exponent section.

9. The apparatus of claim 8 wherein within the multi-precision combination region, exponent and mantissa segment module, the initial value of the region segment is continuously combined with the result output by the multi-precision combination exponent and mantissa segment module, excess bits in the high order bits of the region segment are overflowed by multi-precision shifting, and the excess bits are rounded according to the precision mode control signal to obtain the absolute value of the result;

10. The multi-precision Posit encoding and decoding operation method supporting variable exponent bit width is characterized in that the following calculation process is completed through a multi-precision Posit decoder: