CN110515586A - Multiplier, data processing method, chip and electronic equipment - Google Patents

Multiplier, data processing method, chip and electronic equipment Download PDF

Info

Publication number
CN110515586A
CN110515586A CN201910817905.0A CN201910817905A CN110515586A CN 110515586 A CN110515586 A CN 110515586A CN 201910817905 A CN201910817905 A CN 201910817905A CN 110515586 A CN110515586 A CN 110515586A
Authority
CN
China
Prior art keywords
circuit
product
data
multiplier
partial product
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910817905.0A
Other languages
Chinese (zh)
Other versions
CN110515586B (en
Inventor
不公告发明人
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Cambricon Information Technology Co Ltd
Original Assignee
Shanghai Cambricon Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Cambricon Information Technology Co Ltd filed Critical Shanghai Cambricon Information Technology Co Ltd
Priority to CN201910817905.0A priority Critical patent/CN110515586B/en
Publication of CN110515586A publication Critical patent/CN110515586A/en
Application granted granted Critical
Publication of CN110515586B publication Critical patent/CN110515586B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/4824Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices using signed-digit representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/53Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel
    • G06F7/5318Multiplying only in parallel-parallel fashion, i.e. both operands being entered in parallel with column wise addition of partial products, e.g. using Wallace tree, Dadda counters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • G06F7/533Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even
    • G06F7/5332Reduction of the number of iteration steps or stages, e.g. using the Booth algorithm, log-sum, odd-even by skipping over strings of zeroes or ones, e.g. using the Booth Algorithm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Neurology (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The application provides a kind of multiplier, data processing method, chip and electronic equipment, and the multiplier includes: canonical signed number coding circuit, and partial product obtains circuit and amendment summation circuit;Wherein, the output end of the canonical signed number coding circuit obtains circuit input end with the partial product and connect, the output end that the partial product obtains circuit is connect with the input terminal of the amendment summation circuit, the multiplier can carry out canonical signed number coding to the data received by canonical signed number coding circuit, the number of obtained live part product is less, to reduce the complexity that multiplier realizes multiplying.

Description

Multiplier, data processing method, chip and electronic equipment
Technical field
This application involves field of computer technology, more particularly to a kind of multiplier, data processing method, chip and electronics Equipment.
Background technique
With the continuous development of Digital Electronic Technique, all kinds of artificial intelligence (Artificial Intelligence, AI) cores The fast-developing requirement for good digital multiplier of piece is also higher and higher.Neural network algorithm is extensive as intelligent chip One of algorithm of application, carrying out multiplying by multiplier is a kind of common operation in neural network algorithm.
Currently, multiplier is to encode to every three bit value in multiplier as one, and obtain partial product according to multiplicand, And compression processing is carried out to all partial products with Wallace tree and obtains the target operation result in multiplying.But traditional skill In art, the number of non-zero bit value is more in coding, and the number of the corresponding part product of generation is more, and multiplier is caused to realize multiplication The complexity of operation is higher.
Summary of the invention
Based on this, it is necessary to which in view of the above technical problems, the portion obtained in multiplication procedure can be reduced by providing one kind Divide product mesh, to reduce multiplier, data processing method, chip and the electronic equipment of multiplier multiplying complexity.
The embodiment of the present application provides a kind of multiplier, and the multiplier includes: canonical signed number coding circuit, partial product Obtain circuit and amendment summation circuit, wherein the output end of the canonical signed number coding circuit is obtained with the partial product The connection of sense circuit input terminal, the output end that the partial product obtains circuit are connect with the input terminal of the amendment summation circuit;
The canonical signed number coding circuit is used to carry out canonical signed number to the data received to encode to obtain mesh Mark coding, the partial product obtain circuit and are used to obtain initial protion product according to the target code, and according to the original portion Product is divided to carry out logical operation process, the partial product after the symbol Bits Expanding that is eliminated, the amendment summation circuit is used for described Partial product after eliminating symbol Bits Expanding carries out cumulative correcting process.
The canonical signed number coding circuit includes: data-in port and target in one of the embodiments, Encode output port;The data-in port is used to receive the first data for carrying out canonical signed number coding, the target Coding output port, which is used to export, carries out the target obtained after canonical signed number coding volume to first data received Code.
It includes initial protion product acquiring unit and logic gate that the partial product, which obtains circuit, in one of the embodiments, Unit, the initial protion product acquiring unit are used to obtain initial protion product according to target code, and the logic gate is used for Logical operation process is carried out to the high double figures value of initial protion product, the partial product after the symbol Bits Expanding that is eliminated.
It includes AND gate circuit that the partial product, which obtains circuit, in one of the embodiments,.
The amendment summation circuit includes: amendment Wallace tree group sub-circuit and cumulative son in one of the embodiments, The output end of circuit, the amendment Wallace tree group sub-circuit is connect with the input terminal of the cumulative sub-circuit;
Wherein, the amendment Wallace tree group sub-circuit is used to carry out the partial product after the elimination symbol Bits Expanding tired Add correcting process, the cumulative sub-circuit is used to carry out accumulation process to the cumulative amendment operation result.
The amendment Wallace tree group sub-circuit includes: Wallace tree unit, the Hua Lai in one of the embodiments, Scholar's tree unit is used to carry out cumulative correcting process to each columns value of the partial product after the elimination symbol Bits Expanding.
The cumulative sub-circuit includes: adder in one of the embodiments, and the adder is used for described cumulative It corrects operation result and carries out add operation.
In one of the embodiments, the adder include: carry signal input port and position signal input port with And result output port;The carry signal input port is used for for receiving carry signal, described and position signal input port It receives and position signal, the result output port is used to export the carry signal and carries out accumulation process with described and position signal The target operation result.
A kind of multiplier provided in this embodiment, by canonical signed number coding circuit can to the data received into Row canonical signed number encodes to obtain target code, then obtains circuit according to each bit value in target code by partial product Initial protion product is obtained, and logical operation process is carried out to initial protion product by logic gate, obtains corresponding elimination symbol Partial product after number Bits Expanding, the partial product after divided-by symbol Bits Expanding is offseted finally by amendment summation circuit carry out cumulative amendment Processing, the multiplier can carry out canonical signed number coding to the data received by canonical signed number coding circuit, The number of obtained live part product is less, to reduce the complexity that multiplier realizes multiplying.
The embodiment of the present application provides a kind of data processing method, which comprises
Receive pending data;
Canonical signed number coded treatment is carried out to the pending data, obtains initial protion product;
Logical operation process is carried out according to initial protion product, sign-extension bit is eliminated and is eliminated after symbol Bits Expanding Partial product;
Cumulative correcting process is carried out to the partial product after the elimination symbol Bits Expanding, obtains target operation result.
It is described in one of the embodiments, that canonical signed number coded treatment is carried out to the pending data, it obtains Initial protion product, comprising:
Canonical signed number coded treatment is carried out to the pending data, obtains target code;
According to the pending data and the target code, the initial protion product is obtained.
It is described in one of the embodiments, that canonical signed number coded treatment is carried out to the pending data, it obtains Target code, comprising: it is 1 that l bit value 1 continuous in the pending data, which is converted to the position (l+1) highest bit value, minimum Bit value be -1, remaining position be numerical value 0 after, obtain the target code, wherein l be more than or equal to 2.
It is described in one of the embodiments, that logical operation process is carried out according to initial protion product, it eliminates symbol and expands Exhibition position is eliminated the partial product after symbol Bits Expanding, comprising: carries out to the highest bit value of initial protion product and logic Calculation process eliminates sign-extension bit and obtains the partial product after the elimination symbol Bits Expanding.
A kind of data processing method provided in this embodiment, multiplier receive pending data, carry out to pending data Canonical signed number coding obtains initial protion product, and carries out logical operation process to initial protion product, eliminates sign-extension bit Be eliminated the partial product after symbol Bits Expanding, and the partial product after offseting divided-by symbol Bits Expanding carries out cumulative correcting process, obtains Target operation result, this method can carry out canonical signed number coding to the pending data received, reduce multiplying The number of middle live part product, to reduce the complexity of multiplying.
A kind of machine learning arithmetic unit provided by the embodiments of the present application, the machine learning arithmetic unit include one or Multiple multipliers;The machine learning arithmetic unit is used to obtained from other processing units to operational data and control letter Breath, and specified machine learning operation is executed, implementing result is passed into other processing units by I/O interface;
When the machine learning arithmetic unit includes multiple multipliers, by default between multiple computing devices Specific structure is attached and transmits data;
Wherein, multiple multipliers are interconnected by PCIE bus and are transmitted data, to support more massive machine The operation of device study;Multiple multipliers share same control system or possess respective control system;Multiple multiplication Device shared drive possesses respective memory;The mutual contact mode of multiple multipliers is any interconnection topology.
A kind of combined treatment device provided by the embodiments of the present application, the combined treatment device include machine learning as mentioned Processing unit, general interconnecting interface and other processing units;The machine learning arithmetic unit and above-mentioned other processing units carry out Interaction, the common operation completing user and specifying;The combined treatment device can also include storage device, the storage device respectively with The machine learning arithmetic unit is connected with other processing units, for saving the machine learning arithmetic unit and described The data of other processing units.
A kind of neural network chip provided by the embodiments of the present application, the neural network chip include multiplication described above Device, machine learning arithmetic unit described above or combined treatment device described above.
A kind of neural network chip encapsulating structure provided by the embodiments of the present application, the neural network chip encapsulating structure include Neural network chip described above.
A kind of board provided by the embodiments of the present application, the board include neural network chip encapsulating structure described above.
The embodiment of the present application provides a kind of electronic device, the electronic device include neural network chip described above or Person's board described above.
A kind of chip provided by the embodiments of the present application, including at least one multiplier as described in any one of the above embodiments.
A kind of electronic equipment provided by the embodiments of the present application, including chip as mentioned
Detailed description of the invention
Fig. 1 is a kind of structural schematic diagram for multiplier that an embodiment provides;
Fig. 2 is the structural schematic diagram for another multiplier that another embodiment provides;
Fig. 3 is a kind of concrete structure schematic diagram for multiplier that an embodiment provides;
Fig. 4 is the concrete structure schematic diagram for another multiplier that another embodiment provides;
Fig. 5 is the regularity of distribution schematic diagram of the partial product after 9 elimination symbol Bits Expandings that another embodiment provides;
Another particular circuit configurations figure of summation circuit is corrected when 8 data operations that Fig. 6 provides for another embodiment;
Fig. 7 is a kind of processing method flow diagram for data that an embodiment provides;
Fig. 8 is the processing method flow diagram for another data that an embodiment provides;
Fig. 9 is a kind of structure chart for combined treatment device that an embodiment provides;
Figure 10 is the structure chart for another combined treatment device that an embodiment provides;
Figure 11 is a kind of structural schematic diagram for board that an embodiment provides.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not For limiting the application.
Multiplier provided by the present application can be applied to AI chip, on-site programmable gate array FPGA (Field- Programmable Gate Array, FPGA) chip or be in other hardware circuit equipment progress multiplying processing, Its concrete structure schematic diagram is as illustrated in fig. 1 and 2.
As shown in FIG. 1, FIG. 1 is a kind of structure charts for multiplier that one embodiment provides, which includes: that canonical has Symbolic number coding circuit 11 and amendment summation circuit 12, wherein the output end of the canonical signed number coding circuit 11 and institute State the input terminal connection of amendment summation circuit 12;The canonical signed number coding circuit 11 is used to carry out the data received Canonical signed number coded treatment is eliminated the partial product after symbol Bits Expanding, and the amendment summation circuit 12 is used for described Partial product after eliminating symbol Bits Expanding carries out cumulative correcting process.
Specifically, above-mentioned canonical signed number coding circuit 11 may include multiple data processing lists with different function Member, and the data that receive of canonical signed number coding circuit 11 can be used as the multiplier in multiplying, be also used as Multiplicand in multiplying.Optionally, the data processing unit of above-mentioned different function may include with canonical signed number The data processing unit of coded treatment function, above-mentioned canonical signed number coded treatment can be characterized as through numerical value 0, and -1 and 1 The data handling procedure of coding.Optionally, above-mentioned multiplier and multiplicand can be the fixed-point number of more bit bit wides.Optionally, it repairs Partial product after the elimination symbol Bits Expanding that positive summation circuit 12 can obtain canonical signed number coding circuit 11 carries out tired Correcting process is done during adding, and obtains the target operation result in multiplying.
It should be noted that multiplier provided in this embodiment can handle the multiplying of fixed bit wide data, this is solid Position it is wide can be for 8 bits, 16 bits, 32 bits can also be 64 bits, not do any restriction to this present embodiment.But When with multiplication operation, multiplier and multiplicand that canonical signed number coding circuit 11 receives are the data with bit wide.It can Choosing, the input port of the data processing unit of above-mentioned different function can have one, the input terminal of each data processing unit The function of mouth can be identical, and output port can also have one, and the function of the output port of each data processing unit can not It is identical, and the circuit structure of different function data processing unit can not be identical.
A kind of multiplier provided in this embodiment, multiplier is by canonical signed number coding circuit to the data received It carries out canonical signed number coded treatment to be eliminated the partial product after symbol Bits Expanding, amendment summation circuit can be accorded with to eliminating Partial product after number Bits Expanding carries out cumulative correcting process, obtains target operation result;The multiplier can have symbol using canonical Number coding circuit carries out canonical signed number coded treatment to the data received, and what is obtained in reduction multiplication procedure has The number of partial product is imitated, to reduce the complexity that multiplier realizes multiplying;Meanwhile the multiplier can be improved multiplication fortune The operation efficiency of calculation effectively reduces the power consumption of multiplier.
Fig. 2 is a kind of structure chart for multiplier that one embodiment provides.As shown in Fig. 2, the multiplier includes: that canonical has Symbolic number coding circuit 21, partial product obtain circuit 22 and amendment summation circuit 23;Wherein, the canonical signed number coding The output end of circuit 21 and the partial product obtain 22 input terminal of circuit and connects, the partial product obtain circuit 22 output end and The input terminal connection of the amendment summation circuit 23.The canonical signed number coding circuit 21 be used for the data received into Row canonical signed number coded treatment obtains target code, and the partial product obtains circuit 22 for obtaining according to the target code Logical operation process is carried out to initial protion product, and according to initial protion product, the part after the symbol Bits Expanding that is eliminated Product, the amendment summation circuit 23 are used to carry out cumulative correcting process to the partial product after the elimination symbol Bits Expanding.
Optionally, the canonical signed number coding circuit 21 includes: data-in port 211 and target code output end Mouth 212;The data-in port 211 is used to receive the first data for carrying out canonical signed number coded treatment, the target Coding output port 212 is used to export to after first data progress canonical signed number coded treatment received, obtains The target code.
Optionally, it includes initial protion product acquiring unit 221 and logic gate that the partial product, which obtains circuit 22, 222, the initial protion product acquiring unit 221 is used to obtain initial protion product, the logic gate 222 according to target code Logical operation process is carried out for the highest bit value to initial protion product, the part after the symbol Bits Expanding that is eliminated Product.Optionally, it includes AND gate circuit that the partial product, which obtains circuit 22,.
Specifically, above-mentioned canonical signed number coding circuit 21 can receive the first data, and first data are carried out Canonical signed number coded treatment, obtains target code;First data can be the multiplier in multiplying.It needs to illustrate It is that the method for above-mentioned canonical signed number coded treatment can characterize in the following manner: for N multipliers, from low level Numerical value is handled to high-order numerical value, if it exists when continuous l (l >=2) bit value 1, then can be converted to continuous n bit value 1 Data " 1 (0)l-1(- 1) ", and be combined remaining (l+1) bit value corresponded to after (N-l) bit value and conversion to obtain one A new data;Then using the new data as the primary data of next stage conversion process, what is obtained after conversion process is new There is no until continuous l (l >=2) bit value 1 in data;Wherein, canonical signed number coded treatment is carried out to N multipliers, obtained The bit wide of the target code arrived can be equal to (N+1).Further, in canonical signed number coded treatment, data 11 can be with Be converted to (100-001), i.e., data 11 can equivalence be converted to 10 (- 1);Data 111 can be converted to (1000-0001), i.e., Data 111 can equivalence be converted to 100 (- 1);And so on, the mode of 1 conversion process of other continuous l (l >=2) bit value It is similar.
For example, the multiplier that receives of canonical signed number coding circuit 21 is " 001010101101110 ", to the multiplier into The first new data obtained after row first order conversion process is " 0010101011100 (- 1) 0 ", continues to carry out the first new data The second new data obtained after the conversion process of the second level be " 0010101100 (- 1) 00 (- 1) 0 ", continue to the second new data into The third new data obtained after row third level conversion process is " 0010110 (- 1) 00 (- 1) 00 (- 1) 0 ", continues newly to count third It is " 00110 (- 1) 0 (- 1) 00 (- 1) 00 (- 1) 0 " according to obtained the 4th new data after carrying out fourth stage conversion process, continues pair 4th new data carry out obtained the 5th new data after level V conversion process be " 010 (- 1) 0 (- 1) 0 (- 1) 00 (- 1) 00 (- 1) 0 ", there is no continuous l (l >=2) bit values 1 in the 5th new data, at this point, the 5th new data is properly termed as intermediate volume Code, and after carrying out the processing of cover to intermediate code, characterization canonical signed number coded treatment is completed, wherein intermediate code Bit wide can be equal to multiplier bit wide.Optionally, canonical signed number coding circuit 21 carries out canonical signed number to multiplier After coded treatment, in obtained new data (i.e. intermediate code), if the highest bit value and time high-order numerical value in new data are " 10 " or " 01 ", then canonical signed number coding circuit 21 can highest bit value to the intermediate code that the new data obtains One digit number value 0 is mended at higher one, high three bit value for obtaining corresponding target code is respectively " 010 " or " 001 ".Optionally, The bit wide that the bit wide of above-mentioned intermediate code can be equal to target code subtracts 1.
Optionally, the bit wide N that the bit wide of above-mentioned target code can be equal to the multiplier that multiplier receives adds 1, the target The bit wide of coding can be equal to the number of initial protion product, and partial product obtains the initial protion product acquiring unit in circuit 22 221 can obtain corresponding initial protion product according to each bit value in target code, and by logic gates 222 to every Highest bit value in one initial protion product carries out logical operation process, directly eliminates sign-extension bit and is eliminated sign bit Partial product after extension.Optionally, above-mentioned initial protion product can be the partial product for not carrying out symbol Bits Expanding.Meanwhile it is original Highest bit value in partial product determines additional one in the partial product after eliminating symbol Bits Expanding by logic gates 222 Bit value, the bit value can be indicated with Q.Optionally, above-mentioned logic gate 222 may include AND gate circuit.
It should be noted that if the highest bit value that initial protion accumulates is indicated with A, then partial product obtains circuit 22 and can incite somebody to action Highest bit value and signal 1 obtain the highest order of initial protion product by AND gate circuit progress and logical operation, corresponding in target Correspond to the numerical value A' of position in partial product after the elimination symbol Bits Expanding of coding, i.e. A' be A and signal 1 and position signal;And The additional one digit number value Q in partial product after obtaining the elimination symbol Bits Expanding of target code can be equal to the carry of A and signal 1 Signal.Wherein, the portion eliminated after symbol Bits Expanding obtained after initial protion accumulates highest bit value A and logical operation process Divide the production Methods in product between corresponding highest order A' and additional one digit number value Q, may refer to table 1.
Table 1
A kind of multiplier provided in this embodiment, multiplier can be to receiving by canonical signed number coding circuit First data carry out canonical signed number coded treatment and obtain target code, then obtain circuit according to target code by partial product In each bit value obtain initial protion product, and by logic gate to initial protion product high a data carry out logic Calculation process, eliminates sign bit extension process to realize, the partial product after the symbol Bits Expanding that is eliminated is tired finally by amendment Power-up road offsets the partial product after divided-by symbol Bits Expanding and carries out cumulative correcting process, to guarantee that multiplier can have by canonical Symbolic number coding circuit carries out canonical signed number coded treatment to the data received, obtains in reduction multiplication procedure The number of live part product, to reduce the complexity that multiplier realizes multiplying;Meanwhile the multiplier can be improved multiplication The operation efficiency of operation effectively reduces the power consumption of multiplier.
Fig. 3 is a kind of concrete structure schematic diagram for multiplier that one embodiment provides, as shown in figure 3, the multiplier packet The canonical signed number coding circuit 11 is included, which includes: at canonical signed number coding Manage unit 111 and partial product acquiring unit 112;The output end of the canonical signed number coding processing unit 111 and the portion Divide the input terminal connection of product acquiring unit 112.Wherein, the canonical signed number coding processing unit 111 is used for receiving The first data carry out canonical signed number coded treatment and obtain target code, the partial product acquiring unit 112 is used for basis Target code obtains initial protion product, and carries out logical operation process according to initial protion product.
Optionally, the partial product acquiring unit 112 is specifically used for obtaining initial protion product according to target code, and according to The highest bit value of initial protion product carries out binary addition operation processing, the part after obtaining the elimination symbol Bits Expanding Product.Optionally, the partial product acquiring unit 112 includes the first full adder 112a and 1122b.
Specifically, above-mentioned canonical signed number coding processing unit 111 can receive the first data, and to the first data into Row canonical signed number coded treatment, obtains target code;Above-mentioned first data can be the multiplier in multiplying.It needs Bright, the method for above-mentioned canonical signed number coded treatment can characterize in the following manner: for N multipliers, from Low level numerical value is handled to high-order numerical value, then can will be at continuous 1 conversion of n bit value if it exists when continuous l (l >=2) bit value 1 Reason is data " 1 (0)l-1(- 1) ", and remaining (l+1) bit value corresponded to after (N-l) bit value and conversion is combined The data new to one;Then it using the new data as the primary data of next stage conversion process, is obtained after conversion process New data in there is no until continuous l (l >=2) bit value 1;Wherein, N multipliers are carried out at canonical signed number coding Reason, the bit wide of obtained target code can be equal to (N+1).Further, in canonical signed number coded treatment, data 11 Can be converted to (100-001), i.e., data 11 can equivalence be converted to 10 (- 1);Data 111 can be converted to (1000- 0001), i.e., data 111 can equivalence be converted to 100 (- 1);And so on, 1 conversion process of other continuous l (l >=2) bit value Mode it is also similar.
For example, the multiplier that canonical signed number coding processing unit 111 receives is " 001010101101110 ", to this It is " 0010101011100 (- 1) 0 " that multiplier, which carries out the first new data obtained after first order conversion process, is continued to the first new number It is " 0010101100 (- 1) 00 (- 1) 0 " according to the second new data obtained after the conversion process of the second level, continues new to second Obtained third new data is " 0010110 (- 1) 00 (- 1) 00 (- 1) 0 " after data carry out third level conversion process, is continued to the It is " 00110 (- 1) 0 (- 1) 00 (- 1) 00 (- 1) 0 " that three new datas, which carry out the 4th new data obtained after fourth stage conversion process, Continue to carry out the 4th new data obtained the 5th new data after level V conversion process be " 010 (- 1) 0 (- 1) 0 (- 1) 00 (- 1) 00 (- 1) 0 ", there is no continuous l (l >=2) bit values 1 in the 5th new data, at this point, the 5th new data is properly termed as Intermediate code, and after carrying out the processing of cover to intermediate code, characterization canonical signed number coded treatment is completed, wherein in Between the bit wide that encodes can be equal to the bit wide of multiplier.Optionally, canonical signed number coding processing unit 111 carries out just multiplier Then after signed number coded treatment, in obtained new data (i.e. intermediate code), if highest bit value in new data and time high Bit value is " 10 " or " 01 ", then canonical signed number coding processing unit 111 can be to the intermediate code that the new data obtains Higher one of highest bit value at mend one digit number value 0, obtain corresponding target code high three bit value be respectively " 010 " or "001".Optionally, the bit wide that the bit wide of above-mentioned intermediate code can be equal to target code subtracts 1.
Wherein, the bit wide N that the bit wide of above-mentioned target code can be equal to the multiplier that multiplier receives adds 1, which compiles The bit wide of code can be equal to the number of initial protion product, and partial product acquiring unit 112 can be according to each in target code Bit value obtains corresponding initial protion product, and two the first full adder 112a by including in partial product acquiring unit 112 And 1122b, progress and logical operation are carried out to the highest bit value in each initial protion product.Optionally, above-mentioned original portion The bit wide N for dividing the bit wide of product that can be equal to the multiplier that multiplier receives.Optionally, You Shangyi example is it is found that in target code Comprising three numerical value, respectively -1,0,1, wherein partial product acquiring unit 112 can according to the numerical value -1 received and by Multiplier X, obtaining initial protion product is-X, and according to the numerical value 1 and multiplicand X received, obtaining initial protion product is X, according to The numerical value 0 and multiplicand X received, obtaining initial protion product is 0.
It should be noted that if the highest bit value of initial protion product is indicated with A, logic is carried out to the numerical value A of highest order After operation, available target code eliminate symbol Bits Expanding after partial product in additional one digit number value, which can To be indicated with Q.Optionally, the additional one digit number value Q in partial product after eliminating symbol Bits Expanding, can be according to initial protion product Middle highest bit value A and signal 1 be determined with the result of logical operation, wherein the part after eliminating symbol Bits Expanding Q bit value in product can be equal to the carry signal of highest bit value A and signal 1 progress and logical operation in initial protion product, Time high-order numerical value in partial product after eliminating symbol Bits Expanding can be equal to highest bit value A and signal 1 transport with logic Calculate and position signal.
A kind of multiplier provided in this embodiment, multiplier can be to receptions by canonical signed number coding processing unit To the first data carry out canonical signed number coded treatment and obtain target code, then by partial product acquiring unit according to target Each bit value in coding obtains initial protion product, and the highest bit value progress and logical operation long-pending according to initial protion, Sign bit extension process is eliminated to realize, the partial product after the symbol Bits Expanding that is eliminated, finally by amendment summation circuit pair That answers eliminates the partial product after symbol Bits Expanding, and offsets the partial product after divided-by symbol Bits Expanding and carry out cumulative correcting process, from And the data received can be carried out at canonical signed number coding using canonical signed number coding circuit by guaranteeing multiplier Reason, reduces the number of the live part product obtained in multiplication procedure, to reduce the complexity that multiplier realizes multiplying Property;Meanwhile the multiplier can be improved the operation efficiency of multiplying, effectively reduce the power consumption of multiplier.
Multiplier includes the canonical signed number coding processing unit 111 in one of the embodiments, which has Symbolic number coding processing unit 111 includes: data-in port 1111 and target code output port 1112;The data input Port 1111 is used to receive first data for carrying out canonical signed number coded treatment, the target code output port 1112 carry out the target code obtained after canonical signed number coded treatment to first data received for exporting.
Specifically, if data-in port 1111 receives the first data, canonical signed number coding processing unit 111 Canonical signed number coded treatment can be carried out to the first data received, obtain target code, and target code is passed through Target code output port 1112 exports.Optionally, canonical signed number coding processing unit 111 can pass through data input pin Mouth 1111 receives the first data, and first data can be the multiplier in multiplying.It should be noted that shown in Fig. 3 The internal circuit configuration and external output end of canonical signed number coding circuit 11 and canonical signed number coding processing unit 111 Mouth and function are identical.Optionally, canonical signed number coding processing unit 111 carries out canonical signed number coded treatment to multiplier The numerical value for including in the target code obtained afterwards can be -1,0 and 1.
A kind of multiplier provided in this embodiment, canonical signed number coding processing unit can be to the first numbers received Target code is obtained according to canonical signed number coded treatment is carried out, then partial product acquiring unit can be according in target code Each bit value obtains the partial product after corresponding elimination symbol Bits Expanding, and can offset divided-by symbol by correcting summation circuit Partial product after Bits Expanding carries out cumulative correcting process, the target operation result in multiplying is obtained, to guarantee multiplier Canonical signed number coded treatment can be carried out to the data received by canonical signed number coding processing unit, reduction multiplies The number of the live part product obtained in method calculating process, to reduce the complexity that multiplier realizes multiplying;Meanwhile it should Multiplier can be improved the operation efficiency of multiplying, effectively reduce the power consumption of multiplier.
Multiplier includes the partial product acquiring unit 112, the partial product acquiring unit in one of the embodiments, 112 include: target code input port 1121, data-in port 1122 and partial product output port 1123;The target Coding input port 1121 is for receiving the target code, and data-in port 1122 is for receiving the second data, partial product Output port 1123, which is used to export, eliminates sign bit according to the target code and second data acquisition received Partial product after extension.
Specifically, partial product acquiring unit 112 can receive canonical signed number by target code input port 1121 Coding processing unit 111 exports target code, and partial product acquiring unit 112 is received according to target code input port 1121 Each bit value and data-in port 1122 in target code receive the second data, obtain initial protion product, this second Data can be the multiplicand in multiplying, and to the progress of initial protion product and logical operation process, available corresponding Partial product after eliminating symbol Bits Expanding.Optionally, the bit wide for eliminating the partial product after symbol Bits Expanding can be equal to original portion Divide the bit wide of product.
A kind of multiplier provided in this embodiment, multiplier can be according in target codes by partial product acquiring unit Each bit value obtains the partial product after corresponding elimination symbol Bits Expanding, and can offset divided-by symbol by correcting summation circuit Partial product after Bits Expanding carries out cumulative correcting process, the target operation result in multiplying is obtained, to guarantee multiplier The number of the live part product of acquisition is reduced, and reduces the complexity that multiplier realizes multiplying;Meanwhile the multiplier can mention The operation efficiency of high multiplying effectively reduces the power consumption of multiplier.
Continue the concrete structure schematic diagram of multiplier as shown in Figure 3 in one of the embodiments, wherein multiplier packet The amendment summation circuit 12 is included, which includes: 121~12n of full adder, the multiple full adders 121~ 12n is used to carry out the partial product after the elimination symbol Bits Expanding received cumulative correcting process.
Specifically, 121~12n of full adder can realize the combinational circuit of binary system phase adduction summation with gate circuit, may be used also To be interpreted as handling multidigit input signal, multidigit input signal is added to obtain the circuit of two output signals.It is optional , the number n for correcting the full adder for including in summation circuit 12 can be equal to the bit wide N of the partial product after eliminating symbol Bits Expanding Add 1 as a result, product with (N+1), then the sum of with N, wherein N can indicate canonical signed number coding processing unit 111 To target code in include the number of numerical value subtract 1, that is, the number of target code is equal to N+1.Optionally, amendment is cumulative Each that the regularity of distribution of n full adder can obtain in circuit 12 for successively distribution, partial product acquiring unit 112 eliminates symbol Partial product after number Bits Expanding can correspond to a floor full adder.Wherein, the number of plies of full adder, which can be equal to, eliminates symbol Bits Expanding The number of partial product afterwards, the number of the last layer full adder can be equal to the bit wide N of the partial product after eliminating symbol Bits Expanding Add the sum of 1 and N, the number of other each layer of full adders can be equal to the bit wide N of the partial product after eliminating symbol Bits Expanding.Separately Outside, when carrying out accumulation process to the partial product after all elimination symbol Bits Expandings, each eliminates the part after symbol Bits Expanding Long-pending lowest order numerical value present position, to the right than the partial product lowest order numerical value present position after next elimination symbol Bits Expanding Be staggered one digit number value.Optionally, after 121~12n of full adder is by cumulative correcting process, available operation result should Operation result can export for the last layer full adder and position signal.In addition, the internal circuit of above-mentioned 121~12n of full adder Structure can be identical with the internal circuit configuration of the first full adder 112a and 1122b, and function can also be identical.
It should be noted that amendment each of summation circuit 12 full adder can input to two and two or more Signal carries out add operation, obtains two output signals, which may include carry signal Carry and result position Signal Sum.Optionally, in the present embodiment, each of amendment summation circuit 12 full adder can receive three input signals, Three input signals can for eliminate symbol Bits Expanding partial product in any one bit value, right-hand adder obtain into Position output signal Carry, as a result any three kinds of signals in position signal Sum and binary signal.Optionally, tired by correcting Power-up road 12 offsets during the partial product after divided-by symbol Bits Expanding carries out cumulative correcting process, can pass through amendment summation circuit Full adder in 12, two obtained to partial product acquiring unit 112 eliminate the partial product after symbol Bits Expandings and are modified place Reason, which is equivalent to plus 1 processing.Optionally, multiplier, can by the first layer full adder in amendment summation circuit 12 The partial product eliminated after symbol Bits Expanding with obtained to partial product acquiring unit 112 first and second elimination sign bit The correspondence position of partial product after extension is added up, the third that second layer full adder can obtain partial product acquiring unit 112 A partial product eliminated after symbol Bits Expanding carries out accumulation process with the result of upper one layer of full adder, and so on, the last layer Full adder can be to upper one layer of full adder as a result, with untreated in the signal of each layer of full adder output before the last layer Carry signal or obtain with position signal and partial product acquiring unit 112 the last one eliminate the portion after symbol Bits Expanding Divide product to add up, obtains the target operation result in multiplying, it is during processing, other in addition to first layer full adder The input signal that each full adder of layer receives not only may include that each eliminates the partial product correspondence after symbol Bits Expanding Bit value can also include that upper one layer corresponding position full adder exports and position signal, add entirely with low one of upper one layer of corresponding position The carry signal of device output.
Optionally, amendment summation circuit 12 can offset the partial product after divided-by symbol Bits Expanding carry out it is cumulative during do Modified twice processing, wherein amendment summation circuit 12 can by two full adders in first layer and the last layer full adder, The numerical value in partial product after offseting divided-by symbol Bits Expanding is modified processing, wherein if each full adder is one corresponding Number, then be modified in first layer full adder processing full adder can for the full adder of time high bit number and last The full adder that processing is modified in layer full adder can be the full adder of highest order number.In addition, the last layer full adder The carry input signal that the full adder of minimum bit number receives can be equal to 0.
A kind of multiplier provided in this embodiment, the amendment summation circuit in multiplier can obtain partial product acquiring unit To less elimination symbol Bits Expanding after partial product carry out cumulative correcting process, obtain the target operation knot in multiplying Fruit effectively reduces the power consumption of multiplier to reduce the complexity that multiplier realizes multiplying.
Fig. 4 is a kind of concrete structure schematic diagram for multiplier that another embodiment provides, wherein multiplier includes described repairs Positive summation circuit 23, the amendment summation circuit 23 include: amendment Wallace tree group sub-circuit 231 and cumulative sub-circuit 232;Its In, the output end of the amendment Wallace tree group sub-circuit 231 is connect with the input terminal of the cumulative sub-circuit 232;It is described to repair Positive Wallace tree group sub-circuit 231 is described for carrying out cumulative correcting process to the partial product after the elimination symbol Bits Expanding Cumulative sub-circuit 232 is used to carry out accumulation process to the cumulative amendment operation result.
Specifically, what above-mentioned amendment Wallace tree group sub-circuit 231 can obtain canonical signed number coding circuit 211 The numerical value in partial product after eliminating symbol Bits Expanding carries out cumulative correcting process, and passes through 232 pairs of amendment China of cumulative sub-circuit Lai Shishu group sub-circuit 13 obtains cumulative amendment operation result and carries out accumulation process, obtains the target operation knot in multiplying Fruit.
A kind of multiplier provided in this embodiment, the amendment summation circuit in multiplier can obtain partial product acquiring unit To less elimination symbol Bits Expanding after partial product carry out cumulative correcting process, obtain the target operation knot in multiplying Fruit effectively reduces the power consumption of multiplier to reduce the complexity that multiplier realizes multiplying.
Continue the concrete structure schematic diagram of multiplier as shown in Figure 4 in one of the embodiments, wherein multiplier packet The amendment Wallace tree group sub-circuit 231 is included, which includes: Wallace tree unit 2311 ~231n, 2311~231n of multiple Wallace tree units are used for each of the partial product after the elimination symbol Bits Expanding Columns value carries out cumulative correcting process.
Specifically, the circuit structure of 2311~231n of Wallace tree unit can combine realization by full adder and half adder, Furthermore it is also possible to being interpreted as 2311~231n of Wallace tree unit is that one kind can be handled multidigit input signal, it will be more Position input signal is added to obtain the circuit of two output signals.Optionally, the Hua Lai that amendment Wallace tree group sub-circuit 231 includes The number n of scholar's tree unit can be equal to 2 times of the bit wide N of the partial product after eliminating symbol Bits Expanding, wherein N can be indicated just The number for the numerical value for including in the target code that then signed number coding circuit 21 obtains subtracts 1;Meanwhile n Wallace tree unit Parallel processing can be carried out to the partial product of target code, but connection type can be serial connection, wherein the portion of target code Point product can obtain the partial product after all elimination symbol Bits Expandings that circuit 22 obtain for partial product.Optionally, Hua Lai is corrected Each Wallace tree unit can offset each column of all partial products after divided-by symbol Bits Expanding in scholar's tree group sub-circuit 23 All numerical value carry out addition process, each Wallace tree unit can export two signals, i.e. carry signal CarryiWith One and position signal Sumi, wherein i can indicate each corresponding number of Wallace tree unit, first Wallace tree list The number of member is 0.Optionally, the number that each Wallace tree unit receives input signal, which can be equal in target code, wraps The number of all numerical value contained or the total number for eliminating the partial product after symbol Bits Expanding can also be equal in target code and wrap The number of all numerical value contained or the total number for eliminating the partial product after symbol Bits Expanding add 1.
It should be noted that multiplier is added each columns value of the partial product after all elimination symbol Bits Expandings During, by two Wallace tree units in amendment Wallace tree group sub-circuit 231, after offseting divided-by symbol Bits Expanding Two column datas in partial product are modified processing, that is to say, that this two columns in partial product after eliminating symbol Bits Expanding According to corresponding two Wallace tree units input signal, than eliminate symbol Bits Expanding after partial product in other columns An input signal it has been worth more than the input signal of each corresponding Wallace tree unit, which is 1.
In addition, the signal that each Wallace tree unit receives in amendment Wallace tree group sub-circuit 231 may include Carry input signal Cini, partial product input signal, carry output signals Couti.Optionally, each Wallace tree unit connects The partial product input signal received can be the numerical value of each column in the partial product after all elimination symbol Bits Expandings, each China The carry signal Cout of Lay scholar's tree unit outputiDigit can be equal to NCout=floor ((NI+NCin)/2)-1.Wherein, NIIt can To indicate the number of the partial product numerical value input signal of the Wallace tree unit, NCinCan indicate the Wallace tree unit into The number of position input signal, NCoutIt can indicate the number of the least carry output signals of Wallace tree unit, floor () It can indicate downward bracket function.Optionally, each Wallace tree unit in Wallace tree group sub-circuit 231 is corrected to receive Carry input signal can be the carry output signals of upper Wallace tree unit output, and first Wallace tree unit The carry input signal received is 0, meanwhile, the number for the carry signal input port that first Wallace tree unit receives Mesh, can be identical as the number of carry signal input port of other Wallace tree units.
In the present embodiment, if the volume for n Wallace tree unit being connected in series in amendment Wallace tree group sub-circuit 231 Number be 1,2 ..., i ..., n, then correct Wallace tree group sub-circuit 231 can by i-th of Wallace tree unit and n-th China Lay scholar's tree unit, corresponding two column data is modified processing in the partial product after offseting divided-by symbol Bits Expanding;In addition, if canonical First partial product eliminated after symbol Bits Expanding that signed number coding circuit 21 obtains, it is corresponding from lowest order to highest order Digit number is 1,2 ..., m-2, m-1, m, wherein m corresponds to Q numbers, after 1 corresponds to first elimination symbol Bits Expanding The number of lowest order digit in partial product, then i can be equal to N, it can be understood as, amendment Wallace tree group sub-circuit 231 can lead to N-th Wallace tree unit and the last one Wallace tree unit are crossed, the partial product after offseting divided-by symbol Bits Expanding is modified Processing, wherein N can indicate the bit wide for the multiplier that multiplier receives.
Illustratively, if multiplier currently processed 8 * 8 fixed-point number multiplyings, circuit 22 is obtained by partial product Partial product after obtained elimination symbol Bits Expanding is " pi8pi7pi6pi5pi4pi3pi2pi1pi0" (i=1 ..., n=9), wherein i Partial product after can indicating i-th of elimination symbol Bits Expanding, then when cumulative correcting process, after 9 elimination symbol Bits Expandings The regularity of distribution of partial product may refer to shown in Fig. 5, each origin represents every in the partial product after eliminating symbol Bits Expanding One digit number value, count from right column to left column (17 column partial product numerical value are shown in figure, it is last in actual operation The numerical value of column overflows, i.e., the last one eliminates the partial product highest bit value spilling after symbol Bits Expanding, is not involved in subsequent cumulative Operation), need 16 Wallace tree units to carry out cumulative correcting process to the partial product after 9 elimination symbol Bits Expandings in total, Amendment Wallace tree group sub-circuit 231 can be repaired by the 8th Wallace tree unit and the last one Wallace tree unit Positive processing, two Wallace tree unit figures for connecting circuit diagram and realizing correcting process of 16 Wallace tree units are as schemed Shown in 6, wherein Wallace_i indicates Wallace tree unit in Fig. 6, and i is number of the Wallace tree unit since 1, and two The solid line connected between two Wallace tree units indicates that the corresponding Wallace tree unit of high bit number has carry output signals, dotted line Indicate that the corresponding Wallace tree unit of high bit number does not carry out signal.
A kind of multiplier provided in this embodiment, the amendment Wallace tree group sub-circuit in multiplier can obtain partial product Partial product after the less elimination symbol Bits Expanding for taking unit to obtain carries out cumulative correcting process, obtains the mesh in multiplying Operation result is marked, to reduce the complexity that multiplier realizes multiplying, effectively reduces the power consumption of multiplier.
Continue the concrete structure schematic diagram of multiplier as shown in Figure 4 in one of the embodiments, wherein multiplier packet The cumulative sub-circuit 232 is included, which includes: adder 2321, and the adder 2321 is used for described tired Amendment operation result is added to carry out add operation.
Specifically, adder 2321 can be the adder of different bit wides, which can be carry lookahead adder. Optionally, adder 2321 can receive the two paths of signals that amendment Wallace tree group sub-circuit 231 exports, to two-way output signal Add operation is carried out, the target operation result in multiplying is obtained.
A kind of multiplier provided in this embodiment, multiplier can be to amendment Wallace tree group son electricity by cumulative sub-circuit The two paths of signals of road output carries out accumulation process, obtains the target operation result of multiplying, which can reduce multiplier The complexity for realizing multiplying, effectively reduces the power consumption of multiplier.
Multiplier includes the adder 2321 in one of the embodiments, which includes: carry signal Input port 2321a and position signal input port 2321b and result output port 2321c;The carry signal input port 2321a is for receiving carry signal, and described and position signal input port 2321b is for receiving and position signal, the result output Port 2321c is used to export the carry signal and described and position signal carries out the target operation result of accumulation process.
Specifically, adder 2321 can receive amendment Wallace tree group son electricity by carry signal input port 2321a The carry signal Carry that road 231 exports, it is defeated by receiving amendment Wallace's array circuit 231 with position signal input port 2321b Out and position signal Sum, and by carry signal Carry with and position signal Sum progress accumulated result, pass through result output port 2321c output.
It should be noted that multiplier can use 2321 couples of amendment Hua Lai of adder of different bit wides when multiplying Scholar's tree group sub-circuit 231 export carry output signals Carry with and position output signal Sum progress add operation, wherein it is above-mentioned The bit wide that adder 2321 can handle data can be equal to 2 times of the currently processed data bit width N of multiplier.Optionally, it corrects Each of Wallace tree group sub-circuit 231 Wallace tree unit can export a carry output signals Carryi, with one A and position output signal Sumi(i=0 ..., 2N-1, i are the reference numeral of each Wallace tree unit, are numbered since 0). Optionally, the Carry={ [Carry that adder 2321 receives0: Carry2N-2], 0 }, that is to say, that adder 2321 receives The bit wide of the carry output signals Carry arrived is 2N, the corresponding amendment Wallace of preceding 2N-1 bit value in carry output signals Carry The carry output signals of preceding 2N-1 Wallace tree unit in tree group sub-circuit 231, last position in carry output signals Carry Numerical value can be replaced with 0.Optionally, adder 2321 receive and position output signal Sum bit wide be 2N and position output letter Numerical value in number Sum can be equal to each Wallace tree unit in amendment Wallace tree group sub-circuit 231 and position output letter Number.
Illustratively, if multiplier currently processed 8 * 8 fixed-point number multiplyings, adder 2321 can be 16 Position carry lookahead adder, as shown in fig. 6, amendment Wallace tree group sub-circuit 231 can export 16 Wallace tree units With position output signal Sum and carry output signals Carry, still, 16 carry lookahead adders receive and position output letter Number complete and position signal Sum that can be exported for amendment Wallace tree group sub-circuit 231, the carry output signals received can be with To remove all of the carry output signals of the last one Wallace tree unit output in amendment Wallace tree group sub-circuit 231 Carry output signals combined with 0 after carry signal Carry.
A kind of multiplier provided in this embodiment, can be to amendment Wallace tree group sub-circuit output by cumulative sub-circuit Two paths of signals carry out accumulation process, obtain the target operation result of multiplying, which can reduce multiplier realization and multiplies The complexity of method operation effectively reduces the power consumption of multiplier.
Fig. 7 is the flow diagram for the data processing method that one embodiment provides, and this method can be by shown in FIG. 1 Multiplier is handled, and what is involved is the processes of data multiplication operation for the present embodiment.As shown in fig. 7, this method comprises:
S101, pending data is received.
Specifically, multiplier can receive pending data, the pending data by canonical signed number coding circuit It can be the multiplier and multiplicand in multiplying.Wherein, the bit wide of above-mentioned multiplier can be equal to the bit wide of multiplicand.
S102, canonical signed number coded treatment is carried out to the pending data, obtains target code.
Specifically, multiplier can carry out canonical to the multiplier to be processed received by canonical signed number coding circuit Signed number coded treatment, obtains target code.Wherein, the bit wide of target code can be equal to multiplier bit wide N to be processed and add 1.
Optionally, canonical signed number coded treatment is carried out to the pending data in above-mentioned S102, obtains target volume Code the step of, may include: by l bit value 1 continuous in the pending data be converted to the position (l+1) highest bit value be 1, Lowest order numerical value be -1, remaining position be numerical value 0 after, obtain the target code, wherein l be more than or equal to 2.
It should be noted that the method for above-mentioned canonical signed number coded treatment can characterize in the following manner: for N For the multiplier of position, handled from low level numerical value to high-order numerical value, it, then can be by continuous n if it exists when continuous l (l >=2) bit value 1 Bit value 1 is converted to data " 1 (0)l-1(- 1) ", and remaining is corresponded into position (l+1) after (N-l) bit value and conversion Numerical value is combined to obtain a new data;Then using the new data as the primary data of next stage conversion process, until There is no until continuous l (l >=2) bit value 1 in the new data obtained after conversion process;Wherein, canonical is carried out to N multipliers The bit wide of signed number coded treatment, obtained target code can be equal to (N+1).
Partial product after S103, the symbol Bits Expanding that is eliminated according to the pending data and the target code.
It should be noted that canonical signed number coding circuit can according in multiplying multiplicand and canonical have symbol The target code that number encodes, the partial product after the symbol Bits Expanding that is eliminated, and eliminate the portion after symbol Bits Expanding The bit wide for dividing the number of product that can be equal to target code.
S104, cumulative correcting process is carried out to the partial product after the elimination symbol Bits Expanding, obtains target operation result.
Specifically, multiplier can offset the portion after divided-by symbol Bits Expanding by the layer-by-layer full adder in amendment summation circuit Divide product to carry out cumulative correcting process, until to the last one layer of full adder terminates operation, obtains the target operation in multiplying As a result.Optionally, above-mentioned cumulative correcting process can be characterized as offseting the partial product after divided-by symbol Bits Expanding and carry out cumulative mistake Processing is modified in journey, which can be complete by first layer full adder and the last layer in amendment summation circuit Two in device full adders are added to be modified processing.Optionally, after above-mentioned target operation result can be to eliminate symbol Bits Expanding And it is modified the operation result after accumulation process.It should be noted that correcting summation circuit during cumulative correcting process The number in partial product after divided-by symbol Bits Expanding can be offseted by two full adders in first layer and the last layer full adder Value is modified processing, wherein if the corresponding number of each full adder, is modified processing in first layer full adder Full adder can be the full adder of time high bit number, the full adder that processing is modified in the last layer full adder can be for most The full adder of high bit number.
In addition, multiplier can also offset divided-by symbol position by the amendment Wallace tree group sub-circuit in amendment summation circuit Each columns value of partial product after extension carries out accumulation process, and passes through amendment Wallace tree group during accumulation process Two Wallace tree units in sub-circuit can be modified processing, export Corrections Division by amendment Wallace tree group sub-circuit Carry output signals after reason with and position output signal, finally by cumulative sub-circuit by amendment Wallace tree group sub-circuit into Position output signal carries out accumulation process with the signal after 0 substitution the last one and position signal, and target operation result is defeated Out.
It should be noted that if the currently processed N data operation of multiplier, and correct serial in Wallace tree group sub-circuit 2N Wallace tree unit is connected, the corresponding number of each Wallace tree unit then corrects Wallace tree group son electricity since 0 Road can be modified processing by n-th Wallace tree unit and the 2N Wallace tree unit.
A kind of data processing method provided in this embodiment receives pending data, carries out just to the pending data Then signed number coded treatment obtains target code, is eliminated symbol according to the pending data and the target code Partial product after Bits Expanding carries out cumulative correcting process to the partial product after the elimination symbol Bits Expanding, obtains target operation As a result, this method can carry out at canonical signed number coding the data received using canonical signed number coding circuit Reason reduces the number of the live part product obtained in multiplication procedure, to reduce the complexity of multiplying;Meanwhile it should Method can be improved the operation efficiency of multiplying, effectively reduce the power consumption of multiplier.
The data processing method that another embodiment provides is compiled according to the pending data and the target in above-mentioned S103 Code is eliminated the partial product after symbol Bits Expanding, comprising:
S1031, according to the pending data and the target code, obtain initial protion product.
It should be noted that the number of above-mentioned initial protion product can be equal to the bit wide of target code.
Illustratively, if partial product acquiring unit receives one 8 multiplicand " x7x6x5x4x3x2x1x0" (i.e. X), then Partial product acquiring unit can be according to multiplicand " x7x6x5x4x3x2x1x0" three kinds of numerical value -1 including in (i.e. X) and target code, 0,1 directly obtains corresponding initial protion product, and when one digit number value is -1 in target code, then initial protion product can be-X, when When one digit number value is 0 in target code, then initial protion product can be 0, when one digit number value is 1 in target code, then original Partial product can be X.
S1032, add operation processing is carried out to initial protion product, the partial product after the symbol Bits Expanding that is eliminated.
Optionally, add operation processing is carried out to initial protion product in above-mentioned S1032, be eliminated symbol Bits Expanding Partial product, comprising: to the initial protion product highest bit value carry out and logical operation process, the sign bit that is eliminated expand Partial product after exhibition.
Specifically, multiplier can pass through the highest bit value of each initial protion product by partial product acquiring unit First full adder carries out and logical operation process, the additional one digit number value in partial product after available elimination symbol Bits Expanding Q and time high-order numerical value, and then the partial product after the symbol Bits Expanding that is eliminated.Optionally, after above-mentioned elimination symbol Bits Expanding Additional one digit number value Q in partial product can be carried out for highest bit value in initial protion product and signal 1 with logical operation into Position signal, time high-order numerical value in the partial product after eliminating symbol Bits Expanding can be highest bit value and letter in initial protion product Number 1 carries out with logical operation and position signal.
A kind of data processing method provided in this embodiment is obtained according to the pending data and the target code Initial protion product, carries out according to the highest bit value of initial protion product and logical operation process, the sign bit that is eliminated expand Partial product after exhibition, and then offset the partial product after divided-by symbol Bits Expanding and carry out cumulative correcting process, it obtains in multiplying Target operation result, this method can reduce the number of the live part product obtained in multiplication procedure, to reduce multiplication The complexity of operation;Meanwhile this method can be improved the operation efficiency of multiplying, effectively reduce the power consumption of multiplier.
Fig. 8 is the flow diagram for the data processing method that one embodiment provides, and this method can be by shown in Fig. 2 Multiplier is handled, and what is involved is the processes of data multiplication operation for the present embodiment.As shown in figure 8, this method comprises:
S201, pending data is received.
Specifically, multiplier can receive pending data, the pending data by canonical signed number coding circuit It can be the multiplier and multiplicand in multiplying.Wherein, the bit wide of multiplier can be equal to the bit wide of multiplicand.
S202, canonical signed number coded treatment is carried out to the pending data, obtains initial protion product.
Specifically, multiplier, which carries out canonical to the multiplier in multiplying by canonical signed number coding circuit, symbol Number encoder processing, and partial product obtains circuit according to the available initial protion of result of canonical signed number coded treatment Product.
S203, logical operation process is carried out according to initial protion product, eliminates sign-extension bit and is eliminated sign bit Partial product after extension.
Specifically, the logic gate that multiplier can be obtained by partial product in circuit carries out logic to initial protion product Calculation process, the numerical value for directly eliminating sign-extension bit are eliminated the partial product after symbol Bits Expanding.
S204, cumulative correcting process is carried out to the partial product after the elimination symbol Bits Expanding, obtains target operation result.
Specifically, multiplier can offset the portion after divided-by symbol Bits Expanding by the layer-by-layer full adder in amendment summation circuit Divide product to carry out cumulative correcting process and obtains operation result until to the last one layer of full adder terminates operation.Optionally, above-mentioned Cumulative correcting process can be characterized as offseting the partial product after divided-by symbol Bits Expanding carry out it is cumulative during be modified processing, The correcting process can be through two full adders in amendment summation circuit in first layer full adder and the last layer full adder It is modified processing.Optionally, above-mentioned operation result can be for after elimination symbol Bits Expanding and after being modified accumulation process Operation result.It should be noted that amendment summation circuit can pass through first layer and last during cumulative correcting process Two full adders in layer full adder, the numerical value in partial product after offseting divided-by symbol Bits Expanding are modified processing, wherein such as The corresponding number of each full adder of fruit, then be modified the full adder of processing in first layer full adder, can be a time high position The full adder of number, the full adder that processing is modified in the last layer full adder can be the full adder of highest order number.
In addition, multiplier can also offset divided-by symbol position by the amendment Wallace tree group sub-circuit in amendment summation circuit Each columns value of partial product after extension carries out accumulation process, and passes through amendment Wallace tree group during accumulation process Two Wallace tree units can be modified processing in sub-circuit, export correcting process by amendment Wallace tree group sub-circuit Rear carry output signals with and position output signal, owning for Wallace tree group sub-circuit will be corrected finally by cumulative sub-circuit Carry output signals Carryi, the last one and position signal Sum are substituted with 02NAll and position signal afterwards carries out accumulation process, And operation result is exported.It should be noted that if the currently processed N data operation of multiplier, and correct Wallace tree group 2N Wallace tree unit is connected in series in circuit, the corresponding number of each Wallace tree unit then corrects Hua Lai since 0 Scholar's tree group sub-circuit can be modified processing by n-th Wallace tree unit and the 2N Wallace tree unit.
A kind of data processing method provided in this embodiment receives pending data, carries out just to the pending data Then signed number coded treatment obtains initial protion product, carries out logical operation process according to initial protion product, is eliminated Partial product after symbol Bits Expanding carries out cumulative correcting process to the partial product after the elimination symbol Bits Expanding, obtains target Operation result, this method can carry out canonical signed number coding to the pending data received, and reducing has in multiplying The number for imitating partial product, to reduce the complexity of multiplying;Meanwhile this method can be improved the operation effect of multiplying Rate effectively reduces the power consumption of multiplier.
The data processing method that another embodiment provides, carrying out canonical to the pending data in above-mentioned S202 has symbol Number encoder processing obtains initial protion product, comprising:
S2021, canonical signed number coded treatment is carried out to the pending data, obtains target code.
Specifically, multiplier, which can carry out canonical to the multiplier in multiplying by canonical signed number coding circuit, to be had Symbolic number coded treatment, obtains target code.Optionally, after canonical signed number coded treatment, obtained target code includes Numerical value include three kinds, respectively -1,0 and 1.
Optionally, canonical signed number coded treatment is carried out to the pending data in above-mentioned S2021, obtains target volume Code the step of, may include: by l bit value 1 continuous in the pending data be converted to the position (l+1) highest bit value be 1, Lowest order numerical value be -1, remaining position be numerical value 0 after, obtain the target code, wherein l be more than or equal to 2.
It should be noted that the method for above-mentioned canonical signed number coded treatment can characterize in the following manner: for N For the multiplier of position, handled from low level numerical value to high-order numerical value, it, then can be by continuous n if it exists when continuous l (l >=2) bit value 1 Bit value 1 is converted to data " 1 (0)l-1(- 1) ", and remaining is corresponded into position (l+1) after (N-l) bit value and conversion Numerical value is combined to obtain a new data;Then using the new data as the primary data of next stage conversion process, until There is no until continuous l (l >=2) bit value 1 in the new data obtained after conversion process;Wherein, canonical is carried out to N multipliers The bit wide of signed number coded treatment, obtained target code can be equal to (N+1).
S2022, according to the pending data and the target code, obtain the initial protion product.
It should be noted that initial protion product mesh can be equal to the bit wide of target code.
Illustratively, if initial protion product acquiring unit receives one 8 multiplicand " x7x6x5x4x3x2x1x0" (i.e. X), then initial protion product acquiring unit can be according to multiplicand " x7x6x5x4x3x2x1x0" include in (i.e. X) and target code three Kind numerical value -1,0,1 directly obtains corresponding initial protion product, and when one digit number value is -1 in target code, then initial protion product can Think-X, when one digit number value is 0 in target code, then initial protion product can be 0, when one digit number value is 1 in target code When, then initial protion product can be X.
A kind of data processing method provided in this embodiment carries out at canonical signed number coding the pending data Reason, obtains target code, according to the pending data and the target code, the initial protion product is obtained, then to original Initial portion product carries out eliminating sign bit extension process, and the partial product after offseting divided-by symbol Bits Expanding carries out cumulative correcting process, with The target operation result in multiplying is obtained, this method can be using canonical signed number coding circuit to the data received Canonical signed number coded treatment is carried out, the number of the live part product obtained in multiplication procedure is reduced, multiplies to reduce The complexity of method operation;Meanwhile this method can be improved the operation efficiency of multiplying, effectively reduce the power consumption of multiplier.
The data processing method that another embodiment provides carries out logical operation according to initial protion product in above-mentioned S203 Processing eliminates sign-extension bit and is eliminated the partial product after symbol Bits Expanding, comprising: to the highest order of initial protion product Numerical value carries out and logical operation process, eliminates sign-extension bit and obtains the partial product after the elimination symbol Bits Expanding.
Specifically, multiplier can by partial product obtain circuit in logic gate, to initial protion product in most High-order numerical value carries out and logical operation, time high-order numerical value and highest digit in the partial product after the symbol Bits Expanding that is eliminated Value, in addition, multiplier can also obtain the logic gate in circuit by partial product, to the highest digit in initial protion product Value is carried out with signal 1 and logical operation, the additional one digit number value Q in partial product after the symbol Bits Expanding that is eliminated, and disappears Time high-order numerical value (i.e. Q low one digit number value) in partial product after divided-by symbol Bits Expanding.
A kind of data processing method provided in this embodiment after handling pending data, obtains initial protion product, And the highest bit value of initial protion product is carried out and logical operation, it eliminates sign-extension bit and is eliminated symbol Bits Expanding Partial product afterwards, so as to which the power consumption of multiplier is effectively reduced.
The embodiment of the present application also provides a machine learning arithmetic units comprising one or more mentions in this application The multiplier arrived executes specified machine learning fortune to operational data and control information for obtaining from other processing units It calculates, implementing result passes to peripheral equipment by I/O interface.Peripheral equipment for example camera, display, mouse, keyboard, net Card, wifi interface, server.When comprising more than one multiplier, it can be linked by specific structure between multiplier And data are transmitted, for example, data are interconnected and transmitted by PCIE bus, to support the fortune of more massive machine learning It calculates.At this point it is possible to share same control system, there can also be control system independent;Can be with shared drive, it can also be with every A accelerator has respective memory.In addition, its mutual contact mode can be any interconnection topology.
The machine learning arithmetic unit compatibility with higher can pass through PCIE interface and various types of server phases Connection.
The embodiment of the present application also provides a combined treatment devices comprising above-mentioned machine learning arithmetic unit leads to With interconnecting interface and other processing units.Machine learning arithmetic unit is interacted with other processing units, completes user jointly Specified operation.Fig. 9 is the schematic diagram of combined treatment device.
Other processing units, including central processor CPU, graphics processor GPU, neural network processor etc. are general/special With one of processor or above processor type.Processor quantity included by other processing units is with no restrictions.Its Interface of its processing unit as machine learning arithmetic unit and external data and control, including data are carried, and are completed to the machine Device learns the basic control such as unlatching, stopping of arithmetic unit;Other processing units can also cooperate with machine learning arithmetic unit It is common to complete processor active task.
General interconnecting interface, for transmitting data and control between the machine learning arithmetic unit and other processing units Instruction.The machine learning arithmetic unit obtains required input data, write-in machine learning operation dress from other processing units Set the storage device of on piece;Control instruction can be obtained from other processing units, write-in machine learning arithmetic unit on piece Control caching;It can also learn the data in the memory module of arithmetic unit with read machine and be transferred to other processing units.
Optionally, the structure is as shown in Figure 10, can also include storage device, storage device respectively with the machine learning Arithmetic unit is connected with other processing units.Storage device for be stored in the machine learning arithmetic unit and it is described its The data of the data of its processing unit, operation required for being particularly suitable for learn arithmetic unit or other processing units in machine Storage inside in the data that can not all save.
The combined treatment device can be used as the SOC on piece of the equipment such as mobile phone, robot, unmanned plane, video monitoring equipment The die area of control section is effectively reduced in system, improves processing speed, reduces overall power.When this situation, the combined treatment The general interconnecting interface of device is connected with certain components of equipment.Certain components for example camera, display, mouse, keyboard, Network interface card, wifi interface.
In some embodiments, a kind of chip has also been applied for comprising at above-mentioned machine learning arithmetic unit or combination Manage device.
In some embodiments, a kind of chip-packaging structure has been applied for comprising said chip.
In some embodiments, a kind of board has been applied for comprising said chip encapsulating structure.As shown in figure 11, scheme 11 provide a kind of board, and above-mentioned board can also include other matching components other than including said chip 389, should Matching component includes but is not limited to: memory device 390, reception device 391 and control device 392;
The memory device 390 is connect with the chip in the chip-packaging structure by bus, for storing data.Institute Stating memory device may include multiple groups storage unit 393.Storage unit described in each group is connect with the chip by bus.It can To understand, storage unit described in each group can be DDR SDRAM (English: Double Data Rate SDRAM, Double Data Rate Synchronous DRAM).
DDR, which does not need raising clock frequency, can double to improve the speed of SDRAM.DDR allows the rising in clock pulses Edge and failing edge read data.The speed of DDR is twice of standard SDRAM.In one embodiment, the storage device can be with Including storage unit described in 4 groups.Storage unit described in each group may include multiple DDR4 particles (chip).In one embodiment In, the chip interior may include 4 72 DDR4 controllers, and 64bit is used for transmission number in above-mentioned 72 DDR4 controllers According to 8bit is used for ECC check.It is appreciated that data pass when using DDR4-3200 particle in the storage unit described in each group Defeated theoretical bandwidth can reach 25600MB/s.
In one embodiment, storage unit described in each group include multiple Double Data Rate synchronous dynamics being arranged in parallel with Machine memory.DDR can transmit data twice within a clock cycle.The controller of setting control DDR in the chips, Control for data transmission and data storage to each storage unit.
The reception device is electrically connected with the chip in the chip-packaging structure.The reception device is for realizing described Data transmission between chip and external equipment (such as server or computer).Such as in one embodiment, the reception Device can be standard PCIE interface.For example, data to be processed are transferred to the core by standard PCIE interface by server Piece realizes data transfer.Preferably, when using the transmission of 16 interface of PCIE 3.0X, theoretical bandwidth can reach 16000MB/s. In another embodiment, the reception device can also be other interfaces, and the application is not intended to limit above-mentioned other interfaces Specific manifestation form, the interface unit can be realized signaling transfer point.In addition, the calculated result of the chip is still by institute It states reception device and sends back external equipment (such as server).
The control device is electrically connected with the chip.The control device is for supervising the state of the chip Control.Specifically, the chip can be electrically connected with the control device by SPI interface.The control device may include list Piece machine (Micro Controller Unit, MCU).If the chip may include multiple processing chips, multiple processing cores or more A processing circuit can drive multiple loads.Therefore, the chip may be at the different work shape such as multi-load and light load State.It may be implemented by the control device to processing chips multiple in the chip, multiple processing and/or multiple processing circuits Working condition regulation.
In some embodiments, a kind of electronic equipment has been applied for comprising above-mentioned board.
Electronic equipment can be multiplier, robot, computer, printer, scanner, tablet computer, intelligent terminal, hand Machine, automobile data recorder, navigator, sensor, camera, server, cloud server, camera, video camera, projector, wrist-watch, Earphone, mobile storage, wearable device, the vehicles, household electrical appliance, and/or Medical Devices.
The vehicles include aircraft, steamer and/or vehicle;The household electrical appliance include TV, air-conditioning, micro-wave oven, Refrigerator, electric cooker, humidifier, washing machine, electric light, gas-cooker, kitchen ventilator;The Medical Devices include Nuclear Magnetic Resonance, B ultrasound instrument And/or electrocardiograph.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of Electrical combination, but those skilled in the art should understand that, the application is not limited by described electrical combination mode, Because certain circuits can be realized using other way or structure according to the application.Secondly, those skilled in the art also should Know, embodiment described in this description belongs to alternative embodiment, related device and module not necessarily this Shen It please be necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiments.
The several embodiments of the application above described embodiment only expresses, the description thereof is more specific and detailed, but simultaneously The limitation to the application the scope of the patents therefore cannot be interpreted as.It should be pointed out that for those of ordinary skill in the art For, without departing from the concept of this application, various modifications and improvements can be made, these belong to the guarantor of the application Protect range.Therefore, the scope of protection shall be subject to the appended claims for the application patent.

Claims (19)

1. a kind of multiplier, which is characterized in that the multiplier includes: canonical signed number coding circuit, partial product acquisition electricity Road and amendment summation circuit, wherein the output end of the canonical signed number coding circuit and the partial product obtain circuit Input terminal connection, the output end that the partial product obtains circuit are connect with the input terminal of the amendment summation circuit;
The canonical signed number coding circuit is used to carry out the data received canonical signed number and encodes to obtain target volume Code, the partial product obtain circuit and are used to obtain initial protion product according to the target code, and according to initial protion product Logical operation process, the partial product after the symbol Bits Expanding that is eliminated are carried out, the amendment summation circuit is used for the elimination Partial product after symbol Bits Expanding carries out cumulative correcting process.
2. multiplier according to claim 1, which is characterized in that the canonical signed number coding circuit includes: data Input port and target code output port;The data-in port is used to receive the of progress canonical signed number coding One data, the target code output port, which is used to export, carries out canonical signed number coding to first data received The target code obtained afterwards.
3. multiplier according to claim 1 or 2, which is characterized in that it includes initial protion that the partial product, which obtains circuit, Product acquiring unit and logic gate, the initial protion product acquiring unit are used to obtain initial protion according to target code Product, the logic gate are used to carry out logical operation process to the high double figures value of initial protion product, and be eliminated symbol Partial product after number Bits Expanding.
4. multiplier according to any one of claim 1 to 3, which is characterized in that the partial product obtains circuit and includes AND gate circuit.
5. multiplier according to any one of claim 1 to 4, which is characterized in that the amendment summation circuit includes: to repair Positive Wallace tree group sub-circuit and cumulative sub-circuit, the output end of the amendment Wallace tree group sub-circuit and the cumulative son electricity The input terminal on road connects;
Wherein, the amendment Wallace tree group sub-circuit is used to carry out cumulative repair to the partial product after the elimination symbol Bits Expanding Positive processing, the cumulative sub-circuit are used to carry out accumulation process to the cumulative amendment operation result.
6. multiplier according to claim 5, which is characterized in that the amendment Wallace tree group sub-circuit includes: Hua Lai Scholar's tree unit, the Wallace tree unit are used to carry out each columns value of the partial product after the elimination symbol Bits Expanding tired Add correcting process.
7. multiplier according to claim 5 or 6, which is characterized in that the cumulative sub-circuit includes: adder, described Adder is used to carry out add operation to the cumulative amendment operation result.
8. multiplier according to claim 7, which is characterized in that the adder include: carry signal input port and Position signal input port and result output port;The carry signal input port is for receiving carry signal, described and position Signal input port is believed with position signal, the result output port for exporting the carry signal and described and position for receiving Number carry out accumulation process the target operation result.
9. a kind of data processing method, which is characterized in that the described method includes:
Receive pending data;
Canonical signed number coded treatment is carried out to the pending data, obtains initial protion product;
Logical operation process is carried out according to initial protion product, sign-extension bit is eliminated and is eliminated the portion after symbol Bits Expanding Divide product;
Cumulative correcting process is carried out to the partial product after the elimination symbol Bits Expanding, obtains target operation result.
10. according to the method described in claim 9, it is characterized in that, described have symbol to pending data progress canonical Number encoder processing obtains initial protion product, comprising:
Canonical signed number coded treatment is carried out to the pending data, obtains target code;
According to the pending data and the target code, the initial protion product is obtained.
11. according to the method described in claim 10, it is characterized in that, described have symbol to pending data progress canonical Number encoder processing, obtains target code, comprising: l bit value 1 continuous in the pending data is converted to the position (l+1) most High-order numerical value be 1, lowest order numerical value be -1, remaining position be numerical value 0 after, obtain the target code, wherein l be more than or equal to 2.
12. the method according to any one of claim 9 to 11, which is characterized in that described according to initial protion product Logical operation process is carried out, sign-extension bit is eliminated and is eliminated the partial product after symbol Bits Expanding, comprising: to the original portion The progress of highest bit value and logical operation process for dividing product eliminate sign-extension bit and obtain the portion after the elimination symbol Bits Expanding Divide product.
13. a kind of machine learning arithmetic unit, which is characterized in that the machine learning arithmetic unit includes one or more as weighed Benefit requires the described in any item multipliers of 1-8, for being obtained from other processing units to operation input data and control information, And specified machine learning operation is executed, implementing result is passed into other processing units by I/O interface;
It is specific by presetting between multiple computing devices when the machine learning arithmetic unit includes multiple multipliers Structure is attached and transmits data;
Wherein, multiple multipliers are interconnected by PCIE bus and are transmitted data, to support more massive engineering The operation of habit;Multiple multipliers share same control system or possess respective control system;Multiple multipliers are total It enjoys memory or possesses respective memory;The mutual contact mode of multiple multipliers is any interconnection topology.
14. a kind of combined treatment device, which is characterized in that the combined treatment device includes machine as claimed in claim 13 Learn arithmetic unit, general interconnecting interface and other processing units;
The machine learning arithmetic unit is interacted with other processing units, the common calculating behaviour for completing user and specifying Make.
15. combined treatment device according to claim 14, which is characterized in that further include: storage device, the storage device It is connect respectively with the machine learning arithmetic unit and other processing units, for saving the machine learning arithmetic unit With the data of other processing units.
16. a kind of neural network chip, which is characterized in that the machine learning chip includes machine as claimed in claim 13 Learn arithmetic unit or combined treatment device as claimed in claim 14 or combined treatment device as claimed in claim 15.
17. a kind of electronic equipment, which is characterized in that the electronic equipment includes the chip as described in the claim 16.
18. a kind of board, which is characterized in that the board includes: memory device, reception device and control device and such as right It is required that neural network chip described in 16;
Wherein, the neural network chip is separately connected with the memory device, the control device and the reception device;
The memory device, for storing data;
The reception device, for realizing the data transmission between the chip and external equipment;
The control device is monitored for the state to the chip.
19. board according to claim 18, which is characterized in that
The memory device includes: multiple groups storage unit, and storage unit described in each group is connect with the chip by bus, institute State storage unit are as follows: DDR SDRAM;
The chip includes: DDR controller, the control for data transmission and data storage to each storage unit;
The reception device are as follows: standard PCIE interface.
CN201910817905.0A 2019-08-30 2019-08-30 Multiplier, data processing method, chip and electronic equipment Active CN110515586B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910817905.0A CN110515586B (en) 2019-08-30 2019-08-30 Multiplier, data processing method, chip and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910817905.0A CN110515586B (en) 2019-08-30 2019-08-30 Multiplier, data processing method, chip and electronic equipment

Publications (2)

Publication Number Publication Date
CN110515586A true CN110515586A (en) 2019-11-29
CN110515586B CN110515586B (en) 2024-04-09

Family

ID=68628727

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910817905.0A Active CN110515586B (en) 2019-08-30 2019-08-30 Multiplier, data processing method, chip and electronic equipment

Country Status (1)

Country Link
CN (1) CN110515586B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113031911A (en) * 2019-12-24 2021-06-25 上海寒武纪信息科技有限公司 Multiplier, data processing method, device and chip

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4864528A (en) * 1986-07-18 1989-09-05 Matsushita Electric Industrial Co., Ltd. Arithmetic processor and multiplier using redundant signed digit arithmetic
US4967388A (en) * 1988-04-21 1990-10-30 Harris Semiconductor Patents Inc. Truncated product partial canonical signed digit multiplier
US20030220956A1 (en) * 2002-05-22 2003-11-27 Broadcom Corporation Low-error canonic-signed-digit fixed-width multiplier, and method for designing same
US20070083581A1 (en) * 2005-06-12 2007-04-12 Kim Jung B Multiplierless FIR digital filter and method of designing the same
US20070180015A1 (en) * 2005-12-09 2007-08-02 Sang-In Cho High speed low power fixed-point multiplier and method thereof
CN105183424A (en) * 2015-08-21 2015-12-23 电子科技大学 Fixed-bit-width multiplier with high accuracy and low energy consumption properties
CN108304341A (en) * 2018-03-13 2018-07-20 算丰科技(北京)有限公司 AI chip high speeds transmission architecture, AI operations board and server

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4864528A (en) * 1986-07-18 1989-09-05 Matsushita Electric Industrial Co., Ltd. Arithmetic processor and multiplier using redundant signed digit arithmetic
US4967388A (en) * 1988-04-21 1990-10-30 Harris Semiconductor Patents Inc. Truncated product partial canonical signed digit multiplier
US20030220956A1 (en) * 2002-05-22 2003-11-27 Broadcom Corporation Low-error canonic-signed-digit fixed-width multiplier, and method for designing same
US20070083581A1 (en) * 2005-06-12 2007-04-12 Kim Jung B Multiplierless FIR digital filter and method of designing the same
US20070180015A1 (en) * 2005-12-09 2007-08-02 Sang-In Cho High speed low power fixed-point multiplier and method thereof
CN105183424A (en) * 2015-08-21 2015-12-23 电子科技大学 Fixed-bit-width multiplier with high accuracy and low energy consumption properties
CN108304341A (en) * 2018-03-13 2018-07-20 算丰科技(北京)有限公司 AI chip high speeds transmission architecture, AI operations board and server

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
万超 等: "一种高速数字FIR滤波器的VLSI实现", 《合肥工业大学学报(自然科学版)》, vol. 31, no. 5, pages 736 - 739 *
万超;尹勇生;邓红辉;: "高速FIR滤波器中乘加单元的优化设计", 仪器仪表用户, no. 02, 8 April 2008 (2008-04-08) *
熊承义, 高志荣, 田金文: "常系数乘法器的VLSI高效设计", 军民两用技术与产品, no. 09, 21 September 2003 (2003-09-21) *
王瑞光 等: "基于CSD编码的16位并行乘法器的设计", 《微计算机信息》, vol. 24, no. 23, pages 75 - 76 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113031911A (en) * 2019-12-24 2021-06-25 上海寒武纪信息科技有限公司 Multiplier, data processing method, device and chip

Also Published As

Publication number Publication date
CN110515586B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
CN109740739A (en) Neural computing device, neural computing method and Related product
CN110163357A (en) A kind of computing device and method
CN111008003B (en) Data processor, method, chip and electronic equipment
CN109740754A (en) Neural computing device, neural computing method and Related product
CN110515589A (en) Multiplier, data processing method, chip and electronic equipment
CN110362293B (en) Multiplier, data processing method, chip and electronic equipment
CN110515587A (en) Multiplier, data processing method, chip and electronic equipment
CN110515590A (en) Multiplier, data processing method, chip and electronic equipment
CN110554854B (en) Data processor, method, chip and electronic equipment
CN109670581A (en) A kind of computing device and board
CN110531954A (en) Multiplier, data processing method, chip and electronic equipment
CN109711540A (en) A kind of computing device and board
CN111258541B (en) Multiplier, data processing method, chip and electronic equipment
CN110515586A (en) Multiplier, data processing method, chip and electronic equipment
CN110515588A (en) Multiplier, data processing method, chip and electronic equipment
CN110378478A (en) Multiplier, data processing method, chip and electronic equipment
CN210109863U (en) Multiplier, device, neural network chip and electronic equipment
CN110688087B (en) Data processor, method, chip and electronic equipment
CN111258542B (en) Multiplier, data processing method, chip and electronic equipment
CN110647307B (en) Data processor, method, chip and electronic equipment
CN110378477A (en) Multiplier, data processing method, chip and electronic equipment
CN110515585A (en) Multiplier, data processing method, chip and electronic equipment
CN113031909B (en) Data processor, method, device and chip
CN210006083U (en) Multiplier, device, chip and electronic equipment
CN210006082U (en) Multiplier, device, neural network chip and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant