CN108345938A

CN108345938A - A kind of neural network processor and its method including bits switch device

Info

Publication number: CN108345938A
Application number: CN201810170612.3A
Authority: CN
Inventors: 韩银和; 闵丰; 许浩博; 王颖
Original assignee: Institute of Computing Technology of CAS
Current assignee: Institute of Computing Technology of CAS
Priority date: 2018-03-01
Filing date: 2018-03-01
Publication date: 2018-07-31
Also published as: WO2019165679A1

Abstract

The present invention provides a kind of neural network processor, and the method for carrying out bits switch to the data of neural network using the neural network processor.The neural network processor includes bits switch device, which includes：Input interface, control unit, Date Conversion Unit and output interface；Wherein, described control unit is used to generate the control signal for the Date Conversion Unit；The input interface is for receiving initial data；The Date Conversion Unit is used to carry out bits switch to the initial data according to the control signal, and the initial data is converted to the bits switch result expressed using less number of bits；The output interface is used to the bits switch result exporting the bits switch device.Number of bits used by expression data can be reduced through the invention, reduced hardware cost and energy consumption needed for calculating, improved calculating speed.

Description

A kind of neural network processor and its method including bits switch device

Technical field

The present invention relates to artificial intelligence fields, more particularly to the improvement to neural network processor.

Background technology

The depth learning technology of artificial intelligence was obtaining development at full speed in recent years, was solving high-level abstractions cognitive question On, such as image recognition, speech recognition, natural language understanding, weather forecasting, gene expression, commending contents and intelligent robot Equal fields are widely applied and have been proved there is outstanding performance.This make exploitation for artificial intelligence technology and It is modified to the research hotspot for academia and industrial quarters.

Deep neural network is one of the sensor model that artificial intelligence field has highest development level, such network passes through Model is established to simulate the neural connection structure of human brain, data characteristics is described using the layering of multiple conversion stages, Task, which is handled, for large-scale datas such as image, video and audios brings breakthrough.The model of deep neural network belongs to A kind of operational model uses netted interconnection structure it includes a large amount of node between these nodes, is called depth nerve The neuron of network.Weighted value of the bonding strength representation signal between the two nodes between two nodes, i.e. weight, with It is corresponding with the memory in the neural network on biological significance.

For the application specific processor of neural computing, i.e. neural network processor has also obtained corresponding development.In reality In the neural computing processing procedure on border, need repeatedly to carry out the operations such as convolution, activation, pond to a large amount of data, this It needs to consume the extremely large amount of calculating time, has seriously affected the usage experience of user.This makes the meter for how reducing neural network Evaluation time becomes a kind of improvement strategy for neural network processor.

Invention content

Therefore, it is an object of the invention to overcome the defect of the above-mentioned prior art, a kind of neural network processor is provided, it should Neural network processor includes bits switch device, which includes：

Input interface, control unit, Date Conversion Unit and output interface；

Wherein,

Described control unit is used to generate the control signal for the Date Conversion Unit；

The input interface is for receiving initial data；

The Date Conversion Unit is used to carry out bits switch to the initial data according to the control signal, by institute It states initial data and is converted to the bits switch result expressed using less number of bits；

The output interface is used to the bits switch result exporting the bits switch device.

Preferably, according to the neural network processor, wherein described control unit be used for according to the parameter of setting or The parameter of input determines the rule for executing bits switch, to generate the control signal；

Wherein, the parameter includes the bit with the number of bits of the initial data and the bits switch result The relevant information of number.

Preferably, according to the neural network processor, wherein the Date Conversion Unit is used to be believed according to the control Number, it determines the reserved bit in the initial data and blocks position, and according to the reserved bit of the initial data and described The highest order of initial data blocked in position determines the bits switch result.

Preferably, according to the neural network processor, wherein the Date Conversion Unit is used to be believed according to the control Number, it determines the reserved bit in the initial data and blocks position, and using the reserved bit in the initial data as described in Bits switch result.

Preferably, according to the neural network processor, wherein the Date Conversion Unit is used to be believed according to the control Number bits switch is carried out to the initial data, is converted into initial data and to be expressed using the number of bits of script half Bits switch result.

A kind of neural network processor using described in above-mentioned any one carries out bits switch to the data of neural network Method, including：

1) described control unit generates the control signal for Date Conversion Unit；

2) input interface receives the original number for needing to execute bits switch outside the bits switch device According to；

3) Date Conversion Unit carries out bits switch according to the control signal to the initial data, will be described Initial data is converted to the bits switch result expressed using less number of bits；

4) the bits switch result is exported the bits switch device by the output interface.

Preferably, according to the method, wherein step 1) includes：

1-1) described control unit determines the rule for executing bits switch according to the parameter of setting or the parameter of input；

1-2) described control unit generates control signal corresponding with the rule；

Preferably, according to the method, wherein step 3) includes：

The Date Conversion Unit is according to the control signal, reserved bit based on the initial data and described original The highest order of data blocked in position determines the bits switch result.

Preferably, according to the method, wherein step 3) includes：

The Date Conversion Unit is according to the control signal, using the reserved bit in the initial data as the bit Transformation result.

Preferably, according to the method, the caching to Neural Network Data is being completed and is not yet completing convolution algorithm When, the Neural Network Data of caching is inputted into the bits switch device to execute step 1) -4), or be completed to data Convolution algorithm and when not yet completing activation operation, the result of convolution algorithm is inputted into the bits switch device to execute Step 1) -4).

A kind of computer readable storage medium, wherein being stored with computer program, the computer program is when executed For realizing the method described in above-mentioned any one.

Compared with the prior art, the advantages of the present invention are as follows：

The present invention provides a kind of bits switch device for neural network processor, can be used in neural network Number of bits used by expression data is adjusted in various calculating process.Bit used by data is expressed by reduction Digit can reduce the hardware cost needed for calculating, improve calculating speed, reduce neural network processor to data space Needs and reduce execute neural computing energy consumption.

Description of the drawings

Embodiments of the present invention is further illustrated referring to the drawings, wherein：

Fig. 1 shows the module map of bits switch device according to an embodiment of the invention；

Fig. 2 is the annexation figure of each unit in bits switch device according to an embodiment of the invention；

Fig. 3 be it is according to an embodiment of the invention using bits switch device as illustrated in FIG. 1 to neural network Data carry out the method flow of bits switch；

Fig. 4 a are according to one embodiment of present invention in the Date Conversion Unit of bits switch device in " four houses Five enter pattern " under execute bits switch hardware structure diagram；

Fig. 4 b are according to one embodiment of present invention in the Date Conversion Unit of bits switch device for " directly The hardware structure diagram of bits switch is executed under truncated mode ".

Specific implementation mode

It elaborates with reference to the accompanying drawings and detailed description to the present invention.

As described in the text, when designing neural network processor, it is desirable to be able to reduce the calculating time of neural network.It is right This, inventor thinks can be by suitably reducing the number of bits of the data for participating in neural computing, for example, by using more Few bit come represent originally need more bit to indicate data, reduce operand to reduce the calculating of neural network Time.This is because, inventor has found in the research to the prior art, the intermediate result of the algorithm of neural network for calculating There are relatively high fault-tolerances, and the data for needing more bit to indicate originally are represented even with less bit Way can change the precision for the data for participating in calculating to influence the accuracy of obtained intermediate result, however this can't Large effect is caused to the result of neural network final output.

In the present invention, the mode of the bit of data used in this Reduction Computation is referred to as " cutting out to data Cut operation ".Also, the process being adjusted to the binary bits digit needed for expression numerical value is referred to as " bits switch ". For example, be directed to metric numerical value 0.5, use the result that Q7 fixed-point datas are indicated be 01000000 (here Q7 using 8 compare The bit of the leftmost side first in spy indicates fractional parts as sign bit, using remaining 7 bit, thus indicates -1 The decimal that precision between to 1 is 7), when carrying out bits switch, script can be used to the results modification that Q7 is indicated to use Q3 is indicated, and obtaining result 0100, (analogously with Q7, Q3 equally uses the bit of the leftmost side first as sign bit, no Same is that it uses 3 bits to indicate fractional parts, can indicate the decimal that the precision between -1 to 1 is 3).

Based on above-mentioned analysis, the present invention proposes a kind of bits switch device for neural network processor.Pass through institute Stating bits switch device can be according to setting or based on parameter input by user determination execution bits switch rule, with right Data execute bits switch.Conversion in this way, neural network processor can be handled relatively small amounts of data, And the energy consumption for thus promoting processing speed, reducing neural network processor.It has been recognised by the inventors that in logic combination circuit, data The speed of operation and the number of bits of numerical expression are inversely proportional；The energy consumption of data operation is directly proportional to the bit of numerical expression； Therefore after carrying out bits switch to data, the effect for accelerating to calculate with reduce power consumption can reach.

Fig. 1 shows bits switch device 101 according to an embodiment of the invention, including：As input interface Input bus unit 102, Date Conversion Unit 103, the output bus unit 104 as output interface, control unit 105.

Wherein, input bus unit 102 is carried for obtaining the Neural Network Data for needing to carry out bits switch It is supplied to Date Conversion Unit 103.In some embodiments, input bus unit 102 concurrently can be received and/or be transmitted and is multiple Data to be converted.

Date Conversion Unit 103 is used for according to such as setting or execution of determination based on parameter input by user The rule of bits switch executes bits switch to the Neural Network Data from input bus unit 102.

Output bus unit 104, for will be handled via Date Conversion Unit 103 result of obtained bits switch from It is exported in bits switch device 101, with the device being provided in Processing with Neural Network for executing subsequent processing.

Control unit 105, the rule for determining bits switch select corresponding bits switch pattern and turn to control data Change the operation that unit 103 executes bits switch.The parameter or defeated by user that described control unit 105 can be arranged by analysis The parameter entered executes the rule of bits switch to determine, to be selected from pre-set various translative mode.Here join Number may include the number of bits of data to be converted and transformed data bit digit, can also be that data to be converted are adopted The binary expression way used desired by data after binary expression way and conversion, such as Q7, Q3 etc.. For example, according to parameter input by user, determines that the Neural Network Data that Q7 will be used to indicate is converted to and indicated using Q3.It is reducing Used by expression when bit, the mode of " rounding up " may be used, such as 0110 is converted to by 01011000, it can also 0101 is converted to by the way of " directly blocking ", such as by 01011000.Here " rounding up " or " directly blocking " Equal conversion regimes can both have been inputted by user, can also be set to be fixed.

In some embodiments, input bus unit 102 and/or output bus unit 104 can concurrently receive and/or Transmit multiple data to be converted.

Fig. 2 is the annexation figure of each unit in bits switch device according to an embodiment of the invention.Wherein, The number of bits of input bus unit is 128bit, and the number of bits of output bus is 64bit.Control unit is filled from bits switch External reception parameter input by user is set, is used to be used for Date Conversion Unit according to determining bits switch rule to generate Mode switching signal so that Date Conversion Unit, which can be informed under the present situation, to be needed to use which kind of mode to execute bit Conversion.Also, control unit can also be generated to be started to receive data or pause reception data for controlling input bus unit Input control signal, and for control output bus unit start export or suspend output bits switch result output Control signal.

Below by by one embodiment introduction using bits switch device as illustrated in FIG. 1 to Neural Network Data into The procedure of row bits switch.With reference to figure 3, the method includes：

Conversion requirements parameter or input by user parameter of the step 1. based on setting, by bits switch device 101 Control unit 105 determine used in bits switch rule.The conversion requirements parameter of the setting described is inputted by user Parameter include relevant with the data bit digit after the number of bits of Neural Network Data and conversion that need to convert Information.The conversion requirements parameter of the setting, the parameter input by user can also be included in when carrying out bits switch Block rule, such as the rules such as " rounding up " or " directly blocking ".

Based on above-mentioned rule, can be selected from pre-set bits switch pattern by control unit 105.According to One embodiment of the present of invention, the bits switch pattern includes " round up pattern " and " direct truncated mode ", for institute Stating the processing mode of two kinds of different modes will be introduced in a subsequent step.

What the input bus unit 102 in step 2. bits switch device 101 was obtained it needs to execute bits switch Neural Network Data be provided to Date Conversion Unit 103.

Here input bus unit 102 may include multiple interfaces that can receive data parallel, concurrently to receive The Neural Network Data for needing to execute bits switch outside bits switch device 101.Similar, input bus unit 102 can also include multiple interfaces for capableing of parallel output data, to concurrently provide data to Date Conversion Unit 103, to carry out the processing of bits switch.

Rule of step 3. Date Conversion Unit 103 according to bits switch determined by control unit 105, to needing to execute The Neural Network Data of bits switch executes bits switch.

In this step, the control signal from control unit 105 can be received by Date Conversion Unit 103 with according to institute It states rule and executes bits switch.

Inventor has found, when reducing the number of bits for calculating used data, if the number of bits after reduction is big In the half of the number of bits equal to original data, then can make neural network processor hardware cost, processing speed and Reach compromise between accuracy rate.Therefore, in the present invention preferably, it would be desirable to execute the ratio of the Neural Network Data of bits switch Special abbreviation is the half of script, and bits switch is executed for example, by using fixed hardware configuration, and the data of 32bit are turned 16bit is turned to, convert the data of 16bit to 8bit, convert the data of 8bit to 4bit, converts the data of 4bit to 2bit and convert the data of 2bit to 1bit etc..

It, can be according to the rule, it would be desirable to execute the neural network of bits switch during executing bits switch Each bit of data is divided into reserved bit and blocks position, and wherein reserved bit is each bit of the Neural Network Data In higher one or more bits, block remaining bit in each bit that position is the Neural Network Data. For example, for the data 10101111 of 8bit, according to the mode for the half that its number of bits is reduced to script, then its Reserved bit is 1010, and it is 1111 to block position.

Fig. 4 a are shown according to one embodiment of present invention in Date Conversion Unit 103 for " round up mould The hardware device structure of bits switch is executed under formula ", wherein the Neural Network Data for needing to execute bits switch of 16 8bit It is concurrently input into Date Conversion Unit 103, is removed in the reserved bit of the 4bit in the Neural Network Data of each 8bit Remove bit (such as a other than sign bit₁、a₂、a₃) and corresponding highest order (such as a blocked in position₄) served as Two inputs of adder, the sign bit in the output of the adder and the Neural Network Data are used as needle jointly The result after bits switch is executed to the Neural Network Data of the 8bit.

It is illustrated with reference to figure 4a, under " round up pattern ", it is assumed that the nerve being input in converting unit 103 Network data is 10101111 (radix-minus-one complements), indicates its expression metric -0.6328125, and it is 1111 to block position, will block position Highest order 1 be added with 3 bits 010 except divided-by symbol position in reserved bit, based on the symbol in the Neural Network Data It is 1011 (radix-minus-one complements) that the result of number position and adder, which obtains the result after bits switch, indicates metric -0.625.

Fig. 4 b are shown according to one embodiment of present invention in Date Conversion Unit 103 for " directly blocking mould The hardware device structure of bits switch is executed under formula ", wherein the Neural Network Data for needing to execute bits switch of 16 8bit It is concurrently input into Date Conversion Unit 103, the reserved bit of the 4bit in the Neural Network Data of each 8bit (such as a₀、a₁、a₂、a₃) be used directly as for the result after the Neural Network Data of 8bit execution bits switch.

It is illustrated with reference to figure 4b, under " direct truncated mode ", it is assumed that the nerve being input in converting unit 103 Network data is 10101111 (radix-minus-one complements), then it is 1010 to execute the result after bits switch.

Step 4. will be handled the knot of obtained bits switch by output bus unit 104 via Date Conversion Unit 103 Fruit exports from bits switch device 101, with the device being provided in Processing with Neural Network for executing subsequent processing.

The bits switch device provided by the above embodiment of the present invention can as a part for neural network processor, It is used in the various calculating process for neural network.

For example, can be when the caching to Neural Network Data be completed and not yet completes convolution algorithm, using bit Conversion equipment carries out bits switch to the Neural Network Data of caching.Reason for doing so is that the heterogeneous networks of neural network There may be different requirements to number of bits used by data for layer, in order to be adapted to required calculating speed and phase The energy consumption of prestige can carry out bits switch to the Neural Network Data of caching by bits switch device, and will pass through bits switch The result obtained is provided to for executing the unit of convolution algorithm to execute convolution algorithm.

In another example can be turned using bit when the convolution algorithm to data is completed and not yet completes activation operation Changing device carries out bits switch to the result of convolution algorithm.Reason for doing so is that the accumulation operations of convolution algorithm unit are past Toward the number of bits for the result that can increase obtained convolution algorithm, in order to be adapted to requirement of the subsequent operation to number of bits (such as some activation arithmetic elements realized using hardware mode, used number of bits is often fixed ), it needs to carry out bits switch to the result of convolution algorithm.

Based on above-described embodiment, the present invention provides a kind of bits switch device for neural network processor, can quilt For being adjusted to number of bits used by expression data in the various calculating process of neural network.It is expressed by reducing Number of bits used by data can reduce the hardware cost needed for calculating, improve calculating speed, reduce Processing with Neural Network Device executes the needs of data space and reducing the energy consumption of neural computing.

It should be noted that each step introduced in above-described embodiment is all not necessary, those skilled in the art Can carry out according to actual needs it is appropriate accept or reject, replace, modification etc..

It should be noted last that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting.On although Text is described the invention in detail with reference to embodiment, it will be understood by those of ordinary skill in the art that, to the skill of the present invention Art scheme is modified or replaced equivalently, and without departure from the spirit and scope of technical solution of the present invention, should all be covered at this In the right of invention.

Claims

1. a kind of neural network processor, which includes bits switch device, the bits switch device packet It includes：

Input interface, control unit, Date Conversion Unit and output interface；

Wherein,

The input interface is for receiving initial data；

The Date Conversion Unit is used to carry out bits switch to the initial data according to the control signal, by the original Beginning data are converted to the bits switch result expressed using less number of bits；

2. neural network processor according to claim 1, wherein described control unit be used for according to the parameter of setting or The parameter of person's input determines the rule for executing bits switch, to generate the control signal；

Wherein, the parameter includes the number of bits phase with the number of bits of the initial data and the bits switch result The information of pass.

3. neural network processor according to claim 2, wherein the Date Conversion Unit is used for according to the control Signal determines the reserved bit in the initial data and blocks position, and according to the reserved bit of the initial data and institute The highest order blocked in position for stating initial data determines the bits switch result.

4. neural network processor according to claim 2, wherein the Date Conversion Unit is used for according to the control Signal determines the reserved bit in the initial data and blocks position, and using the reserved bit in the initial data as institute State bits switch result.

5. neural network processor according to claim 1, wherein the Date Conversion Unit is used for according to the control Signal carries out bits switch to the initial data, is converted into initial data and is expressed using the number of bits of script half Bits switch result.

6. a kind of neural network processor using as described in any one of claim 1-5 carries out the data of neural network The method of bits switch, including：

2) input interface receives the initial data for needing to execute bits switch outside the bits switch device；

3) Date Conversion Unit carries out bits switch according to the control signal to the initial data, will be described original Data are converted to the bits switch result expressed using less number of bits；

7. according to the method described in claim 6, wherein step 1) includes：

8. according to the method described in claim 7, wherein step 3) includes：

The Date Conversion Unit is according to the control signal, the reserved bit based on the initial data and the initial data The highest order blocked in position determine the bits switch result.

9. according to the method described in claim 7, wherein step 3) includes：

The Date Conversion Unit is according to the control signal, using the reserved bit in the initial data as the bits switch As a result.

10. according to the method described in any one of claim 6-9, being completed to the caching of Neural Network Data and When not yet completing convolution algorithm, the Neural Network Data of caching is inputted into the bits switch device to execute step 1) -4), or The result of convolution algorithm is inputted the ratio by person when the convolution algorithm to data is completed and not yet completes activation operation Special conversion equipment is to execute step 1) -4).

11. a kind of computer readable storage medium, wherein being stored with computer program, the computer program is used when executed In method of the realization as described in any one of claim 6-10.