CN111656315A

CN111656315A - Data processing method and device based on convolutional neural network architecture

Info

Publication number: CN111656315A
Application number: CN201980009296.4A
Authority: CN
Inventors: 王哲; 仇晓颖; 韩彬
Original assignee: SZ DJI Technology Co Ltd
Current assignee: SZ DJI Technology Co Ltd
Priority date: 2019-05-05
Filing date: 2019-05-05
Publication date: 2020-09-11
Also published as: WO2020223856A1

Abstract

A data processing method and device based on a convolutional neural network architecture. The method comprises the following steps: if the input of the current operation layer is a set of fixed point data, the set of fixed point data is processed according to the operation rule of the current operation layer to generate output data of the current operation layer; if the input of the current operation layer is n groups of fixed point data, adjusting the n groups of fixed point data to ensure that the quantization parameters of each group of data in the n groups of data are the same, and generating output data of the current operation layer after processing the adjusted n groups of data according to the operation rule of the current operation layer; wherein n is more than or equal to 2; the quantization parameters comprise the total bit width, the integer part bit width and the decimal part bit width of the fixed point data; and when the operation of all the operation layers is finished, outputting the prediction result of the data to be detected. Thus, the calculation accuracy of the fixed-point convolutional neural network model can be improved.

Description

Data processing method and device based on convolutional neural network architecture

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus based on a convolutional neural network architecture.

Background

In the existing method for performing fixed-point processing on the convolutional neural network, only a convolutional layer and a full-link layer which are relatively densely operated in the convolutional neural network are subjected to fixed-point processing, but an intermediate structure with relatively less operation amount of the convolutional neural network is not subjected to fixed-point processing, and therefore when the method is used for performing data processing on the convolutional neural network subjected to fixed-point processing, operations of converting floating point data into fixed-point data and converting the fixed-point data into floating point data are required; the process of interconversion of floating point data and fixed point data affects computational efficiency and loses accuracy of data processing.

Disclosure of Invention

The invention provides a data processing method and device based on a convolutional neural network architecture, which are used for improving the accuracy of data processing of a fixed-point convolutional neural network model.

In a first aspect, an embodiment of the present invention provides a data processing method based on a convolutional neural network architecture, which is characterized by including:

receiving input to-be-detected data based on the trained convolutional neural network model; the neural network model comprises a plurality of operation layers which are connected in sequence, and each operation layer outputs input data of a next operation layer after performing operation according to output data of a previous operation layer; the parameter of the convolutional neural network model is fixed point data, and the data to be detected is the fixed point data;

if the input of the current operation layer is a set of fixed point data, the set of fixed point data is processed according to the operation rule of the current operation layer to generate output data of the current operation layer;

if the input of the current operation layer is n groups of fixed point data, adjusting the n groups of fixed point data to ensure that the quantization parameters of each group of data in the n groups of data are the same, and generating output data of the current operation layer after processing the adjusted n groups of data according to the operation rule of the current operation layer; wherein n is more than or equal to 2; the quantization parameters comprise the total bit width, the integer part bit width and the decimal part bit width of the fixed point data;

and when the operation of all the operation layers is finished, outputting the prediction result of the data to be detected.

A second aspect of the present invention provides a data processing apparatus, characterized by at least comprising a memory and a processor; the memory is connected with the processor through a communication bus and is used for storing computer instructions executable by the processor; the processor is configured to read computer instructions from the memory to implement a data processing method based on a convolutional neural network architecture, the method comprising:

In a third aspect, the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the method according to the first aspect.

The embodiment of the invention provides a data processing method and device based on a convolutional neural network architecture, aiming at an operation layer which inputs n groups of fixed point data in the convolutional neural network architecture, the n groups of fixed point data are adjusted to ensure that quantization parameters of the n groups of data are the same, and then the adjusted n groups of data are processed according to the operation rule of the current operation layer to generate output data of the current operation layer; compared with the prior art that the calculation layer containing multiple groups of input adopts floating point data to calculate, the conversion process between floating point data and fixed point data during calculation of the calculation layer is saved, and the accuracy of data processing of the convolutional neural network architecture is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive labor.

FIG. 1 is a schematic diagram of a convolutional neural network including a residual structure;

FIG. 2 is a schematic diagram of a convolutional neural network comprising a cascaded structure;

FIG. 3 is a schematic diagram of a convolutional neural network using nested concatenation and residual structure;

fig. 4 is a flowchart illustrating a data processing method based on a convolutional neural network architecture according to an embodiment of the present invention;

FIG. 5 is a schematic flow chart illustrating adjustment of n sets of fixed-point data according to an embodiment of the present invention;

FIG. 6 is a schematic flow chart of adjusting the input setpoint data by the Eltwise layer according to an embodiment of the present invention;

FIG. 7 is a schematic flow chart illustrating another exemplary adjustment of n sets of fixed-point data according to an embodiment of the present invention;

FIG. 8 is a schematic flow chart illustrating the adjustment of the input fixed-point data by the Concat layer according to an embodiment of the present invention;

FIG. 9 is a flowchart illustrating a process for folding the convolution layer, BatchNorm layer, and Scale layer according to one embodiment of the present invention;

FIG. 10 is a flowchart illustrating the calculation of a new convolutional layer after folding the convolutional layer, the BatchNorm layer, and the Scale layer, according to an embodiment of the present invention;

FIG. 11 is a graphical illustration of the deviation of the weight values from the overall distribution range provided by one embodiment of the present invention;

fig. 12 is a schematic structural diagram of a data processing apparatus based on a convolutional neural network architecture according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Convolutional Neural Networks (CNN) are a type of feed-forward Neural network that includes convolution calculations and has a deep structure, and are one of the representative algorithms for deep learning.

The structure of the conventional convolutional neural network mainly includes a plurality of operation layers, such as a convolutional layer and a fully-connected layer. In order to accelerate convergence, a Batch Normalization layer (BN) is added to both the convolution layer and the global connection layer. Each operation layer of the convolutional neural network can process (for example, convolution, pooling, activation or full connection processing) the feature map output by the previous layer to obtain the feature map output by the current layer.

In the prior art, in order to solve the problem of performance degradation caused by network deepening, a residual error structure can be added into a convolutional neural network. For example, as shown in the network structure shown in fig. 1, the output of the residual unit is obtained by adding output elements of a cascade of a plurality of convolutional layers in eltwise (ensuring that the dimensionality of the output and input elements of the convolutional layers is the same), and then activating by Relu. The deep network and the shallow network are added, so that the problem of gradient disappearance can be avoided in the back propagation process, and the accuracy is not reduced along with the deepening of the network.

In addition, the convolutional neural network architecture can also comprise a cascade structure, a smaller neural network is used for coarse detection, and then characteristic channels output by each neural network are cascaded and then are predicted. For example, a network structure as shown in fig. 2.

Some convolutional neural networks can also nest and use a cascade structure and a residual structure so as to improve the operation speed of the network while ensuring the robustness of the network. Such as the nested network structure shown in fig. 3.

Convolutional Neural Networks (CNNs) are applied to many fields, but with the improvement of classification and detection accuracy, the CNNs have increasingly complex structures, and require increasingly large amounts of computation and storage, and further have increasingly high requirements on the computation capability and storage capability of hardware devices, so that the operation of the Convolutional Neural networks cannot be performed on mobile devices with limited storage and computation capabilities, and further, the application of the Convolutional Neural networks in practice is limited.

Therefore, how to reduce the operation scale of the convolutional neural network model on the premise of not influencing the calculation accuracy of the network convolutional neural network model has important significance for accelerating the calculation processing speed of hardware equipment, saving storage resources and expanding the application range of the convolutional neural network.

In the prior art, in order to reduce the calculation scale of the convolutional neural network, there is a method of performing fixed-point processing on the neural network, where the fixed-point processing is to convert a convolutional neural network model from floating point parameters to fixed-point parameters, and since the decimal place of a floating point parameter can be changed randomly, the expressible decimal place range is wider than that of a fixed-point parameter, and the operand of the corresponding floating point parameter is also very large, and after the convolutional neural network is fixed-point, the calculation scale of the convolutional neural network can be effectively reduced by using the fixed-point parameter with a smaller decimal place to replace the original 32-bit floating point parameter.

In the existing fixed-point method, the data format of the fixed-point value may use BW (bit width) to represent the total bit width, IL (integer length) to represent the bit width of the integer part, and FL (fraction length) to represent the bit width of the fractional part, and if the sign bit is also included in IL, the relationship among BW, IL, and FL is: BW-IL + FL; in practice, a set of quantization parameters can be represented using two of the three.

Assuming that there is a data set S, and the total bit width BW required for the fixed-point value is preset, the quantization parameters IL and FL are calculated according to the following equations (1) and (2):

IL＝ceil((log₂(max(s))+1) (1)

FL＝BW-IL (2)

furthermore, after a group of data S and the total bit width BW required by the fixed-point value are given, the quantization parameter FL corresponding to the group of data S can be calculated according to the above formulas (1) and (2); after knowing a set of quantization parameters BW and FL, the mapping relationship between the actual value r in the data set S and the fixed-point value q of the actual value under the set of quantization parameters is as follows formula (3):

wherein r is_maxAnd r_minRespectively representing the maximum and minimum values, r, of the actual values that the set of quantization parameters can express_maxAnd r_minCan be calculated by the following formulas (4) and (5), respectively:

r_max＝(2^BW-1-1)*2^-FL(4)

r_min＝-2^BW-1*2^-FL(5)

furthermore, after a group of data and the total bit width BW required by the fixed-point value of the group of data are given, the quantization parameter of the group of data may be calculated, and after the quantization parameter of the group of data is determined, each actual value r included in the group of data may be converted into the fixed-point value according to the above formula (3), formula (4), and formula (5), which is a process of fixed-point.

In the embodiment of the invention, the actual value is converted into the fixed point value by applying the fixed point method.

Considering that in the existing method for performing fixed-point processing on a convolutional neural network, only a convolutional layer and a full-link layer which are relatively densely operated in the convolutional neural network are subjected to fixed-point processing, but an intermediate structure with relatively less operation amount of the convolutional neural network, such as a cascade structure and a residual structure, is not subjected to fixed-point processing, when data processing is performed on a convolutional neural network model after the fixed-point processing is performed by using the method, because input, output data and parameter data of the convolutional layer and the full-link layer are fixed-point data, and input, output data and parameter data of other structural layers which are not subjected to fixed-point processing are floating-point data, the fixed-point data of the convolutional layer and the full-link layer need to be converted into floating-point data in the data processing process, the fixed-point data is operated with floating-point data which is not subjected to fixed-point processing, and then the floating-point result is converted into the fixed-point; the process of interconversion of floating point data and fixed point data affects computational efficiency and loses accuracy. Based on this, the embodiment of the invention provides a data processing method and device based on a convolutional neural network architecture.

Fig. 4 is a schematic flowchart of a data processing method based on a convolutional neural network architecture according to an embodiment of the present invention. Referring to fig. 1, the method includes the steps of:

s1, receiving input data to be detected based on the trained convolutional neural network model; the neural network model comprises a plurality of operation layers which are connected in sequence, and each operation layer outputs input data of a next operation layer after performing operation according to output data of a previous operation layer; the parameter of the convolutional neural network model is fixed point data, and the data to be detected is the fixed point data.

In this embodiment, the convolutional neural network model is a floating point model before training is completed, that is, each parameter is floating point data. Before the convolutional neural network model is trained, the quantization parameters of the input data and the output data of each operation layer need to be determined based on the floating point model. Specifically, a data set or a sample is input into the floating point model, and a quantization parameter corresponding to each set of data can be calculated according to the above formula (1) and formula (2) according to the input data and the output data of each structural layer of the floating point model of the convolutional neural network. In addition, the quantization parameter determination of the input data and the output data may also take into account the influence of outliers, and reference may be made specifically to the following description. After the quantization parameters of the input data and the output data are obtained, the parameters of the neural network model can be converted into fixed point data, and thus the trained neural network model is generated. Before inputting the data to be measured into the trained neural network model, the data to be measured of the floating point needs to be fixed-point, or the data to be measured of the fixed point needs to be shifted and adjusted according to the needs.

And S2, if the input of the current operation layer is a set of fixed point data, processing the set of fixed point data according to the operation rule of the current operation layer to generate the output data of the current operation layer.

In this embodiment, for an operation layer that inputs a set of fixed point data, the set of fixed point data may be directly processed according to an operation rule corresponding to the operation layer to obtain output data of the current operation layer. For example, convolutional layers, pooling layers, fully-connected layers, etc., all of which have a set of fixed point data as inputs. The different operation layers correspond to different operation rules, for example, the operation rule corresponding to the convolutional layer is convolution operation, and the operation rule corresponding to the pooling layer is maximum value down-sampling, average value down-sampling, or the like.

S3, if the input of the current operation layer is n groups of fixed point data, adjusting the n groups of fixed point data to ensure that the quantization parameters of the n groups of fixed point data are the same, and generating output data of the current operation layer after processing the adjusted n groups of fixed point data according to the operation rule of the current operation layer; wherein n is more than or equal to 2; the quantization parameters include a total bit width, an integer part bit width, and a fractional part bit width of the fixed-point data.

Similarly, the operation rule in step S3 corresponds to the operation layer. For example, when the operation layer is an Eltwise layer, the operation rule is to multiply by elements, sum by elements, or hold the maximum of elements. The element summation is the default operation rule of the Eltwise layer; when the operation layer is a concat layer, the operation rule is to splice two or more feature maps in the channel or num dimension.

For an operation layer containing a plurality of groups of input data, because the data contained in the plurality of groups of input data are different, the quantization parameters of the plurality of groups of input data determined according to the method are different; in this embodiment, in the data processing process, if the current input of the computing layer includes n sets of fixed-point data, the n sets of fixed-point data are first adjusted to have the same quantization parameter, so as to perform the computing process on the n sets of fixed-point data according to the computing rule of the structural layer.

And S4, outputting the prediction result of the data to be detected when the operation of all the operation layers is completed.

The trained convolutional neural network is used for predicting the data to be measured, in this embodiment, the fixed-point convolutional neural network model is used for predicting the data to be measured, the predicted data is fixed-point data, and the finally obtained prediction result is also the fixed-point data.

In the embodiment of the invention, the operation unit with multiple groups of inputs has less operation amount, and the operation layer with multiple inputs in the convolutional neural network can be operated by using fixed point data through the unification of quantization parameters, so that the conversion between floating point data and fixed point data is not needed, the defect that the operation unit with less operation amount in the prior art cannot realize pure fixed point is avoided, and the positive effects of high data processing efficiency and high accuracy are achieved.

In an embodiment of the present invention, if the current operation layer is a residual structure, the input of the current operation layer includes n sets of fixed point data; referring to fig. 5, in the step S3, the adjusting the n sets of fixed-point data so that the quantization parameters of the n sets of data are the same specifically includes the following steps S31-S32:

and S31, acquiring the smallest decimal part bit width in the n groups of fixed point data as the reference bit width according to the decimal part bit width of each group of fixed point data in the n groups of fixed point data.

And S32, taking the reference bit width as a reference, shifting, round and saturating the n-1 groups of fixed point data except the group of fixed point data with the reference bit width in the n groups of fixed point data, so that the bit widths of the decimal parts of the n groups of fixed point data are the same.

Illustratively, the residual structure is realized by an Eltwise layer, and the fixed point data input by the Eltwise layer is assumed to be X_iThe quantization parameter is [ BW_i,FL_i]The output fixed point data is Y, and the quantization parameter is [ BW_y,FL_y]Before the residual error operation, the set of input fixed point data is adjusted according to the following formula:

wherein FL_{in_min}For the smallest FL among the n sets of setpoint data,

is X_iAnd adjusting the fixed point data. Further, the shift bit number (FL) of the shift operation described above_{in_min}-FL_i) The minimum decimal part bit width FL in n groups of fixed point data and the fixed point data X_iThe fractional part bit width FL. After the shifting, round and saturation processing are respectively carried out to obtain the adjusted fixed point data.

Fig. 6 is a schematic diagram of an Eltwise layer adjusting input setpoint data according to an embodiment of the present invention. Referring to FIG. 6, if a set of fixed point data (the quantization parameter of the fixed point data is FL)_in) Conversion to another set of fixed-point fixesQuantization parameter FL of data_outThe following operation of a fixed-point value, which includes shifting, rounding and saturation processing, is called cache (FL)_in，FL_out) The input to the Eltwise layer includes two sets of fixed point data X₁、X₂In the case of (1), cache (FL) is performed for the two sets of setpoint data X1 and X2, respectively_1，FL_in-min)、Rescale(FL_2，FL_in-min) After processing, the operation of the Eltwise layer is executed, and then the result data obtained by the operation is processed by cache (FL)_in-min，FL_y) And (5) outputting after processing.

In an embodiment of the present invention, if the current operation layer is of a cascade structure, the input of the current operation layer includes n sets of fixed point data; referring to fig. 7, in the step S3, the adjusting the n sets of fixed-point data to make the quantization parameter of each of the n sets of data the same includes steps S31-S32:

s31', according to the output quantization parameter of the current operation layer, acquiring the decimal part bit width of the output quantization parameter as the reference bit width.

S32', based on the reference bit width, shift, round, and saturate the n sets of fixed point data, so that the bit widths of the fractional parts of the n sets of fixed point data are all the same.

For example, in the present embodiment, the fixed point data of the input of the Concat layer is assumed to be X_iThe quantization parameter is [ BW_i,FL_i]The output fixed point data is Y, and the quantization parameter is [ BW_y,FL_y]Before corresponding operation, the fixed point value input in each group is converted according to the following formula (7):

wherein, the above

Is X_iThe adjusted fixed-point data, the shift bit number of the above-mentioned shift operation being determined by the output dataQuantization parameter FL_yAnd the quantization parameter FL of the set of fixed point data_iThe difference of (d) is determined.

In this embodiment, n sets of fixed point data are adjusted to have the same quantization parameter, and the quantization parameter is the same as the quantization parameter of the fixed point data output by the Concat layer, so that the adjusted n sets of fixed point data are directly subjected to the cascade operation to obtain the output data Y.

Illustratively, referring to FIG. 8, the input of the Concat layer includes two sets of fixed point data X₁、X₂In the case of (1), cache (FL) is performed for the two sets of setpoint data X1 and X2, respectively_1，FL_y)、Rescale(FL_2，FL_y) After processing, the operation of the Concat layer is executed, and then the result data obtained by the operation is output.

In an embodiment of the present invention, if the current operation layer includes a convolution layer and a batch normalization layer; the batch normalization includes a BatchNorm layer and a Scale layer; before the receiving of the input data to be tested, the method further includes the following steps S311 to S312:

s311, folding the convolution layer, the BatchNorm layer and the Scale layer to form a new convolution layer.

In this embodiment, the convolution layer, the BatchNorm layer, and the Scale layer may be folded according to the following formula to obtain a parameter w of a new convolution layer after folding_foldAnd b_fold：

Wherein v is a variance parameter of the BatchNorm layer, μ is a mean parameter of the BatchNorm layer, β is a translation parameter of the Scale layer, s is a scaling parameter of the Scale layer, eps is a default value, w is a variance parameter of the Scale layer_foldIs the weight parameter of the new convolutional layer, w is the weight parameter of the convolutional layer before the folding process, b_foldIs the bias parameter for the new convolutional layer.

And S312, performing fixed-point processing on the parameters of the new convolutional layer.

Referring to fig. 9, the convolutional layer includes parameters including a weight parameter w and a bias parameter b, and when the convolutional layer uses a batch normalization layer, the bias parameter b of the convolutional layer is 0; the BatchNorm contains parameters with variance v and mean μ; the parameters contained in the Scale layer comprise a scaling parameter s and a translation parameter beta; when the convolution layer, the BatchNorm layer, and the Scale layer are not folded, it is necessary to fix the parameter data included in the convolution layer, the BatchNorm layer, and the Scale layer, respectively.

Further, referring to fig. 10, after the convolution layer, the BatchNorm layer, and the Scale layer are folded, only the parameter data w obtained after the folding process needs to be folded_foldAnd b_foldThe spotting is performed.

In the embodiment of the invention, the operation layer comprising the convolution layer and the batch normalization layer adopts the way of carrying out fixed-point processing on the parameter data of the new convolution layer after the convolution layer and the batch normalization layer are subjected to folding processing, and compared with the way of respectively carrying out fixed-point processing on the parameters of the convolution layer, the BatchNorm layer and the Scale layer, the complexity of fixed-point processing is reduced, and the accuracy of the model after fixed-point processing is improved.

In another embodiment of the present invention, if the current operation layer includes a full connection layer and a batch normalization layer; the batch normalization layer comprises a BatchNorm layer and a Scale layer; before the receiving of the input data to be tested, the method further includes the following steps S311 '-S312':

in step S311', the fully connected layer, BatchNorm layer and Scale layer are folded to form a new convolution layer.

Folding the convolution layer, the BatchNorm layer and the Scale layer according to the following formula to obtain the parameters of the new fully-connected layer:

wherein v is a variance parameter of the BatchNorm layer, μ is a mean parameter of the BatchNorm layer, β is a translation parameter of the Scale layer, s is a scaling parameter of the Scale layer, eps is a default value, w is a variance parameter of the Scale layer_foldIs the new weight parameter of the fully-connected layer, w is the weight parameter of the fully-connected layer before the folding process, b_foldB is the bias parameter of the fully-connected layer before the folding process.

Step S312' is to fix the parameters of the new convolution layer.

In the embodiment of the invention, the parameter data of the new convolution layer after the full connection layer and the batch normalization processing layer are folded is fixed to the operation layer comprising the full connection layer and the batch normalization processing layer, and compared with the method for respectively fixing the parameters of the full connection layer, the BatchNorm layer and the Scale layer, the complexity of fixed-point processing is reduced, and the accuracy of the model after fixed-point processing is improved.

Optionally, in the above embodiment, since a difference between the weight data and the offset data obtained after the folding processing becomes large, when the quantization parameter is calculated for the weight data and the offset data obtained after the conversion, the weight data and the offset data need to be respectively grouped.

When a set of data is subjected to quantization parameter calculation, the maximum value of the absolute values of the set of data is adopted to calculate IL in the prior art, but when the set of data has abnormal values which individually deviate from the overall distribution, the calculation of the quantization parameter IL is affected, and the final fixed point result is deteriorated. For example, as shown in fig. 11, the abscissa represents the weight value and the ordinate represents the number of distributions, so that it can be seen that the weight distribution of the convolutional layer in a neural network structure has a very small proportion of a part of the weight values, that is, the individual weight values deviate from the overall range.

In an embodiment of the present invention, in order to avoid that the existence of an abnormal value affects the calculation of the quantization parameter of the weight data, before receiving the input data to be measured, the quantization parameter is calculated by the following method, and the method includes:

s11, obtaining the maximum value of the absolute value of the first data group, and calculating the bit width IL of the initial integer part according to the maximum value₀；

For example, the first data set may be a weight data set, such as weight data of a convolutional layer and weight data of a fully-connected layer. In addition, the first data group may be input data and output data of each operation layer.

S12, calculating IL_i+1＝IL_i-1, wherein i ≧ 0; and calculating IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min；

S13, according to the initial integer part bit width IL₀Maximum floating point value r_maxAnd minimum floating point value r_minAnd acquiring the quantization parameter of the first data group.

In an embodiment, the bit width IL according to the initial integer part₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically includes:

s131, according to the initial integer part bit width IL₀Performing fixed-point processing and anti-fixed-point processing on each data in the first data group to generate a floating point value r';

s132, using the r_minAs a lower threshold and r_maxGenerating a threshold range [ r ] as a threshold upper line_min,r_max]；

S133, acquiring excess [ r ] in the first data group_min,r_max]Generates a second data set based on IL from each of said data₀Floating point values r' and r generated after fixed point processing and anti-fixed point processing_maxAnd r_minCalculating saturation loss of each data in the second data group, and accumulating the saturation loss of each data in the second data group to obtain a first accumulated value ST;

in step S133, when r 'is a positive number, a saturation loss of the data is obtained by calculating an absolute value of a difference between r' and r _ max; when the r 'is negative, calculating the absolute value of the difference between r' and r _ min as the saturation loss of the data. The fixing in step S12 is a process of converting floating point data into fixed point data, and the reverse fixing is a process of converting fixed point data into floating point data, and the combination of the two processes may be referred to as a sim-quant (singular-rectangle) process.

S134, acquiring the value r in the first data group_min,r_max]Generating a third data group by a plurality of data in the third data group, calculating the gain of each data in the third data group, and accumulating the gain of each data in the third data group to obtain a second accumulated value G;

in step S134, the data is respectively acquired at IL₀Quantization loss of L1 and in IL_iA quantization loss L2 of lower, calculating an absolute value of a difference of L1 and L2 as a gain of the data; the quantization loss is the difference between any data subjected to fixed point processing and inverse fixed point processing and the original data. For example, assume a floating-point data original value is r, using BW and IL₀The quantization parameter is fixed-point processed to generate a quantization value q₀For the quantized value q₀Performing anti-fixed-point processing to generate floating point value r₀Then IL₀Lower quantization loss of r₀R, the numerical loss introduced by the quantization. Similarly, with BW and LL₁The quantization parameter is fixed-point processed to generate a quantization value q₁For the quantized value q₁Performing anti-fixed-point processing to generate floating point value r₁Then IL₁Lower quantization loss of r₁-r。

And S135, acquiring the quantization parameter of the first data group according to the first accumulated value ST and the second accumulated value G.

In step S135, when G is present>K1 × ST, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_minContinuing to update the first accumulated value ST and the second accumulated value G; and when G is less than or equal to K1In ST, with IL_iAs a quantization parameter for the first data set. For example, when in IL₂Lower G>K1 × ST, calculating IL continuously₃Lower r_maxAnd r_minAnd updating the first accumulated value ST and the second accumulated value G to obtain the proportional relation between the first accumulated value ST and the second accumulated value G when the current value is IL₃If G is less than or equal to K1 × ST, IL is used₂As a quantization parameter for the first data set. In the embodiment of the present invention, the K1 is a set value for describing the difference between the saturation loss and the influence degree of the gain on the whole set of weight data. Optionally, setting the value of K1 to be between 100 and 1000 may obtain a better quantization result.

In the embodiment of the present invention, the quantization parameters of the weights may be calculated by steps S131 to S135.

In another preferred embodiment, said bit width IL according to said initial integer part₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically includes:

s131' and r_minAs a lower threshold and r_maxGenerating a threshold range [ r ] as a threshold upper line_min,r_max]；

S132', acquiring the excess [ r ] in the group of data_min,r_max]The number of data of C1;

s133', acquiring the number C2 of non-zero data in the group of data;

s134', determining the quantization parameter of the first data group according to C1 and C2.

Step S134', when C2 is not more than K2 × C1, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value; when C2>K2C 1, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_minAnd updates C1 and C2. For example, when in IL₂Lower C2>K2 × C1, IL calculation is continued₃Lower r_maxAnd r_minAnd updating C1 and C2 to obtain the proportional relation between C1 and C2, when in IL₃If G is less than or equal to K1 × ST, IL is used₂As a quantization parameter for the first data set. In an embodiment of the present invention, K2 is a set value. Optionally, setting the value of K2 to be one thousandth can obtain a better quantification result.

In the embodiment of the present invention, when the quantization parameter is calculated for the input and output data in steps S131 '-S134', the quantization parameter of the data is calculated by using a method based on the number of data saturations, and compared with the prior art in which the quantization parameter is calculated by directly using the maximum absolute value of the data in the data group, the influence of the abnormal value can be effectively eliminated, and the accuracy of the obtained quantization parameter is higher. Compared with the steps S131-S135, the algorithm of the steps S131 '-S134' is simpler, is suitable for the data volume of a plurality of batchs in the forward operation of the model training stage, is not influenced by the change of the data along with the picture, and ensures the quantization parameter IL_iThe reasonableness of the method can reduce the complexity of the operation. In addition, the method in the embodiment avoids the sorting processing required by other algorithms such as Tukeytest and capping algorithm, the speed is faster when a plurality of batchs are operated, and on the other hand, the method also considers the characteristic that the zero value occupies a larger ratio after the counted data are activated for the activation function.

In the prior art, the quantization parameter IL is calculated by using the maximum value of the absolute value of the group of data, which causes the quantization parameter IL to be larger. The method provided by the above embodiment of the present invention can generate an ILi smaller and more reasonable than IL0 obtained by maximum calculation as a final IL value, which may cause partial data saturation when the ILi is used as a quantization parameter, but also make the whole set of data have more bit width to represent the fractional part. Therefore, the method of the embodiment of the invention is adopted to calculate the quantization parameters of the data groups, fully considers the influence of abnormal data in different data groups on the calculation of the quantization parameters, and improves the accuracy of the convolutional neural network model after fixed point.

In one embodiment of the invention, the parameters of the convolutional neural network model comprise a weight parameter and a bias parameter; the weight parameter, the bias parameter and the input/output data of the convolutional neural network model are respectively used as a group of data to preset the total bit width of a set of data in the process of fixed-point processing, specifically, the weight parameter corresponds to a BW (for example, 16 bits), the bias parameter corresponds to a BW (for example, 8 bits), and the input and output data correspond to a BW (for example, 32 bits). Optionally, the weight parameter, the bias parameter, the input data, and the output data of each operation layer respectively calculate corresponding quantization parameters according to the distribution range thereof.

The setting of the total bit width is determined according to the computing capacity of hardware equipment for executing the operation of the convolutional neural network, the size of data and the required accuracy.

In an embodiment of the present invention, the trained convolutional neural network is obtained by the following method:

performing network parameter training on the convolutional neural network to obtain initial floating point parameters of the convolutional neural network; the network parameters include weight data and bias data;

and obtaining quantization parameters respectively corresponding to the initial floating point parameter, the input data and the output data according to the distribution of the initial floating point parameter, the input data and the output data.

In this embodiment, in the method of obtaining the quantization parameter corresponding to the initial floating point parameter according to the distribution of the initial floating point data, the above method of calculating the quantization parameter for the weight data is adopted to eliminate the influence of the abnormal value, so as to obtain a more accurate quantization parameter.

Optionally, the above-mentioned manner of calculating the corresponding quantization parameters according to the distribution ranges of the input data and the output data of each operation layer may be performed with reference to the above-mentioned manner, so as to eliminate the influence of abnormal data.

Continuing to input training data into the convolutional neural network model based on the initial floating point parameters and the quantization losses under the quantization parameters, and updating the floating point parameters of the convolutional neural network model and the quantization losses under the quantization parameters according to a loss function of the convolutional neural network model.

In this embodiment, if normalization processing is applied to both the convolution layer and the fully-connected layer in the convolutional neural network model, the initial floating point parameter is the initial floating point parameter after folding processing.

The existing fixed-point method only ensures that the network (such as AlexNet neural network, VGG neural network, GoogleNet neural network and the like) which has huge parameter quantity and low requirement on precision has better calculation effect, but for some neural networks which have small parameter quantity and high requirement on precision, the effect of the fixed-point neural network in use is not ideal after the fixed-point method is used for fixed-point of the existing fixed-point method.

In the embodiment of the invention, an operation layer which inputs n groups of fixed point data is prepared, and the n groups of fixed point data are adjusted to ensure that the quantization parameters of the n groups of data are the same, so that the fixed point operation of an intermediate structure layer of a convolutional neural network is realized, and the conversion operation between floating point data and fixed point data in a multi-input operation layer is avoided; by folding the operation layer comprising the convolution layer and the batch normalization layer and folding the operation layer comprising the full connection layer and the batch normalization layer, the accuracy of calculation is improved compared with a mode of carrying out layered processing. In addition, in the embodiment of the invention and in the process of calculating the quantization parameter, the influence caused by the abnormal point is fully considered, and the strategies for calculating the quantization parameter are designed respectively aiming at the weight data and the input and output data, so that the accuracy of the quantization parameter is improved. Furthermore, the convolutional neural network model in the embodiment of the invention has the advantages of high precision and wide application range.

Referring to fig. 12, an embodiment of the present invention further provides a data processing apparatus 100, which at least includes a memory 102 and a processor 101; the memory 102 is connected to the processor 101 via a communication bus 103 for storing computer instructions executable by the processor; the processor 101 is configured to read computer instructions from the memory 102 to implement a data processing method based on a convolutional neural network architecture, the method comprising:

Optionally, the processor is further configured to read a computer instruction from the memory to implement:

acquiring the smallest decimal part bit width in the n groups of fixed point data as a reference bit width according to the decimal part bit width of each group of fixed point data in the n groups of fixed point data;

and shifting, rounding and saturating n-1 groups of fixed point data except for one group of fixed point data with the reference bit width in the n groups of fixed point data by taking the reference bit width as a reference so as to enable the bit widths of the decimal parts of the n groups of fixed point data to be the same.

according to the output quantization parameter of the current operation layer, using the decimal part bit width of the output quantization parameter as a reference bit width;

and shifting, rounding and saturating the n groups of fixed point data by taking the reference bit width as a reference so as to ensure that the bit widths of the decimal parts of the n groups of fixed point data are the same.

folding the convolution layer, the BatchNorm layer and the Scale layer to form a new convolution layer;

and performing fixed point processing on the parameters of the new convolutional layer.

folding the convolution layer, the BatchNorm layer and the Scale layer according to the following formula to obtain the parameters of the new convolution layer after folding treatment:

folding the full connection layer, the BatchNorm layer and the Scale layer to form a new convolution layer;

in step S134, the data is respectively acquired at IL₀Quantization loss of L1 and in IL_iA quantization loss L2 of lower, calculating an absolute value of a difference of L1 and L2 as a gain of the data; the quantization loss is the difference between any data subjected to fixed point processing and inverse fixed point processing and the original data. For example, assume a floating-point data original value is r, using BW and IL₀The quantization parameter is fixed-point processed to generate a quantization value q₀For the quantized value q₀Performing anti-fixed-point processing to generate floating point value r₀Then IL₀Lower quantization loss of r₀R, the numerical loss introduced by the quantization. Similarly, with BW and IL₁The quantization parameter is fixed-point processed to generate a quantization value q₁For the quantized value q₁Performing anti-fixed-point processing to generate floating point value r₁Then IL₁Lower quantization loss of r₁-r。

In step S135, when G is present>K1 × ST, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_minContinuing to update the first accumulated value ST and the second accumulated value G; and when G is less than or equal to K1 × ST, IL is used_iAs a quantization parameter for the first data set. In the embodiment of the present invention, the K1 is a set value for describing the difference between the saturation loss and the influence degree of the gain on the whole set of weight data. Optionally, setting the value of K1 to be between 100 and 1000 may obtain a better quantization result.

s133', acquiring the number C2 of non-zero data in the group of data;

Step S134', when C2 is not more than K2 × C1, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value; when C2>K2C 1, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min. In an embodiment of the present invention, K2 is a set value. Optionally, setting the value of K2 to be one thousandth can obtain a better quantification result.

Optionally, the parameters of the convolutional neural network model include a weight parameter and a bias parameter; the weight parameters, the bias parameters and the input and output data of the convolutional neural network model are respectively used as a group of data to preset the total bit width of the fixed point data in the fixed point process, and the weight parameters, the bias parameters, the input data and the output data of each operation layer respectively calculate corresponding quantization parameters according to the distribution range of the weight parameters, the bias parameters, the input data and the output data.

according to the distribution of the initial floating point parameter, the input data and the output data, obtaining quantization parameters corresponding to the initial floating point parameter, the input data and the output data respectively;

continuing to input training data into the convolutional neural network model based on the initial floating point parameters and the quantization losses under the quantization parameters, and updating the floating point parameters of the convolutional neural network model and the quantization losses under the quantization parameters according to a loss function of the convolutional neural network model;

and when the loss function converges to a preset condition, performing fixed-point processing on the floating-point parameters obtained by current updating based on the quantization parameters, thereby generating the trained convolutional neural network model.

An embodiment of the present invention further provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the data processing method based on the convolutional neural network architecture.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The method and apparatus provided by the embodiments of the present invention are described in detail above, and the principle and the embodiments of the present invention are explained in detail herein by using specific examples, and the description of the embodiments is only used to help understanding the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A data processing method based on a convolutional neural network architecture is characterized by comprising the following steps:

if the input of the current operation layer is n groups of fixed point data, adjusting the n groups of fixed point data to ensure that the quantization parameters of the n groups of data are the same, and processing the adjusted n groups of data according to the operation rule of the current operation layer to generate the output data of the current operation layer; wherein n is more than or equal to 2; the quantization parameters comprise the total bit width, the integer part bit width and the decimal part bit width of the fixed point data;

2. The method of claim 1, wherein if the current operation layer is a residual structure, the input of the current operation layer comprises n sets of fixed point data; the adjusting the n sets of fixed point data to make quantization parameters of the n sets of data the same specifically includes:

3. The method of claim 2, wherein the residual structure is implemented using an Eltwise layer.

4. The method of claim 1, wherein if the current operation layer is a cascade structure, the input of the current operation layer comprises n sets of fixed point data; the adjusting the n groups of fixed point data to make the quantization parameters of the n groups of data the same specifically comprises the following steps:

5. The method according to claim 4, wherein the cascade structure is implemented by using a concat layer.

6. The method of any one of claims 1 to 5, wherein if the current operation layer comprises a convolutional layer and a batch normalization layer; the batch normalization includes a BatchNorm layer and a Scale layer; before receiving the input data to be tested, the method further comprises the following steps:

7. The method of claim 6, wherein folding the convolutional layer, the BatchNorm layer, and the Scale layer to form a new convolutional layer comprises:

wherein v is the variance parameter of the BatchNorm layer, μ is the mean parameter of the BatchNorm layer, β is the translation parameter of the Scale layerNumber, s is the scaling parameter of the Scale layer, eps is the default value, w_foldIs the weight parameter of the new convolutional layer, w is the weight parameter of the convolutional layer before the folding process, b_foldIs the bias parameter for the new convolutional layer.

8. The method according to any one of claims 1 to 5, wherein if the current operation layer comprises a fully connected layer and a batch normalization layer; the batch normalization layer comprises a BatchNorm layer and a Scale layer; before the receiving the input data to be tested, the method further comprises:

9. The method according to claim 8, wherein the folding the fully-connected layer, the BatchNorm layer, and the Scale layer to form a new fully-connected layer comprises:

10. The method of any of claims 1 to 9, wherein prior to receiving input data under test, the method further comprises:

obtaining the maximum value of the absolute value of the first data group, and calculating the bit width IL of the initial integer part according to the maximum value₀；

Computing IL_i+1＝IL_i-1, wherein i ≧ 0; and calculating IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min；

According to the initial integer part bit width IL₀Maximum floating point value r_maxAnd minimum floating point value r_minAnd acquiring the quantization parameter of the first data group.

11. The method according to claim 10, wherein said bit width IL according to said initial integer portion₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically includes:

according to the initial integer part bit width IL₀Performing fixed-point processing and anti-fixed-point processing on each data in the first data group to generate a floating point value r';

with the r_minAs a lower threshold and r_maxGenerating a threshold range [ r ] as a threshold upper line_min，r_max]；

Obtaining an excess [ r ] in the first data set_min，r_max]Generates a second data set based on IL from each of said data₀Floating point values r' and r generated after fixed point processing and anti-fixed point processing_maxAnd r_minCalculating saturation loss of each data in the second data group, and accumulating the saturation loss of each data in the second data group to obtain a first accumulated value ST;

obtaining [ r ] in the first data set_min，r_max]Generating a third data set from the plurality of data, calculating a gain for each data in the third data set, and determining a gain for each data in the third data setAccumulating the gain of each data in the third data group to obtain a second accumulated value G;

and acquiring the quantization parameter of the first data group according to the first accumulated value ST and the second accumulated value G.

12. The method of claim 11, wherein said deriving quantization parameters for the first data set from the first accumulation value ST and the second accumulation value G comprises:

when G is not more than K × ST, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value.

13. The method of claim 12, wherein said deriving quantization parameters for said first data set from said first accumulation value ST and said second accumulation value G further comprises:

when G > K × ST, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min。

14. The method according to any one of claims 11 to 13, wherein said data is based on IL according to each of said data₀Calculating the saturation loss of each data in the second data group by using floating point values r', r _ max and r _ min generated after the fixed-point processing and the anti-fixed-point processing specifically comprises:

when the r 'is a positive number, obtaining saturation loss of the data by calculating an absolute value of a difference between r' and r _ max;

when the r 'is negative, calculating the absolute value of the difference between r' and r _ min as the saturation loss of the data.

15. The method according to any one of claims 11 to 14, wherein the calculating the gain of each data in the third data set specifically comprises:

separately acquiring said data at IL₀Quantization loss of L1 and in IL_iA quantization loss L2 of lower, calculating an absolute value of a difference of L1 and L2 as a gain of the data; the quantization loss is the difference between any data subjected to fixed point processing and inverse fixed point processing and the original data.

16. The method according to claim 10, wherein said bit width IL according to said initial integer portion₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically includes:

Obtaining an excess of [ r ] in the set of data_min，r_max]The number of data of C1;

acquiring the number C2 of non-zero-value data in the group of data;

when C2 is not more than K × C1, IL is added_iAs a quantization parameter for the first data set; wherein K is a preset value.

17. The method according to claim 16, wherein said bit width IL according to said initial integer portion₀Maximum floating point value r_maxAnd minimum floating point value r_minThe obtaining of the quantization parameter of the first data group specifically further includes:

when C2 > K C1, recalculating IL_i+1＝IL_i-1 to update IL_i+1Representable maximum floating point value r_maxAnd minimum floating point value r_min。

18. The method of any one of claims 1 to 17, wherein the parameters of the convolutional neural network model include a weight parameter and a bias parameter; the weight parameters, the bias parameters and the input and output data of the convolutional neural network model are respectively used as a group of data to preset the total bit width of the fixed point data in the fixed point process, and the weight parameters, the bias parameters, the input data and the output data of each operation layer respectively calculate corresponding quantization parameters according to the distribution range of the weight parameters, the bias parameters, the input data and the output data.

19. The method of any one of claims 1 to 18, wherein the trained convolutional neural network is obtained by:

20. A data processing apparatus comprising at least a memory and a processor; the memory is connected with the processor through a communication bus and is used for storing computer instructions executable by the processor; the processor is configured to read computer instructions from the memory to implement a data processing method based on a convolutional neural network architecture, the method comprising:

21. The apparatus of claim 20, wherein the inputs to the current operation layer comprise n sets of fixed point data if the current operation layer is in a residual structure; the processor is further configured to read computer instructions from the memory to implement:

22. The apparatus of claim 21, wherein the residual structure is implemented using an Eltwise layer.

23. The apparatus of claim 20, wherein if the current operation layer is in a cascade structure, the input of the current operation layer comprises n sets of fixed point data; the processor is further configured to read computer instructions from the memory to implement:

24. The apparatus of claim 23, wherein the cascade structure is implemented using a concat layer.

25. The apparatus of any one of claims 20 to 24, wherein if the current operation layer comprises a convolutional layer and a batch normalization layer; the batch normalization includes a BatchNorm layer and a Scale layer; the processor is further configured to read computer instructions from the memory to implement:

26. The apparatus of claim 25, wherein the processor is further configured to read computer instructions from the memory to implement:

wherein v is a variance parameter of the BatchNorm layer, μ is a mean parameter of the BatchNorm layer, β is a translation parameter of the Scale layer, s is a scaling parameter of the Scale layer, eps is a default value, w is a variance parameter of the Scale layer_foldAs the weight parameter for the new convolutional layer,w is a weight parameter of the convolution layer before folding processing, b_foldIs the bias parameter for the new convolutional layer.

27. The apparatus according to any one of claims 20 to 24, wherein if the current operation layer comprises a fully connected layer and a batch normalization layer; the batch normalization layer comprises a BatchNorm layer and a Scale layer; the processor is further configured to read computer instructions from the memory to implement:

28. The apparatus of claim 27, wherein the processor is further configured to read computer instructions from the memory to implement:

folding the full connection layer, the BatchNorm layer and the Scale layer according to the following formula to obtain the parameters of the new full connection layer:

29. The apparatus of any of claims 20 to 28, wherein prior to receiving input data under test, the processor is further configured to read computer instructions from the memory to implement:

30. The apparatus of claim 29, wherein the processor is further configured to read computer instructions from the memory to implement:

obtaining [ r ] in the first data set_min，r_max]Generating a third data group by a plurality of data in the third data group, calculating the gain of each data in the third data group, and accumulating the gain of each data in the third data group to obtain a second accumulated value G;

31. The apparatus of claim 30, wherein the processor is further configured to read computer instructions from the memory to implement:

32. The apparatus of claim 31, wherein the processor is further configured to read computer instructions from the memory to implement:

33. The apparatus according to any of claims 30 to 32, wherein the processor is further configured to read computer instructions from the memory to implement:

34. The apparatus according to any of claims 30 to 33, wherein the processor is further configured to read computer instructions from the memory to implement:

35. The apparatus of claim 29, wherein the processor is further configured to read computer instructions from the memory to implement:

acquiring the number C2 of non-zero-value data in the group of data;

36. The apparatus of claim 35, wherein the processor is further configured to read computer instructions from the memory to implement:

37. The apparatus of any one of claims 20 to 36, wherein the parameters of the convolutional neural network model comprise a weight parameter and a bias parameter; the weight parameters, the bias parameters and the input and output data of the convolutional neural network model are respectively used as a group of data to preset the total bit width of the fixed point data in the fixed point process, and the weight parameters, the bias parameters, the input data and the output data of each operation layer respectively calculate corresponding quantization parameters according to the distribution range of the weight parameters, the bias parameters, the input data and the output data.

38. The apparatus according to any of claims 20 to 37, wherein the processor is further configured to read computer instructions from the memory to implement:

39. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 19.