CN110503182A

CN110503182A - Network layer operation method and device in deep neural network

Info

Publication number: CN110503182A
Application number: CN201810479974.0A
Authority: CN
Inventors: 张渊
Original assignee: Hangzhou Hikvision Digital Technology Co Ltd
Current assignee: Hangzhou Hikvision Digital Technology Co Ltd
Priority date: 2018-05-18
Filing date: 2018-05-18
Publication date: 2019-11-26

Abstract

The embodiment of the invention provides the network layer operation methods and device in a kind of deep neural network, wherein the network layer operation method in deep neural network includes: the parameter matrix for obtaining network layer in deep neural network；Operation is normalized to weight each in parameter matrix, obtains parameter normalization matrix；According to the first default basic matrix, each element in parameter normalization matrix is expressed as to the linear combination of element in the first default basic matrix, obtains linear combination matrix；The input quantity of network layer and linear combination matrix are subjected to operation, obtain the output quantity of network layer.By this programme, calculation amount can be reduced while the amount of storage that deep neural network is effectively reduced.

Description

Network layer operation method and device in deep neural network

Technical field

The present invention relates to machine learning techniques fields, more particularly to the network layer operation side in a kind of deep neural network Method and device.

Background technique

DNN (Deep Neural Network, deep neural network) is as an emerging neck in machine learning research Domain parses data by imitating the mechanism of human brain, is a kind of intelligent mould that analytic learning is carried out by establishing and simulating human brain Type.Currently, such as CNN (Convolutional Neural Network, convolutional neural networks), RNN (Recurrent Neural Network, Recognition with Recurrent Neural Network), LSTM (Long Short Term Memory, shot and long term memory network) etc. DNN target detection and segmentation, behavioral value and identification, in terms of obtained good application.But due to The operation for generally realizing network base units in DNN using the multiplying of double precision or single-precision floating-point data/add operation, participates in fortune The data volume of calculation is larger, which results in DNN there are intensives, high storage capacity the problems such as.

In current DNN, the method for fixed point quantification is mostly used to reduce the calculation amount and amount of storage of DNN.Pinpoint quantity The method of change under the premise of guaranteeing accuracy rate, can be reduced to a certain degree in such a way that floating number is converted to fixed-point number The bit number of DNN parameter layer weight, and then reduce the amount of storage of DNN.However, this method compression ratio is limited, it is compressed fixed Points still have biggish calculation amount.

Summary of the invention

The network layer operation method being designed to provide in a kind of deep neural network and device of the embodiment of the present invention, with While the amount of storage that deep neural network is effectively reduced, calculation amount is reduced.Specific technical solution is as follows:

In a first aspect, the embodiment of the invention provides the network layer operation method in a kind of deep neural network, the side Method includes:

Obtain the parameter matrix of network layer in deep neural network；

Operation is normalized to each weight in the parameter matrix, obtains parameter normalization matrix；

According to the first default basic matrix, each element in the parameter normalization matrix is expressed as the described first default group moment The linear combination of element, obtains linear combination matrix in battle array；

The input quantity of the network layer and the linear combination matrix are subjected to operation, obtain the output of the network layer Amount.

Optionally, in the acquisition deep neural network after the parameter matrix of network layer, the method also includes:

Each weight in the parameter matrix is counted, the weight range of the parameter matrix is obtained；

Based on the weight range, the first weight of maximum absolute value in the parameter matrix is extracted；

It is described that operation is normalized to each weight in the parameter matrix, obtain parameter normalization matrix, comprising:

By each weight in the parameter matrix respectively divided by first weight, parameter normalization matrix is obtained；

The input quantity by the network layer and the linear combination matrix carry out operation, obtain the defeated of the network layer Output, comprising:

The input quantity of the network layer and the linear combination matrix are subjected to operation, and by first weight and operation As a result it is multiplied, obtains the output quantity of the network layer.

Optionally, described that operation is normalized to each weight in the parameter matrix, parameter normalization matrix is obtained, is wrapped It includes:

By each weight in the parameter matrix respectively divided by the absolute value of first weight, parameter normalization square is obtained Battle array；

The input quantity of the network layer and the linear combination matrix are subjected to operation, and by the absolute of first weight Value is multiplied with operation result, obtains the output quantity of the network layer.

Optionally, the described first default basic matrix is shift-type basic matrix, and each element is 2 in the shift-type basic matrix Power exponent；

It is described according to the first default basic matrix, each element in the parameter normalization matrix is expressed as described first and is preset The linear combination of element, obtains linear combination matrix in basic matrix, comprising:

According to the first default basic matrix, each element in the parameter normalization matrix is expressed as the linear of 2 power exponent Combination, obtains linear combination matrix；

For the first element in the input quantity of the network layer, according to the institute for carrying out multiplying with first element Each power exponent for stating second element in linear combination matrix carries out multi-shift operation to first element respectively, and to institute Results added after stating multi-shift operation, obtains the result of product of first element and the second element；

According to the result of product of each element in each element in the input quantity and the linear combination matrix, pass through combination Mode carries out operation to the input quantity and the linear combination matrix, obtains the output quantity of the network layer.

Optionally, operation is carried out in the input quantity by the network layer and the linear combination matrix, obtained described Before the output quantity of network layer, the method also includes:

Obtain the input activation amount of the network layer；

Operation is normalized to each element numerical value in the input activation amount, obtains activation amount normalization matrix；

According to the second default basic matrix, each element in the activation amount normalization matrix is expressed as the described second default base The linear combination of element in matrix, obtains the input quantity of the network layer.

Optionally, after the input activation amount for obtaining the network layer, the method also includes:

The each element numerical value for counting the input activation amount obtains the element numberical range of the input activation amount；

Based on the element numberical range, the first element numerical value of maximum absolute value in the input activation amount is extracted；

It is described that operation is normalized to each element numerical value in the input activation amount, activation amount normalization matrix is obtained, Include:

By each element numerical value in the input activation amount respectively divided by the first element numerical value, activation amount normalizing is obtained Change matrix；

The input quantity of the network layer and the linear combination matrix are subjected to operation, and will the first element numerical value and Operation result is multiplied, and obtains the output quantity of the network layer.

Optionally, described that operation is normalized to each element numerical value in the input activation amount, obtain activation amount normalizing Change matrix, comprising:

By each element numerical value in the input activation amount respectively divided by the absolute value of the first element numerical value, swashed Amount normalization matrix living；

The input quantity of the network layer and the linear combination matrix are subjected to operation, and by the first element numerical value Absolute value is multiplied with operation result, obtains the output quantity of the network layer.

Optionally, the described first default basic matrix and the second default basic matrix are shift-type basic matrix, the shifting The power exponent that each element is 2 in bit-type basic matrix；

It is described according to the second default basic matrix, it is pre- that each element in the activation amount normalization matrix is expressed as described second If the linear combination of element in basic matrix, obtains the input quantity of the network layer, comprising:

According to the second default basic matrix, each element in the activation amount normalization matrix is expressed as to the line of 2 power exponent Property combination, obtain the input quantity of the network layer；

According to each power exponent of the first element in the input quantity of the network layer respectively in the linear combination matrix The sum of each power exponent of Was Used, to first element carry out multi-shift operation, and to the multi-shift operation after Results added obtains the result of product of first element and the second element, the second element are as follows: the linear combination The element of multiplying is carried out in matrix with first element；

Second aspect, the embodiment of the invention provides the network layer arithmetic unit in a kind of deep neural network, the dresses It sets and includes:

First obtains module, for obtaining the parameter matrix of network layer in deep neural network；

First normalization module obtains parameter normalizing for operation to be normalized to each weight in the parameter matrix Change matrix；

First linear combiner module is used for according to the first default basic matrix, by each element in the parameter normalization matrix It is expressed as the linear combination of element in the described first default basic matrix, obtains linear combination matrix；

Computing module obtains described for the input quantity of the network layer and the linear combination matrix to be carried out operation The output quantity of network layer.

Optionally, described device further include:

First statistical module obtains the weight model of the parameter matrix for counting each weight in the parameter matrix It encloses；

First extraction module extracts first of maximum absolute value in the parameter matrix for being based on the weight range Weight；

The first normalization module, is specifically used for:

The computing module, is specifically used for:

Optionally, the first normalization module, is specifically also used to:

The computing module, is specifically also used to:

First linear combiner module, is specifically used for:

The computing module, is specifically used for:

Optionally, described device further include:

Second obtains module, for obtaining the input activation amount of the network layer；

Second normalization module is swashed for operation to be normalized to each element numerical value in the input activation amount Amount normalization matrix living；

Second linear combiner module is used for according to the second default basic matrix, by each member in the activation amount normalization matrix Element is expressed as the linear combination of element in the described second default basic matrix, obtains the input quantity of the network layer.

Optionally, described device further include:

Second statistical module obtains the input activation amount for counting each element numerical value of the input activation amount Element numberical range；

Second extraction module extracts maximum absolute value in the input activation amount for being based on the element numberical range The first element numerical value；

The second normalization module, is specifically used for:

The computing module, is specifically used for:

Optionally, the second normalization module, is specifically also used to:

The computing module, is specifically also used to:

First linear combiner module, is specifically used for:

Second linear combiner module, is specifically used for:

The computing module, is specifically used for:

Network layer operation method and device in a kind of deep neural network provided in an embodiment of the present invention obtain depth mind Parameter matrix through network layer in network is normalized operation to each weight in the parameter matrix, obtains parameter normalization square Each element in parameter normalization matrix, according to the first default basic matrix, is expressed as the line of element in the first default basic matrix by battle array Property combination, obtain linear combination matrix, the input quantity of network layer be subjected to operation with obtained linear combination matrix, can be obtained The output quantity of network layer.By the way that the parameter matrix of network layer to be expressed as to the linear combination of element in the first default basic matrix, make Must be originally used for the weight in the parameter matrix of floating number can indicate that fixed-point number can have by way of fixed-point number linear combination Effect reduces the amount of storage of deep neural network, and by the normalization operation to weight each in parameter matrix, after normalization Weight is with uniformity, and the operation of each weight is easier to realize, so that the calculation amount of deep neural network substantially reduces.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.

Fig. 1 is the flow diagram of the network layer operation method in the deep neural network of one embodiment of the invention；

Fig. 2 is the schematic diagram that multiplication operation is equivalent to displacement and add operation of the embodiment of the present invention；

Fig. 3 is the flow diagram of the network layer operation method in the deep neural network of another embodiment of the present invention；

Fig. 4 is another schematic diagram that multiplication operation is equivalent to displacement and add operation of the embodiment of the present invention；

Fig. 5 is the structural schematic diagram of the network layer arithmetic unit in the deep neural network of one embodiment of the invention；

Fig. 6 is the structural schematic diagram of the network layer arithmetic unit in the deep neural network of another embodiment of the present invention；

Fig. 7 is the structural schematic diagram of the computer equipment of the embodiment of the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.

In order to while the amount of storage that deep neural network is effectively reduced, reduce calculation amount, the embodiment of the present invention is provided Network layer operation method, device and computer equipment in a kind of deep neural network.

In the following, the network layer operation method being provided for the embodiments of the invention in deep neural network first is situated between It continues.

The executing subject of network layer operation method in a kind of deep neural network provided by the embodiment of the present invention can be with For the computer equipment for executing intelligent algorithm, which can be for target detection and segmentation, behavioral value and knowledge Other or smart machine, such as remote computer, remote server, smart camera, intelligent sound equipment of speech recognition etc., The processor equipped with kernel processor chip should be included at least in executing subject.Realize one kind provided by the embodiment of the present invention The mode of network layer operation method in deep neural network can be the software being set in executing subject, hardware circuit and patrol Collect at least one of circuit mode.

As shown in Figure 1, for the network layer operation method in a kind of deep neural network provided by the embodiment of the present invention, it should Network layer operation method in deep neural network may include steps of:

S101 obtains the parameter matrix of network layer in deep neural network.

Network layer in deep neural network includes: activation amount in network layer for carrying out the network operations such as convolution, dot product And parameter layer, parameter layer include Convolution layers of convolution and full articulamentum, include carry out network operations in each parameter layer Parameter matrix, activation amount is the data flow transmitted between two network layers, specifically includes that Convolution layers of convolution defeated Enter/export perhaps inner product Inner Product layers of input/output perhaps the input/output of linear elu layers of modified R or The input/output of Batch Normalization layers of standardization is criticized, Scale layers of input/output or fusion are perhaps scaled Concat layers of input/output etc..DNN (Deep Neural Network, deep neural network) is a more wide in range number According to processing method, DNN can be CNN (Convolutional Neural Network, convolutional neural networks), RNN (Recurrent Neural Network, Recognition with Recurrent Neural Network), LSTM (Long Short Term Memory, shot and long term note Recall network) etc. any one in data processing methods.Include: in parameter matrix specific weight in Convolution layers or Specific weight in the full articulamentum of person.

S102 is normalized operation to weight each in parameter matrix, obtains parameter normalization matrix.

Operation is normalized to weight each in parameter matrix, is ensuring that the consistency of weight.For example, normalization operation It can be in each weight consolidation to a fixed numberical range, to guarantee when carrying out operation, the accuracy of operation, if Data difference is larger, when carrying out data operation, as a result will appear biggish error.It illustratively, can be by each weight normalizing Change to -1 to 1 numberical range, correspondingly, in optional embodiment, the parameter of network layer in obtaining deep neural network After the step of matrix, it can also include the following steps:

The first step, each weight in statistical parameter matrix, obtains the weight range of parameter matrix；

Second step is based on the weight range, the first weight of maximum absolute value in extracting parameter matrix.

After getting parameter matrix, each weight in the parameter matrix can be counted, parameter square can be obtained The weight range of battle array, such as maximum weight is 5 in parameter matrix, the smallest weight is -3, then the weight model of the parameter matrix It encloses for [- 3,5], due to needing that each weight is normalized in -1 to 1 numberical range, needs using maximum weight as removing Number carries out division arithmetic with each weight, just can guarantee the data after each weight is normalized between -1 to 1.Therefore, It can be based on the weight of maximum absolute value in weight range extracting parameter matrix.

It, can be based on first weight to parameter square after extraction obtains the first weight of maximum absolute value in parameter matrix Operation is normalized in battle array.Optionally, operation is normalized to weight each in parameter matrix, obtains parameter normalization matrix Step can be specifically accomplished in that

By each weight in parameter matrix respectively divided by the first weight, parameter normalization matrix is obtained；

Alternatively,

By each weight in parameter matrix respectively divided by the absolute value of the first weight, parameter normalization matrix is obtained.

It, can be directly by each weight point during in the range of weight each in parameter matrix is normalized to -1 to 1 Not divided by the first weight, if the first weight is positive number, in the parameter normalization matrix obtained after normalization each element with it is right The symbol for each weight answered is identical；If the first weight is negative, each member in the parameter normalization matrix obtained after normalization The plain opposite number each other with corresponding each weight.Normalization operation can also by each weight respectively divided by the absolute value of the first weight, Each element is identical as the symbol of corresponding each weight in obtained parameter normalization matrix.Parameter normalization matrix is being calculated When, it needs to record the first weight, and make to record the form of normalization operation, to carry out subsequent arithmetic When, operation result is returned using the absolute value of the first weight or the first weight, obtains accurate operation result.

Each element in parameter normalization matrix is expressed as the first default basic matrix according to the first default basic matrix by S103 The linear combination of middle element, obtains linear combination matrix.

First default basic matrix is the basic matrix pre-defined, which can be the power exponent that each element is 2 Shift-type basic matrix, or diagonal matrix, unit matrix can also be matrix composed by arbitrary element value, group moment The numerical value of each element is set with specific reference to actual application scenarios in battle array, for example, if the first default basic matrix of setting isEach element in parameter normalization matrix may be expressed as the first default base The linear combination of element in matrix can will be mapped as the elements equivalent of floating point valuesBit bit wide data Storage indicates, and then achievees the purpose that the amount of storage for reducing deep neural network.

In the embodiment of the present application, for ease of hardware realization, the first default basic matrix can be shift-type basic matrix, move The power exponent that each element is 2 in bit-type basic matrix.

In the embodiment of the present application, according to the first default basic matrix, each element in parameter normalization matrix is expressed as The linear combination of element in one default basic matrix, can be specifically accomplished in that the step of obtaining linear combination matrix

According to the first default basic matrix, each element in parameter normalization matrix is expressed as to the linear combination of 2 power exponent, Obtain linear combination matrix.

The input quantity of network layer and linear combination matrix are carried out operation, obtain the output quantity of network layer by S104.

After obtaining linear combination matrix, it can directly transport the input quantity of network layer with linear combination matrix Calculate, such as carry out convolution algorithm, point multiplication operation etc., the output quantity of network layer can be obtained by operation, due in S102 The normalization operation for having carried out weight in parameter matrix, after the input quantity and linear combination matrix progress operation to network layer, Operation result can be handled based on the process of normalization operation, the output quantity of network layer is obtained after processing.

According to above-described embodiment, normalization operation be can be to each weight in parameter matrix respectively divided by maximum absolute value The first weight then optionally the input quantity of network layer and linear combination matrix are subjected to operation, obtain the output quantity of network layer The step of, it can specifically be accomplished in that

The input quantity of network layer and linear combination matrix are subjected to operation, and the first weight is multiplied with operation result, is obtained To the output quantity of network layer.

Due to according to each weight in parameter matrix respectively divided by the first weight of maximum absolute value to parameter matrix When operation is normalized, the data of each weight are reduced, in order to guarantee the accurate of operation result, in the input to network layer After amount obtains operation result with linear combination matrix progress operation, needs for the first weight to be multiplied with the operation result, can obtain To the output quantity of accurate network layer.

In addition, normalization operation can be to each weight in parameter matrix respectively divided by absolute according to above-described embodiment It is worth the absolute value of maximum first weight, then optionally, the input quantity of network layer and linear combination matrix is subjected to operation, obtained The step of output quantity of network layer, can be accomplished in that

The input quantity of network layer and linear combination matrix are subjected to operation, and by the absolute value and operation result of the first weight It is multiplied, obtains the output quantity of network layer.

Due to according to each weight in parameter matrix respectively divided by the absolute value pair of the first weight of maximum absolute value When operation is normalized in parameter matrix, the data of each weight are reduced, in order to guarantee the accurate of operation result, to network After input quantity and linear combination matrix the progress operation of layer obtain operation result, need the absolute value of the first weight and the operation As a result it is multiplied, the output quantity of accurate network layer can be obtained.

Based on described in S103, if the first default basic matrix is shift-type basic matrix, each multiplication unit can be equivalent to The input quantity of network layer and linear combination matrix are optionally carried out operation, obtain the output of network layer by displacement and add operation The step of amount, can be accomplished in that

The first step, for the first element in the input quantity of network layer, according to the line for carrying out multiplying with the first element Property combinatorial matrix in second element each power exponent, carry out multi-shift operation respectively to the first element, and grasp to multi-shift Results added after work obtains the result of product of the first element and second element；

Second step passes through combination according to the result of product of each element in each element in input quantity and linear combination matrix Mode carries out operation to input quantity and the linear combination matrix, obtains the output quantity of network layer.

By by multiplication unit be equivalent to displacement and add operation, the calculation amount of deep neural network can be greatly reduced, As shown in Fig. 2, giving the schematic diagram that the multiplication operation of multiplication unit is equivalent to displacement and add operation, wherein x first Element, w are second element, and y is the result of product of x and w, k¹、k²、…、k^NIndicate each power of second element in linear combination matrix Index, i.e., the digit that the first element in input quantity shifts every time.

Using the present embodiment, the parameter matrix of network layer in deep neural network is obtained, to each weight in the parameter matrix Operation is normalized, obtains parameter normalization matrix, according to the first default basic matrix, by each element in parameter normalization matrix Be expressed as the linear combination of element in the first default basic matrix, obtain linear combination matrix, by the input quantity of network layer with obtain Linear combination matrix carry out operation, the output quantity of network layer can be obtained.By the way that the parameter matrix of network layer is expressed as The linear combination of element, allows the weight being originally used in the parameter matrix of floating number to pass through fixed-point number in one default basic matrix The form of linear combination indicates that the amount of storage of deep neural network can be effectively reduced in fixed-point number, and by parameter matrix In each weight normalization operation, the weight after normalization is with uniformity, and the operation of each weight is easier to realize, so that depth is refreshing Calculation amount through network substantially reduces.

Based on embodiment illustrated in fig. 1, the embodiment of the invention also provides the network layer operations in a kind of deep neural network Method, as shown in figure 3, the network layer operation method in the deep neural network includes the following steps:

S301 obtains the parameter matrix of network layer in deep neural network.

S302 is normalized operation to weight each in parameter matrix, obtains parameter normalization matrix.

Each element in parameter normalization matrix is expressed as the first default basic matrix according to the first default basic matrix by S303 The linear combination of middle element, obtains linear combination matrix.

S304 obtains the input activation amount of network layer.

The input activation amount of network layer is to input the activation amount of parameter layer in network layer, may include convolution Convolution layers of input perhaps inner product Inner Product layers of input or the input of linear elu layers of modified R, or The input of Batch Normalization layers of person crowd standardization perhaps scales Scale layers of Concat layers of input or fusion Input etc..

S305 is normalized operation to each element numerical value in input activation amount, obtains activation amount normalization matrix.

Operation is normalized to each element data in input activation amount, is ensuring that the consistency of element data, specifically , normalization operation can be in each element data consolidation to a fixed numberical range, to guarantee when carrying out operation, The accuracy of operation, when carrying out data operation, as a result will appear biggish error if data difference is larger, illustratively, In activation being measured in each element data normalization to -1 to 1 numberical range, correspondingly, being obtained in optional embodiment After the step of taking the input activation amount of network layer, it can also include the following steps:

The first step, each element numerical value of statistics input activation amount, obtains the element numberical range of input activation amount；

Second step is based on element numberical range, extracts the first element numerical value of maximum absolute value in input activation amount.

After getting input activation amount, each element numerical value in the input activation amount can be counted, can be obtained Into the element numberical range of input activation amount, such as input activation amount, maximum element numerical value is 26, the smallest element numerical value Be -19, then the element numberical range of the input activation amount be [- 19,26], due to need for each element numerical value to be normalized to -1 to It in 1 numberical range, needs using maximum element numerical value as divisor, carries out division arithmetic with each element numerical value, just can guarantee Data after each element numerical value is normalized are between -1 to 1.Therefore, input can be extracted based on element numberical range The element numerical value of maximum absolute value in activation amount.

Extract obtain input activation amount in maximum absolute value the first element numerical value after, first yuan of prime number can be based on Operation is normalized to input activation amount in value.In the embodiment of the present application, each element numerical value in input activation amount is returned One changes the step of operating, obtaining activation amount normalization matrix, can specifically be accomplished in that

The each element numerical value in activation amount will be inputted respectively divided by the first element numerical value, obtain activation amount normalization matrix；

Alternatively,

The each element numerical value in activation amount will be inputted respectively divided by the absolute value of the first element numerical value, obtain activation amount normalizing Change matrix.

It, can directly will be each during in the range of will input that each element numerical value is normalized to -1 to 1 in activation amount Element numerical value is respectively divided by the first element numerical value, if the first element numerical value is positive number, the activation amount obtained after normalization is returned Each element is identical as the symbol of corresponding each element numerical value in input activation amount in one change matrix；If the first element numerical value is negative Number, then after normalizing in obtained activation amount normalization matrix each element with input in activation amount corresponding each element numerical value each other Opposite number.Normalization operation can also will in input activation amount each element numerical value respectively divided by the absolute value of the first element numerical value, Each element is identical as the symbol of corresponding each element numerical value in input activation amount in obtained activation amount normalization matrix.It is calculating When obtaining activation amount normalization matrix, need to record the first element numerical value, and to the form of normalization operation make with Record, so as to when carrying out subsequent arithmetic, using the absolute value of the first element numerical value or the first element numerical value to operation result It is returned, obtains accurate operation result.

Each element in activation amount normalization matrix is expressed as the second default group moment according to the second default basic matrix by S306 The linear combination of element, obtains the input quantity of network layer in battle array.

Second default basic matrix is the basic matrix pre-defined, which can be the power exponent that each element is 2 Shift-type basic matrix, or diagonal matrix, unit matrix etc., the numerical value of each element is with specific reference to reality in basic matrix The application scenarios on border are set, and the second default basic matrix can be identical with the first default basic matrix, can not also be identical.If setting Setting the second default basic matrix isEach element in activation amount normalization matrix is all It can be expressed as the linear combination of element in the second default basic matrix, can will be mapped as the elements equivalent of floating point valuesBit bit wide table data store shows, further decreases the amount of storage of deep neural network.

In the embodiment of the present application, for ease of hardware realization, the first default basic matrix and the second default basic matrix Think shift-type basic matrix, the power exponent that each element is 2 in shift-type basic matrix.

In the embodiment of the present application, according to the first default basic matrix, each element in parameter normalization matrix is expressed as The linear combination of element in one default basic matrix, can be accomplished in that the step of obtaining linear combination matrix

In the embodiment of the present application, according to the second default basic matrix, each element in activation amount normalization matrix is expressed as The linear combination of element in second default basic matrix, can be accomplished in that the step of obtaining the input quantity of network layer

According to the second default basic matrix, each element in activation amount normalization matrix is expressed as to linear group of 2 power exponent It closes, obtains the input quantity of network layer.

The input quantity of network layer and linear combination matrix are carried out operation, obtain the output quantity of network layer by S307.

After the input quantity and linear combination matrix for obtaining network layer, it can directly by the input quantity of network layer and line Property combinatorial matrix carry out operation, such as carry out convolution algorithm, point multiplication operation etc., the output of network layer can be obtained by operation Amount, due to having carried out the normalization operation of element numerical value in the normalization operation of weight in parameter matrix and input activation amount, It, can be based on the process of normalization operation, to operation knot after the input quantity and linear combination matrix progress operation to network layer Fruit is handled, and the output quantity of network layer is obtained after processing.

According to above-described embodiment, operation, which is normalized, to input activation amount be can be to each element in input activation amount Numerical value is respectively divided by the first element numerical value of maximum absolute value, then optionally, by the input quantity and linear combination matrix of network layer The step of carrying out operation, obtaining the output quantity of network layer, can specifically be accomplished in that

The input quantity of network layer and linear combination matrix are subjected to operation, and by the first element numerical value and operation result phase Multiply, obtains the output quantity of network layer.

Due to according to input activation amount in each element numerical value respectively divided by the first element numerical value of maximum absolute value When operation is normalized to input activation amount, the data of each element numerical value are reduced, in order to guarantee the accurate of operation result, After the input quantity to network layer obtains operation result with linear combination matrix progress operation, needs the first element numerical value and be somebody's turn to do Operation result is multiplied, and the output quantity of accurate network layer can be obtained.Based on the above embodiment, if parameter matrix also carries out Similar normalization operation, each weight is respectively divided by the first weight of maximum absolute value, then the output quantity of network layer By the way that the first weight, the first element numerical value are multiplied to obtain with operation result；Alternatively, if the normalization operation of parameter matrix Be by each weight respectively divided by the absolute value of the first weight of maximum absolute value, then the output quantity of network layer can pass through by The absolute value of first weight, the first element numerical value are multiplied to obtain with operation result.

In addition, operation, which is normalized, to input activation amount can also be in input activation amount according to above-described embodiment Each element numerical value respectively divided by the absolute value of the first element numerical value of maximum absolute value, then optionally, by the input of network layer The step of amount carries out operation with linear combination matrix, obtains the output quantity of network layer, can be accomplished in that

The input quantity of network layer and linear combination matrix are subjected to operation, and by the absolute value of the first element numerical value and operation As a result it is multiplied, obtains the output quantity of network layer.

Due to according to input activation amount in each element numerical value respectively divided by the first element numerical value of maximum absolute value Absolute value to input activation amount operation is normalized when, the data of each element numerical value reduce, in order to guarantee operation knot Fruit it is accurate, to network layer input quantity and linear combination matrix need first yuan after operation obtains operation result The absolute value of prime number value is multiplied with the operation result, and the output quantity of accurate network layer can be obtained.Based on the above embodiment, such as Fruit parameter matrix has also carried out similar normalization operation, and each weight is respectively divided by the exhausted of the first weight of maximum absolute value To value, then the output quantity of network layer can be by by the absolute value of the first weight, the absolute value of the first element numerical value and operation As a result it is multiplied and obtains；Alternatively, if the normalization operation of parameter matrix is by each weight respectively divided by the of maximum absolute value One weight, then the output quantity of network layer can be by the way that the first weight, the first element numerical value to be multiplied to obtain with operation result.

If the first default basic matrix is shift-type basic matrix, the second default basic matrix is also shift-type matrix, Mei Gecheng Method unit can be equivalent to displacement and add operation, optionally, the input quantity of network layer and linear combination matrix be carried out operation, The step of obtaining the output quantity of network layer can be accomplished in that

The first step, according to each power exponent of the first element in the input quantity of network layer respectively in linear combination matrix second The sum of each power exponent of element carries out multi-shift operation to the first element, and to the results added after multi-shift operation, obtains To the result of product of the first element and second element, wherein second element are as follows: multiplied in linear combination matrix with the first element The element of method operation；

Second step passes through combination according to the result of product of each element in each element in input quantity and linear combination matrix Mode carries out operation to input quantity and linear combination matrix, obtains the output quantity of network layer.

The linear combination for being expressed as 2 power exponent by that will input each element in activation amount can will input in activation amount Each element, which is expressed as displacement, can incite somebody to action with the form being added according still further to the representation of 2 power exponent of weight in parameter matrix Input the form that each element in activation amount is expressed as multi-shift with is added.By the way that the multiplication of activation amount and floating-point weight will be inputted Operation be equivalent to displacement and add operation, the calculation amount of deep neural network can be greatly reduced, as shown in figure 4, give by Multiplication operation is equivalent to the schematic diagram of displacement and add operation, wherein x is the first element, and w is second element, and y is multiplying for x and w Product is as a result, m¹、m²、…、m^NIndicate each power exponent of the first element in input quantity, k¹、k²、…、k^NIt indicates in linear combination matrix Each power exponent of second element, k¹+m¹、…、k^N+m^NThe digit shifted every time for the first element in input quantity.

Using the present embodiment, the parameter matrix of network layer in deep neural network is obtained, to each weight in the parameter matrix Operation is normalized, obtains parameter normalization matrix, according to the first default basic matrix, by each element in parameter normalization matrix Be expressed as the linear combination of element in the first default basic matrix, obtain linear combination matrix, by the input quantity of network layer with obtain Linear combination matrix carry out operation, the output quantity of network layer can be obtained.By the way that the parameter matrix of network layer is expressed as The linear combination of element, allows the weight being originally used in the parameter matrix of floating number to pass through fixed-point number in one default basic matrix The form of linear combination indicates that the amount of storage of deep neural network can be effectively reduced in fixed-point number, and by parameter matrix In each weight normalization operation, the weight after normalization is with uniformity, and the operation of each weight is easier to realize, so that depth is refreshing Calculation amount through network substantially reduces.And by the way that the input activation amount of network layer is expressed as element in the second default basic matrix Linear combination, the shape for allowing the element numerical value in the input activation amount for being originally used for floating number to pass through fixed-point number linear combination Formula indicates, can further decrease the amount of storage of deep neural network.Multiplication by that will input activation amount and parameter matrix is grasped It is equivalent to the form of displacement and add operation, greatly reduces calculation amount, and since the hardware of shift-type basic matrix is easily realized Property, effectively reduce hardware spending.

Corresponding to above method embodiment, the embodiment of the invention provides the network layer operations in a kind of deep neural network Device, as shown in figure 5, the network layer arithmetic unit in the deep neural network may include:

First obtains module 510, for obtaining the parameter matrix of network layer in deep neural network；

First normalization module 520 obtains parameter and returns for operation to be normalized to each weight in the parameter matrix One changes matrix；

First linear combiner module 530 is used for according to the first default basic matrix, by each member in the parameter normalization matrix Element is expressed as the linear combination of element in the described first default basic matrix, obtains linear combination matrix；

Computing module 540 obtains institute for the input quantity of the network layer and the linear combination matrix to be carried out operation State the output quantity of network layer.

In the embodiment of the present application, described device can also include:

The first normalization module 520, specifically can be used for:

The computing module 540, specifically can be used for:

In the embodiment of the present application, the first normalization module 520, specifically can be also used for:

The computing module 540, specifically can be also used for:

In the embodiment of the present application, the described first default basic matrix is shift-type basic matrix, in the shift-type basic matrix The power exponent that each element is 2；

First linear combiner module 530, specifically can be used for:

The computing module 540, specifically can be used for:

Based on embodiment illustrated in fig. 5, the embodiment of the invention also provides the network layer operations in a kind of deep neural network Device, as shown in fig. 6, the network layer arithmetic unit in the deep neural network may include:

First obtains module 610, for obtaining the parameter matrix of network layer in deep neural network；

First normalization module 620 obtains parameter and returns for operation to be normalized to each weight in the parameter matrix One changes matrix；

First linear combiner module 630 is used for according to the first default basic matrix, by each member in the parameter normalization matrix Element is expressed as the linear combination of element in the described first default basic matrix, obtains linear combination matrix；

Second obtains module 640, for obtaining the input activation amount of the network layer；

Second normalization module 650 is obtained for operation to be normalized to each element numerical value in the input activation amount Activation amount normalization matrix；

Second linear combiner module 660 is used for according to the second default basic matrix, will be each in the activation amount normalization matrix Element representation is the linear combination of element in the described second default basic matrix, obtains the input quantity of the network layer；

Computing module 670 obtains institute for the input quantity of the network layer and the linear combination matrix to be carried out operation State the output quantity of network layer.

In the embodiment of the present application, described device further include:

The second normalization module 650, specifically can be used for:

The computing module 670, specifically can be used for:

In the embodiment of the present application, the second normalization module 650, specifically can be also used for:

The computing module 670, specifically can be also used for:

In the embodiment of the present application, the described first default basic matrix and the second default basic matrix are shift-type group moment Gust, the power exponent that each element is 2 in the shift-type basic matrix；

First linear combiner module 630, specifically can be used for:

Second linear combiner module 660, specifically can be used for:

The computing module 670, specifically can be used for:

In order to reduce calculation amount while the amount of storage that deep neural network is effectively reduced, the embodiment of the present invention is also mentioned A kind of computer equipment is supplied, as shown in fig. 7, comprises processor 701 and memory 702, wherein

Memory 702, for storing computer program；

Processor 701 when for executing the program stored on memory 702, is realized in above-mentioned deep neural network All steps of network layer operation method.

Data biography can be carried out between memory 702 and processor 701 by way of wired connection or wireless connection It is defeated, and computer equipment can be communicated by wired communication interface or wireless communication interface with other equipment.

Above-mentioned memory may include RAM (Random Access Memory, random access memory), also may include NVM (Non-volatile Memory, nonvolatile memory), for example, at least a magnetic disk storage.Optionally, memory It can also be that at least one is located remotely from the storage device of aforementioned processor.

Above-mentioned processor can be general processor, including CPU (Central Processing Unit, central processing Device), NP (Network Processor, network processing unit) etc.；Can also be DSP (Digital Signal Processor, Digital signal processor), ASIC (Application Specific Integrated Circuit, specific integrated circuit), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic device are divided Vertical door or transistor logic, discrete hardware components.

In the present embodiment, the processor of the computer equipment is led to by reading the computer program stored in memory It crosses and runs the computer program, can be realized: by the way that the parameter matrix of network layer is expressed as element in the first default basic matrix Linear combination, allow the weight being originally used in the parameter matrix of floating number table by way of fixed-point number linear combination Show, the amount of storage of deep neural network can be effectively reduced in fixed-point number, and passes through the normalization to weight each in parameter matrix Operation, the weight after normalization is with uniformity, and the operation of each weight is easier to realize, so that deep neural network is computationally intensive It is big to reduce.

In addition, the present invention is real corresponding to the network layer operation method in deep neural network provided by above-described embodiment It applies example and provides a kind of storage medium, for storing computer program, when the computer program is executed by processor, in realization State all steps of the network layer operation method in deep neural network.

In the present embodiment, storage medium is stored with executes deep neural network provided by the embodiment of the present invention at runtime In network layer operation method application program, therefore can be realized: pre- by the way that the parameter matrix of network layer is expressed as first If the linear combination of element in basic matrix, allows the weight being originally used in the parameter matrix of floating number linear by fixed-point number Combined form indicates that the amount of storage of deep neural network can be effectively reduced in fixed-point number, and by each in parameter matrix The normalization operation of weight, the weight after normalization is with uniformity, and the operation of each weight is easier to realize, so that depth nerve net The calculation amount of network substantially reduces.

For computer equipment and storage medium embodiment, method content as involved in it is substantially similar to Embodiment of the method above-mentioned, so being described relatively simple, the relevent part can refer to the partial explaination of embodiments of method.

It should be noted that, in this document, relational terms such as first and second and the like are used merely to a reality Body or operation are distinguished with another entity or operation, are deposited without necessarily requiring or implying between these entities or operation In any actual relationship or order or sequence.Moreover, the terms "include", "comprise" or its any other variant are intended to Non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that There is also other identical elements in process, method, article or equipment including the element.

Each embodiment in this specification is all made of relevant mode and describes, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device, For computer equipment and storage medium embodiment, since it is substantially similar to the method embodiment, so be described relatively simple, The relevent part can refer to the partial explaination of embodiments of method.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the scope of the present invention.It is all Any modification, equivalent replacement, improvement and so within the spirit and principles in the present invention, are all contained in protection scope of the present invention It is interior.

Claims

1. the network layer operation method in a kind of deep neural network, which is characterized in that the described method includes:

Obtain the parameter matrix of network layer in deep neural network；

According to the first default basic matrix, each element in the parameter normalization matrix is expressed as in the described first default basic matrix The linear combination of element, obtains linear combination matrix；

The input quantity of the network layer and the linear combination matrix are subjected to operation, obtain the output quantity of the network layer.

2. the method according to claim 1, wherein in the parameter for obtaining network layer in deep neural network After matrix, the method also includes:

The input quantity by the network layer and the linear combination matrix carry out operation, obtain the output of the network layer Amount, comprising:

The input quantity of the network layer and the linear combination matrix are subjected to operation, and by first weight and operation result It is multiplied, obtains the output quantity of the network layer.

3. according to the method described in claim 2, it is characterized in that, described be normalized each weight in the parameter matrix Operation, obtains parameter normalization matrix, comprising:

By each weight in the parameter matrix respectively divided by the absolute value of first weight, parameter normalization matrix is obtained；

The input quantity of the network layer and the linear combination matrix are subjected to operation, and by the absolute value of first weight with Operation result is multiplied, and obtains the output quantity of the network layer.

4. the method according to claim 1, wherein the first default basic matrix is shift-type basic matrix, institute State the power exponent that each element is 2 in shift-type basic matrix；

It is described according to the first default basic matrix, each element in the parameter normalization matrix is expressed as the described first default group moment The linear combination of element, obtains linear combination matrix in battle array, comprising:

According to the first default basic matrix, each element in the parameter normalization matrix is expressed as to the linear combination of 2 power exponent, Obtain linear combination matrix；

For the first element in the input quantity of the network layer, according to the line for carrying out multiplying with first element Each power exponent of second element, carries out multi-shift operation to first element, and to described more respectively in property combinatorial matrix Results added after secondary shifting function obtains the result of product of first element and the second element；

According to the result of product of each element in each element in the input quantity and the linear combination matrix, by way of combination Operation is carried out to the input quantity and the linear combination matrix, obtains the output quantity of the network layer.

5. the method according to claim 1, wherein in the input quantity by the network layer and described linear Combinatorial matrix carries out operation, before obtaining the output quantity of the network layer, the method also includes:

Obtain the input activation amount of the network layer；

According to the second default basic matrix, each element in the activation amount normalization matrix is expressed as the described second default basic matrix The linear combination of middle element obtains the input quantity of the network layer.

6. according to the method described in claim 5, it is characterized in that, measuring it in the input activation for obtaining the network layer Afterwards, the method also includes:

It is described that operation is normalized to each element numerical value in the input activation amount, obtain activation amount normalization matrix, comprising:

By each element numerical value in the input activation amount respectively divided by the first element numerical value, activation amount normalized moments are obtained Battle array；

The input quantity of the network layer and the linear combination matrix are subjected to operation, and by the first element numerical value and operation As a result it is multiplied, obtains the output quantity of the network layer.

7. according to the method described in claim 6, it is characterized in that, described carry out each element numerical value in the input activation amount Normalization operation obtains activation amount normalization matrix, comprising:

By each element numerical value in the input activation amount respectively divided by the absolute value of the first element numerical value, activation amount is obtained Normalization matrix；

The input quantity of the network layer and the linear combination matrix are subjected to operation, and by the absolute of the first element numerical value Value is multiplied with operation result, obtains the output quantity of the network layer.

8. according to the method described in claim 5, it is characterized in that, the first default basic matrix and the second default group moment Battle array is shift-type basic matrix, the power exponent that each element is 2 in the shift-type basic matrix；

It is described according to the second default basic matrix, each element in the activation amount normalization matrix is expressed as the described second default base The linear combination of element in matrix, obtains the input quantity of the network layer, comprising:

According to the second default basic matrix, each element in the activation amount normalization matrix is expressed as to linear group of 2 power exponent It closes, obtains the input quantity of the network layer；

According to each power exponent of the first element in the input quantity of the network layer respectively with second yuan in the linear combination matrix The sum of each power exponent of element carries out multi-shift operation to first element, and to the result after multi-shift operation It is added, obtains the result of product of first element and the second element, the second element are as follows: the linear combination matrix In with first element carry out multiplying element；

9. the network layer arithmetic unit in a kind of deep neural network, which is characterized in that described device includes:

First normalization module obtains parameter normalization square for operation to be normalized to each weight in the parameter matrix Battle array；

First linear combiner module, for according to the first default basic matrix, each element in the parameter normalization matrix to be indicated For the linear combination of element in the described first default basic matrix, linear combination matrix is obtained；

Computing module obtains the network for the input quantity of the network layer and the linear combination matrix to be carried out operation The output quantity of layer.

10. device according to claim 9, which is characterized in that described device further include:

First statistical module obtains the weight range of the parameter matrix for counting each weight in the parameter matrix；

First extraction module extracts the first weight of maximum absolute value in the parameter matrix for being based on the weight range；

The first normalization module, is specifically used for:

The computing module, is specifically used for:

11. device according to claim 10, which is characterized in that the first normalization module is specifically also used to:

The computing module, is specifically also used to:

12. device according to claim 9, which is characterized in that the first default basic matrix is shift-type basic matrix, institute State the power exponent that each element is 2 in shift-type basic matrix；

First linear combiner module, is specifically used for:

The computing module, is specifically used for:

13. device according to claim 9, which is characterized in that described device further include:

Second normalization module obtains activation amount for operation to be normalized to each element numerical value in the input activation amount Normalization matrix；

Second linear combiner module is used for according to the second default basic matrix, by each element table in the activation amount normalization matrix It is shown as the linear combination of element in the described second default basic matrix, obtains the input quantity of the network layer.

14. device according to claim 13, which is characterized in that described device further include:

Second statistical module obtains the element of the input activation amount for counting each element numerical value of the input activation amount Numberical range；

Second extraction module extracts the of maximum absolute value in the input activation amount for being based on the element numberical range Unitary prime number value；

The second normalization module, is specifically used for:

The computing module, is specifically used for:

15. device according to claim 14, which is characterized in that the second normalization module is specifically also used to:

The computing module, is specifically also used to:

16. device according to claim 13, which is characterized in that the first default basic matrix and the second default base Matrix is shift-type basic matrix, the power exponent that each element is 2 in the shift-type basic matrix；

First linear combiner module, is specifically used for:

Second linear combiner module, is specifically used for:

The computing module, is specifically used for: