CN108229644A

CN108229644A - The device of compression/de-compression neural network model, device and method

Info

Publication number: CN108229644A
Application number: CN201611159629.6A
Authority: CN
Inventors: 陈天石; 韦洁; 陈云霁; 刘少礼; 支天; 郭崎
Original assignee: Shanghai Cambricon Information Technology Co Ltd
Current assignee: Shanghai Cambricon Information Technology Co Ltd
Priority date: 2016-12-15
Filing date: 2016-12-15
Publication date: 2018-06-29

Abstract

A kind of device of compression/de-compression neural network model, device and method.Including step：Obtain the parameter to be compressed of neural network model；The parameter to be compressed is compressed and trained using neural network algorithm, obtains the neural network parameter of low-dimensional；The neural network parameter of the low-dimensional is decompressed, restores the parameter of neural network model.The present invention realizes the device of compression/de-compression neural network model with autocoding neural network algorithm, can reduce the parameter of neural network model, be conducive to storing and transmitting for model.

Description

The device of compression/de-compression neural network model, device and method

Technical field

The present invention relates to neural network model compression/decompression algorithm applied technical fields, relate more specifically to a kind of pressure The device and equipment of contracting/decompression neural network model further relate to a kind of method of compression/de-compression neural network model.

Background technology

In recent years, neural network algorithm is widely applied to every field, with problem complexity and to accuracy rate requirement Continuous improvement, neural network model depth is continuously increased, and the thing followed is the explosive growth of number of parameters, this is to nerve Storing and transmitting for network model brings great inconvenience.Imagine each application on mobile phone in future and have deep learning Ability, but each application will transmit, in storage G neural network model parameter, this is clearly unreasonable.

Traditional dimension reduction method be mostly it is linear, such as PCA (Principal Component Analysis, it is main into Analysis) variance the best part direction in high dimensional data is chosen, by selecting these directions, obtain low comprising most information Dimension table shows.However, the linear property of PCA methods causes the characteristic type extracted to have considerable restraint.

Invention content

In view of this, the purpose of the present invention is to provide a kind of with autocoding neural network algorithm compression/de-compression god Device and method through network model, to solve above-mentioned at least one technical problem.

According to an aspect of the present invention, a kind of method of compression/de-compression neural network model is provided, including step：

S1：Obtain the parameter to be compressed of neural network model；

S2：The parameter to be compressed is compressed and trained using neural network algorithm, obtains the neural network of low-dimensional Parameter；

S3：The neural network parameter of the low-dimensional is decompressed, restores the parameter of neural network model.

Further, step S1 includes：Traversal selection is carried out to the parameter to be compressed of neural network model, until choose The quantity of parameter to be compressed is equal to the dimension of setting.

Further, step S1 includes：Traversal selection is carried out to the parameter to be compressed of neural network model, waits to press to described Contracting parameter carries out rarefaction, the parameter to be compressed of selection is judged, the parameter to be compressed less than given threshold is arranged to 0, it chooses the non-zero entry after rarefaction and marks the position coordinates of non-zero entry, set until the quantity for the parameter to be compressed chosen is equal to Fixed dimension.

Further, the traversal is chosen obtains each layer according to the sequencing of structure neural network model and waits to press successively Contracting parameter.

Further, step S2 includes sub-step：

S21：Build autocoding neural network based on multilayer perceptron, the input layer of autocoding neural network and Output layer number of nodes is identical, and the number of hidden nodes is less than input layer number；

S22：Parameter to be compressed is inputted, forward conduction calculating is carried out to every layer of neuron of autocoding neural network, is obtained To the activation value of each layer；

S23：Output is enabled to be equal to input, the residual error of output layer and each layer neuron is obtained using backward conduction algorithm；

S24：Using gradient descent method update weights W and biasing B, output is made to become closer to input；

S25：After weights and biasing convergence, the neural network parameter of the value, as low-dimensional of hidden layer is exported.

Further, it is unziped it using the subnetwork of autocoding neural network in step S21, is restored to output In layer.

According to another aspect of the invention, a kind of device of compression/de-compression neural network model, including parameter acquiring mould Block, model compression module, model memory module and model decompression module, wherein,

Parameter acquisition module, for obtaining the parameter to be compressed of neural network model；

Model compression module for compressing the parameter to be compressed using neural network algorithm, and is trained, and is obtained low The neural network parameter of dimension；

Model decompression module for decompressing the neural network parameter of low-dimensional, forms the neural network parameter of recovery；With And

Memory module, for storing the neural network parameter of the parameter to be compressed of neural network model, low-dimensional and recovery Neural network parameter.

Further, it in the model compression module, compresses the parameter to be compressed and is calculated by autocoding neural network Method is compressed, and autocoding neural network is divided into compression network, intermediate hidden layer reconciliation compression network, the compression network input Parameter to be compressed is exported to intermediate hidden layer, and the number of nodes inputted is more than the number of nodes of output.

Further, the autocoding neural network is built based on multilayer perceptron.

Further, in the model decompression module, the neural network parameter for decompressing low-dimensional passes through the decompression net Network is decompressed, the neural network parameter of the decompression network inputs low-dimensional, restores the quantity of neural network parameter.

In accordance with a further aspect of the present invention, a kind of equipment of compression/de-compression neural network model is provided, including：

Memory, for storing executable instruction；And

Processor, for performing the executable instruction stored in memory, to perform following operation：

Obtain the parameter to be compressed of neural network model；

The parameter to be compressed is compressed and trained using neural network algorithm, obtains the neural network ginseng of low-dimensional Number；

The neural network parameter of the low-dimensional is decompressed, restores the parameter of neural network model.

Based on above-mentioned technical proposal it is found that apparatus and method of the present invention has the advantages that：

(1) neural network model parameter can be effectively reduced using the device, saves the memory headroom needed for storage model, have Conducive to the transmission and transplanting of model；

(2) network parameter can be restored to a greater degree in decompression using correlation method, compared to general linear drop Dimension method has higher accuracy rate.

(3) by the way that there is minimum node in hidden layer, number of parameters can be effectively reduced, save memory, be conducive to model It stores and transmits；Neural network model is decompressed when in use, while ensures accuracy rate, and neural network algorithm is made preferably to apply To in practice；

(4) general compression method is compared, and the model of neural network is compressed with neural network algorithm, can realize fortune The multiplexing of unit is calculated, saves memory.

Description of the drawings

Fig. 1 is to be shown according to the integrally-built of device of the compression/de-compression neural network model of one embodiment of the invention Example block diagram；

Fig. 2 is a kind of parameter acquiring in the device according to the compression/de-compression neural network model of one embodiment of the invention The example block diagram of module；

Fig. 3 is a kind of autocoding in the device according to the compression/de-compression neural network model of one embodiment of the invention The example block diagram of neural network structure；

Fig. 4 is a kind of model compression in the device according to the compression/de-compression neural network model of one embodiment of the invention The example block diagram of module.

Fig. 5 is model decompression a kind of in the device according to the compression/de-compression neural network model of one embodiment of the invention The example block diagram of contracting module.

Fig. 6 is the method flow diagram according to the compression/de-compression neural network model of one embodiment of the invention.

Fig. 7 is the block diagram according to the equipment of the compression/de-compression neural network model of one embodiment of the invention.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with specific embodiment, and reference Attached drawing, the present invention is described in further detail.By described in detail below, other aspects of the invention, advantage and protrusion Feature will become obvious those skilled in the art.

In the present specification, it is following only to illustrate for describing the various embodiments of the principle of the invention, it should not be with any Mode is construed as limiting the scope of the invention.Described below with reference to attached drawing is used to help comprehensive understanding by claim and its waits The exemplary embodiment of the present invention that jljl limits.It is described below to help to understand, but these details including a variety of details It is considered as being only exemplary.Therefore, it will be appreciated by those of ordinary skill in the art that not departing from the scope of the present invention and essence In the case of god, embodiment described herein can be made various changes and modifications.In addition, it rises for clarity and brevity See, the description of known function and structure is omitted.In addition, through attached drawing, same reference numerals are used for identity function and operation.

The embodiment of the present invention provides the device of compression/de-compression neural network model, can be by trained neural network Model parameter is compressed, and can save the memory space of model, is conducive to neural network being transplanted to small memory device.

Fig. 1 is integrally-built for the device of compression/de-compression neural network model provided according to embodiment of the present invention Example block diagram.The wherein device of the compression/de-compression neural network model, including parameter acquisition module, model compression module, mould Type memory module and model decompression module.

Wherein, parameter acquisition module, for obtaining the parameter to be compressed of neural network model；

It is specific can be used for obtaining the parameter to be compressed of neural network module and pre-processed (such as treat pressure Contracting parameter carries out rarefaction), the input for model compression module is prepared.Wherein, acquisition modes can be to neural network mould The parameter to be compressed of type carries out traversal selection, until the quantity for the parameter to be compressed chosen is equal to the dimension of setting.

Can treat compression parameters to carry out rarefaction for above-mentioned pretreatment, when need to treat compression parameters carry out it is dilute During thinization, it can include：Traversal selection is carried out to the parameter to be compressed of neural network model, the parameter to be compressed is carried out dilute Thinization judges the parameter to be compressed of selection the parameter to be compressed less than given threshold is set to 0, after choosing rarefaction Non-zero entry and the position coordinates for marking non-zero entry, until the quantity for the parameter to be compressed chosen is equal to the dimension of setting.This is sparse Change can effectively reduce neural network model parameter, save the memory headroom needed for storage model, be conducive to the transmission and shifting of model It plants.

For parameter to be compressed, network node, weights, training rate, excitation function and biasing can be included.

Preferably, parameter acquisition module exports the parameter of setting dimension each time, until having traversed neural network model Whole parameters.

Fig. 2 shows for a kind of parameter acquisition module in the device of the compression/de-compression neural network model of the embodiment of the present invention It is intended to.The module obtains neural network model one and shares 1 parameter.As shown in Fig. 2, to one in convolutional neural networks frame The neural network framework built in (caffe, Convolution Architecture For Feature ECtraction), The parameter stored in Parameter File is W [l], and each input vector of model compression module is X [l_input], with Boolean type array Label [l] marks rarefaction situation.It is w that if certain, which reads parameter,_i, threshold value threshold, at this time X [l_input] before j-1 Item non-empty, then：If w_iAbsolute value be more than or equal to threshold, then be stored in array X [l_input] in, and by Label [i] puts 1, and reads next parameter；If the absolute value for reading parameter is less than threshold value, Label [i] is set to 0 and read next A parameter；Until array X [l_input] pile (namely realizing that the quantity for the parameter to be compressed chosen is equal to the dimension of setting).

Wherein, model compression module for compressing the parameter to be compressed, is trained using neural network algorithm, obtained Obtain the neural network parameter of low-dimensional.

Specifically, it is compressed by autocoding (Auto-encoder) neural network algorithm, the autocoding god It is built based on multilayer perceptron (MLP) through network, autocoding neural network is divided into compression (encoder) network, hidden (coder) layer and decompression (decoder) network, the output node number of the input reconciliation compression network of hidden layer network is identical, hidden The number of nodes of layer is less than both of the above.The compression network inputs parameter to be compressed, exports to hidden layer, and the number of nodes inputted is big In the number of nodes of output.Network is decompressed equally using MLP structures, input layer is coder layers, and output layer is inputted with encoder Node layer number is identical.

Fig. 3 illustrates the structure of autocoding neural network in this example.Wherein compression network (being equivalent to input layer) and It is all three layers of MLP network to decompress network (being equivalent to output layer), and input layer is all mutually l with output layer number of nodes_input, it is intermediate hidden The number of nodes of layer (coder layers) is at least l_compress, i.e.,：l_compress＜ l_input, connect entirely between layers.It is logical in this way It crosses the compressed number of nodes of neural network algorithm to reduce, so as to reduce memory space.

Fig. 4 illustrates a kind of process of model compression, the i.e. training process of autocoding neural network.

Weights are initialized after putting up autocoding neural network as shown in Figure 3；

X [the l that parameter extraction is obtained_input] input as model compression module, each layer is calculated in propagated forward The result of weight, biasing and output layer；

The value of node layer and X [l will be exported_input] be compared, calculate residual error；

Weight and biasing are updated with gradient descent method；Stop changing when error is sufficiently small or reaches maximum frequency of training Generation；Output Y [l coder layers intermediate_compress] it is input parameter X [l_input] low-dimensional represent.

It preserves by the structured file of compression neural network model, Boolean type array Label [l], decoder network structures text Part, decoder network parameters W_dAnd coder layers of output Y [l_compress], number necessary to these are decompression neural network models According to.

For model decompression module, for decompressing the neural network parameter of low-dimensional, the neural network parameter of recovery is formed, It is placed into neural network；Wherein, it decompresses and is decompressed also by above-described autocoding neural network, include decompression Contracting network, the neural network parameter of compression network input low-dimensional restore the quantity of neural network parameter, and by the god of recovery It is placed into network through network parameter is corresponding.

Fig. 5 illustrates a kind of process for decompressing neural network model.By Y [l_compress] parameter is input to as W_d's Decoder networks obtain length as l_inputOutput X ' [l_input].If the neural network model parameter after decompression is W ' [l], And pass through and read the value of array Label [l] by X ' [l_input] correspond in W ' [l].If Label [i] is 0, illustrate to carry in parameter The absolute value of W [i] is less than the threshold value of rarefaction when taking, and is omitted, then X ' [l_input] in there is no its respective items, W ' [i] value is 0； If Label [i] is 1, the value of X ' [j] is assigned to W ' [i]；Nerve net after having traversed array Label [l] and being decompressed Network model parameter is W ' [l].

It is shown in Figure 1 for memory module, for storing the nerve of the parameter to be compressed of neural network model, low-dimensional Network parameter (namely compressed parameter) and the neural network parameter restored.

Optionally, after parameter acquisition module uses rarefaction, memory module is additionally operable to label during storage rarefaction.

A kind of typicalness overall workflow of above device is as follows：

The intact nervous network model for including parameter and structure to one.It first passes through parameter extraction module and extracts a fixed number Measure parameter；It compresses to obtain the low of parameter using autocoding (auto-encoder) neural network algorithm in model compression module Dimension table shows, repeats above procedure and has compressed all parameters；Store corresponding parameter and network structure；By low-dimensional parameter during decompression As the input of decompression (decoder) network, recovery obtains higher-dimension parameter and correspondence is put back by compression network model, weighs Multiple above procedure decompresses all parameters；The parameter correspondence that decompression is obtained is put back to by compression network model, completes decompression god Through network model process.

The device of above-described embodiment is to be applied to the situation that parameter to be compressed carries out rarefaction.Another situation be not into Row rarefaction in the case of this kind, all may be used for Artificial Neural Network Structures and parameter and autocoding network structure and parameter With the setting with rarefaction similarly hereinafter, only parameter extraction, model storage and decompression process in terms of different from.It is compressed Neural network model parameter is W [l], and each input vectors of auto-encoder are X [l_input].It is successively read parameter w_i, and enable x_i=wi, until i=l_input.After completing one group of compression process, continue to read the parameter in W [l].

Due to not needing to the rarefaction situation of flag parameters, data to be saved is：By the knot of compression neural network model Structure file, decompression network structure file, compression network parameter W_dAnd the output Y [l of hidden layer_compress]。

Decompression process is first by Y [l_compress] parameter is input to as W_dDecompression network, obtain length as l_inputIt is defeated Go out X ' [l_input].If the neural network model parameter after decompression is W ' [l], it is successively read x '_i, and enable w '_i=x '_i；If i= Input then decompresses next group of parameter, until W ' [l] is all assigned.

Based on same inventive concept, the embodiment of the present invention also provides a kind of method of compression/de-compression neural network model, It is shown in Figure 6, including step：

S1：Obtain the parameter to be compressed of neural network model；

For step S1, can specifically include：Traversal selection is carried out to the parameter to be compressed of neural network model, until The quantity for the parameter to be compressed chosen is equal to the dimension of setting.

Optionally, compression parameters can also be treated to be pre-processed, the pretreatment can be treat compression parameters carry out it is dilute Thinization when needing to treat compression parameters progress rarefaction, can include：Parameter to be compressed progress time to neural network model Selection is gone through, rarefaction is carried out to the parameter to be compressed, the parameter to be compressed of selection is judged, less than treating for given threshold Compression parameters are set to 0, and are chosen the non-zero entry after rarefaction and are marked the position coordinates of non-zero entry, until the parameter to be compressed chosen Quantity be equal to setting dimension.The rarefaction can effectively reduce neural network model parameter, save interior needed for storage model Space is deposited, is conducive to the transmission and transplanting of model.

When needing to treat compression parameters progress rarefaction, step S1 includes：To the parameter to be compressed of neural network model Traversal selection is carried out, rarefaction is carried out to the parameter to be compressed, the parameter to be compressed of selection is judged, less than setting threshold The parameter to be compressed of value is set to 0, and is chosen the non-zero entry after rarefaction and is marked the position coordinates of non-zero entry, until that chooses waits to press The quantity of contracting parameter is equal to the dimension of setting.And it is corresponding, such as using rarefaction step, then in step S3, needed after decompression Neural network parameter is placed according to the mark position of non-zero entry.

Carry out rarefaction or during non-rarefaction, the traversal choose sequencing according to structure neural network model according to The secondary parameter to be compressed for obtaining each layer.

For step S2, sub-step can be included：

S21：Build autocoding neural network based on multilayer perceptron, the input layer of autocoding neural network and Output layer number of nodes is identical, and the number of hidden nodes is less than aobvious node layer number；

For step S3, can include：The neural network parameter of low-dimensional is solved using autocoding neural network Pressure, compression neural network includes compression network, hidden layer reconciliation compression network, restores the quantity of neural network parameter, and will restore Neural network parameter corresponding be placed into network.Preferably, the autocoding nerve net that step S21 is built may be used Network unzips it, and is restored in output layer.

For the details that step S1-S3 is not specifically described, it is referred to the instruction performed by corresponding module in above device It carries out, it will not be described here.

It is according to embodiments of the present invention in another aspect, providing a kind of compression/de-compression nerve net based on same inventive concept The equipment of network model.

Fig. 7 is the block diagram according to the equipment of the compression/de-compression neural network model of one embodiment of the invention.The equipment 700 include：

Memory 702, for storing executable instruction；And

Processor 701, for performing the executable instruction stored in memory, to perform following operation：

Obtain the parameter to be compressed of neural network model；

Above-mentioned executable instruction corresponds to the corresponding steps in the above method, is to perform above method step by processor Corresponding executable instruction.

Above-mentioned processor 701 can be single cpu (central processing unit), but can also include two or more processing Unit.For example, processor can include general purpose microprocessor, instruction set processor and/or related chip group and/or special micro- place Manage device (for example, application-specific integrated circuit (ASIC)).Processor can also include the onboard storage device for caching purposes.It is preferred that , using dedicated neural network processor, and it can be multiplexed the neural network processor when instructing and performing and had Neural network, save memory space.

Above-mentioned memory 702 can be flash memory, random access memory (RAM), read-only memory (ROM), EEPROM.It is excellent Choosing, the on-chip storage device being mounted on chip may be used.Memory 702 can also store in addition to above-metioned instruction is stored The neural network parameter of parameter to be compressed, low-dimensional in execution process instruction and the neural network parameter restored.

By above-described embodiment, autocoding neural network algorithm overcomes this by introducing the non-linear property of neural network A little limitations, and export makes its result more reliable with the enforcement mechanisms of the input phase etc..Auto-encoder is a kind of unsupervised Learning method, it represents identical meaning, the multilayer neural network with identical number of nodes using an input layer and output layer, Learn an input and output identical " identity function ".The meaning of autocoding neural network is the most intermediate hidden layer of study, The usual number of nodes of this layer is less compared with input layer and output layer, is the good expression of input vector.This process plays " drop The effect of dimension " realizes that the low-dimensional of higher-dimension input represents.

In aforementioned specification, various embodiments of the present invention are described with reference to its certain exemplary embodiments.Obviously, may be used Various modifications are made to each embodiment, and do not depart from the wider spirit and scope of the present invention described in appended claims. Correspondingly, the description and the appended drawings should be considered illustrative and not restrictive.

Claims

1. a kind of method of compression/de-compression neural network model, including step：

S1：Obtain the parameter to be compressed of neural network model；

S2：The parameter to be compressed is compressed and trained using neural network algorithm, obtains the neural network parameter of low-dimensional；

2. according to the method described in claim 1, it is characterized in that, step S1 includes：

Traversal selection is carried out to the parameter to be compressed of neural network model, until the quantity for the parameter to be compressed chosen is equal to setting Dimension.

3. according to the method described in claim 1, it is characterized in that, step S1 includes：

Traversal selection is carried out to the parameter to be compressed of neural network model, rarefaction is carried out to the parameter to be compressed, to choosing Parameter to be compressed judged that the parameter to be compressed less than given threshold is arranged to 0, choose the non-zero entry after rarefaction simultaneously The position coordinates of non-zero entry are marked, until the quantity for the parameter to be compressed chosen is equal to the dimension of setting.

4. according to the method in claim 2 or 3, which is characterized in that the traversal is chosen according to structure neural network model Sequencing obtain the parameter to be compressed of each layer successively.

5. according to the method described in claim 1, it is characterized in that, step S2 includes sub-step：

S21：Autocoding neural network, the input layer of autocoding neural network and output are built based on multilayer perceptron Node layer number is identical, and the number of hidden nodes is less than input layer number；

S22：Parameter to be compressed is inputted, forward conduction calculating is carried out to every layer of neuron of autocoding neural network, is obtained each The activation value of layer；

6. according to the method described in claim 5, it is characterized in that, part using autocoding neural network in step S21 Network unzips it, and is restored in output layer.

7. a kind of device of compression/de-compression neural network model, is deposited including parameter acquisition module, model compression module, model Module and model decompression module are stored up, wherein,

Model compression module for compressing the parameter to be compressed using neural network algorithm, and is trained, obtains low-dimensional Neural network parameter；

Model decompression module for decompressing the neural network parameter of low-dimensional, forms the neural network parameter of recovery；And

Memory module, for storing the parameter to be compressed of neural network model, the neural network parameter of low-dimensional and the nerve of recovery Network parameter.

8. device according to claim 7, which is characterized in that in the model compression module, compress the ginseng to be compressed Number is compressed by autocoding neural network algorithm, and autocoding neural network is divided into compression network, intermediate hidden layer reconciliation Compression network, the compression network input parameter to be compressed, export to intermediate hidden layer, and the number of nodes inputted is more than the section of output Points.

9. device according to claim 8, which is characterized in that the autocoding neural network is using multilayer perceptron as base Plinth is built.

10. device according to claim 8, which is characterized in that in the model decompression module, decompress the nerve of low-dimensional Network parameter is decompressed by the decompression network, and the neural network parameter of the decompression network inputs low-dimensional restores The quantity of neural network parameter.

11. a kind of equipment of compression/de-compression neural network model, including：

Memory, for storing executable instruction；And

Obtain the parameter to be compressed of neural network model；

The parameter to be compressed is compressed and trained using neural network algorithm, obtains the neural network parameter of low-dimensional；