CN109086819A - Caffemodel model compression method, system, equipment and medium - Google Patents
Caffemodel model compression method, system, equipment and medium Download PDFInfo
- Publication number
- CN109086819A CN109086819A CN201810836366.0A CN201810836366A CN109086819A CN 109086819 A CN109086819 A CN 109086819A CN 201810836366 A CN201810836366 A CN 201810836366A CN 109086819 A CN109086819 A CN 109086819A
- Authority
- CN
- China
- Prior art keywords
- weight matrix
- caffemodel
- model
- matrix
- layers
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000007906 compression Methods 0.000 title claims abstract description 82
- 230000006835 compression Effects 0.000 title claims abstract description 81
- 238000000034 method Methods 0.000 title claims abstract description 33
- 239000011159 matrix material Substances 0.000 claims abstract description 180
- 238000012549 training Methods 0.000 claims abstract description 62
- 230000003252 repetitive effect Effects 0.000 claims description 12
- 238000006243 chemical reaction Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 9
- 230000008676 import Effects 0.000 claims description 9
- 230000000694 effects Effects 0.000 abstract description 2
- 238000004364 calculation method Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 210000000352 storage cell Anatomy 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/07—Target detection
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a kind of caffemodel model compression method, system, equipment and media, and wherein caffemodel model compression method includes: that the caffemodel model after a training is imported using caffe frame;Obtain the first weight matrix;Generate mask code matrix;Using training set training caffemodel model, after iteration, fc6 layer and/or fc7 layers of weight matrix is the second weight matrix;Each element each element multiplication corresponding with mask code matrix in second weight matrix is generated into third weight matrix, sets third weight matrix for the weight matrix of fc6 layers and/or fc7 layers;Iteration terminates, and converts corresponding csr sparse matrix format for third weight matrix and generates compression weight matrix.Method of the invention can reach the effect for reducing the memory space of caffemodel model.
Description
Technical field
The present invention relates to algorithm fields, and in particular to a kind of caffemodel model compression method, system, equipment and Jie
Matter.
Background technique
Pvanet-faster-rcnn (a kind of object detection model) be it is a kind of based on convolutional neural networks in image
The algorithm model that object is detected.The Pvanet- of training standard under a kind of caffe (deep learning frame) frame
The size of caffemodel (deep learning frame model) model that faster-rcnn model obtains be 369MB (million, computer
One of storage cell), which is made of several layers, wherein fc6 (the 6th layer of full articulamentum) layer and fc7 layers of (full articulamentum
7th layer) weight parameter total account for about 352MB.
When calculating the model with GPU (graphics processor) calorimeter, the caffemodel model of 369MB size be will reside in
In GPU video memory, the parameter amount in caffemodel model is larger, and occupancy GPU video memory resource is more, can not be nervous in video memory resource
GPU video card on run, cause GPU operational performance decline.
Summary of the invention
The technical problem to be solved by the present invention is in order to overcome caffemodel model occupancy in operation in the prior art
GPU video memory, the defect for causing GPU operational performance to decline, provide a kind of caffemodel model compression method, system, equipment and
Medium.
The present invention is to solve above-mentioned technical problem by following technical proposals:
A kind of caffemodel model compression method, the caffemodel model compression method include:
The caffemodel model after a training is imported using caffe frame, the caffemodel model includes fc6 layers
And/or fc7 layers, fc6 layers and/or fc7 layer of the weight matrix is the first weight matrix;
Obtain first weight matrix;
Absolute value in first weight matrix is greater than and is set as 1 equal to the element of preset threshold, and will be described
After the element that absolute value in first weight matrix is less than the preset threshold is set as 0, mask code matrix, the default threshold are generated
Value is a positive value;
Using the training set training caffemodel model, after iteration, fc6 layers and/or fc7 layer of the weight matrix
For the second weight matrix;
Each element in second weight matrix is generated with each element multiplication corresponding in the mask code matrix
Fc6 layers and/or fc7 layer of the weight matrix is set the third weight matrix by third weight matrix;
Return it is described using the training set training caffemodel model, after iteration, described fc6 layers and/or fc7 layer
The step of weight matrix is the second weight matrix;
Until reaching default iteration termination condition, then iteration terminates, and converts corresponding csr for the third weight matrix
A kind of (sparse matrix compression storage format) sparse matrix format generates compression weight matrix, by the caffemodel model
Weight matrix is set as the compression weight matrix.
Preferably, the step of weight matrix by the caffemodel model is set as the compression weight matrix
Further include:
After iteration, the training precision for obtaining the caffemodel model is repetitive exercise precision;
The training precision of the caffemodel model before iteration is original training precision, calculates the repetitive exercise essence
Degree compares the down ratio of the original training precision, if the down ratio is higher than default precise ratio, reduces described pre-
If threshold value, the step of generating the mask code matrix is returned;
It is described to convert corresponding sparse matrix format for the third weight matrix and generate the compression weight matrix
Step includes:
Until the down ratio is lower than the default precise ratio, convert the third weight matrix to corresponding dilute
It dredges matrix format and generates the compression weight matrix.
Preferably, the range of the default precise ratio is 0.1%-0.5%.
Preferably, the step of weight matrix by the caffemodel model is set as the compression weight matrix
It also includes later:
Using the fc6 layers and/or fc7 layer of reception input data, and by the input data and the compression weight square
Battle array obtains output data as multiplication operation.
A kind of caffemodel model compression system, the caffemodel model compression system include import modul, cover
Code generation module, iteration module, mask module, return module and conversion module;
The import modul is used to import the caffemodel model after a training using caffe frame, described
Caffemodel model includes fc6 layers and/or fc7 layers, and fc6 layers and/or fc7 layer of the weight matrix is the first weight square
Battle array;
The mask generation module is used to obtain first weight matrix, and will be absolute in first weight matrix
Value is greater than and is set as 1 equal to the element of preset threshold, and the absolute value in first weight matrix is less than described preset
After the element of threshold value is set as 0, mask code matrix is generated, the preset threshold is a positive value;
The iteration module is used for using the training set training caffemodel model, after iteration, described fc6 layer with/
Or fc7 layers of weight matrix is the second weight matrix;
The mask module is used for each element in second weight matrix is corresponding with the mask code matrix
Each element multiplication generates third weight matrix, sets the third for fc6 layers and/or fc7 layer of the weight matrix and weighs
Weight matrix;
The return module is described using the training set training caffemodel model for returning, described after iteration
The step of fc6 layers and/or fc7 layers of weight matrix is the second weight matrix;
The conversion module is used for until reaching default iteration termination condition, then iteration terminates, by the third weight square
Battle array is converted into corresponding csr sparse matrix format and generates compression weight matrix, by the weight matrix of the caffemodel model
It is set as the compression weight matrix.
Preferably, the caffemodel model compression system further includes precision comparison module, the precision comparison module
After for iteration, and obtaining the training precision of the caffemodel model is repetitive exercise precision, the institute before iteration
The training precision for stating caffemodel model is original training precision;
The precision module is also used to calculate the down ratio that the repetitive exercise precision compares the original training precision,
If the down ratio is higher than default precise ratio, the preset threshold is reduced, the mask code matrix generation module is called;
The conversion module is also used to until the down ratio is lower than the default precise ratio, by the third weight
Matrix is converted into corresponding sparse matrix format and generates the compression weight matrix.
Preferably, the range of the default precise ratio is 0.1%-0.5%.
Preferably, the caffemodel model compression system also includes computing module, the computing module is used to utilize institute
Fc6 layers and/or fc7 layers reception input data are stated, and the input data is obtained with the compression weight matrix as multiplication operation
To output data.
A kind of electronic equipment including memory, processor and stores the meter that can be run on a memory and on a processor
Calculation machine program, the processor realize caffemodel model compression method as described above when executing the computer program.
A kind of computer readable storage medium, is stored thereon with computer program, and the computer program is held by processor
The step of caffemodel model compression method as described above is realized when row.
The positive effect of the present invention is that:
The present invention is by by each element of the fc6 layer of caffemodel model and/or fc7 layers of weight matrix and default threshold
Value generates the mask code matrix for only including 0 and 1 element more afterwards, continues to train caffemodel model, will generate after each iteration
The second weight matrix each element and mask code matrix corresponding position each element multiplication, after final iteration, obtain
The weight matrix of caffemodel model include some inactive elements 0, and convert csr sparse matrix lattice for this weight matrix
Formula, to achieve the effect that the memory space for reducing caffemodel model.
Detailed description of the invention
Fig. 1 is the flow chart of the caffemodel model compression method of the embodiment of the present invention 1.
Fig. 2 is the flow chart of the caffemodel model compression method of the embodiment of the present invention 2.
Fig. 3 is the module diagram of the caffemodel model compression system of the embodiment of the present invention 3.
Fig. 4 is the module diagram of the caffemodel model compression system of the embodiment of the present invention 4.
Fig. 5 is the structural schematic diagram of the electronic equipment of the embodiment of the present invention 5.
Specific embodiment
The present invention is further illustrated below by the mode of embodiment, but does not therefore limit the present invention to the reality
It applies among a range.
Caffe is a kind of open source deep learning frame, and the present embodiment is realized based on caffe deep learning frame, can be right
The caffemodel model file obtained after Pvanet-faster-rcnn training is compressed.Fc6 in caffemodel model
Weights (weight) parameter and fc7 layers of weights parameter of layer exist in the form of dense matrix.Before implementation, need
The file to be prepared includes: the caffemodel model to be compressed after Pvanet-faster-rcnn training, and training should
Training set when caffemodel model.
Embodiment 1
It, can be to caffemodel model in compression process the present embodiment provides a kind of caffemodel model compression method
Fc6 layers and fc7 layers carry out identical compression processing, can also be individually respectively to fc6 layers and fc7 layers progress compression processing.
As shown in Figure 1, caffemodel model compression method includes:
Step 101: importing the caffemodel model after a training using caffe frame, caffemodel model includes
Fc6 layers and/or fc7 layers, fc6 layers and/or fc7 layers of weight matrix is the first weight matrix.
Step 102: obtaining the first weight matrix.
Step 103: the absolute value in the first weight matrix being greater than and is set as 1 equal to the element of preset threshold, and will
After the element that absolute value in first weight matrix is less than preset threshold is set as 0, mask code matrix is generated, preset threshold is one just
Value.
In the present embodiment, fc6 layers can select different values from the preset threshold of fc7 layer choosing respectively according to the actual situation.
Step 104: utilizing training set training caffemodel model, after iteration, fc6 layers and/or fc7 layers of weight matrix
For the second weight matrix.
Step 105: each element each element multiplication corresponding with mask code matrix in the second weight matrix is generated
The weight matrix of fc6 layers and/or fc7 layers is set third weight matrix by third weight matrix.
Step 106: judge whether to meet default iteration termination condition, if so, iteration terminates, executes step 107, if
It is no, then return step 104.
Until reaching default iteration termination condition, then iteration terminates.Default iteration termination condition can be led with caffe frame
Iteration termination condition when caffemodel model training after the training entered is identical.
Step 107: corresponding csr sparse matrix format, which is converted, by third weight matrix generates compression weight matrix, it will
The weight matrix of caffemodel model is set as compression weight matrix.
So that the occupied space of caffemodel model greatly reduces, it may make caffemodel model after progress
It occupies GPU video memory space when reforwarding is calculated to reduce, to improve GPU operational performance.
Embodiment 2
The present embodiment provides a kind of caffemodel model compression method, compared with Example 1, difference exists the present embodiment
In caffemodel model compression method is before step 107 further include:
Step 107-1: after iteration, the training precision for obtaining caffemodel model is repetitive exercise precision, iteration
The training precision of preceding caffemodel model is original training precision.
Step 107-2: the down ratio that repetitive exercise precision compares original training precision is calculated.
Step 107-3: judging whether down ratio is higher than default precise ratio, if so, 107-4 is thened follow the steps, if it is not,
Execute step 107.The range of default precise ratio may be configured as 0.1%-0.5%.
Step 107-4: preset threshold, return step 103 are reduced.
Step 107 includes:
Until down ratio is lower than default precise ratio, it is raw to convert corresponding sparse matrix format for third weight matrix
At compression weight matrix.
Default precise ratio in the present embodiment is selected as 0.5%.
In practical applications, the Pvanet-faster-rcnn's that can detecte 20 kinds of objects to one with this method
Caffemodel model is compressed, and the initial size of caffemodel model is 369MB, and adopting said method is by caffemodel
When being 37MB after model compression, model accuracy down ratio is only 0.36%.Obtain the original with caffemodel initial model
Beginning training precision is suitable, but the smaller new model of occupied space.
Preferably, it is also included after the step of setting compression weight matrix for the weight matrix of caffemodel model:
Make multiplication operation using fc6 layers and/or fc7 layers reception input data, and by input data and compression weight matrix
Obtain output data.
When model carries out forward calculation, fc6 layers of input data, which be multiplied with fc6 layers of weight matrix, generates the layer
Output data, fc7 layers of input data, which be multiplied with fc7 layers of weight matrix, generates this layer of output data.Compared to before, press
The occupied space size of caffemodel model after contracting, so that in model forward calculation, the GPU video memory resource of model occupancy
Space is reduced, and the runnability of GPU is substantially increased.
Embodiment 3
The present embodiment provides a kind of caffemodel model compression systems, as shown in figure 3, caffemodel model compression system
System includes import modul 201, mask generation module 202, iteration module 203, mask module 204, return module 205 and conversion mould
Block 206.
Import modul 201 is used to import the caffemodel model after a training, caffemodel mould using caffe frame
Type includes fc6 layers and/or fc7 layers, and fc6 layers and/or fc7 layers of weight matrix is the first weight matrix.
Mask generation module 202 for obtain the first weight matrix, and by the absolute value in the first weight matrix be greater than and
Element equal to preset threshold is set as 1, and sets 0 for the element that the absolute value in the first weight matrix is less than preset threshold
Afterwards, mask code matrix is generated, preset threshold is a positive value.
In the present embodiment, fc6 layers can select different values from the preset threshold of fc7 layer choosing respectively according to the actual situation.
Iteration module 203 is used to train caffemodel model using training set, after iteration, fc6 layers and/or fc7 layers
Weight matrix is the second weight matrix.
Mask module 204 is used for each the element phase corresponding with mask code matrix of each element in the second weight matrix
Multiply and generate third weight matrix, sets third weight matrix for the weight matrix of fc6 layers and/or fc7 layers.
Return module 205 is for calling iteration module 203.
Conversion module 206 is used for when until reaching default iteration termination condition, after iteration, by third weight matrix turn
It turns to corresponding csr sparse matrix format and generates compression weight matrix, the weight matrix of caffemodel model is set as pressing
Contracting weight matrix.When caffemodel model training after the training that default iteration termination condition can be imported with caffe frame
Iteration termination condition it is identical.
So that the occupied space of caffemodel model greatly reduces, it may make caffemodel model after progress
It occupies GPU video memory space when reforwarding is calculated to reduce, to improve GPU operational performance.
Embodiment 4
The present embodiment provides a kind of caffemodel model compression system, compared with Example 3, difference exists the present embodiment
In as shown in figure 4, caffemodel model compression system further includes precision comparison module 207, precision comparison module 207 is used for
After iteration, and obtaining the training precision of caffemodel model is repetitive exercise precision, the caffemodel before iteration
The training precision of model is original training precision;
Precision comparison module 207 is also used to calculate the down ratio that repetitive exercise precision compares original training precision, if under
Drop ratio is higher than default precise ratio, then reduces preset threshold, calls mask code matrix generation module;The range of default precise ratio
It may be configured as 0.1%-0.5%.
Conversion module 206 is also used to until down ratio is lower than default precise ratio, converts third weight matrix to pair
The sparse matrix format answered generates compression weight matrix.
Default precise ratio in the present embodiment is selected as 0.5%.
In practical applications, the Pvanet-faster-rcnn's that can detecte 20 kinds of objects to one with this method
Caffemodel model is compressed, and the initial size of caffemodel model is 369MB, and adopting said method is by caffemodel
When being 37MB after model compression, model accuracy down ratio is only 0.36%.Obtain the original with caffemodel initial model
Beginning training precision is suitable, but the smaller new model of occupied space.
Preferably, caffemodel model compression system also includes computing module, computing module be used for using fc6 layers and/or
Fc7 layers of reception input data, and input data and compression weight matrix are obtained into output data as multiplication operation.
When model carries out forward calculation, fc6 layers of input data, which be multiplied with fc6 layers of weight matrix, generates the layer
Output data, fc7 layers of input data, which be multiplied with fc7 layers of weight matrix, generates this layer of output data.Compared to before, press
The occupied space size of caffemodel model after contracting, so that in model forward calculation, the GPU video memory resource of model occupancy
Space is reduced, and the runnability of GPU is substantially increased.
Embodiment 5
Fig. 5 is the structural schematic diagram of a kind of electronic equipment provided in this embodiment.The electronic equipment includes memory, place
The computer program managing device and storage on a memory and can running on a processor, the processor execute real when described program
The caffemodel model compression method of current embodiment 1.The electronic equipment 30 that Fig. 5 is shown is only an example, should not be to this
The function and use scope of inventive embodiments bring any restrictions.
As shown in figure 5, electronic equipment 30 can be showed in the form of universal computing device, such as it can set for server
It is standby.The component of electronic equipment 30 can include but is not limited to: at least one above-mentioned processor 31, above-mentioned at least one processor
32, the bus 33 of different system components (including memory 32 and processor 31) is connected.
Bus 33 includes data/address bus, address bus and control bus.
Memory 32 may include volatile memory, such as random access memory (RAM) 321 and/or cache
Memory 322 can further include read-only memory (ROM) 323.
Memory 32 can also include program/utility 325 with one group of (at least one) program module 324, this
The program module 324 of sample includes but is not limited to: operating system, one or more application program, other program modules and journey
It may include the realization of network environment in ordinal number evidence, each of these examples or certain combination.
Processor 31 by operation storage computer program in memory 32, thereby executing various function application and
Data processing, such as caffemodel model compression method provided by the embodiment of the present invention 1.
Electronic equipment 30 can also be communicated with one or more external equipments 34 (such as keyboard, sensing equipment etc.).It is this
Communication can be carried out by input/output (I/O) interface 35.Also, the equipment 30 that model generates can also pass through Network adaptation
Device 36 and one or more network (such as local area network (LAN), wide area network (WAN) and/or public network, such as internet) logical
Letter.As shown, the other modules for the equipment 30 that network adapter 36 is generated by bus 33 and model communicate.It should be understood that
Although not shown in the drawings, the equipment 30 that can be generated with binding model uses other hardware and/or software module, including but unlimited
In: microcode, device driver, redundant processor, external disk drive array, RAID (disk array) system, magnetic tape drive
Device and data backup storage system etc..
It should be noted that although being referred to several units/modules or subelement/mould of electronic equipment in the above detailed description
Block, but it is this division be only exemplary it is not enforceable.In fact, embodiment according to the present invention, is retouched above
The feature and function for two or more units/modules stated can embody in a units/modules.Conversely, above description
A units/modules feature and function can with further division be embodied by multiple units/modules.
Embodiment 6
A kind of computer readable storage medium is present embodiments provided, computer program, described program quilt are stored thereon with
Caffemodel model compression method provided by embodiment 1 is realized when processor executes.
Wherein, what readable storage medium storing program for executing can use more specifically can include but is not limited to: portable disc, hard disk, random
Access memory, read-only memory, erasable programmable read only memory, light storage device, magnetic memory device or above-mentioned times
The suitable combination of meaning.
In possible embodiment, the present invention is also implemented as a kind of form of program product comprising program generation
Code, when described program product is run on the terminal device, said program code is realized in fact for executing the terminal device
Apply the step in caffemodel model compression method described in example 1.
Wherein it is possible to be write with any combination of one or more programming languages for executing program of the invention
Code, said program code can be executed fully on a user device, partly execute on a user device, is only as one
Vertical software package executes, part executes on a remote device or executes on a remote device completely on a user device for part.
Although specific embodiments of the present invention have been described above, it will be appreciated by those of skill in the art that this is only
For example, protection scope of the present invention is to be defined by the appended claims.Those skilled in the art without departing substantially from
Under the premise of the principle and substance of the present invention, many changes and modifications may be made, but these change and
Modification each falls within protection scope of the present invention.
Claims (10)
1. a kind of caffemodel model compression method, which is characterized in that the caffemodel model compression method includes:
Using caffe frame import one training after caffemodel model, the caffemodel model include fc6 layer with/
Or fc7 layers, fc6 layers and/or fc7 layer of the weight matrix is the first weight matrix;
Obtain first weight matrix;
Absolute value in first weight matrix is greater than and is set as 1 equal to the element of preset threshold, and by described first
After the element that absolute value in weight matrix is less than the preset threshold is set as 0, mask code matrix is generated, the preset threshold is
One positive value;
Using the training set training caffemodel model, after iteration, fc6 layer and/or fc7 layer of the weight matrix is the
Two weight matrix;
Each element in second weight matrix is generated into third with each element multiplication corresponding in the mask code matrix
Fc6 layers and/or fc7 layer of the weight matrix is set the third weight matrix by weight matrix;
Return it is described using the training set training caffemodel model, after iteration, fc6 layers and/or fc7 layer of the weight
The step of matrix is the second weight matrix;
Until reaching default iteration termination condition, then iteration terminates, and it is sparse to convert corresponding csr for the third weight matrix
Matrix format generates compression weight matrix, sets the compression weight square for the weight matrix of the caffemodel model
Battle array.
2. caffemodel model compression method as described in claim 1, which is characterized in that described by the caffemodel
The weight matrix of model is set as the step of compression weight matrix further include:
After iteration, the training precision for obtaining the caffemodel model is repetitive exercise precision;
The training precision of the caffemodel model before iteration is original training precision, calculates the repetitive exercise precision phase
The default threshold is reduced if the down ratio is higher than default precise ratio than the down ratio of the original training precision
Value returns to the step of generating the mask code matrix;
It is described to convert the step of corresponding sparse matrix format generates the compression weight matrix for the third weight matrix
Include:
Until the down ratio is lower than the default precise ratio, corresponding sparse square is converted by the third weight matrix
Grid array formula generates the compression weight matrix.
3. caffemodel model compression method as claimed in claim 2, which is characterized in that the model of the default precise ratio
It encloses for 0.1%-0.0.5%.
4. caffemodel model compression method as described in claim 1, which is characterized in that described by the caffemodel
The weight matrix of model is set as also including after the step of compression weight matrix:
Make using the fc6 layers and/or fc7 layer of reception input data, and by the input data and the compression weight matrix
Multiplication operation obtains output data.
5. a kind of caffemodel model compression system, which is characterized in that the caffemodel model compression system includes leading
Enter module, mask generation module, iteration module, mask module, return module and conversion module;
The import modul is used to import the caffemodel model after a training, the caffemodel using caffe frame
Model includes fc6 layers and/or fc7 layers, and fc6 layers and/or fc7 layer of the weight matrix is the first weight matrix;
The mask generation module is used to obtain first weight matrix, and the absolute value in first weight matrix is big
In being set as 1 with the element for being equal to preset threshold, and the absolute value in first weight matrix is less than the preset threshold
Element be set as 0 after, generate mask code matrix, the preset threshold be a positive value;
The iteration module is used to train the caffemodel model using training set, after iteration, described fc6 layers and/or fc7
The weight matrix of layer is the second weight matrix;
The mask module be used for by each element in second weight matrix it is corresponding with the mask code matrix each
Element multiplication generates third weight matrix, sets the third weight square for fc6 layers and/or fc7 layer of the weight matrix
Battle array;
The return module is described using the training set training caffemodel model for returning, fc6 layers described after iteration
And/or fc7 layers weight matrix be the second weight matrix the step of;
The conversion module is used for until reaching default iteration termination condition, then iteration terminates, and the third weight matrix is turned
It turns to corresponding csr sparse matrix format and generates compression weight matrix, the weight matrix of the caffemodel model is arranged
For the compression weight matrix.
6. caffemodel model compression system as claimed in claim 5, which is characterized in that the caffemodel model pressure
Compression system further includes precision comparison module, after the precision comparison module is used for iteration, and described in acquisition
The training precision of caffemodel model is repetitive exercise precision, and the training precision of the caffemodel model before iteration is
Original training precision;
The precision module is also used to calculate the down ratio that the repetitive exercise precision compares the original training precision, if institute
It states down ratio and is higher than default precise ratio, then reduce the preset threshold, call the mask code matrix generation module;
The conversion module is also used to until the down ratio is lower than the default precise ratio, by the third weight matrix
It is converted into corresponding sparse matrix format and generates the compression weight matrix.
7. caffemodel model compression system as claimed in claim 6, which is characterized in that the model of the default precise ratio
It encloses for 0.1%-0.5%.
8. caffemodel model compression system as claimed in claim 5, which is characterized in that the caffemodel model pressure
Compression system also includes computing module, and the computing module is used to utilizing the fc6 layers and/or fc7 layer of reception input data, and by institute
It states input data and the compression weight matrix and obtains output data as multiplication operation.
9. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor
Machine program, which is characterized in that the processor is realized of any of claims 1-4 when executing the computer program
Caffemodel model compression method.
10. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the computer program
The step of caffemodel model compression method of any of claims 1-4 is realized when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810836366.0A CN109086819B (en) | 2018-07-26 | 2018-07-26 | Method, system, equipment and medium for compressing caffemul model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810836366.0A CN109086819B (en) | 2018-07-26 | 2018-07-26 | Method, system, equipment and medium for compressing caffemul model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109086819A true CN109086819A (en) | 2018-12-25 |
CN109086819B CN109086819B (en) | 2023-12-05 |
Family
ID=64830821
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810836366.0A Active CN109086819B (en) | 2018-07-26 | 2018-07-26 | Method, system, equipment and medium for compressing caffemul model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109086819B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109711367A (en) * | 2018-12-29 | 2019-05-03 | 北京中科寒武纪科技有限公司 | Operation method, device and Related product |
CN109740746A (en) * | 2018-12-29 | 2019-05-10 | 北京中科寒武纪科技有限公司 | Operation method, device and Related product |
CN113360188A (en) * | 2021-05-18 | 2021-09-07 | 中国石油大学(北京) | Parallel processing method and device for optimizing sparse matrix-vector multiplication |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105184369A (en) * | 2015-09-08 | 2015-12-23 | 杭州朗和科技有限公司 | Depth learning model matrix compression method and device |
CN107610192A (en) * | 2017-09-30 | 2018-01-19 | 西安电子科技大学 | Adaptive observation compressed sensing image reconstructing method based on deep learning |
US20180046914A1 (en) * | 2016-08-12 | 2018-02-15 | Beijing Deephi Intelligence Technology Co., Ltd. | Compression method for deep neural networks with load balance |
CN108090498A (en) * | 2017-12-28 | 2018-05-29 | 广东工业大学 | A kind of fiber recognition method and device based on deep learning |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6898359B2 (en) * | 2016-06-14 | 2021-07-07 | タータン エーアイ リミテッド | Accelerator for deep neural networks |
CN106503729A (en) * | 2016-09-29 | 2017-03-15 | 天津大学 | A kind of generation method of the image convolution feature based on top layer weights |
CN108229681A (en) * | 2017-12-28 | 2018-06-29 | 郑州云海信息技术有限公司 | A kind of neural network model compression method, system, device and readable storage medium storing program for executing |
-
2018
- 2018-07-26 CN CN201810836366.0A patent/CN109086819B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105184369A (en) * | 2015-09-08 | 2015-12-23 | 杭州朗和科技有限公司 | Depth learning model matrix compression method and device |
US20180046914A1 (en) * | 2016-08-12 | 2018-02-15 | Beijing Deephi Intelligence Technology Co., Ltd. | Compression method for deep neural networks with load balance |
CN107610192A (en) * | 2017-09-30 | 2018-01-19 | 西安电子科技大学 | Adaptive observation compressed sensing image reconstructing method based on deep learning |
CN108090498A (en) * | 2017-12-28 | 2018-05-29 | 广东工业大学 | A kind of fiber recognition method and device based on deep learning |
Non-Patent Citations (2)
Title |
---|
JULIEN NYAMBAL等: "《Automated Parking Space Detection Using Convolutional Neural Networks》", 《2017 PATTERN RECOGNITION ASSOCIATION OF SOUTH AFRICA AND ROBOTICS AND MECHATRONICS》 * |
SONG HAN等: "《Deep Compression: Compressing Deep Neural Networks with Pruing, Trained Quantization and Huffman Coding》", 《ARXIV》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109711367A (en) * | 2018-12-29 | 2019-05-03 | 北京中科寒武纪科技有限公司 | Operation method, device and Related product |
CN109740746A (en) * | 2018-12-29 | 2019-05-10 | 北京中科寒武纪科技有限公司 | Operation method, device and Related product |
CN109740746B (en) * | 2018-12-29 | 2020-01-31 | 中科寒武纪科技股份有限公司 | Operation method, device and related product |
CN113360188A (en) * | 2021-05-18 | 2021-09-07 | 中国石油大学(北京) | Parallel processing method and device for optimizing sparse matrix-vector multiplication |
CN113360188B (en) * | 2021-05-18 | 2023-10-31 | 中国石油大学(北京) | Parallel processing method and device for optimizing sparse matrix-vector multiplication |
Also Published As
Publication number | Publication date |
---|---|
CN109086819B (en) | 2023-12-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111445418B (en) | Image defogging processing method and device and computer equipment | |
CN109086819A (en) | Caffemodel model compression method, system, equipment and medium | |
CN110969198A (en) | Distributed training method, device, equipment and storage medium for deep learning model | |
CN107256424A (en) | Three value weight convolutional network processing systems and method | |
CN109558310A (en) | Method for generating test case and device | |
CN114138231B (en) | Method, circuit and SOC for executing matrix multiplication operation | |
CN107506284A (en) | Log processing method and device | |
CN116778148A (en) | Target detection method, target detection device, electronic equipment and storage medium | |
CN110389840A (en) | Load consumption method for early warning, device, computer equipment and storage medium | |
CN111694692B (en) | Data storage erasure method, device and equipment and readable storage medium | |
CN108961268A (en) | A kind of notable figure calculation method and relevant apparatus | |
CN108629410A (en) | Based on principal component analysis dimensionality reduction and/or rise the Processing with Neural Network method tieed up | |
CN110245706B (en) | Lightweight target detection method for embedded application | |
CN110276413B (en) | Model compression method and device | |
CN115617636A (en) | Distributed performance test system | |
CN109117945B (en) | Processor and processing method thereof, chip packaging structure and electronic device | |
CN113139490B (en) | Image feature matching method and device, computer equipment and storage medium | |
CN112580772B (en) | Compression method and device for convolutional neural network | |
CN104320659A (en) | Background modeling method, device and apparatus | |
CN114297022A (en) | Cloud environment anomaly detection method and device, electronic equipment and storage medium | |
CN114819096A (en) | Model training method and device, electronic equipment and storage medium | |
CN106851283B (en) | A kind of method and device of the image adaptive compressed sensing sampling based on standard deviation | |
JPWO2021038840A5 (en) | ||
CN116798052B (en) | Training method and device of text recognition model, storage medium and electronic equipment | |
CN108629409A (en) | A kind of Processing with Neural Network system reducing IO expenses based on principal component analysis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |