CN109657780A - A kind of model compression method based on beta pruning sequence Active Learning - Google Patents

A kind of model compression method based on beta pruning sequence Active Learning Download PDF

Info

Publication number
CN109657780A
CN109657780A CN201811501702.2A CN201811501702A CN109657780A CN 109657780 A CN109657780 A CN 109657780A CN 201811501702 A CN201811501702 A CN 201811501702A CN 109657780 A CN109657780 A CN 109657780A
Authority
CN
China
Prior art keywords
beta pruning
model
network
layer
lstm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811501702.2A
Other languages
Chinese (zh)
Inventor
丁贵广
钟婧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Publication of CN109657780A publication Critical patent/CN109657780A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention provides a kind of model compression methods based on beta pruning sequence Active Learning, one is proposed end to end based on the beta pruning frame of sequence Active Learning, it can be with the importance of each layer of Active Learning network, generate beta pruning priority, reasonable beta pruning decision is made, solves the problems, such as the unreasonable of existing simple in-order pruning method, preferentially in the smallest network layer beta pruning of influence power, it conforms to the principle of simplicity to difficult gradually beta pruning, minimizes the model accuracy loss of beta pruning process;It is finally lost with model simultaneously to be oriented to, multi-angle, efficiently, flexibly rapidly assessment convolution kernel importance is transplanted to portable equipment for subsequent large-sized model and provides technical support to guarantee the correctness and validity of whole process model compression.The experimental results showed that, model compression method provided by the invention based on beta pruning sequence Active Learning shows leading under multiple data sets, Multi-model MPCA, can be in the case where guaranteeing model accuracy, huge compression model volume has very strong actual application prospect.

Description

A kind of model compression method based on beta pruning sequence Active Learning
Technical field
The invention belongs to neural network model technology field more particularly to a kind of models based on beta pruning sequence Active Learning Compression method.
Background technique
In recent years, flourishing with deep neural network, science circle and industry have witnessed deep learning jointly and have existed The important breakthrough of the various fields such as computer vision, natural language processing.Convolutional neural networks (CNN) are in certain visual fields Expressive force has been even more than the visual processes ability of the mankind.
Although depth network obtains great breakthrough in visual field, the size and calculation amount of model become it in reality Bottleneck in the application of border.Depth network application needs to depend on the quick computing capability of hardware, a large amount of memory spaces in reality scene And battery capacity.Large Scale Neural Networks can be efficiently run in the server of computer room, quickly calculated by GPU, be but difficult to It applies in the limited mobile device of resource and low frequency CPU, such as smart phone, wearable equipment etc..It is so limited, deep learning Numerous scientific achievements be difficult to be transformed into practical applicable scene.In order to solve this problem, Recent study personnel are proposed The method of many model compressions, it is intended to which compact model size improves model running speed, and keeps model accuracy not as far as possible Become, uses compressed model transplantations into small device to realize.The essence of model compression is to generate a small pattern Type makes it and large scale network have same expressive force.This is a greatly challenge, extensive net for researcher Network is well-designed by experts and scholars and verifies effective network structure, each parameter learnt is to network entirety Performance contributes, inappropriate to give up certain parameters and damage modelling effect.Actual scene if necessary to one extremely Small model then needs the relationship between balance model complexity and modelling effect, suitably makes certain aspect and gives up
The model compression method of mainstream is divided into several branches, and the first kind is " neural network trimming ", including rarefaction beta pruning With two methods of structuring beta pruning.The emphasis of pruning method is connected to the network weight in assessment, and cutting influences lesser power to network Weight, and by retraining come the precision of Restoration model.Rarefaction beta pruning is sporadically cut in network small significance and is connected, can be with Largely compact model size reduces memory overhead;But it is limited to the realization in bottom library, there are still tired in network acceleration It is difficult.And structuring beta pruning can keep the regular shape of convolution nuclear structure well, it is usually substantially single using convolution kernel as beta pruning Position;Network model after structuring beta pruning, tactical rule and complete can be carried out directly at acceleration according to traditional convolution method Reason.Another kind of is " neural network parameter is shifted and shared ", that is, passes through the modes such as parameter quantization, low order estimation or knowledge extraction Compression network model.Neural network parameter transfer and shared method, can be with usually as the subsequent compression step of model beta pruning Further compression network model;Be used alone can not in the case where keeping model accuracy extensive compression network volume.Third Class is " Neural Network Structure Design ", searches for planned network structure automatically by the network structure or machine that artificially design new Method, directly design a miniature neural network.Automated Design network can discharge hand labor, give special scenes, specific The task of data makes most suitable network structure to measure, but the complexity of this method is very high, needs to consume a large amount of meter Resource is calculated, set objective is otherwise unable to reach.
From the point of view of present case, structuring beta pruning is in compression network volume and to improve two aspect of model running speed most Effective method.Existing structuring technology of prunning branches mainly includes using sequence beta pruning or global pruning two ways.Sequentially Beta pruning has preset beta pruning sequence, by from front to back, from back to front in the way of layer-by-layer beta pruning, in each layer according to certain ratio Example subtracts relatively unessential convolution kernel.Global pruning mode sets an importance threshold value, small during each round beta pruning It is cut simultaneously in all convolution kernels of the network of the threshold value.In fact, depth model compression is the task of a system level, It needs to make beta pruning decision according to world model.Research emphasis is placed in assessment convolution kernel importance by existing method, but beta pruning Strategy is too simple, leads to that the effect is unsatisfactory.However, there is an important phenomenon ignored, each convolutional layer is different Importance: if subtracting only a few convolution kernel on an important convolutional layer, the precision of overall model may also can be substantially Decline;, whereas if, even if cutting off a large amount of convolution kernels, can also hardly be caused to precision on a unessential convolutional layer It influences, the mode of beta pruning or global pruning does not obviously account for the importance of each convolutional layer to beta pruning result in sequence The influence of generation.
Summary of the invention
In order to solve the above technical problem, the present invention provides a kind of model compression sides based on beta pruning sequence Active Learning Method, comprising:
S1. LSTM learning network temporal aspect is utilized, the decision whether each network layer needs beta pruning is made;
S2. the network layer parameter of the network layer is assessed and is cut in selected network layer, and propose Restoration Mechanism Model accuracy after beta pruning is restored at once;
S3. acceleration retraining is carried out using tutor's network to the model after beta pruning;
S4. according to the expressive force and complexity of the model of retraining after beta pruning, feedback excitation R is obtained, with enhancing study side Method updates LSTM;
S5. input of highest 5 models of feedback excitation as new round LSTM is chosen, step 1- step 4 is repeated, until LSTM terminates training process when no longer generating more preferably beta pruning decision, obtains model after optimal beta pruning.
Further, the step S1 includes:
(1) neural network model is come out with string representation first, as the input of LSTM, concrete mode are as follows:
With (mi,ni) indicate neural network i-th of node ξi, wherein m indicates that node type, value are divided in { 0,1,2 } Convolution, pondization and full attended operation are not represented;N indicates node attribute values, and when the node is convolution, n represents this layer of convolution kernel Quantity;When the node is Chi Huashi, n represents pond step-length;When the node is full connection, n represents this layer of neuronal quantity;
(2) LSTM beta pruning decision is obtained, is specifically included:
At each moment, the input of a host node and its next node as multilayer LSTM, which can be indicated For [mi, ni, mi+1.ni+1];LSTM is made whether the decision of beta pruning using softmax function to host node currently entered, auxiliary Node only provides auxiliary information, does not carry out beta pruning prediction to it.
Further, the network layer includes convolutional layer and/or full articulamentum;The network layer parameter of convolutional layer is convolution Core, the network layer parameter of full articulamentum are full Connecting quantity.
Further, in step s 2, the method assessed convolution kernel and cut in convolutional layer includes:
By calculating the L2 norm of each channel set in i+1 convolutional layer, importance scores s is obtainedj, specific as following Formula:
sj=| | Ci+1,j||2,s.t.j∈[1,xi]
Wherein, Ci+1,jIndicate that j-th of channel set of i+1 convolutional layer, s.t. represent the abbreviation of subject to, xiTable Show the convolution nuclear volume in i-th of convolutional layer;
According to compression ratio, selects and cut importance scores s in i+1 convolutional layerjThe smallest channel set is right with them The convolution kernel in i-th of convolutional layer answered.
Further, the Restoration Mechanism includes: to select a part of convolution kernel of i+1 convolutional layer and according to certain ratio Example amplification convolution nuclear parameter is specific such as following formula:
Wherein, FI+1, jIndicate j-th of convolution kernel of i+1 convolutional layer;Indicate the i+1 convolution of beta pruning J-th of convolution kernel of layer;A is a hyper parameter, for selecting the biggish convolution kernel of deviation.
Further, in step s3, retraining process is accelerated using the method for knowledge extraction, by the input model of LSTM As teacher's network, for the model after beta pruning as student network, the classification of all categories of Internet-supported Study of students teacher's network output is general Rate z, z include knowledge abundant compared to training label, include similitude and otherness between classification.
Further, in step s 4, feedback excitation R is calculated by following formula:
R=performance- λ × complexity
Wherein, expressive force performance is indicated by model in the accuracy rate of verifying collection or the loss of training set;It is complicated Spend complexity is indicated by model FLOPs or ginseng population size;λ is a hyper parameter, needs to test by the intersection of experiment Card is to select optimal value.
Compared with prior art, the beneficial effects of the present invention are:
Model compression method provided by the invention based on beta pruning sequence Active Learning proposes one and is based on end to end The beta pruning frame of sequence Active Learning can generate beta pruning priority with the importance of each layer of Active Learning network, make reasonable Beta pruning decision solves the problems, such as the unreasonable of existing simple in-order pruning method, preferentially in the smallest network layer beta pruning of influence power, It conforms to the principle of simplicity to difficult gradually beta pruning, minimizes the model accuracy loss of beta pruning process;It is finally lost with model simultaneously to be oriented to, it is polygonal Degree, efficient, flexible rapidly assessment convolution kernel importance, to guarantee the correctness and validity of whole process model compression, after being Continuous large-sized model is transplanted to portable equipment and provides technical support.The experimental results showed that provided by the invention be based on beta pruning sequence master The model compression method of dynamic study showed under multiple data sets, Multi-model MPCA it is leading, can be in the feelings for guaranteeing model accuracy Under condition, huge compression model volume has very strong actual application prospect.
Detailed description of the invention
Fig. 1 is the flow chart of the model compression method based on beta pruning sequence Active Learning.
Specific embodiment
Come to carry out in detail model compression method provided by the invention for carrying out beta pruning only for convolutional layer below Explanation.
A kind of model compression method based on beta pruning sequence Active Learning, as shown in Figure 1, comprising:
S1. LSTM (Long Short-Term Memory) learning network temporal aspect is utilized, making each convolutional layer is The no decision for needing beta pruning;
S2. convolution kernel is assessed and is cut in selected convolutional layer, convolution kernel appraisal procedure considers former and later two The relevance of convolutional layer, the method rapid evaluation convolution kernel importance driven using non-data, and propose Restoration Mechanism to beta pruning Model accuracy is restored at once afterwards;
S3. acceleration retraining is carried out using tutor's network to the model after beta pruning;
S4. calculate beta pruning after retraining model expressive force and complexity, obtain feedback excitation R, with enhancing study side Method updates LSTM;
S5. the model of retraining after beta pruning is saved locally, chooses highest 5 models of feedback excitation as new The input of one wheel LSTM, repeats step 1- step 4, until LSTM terminates training process when no longer generating more preferably beta pruning decision, Obtain model after optimal beta pruning.By iterating, LSTM can preferably analyze network architecture, and proposition is correctly cut Branch sequence, so that entire beta pruning process is precisely effective.
Wherein, step S1 includes:
(1) neural network model is come out with string representation first, as the input of LSTM, concrete mode are as follows:
With (mi,ni) indicate neural network i-th of node ξi, wherein m indicates that node type, value are divided in { 0,1,2 } Convolution, pondization and full attended operation are not represented;N indicates node attribute values, and when the node is convolution, n represents this layer of convolution kernel Quantity;When the node is Chi Huashi, n represents pond step-length;When the node is full connection, n represents this layer of neuronal quantity;It adopts In fashion described above, each node is indicated with two values, then a neural network can be come out with string representation;
Wherein, convolution can carry out cut operator, referred to as host node;And other nodes remain constant, referred to as assist Node, for providing the auxiliary information of neural network;
(2) LSTM beta pruning decision is obtained, is specifically included:
At each moment, a host node and its next node (including host node and auxiliary node) are used as multilayer The input of LSTM, the input are represented by [mi, ni, mi+1.ni+1];LSTM is using softmax function to host node currently entered It is made whether the decision of beta pruning, auxiliary node only provides auxiliary information, does not carry out beta pruning prediction to it;There is N number of master for one The network structure of node, LSTM input include altogether N number of moment, that is, repeat the above steps n times;Auxiliary node is made together with host node It is inputted for the LSTM at each moment, but beta pruning decision cannot be obtained, being intended only as auxiliary information, to help LSTM to better understand whole A network.
In step s 2, assessment is carried out to convolution kernel and cutting specifically includes:
Define a hyper parameter Rprune, remove (R in the convolutional layer i for needing beta pruningprune×xi) convolution kernel, xiIt indicates the The convolution nuclear volume of i convolutional layer.
One convolution operation can be by triple < Ii, Wi, Oi> indicate, whereinRepresent i-th of convolution Input tensor, including port number xi-1, height h, width w,Indicate set of real numbers;Convolution kernel tensor Wherein convolution nuclear shape is k × k, and the port number for inputting tensor is xi-1;Export tensor OiPort number be xi
From the perspective of convolution kernel, WiInclude xiA convolution kernelFrom the perspective of channel, WiInclude xi-1A channel set
When j-th of convolution kernel of a certain convolutional layer is cut up, corresponding j-th of channel set C of its next convolutional layeri+1,jBecome In vain, it is also desirable to be cut up simultaneously;And other convolutional layers are not influenced by beta pruning in network, structure remains unchanged.
After i-th of convolutional layer is cut up convolution kernel, the output bias of i+1 convolutional layer is transferred to final loss letter Number, directly results in the loss of neural network accuracy;Importance low i-th layer of convolution kernel and corresponding i+1 channel set are removed, It can make output bias Δ Oi+1It is minimum;Due to usually there is activation primitive between two convolutional layers, crowd canonical or Chi Huacao Make, the convolution kernel F compared to i-th layeri, i+1 layer channel set Ci+1To output valve Oi+1There is more immediate influence;Therefore can lead to The importance for crossing assessment i+1 layer channel set, carrys out the importance of i-th layer of convolution kernel of indirect assessment.
The present invention assesses the importance of channel set using L2 norm, because L2 norm is in addition in view of other than numerical values recited, Also contemplate difference between numerical value;The L2 norm of channel set and the size that output characteristic value can be embodied to a certain extent, L2 compared with Small channel set tends to produce the output of weak activation, and absolute value is biggish tends to produce the output activated by force.
By calculating the L2 norm of each channel set in i+1 convolutional layer, importance scores s is obtainedj, specific as following Formula:
sj=| | Ci+1,j||2,s.t.j∈[1,xi]
Wherein, Ci+1,jIndicate that j-th of channel set of i+1 convolutional layer, s.t. represent the abbreviation of subject to, xiTable Show the convolution nuclear volume in i-th of convolutional layer;
According to compression ratio, selects and cut importance scores s in i+1 convolutional layerjThe smallest channel set is right with them The convolution kernel in i-th of convolutional layer answered.
In step s 2, after carrying out beta pruning to channel set and convolution kernel, Fi+1BecomeCi+1BecomeModel The reduction of parameter is bound to cause the loss of precision.In order to reduce loss, output bias Δ O is minimizedi+1, to a convolution kernel After carrying out beta pruning, Restoration Mechanism is used at once, is lost with recovered part.
Restoration Mechanism specifically includes:
It selects a part of convolution kernel of i+1 layer and amplifies convolution nuclear parameter according to a certain percentage, specific such as following formula:
Wherein, Fi+1,jIndicate j-th of convolution kernel of i+1 convolutional layer;Indicate the i+1 convolution of beta pruning J-th of convolution kernel of layer;A is a hyper parameter, for selecting the biggish convolution kernel of deviation.Optimal a will pass through cross validation Mode obtain.
In step s3, in order to improve the efficiency of model retraining, the method using knowledge extraction accelerates retraining process. Using the input model of LSTM as teacher's network, for the model after beta pruning as student network, Internet-supported Study of students teacher's network is defeated Class probability z of all categories out.Z includes knowledge abundant compared to training label, includes similitude and otherness between classification.Cause Model after this beta pruning can save the training time under conditions of learning knowledge content as much.
Following formula presents the loss function g of the model training after beta pruning, using L2 norm, minimizes academics and students Output category probability.
Cross entropy loss function f (x, y, θ) based on data label and the loss function g (x, z, θ) based on tutor are done Weighted combination obtains final loss function L, referring to following formula.
L=β f (x, y, θ)+g (x, z, θ)
Wherein, x indicates data, and y indicates that data label, θ indicate model parameter, and β is weight hyper parameter, is obtained by experiment Optimal value.
Step S4 includes:
(1) feedback excitation R is calculated according to following formula:
R=performance- λ × complexity
Wherein, expressive force performance is indicated by model in the accuracy rate of verifying collection or the loss of training set;It is complicated Spend complexity is indicated by model FLOPs (flops) or ginseng population size;λ is a hyper parameter, needs to lead to The cross validation of experiment is crossed to select optimal value.
(2) LSTM is trained using the Policy-Gradient algorithm of enhancing study Plays, it is made to generate better beta pruning decision, It is specific as follows:
About each section in the Policy-Gradient algorithm training LSTM and formula for specifically how using enhancing study Plays Concrete meaning can be issued on Machine Learning periodical with reference to Ronald J.Williams in 1992 《Simple statistical gradient-following algorithms for connectionist reinforcement learning》(8(3-4):229–256,1992.)。
It, can also be only to full connection according to the structure feature of neural network in model compression method provided by the invention Layer carries out beta pruning, or carries out beta pruning to convolutional layer and full articulamentum simultaneously.The object cut in full articulamentum is full connection ginseng Number.
Effect assessment
Pass through three standard data sets CIFAR10, CIFAR100, MNIST in image classification field, three common networks Experiment on structure VGG, ResNet, three layers of fully-connected network, the model provided by the invention based on beta pruning sequence Active Learning Compression method shows validity.Specifically, in CIFAR10 data set, the compression ratio of VGG19 network is being kept substantially The compression ratio of 84.7%, ResNet be can reach in the case that precision is constant up to 34.1%, while precision improvement 0.56%; In CIFAR100 data set;The compression ratio of VGG19 network can reach 70.1% in the case where keeping precision constant;MNIST number According to concentration, the compression ratio of three layers of fully-connected network can reach 87.27% in the case where holding precision is constant substantially.VGG19 exists CIFAR10's tests while showing the susceptibility that LSTM instructs beta pruning that can show each layer of network well.And the present invention mentions The model compression method based on beta pruning sequence Active Learning supplied is experiments verify that be better than the non-data driving side based on convolution kernel Method.The above experimental result leads over existing pruning method at present, it was demonstrated that provided by the invention actively to be learned based on beta pruning sequence The validity of the model compression method of habit.
Finally it should be noted that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although reference Preferred embodiment describes the invention in detail, those skilled in the art should understand that, it can be to of the invention Technical solution is modified or replaced equivalently, and without departing from the spirit and scope of the technical solution of the present invention, should all be covered In scope of the presently claimed invention.

Claims (7)

1. a kind of model compression method based on beta pruning sequence Active Learning, which is characterized in that the described method includes:
S1. LSTM learning network temporal aspect is utilized, the decision whether each network layer needs beta pruning is made;
S2. the network layer parameter of the network layer is assessed and is cut in selected network layer, and propose Restoration Mechanism to cutting Model accuracy is restored at once after branch;
S3. acceleration retraining is carried out using tutor's network to the model after beta pruning;
S4. according to the expressive force and complexity of the model of retraining after beta pruning, feedback excitation R is obtained, more with enhancing learning method New LSTM;
S5. input of highest 5 models of feedback excitation as new round LSTM is chosen, step 1- step 4 is repeated, until LSTM Training process is terminated when no longer generating more preferably beta pruning decision, obtains model after optimal beta pruning.
2. the method according to claim 1, wherein the step S1 includes:
(1) neural network model is come out with string representation first, as the input of LSTM, concrete mode are as follows:
With (mi,ni) indicate neural network i-th of node ξi, wherein m indicates node type, and value is in { 0,1,2 }, generation respectively Table convolution, pondization and full attended operation;N indicates node attribute values, and when the node is convolution, n represents this layer of convolution nuclear volume; When the node is Chi Huashi, n represents pond step-length;When the node is full connection, n represents this layer of neuronal quantity;
(2) LSTM beta pruning decision is obtained, is specifically included:
At each moment, the input of a host node and its next node as multilayer LSTM, which is represented by [mi, ni,mi+1.ni+1];LSTM is made whether the decision of beta pruning, auxiliary node using softmax function to host node currently entered Auxiliary information is only provided, beta pruning prediction is not carried out to it.
3. method according to claim 1 or 2, which is characterized in that the network layer includes convolutional layer and/or full connection Layer;The network layer parameter of convolutional layer is convolution kernel, and the network layer parameter of full articulamentum is full Connecting quantity.
4. according to the method described in claim 3, it is characterized in that, in step s 2, assessing in convolutional layer convolution kernel Method with cutting includes:
By calculating the L2 norm of each channel set in i+1 convolutional layer, importance scores s is obtainedj, it is specific such as following formula:
sj=| | Ci+1,j||2,s.t.j∈[1,xi]
Wherein, Ci+1,jIndicate that j-th of channel set of i+1 convolutional layer, s.t. represent the abbreviation of subject to, xiIt indicates The convolution nuclear volume of i-th of convolutional layer;
According to compression ratio, selects and cut importance scores s in i+1 convolutional layerjThe smallest channel set corresponding with them Convolution kernel in i convolutional layer.
5. according to the method described in claim 4, it is characterized in that, the Restoration Mechanism includes: selection i+1 convolutional layer A part of convolution kernel simultaneously amplifies convolution nuclear parameter according to a certain percentage, specific such as following formula:
Wherein, Fi+1,jIndicate j-th of convolution kernel of i+1 convolutional layer;Indicate the i+1 convolutional layer of beta pruning J-th of convolution kernel;A is a hyper parameter, for selecting the biggish convolution kernel of deviation.
6. according to the method described in claim 5, it is characterized in that, in step s3, the method using knowledge extraction accelerates weight Training process, using the input model of LSTM as teacher's network, for the model after beta pruning as student network, Internet-supported Study of students is old The class probability z of all categories of teacher's network output, z include knowledge abundant compared to training label, comprising similitude between classification and Otherness.
7. according to the method described in claim 6, it is characterized in that, in step s 4, feedback excitation R is calculated by following formula It arrives:
R=performance- λ × complexity
Wherein, expressive force performance is indicated by model in the accuracy rate of verifying collection or the loss of training set;Complexity Complexity is indicated by model FLOPs or ginseng population size;λ is a hyper parameter, needs the cross validation by experiment To select optimal value.
CN201811501702.2A 2018-06-15 2018-12-10 A kind of model compression method based on beta pruning sequence Active Learning Pending CN109657780A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2018106163994 2018-06-15
CN201810616399 2018-06-15

Publications (1)

Publication Number Publication Date
CN109657780A true CN109657780A (en) 2019-04-19

Family

ID=66113957

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811501702.2A Pending CN109657780A (en) 2018-06-15 2018-12-10 A kind of model compression method based on beta pruning sequence Active Learning

Country Status (1)

Country Link
CN (1) CN109657780A (en)

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309847A (en) * 2019-04-26 2019-10-08 深圳前海微众银行股份有限公司 A kind of model compression method and device
CN110555417A (en) * 2019-09-06 2019-12-10 福建中科亚创动漫科技股份有限公司 Video image recognition system and method based on deep learning
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110647990A (en) * 2019-09-18 2020-01-03 无锡信捷电气股份有限公司 Cutting method of deep convolutional neural network model based on grey correlation analysis
CN110766131A (en) * 2019-05-14 2020-02-07 北京嘀嘀无限科技发展有限公司 Data processing device and method and electronic equipment
CN110795993A (en) * 2019-09-12 2020-02-14 深圳云天励飞技术有限公司 Method and device for constructing model, terminal equipment and medium
CN110796177A (en) * 2019-10-10 2020-02-14 温州大学 Method for effectively reducing neural network overfitting in image classification task
CN110929849A (en) * 2019-11-22 2020-03-27 迪爱斯信息技术股份有限公司 Neural network model compression method and device
CN111062382A (en) * 2019-10-30 2020-04-24 北京交通大学 Channel pruning method for target detection network
CN111210017A (en) * 2019-12-24 2020-05-29 北京迈格威科技有限公司 Method, device, equipment and storage medium for determining layout sequence and processing data
CN111222629A (en) * 2019-12-31 2020-06-02 暗物智能科技(广州)有限公司 Neural network model pruning method and system based on adaptive batch normalization
CN111242287A (en) * 2020-01-15 2020-06-05 东南大学 Neural network compression method based on channel L1 norm pruning
CN112001483A (en) * 2020-08-14 2020-11-27 广州市百果园信息技术有限公司 Method and device for pruning neural network model
WO2021043193A1 (en) * 2019-09-04 2021-03-11 华为技术有限公司 Neural network structure search method and image processing method and device
WO2021077744A1 (en) * 2019-10-25 2021-04-29 浪潮电子信息产业股份有限公司 Image classification method, apparatus and device, and computer readable storage medium
CN112734036A (en) * 2021-01-14 2021-04-30 西安电子科技大学 Target detection method based on pruning convolutional neural network
CN112766491A (en) * 2021-01-18 2021-05-07 电子科技大学 Neural network compression method based on Taylor expansion and data driving
WO2021129570A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Network pruning optimization method based on network activation and sparsification
CN113128661A (en) * 2020-01-15 2021-07-16 富士通株式会社 Information processing apparatus, information processing method, and computer program
CN113344182A (en) * 2021-06-01 2021-09-03 电子科技大学 Network model compression method based on deep learning
WO2022198606A1 (en) * 2021-03-26 2022-09-29 深圳市大疆创新科技有限公司 Deep learning model acquisition method, system and apparatus, and storage medium
CN115238893A (en) * 2022-09-23 2022-10-25 北京航空航天大学 Neural network model quantification method and device for natural language processing
CN116698410A (en) * 2023-06-29 2023-09-05 重庆邮电大学空间通信研究院 Rolling bearing multi-sensor data monitoring method based on convolutional neural network

Cited By (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110309847A (en) * 2019-04-26 2019-10-08 深圳前海微众银行股份有限公司 A kind of model compression method and device
CN110766131A (en) * 2019-05-14 2020-02-07 北京嘀嘀无限科技发展有限公司 Data processing device and method and electronic equipment
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110619385B (en) * 2019-08-31 2022-07-29 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
WO2021043193A1 (en) * 2019-09-04 2021-03-11 华为技术有限公司 Neural network structure search method and image processing method and device
CN110555417A (en) * 2019-09-06 2019-12-10 福建中科亚创动漫科技股份有限公司 Video image recognition system and method based on deep learning
CN110795993A (en) * 2019-09-12 2020-02-14 深圳云天励飞技术有限公司 Method and device for constructing model, terminal equipment and medium
CN110647990A (en) * 2019-09-18 2020-01-03 无锡信捷电气股份有限公司 Cutting method of deep convolutional neural network model based on grey correlation analysis
CN110796177A (en) * 2019-10-10 2020-02-14 温州大学 Method for effectively reducing neural network overfitting in image classification task
WO2021077744A1 (en) * 2019-10-25 2021-04-29 浪潮电子信息产业股份有限公司 Image classification method, apparatus and device, and computer readable storage medium
CN111062382A (en) * 2019-10-30 2020-04-24 北京交通大学 Channel pruning method for target detection network
CN110929849B (en) * 2019-11-22 2023-09-01 迪爱斯信息技术股份有限公司 Video detection method and device based on neural network model compression
CN110929849A (en) * 2019-11-22 2020-03-27 迪爱斯信息技术股份有限公司 Neural network model compression method and device
CN111210017B (en) * 2019-12-24 2023-09-26 北京迈格威科技有限公司 Method, device, equipment and storage medium for determining layout sequence and data processing
CN111210017A (en) * 2019-12-24 2020-05-29 北京迈格威科技有限公司 Method, device, equipment and storage medium for determining layout sequence and processing data
WO2021129570A1 (en) * 2019-12-25 2021-07-01 神思电子技术股份有限公司 Network pruning optimization method based on network activation and sparsification
CN111222629A (en) * 2019-12-31 2020-06-02 暗物智能科技(广州)有限公司 Neural network model pruning method and system based on adaptive batch normalization
CN113128661A (en) * 2020-01-15 2021-07-16 富士通株式会社 Information processing apparatus, information processing method, and computer program
CN111242287A (en) * 2020-01-15 2020-06-05 东南大学 Neural network compression method based on channel L1 norm pruning
CN112001483A (en) * 2020-08-14 2020-11-27 广州市百果园信息技术有限公司 Method and device for pruning neural network model
CN112734036A (en) * 2021-01-14 2021-04-30 西安电子科技大学 Target detection method based on pruning convolutional neural network
CN112766491A (en) * 2021-01-18 2021-05-07 电子科技大学 Neural network compression method based on Taylor expansion and data driving
WO2022198606A1 (en) * 2021-03-26 2022-09-29 深圳市大疆创新科技有限公司 Deep learning model acquisition method, system and apparatus, and storage medium
CN113344182A (en) * 2021-06-01 2021-09-03 电子科技大学 Network model compression method based on deep learning
CN115238893A (en) * 2022-09-23 2022-10-25 北京航空航天大学 Neural network model quantification method and device for natural language processing
CN116698410A (en) * 2023-06-29 2023-09-05 重庆邮电大学空间通信研究院 Rolling bearing multi-sensor data monitoring method based on convolutional neural network
CN116698410B (en) * 2023-06-29 2024-03-12 重庆邮电大学空间通信研究院 Rolling bearing multi-sensor data monitoring method based on convolutional neural network

Similar Documents

Publication Publication Date Title
CN109657780A (en) A kind of model compression method based on beta pruning sequence Active Learning
Zhou et al. Rethinking bottleneck structure for efficient mobile network design
CN108846445B (en) Image processing method
CN104751842B (en) The optimization method and system of deep neural network
Li et al. Large scale recurrent neural network on GPU
CN109460817A (en) A kind of convolutional neural networks on piece learning system based on nonvolatile storage
CN107844784A (en) Face identification method, device, computer equipment and readable storage medium storing program for executing
CN110175628A (en) A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation
CN106355248A (en) Deep convolution neural network training method and device
CN107229757A (en) The video retrieval method encoded based on deep learning and Hash
CN107392224A (en) A kind of crop disease recognizer based on triple channel convolutional neural networks
CN114758180B (en) Knowledge distillation-based lightweight flower identification method
CN106816147A (en) Speech recognition system based on binary neural network acoustic model
CN109063719A (en) A kind of image classification method of co-ordinative construction similitude and category information
CN112163671A (en) New energy scene generation method and system
CN106897744A (en) A kind of self adaptation sets the method and system of depth confidence network parameter
CN109840595A (en) A kind of knowledge method for tracing based on group study behavior feature
WO2023134142A1 (en) Multi-scale point cloud classification method and system
CN109325513A (en) A kind of image classification network training method based on magnanimity list class single image
CN112258557A (en) Visual tracking method based on space attention feature aggregation
CN109145107A (en) Subject distillation method, apparatus, medium and equipment based on convolutional neural networks
CN110188978B (en) University student professional recommendation method based on deep learning
Peng et al. An industrial-grade solution for agricultural image classification tasks
CN110297894A (en) A kind of Intelligent dialogue generation method based on auxiliary network
Wang et al. Towards efficient convolutional neural networks through low-error filter saliency estimation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190419

RJ01 Rejection of invention patent application after publication