CN112101547B - Pruning method and device for network model, electronic equipment and storage medium - Google Patents

Pruning method and device for network model, electronic equipment and storage medium Download PDF

Info

Publication number
CN112101547B
CN112101547B CN202010964152.9A CN202010964152A CN112101547B CN 112101547 B CN112101547 B CN 112101547B CN 202010964152 A CN202010964152 A CN 202010964152A CN 112101547 B CN112101547 B CN 112101547B
Authority
CN
China
Prior art keywords
network model
pruning
convolution layer
parameters
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010964152.9A
Other languages
Chinese (zh)
Other versions
CN112101547A (en
Inventor
谷宇章
邱守猛
袁泽强
张晓林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Institute of Microsystem and Information Technology of CAS
Original Assignee
Shanghai Institute of Microsystem and Information Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Institute of Microsystem and Information Technology of CAS filed Critical Shanghai Institute of Microsystem and Information Technology of CAS
Priority to CN202010964152.9A priority Critical patent/CN112101547B/en
Publication of CN112101547A publication Critical patent/CN112101547A/en
Application granted granted Critical
Publication of CN112101547B publication Critical patent/CN112101547B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application discloses a pruning method, device, electronic equipment and storage medium for a network model, which comprise the steps of acquiring a training image set and a current network model, inputting the training image into the current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation treatment on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain the network model after pruning. Based on the embodiment of the application, the attenuation processing is carried out on the corresponding parameters of the convolution layer, so that the knowledge transfer of the parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.

Description

Pruning method and device for network model, electronic equipment and storage medium
Technical Field
The present invention relates to the field of deep learning technologies, and in particular, to a pruning method and apparatus for a network model, an electronic device, and a storage medium.
Background
In recent years, with the continuous development of deep learning technology, a network model becomes more and more complex, and the number of parameters in the network model is also more and more large, which brings great computational burden to the practical application of the deep learning network model. However, according to the research of people, many redundant parameters exist in a trained network model, and after the redundant parameters are removed, the performance of the network model before the removal can be recovered through a certain fine adjustment. For example, resNet-50 has 50 convolutions, and the overall model requires a total of about 95MB of memory, but can still function properly after 75% of the parameters are removed, and can also reduce run time by as much as 50%. Therefore, redundant parameters in the network model are removed, namely, the network model is pruned, so that the model is light, and the model is easy to deploy and apply in an actual scene.
In the prior art, pruning is mainly divided into two categories, namely sparsification in training and pruning after training. In the training, the sparsification is specifically to apply sparsification constraint to parameters or structures of the model in the process of training the model, so that sparse parameters or structures are obtained, the size of the model can be reduced, the time consumed by reasoning is reduced, and the reasoning speed is improved. Pruning after training specifically includes deleting unimportant weights in the trained model so as to enable the network model to be sparse and simplified. However, in the post-training pruning processing method, the accuracy of the network model tends to be degraded after the insignificant weights are deleted. Thus, it is generally necessary to fine-tune the trained pruned network model to restore performance.
Disclosure of Invention
The embodiment of the application provides a pruning method, a pruning device, electronic equipment and a storage medium for a network model, which can reduce the number of parameters, simultaneously, does not increase training burden, and can ensure the identification accuracy of the network model.
The embodiment of the application provides a pruning method for a network model, which comprises the following steps:
acquiring a training image set and a current network model; the current network model comprises a plurality of convolution layers;
pruning is carried out on the current network model based on the training image set, and a pruned network model is obtained;
the pruning process comprises the following steps:
inputting the training image into a current network model, and determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation treatment on the parameters corresponding to each convolution layer to obtain attenuation parameters;
if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer, and obtaining the pruned network model.
Further, after obtaining the pruned network model, the method further includes:
and re-determining the pruned network model as a current network model, and returning to the step of executing pruning processing on the current network model based on the training image set to obtain the pruned network model.
Further, inputting the training image into the current network model, determining parameters corresponding to each of the plurality of convolution layers according to the output of the current network model, including:
inputting the training image into a current network model, and determining a feature atlas output by each convolution layer in a plurality of convolution layers according to the output of the current network model;
determining parameters corresponding to the feature atlas output by each convolution layer;
and determining the parameters corresponding to each convolution layer according to the preset mapping relation and the parameters corresponding to the feature atlas output by each convolution layer.
Further, based on the preset pruning rate corresponding to each convolution layer, performing attenuation processing on the parameters corresponding to each convolution layer to obtain attenuation parameters, including:
determining a target feature graph set from the feature graph set output by each convolution layer based on a preset pruning rate corresponding to each convolution layer; the ratio of the number of channels of the target feature atlas to the number of channels of the feature atlas is a preset pruning rate;
determining parameters corresponding to the target feature atlas;
carrying out attenuation treatment on parameters corresponding to the target feature atlas based on a preset coefficient to obtain transition parameters;
and determining the attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters.
Further, after obtaining the pruned network model, the method further includes:
re-determining the pruned network model as a current network model;
and training the current network model by using the training image set to obtain a trained network model.
Further, training the current network model by using the training image set to obtain a trained network model, including:
inputting the training image into a current network model, and determining parameter sets corresponding to a plurality of convolution layers according to the output of the current network model;
determining a parameter set to be pruned from parameter sets corresponding to the plurality of convolution layers, and determining parameters except the parameter set to be pruned in the parameter sets corresponding to the plurality of convolution layers as parameter sets to be updated;
and carrying out pause updating processing on the parameter set to be pruned, and updating the parameter set to be updated to obtain the trained network model.
Correspondingly, the embodiment of the application also provides a pruning device for the network model, which comprises:
the acquisition module is used for acquiring the training image set and the current network model; the current network model comprises a plurality of convolution layers;
the pruning module is used for pruning the current network model based on the training image set to obtain a pruned network model;
the pruning process comprises the following steps:
inputting the training image into a current network model, and determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation treatment on the parameters corresponding to each convolution layer to obtain attenuation parameters;
if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer, and obtaining the pruned network model.
Further, the pruning module includes:
the determining module is used for inputting the training image into the current network model, and determining the corresponding parameter of each convolution layer in the plurality of convolution layers according to the output of the current network model;
the attenuation module is used for carrying out attenuation treatment on the parameters corresponding to each convolution layer based on the preset pruning rate corresponding to each convolution layer to obtain attenuation parameters;
and the rejecting module is used for rejecting the parameters corresponding to the attenuation parameters in the convolution layer if the difference value between the attenuation parameters and the preset threshold value is in the preset interval, so as to obtain the pruned network model.
Accordingly, the embodiment of the application also provides an electronic device, which comprises a processor and a memory, wherein at least one instruction, at least one section of program, code set or instruction set is stored in the memory, and the at least one instruction, the at least one section of program, the code set or the instruction set is loaded and executed by the processor to realize the pruning method for the network model.
Accordingly, the embodiment of the application further provides a computer readable storage medium, where at least one instruction, at least one program, a code set, or an instruction set is stored, where at least one instruction, at least one program, a code set, or an instruction set is loaded and executed by a processor to implement the pruning method for the network model.
The embodiment of the application has the following beneficial effects:
the embodiment of the application discloses a pruning method, a pruning device, electronic equipment and a storage medium for a network model, wherein the pruning method specifically comprises the steps of obtaining a training image set and a current network model, wherein the current network model comprises a plurality of convolution layers, pruning the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting training images into a current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation processing on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain a pruned network model. Based on the embodiment of the application, the attenuation processing is carried out on the corresponding parameters of the convolution layer, so that the knowledge transfer of the parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.
Drawings
In order to more clearly illustrate the technical solutions and advantages of embodiments of the present application or of the prior art, the following description will briefly introduce the drawings that are required to be used in the embodiments or the prior art descriptions, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic illustration of an application environment provided by an embodiment of the present application;
fig. 2 is a schematic flow chart of a pruning method for a network model according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating steps of a pruning process according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of determining a% of a feature atlas after a% of a feature atlas output by each convolution layer as a target feature atlas according to an embodiment of the present application;
fig. 5 is a data comparison graph of accuracy of recognition of a training image set by an AlexNet model before pruning and accuracy of recognition of a training image set by an AlexNet model after pruning, which are provided in the embodiment of the present application, by performing pruning processing with a preset pruning rate of 80% on the AlexNet model in the 25 th iteration lot epoch and the 75 th iteration lot epoch;
fig. 6 is a data comparison graph of accuracy of identifying a training image set by an AlexNet model before pruning and accuracy of identifying a training image set by an AlexNet model after pruning, which are provided in the embodiment of the present application, when the 25 th iteration lot epoch and the 75 th iteration lot epoch are performed on the AlexNet model with a preset pruning rate of 60%;
fig. 7 is a graph comparing data of accuracy of recognition of a training image set by a VGG19 model before pruning and accuracy of recognition of a training image set by a VGG19 model after pruning, which is provided in the embodiment of the present application, by performing pruning processing with a preset pruning rate of 80% on the VGG19 model in the 25 th iteration lot epoch and the 75 th iteration lot epoch;
fig. 8 is a data comparison graph of accuracy of recognition of a training image set by a VGG19 model before pruning and accuracy of recognition of a training image set by a VGG19 model after pruning, which are provided in the embodiment of the present application, by performing pruning processing with a preset pruning rate of 60% on the VGG19 model in the 25 th iteration lot epoch and the 75 th iteration lot epoch;
fig. 9 is a schematic structural diagram of a pruning device for a network model according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It will be apparent that the described embodiments are merely one embodiment of the present application and not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic may be included in at least one implementation of the present application. In the description of the embodiments of the present application, it should be understood that the terms "comprises," "comprising," and "is" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or modules is not necessarily limited to those steps or modules that are expressly listed or inherent to such process, method, apparatus, article, or device.
Referring to fig. 1, a schematic diagram of an application environment provided in an embodiment of the present application is shown, including a server 101, where the server 101 includes a pruning device for a network model, and the server 101 obtains a training image set and a current network model, where the current network model includes a plurality of convolution layers, performs pruning processing on the current network model based on the training image set, and obtains a pruned network model; the pruning processing step comprises the steps of inputting training images into a current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation processing on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain a pruned network model.
In the following, a specific embodiment of a pruning method for a network model according to the present application is described, and fig. 2 is a schematic flow chart of a pruning method for a network model according to the embodiment of the present application, where the present specification provides the method operation steps as shown in the example or the flowchart, but may include more or fewer operation steps based on conventional or non-inventive labor. The sequence of steps recited in the embodiments is only one manner of a plurality of execution sequences, and does not represent a unique execution sequence, and when actually executed, may be executed sequentially or in parallel (e.g., in a parallel processor or a multithreaded environment) according to the method shown in the embodiments or the drawings. As shown in fig. 2, the method includes:
s201: acquiring a training image set and a current network model; the current network model includes multiple convolution layers.
In this embodiment of the application, the server obtains a training image set and a current network model, where the training image set may specifically be a CIFAR100 data set, an OTB50 data set, an OTB100 data set, or a GOT-10K data set. The current network model may be an AlexNet model, a VGG19 model, or a sialmfc model, and the current network model may also be a target recognition network model obtained by training with a training image set.
S203: pruning is carried out on the current network model based on the training image set, and a network model after pruning is obtained.
In an optional implementation manner, after obtaining the pruned network model, the server redetermines the pruned network model as the current network model, and returns to execute pruning processing on the current network model based on the training image set to obtain the pruned network model.
In another optional implementation manner, after obtaining the pruned network model, the server redetermines the pruned network model as the current network model, and trains the current network model by using the training image set to obtain a trained network model. Specifically, a training image is input into a current network model, parameter sets corresponding to a plurality of convolution layers are determined according to the output of the current network model, a parameter set to be pruned is determined from the parameter sets corresponding to the plurality of convolution layers, parameters except the parameter set to be pruned in the parameter sets corresponding to the plurality of convolution layers are determined as parameter sets to be updated, the parameter sets to be pruned are subjected to pause updating processing, and the parameter sets to be updated are updated, so that the trained network model is obtained.
Fig. 3 is a flow chart of the steps of a pruning process provided in an embodiment of the present application, which provides the method operational steps as shown in the examples or flow charts, but may include more or fewer operational steps based on conventional or non-inventive labor. The sequence of steps recited in the embodiments is only one manner of a plurality of execution sequences, and does not represent a unique execution sequence, and when actually executed, may be executed sequentially or in parallel (e.g., in a parallel processor or a multithreaded environment) according to the method shown in the embodiments or the drawings. As shown in fig. 3, the method includes:
s301: and inputting the training image into a current network model, and determining parameters corresponding to each convolution layer in the plurality of convolution layers according to the output of the current network model.
In this embodiment of the present application, in a specific implementation manner of determining parameters corresponding to each of a plurality of convolution layers, a server inputs an acquired training image into a current network model, determines a feature atlas output by each of the plurality of convolution layers according to an output of the current network model, and then determines parameters corresponding to the feature atlas output by each of the convolution layers, and determines parameters corresponding to each of the convolution layers according to a preset mapping relationship and parameters corresponding to the feature atlas output by each of the convolution layers, where the preset mapping relationship refers to a relationship between the parameters corresponding to the convolution layers and the parameters corresponding to the feature atlas output by the convolution layer.
S303: and carrying out attenuation treatment on the parameters corresponding to each convolution layer based on the preset pruning rate corresponding to each convolution layer to obtain attenuation parameters.
In this embodiment of the present invention, the server determines, based on the preset pruning rate corresponding to each convolution layer, a target feature set from the feature set output by each convolution layer, where the preset pruning rate corresponding to each convolution layer in the plurality of convolution layers may be identical, the preset pruning rate corresponding to each convolution layer in the plurality of convolution layers may also be identical, the preset pruning rates corresponding to some convolution layers in the plurality of convolution layers may be identical, and the preset pruning rates of some convolution layers other than the convolution layer with the identical preset pruning rate are not identical.
In an alternative embodiment, the preset pruning rate corresponding to each convolution layer in the plurality of convolution layers may be identical, and then the server may arbitrarily determine, as the target feature map set, a feature map set with a channel number accounting for a% of the total channel number a of the feature map output by the corresponding convolution layer from the feature map set output by each convolution layer. Here, the arbitrary determination may be to randomly select a plurality of feature atlases from the feature atlases output from the convolution layers, or to sequentially select a plurality of feature atlases from the convolution layers, for example, fig. 4 illustrates a schematic diagram in which a% of the feature atlases after determining the feature atlases output from each convolution layer are target feature atlases.
In another alternative embodiment, the preset pruning rate corresponding to each of the plurality of convolution layers may be quite different. For example, assume that the current network model contains three convolution layers, a first convolution layer that outputs a feature atlas of 10 channels, a second convolution layer that outputs a feature atlas of 10 channels, and a third convolution layer that outputs a feature atlas of 20 channels, respectively. The preset pruning rate corresponding to the first convolution layer is 20%, the preset pruning rate corresponding to the second convolution layer is 40%, and the preset pruning rate corresponding to the third convolution layer is 50%, so that the server selects a 2-channel feature map set from the 10-channel feature map set output by the first convolution layer, selects 4-channel feature map sets from the 10-channel feature map set output by the second convolution layer, selects 10-channel feature map sets from the 20-channel feature map set output by the third convolution layer, and takes the selected 16-channel feature map set as a target feature map set. Similarly, in the case that the preset pruning rates of the partial convolution layers corresponding to the plurality of convolution layers are the same and the preset pruning rates of the partial convolution layers other than the convolution layers having the same preset pruning rate are different, the method for determining the target feature atlas may also be identical to the method for determining the target feature atlas in the case that the preset pruning rates corresponding to each of the plurality of convolution layers are completely different, which is not described in detail in the present application.
In the embodiment of the present application, after determining a target feature atlas from a feature atlas output by each convolution layer based on a preset pruning rate corresponding to each convolution layer, a server determines a parameter corresponding to the target feature atlas, performs attenuation processing on the parameter corresponding to the target feature atlas based on a preset coefficient, obtains a transition parameter, and determines an attenuation parameter corresponding to each convolution based on a preset mapping relation between the parameter corresponding to a certain convolution layer and the parameter corresponding to the feature atlas output by the convolution layer and the obtained transition parameter.
S305: if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer, and obtaining the pruned network model.
The current network model is pruned based on the training image set, and the specific experimental description is performed on the pruned network model.
In an alternative embodiment, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The method comprises the steps of carrying out pruning treatment with a preset pruning rate of 80% on AlexNet models in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned AlexNet model with a final pruning rate of 0.64, determining the recognition accuracy of the pruned AlexNet model on a training image set, determining the recognition accuracy of the pruned AlexNet model on the training image set, and carrying out performance analysis on the pruned model based on the recognition accuracy of the pruned AlexNet model on the training image set and the recognition accuracy of the pruned AlexNet model on the training image set, wherein the calculated amount between the pruned convolution layers is 41% of the calculated amount between the pruned convolution layers. Fig. 5 illustrates a data comparison graph of the accuracy of identifying a training image set by the AlexNet model before pruning and the accuracy of identifying a training image set by the AlexNet model after pruning, in which the preset pruning rate is 80% for the AlexNet model in the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the AlexNet model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the AlexNet model before pruning to the training image set. As can be clearly seen from fig. 5, in the whole pruning process, the accuracy of identifying the CIFAR100 data set by the AlexNet model after pruning is basically consistent with the accuracy of identifying the CIFAR100 data set by the AlexNet model before pruning, and when the AlexNet model is pruned in the 25 th iteration batch epoch and the 75 th iteration batch epoch, the accuracy of identifying the CIFAR100 data set is not suddenly reduced, the whole pruning process is very smooth, and the accuracy of identifying the CIFAR100 data set by the AlexNet model after pruning finally exceeds the accuracy of identifying the CIFAR100 data set by the AlexNet model before pruning.
In an alternative embodiment, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The method comprises the steps of carrying out pruning treatment with a preset pruning rate of 60% on AlexNet models in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned AlexNet model with a final pruning rate of 0.36, determining the accuracy of recognition of the training image set by the AlexNet model before pruning and the accuracy of recognition of the training image set by the AlexNet model after pruning, and carrying out performance analysis of the pruned model based on the accuracy of recognition of the training image set by the AlexNet model before pruning and the accuracy of recognition of the training image set by the AlexNet model after pruning, wherein the calculated amount between the convolution layers after pruning is 13% of the calculated amount between the convolution layers before pruning. Fig. 6 illustrates a data comparison graph of the accuracy of identifying a training image set by the AlexNet model before pruning and the accuracy of identifying a training image set by the AlexNet model after pruning, in which the preset pruning rate is 60% for the AlexNet model in the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the AlexNet model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the AlexNet model before pruning to the training image set. As can be clearly seen from fig. 6, the accuracy of identifying the CIFAR100 data set by the AlexNet model after pruning eventually exceeds the accuracy of identifying the CIFAR100 data set by the AlexNet model before pruning.
In an alternative embodiment, the VGG19 is used as the current network model, the CIFAR100 data set is used as the training image set, the VGG19 is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 data set. The method comprises the steps of performing pruning treatment with a preset pruning rate of 80% on VGG19 in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned VGG19 with a final pruning rate of 0.64, determining the accuracy of recognition of the pruned VGG19 on a training image set, determining the accuracy of recognition of the pruned VGG19 on the training image set, and performing performance analysis of a model after pruning on the training image set based on the accuracy of recognition of the pruned VGG19 on the training image set and the accuracy of recognition of the pruned VGG19 on the training image set, wherein the calculated amount between the pruned VGG19 and the pruned convolution layers is 41%. Fig. 7 illustrates a data comparison graph of the accuracy of identifying the VGG19 to the training image set before pruning and the accuracy of identifying the VGG19 to the training image set after pruning, in which the predetermined pruning rate is 80% for the VGG19 at the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the VGG19 model before pruning to the training image set. As can be clearly seen from fig. 7, in the training process, the convergence rate of the VGG19 model is gradually increased, and the accuracy of identifying the CIFAR100 data set by the VGG19 after pruning eventually exceeds the accuracy of identifying the CIFAR100 data set by the VGG19 before pruning.
In an alternative embodiment, the VGG19 is used as the current network model, the CIFAR100 data set is used as the training image set, the VGG19 is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 data set. The method comprises the steps of performing pruning treatment with a preset pruning rate of 60% on VGG19 in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned VGG19 with a final pruning rate of 0.36, determining the accuracy of recognition of the VGG19 on a training image set before pruning and the accuracy of recognition of the VGG19 on the training image set after pruning, and performing performance analysis of a model after pruning on the training image set based on the accuracy of recognition of the VGG19 on the training image set before pruning and the accuracy of recognition of the VGG19 on the training image set after pruning, wherein the calculated amount between the convolutions after pruning is 13% of the calculated amount between the convolutions before pruning. Fig. 8 illustrates a data comparison graph of the accuracy of identifying the VGG19 to the training image set before pruning and the accuracy of identifying the VGG19 to the training image set after pruning, in which the preset pruning rate is 60% for the VGG19 at the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the VGG19 model before pruning to the training image set. As can be clearly seen from fig. 8, in the initial stage of training, the convergence rate of the VGG19 model gradually increases, and the accuracy of identifying the CIFAR100 data set by the VGG19 after pruning eventually exceeds the accuracy of identifying the CIFAR100 data set by the VGG19 before pruning.
In an alternative embodiment, the sialmfc model is used as the current network model, the OTB50 data set, the OTB100 data set or the GOT-10K data set is used as the training image set, the OTB50 data set, the OTB100 data set or the GOT-10K data set is used to train the sialmfc model, and in the process of training the sialmfc model by using the OTB50 data set, the OTB100 data set or the GOT-10K data set, the training iteration number may be specifically 50 iteration batches epoch. The method comprises the steps of performing pruning processing with a preset pruning rate of 80% on a SiamFC model in a 5 th iteration batch epoch and a 15 th iteration batch epoch, obtaining a SiamFC model with a final pruning rate of 0.64, determining the accuracy of the SiamFC model before pruning on a training image set, determining the accuracy of the SiamFC model after pruning on the training image set, comparing the accuracy of the SiamFC model before pruning on the training image set with the accuracy of the SiamFC model after pruning on the training image set, wherein the SiamFC represents the accuracy and success rate of the SiamFC model before pruning on the training image set, the Prun_SiamFC represents the accuracy and success rate of the SiamFC model after pruning on the training image set, determining the accuracy of the SiamFC model after pruning on the training image set is the accuracy of the training image set, and determining the accuracy of the SiamFC model after pruning is the probability of the training image set to be matched with the actual region.
From the above experimental data, it can be clearly seen that the pruning of the network model disclosed by the application can effectively realize the light weight of the current model on the basis of not reducing the performance of the current model.
By adopting the pruning method for the network model, which is provided by the embodiment of the application, through carrying out attenuation treatment on the parameters corresponding to the convolution layer, knowledge transfer of parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.
Fig. 9 is a schematic structural diagram of a pruning device for a network model according to an embodiment of the present application, as shown in fig. 9, where the pruning device for a network model includes:
the acquisition module 901 is used for acquiring a training image set and a current network model; the current network model comprises a plurality of convolution layers;
the pruning module 903 is configured to prune the current network model based on the training image set to obtain a pruned network model;
the pruning process comprises the following steps:
inputting the training image into a current network model, and determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model;
based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation treatment on the parameters corresponding to each convolution layer to obtain attenuation parameters;
if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer, and obtaining the pruned network model.
In this embodiment, the pruning module 903 described above includes:
the determining module 9031 is configured to input the training image into the current network model, and determine a parameter corresponding to each of the plurality of convolution layers according to an output of the current network model;
the attenuation module 9033 is configured to perform attenuation processing on parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer, so as to obtain attenuation parameters;
the rejecting module 9035 is configured to reject parameters corresponding to the attenuation parameters in the convolutional layer if the difference between the attenuation parameters and the preset threshold is within a preset interval, and obtain a pruned network model.
The apparatus and method embodiments in the embodiments of the present application are based on the same application concept.
The electronic device may be configured in a server to store at least one instruction, at least one program, a code set, or an instruction set related to a pruning method for a network model in the method embodiment, where the at least one instruction, the at least one program, the code set, or the instruction set is loaded into the memory and executed by the memory to implement the pruning method for a network model.
The storage medium may be configured in a server to store at least one instruction, at least one program, a code set, or an instruction set related to a pruning method for a network model in a method embodiment, where the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the pruning method for a network model.
Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of the computer network. Alternatively, in the present embodiment, the storage medium may include, but is not limited to, including: a U-disk, a Read-only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk, or the like, which can store program codes.
As can be seen from the embodiments of the pruning method, apparatus, electronic device or storage medium for a network model provided in the present application, the method in the present application includes obtaining a training image set and a current network model, where the current network model includes a plurality of convolution layers, pruning the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting training images into a current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation processing on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain a pruned network model. Based on the embodiment of the application, the attenuation processing is carried out on the corresponding parameters of the convolution layer, so that the knowledge transfer of the parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.
It should be noted that: the foregoing sequence of embodiments of the present application is for illustration only, and does not represent the advantages or disadvantages of the embodiments, and the present specification describes specific embodiments, other embodiments being within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in a different order in a different embodiment and can achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or the sequential order shown, to achieve desirable results, and in some embodiments, multitasking parallel processing may be possible or advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the embodiments of the device, the description is relatively simple, since it is based on embodiments similar to the method, as relevant see the description of parts of the method embodiments.
While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims (7)

1. A method for pruning a network model, comprising:
acquiring a training image set and a current network model; the current network model comprises a plurality of convolution layers;
pruning is carried out on the current network model based on the training image set, and a pruned network model is obtained;
the pruning process comprises the following steps:
inputting the training image set into the current network model, and determining a feature atlas output by each convolution layer in the plurality of convolution layers according to the output of the current network model;
determining parameters corresponding to the feature atlas output by each convolution layer;
determining the parameters corresponding to each convolution layer according to the preset mapping relation and the parameters corresponding to the feature atlas output by each convolution layer;
determining a target feature image set from the feature image set output by each convolution layer based on the preset pruning rate corresponding to each convolution layer; the ratio of the number of channels of the target feature atlas to the number of channels of the feature atlas is the preset pruning rate;
determining parameters corresponding to the target feature atlas;
carrying out attenuation treatment on parameters corresponding to the target feature atlas based on a preset coefficient to obtain transition parameters;
determining attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters;
and if the difference value between the attenuation parameter and the preset threshold value is in the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer to obtain a pruned network model.
2. The method of claim 1, wherein after obtaining the pruned network model, further comprising:
and re-determining the pruned network model as the current network model, and returning to the step of executing pruning processing on the current network model based on the training image set to obtain the pruned network model.
3. The method of claim 1, wherein after obtaining the pruned network model, further comprising:
re-determining the pruned network model as the current network model;
and training the current network model by using the training image set to obtain a trained network model.
4. A method according to claim 3, wherein training the current network model using the training image set to obtain a trained network model comprises:
inputting the training image into the current network model, and determining parameter sets corresponding to the plurality of convolution layers according to the output of the current network model;
determining a parameter set to be pruned from parameter sets corresponding to the plurality of convolution layers, and determining parameters except the parameter set to be pruned in the parameter sets corresponding to the plurality of convolution layers as parameter sets to be updated;
and carrying out pause updating processing on the parameter set to be pruned, and updating the parameter set to be updated to obtain a trained network model.
5. A pruning device for a network model, comprising:
the acquisition module is used for acquiring the training image set and the current network model; the current network model comprises a plurality of convolution layers;
the pruning module is used for pruning the current network model based on the training image set to obtain a pruned network model;
the determining module is used for inputting the training image set into the current network model, and determining a feature atlas output by each convolution layer in the plurality of convolution layers according to the output of the current network model; determining parameters corresponding to the feature atlas output by each convolution layer; determining the parameters corresponding to each convolution layer according to the preset mapping relation and the parameters corresponding to the feature atlas output by each convolution layer;
the attenuation module is used for determining a target feature graph set from the feature graph set output by each convolution layer based on the preset pruning rate corresponding to each convolution layer; the ratio of the number of channels of the target feature atlas to the number of channels of the feature atlas is the preset pruning rate; determining parameters corresponding to the target feature atlas; carrying out attenuation treatment on parameters corresponding to the target feature atlas based on a preset coefficient to obtain transition parameters; determining attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters;
and the rejecting module is used for rejecting the parameters corresponding to the attenuation parameters in the convolution layer if the difference value between the attenuation parameters and the preset threshold value is in the preset interval, so as to obtain a pruned network model.
6. An electronic device comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the pruning method for a network model of any one of claims 1-4.
7. A computer readable storage medium having stored therein at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code set, or instruction set being loaded and executed by a processor to implement a pruning method for a network model according to any one of claims 1-4.
CN202010964152.9A 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium Active CN112101547B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010964152.9A CN112101547B (en) 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010964152.9A CN112101547B (en) 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112101547A CN112101547A (en) 2020-12-18
CN112101547B true CN112101547B (en) 2024-04-16

Family

ID=73751627

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010964152.9A Active CN112101547B (en) 2020-09-14 2020-09-14 Pruning method and device for network model, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112101547B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734029A (en) * 2020-12-30 2021-04-30 中国科学院计算技术研究所 Neural network channel pruning method, storage medium and electronic equipment
CN113111925A (en) * 2021-03-29 2021-07-13 宁夏新大众机械有限公司 Feed qualification classification method based on deep learning
CN113361697A (en) * 2021-07-14 2021-09-07 深圳思悦创新有限公司 Convolution network model compression method, system and storage medium
CN115186937B (en) * 2022-09-09 2022-11-22 闪捷信息科技有限公司 Prediction model training and data prediction method and device based on multi-party data cooperation

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104684041A (en) * 2015-02-06 2015-06-03 中国科学院上海微***与信息技术研究所 Real-time wireless sensor network routing method supporting large-scale node application
CN109671063A (en) * 2018-12-11 2019-04-23 西安交通大学 A kind of image quality measure method of importance between the network characterization based on depth
CN109886391A (en) * 2019-01-30 2019-06-14 东南大学 A kind of neural network compression method based on the positive and negative diagonal convolution in space
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110874631A (en) * 2020-01-20 2020-03-10 浙江大学 Convolutional neural network pruning method based on feature map sparsification
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN111652236A (en) * 2020-04-21 2020-09-11 东南大学 Lightweight fine-grained image identification method for cross-layer feature interaction in weak supervision scene

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11537892B2 (en) * 2017-08-18 2022-12-27 Intel Corporation Slimming of neural networks in machine learning environments
GB2582352B (en) * 2019-03-20 2021-12-15 Imagination Tech Ltd Methods and systems for implementing a convolution transpose layer of a neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104684041A (en) * 2015-02-06 2015-06-03 中国科学院上海微***与信息技术研究所 Real-time wireless sensor network routing method supporting large-scale node application
CN109671063A (en) * 2018-12-11 2019-04-23 西安交通大学 A kind of image quality measure method of importance between the network characterization based on depth
CN109886391A (en) * 2019-01-30 2019-06-14 东南大学 A kind of neural network compression method based on the positive and negative diagonal convolution in space
CN110619385A (en) * 2019-08-31 2019-12-27 电子科技大学 Structured network model compression acceleration method based on multi-stage pruning
CN110874631A (en) * 2020-01-20 2020-03-10 浙江大学 Convolutional neural network pruning method based on feature map sparsification
CN111461291A (en) * 2020-03-13 2020-07-28 西安科技大学 Long-distance pipeline inspection method based on YO L Ov3 pruning network and deep learning defogging model
CN111652236A (en) * 2020-04-21 2020-09-11 东南大学 Lightweight fine-grained image identification method for cross-layer feature interaction in weak supervision scene

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BFRIFP: Brain Functional Reorganization Inspired Filter Pruning;Shoumeng Qiu 等;《Artificial Neural Networks and Machine Learning – ICANN 2021》;20210907;第12894卷;16-28 *

Also Published As

Publication number Publication date
CN112101547A (en) 2020-12-18

Similar Documents

Publication Publication Date Title
CN112101547B (en) Pruning method and device for network model, electronic equipment and storage medium
US11586909B2 (en) Information processing method, information processing apparatus, and computer readable storage medium
CN114037844A (en) Global rank perception neural network model compression method based on filter characteristic diagram
CN111724370B (en) Multi-task image quality evaluation method and system based on uncertainty and probability
JP6950756B2 (en) Neural network rank optimizer and optimization method
US20200364538A1 (en) Method of performing, by electronic device, convolution operation at certain layer in neural network, and electronic device therefor
CN110930996B (en) Model training method, voice recognition method, device, storage medium and equipment
CN106802888B (en) Word vector training method and device
CN115511069A (en) Neural network training method, data processing method, device and storage medium
CN109145107B (en) Theme extraction method, device, medium and equipment based on convolutional neural network
Pietron et al. Retrain or not retrain?-efficient pruning methods of deep cnn networks
CN107563357A (en) Live dress ornament based on scene cut, which is dressed up, recommends method, apparatus and computing device
CN114429208A (en) Model compression method, device, equipment and medium based on residual structure pruning
CN110147851B (en) Image screening method and device, computer equipment and storage medium
CN109978058B (en) Method, device, terminal and storage medium for determining image classification
CN112560881A (en) Object identification method and device and data processing method
CN113837378A (en) Convolutional neural network compression method based on agent model and gradient optimization
CN116416212B (en) Training method of road surface damage detection neural network and road surface damage detection neural network
CN115170902B (en) Training method of image processing model
JP7073171B2 (en) Learning equipment, learning methods and programs
CN111602145A (en) Optimization method of convolutional neural network and related product
CN114091668A (en) Neural network pruning method and system based on micro-decision maker and knowledge distillation
CN116992944B (en) Image processing method and device based on leavable importance judging standard pruning
CN111666371A (en) Theme-based matching degree determination method and device, electronic equipment and storage medium
CN117131920B (en) Model pruning method based on network structure search

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant