CN112101547B

CN112101547B - Pruning method and device for network model, electronic equipment and storage medium

Info

Publication number: CN112101547B
Application number: CN202010964152.9A
Authority: CN
Inventors: 谷宇章; 邱守猛; 袁泽强; 张晓林
Original assignee: Shanghai Institute of Microsystem and Information Technology of CAS
Current assignee: Shanghai Institute of Microsystem and Information Technology of CAS
Priority date: 2020-09-14
Filing date: 2020-09-14
Publication date: 2024-04-16
Anticipated expiration: 2040-09-14
Also published as: CN112101547A

Abstract

The embodiment of the application discloses a pruning method, device, electronic equipment and storage medium for a network model, which comprise the steps of acquiring a training image set and a current network model, inputting the training image into the current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation treatment on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain the network model after pruning. Based on the embodiment of the application, the attenuation processing is carried out on the corresponding parameters of the convolution layer, so that the knowledge transfer of the parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.

Description

Pruning method and device for network model, electronic equipment and storage medium

Technical Field

The present invention relates to the field of deep learning technologies, and in particular, to a pruning method and apparatus for a network model, an electronic device, and a storage medium.

Background

In recent years, with the continuous development of deep learning technology, a network model becomes more and more complex, and the number of parameters in the network model is also more and more large, which brings great computational burden to the practical application of the deep learning network model. However, according to the research of people, many redundant parameters exist in a trained network model, and after the redundant parameters are removed, the performance of the network model before the removal can be recovered through a certain fine adjustment. For example, resNet-50 has 50 convolutions, and the overall model requires a total of about 95MB of memory, but can still function properly after 75% of the parameters are removed, and can also reduce run time by as much as 50%. Therefore, redundant parameters in the network model are removed, namely, the network model is pruned, so that the model is light, and the model is easy to deploy and apply in an actual scene.

In the prior art, pruning is mainly divided into two categories, namely sparsification in training and pruning after training. In the training, the sparsification is specifically to apply sparsification constraint to parameters or structures of the model in the process of training the model, so that sparse parameters or structures are obtained, the size of the model can be reduced, the time consumed by reasoning is reduced, and the reasoning speed is improved. Pruning after training specifically includes deleting unimportant weights in the trained model so as to enable the network model to be sparse and simplified. However, in the post-training pruning processing method, the accuracy of the network model tends to be degraded after the insignificant weights are deleted. Thus, it is generally necessary to fine-tune the trained pruned network model to restore performance.

Disclosure of Invention

The embodiment of the application provides a pruning method, a pruning device, electronic equipment and a storage medium for a network model, which can reduce the number of parameters, simultaneously, does not increase training burden, and can ensure the identification accuracy of the network model.

The embodiment of the application provides a pruning method for a network model, which comprises the following steps:

acquiring a training image set and a current network model; the current network model comprises a plurality of convolution layers;

pruning is carried out on the current network model based on the training image set, and a pruned network model is obtained;

the pruning process comprises the following steps:

inputting the training image into a current network model, and determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model;

based on the preset pruning rate corresponding to each convolution layer, carrying out attenuation treatment on the parameters corresponding to each convolution layer to obtain attenuation parameters;

if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer, and obtaining the pruned network model.

Further, after obtaining the pruned network model, the method further includes:

and re-determining the pruned network model as a current network model, and returning to the step of executing pruning processing on the current network model based on the training image set to obtain the pruned network model.

Further, inputting the training image into the current network model, determining parameters corresponding to each of the plurality of convolution layers according to the output of the current network model, including:

inputting the training image into a current network model, and determining a feature atlas output by each convolution layer in a plurality of convolution layers according to the output of the current network model;

determining parameters corresponding to the feature atlas output by each convolution layer;

and determining the parameters corresponding to each convolution layer according to the preset mapping relation and the parameters corresponding to the feature atlas output by each convolution layer.

Further, based on the preset pruning rate corresponding to each convolution layer, performing attenuation processing on the parameters corresponding to each convolution layer to obtain attenuation parameters, including:

determining a target feature graph set from the feature graph set output by each convolution layer based on a preset pruning rate corresponding to each convolution layer; the ratio of the number of channels of the target feature atlas to the number of channels of the feature atlas is a preset pruning rate;

determining parameters corresponding to the target feature atlas;

carrying out attenuation treatment on parameters corresponding to the target feature atlas based on a preset coefficient to obtain transition parameters;

and determining the attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters.

Further, after obtaining the pruned network model, the method further includes:

re-determining the pruned network model as a current network model;

and training the current network model by using the training image set to obtain a trained network model.

Further, training the current network model by using the training image set to obtain a trained network model, including:

inputting the training image into a current network model, and determining parameter sets corresponding to a plurality of convolution layers according to the output of the current network model;

determining a parameter set to be pruned from parameter sets corresponding to the plurality of convolution layers, and determining parameters except the parameter set to be pruned in the parameter sets corresponding to the plurality of convolution layers as parameter sets to be updated;

and carrying out pause updating processing on the parameter set to be pruned, and updating the parameter set to be updated to obtain the trained network model.

Correspondingly, the embodiment of the application also provides a pruning device for the network model, which comprises:

the acquisition module is used for acquiring the training image set and the current network model; the current network model comprises a plurality of convolution layers;

the pruning module is used for pruning the current network model based on the training image set to obtain a pruned network model;

the pruning process comprises the following steps:

Further, the pruning module includes:

the determining module is used for inputting the training image into the current network model, and determining the corresponding parameter of each convolution layer in the plurality of convolution layers according to the output of the current network model;

the attenuation module is used for carrying out attenuation treatment on the parameters corresponding to each convolution layer based on the preset pruning rate corresponding to each convolution layer to obtain attenuation parameters;

and the rejecting module is used for rejecting the parameters corresponding to the attenuation parameters in the convolution layer if the difference value between the attenuation parameters and the preset threshold value is in the preset interval, so as to obtain the pruned network model.

Accordingly, the embodiment of the application also provides an electronic device, which comprises a processor and a memory, wherein at least one instruction, at least one section of program, code set or instruction set is stored in the memory, and the at least one instruction, the at least one section of program, the code set or the instruction set is loaded and executed by the processor to realize the pruning method for the network model.

Accordingly, the embodiment of the application further provides a computer readable storage medium, where at least one instruction, at least one program, a code set, or an instruction set is stored, where at least one instruction, at least one program, a code set, or an instruction set is loaded and executed by a processor to implement the pruning method for the network model.

The embodiment of the application has the following beneficial effects:

the embodiment of the application discloses a pruning method, a pruning device, electronic equipment and a storage medium for a network model, wherein the pruning method specifically comprises the steps of obtaining a training image set and a current network model, wherein the current network model comprises a plurality of convolution layers, pruning the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting training images into a current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation processing on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain a pruned network model. Based on the embodiment of the application, the attenuation processing is carried out on the corresponding parameters of the convolution layer, so that the knowledge transfer of the parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.

Drawings

In order to more clearly illustrate the technical solutions and advantages of embodiments of the present application or of the prior art, the following description will briefly introduce the drawings that are required to be used in the embodiments or the prior art descriptions, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic illustration of an application environment provided by an embodiment of the present application;

fig. 2 is a schematic flow chart of a pruning method for a network model according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating steps of a pruning process according to an embodiment of the present disclosure;

FIG. 4 is a schematic diagram of determining a% of a feature atlas after a% of a feature atlas output by each convolution layer as a target feature atlas according to an embodiment of the present application;

fig. 5 is a data comparison graph of accuracy of recognition of a training image set by an AlexNet model before pruning and accuracy of recognition of a training image set by an AlexNet model after pruning, which are provided in the embodiment of the present application, by performing pruning processing with a preset pruning rate of 80% on the AlexNet model in the 25 th iteration lot epoch and the 75 th iteration lot epoch;

fig. 6 is a data comparison graph of accuracy of identifying a training image set by an AlexNet model before pruning and accuracy of identifying a training image set by an AlexNet model after pruning, which are provided in the embodiment of the present application, when the 25 th iteration lot epoch and the 75 th iteration lot epoch are performed on the AlexNet model with a preset pruning rate of 60%;

fig. 7 is a graph comparing data of accuracy of recognition of a training image set by a VGG19 model before pruning and accuracy of recognition of a training image set by a VGG19 model after pruning, which is provided in the embodiment of the present application, by performing pruning processing with a preset pruning rate of 80% on the VGG19 model in the 25 th iteration lot epoch and the 75 th iteration lot epoch;

fig. 8 is a data comparison graph of accuracy of recognition of a training image set by a VGG19 model before pruning and accuracy of recognition of a training image set by a VGG19 model after pruning, which are provided in the embodiment of the present application, by performing pruning processing with a preset pruning rate of 60% on the VGG19 model in the 25 th iteration lot epoch and the 75 th iteration lot epoch;

fig. 9 is a schematic structural diagram of a pruning device for a network model according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail below with reference to the accompanying drawings. It will be apparent that the described embodiments are merely one embodiment of the present application and not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic may be included in at least one implementation of the present application. In the description of the embodiments of the present application, it should be understood that the terms "comprises," "comprising," and "is" and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, apparatus, article, or device that comprises a list of steps or modules is not necessarily limited to those steps or modules that are expressly listed or inherent to such process, method, apparatus, article, or device.

Referring to fig. 1, a schematic diagram of an application environment provided in an embodiment of the present application is shown, including a server 101, where the server 101 includes a pruning device for a network model, and the server 101 obtains a training image set and a current network model, where the current network model includes a plurality of convolution layers, performs pruning processing on the current network model based on the training image set, and obtains a pruned network model; the pruning processing step comprises the steps of inputting training images into a current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation processing on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain a pruned network model.

In the following, a specific embodiment of a pruning method for a network model according to the present application is described, and fig. 2 is a schematic flow chart of a pruning method for a network model according to the embodiment of the present application, where the present specification provides the method operation steps as shown in the example or the flowchart, but may include more or fewer operation steps based on conventional or non-inventive labor. The sequence of steps recited in the embodiments is only one manner of a plurality of execution sequences, and does not represent a unique execution sequence, and when actually executed, may be executed sequentially or in parallel (e.g., in a parallel processor or a multithreaded environment) according to the method shown in the embodiments or the drawings. As shown in fig. 2, the method includes:

s201: acquiring a training image set and a current network model; the current network model includes multiple convolution layers.

In this embodiment of the application, the server obtains a training image set and a current network model, where the training image set may specifically be a CIFAR100 data set, an OTB50 data set, an OTB100 data set, or a GOT-10K data set. The current network model may be an AlexNet model, a VGG19 model, or a sialmfc model, and the current network model may also be a target recognition network model obtained by training with a training image set.

S203: pruning is carried out on the current network model based on the training image set, and a network model after pruning is obtained.

In an optional implementation manner, after obtaining the pruned network model, the server redetermines the pruned network model as the current network model, and returns to execute pruning processing on the current network model based on the training image set to obtain the pruned network model.

In another optional implementation manner, after obtaining the pruned network model, the server redetermines the pruned network model as the current network model, and trains the current network model by using the training image set to obtain a trained network model. Specifically, a training image is input into a current network model, parameter sets corresponding to a plurality of convolution layers are determined according to the output of the current network model, a parameter set to be pruned is determined from the parameter sets corresponding to the plurality of convolution layers, parameters except the parameter set to be pruned in the parameter sets corresponding to the plurality of convolution layers are determined as parameter sets to be updated, the parameter sets to be pruned are subjected to pause updating processing, and the parameter sets to be updated are updated, so that the trained network model is obtained.

Fig. 3 is a flow chart of the steps of a pruning process provided in an embodiment of the present application, which provides the method operational steps as shown in the examples or flow charts, but may include more or fewer operational steps based on conventional or non-inventive labor. The sequence of steps recited in the embodiments is only one manner of a plurality of execution sequences, and does not represent a unique execution sequence, and when actually executed, may be executed sequentially or in parallel (e.g., in a parallel processor or a multithreaded environment) according to the method shown in the embodiments or the drawings. As shown in fig. 3, the method includes:

s301: and inputting the training image into a current network model, and determining parameters corresponding to each convolution layer in the plurality of convolution layers according to the output of the current network model.

In this embodiment of the present application, in a specific implementation manner of determining parameters corresponding to each of a plurality of convolution layers, a server inputs an acquired training image into a current network model, determines a feature atlas output by each of the plurality of convolution layers according to an output of the current network model, and then determines parameters corresponding to the feature atlas output by each of the convolution layers, and determines parameters corresponding to each of the convolution layers according to a preset mapping relationship and parameters corresponding to the feature atlas output by each of the convolution layers, where the preset mapping relationship refers to a relationship between the parameters corresponding to the convolution layers and the parameters corresponding to the feature atlas output by the convolution layer.

S303: and carrying out attenuation treatment on the parameters corresponding to each convolution layer based on the preset pruning rate corresponding to each convolution layer to obtain attenuation parameters.

In this embodiment of the present invention, the server determines, based on the preset pruning rate corresponding to each convolution layer, a target feature set from the feature set output by each convolution layer, where the preset pruning rate corresponding to each convolution layer in the plurality of convolution layers may be identical, the preset pruning rate corresponding to each convolution layer in the plurality of convolution layers may also be identical, the preset pruning rates corresponding to some convolution layers in the plurality of convolution layers may be identical, and the preset pruning rates of some convolution layers other than the convolution layer with the identical preset pruning rate are not identical.

In an alternative embodiment, the preset pruning rate corresponding to each convolution layer in the plurality of convolution layers may be identical, and then the server may arbitrarily determine, as the target feature map set, a feature map set with a channel number accounting for a% of the total channel number a of the feature map output by the corresponding convolution layer from the feature map set output by each convolution layer. Here, the arbitrary determination may be to randomly select a plurality of feature atlases from the feature atlases output from the convolution layers, or to sequentially select a plurality of feature atlases from the convolution layers, for example, fig. 4 illustrates a schematic diagram in which a% of the feature atlases after determining the feature atlases output from each convolution layer are target feature atlases.

In another alternative embodiment, the preset pruning rate corresponding to each of the plurality of convolution layers may be quite different. For example, assume that the current network model contains three convolution layers, a first convolution layer that outputs a feature atlas of 10 channels, a second convolution layer that outputs a feature atlas of 10 channels, and a third convolution layer that outputs a feature atlas of 20 channels, respectively. The preset pruning rate corresponding to the first convolution layer is 20%, the preset pruning rate corresponding to the second convolution layer is 40%, and the preset pruning rate corresponding to the third convolution layer is 50%, so that the server selects a 2-channel feature map set from the 10-channel feature map set output by the first convolution layer, selects 4-channel feature map sets from the 10-channel feature map set output by the second convolution layer, selects 10-channel feature map sets from the 20-channel feature map set output by the third convolution layer, and takes the selected 16-channel feature map set as a target feature map set. Similarly, in the case that the preset pruning rates of the partial convolution layers corresponding to the plurality of convolution layers are the same and the preset pruning rates of the partial convolution layers other than the convolution layers having the same preset pruning rate are different, the method for determining the target feature atlas may also be identical to the method for determining the target feature atlas in the case that the preset pruning rates corresponding to each of the plurality of convolution layers are completely different, which is not described in detail in the present application.

In the embodiment of the present application, after determining a target feature atlas from a feature atlas output by each convolution layer based on a preset pruning rate corresponding to each convolution layer, a server determines a parameter corresponding to the target feature atlas, performs attenuation processing on the parameter corresponding to the target feature atlas based on a preset coefficient, obtains a transition parameter, and determines an attenuation parameter corresponding to each convolution based on a preset mapping relation between the parameter corresponding to a certain convolution layer and the parameter corresponding to the feature atlas output by the convolution layer and the obtained transition parameter.

S305: if the difference value between the attenuation parameter and the preset threshold value is within the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer, and obtaining the pruned network model.

The current network model is pruned based on the training image set, and the specific experimental description is performed on the pruned network model.

In an alternative embodiment, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The method comprises the steps of carrying out pruning treatment with a preset pruning rate of 80% on AlexNet models in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned AlexNet model with a final pruning rate of 0.64, determining the recognition accuracy of the pruned AlexNet model on a training image set, determining the recognition accuracy of the pruned AlexNet model on the training image set, and carrying out performance analysis on the pruned model based on the recognition accuracy of the pruned AlexNet model on the training image set and the recognition accuracy of the pruned AlexNet model on the training image set, wherein the calculated amount between the pruned convolution layers is 41% of the calculated amount between the pruned convolution layers. Fig. 5 illustrates a data comparison graph of the accuracy of identifying a training image set by the AlexNet model before pruning and the accuracy of identifying a training image set by the AlexNet model after pruning, in which the preset pruning rate is 80% for the AlexNet model in the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the AlexNet model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the AlexNet model before pruning to the training image set. As can be clearly seen from fig. 5, in the whole pruning process, the accuracy of identifying the CIFAR100 data set by the AlexNet model after pruning is basically consistent with the accuracy of identifying the CIFAR100 data set by the AlexNet model before pruning, and when the AlexNet model is pruned in the 25 th iteration batch epoch and the 75 th iteration batch epoch, the accuracy of identifying the CIFAR100 data set is not suddenly reduced, the whole pruning process is very smooth, and the accuracy of identifying the CIFAR100 data set by the AlexNet model after pruning finally exceeds the accuracy of identifying the CIFAR100 data set by the AlexNet model before pruning.

In an alternative embodiment, the AlexNet model is used as the current network model, the CIFAR100 data set is used as the training image set, the AlexNet model is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the AlexNet model by using the CIFAR100 data set. The method comprises the steps of carrying out pruning treatment with a preset pruning rate of 60% on AlexNet models in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned AlexNet model with a final pruning rate of 0.36, determining the accuracy of recognition of the training image set by the AlexNet model before pruning and the accuracy of recognition of the training image set by the AlexNet model after pruning, and carrying out performance analysis of the pruned model based on the accuracy of recognition of the training image set by the AlexNet model before pruning and the accuracy of recognition of the training image set by the AlexNet model after pruning, wherein the calculated amount between the convolution layers after pruning is 13% of the calculated amount between the convolution layers before pruning. Fig. 6 illustrates a data comparison graph of the accuracy of identifying a training image set by the AlexNet model before pruning and the accuracy of identifying a training image set by the AlexNet model after pruning, in which the preset pruning rate is 60% for the AlexNet model in the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the AlexNet model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the AlexNet model before pruning to the training image set. As can be clearly seen from fig. 6, the accuracy of identifying the CIFAR100 data set by the AlexNet model after pruning eventually exceeds the accuracy of identifying the CIFAR100 data set by the AlexNet model before pruning.

In an alternative embodiment, the VGG19 is used as the current network model, the CIFAR100 data set is used as the training image set, the VGG19 is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 data set. The method comprises the steps of performing pruning treatment with a preset pruning rate of 80% on VGG19 in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned VGG19 with a final pruning rate of 0.64, determining the accuracy of recognition of the pruned VGG19 on a training image set, determining the accuracy of recognition of the pruned VGG19 on the training image set, and performing performance analysis of a model after pruning on the training image set based on the accuracy of recognition of the pruned VGG19 on the training image set and the accuracy of recognition of the pruned VGG19 on the training image set, wherein the calculated amount between the pruned VGG19 and the pruned convolution layers is 41%. Fig. 7 illustrates a data comparison graph of the accuracy of identifying the VGG19 to the training image set before pruning and the accuracy of identifying the VGG19 to the training image set after pruning, in which the predetermined pruning rate is 80% for the VGG19 at the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the VGG19 model before pruning to the training image set. As can be clearly seen from fig. 7, in the training process, the convergence rate of the VGG19 model is gradually increased, and the accuracy of identifying the CIFAR100 data set by the VGG19 after pruning eventually exceeds the accuracy of identifying the CIFAR100 data set by the VGG19 before pruning.

In an alternative embodiment, the VGG19 is used as the current network model, the CIFAR100 data set is used as the training image set, the VGG19 is trained by using the CIFAR100 data set, and the training iteration number may be specifically 250 iteration batches epoch in the process of training the VGG19 by using the CIFAR100 data set. The method comprises the steps of performing pruning treatment with a preset pruning rate of 60% on VGG19 in a 25 th iteration batch epoch and a 75 th iteration batch epoch, obtaining a pruned VGG19 with a final pruning rate of 0.36, determining the accuracy of recognition of the VGG19 on a training image set before pruning and the accuracy of recognition of the VGG19 on the training image set after pruning, and performing performance analysis of a model after pruning on the training image set based on the accuracy of recognition of the VGG19 on the training image set before pruning and the accuracy of recognition of the VGG19 on the training image set after pruning, wherein the calculated amount between the convolutions after pruning is 13% of the calculated amount between the convolutions before pruning. Fig. 8 illustrates a data comparison graph of the accuracy of identifying the VGG19 to the training image set before pruning and the accuracy of identifying the VGG19 to the training image set after pruning, in which the preset pruning rate is 60% for the VGG19 at the 25 th iteration lot epoch and the 75 th iteration lot epoch. In the figure, a dashed line Baseline represents the recognition accuracy of the VGG19 model before pruning to the training image set, and a solid line Smoot_Pranning represents the recognition accuracy of the VGG19 model before pruning to the training image set. As can be clearly seen from fig. 8, in the initial stage of training, the convergence rate of the VGG19 model gradually increases, and the accuracy of identifying the CIFAR100 data set by the VGG19 after pruning eventually exceeds the accuracy of identifying the CIFAR100 data set by the VGG19 before pruning.

In an alternative embodiment, the sialmfc model is used as the current network model, the OTB50 data set, the OTB100 data set or the GOT-10K data set is used as the training image set, the OTB50 data set, the OTB100 data set or the GOT-10K data set is used to train the sialmfc model, and in the process of training the sialmfc model by using the OTB50 data set, the OTB100 data set or the GOT-10K data set, the training iteration number may be specifically 50 iteration batches epoch. The method comprises the steps of performing pruning processing with a preset pruning rate of 80% on a SiamFC model in a 5 th iteration batch epoch and a 15 th iteration batch epoch, obtaining a SiamFC model with a final pruning rate of 0.64, determining the accuracy of the SiamFC model before pruning on a training image set, determining the accuracy of the SiamFC model after pruning on the training image set, comparing the accuracy of the SiamFC model before pruning on the training image set with the accuracy of the SiamFC model after pruning on the training image set, wherein the SiamFC represents the accuracy and success rate of the SiamFC model before pruning on the training image set, the Prun_SiamFC represents the accuracy and success rate of the SiamFC model after pruning on the training image set, determining the accuracy of the SiamFC model after pruning on the training image set is the accuracy of the training image set, and determining the accuracy of the SiamFC model after pruning is the probability of the training image set to be matched with the actual region.

From the above experimental data, it can be clearly seen that the pruning of the network model disclosed by the application can effectively realize the light weight of the current model on the basis of not reducing the performance of the current model.

By adopting the pruning method for the network model, which is provided by the embodiment of the application, through carrying out attenuation treatment on the parameters corresponding to the convolution layer, knowledge transfer of parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.

Fig. 9 is a schematic structural diagram of a pruning device for a network model according to an embodiment of the present application, as shown in fig. 9, where the pruning device for a network model includes:

the acquisition module 901 is used for acquiring a training image set and a current network model; the current network model comprises a plurality of convolution layers;

the pruning module 903 is configured to prune the current network model based on the training image set to obtain a pruned network model;

the pruning process comprises the following steps:

In this embodiment, the pruning module 903 described above includes:

the determining module 9031 is configured to input the training image into the current network model, and determine a parameter corresponding to each of the plurality of convolution layers according to an output of the current network model;

the attenuation module 9033 is configured to perform attenuation processing on parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer, so as to obtain attenuation parameters;

the rejecting module 9035 is configured to reject parameters corresponding to the attenuation parameters in the convolutional layer if the difference between the attenuation parameters and the preset threshold is within a preset interval, and obtain a pruned network model.

The apparatus and method embodiments in the embodiments of the present application are based on the same application concept.

The electronic device may be configured in a server to store at least one instruction, at least one program, a code set, or an instruction set related to a pruning method for a network model in the method embodiment, where the at least one instruction, the at least one program, the code set, or the instruction set is loaded into the memory and executed by the memory to implement the pruning method for a network model.

The storage medium may be configured in a server to store at least one instruction, at least one program, a code set, or an instruction set related to a pruning method for a network model in a method embodiment, where the at least one instruction, the at least one program, the code set, or the instruction set is loaded and executed by the processor to implement the pruning method for a network model.

Alternatively, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of the computer network. Alternatively, in the present embodiment, the storage medium may include, but is not limited to, including: a U-disk, a Read-only Memory (ROM), a removable hard disk, a magnetic disk, or an optical disk, or the like, which can store program codes.

As can be seen from the embodiments of the pruning method, apparatus, electronic device or storage medium for a network model provided in the present application, the method in the present application includes obtaining a training image set and a current network model, where the current network model includes a plurality of convolution layers, pruning the current network model based on the training image set to obtain a pruned network model; the pruning processing step comprises the steps of inputting training images into a current network model, determining parameters corresponding to each convolution layer in a plurality of convolution layers according to the output of the current network model, carrying out attenuation processing on the parameters corresponding to each convolution layer based on a preset pruning rate corresponding to each convolution layer to obtain attenuation parameters, and eliminating the parameters corresponding to the attenuation parameters in the convolution layers if the difference value of the attenuation parameters and a preset threshold value is within a preset interval to obtain a pruned network model. Based on the embodiment of the application, the attenuation processing is carried out on the corresponding parameters of the convolution layer, so that the knowledge transfer of the parameter learning corresponding to the convolution layer of the parameters to be eliminated is forced, the training burden is not increased while the parameter quantity is reduced, and the identification accuracy of the network model can be ensured.

It should be noted that: the foregoing sequence of embodiments of the present application is for illustration only, and does not represent the advantages or disadvantages of the embodiments, and the present specification describes specific embodiments, other embodiments being within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in a different order in a different embodiment and can achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or the sequential order shown, to achieve desirable results, and in some embodiments, multitasking parallel processing may be possible or advantageous.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for the embodiments of the device, the description is relatively simple, since it is based on embodiments similar to the method, as relevant see the description of parts of the method embodiments.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that changes and modifications may be made without departing from the principles of the invention, such changes and modifications are also intended to be within the scope of the invention.

Claims

1. A method for pruning a network model, comprising:

the pruning process comprises the following steps:

inputting the training image set into the current network model, and determining a feature atlas output by each convolution layer in the plurality of convolution layers according to the output of the current network model;

determining the parameters corresponding to each convolution layer according to the preset mapping relation and the parameters corresponding to the feature atlas output by each convolution layer;

determining a target feature image set from the feature image set output by each convolution layer based on the preset pruning rate corresponding to each convolution layer; the ratio of the number of channels of the target feature atlas to the number of channels of the feature atlas is the preset pruning rate;

determining parameters corresponding to the target feature atlas;

determining attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters;

and if the difference value between the attenuation parameter and the preset threshold value is in the preset interval, eliminating the parameter corresponding to the attenuation parameter in the convolution layer to obtain a pruned network model.

2. The method of claim 1, wherein after obtaining the pruned network model, further comprising:

and re-determining the pruned network model as the current network model, and returning to the step of executing pruning processing on the current network model based on the training image set to obtain the pruned network model.

3. The method of claim 1, wherein after obtaining the pruned network model, further comprising:

re-determining the pruned network model as the current network model;

4. A method according to claim 3, wherein training the current network model using the training image set to obtain a trained network model comprises:

inputting the training image into the current network model, and determining parameter sets corresponding to the plurality of convolution layers according to the output of the current network model;

and carrying out pause updating processing on the parameter set to be pruned, and updating the parameter set to be updated to obtain a trained network model.

5. A pruning device for a network model, comprising:

the determining module is used for inputting the training image set into the current network model, and determining a feature atlas output by each convolution layer in the plurality of convolution layers according to the output of the current network model; determining parameters corresponding to the feature atlas output by each convolution layer; determining the parameters corresponding to each convolution layer according to the preset mapping relation and the parameters corresponding to the feature atlas output by each convolution layer;

the attenuation module is used for determining a target feature graph set from the feature graph set output by each convolution layer based on the preset pruning rate corresponding to each convolution layer; the ratio of the number of channels of the target feature atlas to the number of channels of the feature atlas is the preset pruning rate; determining parameters corresponding to the target feature atlas; carrying out attenuation treatment on parameters corresponding to the target feature atlas based on a preset coefficient to obtain transition parameters; determining attenuation parameters corresponding to each convolution layer according to the preset mapping relation and the transition parameters;

and the rejecting module is used for rejecting the parameters corresponding to the attenuation parameters in the convolution layer if the difference value between the attenuation parameters and the preset threshold value is in the preset interval, so as to obtain a pruned network model.

6. An electronic device comprising a processor and a memory, wherein the memory stores at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement the pruning method for a network model of any one of claims 1-4.

7. A computer readable storage medium having stored therein at least one instruction, at least one program, code set, or instruction set, the at least one instruction, the at least one program, the code set, or instruction set being loaded and executed by a processor to implement a pruning method for a network model according to any one of claims 1-4.