CN117521763A - Artificial intelligent model compression method integrating regularized pruning and importance pruning - Google Patents

Artificial intelligent model compression method integrating regularized pruning and importance pruning Download PDF

Info

Publication number
CN117521763A
CN117521763A CN202410009132.4A CN202410009132A CN117521763A CN 117521763 A CN117521763 A CN 117521763A CN 202410009132 A CN202410009132 A CN 202410009132A CN 117521763 A CN117521763 A CN 117521763A
Authority
CN
China
Prior art keywords
network
pruning
loss function
group
importance
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410009132.4A
Other languages
Chinese (zh)
Inventor
李锋
李博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Maritime University
Original Assignee
Dalian Maritime University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Maritime University filed Critical Dalian Maritime University
Priority to CN202410009132.4A priority Critical patent/CN117521763A/en
Publication of CN117521763A publication Critical patent/CN117521763A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an artificial intelligent model compression method for regularized pruning and importance pruning of a fusion group, which comprises the following steps: initializing weights and offsets in the deep convolutional neural network, and initializing a group regularization penalty factor; adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training; back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function; continuously optimizing weights and biases through a training process of multiple iterations to obtain network parameters with perfect training; after training, pruning the network and compressing the network model; and carrying out fine tuning training on the pruned network. In the method, in the regularized pruning of the group, the penalty factors are adaptively adjusted according to the importance of the weight group, and the adaptively adjusted penalty factors are used for pruning the network more reasonably, so that the weight group with important contribution to the network performance is better reserved.

Description

Artificial intelligent model compression method integrating regularized pruning and importance pruning
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence model compression method integrating regularized pruning and importance pruning.
Background
Artificial intelligence has been successfully applied in a variety of different practical tasks, with deep neural networks being one of the most successful artificial intelligence models and achieving the most advanced performance in a number of application areas. Deep neural networks can use large-scale networks to extract useful features from large amounts of data, which is a major factor in its success. However, this makes deep neural networks ubiquitous with large-scale parameters, complex computation, etc., which prevents their deployment on mobile and embedded devices. With this problem, various network compression methods have been developed.
Network pruning is an important artificial intelligence model compression method, often used to compress deep neural network models, which by removing some unimportant parts of the network, results in a more compact network without affecting the original accuracy. And especially, the structural pruning can remove the whole redundant structure in the network, thereby being beneficial to realizing the compression and acceleration of the model.
Existing structural pruning methods mainly include importance-based methods and regularization-based methods. The method based on importance sorts the network structures through some importance criteria, and prunes the structures with lower importance. This pruning method requires more retraining to recover the loss of accuracy due to pruning, which increases a significant amount of computational cost. The regularization-based method can punish the network weights in the training process, compress the weights of redundant structures to zero, and remove the weights from the network. Since these weights are small, they do not contribute much to the output of the network, their removal does not result in a large loss of accuracy, and good accuracy can be achieved without requiring too much retraining. Therefore, this method does not increase the calculation cost.
Network pruning is an important artificial intelligent model compression method, wherein the structural pruning method based on group regularization can effectively prune redundant structures in a network model, and simultaneously realize network model compression and model acceleration. However, existing group regularization methods typically use a fixed penalty factor to compress the weights to zero, which has an unreasonable assumption that all weights are equally important. In practical cases, the weights of the neural networks tend to have different importance. Some weights may have a greater contribution to the performance of the network, while others may have less impact on the performance of the network. Thus, using the same penalty factor, simply compressing all weights to zero at the same penalty strength results in a decrease in network performance. In addition, the penalty factor is taken as a super parameter, the value of the penalty factor lacks general theoretical guidance, parameter adjustment is usually required according to different problems, and a great deal of time cost and calculation cost are consumed.
The group regularization method is a regularization method specially used for structural pruning, and the structural pruning effect of the regularization method is verified by a lot of work. However, existing group regularization methods typically use a fixed penalty factor to compress the weights to zero, which has an unreasonable assumption that all weights are equally important. This assumption is contrary to the actual situation where the weights of the neural network tend to have different importance. Some weights may have a greater contribution to the performance of the network, while others may have less impact on the performance of the network. Thus, using the same penalty factor, simply compressing all weights to zero at the same penalty strength results in a decrease in network performance. In addition, the penalty factor is taken as a super parameter, the value of the penalty factor lacks general theoretical guidance, parameter adjustment is usually required according to different problems, and a great deal of time cost and calculation cost are consumed.
Disclosure of Invention
According to the technical problems, the invention provides a model compression method for regularized pruning and importance pruning of a fusion group. In the method, penalty factors are adaptively adjusted according to the importance of weight groups in group regularized pruning. By using the self-adaptive adjustment penalty factors, the network can be pruned more reasonably, and the weight group which has important contribution to the network performance can be better reserved.
The invention adopts the following technical means:
an artificial intelligence model compression method integrating regularized pruning and importance pruning, comprising:
s1, initializing weights and offsets in a deep convolutional neural network, and initializing a group regularization penalty factor;
s2, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training;
s3, back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function;
s4, continuously optimizing weights and biases through a training process of multiple iterations, and finally obtaining the network parameters with perfect training;
s5, after training is finished, pruning operation is carried out on the network, and a network model is compressed;
s6, performing fine tuning training on the pruned network to help the network adapt to the pruned structure.
Further, the step S1 specifically includes:
s11, randomly sampling the weight and the bias into a smaller value from normal distribution;
s12, initializing and setting the group regularization factor to 0.
Further, the step S2 specifically includes:
s21, expressing the cross entropy loss function as
S22, representing the group regularization loss function as
S23, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training
Wherein,representing the set of ownership weights in the network, +.>Is a predicted loss function, i.e. cross entropy loss function, ">Is an unstructured regularization of each weight, +.>Representing norm operation, weight decay, ++>Is for->Group of layers->Structured sparse regularization term of +.>Is->The number of weight sets in a layer;
s24, pairUsing the Group Lasso, the Group Lasso canonical term is expressed as:
further, the step S3 specifically includes:
s31, differentiating the output of the network by using the loss function to obtain the gradient of the loss function to the output;
s32, sequentially calculating the gradient of each layer according to a chain rule until the gradient of a loss function on each parameter (weight and bias) is calculated;
s33, if the gradient of the loss function to each parameter is calculated, updating the parameters of the network by using an optimization algorithm.
Further, in the step S33, the optimization algorithm used for updating the parameters of the network is:
new parameter = old parameter-learning rate parameter gradient
Wherein the learning rate is a super parameter for controlling the step size of parameter update.
Further, the step S5 specifically includes:
s51, performing overall parameters of the convolution kernelOperating;
s52, setting a threshold value which is 0.0001, and removing convolution kernels smaller than the threshold value.
Further, the step S6 specifically includes:
training the pruned network, and slightly adjusting parameters of the network model by using a small learning rate.
Further, the method can impose more reasonable penalty factors on different weight sets, wherein:
for important weight groups, smaller penalty factors are applied, and the penalty force in updating is smaller;
for unimportant weight groups, a larger penalty factor is applied, and the penalty force in updating is larger.
Compared with the prior art, the invention has the following advantages:
1. according to the artificial intelligent model compression method for regularized pruning and importance pruning of the fusion group, the performance and the precision of a network can be remarkably improved, and the final precision can be improved by about 0.34% under the same training iteration times by taking VGG16 as an example; taking ResNet18 as an example, the final accuracy can be improved by about 0.28% for the same number of training iterations.
2. According to the artificial intelligent model compression method integrating regularized pruning and importance pruning, the calculation complexity of a network can be reduced, the calculation efficiency of the network is improved, and compared with a normal VGG16 network, the VGG16 is taken as an example, the parameter quantity can be reduced by 87%; taking ResNet18 as an example, the number of parameters can be reduced by 81%.
3. The artificial intelligence model compression method for regularized pruning and importance pruning of the fusion group can automatically identify and adjust the magnitude of the punishment factors without manual adjustment. Therefore, the workload of manual parameter adjustment can be reduced, and the network is ensured to be always subjected to proper regularization constraint in the training process.
4. The artificial intelligent model compression method for regularized pruning and importance pruning of the fusion group can be conveniently applied to the existing deep convolutional neural network training and pruning flow without large-scale modification or redesign. Thus, time and resources can be saved, and the method is easier to implement and popularize.
In summary, by using the method for adaptively adjusting the group regularization penalty factor provided by the invention, the performance and the precision of the deep convolutional neural network can be effectively improved, the number of parameters can be reduced, and the operation flow can be simplified. Meanwhile, the method is simple and convenient to operate, and can be conveniently applied to the existing network training and pruning flow. These improvements will bring broader application prospects and commercial value to the fields of neural network training and artificial intelligence model compression.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.
Fig. 1 is a general route diagram of network pruning provided by the present invention.
FIG. 2 is a schematic diagram of group regularization provided by the present invention.
FIG. 3 is a comparison of the number of parameters and the running acceleration ratio of different networks under the same conditions provided by the embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in FIG. 1, the invention provides an artificial intelligence model compression method for regularized pruning and importance pruning of a fusion group, which comprises the following steps:
s1, initializing weights and offsets in a deep convolutional neural network, and initializing a group regularization penalty factor;
s2, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training; in this embodiment, in each training iteration, the group regularization loss function needs to be added in addition to the loss function generated by calculating the network parameters. This ensures that each weight set of the network is subject to the appropriate regularization constraints.
S3, back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function; in this embodiment, by dynamically adjusting the group regularization penalty factor, a weight group that has an important contribution to network performance can be more accurately retained, and a weight group that has less impact on network performance by pruning can be more accurately retained.
S4, continuously optimizing weights and biases through a training process of multiple iterations, and finally obtaining the network parameters with perfect training; therefore, the number of parameters of the network can be further reduced, and the calculation efficiency of the network is improved.
S5, after training is finished, pruning operation is carried out on the network, and a network model is compressed;
s6, performing fine tuning training on the pruned network to help the network adapt to the pruned structure. The fine tuning training can help the network adapt to the structure after pruning, and improves the generalization capability of the network so as to further improve the performance and the precision.
In specific implementation, as a preferred embodiment of the present invention, the step S1 specifically includes:
s11, randomly sampling the weight and the bias into a smaller value from normal distribution;
s12, initializing and setting the group regularization factor to 0.
In specific implementation, as a preferred embodiment of the present invention, the step S2 specifically includes:
s21, expressing the cross entropy loss function as
S22, representing the group regularization loss function as
S23, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training
Wherein,representing the set of ownership weights in the network, +.>Is a predicted loss function, i.e. cross entropy loss function, ">Is an unstructured regularization of each weight, +.>Representing norm operation, weight decay, ++>Is for->Group of layers->Structured sparse regularization term of +.>Is->The number of weight sets in a layer;
s24, pairUsing the Group Lasso, the Group Lasso canonical term is expressed as:
in this embodiment, parameters of the network model are divided into two modules for processing, wherein one module performs data fitting to generate a cross entropy loss function, and the other module can penalize weight sets corresponding to the network filters, and compress weights of the redundant structures to zero, as shown in fig. 2, whereinIs an adaptively adjustable penalty factor that produces a group regularization loss function.
In specific implementation, as a preferred embodiment of the present invention, the step S3 specifically includes:
s31, differentiating the output of the network by using the loss function to obtain the gradient of the loss function to the output;
s32, sequentially calculating the gradient of each layer according to a chain rule until the gradient of a loss function on each parameter (weight and bias) is calculated;
s33, if the gradient of the loss function to each parameter is calculated, updating the parameters of the network by using an optimization algorithm. The optimization algorithm used for updating the parameters of the network is as follows:
new parameter = old parameter-learning rate parameter gradient
Wherein the learning rate is a super parameter for controlling the step size of parameter update. In this example, the initial value is 0.1, and the parameters are multiplied by 0.1 at 50% and 75% of the number of network iterations.
In this embodiment, the scale factor of the network batch normalization layer (BN layer) is used as an importance standard to represent the importance of the weight set, and thus a penalty factor capable of being adaptively adjusted is constructed, so that a small penalty factor can be applied to a filter with high importance, the filter is retained in a strengthening manner, and a large penalty factor is applied to a filter with low importance, and the filter is compressed to zero. The scaling factor of BN layer is a significant advantage as a criterion of importance:
(1) The scaling factor of the BN layer is an inherent parameter in the network training process, and no additional construction parameter is needed as an importance standard, so that the calculation cost can be saved;
(2) The scaling factor of the BN layer is related to the characteristic image output by the filter, so that the importance of the filter can be more accurately represented;
(3) The scaling factor of the BN layer is itself updatable, which can bring about the updating of the penalty factor constructed by it, so that no additional design of the responsible penalty factor updating rules is required.
In specific implementation, as a preferred embodiment of the present invention, the step S5 specifically includes:
s51, performing overall parameters of the convolution kernelOperating;
s52, setting a threshold value which is 0.0001, and removing convolution kernels smaller than the threshold value.
In specific implementation, as a preferred embodiment of the present invention, the step S6 specifically includes:
training the pruned network, and slightly adjusting the parameters of the network model by using a small learning rate (0.0005).
In specific implementation, as a preferred embodiment of the present invention, the method can apply more reasonable penalty factors to different weight groups, wherein:
for important weight groups, smaller penalty factors are applied, and the penalty force in updating is smaller;
for unimportant weight groups, a larger penalty factor is applied, and the penalty force in updating is larger.
Examples
VGG, res net in the following table are standard networks; SSL is a method for applying fixed group regularization factors under the same model; DRFGR is a penalty factor for applying adaptive adjustment under the same model.
Table I results of different indexes of Vgg16 and Vgg19 under the same condition and different methods
Table II results of different indices of ResNet18 and ResNet50 under the same conditions and different methods
In summary, the group regularization method of the self-adaptive adjustment penalty factors can more reasonably prune the network, can effectively compress and prune the network model, and simultaneously maintains the performance and accuracy of the model. The improvement part of the invention brings improvement of performance, precision, efficiency, simplicity, generalization capability and expandability. The improvement of the technical effects and main performance indexes can lead the method of the invention to have wide application prospect and commercial value in the field of artificial intelligent model compression. The method for adaptively adjusting the regularization penalty factors of the group can effectively improve the performance and the precision of the deep convolutional neural network, reduce the number of parameters and simplify the operation flow. Meanwhile, the method is simple and convenient to operate, and can be conveniently applied to the existing network training and pruning flow. These improvements will bring broader application prospects and commercial value to the fields of neural network training and artificial intelligence model compression.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (8)

1. An artificial intelligence model compression method integrating regularized pruning and importance pruning, which is characterized by comprising the following steps:
s1, initializing weights and offsets in a deep convolutional neural network, and initializing a group regularization penalty factor;
s2, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training;
s3, back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function;
s4, continuously optimizing weights and biases through a training process of multiple iterations, and finally obtaining the network parameters with perfect training;
s5, after training is finished, pruning operation is carried out on the network, and a network model is compressed;
s6, performing fine tuning training on the pruned network to help the network adapt to the pruned structure.
2. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S1 specifically comprises:
s11, randomly sampling the weight and the bias into a smaller value from normal distribution;
s12, initializing and setting the group regularization factor to 0.
3. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S2 specifically comprises:
s21, expressing the cross entropy loss function as
S22, representing the group regularization loss function as
S23, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training
Wherein,representing the set of ownership weights in the network, +.>Is the predicted loss function, i.e. the cross entropy loss function,is an unstructured regularization of each weight, +.>Representing norm operation, weight decay, ++>Is for->Group of layersStructured sparse regularization term of +.>Is->The number of weight sets in a layer;
s24, pairUsing the Group Lasso, the Group Lasso canonical term is expressed as:
4. the artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S3 specifically comprises:
s31, differentiating the output of the network by using the loss function to obtain the gradient of the loss function to the output;
s32, sequentially calculating the gradient of each layer according to a chain rule until the gradient of a loss function on each parameter (weight and bias) is calculated;
s33, if the gradient of the loss function to each parameter is calculated, updating the parameters of the network by using an optimization algorithm.
5. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein in step S33, the optimization algorithm used for updating the parameters of the network is:
new parameter = old parameter-learning rate parameter gradient
Wherein the learning rate is a super parameter for controlling the step size of parameter update.
6. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S5 specifically comprises:
s51, performing overall parameters of the convolution kernelOperating;
s52, setting a threshold value which is 0.0001, and removing convolution kernels smaller than the threshold value.
7. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S6 specifically comprises:
training the pruned network, and slightly adjusting parameters of the network model by using a small learning rate.
8. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning of claim 1, wherein the method is capable of applying more rational penalty factors to different weight groups, wherein:
for important weight groups, smaller penalty factors are applied, and the penalty force in updating is smaller;
for unimportant weight groups, a larger penalty factor is applied, and the penalty force in updating is larger.
CN202410009132.4A 2024-01-04 2024-01-04 Artificial intelligent model compression method integrating regularized pruning and importance pruning Pending CN117521763A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410009132.4A CN117521763A (en) 2024-01-04 2024-01-04 Artificial intelligent model compression method integrating regularized pruning and importance pruning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410009132.4A CN117521763A (en) 2024-01-04 2024-01-04 Artificial intelligent model compression method integrating regularized pruning and importance pruning

Publications (1)

Publication Number Publication Date
CN117521763A true CN117521763A (en) 2024-02-06

Family

ID=89751579

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410009132.4A Pending CN117521763A (en) 2024-01-04 2024-01-04 Artificial intelligent model compression method integrating regularized pruning and importance pruning

Country Status (1)

Country Link
CN (1) CN117521763A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117890978A (en) * 2024-03-18 2024-04-16 大连海事大学 Seismic velocity image generation method based on visual transducer

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117890978A (en) * 2024-03-18 2024-04-16 大连海事大学 Seismic velocity image generation method based on visual transducer
CN117890978B (en) * 2024-03-18 2024-05-10 大连海事大学 Seismic velocity image generation method based on visual transducer

Similar Documents

Publication Publication Date Title
WO2021004366A1 (en) Neural network accelerator based on structured pruning and low-bit quantization, and method
CN111079781B (en) Lightweight convolutional neural network image recognition method based on low rank and sparse decomposition
CN110175628A (en) A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation
TWI785227B (en) Method for batch normalization layer pruning in deep neural networks
CN109102064A (en) A kind of high-precision neural network quantization compression method
CN117521763A (en) Artificial intelligent model compression method integrating regularized pruning and importance pruning
CN111105035A (en) Neural network pruning method based on combination of sparse learning and genetic algorithm
CN113610227B (en) Deep convolutional neural network pruning method for image classification
CN113657421A (en) Convolutional neural network compression method and device and image classification method and device
CN113595993A (en) Vehicle-mounted sensing equipment joint learning method for model structure optimization under edge calculation
CN112488304A (en) Heuristic filter pruning method and system in convolutional neural network
CN112381218A (en) Local updating method for distributed deep learning training
CN110188877A (en) A kind of neural network compression method and device
CN111382581B (en) One-time pruning compression method in machine translation
CN114970853A (en) Cross-range quantization convolutional neural network compression method
CN108805844B (en) Lightweight regression network construction method based on prior filtering
CN112132062B (en) Remote sensing image classification method based on pruning compression neural network
CN114004327A (en) Adaptive quantization method of neural network accelerator suitable for running on FPGA
Liu et al. Improvement of pruning method for convolution neural network compression
CN115564043B (en) Image classification model pruning method and device, electronic equipment and storage medium
CN116187401A (en) Compression method and device for neural network, electronic equipment and storage medium
CN109389221A (en) A kind of neural network compression method
CN111210009A (en) Information entropy-based multi-model adaptive deep neural network filter grafting method, device and system and storage medium
CN113592085A (en) Nuclear pruning method, device, equipment and medium based on high-rank convolution graph
Sarkar et al. An incremental pruning strategy for fast training of CNN models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination