CN117521763A - Artificial intelligent model compression method integrating regularized pruning and importance pruning - Google Patents
Artificial intelligent model compression method integrating regularized pruning and importance pruning Download PDFInfo
- Publication number
- CN117521763A CN117521763A CN202410009132.4A CN202410009132A CN117521763A CN 117521763 A CN117521763 A CN 117521763A CN 202410009132 A CN202410009132 A CN 202410009132A CN 117521763 A CN117521763 A CN 117521763A
- Authority
- CN
- China
- Prior art keywords
- network
- pruning
- loss function
- group
- importance
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 61
- 238000013138 pruning Methods 0.000 title claims abstract description 60
- 230000006835 compression Effects 0.000 title claims abstract description 27
- 238000007906 compression Methods 0.000 title claims abstract description 27
- 238000012549 training Methods 0.000 claims abstract description 38
- 230000004927 fusion Effects 0.000 claims abstract description 13
- 230000008569 process Effects 0.000 claims abstract description 9
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 7
- 230000006870 function Effects 0.000 claims description 55
- 238000013473 artificial intelligence Methods 0.000 claims description 18
- 238000005457 optimization Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 7
- 230000006872 improvement Effects 0.000 description 5
- 244000141353 Prunus domestica Species 0.000 description 4
- 230000001133 acceleration Effects 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000005728 strengthening Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides an artificial intelligent model compression method for regularized pruning and importance pruning of a fusion group, which comprises the following steps: initializing weights and offsets in the deep convolutional neural network, and initializing a group regularization penalty factor; adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training; back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function; continuously optimizing weights and biases through a training process of multiple iterations to obtain network parameters with perfect training; after training, pruning the network and compressing the network model; and carrying out fine tuning training on the pruned network. In the method, in the regularized pruning of the group, the penalty factors are adaptively adjusted according to the importance of the weight group, and the adaptively adjusted penalty factors are used for pruning the network more reasonably, so that the weight group with important contribution to the network performance is better reserved.
Description
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence model compression method integrating regularized pruning and importance pruning.
Background
Artificial intelligence has been successfully applied in a variety of different practical tasks, with deep neural networks being one of the most successful artificial intelligence models and achieving the most advanced performance in a number of application areas. Deep neural networks can use large-scale networks to extract useful features from large amounts of data, which is a major factor in its success. However, this makes deep neural networks ubiquitous with large-scale parameters, complex computation, etc., which prevents their deployment on mobile and embedded devices. With this problem, various network compression methods have been developed.
Network pruning is an important artificial intelligence model compression method, often used to compress deep neural network models, which by removing some unimportant parts of the network, results in a more compact network without affecting the original accuracy. And especially, the structural pruning can remove the whole redundant structure in the network, thereby being beneficial to realizing the compression and acceleration of the model.
Existing structural pruning methods mainly include importance-based methods and regularization-based methods. The method based on importance sorts the network structures through some importance criteria, and prunes the structures with lower importance. This pruning method requires more retraining to recover the loss of accuracy due to pruning, which increases a significant amount of computational cost. The regularization-based method can punish the network weights in the training process, compress the weights of redundant structures to zero, and remove the weights from the network. Since these weights are small, they do not contribute much to the output of the network, their removal does not result in a large loss of accuracy, and good accuracy can be achieved without requiring too much retraining. Therefore, this method does not increase the calculation cost.
Network pruning is an important artificial intelligent model compression method, wherein the structural pruning method based on group regularization can effectively prune redundant structures in a network model, and simultaneously realize network model compression and model acceleration. However, existing group regularization methods typically use a fixed penalty factor to compress the weights to zero, which has an unreasonable assumption that all weights are equally important. In practical cases, the weights of the neural networks tend to have different importance. Some weights may have a greater contribution to the performance of the network, while others may have less impact on the performance of the network. Thus, using the same penalty factor, simply compressing all weights to zero at the same penalty strength results in a decrease in network performance. In addition, the penalty factor is taken as a super parameter, the value of the penalty factor lacks general theoretical guidance, parameter adjustment is usually required according to different problems, and a great deal of time cost and calculation cost are consumed.
The group regularization method is a regularization method specially used for structural pruning, and the structural pruning effect of the regularization method is verified by a lot of work. However, existing group regularization methods typically use a fixed penalty factor to compress the weights to zero, which has an unreasonable assumption that all weights are equally important. This assumption is contrary to the actual situation where the weights of the neural network tend to have different importance. Some weights may have a greater contribution to the performance of the network, while others may have less impact on the performance of the network. Thus, using the same penalty factor, simply compressing all weights to zero at the same penalty strength results in a decrease in network performance. In addition, the penalty factor is taken as a super parameter, the value of the penalty factor lacks general theoretical guidance, parameter adjustment is usually required according to different problems, and a great deal of time cost and calculation cost are consumed.
Disclosure of Invention
According to the technical problems, the invention provides a model compression method for regularized pruning and importance pruning of a fusion group. In the method, penalty factors are adaptively adjusted according to the importance of weight groups in group regularized pruning. By using the self-adaptive adjustment penalty factors, the network can be pruned more reasonably, and the weight group which has important contribution to the network performance can be better reserved.
The invention adopts the following technical means:
an artificial intelligence model compression method integrating regularized pruning and importance pruning, comprising:
s1, initializing weights and offsets in a deep convolutional neural network, and initializing a group regularization penalty factor;
s2, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training;
s3, back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function;
s4, continuously optimizing weights and biases through a training process of multiple iterations, and finally obtaining the network parameters with perfect training;
s5, after training is finished, pruning operation is carried out on the network, and a network model is compressed;
s6, performing fine tuning training on the pruned network to help the network adapt to the pruned structure.
Further, the step S1 specifically includes:
s11, randomly sampling the weight and the bias into a smaller value from normal distribution;
s12, initializing and setting the group regularization factor to 0.
Further, the step S2 specifically includes:
s21, expressing the cross entropy loss function as;
S22, representing the group regularization loss function as;
S23, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training:
Wherein,representing the set of ownership weights in the network, +.>Is a predicted loss function, i.e. cross entropy loss function, ">Is an unstructured regularization of each weight, +.>Representing norm operation, weight decay, ++>Is for->Group of layers->Structured sparse regularization term of +.>Is->The number of weight sets in a layer;
s24, pairUsing the Group Lasso, the Group Lasso canonical term is expressed as:
。
further, the step S3 specifically includes:
s31, differentiating the output of the network by using the loss function to obtain the gradient of the loss function to the output;
s32, sequentially calculating the gradient of each layer according to a chain rule until the gradient of a loss function on each parameter (weight and bias) is calculated;
s33, if the gradient of the loss function to each parameter is calculated, updating the parameters of the network by using an optimization algorithm.
Further, in the step S33, the optimization algorithm used for updating the parameters of the network is:
new parameter = old parameter-learning rate parameter gradient
Wherein the learning rate is a super parameter for controlling the step size of parameter update.
Further, the step S5 specifically includes:
s51, performing overall parameters of the convolution kernelOperating;
s52, setting a threshold value which is 0.0001, and removing convolution kernels smaller than the threshold value.
Further, the step S6 specifically includes:
training the pruned network, and slightly adjusting parameters of the network model by using a small learning rate.
Further, the method can impose more reasonable penalty factors on different weight sets, wherein:
for important weight groups, smaller penalty factors are applied, and the penalty force in updating is smaller;
for unimportant weight groups, a larger penalty factor is applied, and the penalty force in updating is larger.
Compared with the prior art, the invention has the following advantages:
1. according to the artificial intelligent model compression method for regularized pruning and importance pruning of the fusion group, the performance and the precision of a network can be remarkably improved, and the final precision can be improved by about 0.34% under the same training iteration times by taking VGG16 as an example; taking ResNet18 as an example, the final accuracy can be improved by about 0.28% for the same number of training iterations.
2. According to the artificial intelligent model compression method integrating regularized pruning and importance pruning, the calculation complexity of a network can be reduced, the calculation efficiency of the network is improved, and compared with a normal VGG16 network, the VGG16 is taken as an example, the parameter quantity can be reduced by 87%; taking ResNet18 as an example, the number of parameters can be reduced by 81%.
3. The artificial intelligence model compression method for regularized pruning and importance pruning of the fusion group can automatically identify and adjust the magnitude of the punishment factors without manual adjustment. Therefore, the workload of manual parameter adjustment can be reduced, and the network is ensured to be always subjected to proper regularization constraint in the training process.
4. The artificial intelligent model compression method for regularized pruning and importance pruning of the fusion group can be conveniently applied to the existing deep convolutional neural network training and pruning flow without large-scale modification or redesign. Thus, time and resources can be saved, and the method is easier to implement and popularize.
In summary, by using the method for adaptively adjusting the group regularization penalty factor provided by the invention, the performance and the precision of the deep convolutional neural network can be effectively improved, the number of parameters can be reduced, and the operation flow can be simplified. Meanwhile, the method is simple and convenient to operate, and can be conveniently applied to the existing network training and pruning flow. These improvements will bring broader application prospects and commercial value to the fields of neural network training and artificial intelligence model compression.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.
Fig. 1 is a general route diagram of network pruning provided by the present invention.
FIG. 2 is a schematic diagram of group regularization provided by the present invention.
FIG. 3 is a comparison of the number of parameters and the running acceleration ratio of different networks under the same conditions provided by the embodiment of the present invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in FIG. 1, the invention provides an artificial intelligence model compression method for regularized pruning and importance pruning of a fusion group, which comprises the following steps:
s1, initializing weights and offsets in a deep convolutional neural network, and initializing a group regularization penalty factor;
s2, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training; in this embodiment, in each training iteration, the group regularization loss function needs to be added in addition to the loss function generated by calculating the network parameters. This ensures that each weight set of the network is subject to the appropriate regularization constraints.
S3, back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function; in this embodiment, by dynamically adjusting the group regularization penalty factor, a weight group that has an important contribution to network performance can be more accurately retained, and a weight group that has less impact on network performance by pruning can be more accurately retained.
S4, continuously optimizing weights and biases through a training process of multiple iterations, and finally obtaining the network parameters with perfect training; therefore, the number of parameters of the network can be further reduced, and the calculation efficiency of the network is improved.
S5, after training is finished, pruning operation is carried out on the network, and a network model is compressed;
s6, performing fine tuning training on the pruned network to help the network adapt to the pruned structure. The fine tuning training can help the network adapt to the structure after pruning, and improves the generalization capability of the network so as to further improve the performance and the precision.
In specific implementation, as a preferred embodiment of the present invention, the step S1 specifically includes:
s11, randomly sampling the weight and the bias into a smaller value from normal distribution;
s12, initializing and setting the group regularization factor to 0.
In specific implementation, as a preferred embodiment of the present invention, the step S2 specifically includes:
s21, expressing the cross entropy loss function as;
S22, representing the group regularization loss function as;
S23, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training:
Wherein,representing the set of ownership weights in the network, +.>Is a predicted loss function, i.e. cross entropy loss function, ">Is an unstructured regularization of each weight, +.>Representing norm operation, weight decay, ++>Is for->Group of layers->Structured sparse regularization term of +.>Is->The number of weight sets in a layer;
s24, pairUsing the Group Lasso, the Group Lasso canonical term is expressed as:
。
in this embodiment, parameters of the network model are divided into two modules for processing, wherein one module performs data fitting to generate a cross entropy loss function, and the other module can penalize weight sets corresponding to the network filters, and compress weights of the redundant structures to zero, as shown in fig. 2, whereinIs an adaptively adjustable penalty factor that produces a group regularization loss function.
In specific implementation, as a preferred embodiment of the present invention, the step S3 specifically includes:
s31, differentiating the output of the network by using the loss function to obtain the gradient of the loss function to the output;
s32, sequentially calculating the gradient of each layer according to a chain rule until the gradient of a loss function on each parameter (weight and bias) is calculated;
s33, if the gradient of the loss function to each parameter is calculated, updating the parameters of the network by using an optimization algorithm. The optimization algorithm used for updating the parameters of the network is as follows:
new parameter = old parameter-learning rate parameter gradient
Wherein the learning rate is a super parameter for controlling the step size of parameter update. In this example, the initial value is 0.1, and the parameters are multiplied by 0.1 at 50% and 75% of the number of network iterations.
In this embodiment, the scale factor of the network batch normalization layer (BN layer) is used as an importance standard to represent the importance of the weight set, and thus a penalty factor capable of being adaptively adjusted is constructed, so that a small penalty factor can be applied to a filter with high importance, the filter is retained in a strengthening manner, and a large penalty factor is applied to a filter with low importance, and the filter is compressed to zero. The scaling factor of BN layer is a significant advantage as a criterion of importance:
(1) The scaling factor of the BN layer is an inherent parameter in the network training process, and no additional construction parameter is needed as an importance standard, so that the calculation cost can be saved;
(2) The scaling factor of the BN layer is related to the characteristic image output by the filter, so that the importance of the filter can be more accurately represented;
(3) The scaling factor of the BN layer is itself updatable, which can bring about the updating of the penalty factor constructed by it, so that no additional design of the responsible penalty factor updating rules is required.
In specific implementation, as a preferred embodiment of the present invention, the step S5 specifically includes:
s51, performing overall parameters of the convolution kernelOperating;
s52, setting a threshold value which is 0.0001, and removing convolution kernels smaller than the threshold value.
In specific implementation, as a preferred embodiment of the present invention, the step S6 specifically includes:
training the pruned network, and slightly adjusting the parameters of the network model by using a small learning rate (0.0005).
In specific implementation, as a preferred embodiment of the present invention, the method can apply more reasonable penalty factors to different weight groups, wherein:
for important weight groups, smaller penalty factors are applied, and the penalty force in updating is smaller;
for unimportant weight groups, a larger penalty factor is applied, and the penalty force in updating is larger.
Examples
VGG, res net in the following table are standard networks; SSL is a method for applying fixed group regularization factors under the same model; DRFGR is a penalty factor for applying adaptive adjustment under the same model.
Table I results of different indexes of Vgg16 and Vgg19 under the same condition and different methods
Table II results of different indices of ResNet18 and ResNet50 under the same conditions and different methods
In summary, the group regularization method of the self-adaptive adjustment penalty factors can more reasonably prune the network, can effectively compress and prune the network model, and simultaneously maintains the performance and accuracy of the model. The improvement part of the invention brings improvement of performance, precision, efficiency, simplicity, generalization capability and expandability. The improvement of the technical effects and main performance indexes can lead the method of the invention to have wide application prospect and commercial value in the field of artificial intelligent model compression. The method for adaptively adjusting the regularization penalty factors of the group can effectively improve the performance and the precision of the deep convolutional neural network, reduce the number of parameters and simplify the operation flow. Meanwhile, the method is simple and convenient to operate, and can be conveniently applied to the existing network training and pruning flow. These improvements will bring broader application prospects and commercial value to the fields of neural network training and artificial intelligence model compression.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.
Claims (8)
1. An artificial intelligence model compression method integrating regularized pruning and importance pruning, which is characterized by comprising the following steps:
s1, initializing weights and offsets in a deep convolutional neural network, and initializing a group regularization penalty factor;
s2, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training;
s3, back propagation is carried out according to the total loss function, gradient of the loss function is calculated, and weight and bias of the network are updated to minimize the loss function;
s4, continuously optimizing weights and biases through a training process of multiple iterations, and finally obtaining the network parameters with perfect training;
s5, after training is finished, pruning operation is carried out on the network, and a network model is compressed;
s6, performing fine tuning training on the pruned network to help the network adapt to the pruned structure.
2. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S1 specifically comprises:
s11, randomly sampling the weight and the bias into a smaller value from normal distribution;
s12, initializing and setting the group regularization factor to 0.
3. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S2 specifically comprises:
s21, expressing the cross entropy loss function as;
S22, representing the group regularization loss function as;
S23, adding the cross entropy loss function and the group regularization loss function to obtain a total loss function of the network training:
Wherein,representing the set of ownership weights in the network, +.>Is the predicted loss function, i.e. the cross entropy loss function,is an unstructured regularization of each weight, +.>Representing norm operation, weight decay, ++>Is for->Group of layersStructured sparse regularization term of +.>Is->The number of weight sets in a layer;
s24, pairUsing the Group Lasso, the Group Lasso canonical term is expressed as:
。
4. the artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S3 specifically comprises:
s31, differentiating the output of the network by using the loss function to obtain the gradient of the loss function to the output;
s32, sequentially calculating the gradient of each layer according to a chain rule until the gradient of a loss function on each parameter (weight and bias) is calculated;
s33, if the gradient of the loss function to each parameter is calculated, updating the parameters of the network by using an optimization algorithm.
5. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein in step S33, the optimization algorithm used for updating the parameters of the network is:
new parameter = old parameter-learning rate parameter gradient
Wherein the learning rate is a super parameter for controlling the step size of parameter update.
6. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S5 specifically comprises:
s51, performing overall parameters of the convolution kernelOperating;
s52, setting a threshold value which is 0.0001, and removing convolution kernels smaller than the threshold value.
7. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning according to claim 1, wherein the step S6 specifically comprises:
training the pruned network, and slightly adjusting parameters of the network model by using a small learning rate.
8. The artificial intelligence model compression method of fusion group regularized pruning and importance pruning of claim 1, wherein the method is capable of applying more rational penalty factors to different weight groups, wherein:
for important weight groups, smaller penalty factors are applied, and the penalty force in updating is smaller;
for unimportant weight groups, a larger penalty factor is applied, and the penalty force in updating is larger.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410009132.4A CN117521763A (en) | 2024-01-04 | 2024-01-04 | Artificial intelligent model compression method integrating regularized pruning and importance pruning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410009132.4A CN117521763A (en) | 2024-01-04 | 2024-01-04 | Artificial intelligent model compression method integrating regularized pruning and importance pruning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117521763A true CN117521763A (en) | 2024-02-06 |
Family
ID=89751579
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410009132.4A Pending CN117521763A (en) | 2024-01-04 | 2024-01-04 | Artificial intelligent model compression method integrating regularized pruning and importance pruning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117521763A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117890978A (en) * | 2024-03-18 | 2024-04-16 | 大连海事大学 | Seismic velocity image generation method based on visual transducer |
-
2024
- 2024-01-04 CN CN202410009132.4A patent/CN117521763A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117890978A (en) * | 2024-03-18 | 2024-04-16 | 大连海事大学 | Seismic velocity image generation method based on visual transducer |
CN117890978B (en) * | 2024-03-18 | 2024-05-10 | 大连海事大学 | Seismic velocity image generation method based on visual transducer |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021004366A1 (en) | Neural network accelerator based on structured pruning and low-bit quantization, and method | |
CN111079781B (en) | Lightweight convolutional neural network image recognition method based on low rank and sparse decomposition | |
CN110175628A (en) | A kind of compression algorithm based on automatic search with the neural networks pruning of knowledge distillation | |
TWI785227B (en) | Method for batch normalization layer pruning in deep neural networks | |
CN109102064A (en) | A kind of high-precision neural network quantization compression method | |
CN117521763A (en) | Artificial intelligent model compression method integrating regularized pruning and importance pruning | |
CN111105035A (en) | Neural network pruning method based on combination of sparse learning and genetic algorithm | |
CN113610227B (en) | Deep convolutional neural network pruning method for image classification | |
CN113657421A (en) | Convolutional neural network compression method and device and image classification method and device | |
CN113595993A (en) | Vehicle-mounted sensing equipment joint learning method for model structure optimization under edge calculation | |
CN112488304A (en) | Heuristic filter pruning method and system in convolutional neural network | |
CN112381218A (en) | Local updating method for distributed deep learning training | |
CN110188877A (en) | A kind of neural network compression method and device | |
CN111382581B (en) | One-time pruning compression method in machine translation | |
CN114970853A (en) | Cross-range quantization convolutional neural network compression method | |
CN108805844B (en) | Lightweight regression network construction method based on prior filtering | |
CN112132062B (en) | Remote sensing image classification method based on pruning compression neural network | |
CN114004327A (en) | Adaptive quantization method of neural network accelerator suitable for running on FPGA | |
Liu et al. | Improvement of pruning method for convolution neural network compression | |
CN115564043B (en) | Image classification model pruning method and device, electronic equipment and storage medium | |
CN116187401A (en) | Compression method and device for neural network, electronic equipment and storage medium | |
CN109389221A (en) | A kind of neural network compression method | |
CN111210009A (en) | Information entropy-based multi-model adaptive deep neural network filter grafting method, device and system and storage medium | |
CN113592085A (en) | Nuclear pruning method, device, equipment and medium based on high-rank convolution graph | |
Sarkar et al. | An incremental pruning strategy for fast training of CNN models |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |