CN111222629B - Neural network model pruning method and system based on self-adaptive batch standardization - Google Patents

Neural network model pruning method and system based on self-adaptive batch standardization Download PDF

Info

Publication number
CN111222629B
CN111222629B CN201911423105.7A CN201911423105A CN111222629B CN 111222629 B CN111222629 B CN 111222629B CN 201911423105 A CN201911423105 A CN 201911423105A CN 111222629 B CN111222629 B CN 111222629B
Authority
CN
China
Prior art keywords
pruning
model
neural network
network model
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911423105.7A
Other languages
Chinese (zh)
Other versions
CN111222629A (en
Inventor
李百林
苏江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DMAI Guangzhou Co Ltd
Original Assignee
DMAI Guangzhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DMAI Guangzhou Co Ltd filed Critical DMAI Guangzhou Co Ltd
Priority to CN201911423105.7A priority Critical patent/CN111222629B/en
Publication of CN111222629A publication Critical patent/CN111222629A/en
Application granted granted Critical
Publication of CN111222629B publication Critical patent/CN111222629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Feedback Control In General (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a neural network model pruning method and system based on self-adaptive batch standardization, which take random sampling floating point number as pruning rate of each layer, and generate pruning rate vector (r) under the limit of preset computing resources 1 ,r 2 ,...,r L ) As a pruning strategy, pruning model candidate sets formed by pruning the models respectively based on the pruning strategy; respectively utilizing a self-adaptive batch normalization method to the pruning models in the candidate set to update the statistical parameters of the batch normalization layer; and evaluating the classification accuracy of the neural network model with updated statistical parameters, and fine-tuning the model with the highest classification accuracy on a training set to convergence to obtain a final pruning model. According to the invention, the candidate sub-networks are rapidly and accurately evaluated by the adjustment standardization layer, and the winning pruning strategy is finely adjusted in the rapid evaluation method, so that the parameters of the final pruning network are obtained, the huge time consumption required by finely adjusting all the pruning networks is avoided, and meanwhile, the accuracy is also advantageous.

Description

Neural network model pruning method and system based on self-adaptive batch standardization
Technical Field
The invention relates to the technical field of neural network model pruning, in particular to a neural network model pruning method and system based on self-adaptive batch standardization.
Background
Neural network pruning aims at reducing computational redundancy of the neural network without losing much accuracy. The pruned model generally has lower energy consumption and hardware load, and therefore has great significance for deployment on embedded equipment. However, how to find the least important parts of the network to minimize the loss of accuracy after pruning is a critical issue. The pruning problem of a neural network can be seen as a search problem, where the search space is the set of all pruned subnetworks, and finding the subnetwork with the highest accuracy in this space is the core of the pruning problem. The sub-network evaluation process is commonly existed in the existing pruning method, and can reveal the potential accuracy of the sub-network and then fine-tune the sub-network with the highest potential accuracy to obtain the optimal neural network model. In the neural network model pruning method in the prior art, as shown in fig. 1, all pruning networks often need to be finely tuned, so as to judge the final convergence accuracy which can be achieved by different pruning strategies, but the essence of the fine tuning is training for a plurality of periods, and the fine tuning is relatively time-consuming.
Disclosure of Invention
Therefore, the neural network model pruning method and system based on the self-adaptive batch standardization overcomes the defect of large time consumption of the neural network model pruning method in the prior art.
In a first aspect, an embodiment of the present invention provides a neural network model pruning method based on adaptive batch normalization, including the following steps: for an L-layer neural network model, L [0, R are randomly sampled](0<R<1) The floating point number in the tree is used as the pruning rate of each layer, and a pruning rate vector (r) is generated under the condition that the limit of preset computing resources is met 1 ,r 2 ,…,r L ) As a pruning strategy; pruning is respectively carried out on the neural network model based on the pruning strategy, and a pruning model candidate set formed by the pruned model is generated; respectively utilizing a self-adaptive batch normalization method to the pruning models in the candidate set to update the statistical parameters of the batch normalization layer; and evaluating the classification accuracy of the neural network model with updated statistical parameters, and fine-tuning the model with the highest classification accuracy on a training set to convergence to be used as a final pruning model.
In an embodiment, the preset computing resource limit includes at least one of a preset computing operand limit, a preset parameter limit, and a preset computing delay limit.
In an embodiment, the process of pruning the neural network model based on the pruning strategy to generate a pruned model candidate set formed by the pruned model includes: pruning is carried out on the candidate set pruning model by utilizing each pruning strategy, the convolution kernels of all layers are ordered according to the norm from large to small, and the convolution kernels after M bits are inverted in the ordering are removed, wherein M=ceil (r) l * cl), ceil represents rounding up, c l Is the number of convolution kernels of layer i, r l The pruning rate of the first layer; and forming a pruning model candidate set by using the pruning model with the convolution kernel after the M bits of the reciprocal number are removed.
In an embodiment, the process of updating statistical parameters of the batch normalization layer of the pruning models in the candidate set by using an adaptive batch normalization method respectively includes: and (3) fixing all the learnable parameters of the pruning models in each candidate set, iterating on a preset number of training samples, and updating the sliding average value and the sliding variance value of the statistical parameters of the batch normalization layer.
In one embodiment, the parameter sliding average and sliding variance values are respectively calculated by the following formula
Figure BDA0002352826150000021
Updating:
μ t =mμ t-1 +(1-m)μ β ,
Figure BDA0002352826150000022
wherein mu t Representing a running average of the parameters,
Figure BDA0002352826150000031
represents the sliding variance value, m represents the weight of the historical average in the sliding average, t represents the iteration number, and β represents the sample batch.
In one embodiment, the process of evaluating the classification accuracy of the neural network model with updated statistical parameters and taking the model with highest classification accuracy as the final pruning model comprises the steps of acquiring the classification accuracy calculated on the verification set by the neural network model with updated statistical parameters as the potential accuracy of the model; and acquiring the neural network models with the potential accuracy rate ranked in the top N as candidate models, performing fine tuning on the candidate models on a training set until convergence, and calculating the model with the highest classification accuracy rate on a verification set as a final pruning model.
In a second aspect, an embodiment of the present invention provides a neural network model pruning system based on adaptive batch normalization, including: pruning strategy generation module for randomly sampling L [0, R for L-layer neural network model](0<R<1) Floating point number in asThe pruning rate of each layer is used for generating a pruning rate vector (r under the condition of meeting the limit of preset computing resources 1 ,r 2 ,…,r L ) As a pruning strategy; the pruning model candidate set generation module is used for pruning the neural network model based on the pruning strategy respectively to generate a pruning model candidate set formed by the pruned model; the statistical parameter updating module of the batch normalization layer is used for updating the statistical parameters of the batch normalization layer of the pruning models in the candidate set by using a self-adaptive batch normalization method respectively; and the pruning model output module is used for evaluating the classification accuracy of the neural network model with updated statistical parameters, and taking the model with the highest classification accuracy as the final pruning model after fine adjustment on a training set to convergence.
In a third aspect, an embodiment of the present invention provides a computer apparatus, including: the system comprises at least one processor and a memory communicatively connected with the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the neural network model pruning method based on adaptive batch normalization according to the first aspect of the present invention.
In a fourth aspect, an embodiment of the present invention provides a computer readable storage medium, where computer instructions are stored, to cause the at least one processor to execute the neural network model pruning method based on adaptive batch normalization according to the first aspect of the present invention.
The technical scheme of the invention has the following advantages:
1. the embodiment of the invention provides a neural network model pruning method and system based on self-adaptive batch standardization, which aim at a neural network model, randomly sample floating point numbers as pruning rate of each layer, and generate a pruning rate vector (r) under the condition that the limit of preset computing resources is met 1 ,r 2 ,…,r L ) As a pruning strategy, pruning is respectively carried out on the neural network model based on the pruning strategy, and a pruning model candidate set formed by a pruned model is generated; to candidate setThe pruning model of the tree is updated with statistical parameters of a batch normalization layer by a self-adaptive batch normalization method respectively; and evaluating the classification accuracy of the neural network model with updated statistical parameters, and fine-tuning the model with the highest classification accuracy on a training set to convergence to obtain a final pruning model. According to the invention, the candidate sub-networks are rapidly and accurately evaluated by the adjustment standardization layer, and the winning pruning strategy is finely adjusted in the rapid evaluation method, so that the parameters of the final pruning network are obtained, the huge time consumption required by finely adjusting all the pruning networks is avoided, and meanwhile, the accuracy is also advantageous.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described, and it is obvious that the drawings in the description below are some embodiments of the present invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a specific example of a pruning algorithm in the prior art according to an embodiment of the present invention;
FIG. 2 is a flowchart of an example of a neural network model pruning method based on adaptive batch normalization according to an embodiment of the present invention;
FIG. 3 is a flowchart of a specific example of a neural network model pruning method based on adaptive batch normalization according to an embodiment of the present invention;
FIG. 4 is a block diagram of a specific example of a neural network model pruning system based on adaptive batch normalization according to an embodiment of the present invention;
fig. 5 is a composition diagram of a specific example of a computer device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Example 1
The neural network model pruning method based on the adaptive batch standardization provided by the embodiment of the invention, as shown in fig. 2, comprises the following steps:
step S1: for an L-layer neural network model, L [0, R are randomly sampled](0<R<1) The floating point number in the tree is used as the pruning rate of each layer, and a pruning rate vector (r) is generated under the condition that the limit of preset computing resources is met 1 ,r 2 ,…,r L ) As a pruning strategy.
In the embodiment of the invention, for taking all convolution layers with the convolution kernel size of 3×3 of mobilenet v1 as a pruneable layer (l=14), 14 [0, r are randomly sampled](0<R<1) The floating point number in the tree is used as the pruning rate of each layer, and a corresponding vector (r 1 ,r 2 ,…,r L ) As pruning strategy, each element r in the vector 1 Representing the proportion of convolution kernels that the first layer needs to reduce, i.e. the pruning rate of each layer, the pruning policy remains to meet preset computational resource constraints, including at least one of preset computational operand constraints, preset parameter number constraints, preset computational latency constraints, e.g. computational operands do not exceed 283M, etc.
And S2, pruning the neural network models based on pruning strategies to generate a pruning model candidate set formed by the pruned models.
According to the embodiment of the invention, a training to converged MobileNet V1 model is subjected to convolution kernel pruning one by using the pruning strategy generated in the step S1. Specifically, for each pruning strategy (r 1 ,r 2 ,…,r L ) The convolution kernels of each layer L are firstly ordered from large to small according to the L1 norm, and then removed from the original neural networkConvolution kernels after the last M bits in the ordering, where m=ceil (r l *c l ) Ceil represents the upper rounding, c l For the number of convolution kernels of layer l, r l The pruning rate of the first layer is adopted, so that the purposes of reducing the quantity of the neural network parameters and the required calculated quantity are achieved. And pruning the same trained neural network model by respectively applying different pruning strategies in the pruning strategy candidate set to obtain a pruning model candidate set consisting of different pruned models (1000 in the embodiment of the invention, which is only used as an example and not limited thereto). It should be noted that the above-mentioned order of the convolution kernels of each layer L from large to small according to the L1 norm is only used as an illustration, but not limited thereto, and in other embodiments, the order may be obtained according to other norm types.
And step S3, updating statistical parameters of the batch normalization layer of the pruning models in the candidate set by using a self-adaptive batch normalization method respectively.
In this embodiment, the adaptive batch normalization method is applied one by one to recalculate the statistical parameters (sliding average μ) of the batch normalization layer (international generic name Batch Normalization layer) t And a sliding variance value
Figure BDA0002352826150000071
). Specifically, for the pruning model in each candidate set, all the learnable parameters in the neural network are fixed first, then iterated over a small number of training samples, and the statistical parameters μ of the batch normalization layer are calculated t And->
Figure BDA0002352826150000072
The updating is carried out, and the updating process is as follows:
μ t =mμ t-1 +(1-m)μ β ,
Figure BDA0002352826150000073
where m represents the weight of the historical average in the running average, t represents the number of iterations, and β represents the sample lot. The number of training samples in the embodiment of the present invention is 5000, but is merely exemplary and not limited thereto.
And S4, evaluating the classification accuracy of the neural network model with updated statistical parameters, and taking the model with the highest classification accuracy as a final pruning model.
The embodiment of the invention acquires the classification accuracy calculated on the verification set by the neural network model with updated statistical parameters as the potential accuracy of the model; and acquiring the neural network models with the potential accuracy rate ranked in the top N as candidate models, performing fine tuning on the candidate models on a training set until convergence, and calculating the model with the highest classification accuracy rate on a verification set as a final pruning model.
In a specific embodiment, as shown in FIG. 3, classification accuracy is calculated for each candidate model over a sub-verification set of 10000 samples randomly sampled in the training set. In this implementation, the accuracy is 14.33% at the highest, and the pruning strategy corresponding to the candidate model is (0.40,0.26,0.29,0.33,0.39,0.14,0.28,0.38,0.39,0.23,0.36,0.09,0.02,0.28). And finally, fine-tuning the candidate sub-network obtained by pruning according to the pruning strategy until convergence, wherein the classification accuracy of the final pruning model reaches 70.9%.
In order to verify the effectiveness of the invention, the embodiment of the invention selects an international public data set ImageNet for evaluation. The effectiveness of the invention is verified from the superiority evaluation of the model precision after pruning, and the invention selects the picture classification accuracy under the specific model floating point number operation times as comparison with the commonly adopted evaluation index of the data set, and the invention and other methods show the pruning effect of the invention on ResNet-50 and Table 2 show the pruning effect of the invention and other methods on MobileNet V1.
TABLE 1
Figure BDA0002352826150000081
Figure BDA0002352826150000091
TABLE 2
Method Floating point number calculation (M) Test accuracy (%)
0.5×MobileNetV1[4] 325 68.4
AMC 285 70.5
NetAdapt[2] 284 69.1
Meta-Pruning[3] 281 70.6
The invention is that 284 70.9
Compared with other pruning methods, the accuracy of the pruned model obtained by the method provided by the embodiment of the invention has obvious advantages under the same pruning rate.
According to the neural network model pruning method based on the adaptive batch standardization, provided by the embodiment of the invention, the candidate sub-networks are rapidly and accurately estimated by adjusting the batch standardization layer, and the winning pruning strategy in the rapid estimation method is finely adjusted to obtain the parameters of the final pruning network, so that the huge time consumption required by finely adjusting all the pruning networks is avoided, and meanwhile, the accuracy is also advantageous.
Example 2
The embodiment of the invention provides a neural network model pruning system based on self-adaptive batch standardization, which is shown in fig. 4, and comprises the following steps:
pruning strategy generation module 1 for randomly sampling L [0, R ] for an L-layer neural network model](0<R<1) The floating point number in the tree is used as the pruning rate of each layer, and a pruning rate vector (r) is generated under the condition that the limit of preset computing resources is met 1 ,r 2 ,…,r L ) As a pruning strategy; this module performs the method described in step S1 in embodiment 1, and will not be described here again.
The pruning model candidate set generating module 2 is used for respectively pruning based on the pruning strategy neural network model to generate a pruning model candidate set formed by a pruned model; this module performs the method described in step S2 in embodiment 1, and will not be described here.
The statistical parameter updating module of the batch normalization layer is used for updating the statistical parameters of the batch normalization layer of the pruning models in the candidate set by using a self-adaptive batch normalization method respectively; this module performs the method described in step S3 in embodiment 1, and will not be described here.
And the pruning model output module is used for evaluating the classification accuracy of the neural network model with updated statistical parameters, and taking the model with the highest classification accuracy as the final pruning model after fine adjustment on a training set to convergence. This module performs the method described in step S4 in embodiment 1, and will not be described here.
According to the neural network model pruning system based on the adaptive batch standardization, provided by the embodiment of the invention, the candidate sub-networks are rapidly and accurately evaluated by adjusting the batch standardization layer, and the winning pruning strategy in the rapid evaluation method is finely adjusted to obtain the parameters of the final pruning network, so that the huge time consumption required by finely adjusting all the pruning networks is avoided, and meanwhile, the accuracy is also advantageous.
Example 3
An embodiment of the present invention provides a computer device, as shown in fig. 5, including: at least one processor 401, such as a CPU (Central Processing Unit ), at least one communication interface 403, a memory 404, at least one communication bus 402. Wherein communication bus 402 is used to enable connected communications between these components. The communication interface 403 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional communication interface 403 may further include a standard wired interface and a wireless interface. The memory 404 may be a high-speed RAM memory (Ramdom Access Memory, volatile random access memory) or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 404 may also optionally be at least one storage device located remotely from the aforementioned processor 401. Wherein the processor 401 may perform the neural network model pruning method based on adaptive batch normalization in embodiment 1. A set of program codes is stored in the memory 404, and the processor 401 calls the program codes stored in the memory 404 for executing the neural network model pruning method based on adaptive batch normalization in embodiment 1. The communication bus 402 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. Communication bus 402 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one line is shown in fig. 5, but not only one bus or one type of bus.
Wherein the memory 404 may include volatile memory (English) such as random-access memory (RAM); the memory may also include a nonvolatile memory (english: non-volatile memory), such as a flash memory (english: flash memory), a hard disk (english: hard disk drive, abbreviated as HDD) or a solid state disk (english: solid-state drive, abbreviated as SSD); memory 404 may also include a combination of the above types of memory.
The processor 401 may be a central processor (English: central processing unit, abbreviated: CPU), a network processor (English: network processor, abbreviated: NP) or a combination of CPU and NP.
Wherein the processor 401 may further comprise a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof (English: programmable logic device). The PLD may be a complex programmable logic device (English: complex programmable logic device, abbreviated: CPLD), a field programmable gate array (English: field-programmable gate array, abbreviated: FPGA), a general-purpose array logic (English: generic array logic, abbreviated: GAL), or any combination thereof.
Optionally, the memory 404 is also used for storing program instructions. The processor 401 may invoke program instructions to implement the neural network model pruning method based on adaptive batch normalization as in execution of embodiment 1 herein.
The embodiment of the invention also provides a computer readable storage medium, and the computer readable storage medium stores computer executable instructions, wherein the computer executable instructions can execute the neural network model pruning method based on the adaptive batch normalization in the embodiment 1. Wherein the storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Flash Memory (Flash Memory), a Hard Disk (HDD), or a Solid State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
It is apparent that the above examples are given by way of illustration only and are not limiting of the embodiments. Other variations or modifications of the above teachings will be apparent to those of ordinary skill in the art. It is not necessary here nor is it exhaustive of all embodiments. And obvious variations or modifications thereof are contemplated as falling within the scope of the present invention.

Claims (6)

1. A neural network model pruning method based on self-adaptive batch standardization is characterized by comprising the following steps:
for an L-layer neural network model, L [0, R are randomly sampled](0<R<1) The floating point number in the tree is used as the pruning rate of each layer, and a pruning rate vector (r) is generated under the condition that the limit of preset computing resources is met 1 ,r 2 ,…,r L ) As a pruning strategy;
pruning is carried out on the neural network model based on the pruning strategy, and a pruning model candidate set formed by the pruned model is generated, which comprises the following steps:
pruning is carried out on the candidate set pruning model by utilizing each pruning strategy, the convolution kernels of all layers are ordered according to the norm from large to small, and the convolution kernels after M bits are inverted in the ordering are removed, wherein M=ceil (r) l *c l ) Ceil represents the upper rounding, c l For the number of convolution kernels of layer l, r l The pruning rate of the first layer;
the pruning model after the convolution kernel after M bits of the reciprocal is removed is formed into a pruning model candidate set;
the statistical parameters of the batch normalization layer of the pruning model in the candidate set are updated by a self-adaptive batch normalization method respectively, and the method comprises the following steps: for the pruning model in each candidate set, fixing all the learnable parameters, iterating on a preset number of training samples, updating the statistical parameter sliding average value and the sliding variance value of the batch normalization layer, and respectively carrying out the parameter sliding average value and the sliding variance value by the following formulas
Figure FDA0004105670140000011
Updating:
μ t =mμ t-1 +(1-m)μ β ,
Figure FDA0004105670140000012
wherein mu t Representing a running average of the parameters,
Figure FDA0004105670140000013
representing a sliding variance value, m represents a weight of a historical average in the sliding average, t represents iteration times, and beta represents a sample batch;
and evaluating the classification accuracy of the neural network model with updated statistical parameters, and fine-tuning the model with the highest classification accuracy on a training set to convergence to obtain a final pruning model.
2. The adaptive batch normalization based neural network model pruning method according to claim 1, wherein the preset computing resource limits include at least one of preset computing operand limits, preset parameter limit, and preset computing delay limit.
3. The neural network model pruning method based on adaptive batch normalization according to claim 1, wherein the process of evaluating the classification accuracy of the neural network model from which the updated statistical parameters are obtained and taking the model with the highest classification accuracy as the final pruning model comprises:
acquiring the classification accuracy calculated on the verification set by the neural network model with updated statistical parameters as the potential accuracy of the model;
and acquiring the neural network models with the potential accuracy rate ranked in the top N as candidate models, performing fine tuning on the candidate models on a training set until convergence, and calculating the model with the highest classification accuracy rate on a verification set as a final pruning model.
4. A neural network model pruning system based on adaptive batch normalization, comprising:
pruning strategy generation module for randomly sampling L [0, R for L-layer neural network model](0<R<1) The floating point number in the tree is used as the pruning rate of each layer, and a pruning rate vector (r) is generated under the condition that the limit of preset computing resources is met 1 ,r 2 ,…,r L ) As a pruning strategy;
the pruning model candidate set generating module is used for respectively pruning the neural network model based on the pruning strategy to generate a pruning model candidate set formed by the pruned model, and comprises the following steps:
pruning is carried out on the candidate set pruning model by utilizing each pruning strategy, the convolution kernels of all layers are ordered according to the norm from large to small, and the convolution kernels after M bits are inverted in the ordering are removed, wherein M=ceil (r) l *c l ) Ceil represents the upper rounding, c l For the number of convolution kernels of layer l, r l The pruning rate of the first layer;
the pruning model after the convolution kernel after M bits of the reciprocal is removed is formed into a pruning model candidate set;
the statistical parameter updating module of the batch normalization layer is used for updating the statistical parameters of the batch normalization layer of the pruning model in the candidate set by using a self-adaptive batch normalization method respectively, and comprises the following steps: for the pruning model in each candidate set, fixing all the learnable parameters, iterating on a preset number of training samples, updating the statistical parameter sliding average value and the sliding variance value of the batch normalization layer, and respectively carrying out the parameter sliding average value and the sliding variance value by the following formulas
Figure FDA0004105670140000031
Updating:
μ t =mμ t-1 +(1-m)μ β ,
Figure FDA0004105670140000032
wherein mu t Representative parametersThe sliding average value of the two values is calculated,
Figure FDA0004105670140000033
representing a sliding variance value, m represents a weight of a historical average in the sliding average, t represents iteration times, and beta represents a sample batch;
and the pruning model output module is used for evaluating the classification accuracy of the neural network model with updated statistical parameters, and taking the model with the highest classification accuracy as the final pruning model after fine tuning to convergence on the training set.
5. A computer device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the adaptive batch normalization based neural network model pruning method of any one of claims 1-3.
6. A computer readable storage medium having stored thereon computer instructions for causing the computer to perform the adaptive batch normalization based neural network model pruning method of any one of claims 1-3.
CN201911423105.7A 2019-12-31 2019-12-31 Neural network model pruning method and system based on self-adaptive batch standardization Active CN111222629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911423105.7A CN111222629B (en) 2019-12-31 2019-12-31 Neural network model pruning method and system based on self-adaptive batch standardization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911423105.7A CN111222629B (en) 2019-12-31 2019-12-31 Neural network model pruning method and system based on self-adaptive batch standardization

Publications (2)

Publication Number Publication Date
CN111222629A CN111222629A (en) 2020-06-02
CN111222629B true CN111222629B (en) 2023-05-05

Family

ID=70829217

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911423105.7A Active CN111222629B (en) 2019-12-31 2019-12-31 Neural network model pruning method and system based on self-adaptive batch standardization

Country Status (1)

Country Link
CN (1) CN111222629B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553169B (en) * 2020-06-25 2023-08-25 北京百度网讯科技有限公司 Pruning method and device of semantic understanding model, electronic equipment and storage medium
CN111967491A (en) * 2020-06-29 2020-11-20 北京百度网讯科技有限公司 Model offline quantization method and device, electronic equipment and storage medium
CN112241786B (en) * 2020-10-23 2024-02-20 北京百度网讯科技有限公司 Determination method and device for model super-parameters, computing device and medium
CN112149829B (en) * 2020-10-23 2024-05-14 北京百度网讯科技有限公司 Method, device, equipment and storage medium for determining pruning strategy of network model
CN112396100B (en) * 2020-11-16 2024-05-24 中保车服科技服务股份有限公司 Optimization method, system and related device for fine-grained classification model
CN112734036B (en) * 2021-01-14 2023-06-02 西安电子科技大学 Target detection method based on pruning convolutional neural network
CN114861910B (en) * 2022-05-19 2023-07-04 北京百度网讯科技有限公司 Compression method, device, equipment and medium of neural network model
CN115935263B (en) * 2023-02-22 2023-06-16 和普威视光电股份有限公司 Side chip detection and classification method and system based on yolov5 pruning

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981854A (en) * 2012-11-16 2013-03-20 天津市天祥世联网络科技有限公司 Neural network optimization method based on floating number operation inline function library
CN107340993B (en) * 2016-04-28 2021-07-16 中科寒武纪科技股份有限公司 Arithmetic device and method
US20180075347A1 (en) * 2016-09-15 2018-03-15 Microsoft Technology Licensing, Llc Efficient training of neural networks
US11631004B2 (en) * 2018-03-28 2023-04-18 Intel Corporation Channel pruning of a convolutional network based on gradient descent optimization
CN109657780A (en) * 2018-06-15 2019-04-19 清华大学 A kind of model compression method based on beta pruning sequence Active Learning
CN109101999B (en) * 2018-07-16 2021-06-25 华东师范大学 Support vector machine-based cooperative neural network credible decision method
CN109376859A (en) * 2018-09-27 2019-02-22 东南大学 A kind of neural networks pruning method based on diamond shape convolution
CN109902808B (en) * 2019-02-28 2023-09-26 华南理工大学 Method for optimizing convolutional neural network based on floating point digital variation genetic algorithm
CN110232436A (en) * 2019-05-08 2019-09-13 华为技术有限公司 Pruning method, device and the storage medium of convolutional neural networks
CN110222820A (en) * 2019-05-28 2019-09-10 东南大学 Convolutional neural networks compression method based on weight beta pruning and quantization
CN110210620A (en) * 2019-06-04 2019-09-06 北京邮电大学 A kind of channel pruning method for deep neural network
CN110222841A (en) * 2019-06-17 2019-09-10 苏州思必驰信息科技有限公司 Neural network training method and device based on spacing loss function
CN110443359A (en) * 2019-07-03 2019-11-12 中国石油大学(华东) Neural network compression algorithm based on adaptive combined beta pruning-quantization

Also Published As

Publication number Publication date
CN111222629A (en) 2020-06-02

Similar Documents

Publication Publication Date Title
CN111222629B (en) Neural network model pruning method and system based on self-adaptive batch standardization
JP6969637B2 (en) Causality analysis methods and electronic devices
TWI769754B (en) Method and device for determining target business model based on privacy protection
WO2021208151A1 (en) Model compression method, image processing method and device
KR101904518B1 (en) Method and system for identifying rare-event failure rates
WO2021129086A1 (en) Traffic prediction method, device, and storage medium
JP6831347B2 (en) Learning equipment, learning methods and learning programs
Jiang et al. Two step composite quantile regression for single-index models
CN111581909B (en) SRAM yield evaluation method based on improved adaptive importance sampling algorithm
WO2023098544A1 (en) Structured pruning method and apparatus based on local sparsity constraints
Oh et al. On score vector-and residual-based CUSUM tests in ARMA–GARCH models
US8813009B1 (en) Computing device mismatch variation contributions
Cocucci et al. Model error covariance estimation in particle and ensemble Kalman filters using an online expectation–maximization algorithm
JP7320705B2 (en) Learning data evaluation method, program, learning data generation method, trained model generation method, and learning data evaluation system
CN114186518A (en) Integrated circuit yield estimation method and memory
CN111612648B (en) Training method and device for photovoltaic power generation prediction model and computer equipment
CN115907775A (en) Personal credit assessment rating method based on deep learning and application thereof
EP4007173A1 (en) Data storage method, and data acquisition method and apparatus therefor
CN113688191B (en) Feature data generation method, electronic device, and storage medium
CN112396100B (en) Optimization method, system and related device for fine-grained classification model
CN115238641A (en) Defect root cause determination method, defect root cause determination device and storage medium
Ridder Asymptotic optimality of the cross-entropy method for Markov chain problems
Yin et al. An efficient reference-point based surrogate-assisted multi-objective differential evolution for analog/RF circuit synthesis
CN112419098A (en) Power grid safety and stability simulation sample screening and expanding method based on safety information entropy
Yuenyong Fast and effective tuning of echo state network reservoir parameters using evolutionary algorithms and template matrices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant