CN112766397A - Classification network and implementation method and device thereof - Google Patents

Classification network and implementation method and device thereof Download PDF

Info

Publication number
CN112766397A
CN112766397A CN202110113089.2A CN202110113089A CN112766397A CN 112766397 A CN112766397 A CN 112766397A CN 202110113089 A CN202110113089 A CN 202110113089A CN 112766397 A CN112766397 A CN 112766397A
Authority
CN
China
Prior art keywords
pruning
module
network
classification network
inclusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110113089.2A
Other languages
Chinese (zh)
Other versions
CN112766397B (en
Inventor
冯蓬勃
张一凡
刘杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Goertek Inc
Original Assignee
Goertek Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Goertek Inc filed Critical Goertek Inc
Priority to CN202110113089.2A priority Critical patent/CN112766397B/en
Publication of CN112766397A publication Critical patent/CN112766397A/en
Priority to PCT/CN2021/129578 priority patent/WO2022160856A1/en
Application granted granted Critical
Publication of CN112766397B publication Critical patent/CN112766397B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a classification network and a realization method and a device thereof, wherein the method comprises the following steps: constructing a classification network comprising a Stem module, a plurality of inclusion modules and a plurality of IR modules based on IRv2 algorithm, wherein the batch standardization BN layer of the classification network uses gamma parameter; initializing a classification network, and performing sparse training on the initialized classification network to obtain a sparse gamma parameter; based on the sparsified gamma parameter, network slimming pruning is carried out on the Stem module, the pruning inclusion module and the plurality of IR modules, and network slimming pruning is not carried out on the non-pruning inclusion module. The technical scheme can ensure the classification effect of the classification network and simultaneously realize the simplification of the network structure, thereby improving the training efficiency, reducing the inference time and the memory occupation, saving the labor, the working hours and the expenses and enabling the classification network to be deployed on hardware equipment with different computing powers.

Description

Classification network and implementation method and device thereof
Technical Field
The present application relates to the field of neural network technologies, and in particular, to a classification network and a method and an apparatus for implementing the classification network.
Background
IRv2 is called as inclusion-ResNet-v 2, that is, the second generation algorithm combines the inclusion and ResNet (residual error network) to classify, combines the advantages of inclusion and ResNet, can obtain a deeper classification network, and has good classification performance. However, the classification network obtained based on the IRv2 algorithm also has the problems of high complexity, large memory occupation and the like, and needs to be solved urgently.
Disclosure of Invention
The embodiment of the application provides a classification network and a realization method and device thereof, which can ensure the classification effect of the classification network, reduce the complexity of the classification network and obtain a lighter classification network.
The embodiment of the application adopts the following technical scheme:
in a first aspect, an embodiment of the present application provides a method for implementing a classification network, including: constructing a classification network comprising a Stem module, a plurality of inclusion modules and a plurality of IR modules based on IRv2 algorithm, wherein the plurality of inclusion modules comprise a pruning inclusion module and a non-pruning inclusion module, and a batch standardization BN layer of the classification network uses gamma parameter; initializing a classification network, and performing sparse training on the initialized classification network to obtain a sparse gamma parameter; based on the sparsification gamma parameter, pruning the classification network after the sparsification training, wherein, network sliming pruning is carried out on the Stem module, the pruning inclusion module and the plurality of IR modules, and network sliming pruning is not carried out on the non-pruning inclusion module.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and the input channel of the first convolutional layer in the Stem module is not subjected to network slimming pruning.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and determining the number and the serial number of the input channels of the first convolutional layer under each branch in the pruning increment module according to the number and the serial number of the output channels of the last convolutional layer in the Stem module.
In some embodiments, in the implementation method of the classification network, the inclusion module includes a Mixed _5b module and a Mixed _7a module, where the Mixed _5b module is a pruning inclusion module, and the Mixed _7a module is a non-pruning inclusion module.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and performing no network migration pruning on the input channel of the first convolution layer under each interception branch in the IR module.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and determining the number and the serial number of the input channels of the first convolution layer connected behind the concat layer in the IR module according to the number and the serial number of the output channels of the last convolution layer in each input branch.
In some embodiments, in the method for implementing a classification network, performing network scaling on the Stem module, the pruning inclusion module, and the plurality of IR modules includes: the network slimming pruning was performed on three IR modules, Block35 module, Block17 module, and Block8 module.
In some embodiments, the method for implementing a classification network further comprises: fine-tuning the classification network subjected to pruning treatment; if the pruning rate of the fine-tuned classification network reaches a preset value, outputting the fine-tuned classification network; otherwise, carrying out sparse training on the fine-tuned classification network, and carrying out pruning treatment on the sparsely trained classification network again.
In a second aspect, an embodiment of the present application further provides an implementation apparatus for a classified network, which is used to implement the implementation method for the classified network described above.
In a third aspect, an embodiment of the present application further provides a classification network based on an IRv2 algorithm, including a Stem module, multiple inclusion modules, and multiple IR modules, where the inclusion modules include a pruning inclusion module and a non-pruning inclusion module, the Stem module, the pruning inclusion module, and the IR module are pruned through network pruning, and the non-pruning inclusion module is not pruned through network pruning.
In a fourth aspect, an embodiment of the present application further provides an electronic device, including: a processor; and a memory arranged to store computer executable instructions that, when executed, cause the processor to perform a method of implementing a classification network as described in any one of the above.
In a fifth aspect, the present application further provides a computer-readable storage medium storing one or more programs, which when executed by an electronic device including a plurality of application programs, cause the electronic device to execute the implementation method of the classification network as described in any one of the above.
The embodiment of the application adopts at least one technical scheme which can achieve the following beneficial effects: by selectively using a network pruning processing technology, the classification network comprising the Stem module, the plurality of inclusion modules and the plurality of IR modules is reasonably pruned, so that the classification effect of the classification network can be ensured, and the simplification of the network structure is realized, thereby improving the training efficiency, reducing the inference time and the memory occupation, saving the labor, the working hours and the expenditure, and enabling the classification network to be deployed on hardware equipment with different computing powers.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 shows a flow diagram of a method of implementing a classification network according to one embodiment of the present application;
FIG. 2 illustrates a classification network structure diagram based on the IRv2 algorithm according to one embodiment of the present application;
fig. 3 shows a schematic network structure diagram of the Mixed _5b module in fig. 2;
FIG. 4 is a schematic diagram of a network structure of the Block35 module in FIG. 2;
FIG. 5 is a schematic diagram of the network structure of the Block17 module in FIG. 2;
FIG. 6 is a schematic diagram of a network structure of the Block8 module in FIG. 2;
FIG. 7 is a block diagram of an apparatus for implementing a classification network according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to expand the application scenes of the classification neural network, reduce the training and reasoning time of the classification neural network and compress the size of the model of the classification neural network, a large number of pruning, compressing and quantifying methods are provided by academia in academia, and the existing complex classification neural network is pruned, compressed and quantified, so that the method is suitable for various application scenes with different computing power.
The starting point of neural network pruning is to eliminate unimportant parameters in the model under the condition of ensuring that the performance loss of the model is not too large, so that the complexity, the memory occupation and the calculation amount of the network are reduced. Neural network pruning can be divided into unstructured pruning and structured pruning according to the granularity of pruning. Most of the early pruning methods are unstructured, for example, zero is set for the value with small absolute value in kernel, so as to obtain sparse kernel, which is characterized by needing the support of bottom hardware or computational library. Structured pruning can be at the channel-wise, filter-wise or shape-wise level, i.e. the direct deletion of unimportant structures, with the advantage that no support of underlying hardware or computational libraries is required.
The Network sliming algorithm belongs to a classical structured pruning algorithm, and pruning of a convolutional layer is realized by introducing a channel-level scaling factor into the convolutional layer. In the convolutional neural network, the number of output channels of a convolutional layer of a Batch Normalization (BN) layer is the same as the number of gamma parameters of the BN layer, i.e., each channel output by the convolutional layer corresponds to a gamma value. Therefore, the Network scaling algorithm multiplexes the gamma parameter of the BN layer as a scaling factor, and the complexity of the Network structure can be reduced on the basis that the model performance is basically unchanged by thinning the gamma parameter and deleting the channel with a small corresponding gamma value.
The technical idea of the method is that a Network sliming algorithm is utilized to prune a complex classification Network realized based on IRv2 algorithm, selectively determine which modules are pruned, and do not prune the modules, thereby realizing the balance of Network structure complexity and classification effect.
The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.
Fig. 1 shows a flow diagram of a method for implementing a classification network according to an embodiment of the present application. As shown in fig. 1, the method includes:
step S110, constructing a classification network comprising a Stem module, a plurality of inclusion modules and a plurality of IR modules based on IRv2 algorithm, wherein the plurality of inclusion modules comprise a pruning inclusion module and a non-pruning inclusion module, and the batch standardization BN layer of the classification network uses gamma parameter.
The classification network structure realized based on the IRv2 algorithm has the characteristics of an inclusion and ResNet network structure, specifically, each layer in the inclusion network structure uses convolution kernels with different sizes to generate different receptive fields, and then all branches are spliced to enrich information of each layer, namely, the inclusion network structure can realize the improvement of classification performance by increasing the width of a neural network; the Resnet network structure solves the problem that the neural network is degraded along with the increase of the depth by adding identity mapping (identity mapping) on the basis of a shallow network, so that the neural network can be designed to be very deep, namely the Resnet network structure realizes the improvement of classification performance by increasing the depth of the network.
Among the above mentioned modules, the Stem module mainly performs convolution and pooling on the image for multiple times to realize the preprocessing of the image. The inclusion module is derived from an inclusion algorithm, for example, includes modules of Mixed _5 series (Mixed _5b, Mixed _5c, Mixed _5d), Mixed _6 series (Mixed _6a to Mixed _6e), Mixed _7 series (Mixed _7a, Mixed _7b, Mixed _7c), and the like, and specifically, which kind of modules are used can be selected according to requirements. The IR module is the inclusion-ResNet module, which is a combination of inclusion and ResNet proposed by the IRv2 algorithm.
In the embodiment of the application, the inclusion module is divided into a pruning inclusion module and a non-pruning inclusion module, that is, a part of the inclusion module is pruned, and a part of the inclusion module is not pruned.
In practice, many classification networks based on IRv2 algorithm do not use γ parameters at BN (batch normalization) layer, but in the embodiment of the present application, since pruning needs to be performed based on sparsified γ parameters, γ parameters need to be used at BN layer when constructing the classification network.
And step S120, initializing the classification network, and performing sparse training on the initialized classification network to obtain a sparse gamma parameter. Any one of the prior art methods may be selected for the sparsification training, which is not limited in this application.
And step S130, based on the sparsified gamma parameter, pruning the classification network after the sparsification training, wherein network sliming pruning is carried out on the Stem module, the pruning Incep module and the plurality of IR modules, and network sliming pruning is not carried out on the non-pruning Incep module.
Specifically, the network slimming pruning can be performed on the module to be pruned with reference to the following manner:
1) firstly, all gamma values in the whole classification network are sequenced, and a threshold value T is determined according to a preset pruning rate.
2) Analyzing each convolution layer one by one, recording the channel serial number of the BN layer after the convolution layer, wherein the gamma parameter in the BN layer is less than the threshold value T, and recording the channel serial number as Ij(where j represents the sequence number of the current convolutional layer), this type of channel is deleted when pruning.
3) For the convolutional layer needing pruning, the number of input channels is determined by the number of output channels of the layer above the convolutional layer, and the number of output channels is determined by the gamma value in the next BN layer. Assuming that the serial number of the current convolution layer is k, inputting the deleted channel sequence of the channelNumber Ik-1That is, the serial number of the deleting channel is the same as that of the deleting channel of the previous layer; the channel number of the output channel deletion is Ik
It can be seen that, in the method shown in fig. 1, the network migration pruning processing technology is selectively used to reasonably prune the classification network including the Stem module, the inclusion modules and the IR modules, which can ensure the classification effect of the classification network and realize the simplification of the network structure, thereby improving the training efficiency, reducing the inference time and memory occupation, saving the manpower, working hours and expenses, and enabling the classification network to be deployed on hardware devices with different computing powers.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and performing no network slim pruning on the input channel of the first convolutional layer in the Stem module. Pruning cannot be performed because the input channel of the first convolutional layer in the Stem module is the original image, and the remaining convolutional layers of the module can be pruned normally in the manner described above.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and determining the number and the serial number of the input channels of the first convolutional layer under each branch in the pruning increment module according to the number and the serial number of the output channels of the last convolutional layer in the Stem module.
In this embodiment, the pruning inclusion module is connected after the Stem module, so that the input channel of the first convolutional layer under each branch in the pruning inclusion module should match with the output channel of the last convolutional layer in the Stem module, and thus, in particular, the number and the serial number need to be matched. For example, if the number of output channels of the last convolutional layer in the Stem module is 32 and the serial numbers are 0, 2, 4, and 6 … … 62, then correspondingly, in the pruning inclusion module, the number of input channels of the first convolutional layer under each branch is also 32, and the serial numbers are also 0, 2, 4, and 6 … … 62, which are in one-to-one correspondence.
In addition, other convolutional layers in the pruning inclusion module do not need to be subjected to network slimming pruning.
In some embodiments, in the implementation method of the classification network, the inclusion module includes a Mixed _5b module and a Mixed _7a module, where the Mixed _5b module is a pruning inclusion module, and the Mixed _7a module is a non-pruning inclusion module.
As mentioned above, the inclusion module includes a plurality of series, in some embodiments, a Mixed _5b module and a Mixed _7a module are selected, and the Mixed _5b module is pruned, and the Mixed _7a module is not pruned.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and performing no network migration pruning on the input channel of the first convolution layer under each interception branch in the IR module.
In some embodiments, in the method for implementing a classification network, pruning the sparsely trained classification network includes: and determining the number and the serial number of the input channels of the first convolution layer connected behind the concat layer in the IR module according to the number and the serial number of the output channels of the last convolution layer in each input branch. That is, the first convolutional layer connected to the concat layer is the convolutional layer that can be pruned in each IR module, and the other convolutional layers in the IR modules are not pruned.
In some embodiments, in the method for implementing a classification network, performing network scaling on the Stem module, the pruning inclusion module, and the plurality of IR modules includes: the network slimming pruning was performed on three IR modules, Block35 module, Block17 module, and Block8 module.
Fig. 2 shows a structural diagram of a classification network based on IRv2 algorithm according to an embodiment of the present application, and as shown in fig. 2, the classification network includes tandem Stem module, Mixed _5b module, combination of 10 Block35 modules, Mixed _6a module, combination of 20 Block17 modules, Mixed _7a module, combination of 10 Block8 modules, Average Pooling layer (Average Pooling), Dropout layer, and Softmax layer. The Stem module is used to receive Input (Input).
Fig. 3 shows a schematic diagram of a network structure of the Mixed _5b module in fig. 2, and as shown in fig. 3, the Mixed _5b module includes four branches, a first branch includes a 64-channel convolution layer (Conv) using 1 × 1 convolution kernel, a 96-channel convolution layer using 3 × 3 convolution kernel, and a 96-channel convolution layer using 3 × 3 convolution kernel; the second branch includes 48-channel convolution layers using 1 x 1 convolution kernels, and 64-channel convolution layers using 5 x 5 convolution kernels; the third branch includes an average pooling layer of 192 channels and 64 channel convolution layers using 1 x 1 convolution kernels, and the fourth branch includes 96 channel convolution layers using 1 x 1 convolution kernels. The Output of each branch is spliced by the concat layer of 320 channels and then Output (Output).
In fig. 3, the 64-channel convolutional layer with 1 × 1 convolutional kernel in the first branch, the 48-channel convolutional layer with 1 × 1 convolutional kernel in the second branch, and the 96-channel convolutional layer with 1 × 1 convolutional kernel in the fourth branch can be pruned, and the number and sequence number of the input channels are determined according to the number and sequence number of the output channels of the last convolutional layer in the Stem module.
Fig. 4 shows a schematic network structure diagram of the Block35 module in fig. 2, in which a dashed box represents a convolutional layer that can be pruned. As shown in fig. 4, the Block35 module includes three branches, the first branch including a 32-channel convolutional layer using 1 × 1 convolution kernels, a 48-channel convolutional layer using 3 × 3 convolution kernels, and a 64-channel convolutional layer using 3 × 3 convolution kernels; the second branch includes 32-channel convolution layers using 1 x 1 convolution kernels and 32-channel convolution layers using 3 x 3 convolution kernels; the third branch includes 32-channel convolution layers using 1 x 1 convolution kernels. The output of each branch is spliced by a concat layer of 128 channels and then subjected to characteristic output, then a 320-channel convolution layer which can be pruned and uses a 1 x 1 convolution kernel is subjected to proportion adjustment (scaled up) and activation (using a relu function), and the final characteristic output of the module is obtained.
Fig. 5 shows a schematic diagram of a network structure of the Block17 module in fig. 2, in which a dashed box represents a convolutional layer that can be pruned. As shown in fig. 5, the Block35 module includes two branches, the first branch including a 128 channel convolutional layer using 1 x 1 convolution kernel, a 160 channel convolutional layer using 1 x 7 convolution kernel, a 192 channel convolutional layer using 7 x 1 convolution kernel; the second branch comprises the convolution-batch normalization-activation layer CBA (26, 1, 192). The output of each branch is spliced by a concat layer of 384 channels and then subjected to characteristic output, and then subjected to 1088 channel convolution layer which can be pruned and uses 1 x 1 convolution kernel, and then the final characteristic output of the module is obtained through proportion adjustment and activation.
Fig. 6 shows a schematic diagram of a network structure of the Block8 module in fig. 2, in which a dashed box represents a convolutional layer that can be pruned. As shown in fig. 4, the Block35 module includes two branches, the first branch including a 192 channel convolutional layer using 1 x 1 convolution kernel, a 224 channel convolutional layer using 1 x 3 convolution kernel, a 256 channel convolutional layer using 3 x 1 convolution kernel; the second branch includes 192 channel convolution layers using 1 x 1 convolution kernels. The output of each branch is spliced by a concat layer of 448 channels and then subjected to characteristic output, and then subjected to 2080 channel convolution layer which can be pruned and uses 1 x 1 convolution kernel, and then the final characteristic output of the module is obtained through proportion adjustment and activation.
In some embodiments, the method for implementing a classification network further comprises: fine-tuning the classification network subjected to pruning treatment; if the pruning rate of the fine-tuned classification network reaches a preset value, outputting the fine-tuned classification network; otherwise, carrying out sparse training on the fine-tuned classification network, and carrying out pruning treatment on the sparsely trained classification network again.
Since the gamma parameter is mainly used for determining which channels need to be pruned, the gamma parameter in the classification network can be removed after pruning, and then fine adjustment of the classification network is carried out. If the pruning rate of the fine-tuned classification network reaches the expectation and the effect also meets the expectation, the classification network can be put into use, if the pruning rate does not meet the expectation, the sparse training can be carried out again, and the network slimming pruning can be carried out by utilizing the sparse gamma parameter again.
Experiments prove that the scheme of the embodiment of the application can reduce the training time and the quit time of the classification network and reduce the floating point operation times (FlOPs) and the parameter quantity.
In addition, the scheme of the embodiment of the application can also keep the classification effect of the classification network.
In a set of control experiments, the experimental scene is defect detection of industrial products, and the classification objects are 1799 images (samples) to be detected of the industrial products.
The contrast network is a network constructed based on IRv2 algorithm, and reference can be made to fig. 2; the pruning network is obtained by pruning the control network with the pruning rate of 0.2 by using the method shown in the embodiment.
Experimental data show that the missed detection rate of the pruning network (the defective industrial products are classified into normal industrial products, miss) is reduced by 0.2%, the accuracy rate (P) of the pruning network is the same as that of the comparison network and is 98.3%, and the reduction of the missed detection rate has stronger practical significance in the scene.
Moreover, training time (about 93.1% of the comparison network), reasoning time (about 90.2% of the comparison network), FLOPs (floating point operands, about 81.5% of the comparison network) and parameter amount (about 77.6% of the comparison network) of the pruning network are all significantly reduced compared with the comparison network, while the performances of the two remain substantially unchanged, and unexpected gains are possible.
The embodiment of the present application further provides an implementation apparatus of a classification network, which is used for implementing the implementation method of the classification network.
Specifically, fig. 7 shows a schematic structural diagram of an implementation apparatus of a classification network according to an embodiment of the present application. As shown in fig. 7, an apparatus 700 for implementing a classification network includes:
the construction unit 710 is configured to construct a classification network including a Stem module, a plurality of inclusion modules and a plurality of IR modules based on the IRv2 algorithm, wherein the plurality of inclusion modules includes a pruning inclusion module and a non-pruning inclusion module, and a gamma parameter is used in a batch-standardized BN layer of the classification network.
And the sparse unit 720 is configured to initialize the classification network, and perform sparse training on the initialized classification network to obtain a sparse gamma parameter.
And the pruning unit 730 is configured to prune the classification network after the sparseness training based on the sparseness γ parameter, wherein network pruning is performed on the Stem module, the pruning inclusion module and the plurality of IR modules, and network pruning is not performed on the non-pruning inclusion module.
In some embodiments, in the implementation apparatus of the classification network, the pruning unit 730 is configured to not perform network pruning on the input channel of the first convolutional layer in the Stem module.
In some embodiments, in the apparatus for implementing a classification network, the pruning unit 730 is configured to determine, according to the number and sequence number of output channels of the last convolutional layer in the Stem module, the number and sequence number of input channels of the next convolutional layer in each branch in the pruning inclusion module.
In some embodiments, in the implementation apparatus of the classification network, the inclusion module includes a Mixed _5b module and a Mixed _7a module, where the Mixed _5b module is a pruning inclusion module and the Mixed _7a module is a non-pruning inclusion module.
In some embodiments, in the implementation apparatus of the classification network, the pruning unit 730 is configured to not perform network pruning on the input channel of the first convolutional layer under each initiation branch in the IR module.
In some embodiments, in the apparatus for implementing a classification network, the pruning unit 730 is configured to determine, according to the number and sequence number of output channels of the last convolutional layer in each initiation branch, the number and sequence number of input channels of the first convolutional layer connected after the concat layer in the IR module.
In some embodiments, in the implementation apparatus of the classified network, the pruning unit 730 is configured to perform network pruning on three IR modules, i.e., a Block35 module, a Block17 module, and a Block8 module.
In some embodiments, the apparatus for implementing a classification network further includes a fine-tuning unit, configured to perform fine-tuning on the classification network after pruning; if the pruning rate of the fine-tuned classification network reaches a preset value, outputting the fine-tuned classification network; otherwise, the sparse unit 720 performs sparse training on the fine-tuned classification network, and the pruning unit 730 prunes the sparsely trained classification network again.
It can be understood that the implementation apparatus for a classification network can implement each step of the implementation method for a classification network provided in the foregoing embodiments, and the related explanations about the implementation method for a classification network are applicable to the implementation apparatus for a classification network, and are not described herein again.
The embodiment of the application further provides a classification network based on an IRv2 algorithm, which comprises a Stem module, a plurality of inclusion modules and a plurality of IR modules, wherein the inclusion modules comprise a pruning inclusion module and a non-pruning inclusion module, the Stem module, the pruning inclusion module and the IR modules are subjected to network sizing pruning, and the non-pruning inclusion module is not subjected to network sizing pruning. The specific structure of the classification network can refer to fig. 2 to 6.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application. Referring to fig. 8, at a hardware level, the electronic device includes a processor, and optionally further includes an internal bus, a network interface, and a memory. The Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory, such as at least 1 disk Memory. Of course, the electronic device may also include hardware required for other services.
The processor, the network interface, and the memory may be connected to each other via an internal bus, which may be an ISA (Industry Standard Architecture) bus, a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 8, but that does not indicate only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code including computer operating instructions. The memory may include both memory and non-volatile storage and provides instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to form the implementation device of the classification network on the logic level. The processor is used for executing the program stored in the memory and is specifically used for executing the following operations:
constructing a classification network comprising a Stem module, a plurality of inclusion modules and a plurality of IR modules based on IRv2 algorithm, wherein the plurality of inclusion modules comprise a pruning inclusion module and a non-pruning inclusion module, and a batch standardization BN layer of the classification network uses gamma parameter; initializing a classification network, and performing sparse training on the initialized classification network to obtain a sparse gamma parameter; based on the sparsified gamma parameter, network slimming pruning is carried out on the Stem module, the pruning inclusion module and the plurality of IR modules, and network slimming pruning is not carried out on the non-pruning inclusion module.
The implementation method of the classification network disclosed in the embodiment of fig. 1 of the present application may be applied to or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software. The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor.
The electronic device may further execute the method for implementing the classification network in fig. 1, and implement the function of the apparatus for implementing the classification network in the embodiment shown in fig. 7, which is not described herein again in this embodiment of the present application.
An embodiment of the present application further provides a computer-readable storage medium storing one or more programs, where the one or more programs include instructions, which, when executed by an electronic device including a plurality of application programs, enable the electronic device to perform an implementation method of a classification network in the embodiment shown in fig. 1, and are specifically configured to perform:
constructing a classification network comprising a Stem module, a plurality of inclusion modules and a plurality of IR modules based on IRv2 algorithm, wherein the plurality of inclusion modules comprise a pruning inclusion module and a non-pruning inclusion module, and a batch standardization BN layer of the classification network uses gamma parameter; initializing a classification network, and performing sparse training on the initialized classification network to obtain a sparse gamma parameter; based on the sparsified gamma parameter, network slimming pruning is carried out on the Stem module, the pruning inclusion module and the plurality of IR modules, and network slimming pruning is not carried out on the non-pruning inclusion module.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method for implementing a classification network includes:
constructing a classification network comprising a Stem module, a plurality of inclusion modules and a plurality of IR modules based on an IRv2 algorithm, wherein the plurality of inclusion modules comprise a pruning inclusion module and a non-pruning inclusion module, and a batch standardization BN layer of the classification network uses a gamma parameter;
initializing the classification network, and performing sparse training on the initialized classification network to obtain a sparse gamma parameter;
based on a sparsification gamma parameter, pruning the classification network after the sparsification training, wherein network sliming pruning is carried out on the Stem module, the pruning inclusion module and the plurality of IR modules, and network sliming pruning is not carried out on the non-pruning inclusion module.
2. The method of claim 1, wherein pruning the sparsely trained classification network comprises:
and carrying out no network slim pruning on the input channel of the first convolutional layer in the Stem module.
3. The method of claim 2, wherein pruning the sparsely trained classification network comprises:
and determining the number and the serial number of the input channels of the first convolutional layer under each branch in the pruning increment module according to the number and the serial number of the output channels of the last convolutional layer in the Stem module.
4. The method of claim 1,
the inclusion module comprises a Mixed-5 b module and a Mixed-7 a module, wherein the Mixed-5 b module is a pruning inclusion module, and the Mixed-7 a module is a non-pruning inclusion module.
5. The method of claim 1, wherein pruning the sparsely trained classification network comprises:
and performing no network migration pruning on the input channel of the first convolution layer under each interception branch in the IR module.
6. The method of claim 5, wherein pruning the sparsely trained classification network comprises:
and determining the number and the serial number of the input channels of the first convolution layer connected behind the concat layer in the IR module according to the number and the serial number of the output channels of the last convolution layer in each acceptance branch.
7. The method of claim 1, wherein said network pruning the Stem module, the pruning inclusion module, and the plurality of IR modules comprises:
the network slimming pruning was performed on three IR modules, Block35 module, Block17 module, and Block8 module.
8. The method of any one of claims 1 to 7, further comprising:
fine-tuning the classification network subjected to the pruning treatment;
if the pruning rate of the fine-tuned classification network reaches a preset value, outputting the fine-tuned classification network; otherwise, carrying out the thinning training on the finely tuned classification network, and carrying out the pruning treatment on the sparsely trained classification network again.
9. An implementation device of a classification network, which is used for implementing the method of any one of claims 1 to 8.
10. A classification network based on IRv2 algorithm, comprising a Stem module, a plurality of inclusion modules and a plurality of IR modules, characterized in that the inclusion modules comprise a pruning inclusion module and a non-pruning inclusion module,
the Stem module, the pruning inclusion module and the IR module are subjected to network scaling pruning, and the non-pruning inclusion module is not subjected to network scaling pruning.
CN202110113089.2A 2021-01-27 2021-01-27 Classification network and implementation method and device thereof Active CN112766397B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110113089.2A CN112766397B (en) 2021-01-27 2021-01-27 Classification network and implementation method and device thereof
PCT/CN2021/129578 WO2022160856A1 (en) 2021-01-27 2021-11-09 Classification network, and method and apparatus for implementing same

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110113089.2A CN112766397B (en) 2021-01-27 2021-01-27 Classification network and implementation method and device thereof

Publications (2)

Publication Number Publication Date
CN112766397A true CN112766397A (en) 2021-05-07
CN112766397B CN112766397B (en) 2023-12-05

Family

ID=75706220

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110113089.2A Active CN112766397B (en) 2021-01-27 2021-01-27 Classification network and implementation method and device thereof

Country Status (2)

Country Link
CN (1) CN112766397B (en)
WO (1) WO2022160856A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022160856A1 (en) * 2021-01-27 2022-08-04 歌尔股份有限公司 Classification network, and method and apparatus for implementing same

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115049058B (en) * 2022-08-17 2023-01-20 北京智芯微电子科技有限公司 Compression method and device of topology recognition model, electronic equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635936A (en) * 2018-12-29 2019-04-16 杭州国芯科技股份有限公司 A kind of neural networks pruning quantization method based on retraining
CN110321999A (en) * 2018-03-30 2019-10-11 北京深鉴智能科技有限公司 Neural computing figure optimization method
CN110866473A (en) * 2019-11-04 2020-03-06 浙江大华技术股份有限公司 Target object tracking detection method and device, storage medium and electronic device
CN111199282A (en) * 2019-12-31 2020-05-26 的卢技术有限公司 Pruning method and device for convolutional neural network model

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111401523A (en) * 2020-03-13 2020-07-10 大连理工大学 Deep learning network model compression method based on network layer pruning
CN111612144B (en) * 2020-05-22 2021-06-15 深圳金三立视频科技股份有限公司 Pruning method and terminal applied to target detection
CN111832705B (en) * 2020-06-30 2024-04-02 南京航空航天大学 Compression method of convolutional neural network and realization circuit thereof
CN112766397B (en) * 2021-01-27 2023-12-05 歌尔股份有限公司 Classification network and implementation method and device thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110321999A (en) * 2018-03-30 2019-10-11 北京深鉴智能科技有限公司 Neural computing figure optimization method
CN109635936A (en) * 2018-12-29 2019-04-16 杭州国芯科技股份有限公司 A kind of neural networks pruning quantization method based on retraining
CN110866473A (en) * 2019-11-04 2020-03-06 浙江大华技术股份有限公司 Target object tracking detection method and device, storage medium and electronic device
CN111199282A (en) * 2019-12-31 2020-05-26 的卢技术有限公司 Pruning method and device for convolutional neural network model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022160856A1 (en) * 2021-01-27 2022-08-04 歌尔股份有限公司 Classification network, and method and apparatus for implementing same

Also Published As

Publication number Publication date
CN112766397B (en) 2023-12-05
WO2022160856A1 (en) 2022-08-04

Similar Documents

Publication Publication Date Title
CN108701250B (en) Data fixed-point method and device
KR102631381B1 (en) Convolutional neural network processing method and apparatus
US11741339B2 (en) Deep neural network-based method and device for quantifying activation amount
US11748595B2 (en) Convolution acceleration operation method and apparatus, storage medium and terminal device
US20240160891A1 (en) Memory allocation method for ai processor, computer apparatus, and computer-readable storage medium
US11928599B2 (en) Method and device for model compression of neural network
CN112766397A (en) Classification network and implementation method and device thereof
US20220253716A1 (en) Neural network comprising matrix multiplication
CN111709415B (en) Target detection method, device, computer equipment and storage medium
CN112949692A (en) Target detection method and device
Niu et al. Reuse kernels or activations? A flexible dataflow for low-latency spectral CNN acceleration
CN114677548A (en) Neural network image classification system and method based on resistive random access memory
US11410016B2 (en) Selective performance of deterministic computations for neural networks
CN111275166B (en) Convolutional neural network-based image processing device, equipment and readable storage medium
US20240127044A1 (en) Hardware implementation of an attention-based neural network
US20230021204A1 (en) Neural network comprising matrix multiplication
US20220253709A1 (en) Compressing a Set of Coefficients for Subsequent Use in a Neural Network
CN115543945A (en) Model compression method and device, storage medium and electronic equipment
EP3933705A1 (en) Methods and systems for running dynamic recurrent neural networks in hardware
CN114742221A (en) Deep neural network model pruning method, system, equipment and medium
CN113963236A (en) Target detection method and device
US20220261652A1 (en) Training a Neural Network
US20230010180A1 (en) Parafinitary neural learning
US20240144012A1 (en) Method and apparatus for compressing neural network model by using hardware characteristics
CN116755714B (en) Method, device, equipment and storage medium for operating deep neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant