WO2022141754A1 - Automatic pruning method and platform for general compression architecture of convolutional neural network - Google Patents

Automatic pruning method and platform for general compression architecture of convolutional neural network Download PDF

Info

Publication number
WO2022141754A1
WO2022141754A1 PCT/CN2021/075807 CN2021075807W WO2022141754A1 WO 2022141754 A1 WO2022141754 A1 WO 2022141754A1 CN 2021075807 W CN2021075807 W CN 2021075807W WO 2022141754 A1 WO2022141754 A1 WO 2022141754A1
Authority
WO
WIPO (PCT)
Prior art keywords
pruning
network
channel
pruned
model
Prior art date
Application number
PCT/CN2021/075807
Other languages
French (fr)
Chinese (zh)
Inventor
王宏升
单海军
俞再亮
Original Assignee
之江实验室
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 之江实验室 filed Critical 之江实验室
Publication of WO2022141754A1 publication Critical patent/WO2022141754A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

Definitions

  • the invention belongs to the field of neural network model compression, and in particular relates to an automatic pruning method and platform for a general compression architecture of a convolutional neural network.
  • the purpose of the present invention is to provide an automatic pruning method and platform for a general compression architecture of a convolutional neural network in view of the deficiencies of the prior art.
  • a kind of automatic pruning method of the general compression architecture of convolutional neural network comprising the following steps:
  • Step 1 construct channel pruning coding vector: adopt random structure sampling method to carry out channel width sampling to all convolutional network modules of the convolutional neural network model input by the user, and generate channel pruning coding vector;
  • Step 2 Train the channel pruning network of meta-learning: design a pruning cell network, input the channel pruning encoding vector generated in step 1 into the pruning cell network, and the output of the pruning cell network is used to construct the weight matrix of the pruning network model , and generate the corresponding pruning network model; use the training data to jointly train the pruning cell network and the corresponding pruning network model, and update the pruning cell network at the same time;
  • Step 3 Search the optimal pruning network model based on evolutionary algorithm: input multiple channel pruning encoding vectors that satisfy specific constraints into the pruning cell network updated in step 2, output the weight matrix, and generate multiple corresponding pruning networks model; evaluate the accuracy of each pruned network model; use an evolutionary algorithm to search for the pruned network model that satisfies specific constraints and has the highest accuracy, and obtains a general compression architecture for convolutional neural networks.
  • the channel pruning coding vector is specifically: each element in the channel pruning coding vector corresponds to the channel width of a convolutional network module, and the channel width of each convolutional network module is randomly sampled to generate channel pruning. Coding vector, the convolutional neural network model input by the user and the pruning model are established one-to-one mapping relationship through the channel pruning coding vector, and the channel pruning coding vector is used to generate the corresponding pruning network model.
  • the channel pruning coding vector is generated by randomly selecting the channel width of each layer channel in each iteration; by inputting different channel pruning coding vectors, the corresponding weight matrix is generated, and different Pruning network model; by randomly generating different encoding vectors, the pruning cell network learns to predict the weights of different pruning network models.
  • the pruned cell network is specifically: the pruned cell network is composed of two fully connected layers, the input is a channel pruning encoding vector, and the output is a weight matrix for generating a pruned network model.
  • step 2 the following sub-steps are included:
  • step (2.3) is specifically: in the forward propagation stage, the channel pruning encoding vector is input into the pruning cell network to generate a weight matrix; at the same time, the weight matrix generated by the pruning cell network is used to construct and The pruning network model corresponding to the currently input channel pruning coding vector; adjust the shape of the weight matrix output by the pruning cell network to make it consistent with the input shape of the pruning network model corresponding to the channel pruning coding vector.
  • step (2.3) is specifically: in the back propagation stage, instead of updating the weight matrix of the pruned network model, the gradient of the weight in the pruned cell network is calculated; since the output of the pruned cell network and the pruned network The reshape operation and convolution operation between the outputs of the model are both differentiable, so the chain rule is used to calculate the gradient of the weights in the pruned network model to train the pruned cell network end-to-end.
  • step 3 includes the following substeps:
  • gene mutation refers to randomly changing a part of the element values in the gene; gene recombination refers to randomly recombining the elements of two parent genes; and eliminating new genes that do not meet specific constraints.
  • a platform for automatic pruning method based on the general compression architecture of the above-mentioned convolutional neural network including the following components:
  • Data loading component used to obtain training data of the convolutional neural network, the training data is a labeled sample that satisfies the supervised learning task;
  • Automatic compression component used to automatically compress the convolutional neural network model, including pruning vector encoding module, pruning network generation module, pruning cell network and pruning network joint training module, pruning network search module and task-specific fine-tuning module ;
  • the pruning vector coding module adopts the random structure sampling method to sample the channel width of all the convolutional network modules of the neural network model input by the user to generate the channel pruning coding vector; in the forward propagation process, the channel pruning coding vector is input into the pruning vector.
  • Branch cell network generate the corresponding structure of the pruning network and the weight matrix of the pruning cell network;
  • the pruning network generation module is based on the pruning cell network to construct a pruning network corresponding to the current input channel pruning encoding vector, and adjust the shape of the weight matrix output by the pruning cell network to make it correspond to the channel pruning encoding vector.
  • the number of encoder units of the input and output of the branch structure is the same;
  • the pruned cell network and the pruned network joint training module train the pruned cell network end-to-end. Specifically, simply randomly sampled channel pruning encoding vectors and a small batch of training data are input into the pruning network; The weight of the branch structure and the weight matrix of the pruned cell network;
  • the pruning network search module is to search for the highest-precision pruning network that satisfies specific constraints. It proposes to use evolutionary algorithms to search for the highest-precision pruning network that satisfies specific constraints; input the channel pruning code vector into the trained pruning network. cell network, generate the weights corresponding to the pruning network, and evaluate the pruning network on the validation set to obtain the accuracy of the corresponding pruning network; in the evolutionary search algorithm used in the meta-learning pruning network, the The structure is generated by simply randomly sampled channel pruning coding vector, so the channel pruning coding vector is defined as the gene of the pruning network; under certain constraints, a series of channel pruning coding vectors are first selected as the pruning network.
  • the accuracy of the corresponding pruning network is obtained by evaluating on the validation set; then, the top k genes with higher accuracy are selected, and gene recombination and mutation are used to generate new genes; by further repeating the top k optimal genes selected The process and the process of new gene generation are iterated to obtain the gene that satisfies the constraints and has the highest accuracy;
  • the task-specific fine-tuning module is to fine-tune the network for a specific task on the pruning network generated by the automatic compression component, use the feature layer and output layer of the pruning network to fine-tune the specific task scene, and output the final fine-tuned compression model, namely
  • the login user obtains the compression model of the convolutional neural network from the platform, and the user uses the compression model output by the automatic compression component to infer the new data of the specific task uploaded by the login user on the data set of the actual scene; and The performance comparison information of the inference model before and after compression is presented on the compression model inference page of the platform.
  • the present invention studies a general compression architecture for generating multiple convolutional neural networks based on channel pruning based on meta-learning; Compression architecture, resulting in an optimal general compression architecture for task-independent pretrained convolutional neural network models.
  • the general architecture of the multi-task-oriented pre-training convolutional neural network model can be compressed and generated, and the compressed model architecture can be fully utilized to improve specific tasks.
  • large-scale image processing models can be deployed on end-side devices such as small memory and limited resources, which promotes the landing process of general deep convolutional neural network models in the industry.
  • 1 is an overall architecture diagram of the compression method of the present invention in conjunction with a specific task
  • Fig. 2 is the training flow chart of the pruning network of meta-learning
  • Figure 3 is a diagram of the joint training process of the pruned cell network and the pruned network
  • Figure 4 is a diagram of a pruning network search architecture based on an evolutionary algorithm.
  • the present invention studies a general compression architecture for generating multiple pre-trained convolutional neural network models based on channel pruning based on meta-learning. Specifically, the present invention first constructs a network structure for pruning a large model on different convolution channels based on a channel pruning encoding vector generated by simple random sampling. Design a meta-network of pruned cell network, and use the pruned cell network to generate a pruned network model corresponding to the current input encoding vector. In each iteration, simple random sampling is used to generate the channel width of each layer of convolutional modules to form the corresponding encoding vector.
  • a pruning cell network that can generate weights for different pruning structures can be learned.
  • an evolutionary algorithm is used to search for the optimal compression structure, thereby obtaining the optimal general compression structure of the task-independent pre-trained convolutional neural network model.
  • the invention solves the problem of over-fitting learning and low generalization ability of the compression model in the compression process of the convolutional neural network model under the condition of few sample data, and deeply explores the feasibility and efficiency of image processing of the large-scale deep neural network model under the condition of few samples.
  • the channel pruning through meta-learning will face multiple
  • the large-scale pre-trained convolutional neural network model of the task is automatically compressed to generate a general architecture that satisfies different hard constraints (such as the number of floating-point operations) and is independent of the task; when using this general architecture, on the basis of the meta-learning pruning network Fine-tune a task-specific network, input a task-specific dataset, and fine-tune only a specific task, saving computational costs and improving efficiency.
  • the present invention is an automatic pruning method for a general compression architecture of a convolutional neural network.
  • the whole process is divided into three steps: the first step is to construct a channel pruning encoding vector based on simple random sampling; the second step is to prune the training element learning branch network; the third step is to search for the optimal compression structure based on the evolutionary algorithm; specifically:
  • Step 1 Construct a channel pruning encoding vector based on simple random sampling.
  • the simple random sampling method is used to conduct channel sampling for all the unit modules of the convolutional neural network model, and a channel sampling vector is generated, that is, the channel pruning coding vector.
  • each element in the channel pruning coding vector corresponds to the channel width of a convolutional network module, random sampling is performed on the channel of each convolutional network module, and a channel pruning coding vector is generated.
  • the input convolutional neural network model and the pruning model establish a one-to-one mapping relationship, and the corresponding pruning network structure is generated according to the channel pruning encoding vector.
  • channel pruning encoding vectors are generated by randomly selecting the channel width of each layer channel in each iteration.
  • different pruning network structures are constructed and corresponding weights are generated.
  • the pruned cellular network learns to predict the weights of different pruned networks.
  • Step 2 Train the meta-learned pruning network, as shown in Figure 2.
  • Define the pruning cell network take the channel pruning encoding vector as input, output the weight matrix used to construct the pruning network, and generate the corresponding pruning network model; use the batch data set to train the generated pruning structure and update the pruning structure Thus, the cell network is updated; the final output is the weight of the pruned cell network output after iterative update.
  • the pruned cell network is a meta-network consisting of two fully connected layers; the input is the channel pruning encoding vector constructed in the first stage, and the output is the weight matrix used to generate the pruned network model.
  • Training a pruned cell network includes the following sub-steps:
  • Step 1 In the forward propagation process, the channel pruning encoding vector is input into the pruning cell network and the weight matrix is output.
  • Step 2 As shown in Figure 3, the process of constructing a pruned network model based on the pruned cell network is as follows:
  • each element c i corresponds to the convolution channel width of the i-th layer convolution unit module, and the channel sampling is performed on each convolution layer of the original network input by the user to generate a channel
  • the pruning code vector that is, the elements sampled as c i in each channel pass through the pruning cell network to generate the i-th convolution unit module corresponding to the compression model and its weight; the original model and the compression model are established through the channel pruning code vector.
  • the corresponding pruning network structure is generated according to the channel pruning code vector.
  • Step 3 As shown in Figure 3, the process of jointly training the pruned cell network and the pruned network model is as follows:
  • a small batch of training data is input into the pruning network model generated in step 2 for model training.
  • the pruning network model updates the parameters (weight matrix)
  • the cell network also updates the parameters according to the updated parameters of the pruning network;
  • the pruned network model and the cellular network are updated together; the weights output by the cellular network can be calculated using the chain rule, so the cellular network can be trained end-to-end.
  • the simple random sampling method is used to sample the channels of the convolutional unit modules of each layer, and different channel pruning coding vectors are constructed.
  • the same training data set is used for multiple iteration training, and each iteration is based on a channel pruning coding vector.
  • Simultaneous training Cellular networks and pruned network models learn cell networks that can generate weight matrices for different pruned network models by changing the input channel pruning encoding vector.
  • the shape of the weight matrix output by the cell network is adjusted according to the width and position of the convolution channel of the i-th layer convolution unit module corresponding to the element c i in the coding vector.
  • Figure 4 shows the process of pruning network search based on evolutionary algorithm:
  • Each pruning network model is generated by the pruning coding vector obtained by simply randomly sampling the convolution channels of the convolutional unit modules of each layer, so the channel pruning coding vector is defined as the gene G of the pruning network model, A series of genes that satisfy the constraint C are randomly selected as the initial population.
  • Step 2 Evaluate the inference accuracy of the pruning network model corresponding to each gene G i in the existing population on the validation set, and select the top k genes with the highest accuracy.
  • Step 3 Use the top k genes with the highest accuracy selected in step 2 to perform gene recombination and gene mutation to generate new genes, and add the new genes to the existing population.
  • Gene mutation refers to mutation by randomly changing the value of some elements in the gene; gene recombination refers to randomly recombining the genes of two parents to produce offspring; and it is easy to strengthen constraint C by eliminating unqualified genes.
  • Step 4 Repeat steps 2 and 3 for N rounds of iterations, select the top k genes with the highest accuracy in the existing population and generate new genes, until the gene that satisfies the constraint C and has the highest accuracy is obtained.
  • a platform of the present invention for an automatic pruning method based on the general compression architecture of the above-mentioned convolutional neural network includes the following components:
  • Data loading component used to obtain training samples of the convolutional neural network, where the training samples are labeled samples that satisfy the supervised learning task.
  • Automatic compression component used to automatically compress the convolutional neural network model, including pruning vector encoding module, pruning network generation module, pruning cell network and pruning network joint training module, pruning network search module and task-specific fine-tuning module .
  • the pruning vector coding module adopts the random structure sampling method to sample the channel width of all the convolutional network modules of the neural network model input by the user to generate the channel pruning coding vector; in the forward propagation process, the channel pruning coding vector is input into the pruning vector.
  • Branch cell network generate the weight matrix of the corresponding structure of the pruned network and the pruned cell network.
  • the pruning network generation module is based on the pruning cell network to construct a pruning network corresponding to the current input channel pruning encoding vector, and adjust the shape of the weight matrix output by the pruning cell network to make it correspond to the channel pruning encoding vector.
  • the number of encoder units for the input and output of the branch structure is the same.
  • the pruned cell network and the pruned network joint training module train the pruned cell network end-to-end. Specifically, simply randomly sampled channel pruning encoding vectors and a small batch of training data are input into the pruning network; The weights of the branch structure and the weight matrix of the pruned cell network.
  • the pruning network search module is to search for the highest-precision pruning network that satisfies specific constraints. It proposes to use evolutionary algorithms to search for the highest-precision pruning network that satisfies specific constraints; input the channel pruning code vector into the trained pruning network. cell network, generate the weights corresponding to the pruning network, and evaluate the pruning network on the validation set to obtain the accuracy of the corresponding pruning network; in the evolutionary search algorithm used in the meta-learning pruning network, the The structure is generated by simply randomly sampled channel pruning coding vector, so the channel pruning coding vector is defined as the gene of the pruning network; under certain constraints, a series of channel pruning coding vectors are first selected as the pruning network.
  • the accuracy of the corresponding pruning network is obtained by evaluating on the validation set; then, the top k genes with higher accuracy are selected, and gene recombination and mutation are used to generate new genes; by further repeating the top k optimal genes selected The process and the process of new gene generation are iterated to obtain the gene that satisfies the constraints and has the highest accuracy.
  • the task-specific fine-tuning module is to fine-tune the network for a specific task on the pruning network generated by the automatic compression component, use the feature layer and output layer of the pruning network to fine-tune the specific task scene, and output the final fine-tuned compression model, namely
  • the login user obtains the compression model of the convolutional neural network from the platform, and the user uses the compression model output by the automatic compression component to infer the new data of the specific task uploaded by the login user on the data set of the actual scene; and The performance comparison information of the inference model before and after compression is presented on the compression model inference page of the platform.
  • the ImageNet2012 classification data set uploaded by the login user is obtained through the data loading component of the platform.
  • the original training image is divided into sub-validation data sets.
  • the sub-validation data set contains 50,000 images, which are obtained from each of 1,000 classes. 50 training images are randomly selected from the dataset, and the other remaining samples constitute the sub-training dataset.
  • the invention trains the cellular network on the sub-training data set, and evaluates the performance of the pruning network on the sub-validation data set in the search stage.
  • a pretrained convolutional neural network compression model is generated.
  • the pre-trained compression model generated by the automatic compression component is loaded through the platform, and the model for the classification task is fine-tuned on the generated pre-trained model.
  • Fine-tune the compression model based on the specific task fine-tuning module of the automatic compression component, and use the feature layer and output layer of the pre-trained model generated by the automatic compression component to fine-tune the image classification task scene, and finally, the platform outputs the image required by the login user A compressed model of convolutional neural network models for classification tasks.
  • the compressed model is output to the designated container, which can be downloaded by the logged-in user, and the model performance comparison information before and after compression is presented on the page of the output compression model of the platform.
  • Table 1 compares the original MobileNet V2 network model and meta-learning The top-1 accuracy of the channel pruning network.
  • the top-1 accuracy of the original MobileNet V2 network model running 313M floating point operations is 72.0%, while the meta-learning channel pruning network only runs 219M floating point operations to achieve a top-1 accuracy of 72.7%.
  • Table 1 Image classification task MobileNet V2 model comparison information before and after compression
  • ImageNet2012 (including 1000 class images) Before compression after compression Top1-Acc 72.0% 72.7% Number of floating point operations 313M 291M
  • the inference component of the platform use the compressed model output by the platform to infer the ImageNet2012 test set data uploaded by the login user, infer the compressed model on 8 Nvidia 1080Ti GPU graphics cards and present it on the compressed model inference page of the platform Performance information before and after compression.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Physiology (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed in the present invention are an automatic pruning method and platform for a general compression architecture of a convolutional neural network. The method comprises: firstly randomly performing channel width sampling of a convolution module of an input model to generate a channel pruning coding vector; then designing a pruned cell network, inputting the channel pruning coding vector into the cell network, outputting a weighting matrix used for constructing a pruned network model and generating a corresponding pruned structure model, and jointly training the pruned cell network and the generated pruned network model to update the pruned cell network; and finally using the weightings generated by the trained pruned network to search for the pruned network with the best performance; no fine-tuning is needed during the search. By means of training a single pruned network of a target network, the user can search for various pruned networks in different constraint conditions with little need for manual involvement, accelerating the speed of the search for high-performance neural network structures.

Description

一种卷积神经网络通用压缩架构的自动剪枝方法及平台An automatic pruning method and platform for a general compression architecture of convolutional neural networks 技术领域technical field
本发明属于神经网络模型压缩领域,尤其涉及一种卷积神经网络通用压缩架构的自动剪枝方法及平台。The invention belongs to the field of neural network model compression, and in particular relates to an automatic pruning method and platform for a general compression architecture of a convolutional neural network.
背景技术Background technique
大规模深度卷积神经网络模型在图像识别、目标检测等任务上都取得了优异的性能,然而,将具有海量参数的预训练模型部署到内存有限的设备中仍然面临巨大挑战。在模型压缩领域,已有的神经网络量化压缩方法将浮点型权重量化为低比特权重(例如,8位或1位)来减小模型大小。但是由于量化误差的引入,使得神经网络的训练非常困难。相比之下,通道剪枝方法通过直接移除冗余通道来减小模型大小并加速推理,使得快速推理几乎不需要额外的工作。而且通道剪枝之后,量化起来也更加容易,模型会更加紧凑。Large-scale deep convolutional neural network models have achieved excellent performance on tasks such as image recognition and object detection. However, deploying pre-trained models with massive parameters to devices with limited memory still faces great challenges. In the field of model compression, existing neural network quantization compression methods quantize floating-point weights into low-bit weights (eg, 8-bit or 1-bit) to reduce model size. However, due to the introduction of quantization error, the training of neural network is very difficult. In contrast, channel pruning methods reduce model size and speed up inference by directly removing redundant channels, enabling fast inference with little extra work. And after channel pruning, quantification is easier and the model is more compact.
已有的通道剪枝方法主要是数据驱动的稀疏约束或人工设计的剪枝策略;考虑到通常一个卷积神经网络具有非常多的卷积单元模块,每个模块的通道宽度通常随着网络的加深逐层增大,所以,卷积通道可能的裁剪方式有上亿种情况,由于受计算资源等限制,人工设计所有可能的剪枝结构并且寻找最优结构几乎不可能。Existing channel pruning methods are mainly data-driven sparse constraints or artificially designed pruning strategies; considering that usually a convolutional neural network has many convolutional unit modules, the channel width of each module usually varies with the network size. The deepening increases layer by layer. Therefore, there are hundreds of millions of possible pruning methods for the convolution channel. Due to the limitation of computing resources, it is almost impossible to manually design all possible pruning structures and find the optimal structure.
发明内容SUMMARY OF THE INVENTION
本发明的目的在于针对现有技术的不足,提供一种卷积神经网络通用压缩架构的自动剪枝方法及平台。The purpose of the present invention is to provide an automatic pruning method and platform for a general compression architecture of a convolutional neural network in view of the deficiencies of the prior art.
本发明的目的是通过以下技术方案实现的:一种卷积神经网络通用压缩架构的自动剪枝方法,包括以下步骤:The object of the present invention is to be achieved through the following technical solutions: a kind of automatic pruning method of the general compression architecture of convolutional neural network, comprising the following steps:
步骤1:构建通道剪枝编码向量:采用随机结构抽样方法对用户输入的卷积神经网络模型的所有卷积网络模块进行通道宽度采样,生成通道剪枝编码向量;Step 1: construct channel pruning coding vector: adopt random structure sampling method to carry out channel width sampling to all convolutional network modules of the convolutional neural network model input by the user, and generate channel pruning coding vector;
步骤2:训练元学习的通道剪枝网络:设计剪枝细胞网络,将步骤1生成的通道剪枝编码向量输入剪枝细胞网络,剪枝细胞网络的输出用于构建剪枝网络模型的权重矩阵,并生成对应的剪枝网络模型;利用训练数据联合训练剪枝细胞网络和对应的剪枝网络模型,同时更新剪枝细胞网络;Step 2: Train the channel pruning network of meta-learning: design a pruning cell network, input the channel pruning encoding vector generated in step 1 into the pruning cell network, and the output of the pruning cell network is used to construct the weight matrix of the pruning network model , and generate the corresponding pruning network model; use the training data to jointly train the pruning cell network and the corresponding pruning network model, and update the pruning cell network at the same time;
步骤3:基于进化算法的搜索最优剪枝网络模型:将多个满足特定约束的通道剪枝编码向量输入步骤2更新后的剪枝细胞网络,输出权重矩阵,生成多个对应的剪枝网络模型;评估每个剪枝网络模型的精度;采用进化算法搜索其中满足特定约束且精度最高的剪枝网络模型, 得到卷积神经网络的通用压缩架构。Step 3: Search the optimal pruning network model based on evolutionary algorithm: input multiple channel pruning encoding vectors that satisfy specific constraints into the pruning cell network updated in step 2, output the weight matrix, and generate multiple corresponding pruning networks model; evaluate the accuracy of each pruned network model; use an evolutionary algorithm to search for the pruned network model that satisfies specific constraints and has the highest accuracy, and obtains a general compression architecture for convolutional neural networks.
进一步地,所述通道剪枝编码向量具体为:通道剪枝编码向量中每个元素对应一个卷积网络模块的通道宽度,对每个卷积网络模块的通道宽度进行随机采样,生成通道剪枝编码向量,通过通道剪枝编码向量将用户输入的卷积神经网络模型和剪枝模型建立一对一的映射关系,通道剪枝编码向量用于生成对应的剪枝网络模型。Further, the channel pruning coding vector is specifically: each element in the channel pruning coding vector corresponds to the channel width of a convolutional network module, and the channel width of each convolutional network module is randomly sampled to generate channel pruning. Coding vector, the convolutional neural network model input by the user and the pruning model are established one-to-one mapping relationship through the channel pruning coding vector, and the channel pruning coding vector is used to generate the corresponding pruning network model.
进一步地,在训练阶段,通过在每次迭代中随机选择每层通道的通道宽度来生成通道剪枝编码向量;通过输入不同的通道剪枝编码向量,生成对应的权值矩阵,并构造不同的剪枝网络模型;通过随机生成不同的编码向量,剪枝细胞网络学习预测不同剪枝网络模型的权值。Further, in the training phase, the channel pruning coding vector is generated by randomly selecting the channel width of each layer channel in each iteration; by inputting different channel pruning coding vectors, the corresponding weight matrix is generated, and different Pruning network model; by randomly generating different encoding vectors, the pruning cell network learns to predict the weights of different pruning network models.
进一步地,所述剪枝细胞网络具体为:剪枝细胞网络由两个全连接层组成,输入为通道剪枝编码向量,输出为用于生成剪枝网络模型的权重矩阵。Further, the pruned cell network is specifically: the pruned cell network is composed of two fully connected layers, the input is a channel pruning encoding vector, and the output is a weight matrix for generating a pruned network model.
进一步地,所述步骤2中,包括以下子步骤:Further, in the step 2, the following sub-steps are included:
步骤(2.1):将通道剪枝编码向量输入剪枝细胞网络并输出权重矩阵;Step (2.1): Input the channel pruning encoding vector into the pruning cell network and output the weight matrix;
步骤(2.2):基于剪枝细胞网络输出的权重矩阵构建剪枝网络模型;Step (2.2): build a pruned network model based on the weight matrix output by the pruned cell network;
步骤(2.3):联合训练剪枝细胞网络和剪枝网络模型:将训练数据输入步骤(2.2)生成的剪枝网络模型进行模型训练,同时更新剪枝细胞网络。Step (2.3): jointly train the pruning cell network and the pruning network model: input the training data into the pruning network model generated in step (2.2) for model training, and at the same time update the pruning cell network.
进一步地,所述步骤(2.3)具体为:前向传播阶段,将通道剪枝编码向量输入剪枝细胞网络,生成权值矩阵;与此同时,利用剪枝细胞网络生成的权值矩阵构建与当前输入的通道剪枝编码向量对应的剪枝网络模型;调整剪枝细胞网络输出的权重矩阵的形状,使其与通道剪枝编码向量对应的剪枝网络模型的输入的形状一致。Further, the step (2.3) is specifically: in the forward propagation stage, the channel pruning encoding vector is input into the pruning cell network to generate a weight matrix; at the same time, the weight matrix generated by the pruning cell network is used to construct and The pruning network model corresponding to the currently input channel pruning coding vector; adjust the shape of the weight matrix output by the pruning cell network to make it consistent with the input shape of the pruning network model corresponding to the channel pruning coding vector.
进一步地,所述步骤(2.3)具体为:反向传播阶段,不是更新剪枝网络模型的权重矩阵,而是计算剪枝细胞网络中权重的梯度;由于剪枝细胞网络的输出和剪枝网络模型的输出之间的reshape操作和卷积运算都是可微的,因此采用链式法则计算剪枝网络模型中权重的梯度,从而端到端训练剪枝细胞网络。Further, the step (2.3) is specifically: in the back propagation stage, instead of updating the weight matrix of the pruned network model, the gradient of the weight in the pruned cell network is calculated; since the output of the pruned cell network and the pruned network The reshape operation and convolution operation between the outputs of the model are both differentiable, so the chain rule is used to calculate the gradient of the weights in the pruned network model to train the pruned cell network end-to-end.
进一步地,所述步骤3包括以下子步骤:Further, the step 3 includes the following substeps:
步骤(3.1):将通道剪枝编码向量定义为剪枝网络模型的基因,随机选取满足特定约束的一系列基因作为初始种群;Step (3.1): define the channel pruning encoding vector as the gene of the pruning network model, and randomly select a series of genes that satisfy specific constraints as the initial population;
步骤(3.2):评估现有种群中各个基因对应的剪枝网络模型的精度,选取精度较高的前k个基因;Step (3.2): Evaluate the accuracy of the pruning network model corresponding to each gene in the existing population, and select the top k genes with higher accuracy;
步骤(3.3):利用步骤(3.2)选取的精度较高的前k个基因进行基因重组和基因变异生成新的基因,将新基因加入现有种群中;Step (3.3): use the top k genes with higher precision selected in step (3.2) to carry out gene recombination and gene mutation to generate new genes, and add the new genes to the existing population;
步骤(3.4):重复迭代的步骤(3.2)~(3.3),选择现有种群中前k个精度较高的基因并生成新 基因,迭代次数达到设定轮次后,最终获得满足特定约束并且精度最高的剪枝网络模型。Step (3.4): Repeat the iterative steps (3.2) to (3.3), select the top k genes with higher precision in the existing population and generate new genes. The most accurate pruned network model.
进一步地,所述步骤(3.3)中,基因变异是指通过随机改变基因里一部分元素值;基因重组是指随机地将两个父辈的基因的元素进行重组;剔除不满足特定约束的新基因。Further, in the step (3.3), gene mutation refers to randomly changing a part of the element values in the gene; gene recombination refers to randomly recombining the elements of two parent genes; and eliminating new genes that do not meet specific constraints.
一种基于上述卷积神经网络通用压缩架构的自动剪枝方法的平台,包括以下组件:A platform for automatic pruning method based on the general compression architecture of the above-mentioned convolutional neural network, including the following components:
数据加载组件:用于获取卷积神经网络的训练数据,所述训练数据是满足监督学习任务的有标签的样本;Data loading component: used to obtain training data of the convolutional neural network, the training data is a labeled sample that satisfies the supervised learning task;
自动压缩组件:用于将卷积神经网络模型自动压缩,包括剪枝向量编码模块、剪枝网络生成模块、剪枝细胞网络和剪枝网络联合训练模块、剪枝网络搜索模块和特定任务微调模块;Automatic compression component: used to automatically compress the convolutional neural network model, including pruning vector encoding module, pruning network generation module, pruning cell network and pruning network joint training module, pruning network search module and task-specific fine-tuning module ;
剪枝向量编码模块是采用随机结构抽样方法对用户输入的神经网络模型的所有卷积网络模块进行通道宽度采样,生成通道剪枝编码向量;前向传播过程中,将通道剪枝编码向量输入剪枝细胞网络,生成对应结构的剪枝网络和剪枝细胞网络的权重矩阵;The pruning vector coding module adopts the random structure sampling method to sample the channel width of all the convolutional network modules of the neural network model input by the user to generate the channel pruning coding vector; in the forward propagation process, the channel pruning coding vector is input into the pruning vector. Branch cell network, generate the corresponding structure of the pruning network and the weight matrix of the pruning cell network;
剪枝网络生成模块是基于剪枝细胞网络构建与当前输入的通道剪枝编码向量对应的剪枝网络,调整剪枝细胞网络输出的权重矩阵的形状,使其与通道剪枝编码向量对应的剪枝结构的输入输出的编码器单元数目一致;The pruning network generation module is based on the pruning cell network to construct a pruning network corresponding to the current input channel pruning encoding vector, and adjust the shape of the weight matrix output by the pruning cell network to make it correspond to the channel pruning encoding vector. The number of encoder units of the input and output of the branch structure is the same;
剪枝细胞网络和剪枝网络联合训练模块是端到端地训练剪枝细胞网络,具体地,将简单随机采样的通道剪枝编码向量和一个小批次的训练数据输入剪枝网络;更新剪枝结构的权重和剪枝细胞网络的权重矩阵;The pruned cell network and the pruned network joint training module train the pruned cell network end-to-end. Specifically, simply randomly sampled channel pruning encoding vectors and a small batch of training data are input into the pruning network; The weight of the branch structure and the weight matrix of the pruned cell network;
剪枝网络搜索模块是为了搜索出满足特定约束条件的最高精度的剪枝网络,提出采用进化算法搜索满足特定约束条件的最高精度的剪枝网络;将通道剪枝编码向量输入训练好的剪枝细胞网络,生成对应剪枝网络的权重,在验证集上对剪枝网络进行评估,获得对应剪枝网络的精度;在元学习剪枝网络中采用的进化搜索算法中,每个剪枝网络的结构是由简单随机采样的通道剪枝编码向量生成的,所以将通道剪枝编码向量定义为剪枝网络的基因;在满足特定约束条件下,首先选取一系列通道剪枝编码向量作为剪枝网络的基因,通过在验证集上评估获得对应剪枝网络的精度;然后,选取精度较高的前k个基因,采用基因重组和变异生成新的基因;通过进一步重复前k个最优基因选择的过程和新基因生成的过程进行迭代,获得满足约束条件并且精度最高的基因;The pruning network search module is to search for the highest-precision pruning network that satisfies specific constraints. It proposes to use evolutionary algorithms to search for the highest-precision pruning network that satisfies specific constraints; input the channel pruning code vector into the trained pruning network. cell network, generate the weights corresponding to the pruning network, and evaluate the pruning network on the validation set to obtain the accuracy of the corresponding pruning network; in the evolutionary search algorithm used in the meta-learning pruning network, the The structure is generated by simply randomly sampled channel pruning coding vector, so the channel pruning coding vector is defined as the gene of the pruning network; under certain constraints, a series of channel pruning coding vectors are first selected as the pruning network. The accuracy of the corresponding pruning network is obtained by evaluating on the validation set; then, the top k genes with higher accuracy are selected, and gene recombination and mutation are used to generate new genes; by further repeating the top k optimal genes selected The process and the process of new gene generation are iterated to obtain the gene that satisfies the constraints and has the highest accuracy;
特定任务微调模块是在所述自动压缩组件生成的剪枝网络上针对特定任务进行微调网络,利用剪枝网络的特征层和输出层对特定任务场景进行微调,输出最终微调好的压缩模型,即登陆用户需求的卷积神经网络模型的压缩模型;将所述压缩模型输出到指定的容器,可供所述登陆用户下载,并在所述平台的输出压缩模型的页面呈现压缩前后模型性能对比信息;The task-specific fine-tuning module is to fine-tune the network for a specific task on the pruning network generated by the automatic compression component, use the feature layer and output layer of the pruning network to fine-tune the specific task scene, and output the final fine-tuned compression model, namely The compression model of the convolutional neural network model required by the login user; the compression model is output to the designated container, which can be downloaded by the login user, and the performance comparison information of the model before and after compression is presented on the output compression model page of the platform ;
推理组件:登陆用户从所述平台获取卷积神经网络的压缩模型,用户利用所述自动压缩 组件输出的压缩模型在实际场景的数据集上对登陆用户上传的特定任务的新数据进行推理;并在所述平台的压缩模型推理页面呈现压缩前后推理模型性能对比信息。Inference component: the login user obtains the compression model of the convolutional neural network from the platform, and the user uses the compression model output by the automatic compression component to infer the new data of the specific task uploaded by the login user on the data set of the actual scene; and The performance comparison information of the inference model before and after compression is presented on the compression model inference page of the platform.
本发明的有益效果是:首先,本发明研究基于元学***台,可以压缩生成面向多任务的预训练卷积神经网络模型的通用架构,充分利用已压缩好的模型架构提高特定任务的压缩效率,并且可将大规模图像处理模型部署在内存小、资源受限等端侧设备,推动了通用深度卷积神经网络模型在工业界的落地进程。The beneficial effects of the present invention are as follows: first, the present invention studies a general compression architecture for generating multiple convolutional neural networks based on channel pruning based on meta-learning; Compression architecture, resulting in an optimal general compression architecture for task-independent pretrained convolutional neural network models. Using the multi-task-oriented pre-training convolutional neural network model automatic compression platform of the present invention, the general architecture of the multi-task-oriented pre-training convolutional neural network model can be compressed and generated, and the compressed model architecture can be fully utilized to improve specific tasks. In addition, large-scale image processing models can be deployed on end-side devices such as small memory and limited resources, which promotes the landing process of general deep convolutional neural network models in the industry.
附图说明Description of drawings
图1是本发明压缩方法结合特定任务的整体架构图;1 is an overall architecture diagram of the compression method of the present invention in conjunction with a specific task;
图2是元学习的剪枝网络的训练流程图;Fig. 2 is the training flow chart of the pruning network of meta-learning;
图3是剪枝细胞网络和剪枝网络联合训练过程图;Figure 3 is a diagram of the joint training process of the pruned cell network and the pruned network;
图4是基于进化算法的剪枝网络搜索架构图。Figure 4 is a diagram of a pruning network search architecture based on an evolutionary algorithm.
具体实施方式Detailed ways
受神经网络架构搜索的启发,尤其是在少样本的情况下,自动机器学习能够基于一个反馈回路以迭代方式进行自动模型压缩。本发明研究基于元学习的通道剪枝生成多种预训练卷积神经网络模型的通用压缩架构。具体地,本发明首先构建一种基于简单随机采样生成的通道剪枝编码向量,在不同卷积通道上剪枝大模型的网络结构。设计一种剪枝细胞网络的元网络,利用该剪枝细胞网络生成与当前输入的编码向量对应的剪枝网络模型。每轮迭代时,利用简单随机采样产生各层卷积模块的通道宽度,组成对应的编码向量。通过改变输入剪枝细胞网络的编码向量和小批次的训练数据,联合训练剪枝细胞网络和对应的剪枝结构,可以学得能够为不同剪枝结构生成权重的剪枝细胞网络。同时,在已训练好的元学习网络基础上,通过进化算法搜索最优压缩结构,由此得到与任务无关的预训练卷积神经网络模型的最优通用压缩架构。本发明解决少样本数据下卷积神经网络模型压缩过程中过拟合学习和压缩模型泛化能力低的问题,深入地探索大规模深度神经网络模型在少样本条件下的图像处理的可行性和关键技术,提高压缩模型面向多种特定任务使用过程中的灵活性和有效性。与已有的剪枝方法相比,元学习的通道剪枝能够把人力彻底从繁琐的超参数调优中解放出来,同时允许利用多种目标度量方法直接优化压缩模型。与其它自动机器学习方法相比,元学习的通道剪枝能够很容易地在搜索所需压缩结构时实施条件约束,无需像强化学习一样需要手动调整网络的超参数。本发明压缩方法的应用技术路线如图1所示,基于图像处理的数据集,研究基 于元学习的通道剪枝以及基于进化算法的剪枝网络自动搜索,通过元学习的通道剪枝将面向多任务的大规模预训练卷积神经网络模型自动压缩生成满足不同硬约束条件(如浮点数运算次数)且与任务无关的通用架构;使用该通用架构时,在元学习的剪枝网络的基础上微调特定任务网络,输入特定任务的数据集,仅微调特定任务,节省计算成本,提高效率。Inspired by neural network architecture search, especially in the case of few samples, automatic machine learning enables automatic model compression in an iterative manner based on a feedback loop. The present invention studies a general compression architecture for generating multiple pre-trained convolutional neural network models based on channel pruning based on meta-learning. Specifically, the present invention first constructs a network structure for pruning a large model on different convolution channels based on a channel pruning encoding vector generated by simple random sampling. Design a meta-network of pruned cell network, and use the pruned cell network to generate a pruned network model corresponding to the current input encoding vector. In each iteration, simple random sampling is used to generate the channel width of each layer of convolutional modules to form the corresponding encoding vector. By changing the encoding vector of the input pruning cell network and the training data of small batches, and jointly training the pruning cell network and the corresponding pruning structure, a pruning cell network that can generate weights for different pruning structures can be learned. At the same time, on the basis of the trained meta-learning network, an evolutionary algorithm is used to search for the optimal compression structure, thereby obtaining the optimal general compression structure of the task-independent pre-trained convolutional neural network model. The invention solves the problem of over-fitting learning and low generalization ability of the compression model in the compression process of the convolutional neural network model under the condition of few sample data, and deeply explores the feasibility and efficiency of image processing of the large-scale deep neural network model under the condition of few samples. Key technology to improve the flexibility and effectiveness of compression models for a variety of specific tasks. Compared with existing pruning methods, meta-learning channel pruning can completely liberate manpower from tedious hyperparameter tuning, while allowing the use of multiple target metrics to directly optimize compressed models. Compared to other automatic machine learning methods, channel pruning for meta-learning can easily enforce conditional constraints when searching for the desired compression structure, without the need to manually tune the network's hyperparameters as in reinforcement learning. The application technical route of the compression method of the present invention is shown in Figure 1. Based on the data set of image processing, the channel pruning based on meta-learning and the automatic search of pruning network based on evolutionary algorithm are studied. The channel pruning through meta-learning will face multiple The large-scale pre-trained convolutional neural network model of the task is automatically compressed to generate a general architecture that satisfies different hard constraints (such as the number of floating-point operations) and is independent of the task; when using this general architecture, on the basis of the meta-learning pruning network Fine-tune a task-specific network, input a task-specific dataset, and fine-tune only a specific task, saving computational costs and improving efficiency.
本发明一种卷积神经网络通用压缩架构的自动剪枝方法,整个过程分为三个步骤:第一步是构建基于简单随机采样的通道剪枝编码向量;第二步是训练元学习的剪枝网络;第三步是基于进化算法搜索最优压缩结构;具体为:The present invention is an automatic pruning method for a general compression architecture of a convolutional neural network. The whole process is divided into three steps: the first step is to construct a channel pruning encoding vector based on simple random sampling; the second step is to prune the training element learning branch network; the third step is to search for the optimal compression structure based on the evolutionary algorithm; specifically:
第一步:构建基于简单随机采样的通道剪枝编码向量。采用简单随机采样方法对卷积神经网络模型的所有卷积网络的单元模块进行通道采样,生成一个通道采样向量,即通道剪枝编码向量。Step 1: Construct a channel pruning encoding vector based on simple random sampling. The simple random sampling method is used to conduct channel sampling for all the unit modules of the convolutional neural network model, and a channel sampling vector is generated, that is, the channel pruning coding vector.
具体地,通道剪枝编码向量中每个元素对应一个卷积网络模块的通道宽度,对每个卷积网络模块的通道进行随机采样,生成通道剪枝编码向量,通过通道剪枝编码向量将用户输入的卷积神经网络模型和剪枝模型建立一对一的映射关系,根据通道剪枝编码向量生成对应的剪枝网络结构。Specifically, each element in the channel pruning coding vector corresponds to the channel width of a convolutional network module, random sampling is performed on the channel of each convolutional network module, and a channel pruning coding vector is generated. The input convolutional neural network model and the pruning model establish a one-to-one mapping relationship, and the corresponding pruning network structure is generated according to the channel pruning encoding vector.
在训练阶段,通过在每次迭代中随机选择每层通道的通道宽度来生成通道剪枝编码向量。通过输入不同的网络编码向量,构造不同的剪枝网络结构,并生成对应的权值。通过随机生成不同的编码向量,剪枝细胞网络学习预测不同剪枝网络的权值。During the training phase, channel pruning encoding vectors are generated by randomly selecting the channel width of each layer channel in each iteration. By inputting different network coding vectors, different pruning network structures are constructed and corresponding weights are generated. By randomly generating different encoding vectors, the pruned cellular network learns to predict the weights of different pruned networks.
第二步:训练元学习的剪枝网络,如图2所示。定义剪枝细胞网络,将通道剪枝编码向量作为输入,输出用于构建剪枝网络的权重矩阵,并生成对应的剪枝网络模型;采用批数据集训练生成的剪枝结构并更新剪枝结构从而更新细胞网络;最终输出迭代更新后剪枝细胞网络输出的权重。Step 2: Train the meta-learned pruning network, as shown in Figure 2. Define the pruning cell network, take the channel pruning encoding vector as input, output the weight matrix used to construct the pruning network, and generate the corresponding pruning network model; use the batch data set to train the generated pruning structure and update the pruning structure Thus, the cell network is updated; the final output is the weight of the pruned cell network output after iterative update.
定义剪枝细胞网络:剪枝细胞网络是一个元网络,由两个全连接层组成;输入为第一阶段构建的通道剪枝编码向量,输出为用于生成剪枝网络模型的权重矩阵。Define the pruned cell network: The pruned cell network is a meta-network consisting of two fully connected layers; the input is the channel pruning encoding vector constructed in the first stage, and the output is the weight matrix used to generate the pruned network model.
训练剪枝细胞网络:包括以下子步骤:Training a pruned cell network: includes the following sub-steps:
步骤一:前向传播过程中,将通道剪枝编码向量输入剪枝细胞网络并输出权重矩阵。Step 1: In the forward propagation process, the channel pruning encoding vector is input into the pruning cell network and the weight matrix is output.
步骤二:如图3所示,基于剪枝细胞网络构建剪枝网络模型的过程为:Step 2: As shown in Figure 3, the process of constructing a pruned network model based on the pruned cell network is as follows:
根据第一阶段构建的通道剪枝编码向量,其中每一个元素c i对应第i层卷积单元模块的卷积通道宽度,对用户输入的原网络的每个卷积层进行通道采样,生成通道剪枝编码向量,即每个通道采样为c i的元素经过剪枝细胞网络生成压缩模型对应的第i层卷积单元模块以及其权重;通过通道剪枝编码向量将原模型和压缩模型建立一对一的映射关系,根据通道剪枝编码向量生成对应的剪枝网络结构。 According to the channel pruning coding vector constructed in the first stage, each element c i corresponds to the convolution channel width of the i-th layer convolution unit module, and the channel sampling is performed on each convolution layer of the original network input by the user to generate a channel The pruning code vector, that is, the elements sampled as c i in each channel pass through the pruning cell network to generate the i-th convolution unit module corresponding to the compression model and its weight; the original model and the compression model are established through the channel pruning code vector. For the one-to-one mapping relationship, the corresponding pruning network structure is generated according to the channel pruning code vector.
步骤三:如图3所示,联合训练剪枝细胞网络和剪枝网络模型的过程为:Step 3: As shown in Figure 3, the process of jointly training the pruned cell network and the pruned network model is as follows:
将一个小批次的训练数据输入步骤二生成的剪枝网络模型进行模型训练,剪枝网络模型更新参数(权重矩阵)后,细胞网络根据剪枝网络更新后的参数也进行参数更新;即反向传播的过程中,剪枝网络模型和细胞网络一起更新;细胞网络输出的权重可以使用链式法则计算,因此,可以端到端的训练细胞网络。A small batch of training data is input into the pruning network model generated in step 2 for model training. After the pruning network model updates the parameters (weight matrix), the cell network also updates the parameters according to the updated parameters of the pruning network; In the process of forward propagation, the pruned network model and the cellular network are updated together; the weights output by the cellular network can be calculated using the chain rule, so the cellular network can be trained end-to-end.
利用简单随机采样方法对各层卷积单元模块进行通道采样,构建不同的通道剪枝编码向量,用同一个训练数据集进行多次迭代训练,每轮迭代时基于一个通道剪枝编码向量同时训练细胞网络和剪枝网络模型,通过改变输入的通道剪枝编码向量,学得能够为不同剪枝网络模型生成权重矩阵的细胞网络。The simple random sampling method is used to sample the channels of the convolutional unit modules of each layer, and different channel pruning coding vectors are constructed. The same training data set is used for multiple iteration training, and each iteration is based on a channel pruning coding vector. Simultaneous training Cellular networks and pruned network models learn cell networks that can generate weight matrices for different pruned network models by changing the input channel pruning encoding vector.
而且需要调整细胞网络输出的权重矩阵的形状,使其与通道剪枝编码向量对应的剪枝网络的输入输出的编码器单元数目一致。通过通道采样的编码向量保持一致的,具体地,根据编码向量中元素c i对应第i层卷积单元模块的卷积通道宽度和位置来调整细胞网络输出的权重矩阵的形状。 Moreover, it is necessary to adjust the shape of the weight matrix output by the cell network to make it consistent with the number of encoder units in the input and output of the pruning network corresponding to the channel pruning encoding vector. The coding vector sampled by the channel is kept consistent. Specifically, the shape of the weight matrix output by the cellular network is adjusted according to the width and position of the convolution channel of the i-th layer convolution unit module corresponding to the element c i in the coding vector.
第三步:如图4所示为基于进化算法的剪枝网络搜索的过程:The third step: Figure 4 shows the process of pruning network search based on evolutionary algorithm:
在第二步训练好的元学习的剪枝网络基础上,将多个满足特定约束条件的通道剪枝编码向量输入剪枝细胞网络,生成对应的权重矩阵,得到多个剪枝网络模型;在验证集上对每个剪枝网络模型进行评估,获得对应的精度;采用进化算法搜索其中满足特定约束条件(如浮点数运算次数)的精度最高的剪枝网络模型,由此得到与任务无关的预训练卷积神经网络模型的通用压缩架构,如图4中方框标记的Network2所示。进化搜索算法的具体步骤如下:On the basis of the meta-learning pruning network trained in the second step, input multiple channel pruning encoding vectors that satisfy specific constraints into the pruning cell network to generate corresponding weight matrices to obtain multiple pruning network models; Evaluate each pruning network model on the validation set to obtain the corresponding accuracy; use an evolutionary algorithm to search for the pruning network model with the highest accuracy that satisfies certain constraints (such as the number of floating-point operations), thereby obtaining task-independent models. A general compression architecture for pre-trained convolutional neural network models, as shown by the box-marked Network2 in Figure 4. The specific steps of the evolutionary search algorithm are as follows:
步骤一、每个剪枝网络模型是由简单随机采样各层卷积单元模块的卷积通道所得的剪枝编码向量生成的,所以将通道剪枝编码向量定义为剪枝网络模型的基因G,随机选取满足约束条件C的一系列基因作为初始种群。Step 1. Each pruning network model is generated by the pruning coding vector obtained by simply randomly sampling the convolution channels of the convolutional unit modules of each layer, so the channel pruning coding vector is defined as the gene G of the pruning network model, A series of genes that satisfy the constraint C are randomly selected as the initial population.
步骤二、评估现有种群中各个基因G i对应的剪枝网络模型在验证集上的推理精度,选取精度最高的前k个基因。 Step 2: Evaluate the inference accuracy of the pruning network model corresponding to each gene G i in the existing population on the validation set, and select the top k genes with the highest accuracy.
步骤三、利用步骤二选取的精度最高的前k个基因进行基因重组和基因变异生成新的基因,将新基因加入现有种群中。基因变异是指通过随机改变基因里一部分元素值来进行变异;基因重组是指随机地将两个父辈的基因进行重组产生后代;而且可以很容易地通过消除不合格的基因来加强约束C。Step 3: Use the top k genes with the highest accuracy selected in step 2 to perform gene recombination and gene mutation to generate new genes, and add the new genes to the existing population. Gene mutation refers to mutation by randomly changing the value of some elements in the gene; gene recombination refers to randomly recombining the genes of two parents to produce offspring; and it is easy to strengthen constraint C by eliminating unqualified genes.
步骤四、重复迭代N轮步骤二和步骤三,选择现有种群中前k个精度最高的基因并生成新基因,直到获得满足约束条件C并且精度最高的基因。Step 4: Repeat steps 2 and 3 for N rounds of iterations, select the top k genes with the highest accuracy in the existing population and generate new genes, until the gene that satisfies the constraint C and has the highest accuracy is obtained.
本发明一种基于上述卷积神经网络通用压缩架构的自动剪枝方法的平台,包括以下组件:A platform of the present invention for an automatic pruning method based on the general compression architecture of the above-mentioned convolutional neural network includes the following components:
数据加载组件:用于获取卷积神经网络的训练样本,所述训练样本是满足监督学习任务的有标签的样本。Data loading component: used to obtain training samples of the convolutional neural network, where the training samples are labeled samples that satisfy the supervised learning task.
自动压缩组件:用于将卷积神经网络模型自动压缩,包括剪枝向量编码模块、剪枝网络生成模块、剪枝细胞网络和剪枝网络联合训练模块、剪枝网络搜索模块和特定任务微调模块。Automatic compression component: used to automatically compress the convolutional neural network model, including pruning vector encoding module, pruning network generation module, pruning cell network and pruning network joint training module, pruning network search module and task-specific fine-tuning module .
剪枝向量编码模块是采用随机结构抽样方法对用户输入的神经网络模型的所有卷积网络模块进行通道宽度采样,生成通道剪枝编码向量;前向传播过程中,将通道剪枝编码向量输入剪枝细胞网络,生成对应结构的剪枝网络和剪枝细胞网络的权重矩阵。The pruning vector coding module adopts the random structure sampling method to sample the channel width of all the convolutional network modules of the neural network model input by the user to generate the channel pruning coding vector; in the forward propagation process, the channel pruning coding vector is input into the pruning vector. Branch cell network, generate the weight matrix of the corresponding structure of the pruned network and the pruned cell network.
剪枝网络生成模块是基于剪枝细胞网络构建与当前输入的通道剪枝编码向量对应的剪枝网络,调整剪枝细胞网络输出的权重矩阵的形状,使其与通道剪枝编码向量对应的剪枝结构的输入输出的编码器单元数目一致。The pruning network generation module is based on the pruning cell network to construct a pruning network corresponding to the current input channel pruning encoding vector, and adjust the shape of the weight matrix output by the pruning cell network to make it correspond to the channel pruning encoding vector. The number of encoder units for the input and output of the branch structure is the same.
剪枝细胞网络和剪枝网络联合训练模块是端到端地训练剪枝细胞网络,具体地,将简单随机采样的通道剪枝编码向量和一个小批次的训练数据输入剪枝网络;更新剪枝结构的权重和剪枝细胞网络的权重矩阵。The pruned cell network and the pruned network joint training module train the pruned cell network end-to-end. Specifically, simply randomly sampled channel pruning encoding vectors and a small batch of training data are input into the pruning network; The weights of the branch structure and the weight matrix of the pruned cell network.
剪枝网络搜索模块是为了搜索出满足特定约束条件的最高精度的剪枝网络,提出采用进化算法搜索满足特定约束条件的最高精度的剪枝网络;将通道剪枝编码向量输入训练好的剪枝细胞网络,生成对应剪枝网络的权重,在验证集上对剪枝网络进行评估,获得对应剪枝网络的精度;在元学习剪枝网络中采用的进化搜索算法中,每个剪枝网络的结构是由简单随机采样的通道剪枝编码向量生成的,所以将通道剪枝编码向量定义为剪枝网络的基因;在满足特定约束条件下,首先选取一系列通道剪枝编码向量作为剪枝网络的基因,通过在验证集上评估获得对应剪枝网络的精度;然后,选取精度较高的前k个基因,采用基因重组和变异生成新的基因;通过进一步重复前k个最优基因选择的过程和新基因生成的过程进行迭代,获得满足约束条件并且精度最高的基因。The pruning network search module is to search for the highest-precision pruning network that satisfies specific constraints. It proposes to use evolutionary algorithms to search for the highest-precision pruning network that satisfies specific constraints; input the channel pruning code vector into the trained pruning network. cell network, generate the weights corresponding to the pruning network, and evaluate the pruning network on the validation set to obtain the accuracy of the corresponding pruning network; in the evolutionary search algorithm used in the meta-learning pruning network, the The structure is generated by simply randomly sampled channel pruning coding vector, so the channel pruning coding vector is defined as the gene of the pruning network; under certain constraints, a series of channel pruning coding vectors are first selected as the pruning network. The accuracy of the corresponding pruning network is obtained by evaluating on the validation set; then, the top k genes with higher accuracy are selected, and gene recombination and mutation are used to generate new genes; by further repeating the top k optimal genes selected The process and the process of new gene generation are iterated to obtain the gene that satisfies the constraints and has the highest accuracy.
特定任务微调模块是在所述自动压缩组件生成的剪枝网络上针对特定任务进行微调网络,利用剪枝网络的特征层和输出层对特定任务场景进行微调,输出最终微调好的压缩模型,即登陆用户需求的卷积神经网络模型的压缩模型;将所述压缩模型输出到指定的容器,可供所述登陆用户下载,并在所述平台的输出压缩模型的页面呈现压缩前后模型模型性能对比信息。The task-specific fine-tuning module is to fine-tune the network for a specific task on the pruning network generated by the automatic compression component, use the feature layer and output layer of the pruning network to fine-tune the specific task scene, and output the final fine-tuned compression model, namely The compression model of the convolutional neural network model required by the login user; the compression model is output to a designated container for the login user to download, and the performance comparison of the model before and after compression is presented on the output compression model page of the platform information.
推理组件:登陆用户从所述平台获取卷积神经网络的压缩模型,用户利用所述自动压缩组件输出的压缩模型在实际场景的数据集上对登陆用户上传的特定任务的新数据进行推理;并在所述平台的压缩模型推理页面呈现压缩前后推理模型性能对比信息。Inference component: the login user obtains the compression model of the convolutional neural network from the platform, and the user uses the compression model output by the automatic compression component to infer the new data of the specific task uploaded by the login user on the data set of the actual scene; and The performance comparison information of the inference model before and after compression is presented on the compression model inference page of the platform.
下面将在ImageNet2012分类数据集上进行卷积神经网络模型的自动压缩实验。根据该图 像分类任务对本发明的技术方案做进一步的详细描述。The following will conduct automatic compression experiments of convolutional neural network models on the ImageNet2012 classification dataset. The technical solution of the present invention is further described in detail according to the image classification task.
通过所述平台的数据加载组件获取登陆用户上传的ImageNet2012分类数据集,在训练时,将原始训练图像分割成子验证数据集,子验证数据集包含50000张图像,是从1000个类的每个类中随机选取50张训练图像,其他剩余样本构成子训练数据集。本发明在子训练数据集上训练细胞网络,搜索阶段在子验证数据集上评估剪枝网络的性能。The ImageNet2012 classification data set uploaded by the login user is obtained through the data loading component of the platform. During training, the original training image is divided into sub-validation data sets. The sub-validation data set contains 50,000 images, which are obtained from each of 1,000 classes. 50 training images are randomly selected from the dataset, and the other remaining samples constitute the sub-training dataset. The invention trains the cellular network on the sub-training data set, and evaluates the performance of the pruning network on the sub-validation data set in the search stage.
通过所述平台的自动压缩组件,生成预训练卷积神经网络压缩模型。Through the automatic compression component of the platform, a pretrained convolutional neural network compression model is generated.
通过所述平台加载自动压缩组件生成的预训练压缩模型,在所述生成的预训练模型上微调该分类任务的模型。The pre-trained compression model generated by the automatic compression component is loaded through the platform, and the model for the classification task is fine-tuned on the generated pre-trained model.
基于所述自动压缩组件的特定任务微调模块所得的压缩模型进行微调,利用自动压缩组件生成的预训练模型的特征层和输出层对图像分类任务场景进行微调,最终,平台输出登陆用户需求的图像分类任务的卷积神经网络模型的压缩模型。Fine-tune the compression model based on the specific task fine-tuning module of the automatic compression component, and use the feature layer and output layer of the pre-trained model generated by the automatic compression component to fine-tune the image classification task scene, and finally, the platform outputs the image required by the login user A compressed model of convolutional neural network models for classification tasks.
将所述压缩模型输出到指定的容器,可供所述登陆用户下载,并在所述平台的输出压缩模型的页面呈现压缩前后模型性能对比信息,表格1比较了原MobileNet V2网络模型与元学习通道剪枝网络在top-1上的精度。原MobileNet V2网络模型运行313M浮点数运算所得top-1的精度为72.0%,而元学习通道剪枝网络仅运行219M浮点数运算就达到72.7%的top-1精度。The compressed model is output to the designated container, which can be downloaded by the logged-in user, and the model performance comparison information before and after compression is presented on the page of the output compression model of the platform. Table 1 compares the original MobileNet V2 network model and meta-learning The top-1 accuracy of the channel pruning network. The top-1 accuracy of the original MobileNet V2 network model running 313M floating point operations is 72.0%, while the meta-learning channel pruning network only runs 219M floating point operations to achieve a top-1 accuracy of 72.7%.
表1:图像分类任务MobileNet V2模型压缩前后对比信息Table 1: Image classification task MobileNet V2 model comparison information before and after compression
ImageNet2012(包含1000类图像)ImageNet2012 (including 1000 class images) 压缩前Before compression 压缩后after compression
Top1-AccTop1-Acc 72.0%72.0% 72.7%72.7%
浮点数运算次数Number of floating point operations 313M313M 291M291M
通过所述平台的推理组件,利用所述平台输出的压缩模型对登陆用户上传的ImageNet2012测试集数据进行推理,在8张Nvidia 1080Ti GPU显卡上推理压缩模型并在所述平台的压缩模型推理页面呈现压缩前后的性能信息。Through the inference component of the platform, use the compressed model output by the platform to infer the ImageNet2012 test set data uploaded by the login user, infer the compressed model on 8 Nvidia 1080Ti GPU graphics cards and present it on the compressed model inference page of the platform Performance information before and after compression.

Claims (10)

  1. 一种卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,包括以下步骤:An automatic pruning method for a general compression architecture of a convolutional neural network, characterized in that it includes the following steps:
    步骤1:构建通道剪枝编码向量:采用随机结构抽样方法对用户输入的卷积神经网络模型的所有卷积网络模块进行通道宽度采样,生成通道剪枝编码向量;Step 1: construct channel pruning coding vector: adopt random structure sampling method to carry out channel width sampling to all convolutional network modules of the convolutional neural network model input by the user, and generate channel pruning coding vector;
    步骤2:训练元学习的通道剪枝网络:设计剪枝细胞网络,将步骤1生成的通道剪枝编码向量输入剪枝细胞网络,剪枝细胞网络的输出用于构建剪枝网络模型的权重矩阵,并生成对应的剪枝网络模型;利用训练数据联合训练剪枝细胞网络和对应的剪枝网络模型,同时更新剪枝细胞网络;Step 2: Train the channel pruning network of meta-learning: design a pruning cell network, input the channel pruning encoding vector generated in step 1 into the pruning cell network, and the output of the pruning cell network is used to construct the weight matrix of the pruning network model , and generate the corresponding pruning network model; use the training data to jointly train the pruning cell network and the corresponding pruning network model, and update the pruning cell network at the same time;
    步骤3:基于进化算法的搜索最优剪枝网络模型:将多个满足特定约束的通道剪枝编码向量输入步骤2更新后的剪枝细胞网络,输出权重矩阵,生成多个对应的剪枝网络模型;评估每个剪枝网络模型的精度;采用进化算法搜索其中满足特定约束且精度最高的剪枝网络模型,得到卷积神经网络的通用压缩架构。Step 3: Search the optimal pruning network model based on evolutionary algorithm: input multiple channel pruning encoding vectors that satisfy specific constraints into the pruning cell network updated in step 2, output the weight matrix, and generate multiple corresponding pruning networks model; evaluate the accuracy of each pruned network model; use an evolutionary algorithm to search for the pruned network model that satisfies specific constraints and has the highest accuracy, and obtains a general compression architecture for convolutional neural networks.
  2. 如权利要求1所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,所述通道剪枝编码向量具体为:通道剪枝编码向量中每个元素对应一个卷积网络模块的通道宽度,对每个卷积网络模块的通道宽度进行随机采样,生成通道剪枝编码向量,通过通道剪枝编码向量将用户输入的卷积神经网络模型和剪枝模型建立一对一的映射关系,通道剪枝编码向量用于生成对应的剪枝网络模型。The automatic pruning method for a general compression architecture of a convolutional neural network according to claim 1, wherein the channel pruning coding vector is specifically: each element in the channel pruning coding vector corresponds to a channel of a convolutional network module Width, randomly sample the channel width of each convolutional network module, generate a channel pruning coding vector, and establish a one-to-one mapping relationship between the convolutional neural network model input by the user and the pruning model through the channel pruning coding vector, The channel pruning encoding vector is used to generate the corresponding pruned network model.
  3. 如权利要求2所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,在训练阶段,通过在每次迭代中随机选择每层通道的通道宽度来生成通道剪枝编码向量;通过输入不同的通道剪枝编码向量,生成对应的权值矩阵,并构造不同的剪枝网络模型;通过随机生成不同的编码向量,剪枝细胞网络学习预测不同剪枝网络模型的权值。The automatic pruning method of the general compression architecture of convolutional neural network according to claim 2, characterized in that, in the training stage, the channel pruning coding vector is generated by randomly selecting the channel width of each layer channel in each iteration; Input different channel pruning coding vectors, generate corresponding weight matrices, and construct different pruning network models; by randomly generating different coding vectors, the pruning cell network learns to predict the weights of different pruning network models.
  4. 如权利要求3所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,所述剪枝细胞网络具体为:剪枝细胞网络由两个全连接层组成,输入为通道剪枝编码向量,输出为用于生成剪枝网络模型的权重矩阵。The automatic pruning method for a general compression architecture of a convolutional neural network according to claim 3, wherein the pruned cell network is specifically: the pruned cell network is composed of two fully connected layers, and the input is a channel pruning code vector, the output is the weight matrix used to generate the pruned network model.
  5. 如权利要求4所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,所述步骤2中,包括以下子步骤:The automatic pruning method of the general compression architecture of the convolutional neural network according to claim 4, wherein, in the step 2, the following sub-steps are included:
    步骤(2.1):将通道剪枝编码向量输入剪枝细胞网络并输出权重矩阵;Step (2.1): Input the channel pruning encoding vector into the pruning cell network and output the weight matrix;
    步骤(2.2):基于剪枝细胞网络输出的权重矩阵构建剪枝网络模型;Step (2.2): build a pruned network model based on the weight matrix output by the pruned cell network;
    步骤(2.3):联合训练剪枝细胞网络和剪枝网络模型:将训练数据输入步骤(2.2)生成的剪枝网络模型进行模型训练,同时更新剪枝细胞网络。Step (2.3): jointly train the pruning cell network and the pruning network model: input the training data into the pruning network model generated in step (2.2) for model training, and at the same time update the pruning cell network.
  6. 如权利要求5所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,所述步骤(2.3)具体为:前向传播阶段,将通道剪枝编码向量输入剪枝细胞网络,生成权值矩阵;与此同时,利用剪枝细胞网络生成的权值矩阵构建与当前输入的通道剪枝编码向量对应的剪枝网络模型;调整剪枝细胞网络输出的权重矩阵的形状,使其与通道剪枝编码向量对应的剪枝网络模型的输入的形状一致。The automatic pruning method of the general compression architecture of convolutional neural network according to claim 5, wherein the step (2.3) is specifically: in the forward propagation stage, the channel pruning code vector is input into the pruning cell network to generate Weight matrix; at the same time, use the weight matrix generated by the pruning cell network to build a pruning network model corresponding to the current input channel pruning encoding vector; adjust the shape of the weight matrix output by the pruning cell network to make it match The shape of the input of the pruning network model corresponding to the channel pruning encoding vector is consistent.
  7. 如权利要求6所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,所述步骤(2.3)具体为:在反向传播阶段,计算剪枝细胞网络中权重的梯度,并采用链式法则,根据剪枝细胞网络中权重的梯度计算剪枝网络模型中权重的梯度,从而端到端训练剪枝细胞网络。The automatic pruning method for the general compression architecture of convolutional neural networks according to claim 6, wherein the step (2.3) is specifically: in the back propagation stage, calculating the gradient of the weights in the pruned cell network, and using The chain rule, calculates the gradient of the weights in the pruned network model according to the gradient of the weights in the pruned cell network, so as to train the pruned cell network end-to-end.
  8. 如权利要求7所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,所述步骤3包括以下子步骤:The automatic pruning method of the general compression architecture of convolutional neural network according to claim 7, wherein the step 3 comprises the following sub-steps:
    步骤(3.1):将通道剪枝编码向量定义为剪枝网络模型的基因,随机选取满足特定约束的一系列基因作为初始种群;Step (3.1): define the channel pruning encoding vector as the gene of the pruning network model, and randomly select a series of genes that satisfy specific constraints as the initial population;
    步骤(3.2):评估现有种群中各个基因对应的剪枝网络模型的精度,选取精度较高的前k个基因;Step (3.2): Evaluate the accuracy of the pruning network model corresponding to each gene in the existing population, and select the top k genes with higher accuracy;
    步骤(3.3):利用步骤(3.2)选取的精度较高的前k个基因进行基因重组和基因变异生成新的基因,将新基因加入现有种群中;Step (3.3): use the top k genes with higher precision selected in step (3.2) to carry out gene recombination and gene mutation to generate new genes, and add the new genes to the existing population;
    步骤(3.4):重复迭代的步骤(3.2)~(3.3),选择现有种群中前k个精度较高的基因并生成新基因,迭代次数达到设定轮次后,最终获得满足特定约束并且精度最高的剪枝网络模型。Step (3.4): Repeat the iterative steps (3.2) to (3.3), select the top k genes with higher precision in the existing population and generate new genes. The most accurate pruned network model.
  9. 如权利要求8所述卷积神经网络通用压缩架构的自动剪枝方法,其特征在于,所述步骤(3.3)中,基因变异是指通过随机改变通道剪枝编码向量里一部分元素值生成新的通道剪枝编码向量;基因重组是指随机地将两个通道剪枝编码向量中的元素进行重新排列组合生成两个新的通道剪枝编码向量;剔除不满足特定约束的新的通道剪枝编码向量。The automatic pruning method for general compression architecture of convolutional neural network according to claim 8, characterized in that, in the step (3.3), gene mutation refers to generating new Channel pruning coding vector; genetic recombination refers to randomly rearranging and combining elements in two channel pruning coding vectors to generate two new channel pruning coding vectors; eliminating new channel pruning coding that does not meet specific constraints vector.
  10. 一种基于权利要求1-9任一项所述卷积神经网络通用压缩架构的自动剪枝方法的平台,其特征在于,包括以下组件:A platform based on the automatic pruning method of the convolutional neural network general compression architecture described in any one of claims 1-9, characterized in that, comprising the following components:
    数据加载组件:用于获取卷积神经网络的训练数据,所述训练数据是满足监督学习任务的有标签的样本;Data loading component: used to obtain training data of the convolutional neural network, the training data is a labeled sample that satisfies the supervised learning task;
    自动压缩组件:用于将卷积神经网络模型自动压缩,包括剪枝向量编码模块、剪枝网络生成模块、剪枝细胞网络和剪枝网络联合训练模块、剪枝网络搜索模块和特定任务微调模块;Automatic compression component: used to automatically compress the convolutional neural network model, including pruning vector encoding module, pruning network generation module, pruning cell network and pruning network joint training module, pruning network search module and task-specific fine-tuning module ;
    剪枝向量编码模块是采用随机结构抽样方法对用户输入的神经网络模型的所有卷积网络模块进行通道宽度采样,生成通道剪枝编码向量;前向传播过程中,将通道剪枝编码向量输入剪枝细胞网络,生成对应结构的剪枝网络和剪枝细胞网络的权重矩阵;The pruning vector coding module adopts the random structure sampling method to sample the channel width of all the convolutional network modules of the neural network model input by the user to generate the channel pruning coding vector; in the forward propagation process, the channel pruning coding vector is input into the pruning vector. Branch cell network, generating the corresponding structure of the pruning network and the weight matrix of the pruning cell network;
    剪枝网络生成模块是基于剪枝细胞网络构建与当前输入的通道剪枝编码向量对应的剪枝网络,调整剪枝细胞网络输出的权重矩阵的形状,使其与通道剪枝编码向量对应的剪枝结构的输入输出的编码器单元数目一致;The pruning network generation module is based on the pruning cell network to construct a pruning network corresponding to the current input channel pruning encoding vector, and adjust the shape of the weight matrix output by the pruning cell network to make it correspond to the channel pruning encoding vector. The number of encoder units of the input and output of the branch structure is the same;
    剪枝细胞网络和剪枝网络联合训练模块是端到端地训练剪枝细胞网络,具体地,将简单随机采样的通道剪枝编码向量和一个小批次的训练数据输入剪枝网络;更新剪枝结构的权重和剪枝细胞网络的权重矩阵;The pruned cell network and the pruned network joint training module train the pruned cell network end-to-end. Specifically, simply randomly sampled channel pruning encoding vectors and a small batch of training data are input into the pruning network; The weight of the branch structure and the weight matrix of the pruned cell network;
    剪枝网络搜索模块用于搜索出满足特定约束条件的最高精度的剪枝网络,提出采用进化算法搜索满足特定约束条件的最高精度的剪枝网络;将通道剪枝编码向量输入训练好的剪枝细胞网络,生成对应剪枝网络的权重,在验证集上对剪枝网络进行评估,获得对应剪枝网络的精度;在元学习剪枝网络中采用的进化搜索算法中,每个剪枝网络的结构是由简单随机采样的通道剪枝编码向量生成的,将通道剪枝编码向量定义为剪枝网络的基因;在满足特定约束条件下,首先选取一系列通道剪枝编码向量作为剪枝网络的基因,通过在验证集上评估获得对应剪枝网络的精度;然后,选取精度较高的前k个基因,采用基因重组和变异生成新的基因;通过进一步重复前k个最优基因选择的过程和新基因生成的过程进行迭代,获得满足约束条件并且精度最高的基因;The pruning network search module is used to search for the highest-precision pruning network that satisfies specific constraints. It is proposed to use evolutionary algorithms to search for the highest-precision pruning network that satisfies specific constraints; input the channel pruning code vector into the trained pruning network. cell network, generate the weights corresponding to the pruning network, and evaluate the pruning network on the validation set to obtain the accuracy of the corresponding pruning network; in the evolutionary search algorithm used in the meta-learning pruning network, the The structure is generated by simply randomly sampled channel pruning encoding vector, which defines the channel pruning encoding vector as the gene of the pruning network; under certain constraints, a series of channel pruning encoding vectors are selected as the pruning network first. Genes, the accuracy of the corresponding pruning network is obtained by evaluating on the validation set; then, the top k genes with higher accuracy are selected, and gene recombination and mutation are used to generate new genes; by further repeating the process of selecting the top k optimal genes Iterate with the new gene generation process to obtain the gene that meets the constraints and has the highest accuracy;
    特定任务微调模块是在所述自动压缩组件生成的剪枝网络上针对特定任务进行微调网络,利用剪枝网络的特征层和输出层对特定任务场景进行微调,输出最终微调好的压缩模型;将所述压缩模型输出到指定的容器,可供所述登陆用户下载,并在所述平台的输出压缩模型的页面呈现压缩前后模型性能对比信息;The task-specific fine-tuning module is to fine-tune the network for a specific task on the pruning network generated by the automatic compression component, use the feature layer and output layer of the pruning network to fine-tune the specific task scene, and output the final fine-tuned compression model; The compression model is output to a designated container, which can be downloaded by the logged-in user, and the performance comparison information of the model before and after compression is presented on the page of the output compression model of the platform;
    推理组件:登陆用户从所述平台获取卷积神经网络的压缩模型,用户利用所述自动压缩组件输出的压缩模型在实际场景的数据集上对登陆用户上传的特定任务的新数据进行推理;并在所述平台的压缩模型推理页面呈现压缩前后推理模型性能对比信息。Inference component: the login user obtains the compression model of the convolutional neural network from the platform, and the user uses the compression model output by the automatic compression component to infer the new data of the specific task uploaded by the login user on the data set of the actual scene; and The performance comparison information of the inference model before and after compression is presented on the compression model inference page of the platform.
PCT/CN2021/075807 2020-12-31 2021-02-07 Automatic pruning method and platform for general compression architecture of convolutional neural network WO2022141754A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011633174.3 2020-12-31
CN202011633174.3A CN112396181A (en) 2020-12-31 2020-12-31 Automatic pruning method and platform for general compression architecture of convolutional neural network

Publications (1)

Publication Number Publication Date
WO2022141754A1 true WO2022141754A1 (en) 2022-07-07

Family

ID=74625110

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/075807 WO2022141754A1 (en) 2020-12-31 2021-02-07 Automatic pruning method and platform for general compression architecture of convolutional neural network

Country Status (2)

Country Link
CN (1) CN112396181A (en)
WO (1) WO2022141754A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186937A (en) * 2022-09-09 2022-10-14 闪捷信息科技有限公司 Prediction model training and data prediction method and device based on multi-party data cooperation
CN115374935A (en) * 2022-09-15 2022-11-22 重庆大学 Pruning method of neural network
CN115496210A (en) * 2022-11-21 2022-12-20 深圳开鸿数字产业发展有限公司 Filtering pruning method and system for network model, electronic equipment and storage medium
CN115797477A (en) * 2023-01-30 2023-03-14 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Pruning type image compression sensing method and system for light weight deployment
CN116402117A (en) * 2023-06-07 2023-07-07 中诚华隆计算机技术有限公司 Image classification convolutional neural network pruning method and core particle device data distribution method
CN116698410A (en) * 2023-06-29 2023-09-05 重庆邮电大学空间通信研究院 Rolling bearing multi-sensor data monitoring method based on convolutional neural network
CN116994309A (en) * 2023-05-06 2023-11-03 浙江大学 Face recognition model pruning method for fairness perception
CN116992945A (en) * 2023-09-27 2023-11-03 之江实验室 Image processing method and device based on greedy strategy reverse channel pruning
CN117131920A (en) * 2023-10-26 2023-11-28 北京市智慧水务发展研究院 Model pruning method based on network structure search

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112561040A (en) * 2021-02-25 2021-03-26 之江实验室 Filter distribution perception training acceleration method and platform for neural network model
CN113076544A (en) * 2021-04-02 2021-07-06 湖南大学 Vulnerability detection method and system based on deep learning model compression and mobile device
CN113037482B (en) * 2021-04-13 2022-07-15 山东新一代信息产业技术研究院有限公司 Model compression encryption method based on RNN
CN113159293B (en) * 2021-04-27 2022-05-06 清华大学 Neural network pruning device and method for storage and computation fusion architecture
CN113361707A (en) * 2021-05-25 2021-09-07 同济大学 Model compression method, system and computer readable medium
CN113642730A (en) * 2021-08-30 2021-11-12 Oppo广东移动通信有限公司 Convolutional network pruning method and device and electronic equipment
CN113743591B (en) * 2021-09-14 2023-12-26 北京邮电大学 Automatic pruning convolutional neural network method and system
CN114120154B (en) * 2021-11-23 2022-10-28 宁波大学 Automatic detection method for breakage of glass curtain wall of high-rise building
CN115273129B (en) * 2022-02-22 2023-05-05 珠海数字动力科技股份有限公司 Lightweight human body posture estimation method and device based on neural architecture search
CN117058525B (en) * 2023-10-08 2024-02-06 之江实验室 Model training method and device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106779086A (en) * 2016-11-28 2017-05-31 北京大学 A kind of integrated learning approach and device based on Active Learning and model beta pruning
US20190122113A1 (en) * 2017-10-19 2019-04-25 International Business Machines Corporation Pruning Redundant Neurons and Kernels of Deep Convolutional Neural Networks
CN111079899A (en) * 2019-12-05 2020-04-28 中国电子科技集团公司信息科学研究院 Neural network model compression method, system, device and medium
CN111967594A (en) * 2020-08-06 2020-11-20 苏州浪潮智能科技有限公司 Neural network compression method, device, equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106779086A (en) * 2016-11-28 2017-05-31 北京大学 A kind of integrated learning approach and device based on Active Learning and model beta pruning
US20190122113A1 (en) * 2017-10-19 2019-04-25 International Business Machines Corporation Pruning Redundant Neurons and Kernels of Deep Convolutional Neural Networks
CN111079899A (en) * 2019-12-05 2020-04-28 中国电子科技集团公司信息科学研究院 Neural network model compression method, system, device and medium
CN111967594A (en) * 2020-08-06 2020-11-20 苏州浪潮智能科技有限公司 Neural network compression method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZECHUN LIU; HAOYUAN MU; XIANGYU ZHANG; ZICHAO GUO; XIN YANG; TIM KWANG-TING CHENG; JIAN SUN: "MetaPruning: Meta Learning for Automatic Neural Network Channel Pruning", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 14 August 2019 (2019-08-14), 201 Olin Library Cornell University Ithaca, NY 14853 , XP081461909 *

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115186937A (en) * 2022-09-09 2022-10-14 闪捷信息科技有限公司 Prediction model training and data prediction method and device based on multi-party data cooperation
CN115374935B (en) * 2022-09-15 2023-08-11 重庆大学 Pruning method of neural network
CN115374935A (en) * 2022-09-15 2022-11-22 重庆大学 Pruning method of neural network
CN115496210A (en) * 2022-11-21 2022-12-20 深圳开鸿数字产业发展有限公司 Filtering pruning method and system for network model, electronic equipment and storage medium
CN115496210B (en) * 2022-11-21 2023-12-08 深圳开鸿数字产业发展有限公司 Filtering pruning method and system of network model, electronic equipment and storage medium
CN115797477B (en) * 2023-01-30 2023-05-16 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Pruning type image compressed sensing method and system for lightweight deployment
CN115797477A (en) * 2023-01-30 2023-03-14 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Pruning type image compression sensing method and system for light weight deployment
CN116994309A (en) * 2023-05-06 2023-11-03 浙江大学 Face recognition model pruning method for fairness perception
CN116994309B (en) * 2023-05-06 2024-04-09 浙江大学 Face recognition model pruning method for fairness perception
CN116402117A (en) * 2023-06-07 2023-07-07 中诚华隆计算机技术有限公司 Image classification convolutional neural network pruning method and core particle device data distribution method
CN116402117B (en) * 2023-06-07 2023-08-18 中诚华隆计算机技术有限公司 Image classification convolutional neural network pruning method and core particle device data distribution method
CN116698410A (en) * 2023-06-29 2023-09-05 重庆邮电大学空间通信研究院 Rolling bearing multi-sensor data monitoring method based on convolutional neural network
CN116698410B (en) * 2023-06-29 2024-03-12 重庆邮电大学空间通信研究院 Rolling bearing multi-sensor data monitoring method based on convolutional neural network
CN116992945A (en) * 2023-09-27 2023-11-03 之江实验室 Image processing method and device based on greedy strategy reverse channel pruning
CN116992945B (en) * 2023-09-27 2024-02-13 之江实验室 Image processing method and device based on greedy strategy reverse channel pruning
CN117131920A (en) * 2023-10-26 2023-11-28 北京市智慧水务发展研究院 Model pruning method based on network structure search
CN117131920B (en) * 2023-10-26 2024-01-30 北京市智慧水务发展研究院 Model pruning method based on network structure search

Also Published As

Publication number Publication date
CN112396181A (en) 2021-02-23

Similar Documents

Publication Publication Date Title
WO2022141754A1 (en) Automatic pruning method and platform for general compression architecture of convolutional neural network
US10929744B2 (en) Fixed-point training method for deep neural networks based on dynamic fixed-point conversion scheme
WO2022126683A1 (en) Method and platform for automatically compressing multi-task-oriented pre-training language model
WO2022126797A1 (en) Automatic compression method and platform for multilevel knowledge distillation-based pre-trained language model
CN107679618B (en) Static strategy fixed-point training method and device
US10984308B2 (en) Compression method for deep neural networks with load balance
US20190050734A1 (en) Compression method of deep neural networks
CN110730046B (en) Cross-frequency-band spectrum prediction method based on deep migration learning
CN107729999A (en) Consider the deep neural network compression method of matrix correlation
CN107679617A (en) The deep neural network compression method of successive ignition
US20220188658A1 (en) Method for automatically compressing multitask-oriented pre-trained language model and platform thereof
US11501171B2 (en) Method and platform for pre-trained language model automatic compression based on multilevel knowledge distillation
CN113033786B (en) Fault diagnosis model construction method and device based on time convolution network
CN112215353A (en) Channel pruning method based on variational structure optimization network
CN114792126A (en) Convolutional neural network design method based on genetic algorithm
CN116822593A (en) Large-scale pre-training language model compression method based on hardware perception
Kim Quantization robust pruning with knowledge distillation
WO2023082045A1 (en) Neural network architecture search method and apparatus
EP4040342A1 (en) Deep neutral network structure learning and simplifying method
CN113627595B (en) Probability-based MobileNet V1 network channel pruning method
Xu et al. Efficient block pruning based on kernel and feature stablization
CN117131904A (en) Parameter localization method and system for neural network
CN116992926A (en) Evolutionary neural network method for cooperatively optimizing convolutional neural network structure and weight
CN118279669A (en) Edge equipment image classification method, equipment, storage medium and product
CN117669237A (en) Prototype optimization design method and system for HPM source

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21912530

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21912530

Country of ref document: EP

Kind code of ref document: A1