CN112132278A

CN112132278A - Model compression method and device, computer equipment and storage medium

Info

Publication number: CN112132278A
Application number: CN202011007728.9A
Authority: CN
Inventors: 郑强; 王晓锐; 高鹏; 王俊; 李葛; 谢国彤
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2020-09-23
Filing date: 2020-09-23
Publication date: 2020-12-25
Also published as: WO2021159748A1

Abstract

The invention relates to the technical field of artificial intelligence, in particular to a model compression method, a model compression device, model compression equipment and a storage medium. The model compression method comprises the steps of obtaining an image recognition model trained in advance and a second backbone network to be trained; inputting an image to be tested into an image recognition model for testing to obtain a model test result; calculating a channel weight vector according to the model test result; respectively inputting the training images into a first backbone network and a second backbone network for feature extraction to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network; calculating model loss based on the first feature map, the second feature map and the channel weight vector; and updating and optimizing the second backbone network according to the model loss to obtain a compressed image identification model. The model compression method solves the problems that the efficiency of distillation compression on the whole artificial intelligent model is low and the universality is not high in the conventional model compression.

Description

Model compression method and device, computer equipment and storage medium

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a model compression method, a model compression device, computer equipment and a storage medium.

Background

In the field of artificial intelligence, the life cycle of a model can be generally divided into two links of model training and model reasoning. In a model training link, redundancy exists in the model, which is inevitable for pursuing higher recognition accuracy of the model. In the model reasoning link, due to the influence of different reasoning application environments, the model is required to have the high performance characteristics of high reasoning speed, small resource occupation, small file size and the like while the accuracy of the model is concerned.

At present, model compression is a common optimization means for converting a model from a model training link to a model reasoning link. However, the current model compression is distillation compression for the whole artificial intelligence model, and different models have complex and various application scenarios, so that a customized compression scheme needs to be developed when different models are compressed, and the efficiency is low and the universality is low.

Disclosure of Invention

The embodiment of the invention provides a model compression method, a model compression device, computer equipment and a storage medium, and aims to solve the problems of low efficiency and low universality of the existing model compression.

A method of model compression, comprising:

acquiring an image recognition model which is trained in advance according to a training image and a second backbone network to be trained; wherein the image recognition model comprises a first backbone network;

inputting a plurality of images to be tested into the image recognition model for testing to obtain model test results corresponding to the plurality of images to be tested;

calculating a channel weight vector according to the model test result; the channel weight vector is used for describing the importance of a characteristic channel corresponding to a characteristic diagram output by the first backbone network;

inputting the training image into the first backbone network and the second backbone network respectively for feature extraction to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network;

calculating a model loss based on the first feature map, the second feature map, and the channel weight vector;

and updating and optimizing the second backbone network according to the model loss to obtain a compressed image identification model.

A pattern compression apparatus comprising:

the backbone network acquisition module is used for acquiring a pre-trained image recognition model and a second backbone network to be trained; wherein the image recognition model comprises a first backbone network;

the model testing module is used for inputting a plurality of images to be tested into the image recognition model for testing to obtain model testing results corresponding to the plurality of images to be tested;

the channel weight calculation module is used for calculating a channel weight vector according to the model test result; the channel weight vector is used for describing the importance of a characteristic channel corresponding to a characteristic diagram output by the first backbone network;

the model training module is used for inputting the training images into the first backbone network and the second backbone network respectively for feature extraction to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network;

a model loss calculation module for calculating a model loss based on the first feature map, the second feature map and the channel weight vector;

and the model updating module is used for updating and optimizing the second backbone network according to the model loss so as to obtain a compressed image identification model.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above model compression method when executing the computer program.

A computer storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned model compression method.

In the model compression method, the device, the computer equipment and the storage medium, the pre-trained image recognition model and the second backbone network to be trained are obtained, so that knowledge distillation is performed according to the first backbone network in the image recognition model, the second backbone network is trained, model compression is not required to be performed on the whole artificial intelligent model, knowledge distillation is performed on a local network of an original model, display memory occupation and calculation amount are reduced, the model compression process is accelerated, the problem of model compression limitation caused by more applied scenes can be effectively solved, the universality is higher, the tool is facilitated, and the repeated investment can be effectively reduced. Then, the image to be tested is input into the image recognition model for testing to obtain a model test result, so that a channel weight vector is calculated according to the model test result, the importance of each characteristic channel is determined according to the model test result, and the accuracy and the practicability of the channel weight vector are ensured. And then, inputting the training images into the first backbone network and the second backbone network respectively for feature extraction to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network, so as to calculate the model loss based on the first feature map, the second feature map and the channel weight vector. And finally, updating and optimizing the second backbone network according to the model loss to obtain a compressed image recognition model, so that the second backbone network can learn the characteristics which have great influence on the model accuracy, the recognition accuracy of the target recognition model obtained after the subsequent model compression is improved, and the problem that important information is lost due to the model compression is avoided.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.

FIG. 1 is a diagram illustrating an application environment of a model compression method according to an embodiment of the present invention;

FIG. 2 is a flow chart of a model compression method according to an embodiment of the present invention;

FIG. 3 is a schematic structural view of a compression method of a mold in the present embodiment;

FIG. 4 is a detailed flowchart of step S202 in FIG. 3;

FIG. 5 is a detailed flowchart of step S203 in FIG. 2;

FIG. 6 is a detailed flowchart of step S203 in FIG. 2;

FIG. 7 is a detailed flowchart of step S205 in FIG. 2;

FIG. 8 is a schematic view of a model compression apparatus in accordance with an embodiment of the present invention;

FIG. 9 is a schematic diagram of a computer device according to an embodiment of the invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The model compression method can be applied in an application environment as in fig. 1, where a computer device communicates with a server over a network. The computer device may be, but is not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. The server may be implemented as a stand-alone server.

In an embodiment, as shown in fig. 2, a model compression method is provided, which is described by taking the server in fig. 1 as an example, and includes the following steps:

s201: and acquiring a pre-trained image recognition model and a second backbone network to be trained, wherein the image recognition model comprises a first backbone network.

The first backbone network refers to a feature extraction backbone network in an image recognition model and can be understood as a Teacher network in a traditional knowledge distillation method. The image-to-recognition model is a model trained in advance for recognizing an object in an image, such as an animal, a person, or the like in the image. The second backbone network is a pre-created characteristic backbone network with a smaller scale (e.g., a smaller number of neurons or a smaller number of network layers) than the first backbone network, and may be understood as a Student network in a conventional knowledge distillation method. It should be noted that the feature channels of the feature maps output by the first backbone network and the second backbone network are kept consistent, so as to unify the feature channel dimensions, thereby facilitating subsequent calculation.

It can be understood that, when the image recognition model is trained, the scale of the model is huge due to pursuit of higher model accuracy, and when the model is deployed, high-cost storage space and consumption of computing resources are required due to higher complexity, so that the model is difficult to implement in each hardware platform. It is therefore necessary to perform an indicated distillation of the original model in order to minimize the consumption of computation space and time by the model.

Current artificial intelligence models include a feature extraction backbone network and other parts related to the application. The number of neurons of the feature extraction backbone network is large, and the calculation complexity is highest, so that knowledge distillation is only needed to be performed on the feature extraction backbone network in the embodiment, model compression is not needed to be performed on the whole artificial intelligence model, the compression workload can be reduced, and the problem of model compression limitation caused by more applied scenes can be effectively solved.

Specifically, as shown in the schematic structure diagram of model compression of fig. 3, the image recognition model includes a mask layer (mask layer) connected to the first backbone network (i.e., Teacher network) and a recognition network (i.e., Head network) connected to the mask layer. The mask layer is used for carrying out channel shielding processing on the characteristic channels in the first characteristic diagram so as to calculate an importance evaluation parameter (namely a channel weight vector) of each characteristic channel; the recognition network is used for recognizing the characteristic diagram output by the mask layer to obtain a corresponding recognition result.

As shown in fig. 3, a test image is input into the image recognition model for testing, so as to obtain a channel weight vector corresponding to a feature map output by the first backbone network, then a training image for training the image recognition model is simultaneously input into the first backbone network and the second backbone network, so as to obtain a first feature map and a second feature map, so as to calculate Loss (i.e., model Loss) based on the first feature map, the second feature map and the channel weight vector, and then the model Loss is transmitted to the second backbone network (i.e., Student network) for model optimization, so as to implement model compression.

S202: and inputting the plurality of images to be tested into the image recognition model for testing to obtain model test results corresponding to the plurality of images to be tested.

Specifically, a plurality of images to be tested can be input into the image recognition model for batch testing according to a preset batch size (representing the number of samples to be tested) parameter, so as to count the recognition accuracy of the plurality of images to be tested as a model test result.

S203: calculating a channel weight vector according to the model test result; the channel weight vector is used for describing the importance of the feature channel in the first feature map.

Specifically, a channel weight vector is calculated according to a model test result, namely the model identification accuracy, so that the importance of the feature channel in the first feature map is directly evaluated through the channel weight vector, the importance of each feature channel in the feature map output by the first backbone network or the second backbone network is evaluated through the model identification accuracy, and the accuracy is high and the practicability is high.

S204: and respectively inputting the training images into a first backbone network and a second backbone network for feature extraction to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network.

The first feature map is a feature map output by feature extraction of the training image through the first backbone network. The second feature map is a feature map output by feature extraction of the training image through a second backbone network. Specifically, the training images are respectively input into the first backbone network and the second backbone network for feature extraction, so as to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network, and further facilitate subsequent model loss calculation. It should be noted that the execution sequence of step S203 and step S204 is not in order, and may be executed simultaneously, which is not limited herein.

S205: and calculating the model loss based on the first feature map, the second feature map and the channel weight vector.

In this embodiment, the loss of the feature map is calculated according to the first feature map and the second feature map through a predefined loss function, and then the second backbone network performs weighting with the channel weight vector, so that the second backbone network can learn the features that have a large influence on the accuracy of the model, which is beneficial to improving the recognition accuracy of the target recognition model obtained after the subsequent model compression, and can avoid the problem that important information is lost due to the model compression.

S206: and updating and optimizing the second backbone network according to the model loss to obtain a compressed image identification model.

Specifically, the model parameters (such as model weights) of each neuron in the second backbone network are subjected to partial derivation by a model updating algorithm preset in the image prediction model

The model parameters of each neuron in the second backbone network can be optimized, and when the prediction accuracy of the second backbone network reaches a preset value, the compressed image can be obtainedAnd (4) identifying the model.

In the embodiment, the pre-trained image recognition model and the second backbone network to be trained are obtained, so that knowledge distillation is performed according to the first backbone network in the image recognition model, the second backbone network is trained, model compression is not required to be performed on the whole artificial intelligent model, knowledge distillation is performed on a local network of an original model, the display memory occupation amount and the calculated amount are reduced, the model compression process is accelerated, the problem of model compression limitation caused by more applied scenes can be effectively solved, the universality is higher, the implementation is facilitated, and the repeated investment can be effectively reduced. Then, the image to be tested is input into the image recognition model for testing to obtain a model test result, so that a channel weight vector is calculated according to the model test result, the importance of each characteristic channel is determined according to the model test result, and the accuracy and the practicability of the channel weight vector are ensured. And then, inputting the training images into the first backbone network and the second backbone network respectively for feature extraction to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network, so as to calculate the model loss based on the first feature map, the second feature map and the channel weight vector. And finally, updating and optimizing the second backbone network according to the model loss to obtain a compressed image recognition model, so that the second backbone network can learn the characteristics which have great influence on the model accuracy, the recognition accuracy of the target recognition model obtained after the subsequent model compression is improved, and the problem that important information is lost due to the model compression is avoided.

In an embodiment, as shown in fig. 4, in step 202, a plurality of images to be tested are input into the image recognition model for testing, so as to obtain model test results corresponding to the plurality of images to be tested, which specifically includes the following steps:

s301: extracting the characteristics of each image to be tested by adopting a first backbone network, and outputting a test characteristic diagram corresponding to each image to be tested; wherein the image to be tested comprises a plurality of characteristic channels.

The test characteristic graph refers to a characteristic graph corresponding to an image to be tested, which is output by the image to be tested through feature extraction of the first backbone network. Specifically, each image to be tested is subjected to feature extraction through a first backbone network, namely, nonlinear transformation such as multilayer convolution, activation, pooling and the like is performed, so that a test feature graph corresponding to each image to be tested is output. The test feature map includes a plurality of feature channels, with different feature channels reflecting different image features.

S302: and carrying out channel shielding treatment on the same characteristic channel in each test characteristic diagram by adopting the mask layer to obtain a third characteristic diagram corresponding to each image to be tested.

The mask layer is a layer of mask covering the original tensor so as to shield or select some specific elements to obtain the image of the target area. The third characteristic diagram is the characteristic diagram after the same characteristic channel is shielded from each first characteristic diagram.

Specifically, a mapping matrix of 0 and 1 is constructed through a mask layer to retain the features of the target feature channel and remove the features of the non-target channel, for example, each test feature map includes a feature channel a and a feature channel b, and if the feature channel 1 in each test feature map needs to be shielded, the mapping matrix corresponding to the feature channel a is filled with 0 and the mapping matrix corresponding to the feature channel b is filled with 1, that is, the feature channel a can be shielded, and the feature channel b is retained.

S303: and identifying each third feature map by adopting an identification network to obtain an identification result corresponding to each third feature map.

Specifically, each third feature map is input into the identification network for identification, so that an identification result corresponding to each third feature map can be obtained.

S304: and obtaining a test result component corresponding to each characteristic channel according to the identification result and the real result corresponding to the image to be tested.

The real result refers to an image classification result corresponding to the pre-labeled image to be tested, for example, if the model application scene of the image recognition model is the animal type in the recognized image, the real result is the real type of the animal in the image to be tested.

Specifically, the identification result is compared with the real result corresponding to the image to be tested, so that the test result component corresponding to the round of test can be obtained. By comparing each recognition result with the real result, the recognition accuracy of a plurality of third feature maps which simultaneously shield the same feature channel can be counted, and the third feature maps can be used as the test result component corresponding to the feature channel.

Exemplarily, assuming that an image to be tested is input into a first backbone network for feature extraction, and output results are a feature map 1 and a feature map 2, each feature map includes 2 feature channels a and b, and by simultaneously shielding the feature channels a in the two first feature maps, a corresponding third feature map a can be obtained₁' and a₂', respectively, a₁' and a₂' input as input data of the identification network into the identification network for identification, and obtain corresponding identification result a₁"(denoted as cat) and a₂"(denoted as dog), comparing each recognition result with a real result (such as dog), and taking the recognition accuracy rate of the test of the current round, namely 50%, as a test result component corresponding to the characteristic channel a. By performing multiple testing rounds, namely, simultaneously shielding each characteristic channel in the two characteristic graphs, the testing result component corresponding to each characteristic channel can be obtained.

S305: and taking the data set containing the test result component corresponding to each characteristic channel as a model test result corresponding to a plurality of images to be tested.

Specifically, a data set of the test result component corresponding to each feature channel is used as a model test result, so that when loss is calculated in the subsequent process, data in the data set is called for calculation.

In an embodiment, as shown in fig. 5, in step S203, that is, according to the model test result, calculating the channel weight vector includes the following steps:

s401: and taking the difference value between the maximum value of the test result component and each test result component in the model test result as a first difference value.

S402: and taking the difference value of the maximum value and the minimum value of the test result component in the model test result as a second difference value.

S403: and calculating the ratio of the first difference value to the second difference value, and adding the ratio to a predefined constant term to obtain a channel weight component corresponding to each test feature map.

Specifically, for intuitive expression, the calculation process of steps S401 to S403 is expressed by the following formula, where Wi is 1+ (Amax-Ai)/(Amax-Amin), where 1 represents a predefined constant term, Amax represents the maximum value of the test result component, Amin represents the minimum value of the test result component, Ai represents the test result component corresponding to the feature channel i, i is the channel identifier, and Wi represents the channel weight component corresponding to the test feature map, and is used to characterize the importance of the feature channel i.

S404: and taking a data set containing the channel weight component corresponding to each test feature map as a channel weight vector.

Specifically, the channel weight component corresponding to each test feature map is stored in a data set, so that the data set is used as a channel weight vector, and when loss is calculated subsequently, data in the data set is called for calculation.

In an embodiment, as shown in fig. 6, in step S203, that is, according to the model test result, the channel weight vector is calculated, which specifically includes the following steps:

s501: and taking the difference value between the maximum value of the test result component and each test result component in the model test result as a first difference value.

S502: and taking the difference value of the maximum value and the minimum value of the test result component in the model test result as a second difference value.

S503: calculating a ratio of the first difference to the second difference, and calculating a product of the ratio and a preset scaling factor.

S504: and adding the product and a predefined constant term to obtain a channel weight component corresponding to each test characteristic diagram.

Specifically, for intuitive expression, the calculation process of steps S501 to S503 is expressed by the following formula, where Wi is 1+ α (Amax-Ai)/(Amax-Amin), where 1 represents a predefined constant term, Amax represents a maximum value of the test result component, Amin represents a minimum value of the test result component, Ai represents a test result component corresponding to the feature channel i, i is a channel identifier, α is a preset scaling factor (default set to 1, and may be set by a user as needed), and Wi represents a channel weight component corresponding to the test feature map, and is used to characterize the importance of the feature channel i.

It can be understood that, in order to further make the difference between the channel weight components obvious, that is, to amplify the difference between the channel weight components, the second backbone network can learn the features that are important to the accuracy of the model, and ensure the accuracy of model compression, in this embodiment, the difference between the channel weight components is amplified by presetting the scaling factor.

S504: and taking a data set containing the channel weight component corresponding to each test feature map as a channel weight vector.

In an embodiment, as shown in fig. 7, in step S205, that is, based on the first feature map, the second feature map and the channel weight vector, the calculating of the model loss specifically includes the following steps:

s601: and calculating the first characteristic diagram and the second characteristic diagram by adopting a predefined loss function to obtain the characteristic diagram loss.

S602: and weighting the loss of the characteristic diagram based on the channel weight vector to obtain the model loss.

Specifically, for intuitive expression, the calculation procedures of steps S601 to S602 are expressed here by the following formulas,

wherein Loss represents model Loss, Ft represents a first feature diagram, Fs represents a second feature diagram, W represents a channel weight vector, n represents the number of images to be tested, c represents the number of feature channels in the first feature diagram, and Ft represents the number of feature channels in the first feature diagram_iIs shown asIth feature channel map, Fs in a feature map_iThe ith eigen-channel map in the second eigen-channel map is shown, and f represents a predefined loss function, such as L1 loss, MSE loss, etc., which is not limited herein. It should be noted that the number of feature channels of the first feature map and the second feature map is consistent.

In this embodiment, when the model loss is calculated, the weighting processing is performed on the feature loss and the channel weight vector, so that the calculation of the model loss integrates the influence of the feature channel importance parameter, and the accuracy loss of the model compression can be effectively reduced while the model is compressed.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.

In one embodiment, a model compression apparatus is provided, and the model compression apparatus corresponds to the model compression method in the above embodiments one to one. As shown in fig. 8, the model compression apparatus includes a backbone network obtaining module 10, a model testing module 20, a channel weight calculating module 30, a model training module 40, a model loss calculating module 50, and a model updating module 60. The functional modules are explained in detail as follows:

the backbone network obtaining module 10 is configured to obtain a pre-trained image recognition model and a second backbone network to be trained, where the image recognition model includes a first backbone network.

The model testing module 20 is configured to input the image to be tested into the image recognition model for testing, to obtain a model testing result and a first feature map output by the first backbone network, and input the training image into the second backbone network for feature extraction, to obtain a second feature map output by the second backbone network.

The channel weight calculation module 30 is configured to calculate a channel weight vector corresponding to the first feature map according to the model test result; the channel weight vector is used for describing the importance of the feature channel in the first feature map.

And the model training module 40 is configured to input the training images into the first backbone network and the second backbone network respectively to perform feature extraction, so as to obtain a first feature map output by the first backbone network and a second feature map output by the second backbone network.

And a model loss calculation module 50, configured to calculate a model loss based on the first feature map, the second feature map, and the channel weight vector.

And a model updating module 60, configured to update and optimize the second backbone network according to the model loss to obtain a compressed image recognition model.

Specifically, the image recognition model includes a mask layer connected to the first backbone network and a recognition network connected to the mask layer.

Specifically, the model testing module comprises a feature extraction unit, a channel shielding unit, an image identification unit, a result statistics unit and a testing result acquisition unit.

The characteristic extraction unit is used for extracting the characteristics of each image to be tested by adopting a first backbone network and outputting a test characteristic diagram corresponding to each image to be tested; wherein the test feature map includes a plurality of feature channels.

The channel shielding unit is used for carrying out channel shielding treatment on the same characteristic channel in each test characteristic diagram by adopting the mask layer to obtain a third characteristic diagram corresponding to each image to be tested;

and the image identification unit is used for identifying each third feature map by adopting an identification network to obtain an identification result corresponding to each third feature map.

And the result counting unit is used for obtaining the test result component corresponding to each characteristic channel according to the identification result and the real result corresponding to the image to be tested.

And the test result acquisition unit is used for taking the data set containing the test result component corresponding to each characteristic channel as the model test result corresponding to the plurality of images to be tested.

Specifically, the channel weight calculation module includes a first difference calculation unit, a second difference calculation unit, a channel weight component calculation unit, and a channel weight vector acquisition unit.

And the first difference calculating unit is used for taking the difference between the maximum value of the test result components and each test result component in the model test result as a first difference.

And the second difference calculation unit is used for taking the difference between the maximum value and the minimum value of the test result component in the model test result as a second difference.

And the channel weight component calculation unit is used for calculating the ratio of the first difference value and the second difference value, and summing the ratio and a predefined constant term to obtain the channel weight component corresponding to each test characteristic diagram.

And the channel weight vector acquisition unit is used for taking a data set containing the channel weight component corresponding to each test characteristic graph as a channel weight vector.

Specifically, the channel weight calculation module includes a first difference calculation unit, a second difference calculation unit, a scaling unit, a channel weight component calculation unit, and a channel weight vector acquisition unit.

And the scaling unit is used for calculating the ratio of the first difference value and the second difference value and calculating the product of the ratio and a preset scaling factor.

And the channel weight component calculation unit is used for adding the product and a predefined constant term to obtain a channel weight component corresponding to each test characteristic graph.

Specifically, the model updating module comprises a feature map loss calculation unit and a model loss calculation unit.

And the characteristic diagram loss calculating unit is used for calculating the first characteristic diagram and the second characteristic diagram by adopting a predefined loss function to obtain the characteristic diagram loss.

And the model loss calculation unit is used for weighting the characteristic diagram loss based on the channel weight vector to obtain the model loss.

For the specific definition of the model compression device, reference may be made to the above definition of the model compression method, which is not described herein again. The modules in the model compression apparatus can be implemented in whole or in part by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 9. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a computer storage medium and an internal memory. The computer storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the computer storage media. The database of the computer device is used for storing data generated or acquired during execution of the model compression method, such as an image recognition model. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a model compression method.

In one embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the steps of the model compression method in the above embodiments are implemented, for example, steps S201 to S206 shown in fig. 2 or steps shown in fig. 3 to 7. Alternatively, the functions of each module/unit in this embodiment of the model compression apparatus, for example, the functions of each module/unit shown in fig. 8, are implemented when the processor executes the computer program, and are not described here again to avoid repetition.

In an embodiment, a computer storage medium is provided, and a computer program is stored on the computer storage medium, and when being executed by a processor, the computer program implements the steps of the model compression method in the foregoing embodiments, such as steps S201 to S206 shown in fig. 2 or steps shown in fig. 3 to fig. 7, which are not repeated herein for avoiding repetition. Alternatively, the computer program, when executed by the processor, implements the functions of each module/unit in the embodiment of the model compression apparatus, for example, the functions of each module/unit shown in fig. 8, and are not described herein again to avoid repetition.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.

Claims

1. A method of model compression, comprising:

2. The model compression method of claim 1, wherein the image recognition model includes a mask layer coupled to the first backbone network and a recognition network coupled to the mask layer.

3. The model compression method of claim 2, wherein the inputting a plurality of images to be tested into the image recognition model for testing to obtain model test results corresponding to the plurality of images to be tested comprises:

extracting the characteristics of each image to be tested by adopting the first backbone network, and outputting a test characteristic diagram corresponding to each image to be tested; wherein the test feature map comprises a plurality of feature channels;

carrying out channel shielding treatment on the same characteristic channel in each test characteristic diagram by adopting the mask layer to obtain a third characteristic diagram corresponding to each image to be tested;

identifying each third feature map by using the identification network to obtain an identification result corresponding to each third feature map;

obtaining a test result component corresponding to each characteristic channel according to the identification result and a real result corresponding to the image to be tested;

and taking a data set containing the test result component corresponding to each characteristic channel as a model test result corresponding to the plurality of images to be tested.

4. The model compression method of claim 3, wherein said computing a channel weight vector based on said model test results comprises:

taking the difference value between the maximum value of the test result component and each test result component in the model test result as a first difference value;

taking the difference value between the maximum value and the minimum value of the test result component in the model test result as a second difference value;

calculating a ratio of the first difference value to the second difference value, and summing the ratio and a predefined constant term to obtain a channel weight component corresponding to each test feature map;

and using a data set containing the channel weight component corresponding to each test feature map as the channel weight vector.

5. The model compression method of claim 3, wherein said computing a channel weight vector based on said model test results comprises:

calculating a ratio of the first difference to the second difference and calculating a product of the ratio and a preset scaling factor;

adding the product and a predefined constant term to obtain a channel weight component corresponding to each test feature map;

6. The model compression method of claim 1, wherein model penalties are calculated based on the first feature map, the second feature map, and the channel weight vector, including;

calculating the first characteristic diagram and the second characteristic diagram by adopting a predefined loss function to obtain characteristic diagram loss;

and weighting the loss of the characteristic diagram based on the channel weight vector to obtain the model loss.

7. A pattern compression apparatus, comprising:

8. The pattern compression apparatus of claim 7, wherein the pattern testing module comprises:

the characteristic extraction unit is used for extracting the characteristics of each image to be tested by adopting the first backbone network and outputting a test characteristic diagram corresponding to each image to be tested; wherein the test feature map comprises a plurality of feature channels;

the channel shielding unit is used for carrying out channel shielding treatment on the same characteristic channel in each test characteristic diagram by adopting a mask layer to obtain a third characteristic diagram corresponding to each image to be tested;

the image identification unit is used for identifying each third feature map by adopting an identification network to obtain an identification result corresponding to each third feature map;

the result counting unit is used for obtaining a test result component corresponding to each characteristic channel according to the identification result and the real result corresponding to the image to be tested;

and the test result acquisition unit is used for taking a data set containing a test result component corresponding to each characteristic channel as a model test result corresponding to the plurality of images to be tested.

9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the model compression method according to any one of claims 1 to 6 when executing the computer program.

10. A computer storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the model compression method according to any one of claims 1 to 6.