CN115841146A

CN115841146A - Model generation method and device, electronic equipment and storage medium

Info

Publication number: CN115841146A
Application number: CN202111092948.0A
Authority: CN
Inventors: 张锐
Original assignee: Zeku Technology Shanghai Corp Ltd
Current assignee: Zeku Technology Shanghai Corp Ltd
Priority date: 2021-09-17
Filing date: 2021-09-17
Publication date: 2023-03-24

Abstract

The embodiment of the application discloses a model generation method and device, electronic equipment and a storage medium. The method comprises the following steps: obtaining an initial mode pruning model; generating a plurality of reference mode pruning models based on the pruning modes corresponding to the initial mode pruning models; respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, and acquiring operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment; determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of operating parameters. By the method, the mode pruning model which is more adaptive to the electronic equipment can be found from the plurality of mode pruning models.

Description

Model generation method and device, electronic equipment and storage medium

Technical Field

The application belongs to the technical field of machine learning, and particularly relates to a model generation method and device, electronic equipment and a storage medium.

Background

Because the neural network model needs more computing resources and storage resources for support, and the computing resources and storage resources of the electronic device are limited, the application of the neural network model in the electronic device is limited. In the related art, the operation amount of the neural network can be reduced by pruning the neural network model, so that the consumption of the calculation resource and the storage resource by the neural network is reduced. However, the electronic device supports different neural network models at the time of running, so that the degree of adaptation of the neural network model to the electronic device still needs to be improved.

Disclosure of Invention

In view of the foregoing, the present application provides a model generation method, apparatus, electronic device, and storage medium to achieve an improvement of the foregoing problem.

In a first aspect, an embodiment of the present application provides a model generation method, where the method includes: obtaining an initial mode pruning model; generating a plurality of reference mode pruning models based on the pruning modes corresponding to the initial mode pruning models, wherein the network weights of the reference mode pruning models and the network weights of the initial mode pruning models have the same zero occupancy ratio; respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, and acquiring operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment; determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of operating parameters.

In a second aspect, an embodiment of the present application provides a model generation apparatus, including: the model acquisition unit is used for acquiring an initial mode pruning model; a model generating unit, configured to generate multiple reference mode pruning models based on the pruning modes corresponding to the initial mode pruning model, where network weights of the multiple reference mode pruning models and network weights of the initial mode pruning model have the same zero ratio; a parameter obtaining unit, configured to input the initial mode pruning model and the multiple reference mode pruning models into a cost model, and obtain operation parameters corresponding to the multiple pruning models output by the cost model, so as to obtain multiple operation parameters, where the operation parameters are parameters that represent operation performance of an electronic device when the corresponding mode pruning models operate on the electronic device; a model determining unit, configured to determine, based on the plurality of operating parameters, a mode pruning model adapted to the electronic device from the plurality of mode pruning models.

In a third aspect, an embodiment of the present application provides an electronic device, including one or more processors and a memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods described above.

In a fourth aspect, the present application provides a computer-readable storage medium, in which program code is stored, where the program code executes the method described above.

In a fifth aspect, the present application provides a computer program product, which includes a computer program/instruction, when executed by a processor, implements the steps of the above method.

The embodiment of the application provides a model generation method and device, electronic equipment and a storage medium. The method comprises the steps of firstly obtaining an initial mode pruning model, then generating a plurality of reference mode pruning models based on a pruning mode corresponding to the initial mode pruning model, wherein the network weights of the plurality of reference mode pruning models and the network weight of the initial mode pruning model have the same zero occupancy ratio, then respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, obtaining operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model so as to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment, and determining the mode pruning model adaptive to the electronic equipment from the plurality of mode pruning model modes based on the plurality of operation parameters. By the method, the operation parameters of the electronic equipment can be calculated when different mode pruning models operate on the electronic equipment by introducing the cost model, and then the mode pruning model which is more adaptive to the electronic equipment can be found from the plurality of mode pruning models.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 shows a schematic diagram of a structured pruning proposed by an embodiment of the present application;

FIG. 2 is a schematic diagram illustrating a mode pruning proposed by an embodiment of the present application;

fig. 3 is a schematic diagram illustrating a convolution of an original model before pruning according to an embodiment of the present application;

FIG. 4 is a schematic diagram illustrating a convolution of an unstructured post-pruning model according to an embodiment of the present application;

FIG. 5 is a schematic diagram illustrating a convolution of a model after pattern pruning according to an embodiment of the present application;

FIG. 6 is a flow chart illustrating a method for model generation according to an embodiment of the present application;

fig. 7 is a schematic diagram illustrating an initial mode pruning model generated according to a preset pruning rate according to an embodiment of the present application;

FIG. 8 is a diagram illustrating a randomly generated reference pattern pruning model according to an embodiment of the present application;

FIG. 9 is a flow chart illustrating a method of model generation according to another embodiment of the present application;

FIG. 10 is a flow chart illustrating a method of model generation according to yet another embodiment of the present application;

FIG. 11 is a flow chart illustrating a method of model generation according to yet another embodiment of the present application;

fig. 12 is a block diagram illustrating a structure of a model generation apparatus according to an embodiment of the present application;

fig. 13 is a block diagram illustrating a structure of a model generation apparatus according to an embodiment of the present application;

FIG. 14 is a block diagram of an electronic device for performing a model generation method according to an embodiment of the present application in real time;

fig. 15 illustrates a storage unit for storing or carrying program code implementing a model generation method according to an embodiment of the present application in real time.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The development of deep learning makes deep neural networks increasingly applied to computer vision tasks such as image recognition, detection and tracking, and network models increasingly tend to develop in wider and deeper directions. The success of deep learning depends largely on the large number of parameters of the model and the computing device with powerful capabilities. However, the deep neural network is difficult to deploy on a low-storage and low-power-consumption hardware platform (such as a mobile device) due to the huge memory requirement and computational consumption, which greatly limits the application. Therefore, model compression of the deep neural network model is required. The model compression technology mainly comprises a model quantization technology, a model pruning technology, a model search technology and the like. The model pruning method is one of the most representative technologies in the model compression method due to the characteristics of simplicity and effectiveness, and the application scenario is mainly the model deployment stage, for example, when a trained model needs to be deployed on mobile devices such as mobile phones and smart homes, the size of the model is often reduced through pruning, so that the model has higher inference speed and occupies smaller memory.

The inventor finds that the model pruning method mainly cuts unimportant parameters by finding an effective parameter importance judging means so as to obtain the effect of compressing the model.

The related model pruning technology mainly comprises structured pruning and unstructured pruning. The method comprises the following steps that pruning, namely sparsification, of an unstructured model is carried out, certain parameters in the model are set to be 0 in a forced mode in the model training process, and when a chip calculates an operation model, calculation is skipped when the chip meets 0; the method includes the steps of structural model pruning, namely calculating parameters of a model which are meaningless through a certain standard, then cutting according to filter/channel granularity, changing an original large model into a small model, and achieving the purposes of reducing calculation amount and reducing power consumption.

Structured pruning has no special requirements for electronic devices and often has a greater demand for reduced power consumption. Unstructured pruning requires electronic equipment to support the "zeroing" operation, i.e., special hardware support. Since structured pruning essentially changes the structure of the original model, the model accuracy after pruning hardly reaches that before pruning.

Unstructured pruning has attracted attention in recent years, and it is evaluated whether unstructured pruning is mainly based on pruning shapes, and the shape-based unstructured pruning mainly includes pattern pruning (schematic pruning), which is shown in fig. 2, and each layer of the model is pruned according to the shape of the gray part in fig. 2. Illustratively, for a convolution operation, if the channel (channel) size of the input feature is 3, the channel (channel) size of the output feature is 3, and the convolution kernel size is 3 × 3, then the model schematic diagrams before pruning, after unstructured pruning and after pattern pruning are respectively shown in fig. 3, fig. 4 and fig. 5, where fig. 3 is a convolution schematic diagram of an original model before pruning, fig. 4 is a convolution schematic diagram of an unstructured pruning model, and fig. 5 is a convolution schematic diagram of a pattern pruning model.

As can be seen from fig. 3, 4 and 5, the pruning of the pattern pruning is regular. Compared with irregular unstructured pruning, the mode pruning has the advantage that a compiler can know the distribution rule of the weight in advance, so that control instructions can be generated better for the structure of the electronic equipment.

However, the electronic device has different support conditions for the neural network model after pruning in different modes during operation, so that the adaptation degree of the neural network model and the electronic device still needs to be improved.

Therefore, the inventors propose a model generation method, a model generation device and an electronic device in the present application. The method comprises the steps of firstly obtaining an initial mode pruning model, then generating a plurality of reference mode pruning models based on a pruning mode corresponding to the initial mode pruning model, wherein the network weights of the plurality of reference mode pruning models and the network weight of the initial mode pruning model have the same zero occupancy ratio, then respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, obtaining operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model so as to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment, and determining the mode pruning model adaptive to the electronic equipment from the plurality of mode pruning model modes based on the plurality of operation parameters. By the method, the operation parameters of the electronic equipment can be calculated when different mode pruning models operate on the electronic equipment by introducing the cost model, and then the mode pruning model which is more adaptive to the electronic equipment can be found from the plurality of mode pruning models.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Referring to fig. 6, a model generation method provided in the embodiment of the present application is applied to an electronic device, and the method includes:

step S110: and obtaining an initial mode pruning model.

In this embodiment of the application, the initial mode pruning model (pattern pruning model) is a mode pruning model obtained by pruning the neural network model according to a preset pruning rate. The preset pruning rate represents a ratio of network weight zero setting corresponding to each layer of the neural network model, and may be a preset pruning rate for pruning the neural network model, for example, the preset pruning rate may be set in a range of 30% to 40%, that is, 30% to 40% of the network weight in each convolution kernel is set to zero.

Step S120: and generating a plurality of reference mode pruning models based on the pruning modes corresponding to the initial mode pruning models, wherein the network weights of the reference mode pruning models and the network weights of the initial mode pruning models have the same zero occupation ratio.

In an embodiment of the application, the number of the plurality of reference pattern pruning models is less than or equal to the maximum number of reference pattern pruning models that can be generated according to one pruning pattern. For mode pruning, a certain number of network weights are pruned in each convolution kernel, unlike unstructured weight pruning, mode pruning yields the same sparsity rate and a finite number of mode shapes in each filter.

Wherein a pruning pattern may be understood as the number of network weights set to zero in the pattern pruning model. In the embodiment of the present application, the pruning mode corresponding to the initial mode pruning model is a pruning mode corresponding to a preset pruning rate.

As one mode, according to the pruning mode corresponding to the initial mode pruning model, a plurality of reference mode pruning models are randomly generated, wherein the number of the reference mode pruning models is the same as the number of the network weights set to zero in the initial pruning model, but the positions of the network weights set to zero are different. For example, as shown in fig. 7 and 8, fig. 7 is a schematic diagram corresponding to an initial mode pruning model generated according to a preset pruning rate, and fig. 8 is a schematic diagram corresponding to a randomly generated reference mode pruning model. In fig. 7 and 8, the white areas are the network weights set to zero in the pattern pruning model, and as can be seen from fig. 7 and 8, the number of the network weights set to zero in fig. 7 and 8 is the same, but the network weights set to zero are in different positions.

When the network parameters of the mode pruning model are set to zero, some unimportant network weights smaller than the threshold can be set to zero by setting a threshold, and the network weights smaller than the threshold in the network weights of the mode pruning model are set to zero. Specifically, the network weight whose absolute value or variance in the model pruning model is smaller than the threshold may be set to zero according to the absolute value or variance of the network weight in the mode pruning model.

Step S130: and respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, and acquiring operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment.

In this embodiment, the cost model is a pre-trained neural network model, and is configured to output, according to an input mode pruning model, a parameter that represents an operation performance of the electronic device when a corresponding mode pruning model is operated on the electronic device.

As a mode, the initial mode pruning model and the multiple reference mode pruning models are input into a pre-trained cost model, and a parameter representing the operation performance of the electronic device corresponding to each mode pruning model output by the cost model is obtained, so as to obtain multiple operation parameters. Specifically, when the initial pattern pruning model and the plurality of reference pattern pruning models are input into the pre-trained cost model, the initial pattern pruning model and the plurality of reference pattern pruning models may be encoded to obtain a coding sequence corresponding to each pattern pruning model, and then the coding sequence corresponding to each pattern pruning model is input into the pre-trained cost model.

Of course, in the embodiment of the present application, after the initial mode pruning model is obtained, the initial mode pruning model may be immediately input into the pre-trained cost model, so that the initial operation parameters corresponding to the initial mode pruning model may be obtained.

And then, after a plurality of reference mode pruning models are randomly generated, inputting the plurality of reference mode pruning models into a pre-trained cost model, and further obtaining the running parameters corresponding to the plurality of reference mode pruning models.

Optionally, after the multiple reference mode pruning models are input to the pre-trained cost model to obtain the operation parameters corresponding to the reference mode pruning model output by the cost model, if the operation parameters corresponding to the currently input reference mode pruning model are smaller than the operation parameters corresponding to the previously input reference mode pruning model, the currently input reference mode pruning model is stored, and the operation parameters corresponding to the currently input reference mode pruning model are assigned to the initial operation parameters corresponding to the initial mode pruning model.

Step S140: determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of operating parameters.

In the embodiment of the present application, a mode pruning model adapted to the electronic device is determined from the initial mode pruning model and the plurality of reference mode pruning models according to the operation parameters corresponding to each mode pruning model output by the cost model, where the mode pruning model adapted to the electronic device may be understood as the mode pruning model most friendly to the electronic device.

The model generation method includes the steps of firstly obtaining an initial mode pruning model, then generating a plurality of reference mode pruning models based on a pruning mode corresponding to the initial mode pruning model, wherein network weights of the plurality of reference mode pruning models and network weights of the initial mode pruning model have the same zero-occupancy ratio, then respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, obtaining operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model so as to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing operation performance of electronic equipment when the corresponding mode pruning models operate on the electronic equipment, and determining the mode pruning model adaptive to the electronic equipment from the plurality of mode pruning model modes based on the plurality of operation parameters. By the method, the operation parameters of the electronic equipment can be calculated when different mode pruning models operate on the electronic equipment by introducing the cost model, and the mode pruning model which is more adaptive to the electronic equipment can be found from the multiple mode pruning models.

Referring to fig. 9, a model generation method provided in the embodiment of the present application is applied to an electronic device, and the method includes:

step S210: and generating an initial mode pruning model according to the model adjusting parameters, wherein the model adjusting parameters are used for representing the number of input channels and the number of output channels of a specified layer in the generated initial mode pruning model and a pruning mode corresponding to the generated initial pruning model.

In the embodiment of the present application, according to the preset model adjustment, the corresponding initial mode pruning model with the specified number of input channels and the specified number of output channels and the specified number of network weights set to zero is randomly generated.

Step S220: and generating a plurality of reference mode pruning models based on the pruning modes corresponding to the initial mode pruning models, wherein the network weights of the reference mode pruning models and the network weights of the initial mode pruning models have the same zero occupation ratio.

Step S230: and respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, and acquiring operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment.

Step S240: and determining a mode pruning model corresponding to the operation parameter which represents the best operation performance of the electronic equipment in the plurality of operation parameters as a mode pruning model which is adaptive to the electronic equipment.

In the embodiment of the application, if the mode pruning model corresponding to the operation parameter which represents the best operation performance of the electronic equipment is the initial mode pruning model, the initial mode pruning model is determined as the mode pruning model which is adapted to the electronic equipment; if the mode pruning model corresponding to the operation parameter which represents the best operation performance of the electronic equipment is one of the plurality of reference mode pruning models, determining the corresponding reference mode pruning model in the plurality of reference mode pruning models as the mode pruning model which is adaptive to the electronic equipment.

One way, the operational parameter includes an operational time delay; the step of determining the mode pruning model corresponding to the operation parameter which characterizes the best operation performance of the electronic device in the plurality of operation parameters as the mode pruning model adapted to the electronic device includes: and determining a mode pruning model corresponding to the operation time delay with the shortest operation time delay, which represents the electronic equipment, in the plurality of operation time delays as a mode pruning model adapted to the electronic equipment.

Specifically, the operation delay is a time delay corresponding to one time when the electronic device operates a mode pruning model. And when the operation parameter is the operation time delay, determining a mode pruning model corresponding to the operation time delay with the shortest time delay, which represents the one-time mode pruning model of the electronic equipment in the plurality of operation time delays, as the mode pruning model matched with the electronic equipment.

As another way, the operating parameter includes operating power consumption; the step of determining the mode pruning model corresponding to the operation parameter which characterizes the best operation performance of the electronic device in the plurality of operation parameters as the mode pruning model adapted to the electronic device includes: and determining a mode pruning model corresponding to the operation power consumption which represents the minimum operation power consumption of the electronic equipment in the plurality of operation power consumptions as a mode pruning model which is adaptive to the electronic equipment.

Specifically, the operation power consumption is power consumption corresponding to one time when the electronic device operates a mode pruning model. And when the operation parameter is the operation power consumption, determining a mode pruning model corresponding to the operation power consumption which is the minimum and is used for representing the power consumption of the mode pruning model of the electronic equipment in one operation in the plurality of operation power consumptions as the mode pruning model matched with the electronic equipment.

The model generation method includes the steps of firstly generating an initial mode pruning model according to model adjustment parameters, then generating a plurality of reference mode pruning models based on a pruning mode corresponding to the initial pruning model, respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, obtaining operation parameters which are output by the cost model and respectively correspond to the plurality of mode pruning models to obtain a plurality of operation parameters, and finally determining a mode pruning model which corresponds to the operation parameter which indicates the best operation performance of electronic equipment in the plurality of operation parameters as a mode pruning model which is adaptive to the electronic equipment. By the method, the operation parameters of the electronic equipment can be calculated when different mode pruning models operate on the electronic equipment by introducing the cost model, and then the mode pruning model which is more adaptive to the electronic equipment can be found from the plurality of mode pruning models.

Referring to fig. 10, a model generation method provided in the embodiment of the present application is applied to an electronic device, and the method includes:

step S310: and acquiring an initial mode pruning model.

Step S320: and generating a plurality of reference mode pruning models based on the pruning modes corresponding to the initial mode pruning models, wherein the network weights of the reference mode pruning models and the network weights of the initial mode pruning models have the same zero occupation ratio.

Step S330: and respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, and acquiring operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment.

Step S340: and acquiring weights corresponding to the operating parameters respectively to obtain a plurality of weights.

In the embodiment of the application, after the operation parameters corresponding to the multiple mode pruning models are obtained, the weight corresponding to each operation parameter can be calculated according to the specific situation of each operation parameter, so as to obtain multiple weights. Common methods of calculating weight include ranking, analytic hierarchy process, fuzzy analytic hierarchy process, and expert evaluation.

As one way, the operation parameters include corresponding operation time delay and operation power consumption; the step of obtaining weights corresponding to the plurality of operating parameters to obtain a plurality of weights includes: and if the operation time delay with the shortest representation operation time delay does not correspond to the operation power consumption with the smallest representation operation power consumption, obtaining weights corresponding to the plurality of operation parameters respectively to obtain a plurality of weights.

The corresponding operation delay and operation power consumption may be understood as the operation delay and operation power consumption generated after one operation of a mode pruning model corresponding to the same electronic device.

When the operation parameters simultaneously comprise operation time delay and operation power consumption, if the operation time delay with the shortest operation time delay for representing the electronic equipment and the operation power consumption with the smallest operation power consumption for representing the electronic equipment do not correspond to each other, the weights corresponding to the operation parameters are obtained through the method, so that the weights are obtained.

Step S350: determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of weights.

In this embodiment of the present application, after obtaining the multiple weights, the mode pruning model corresponding to the operation parameter corresponding to the minimum weight or the maximum weight in the multiple weights may be selected to be determined as the mode pruning model adapted to the electronic device. Specifically, the mode pruning model corresponding to the operation parameter corresponding to the minimum weight is selected, or the mode pruning model corresponding to the operation parameter corresponding to the minimum weight is selected, and the selection can be performed according to a method for calculating the weight.

Illustratively, a ranking method can be selected to calculate the weight, specifically, the assessment indexes of the operation parameters can be preset, then the operation parameters corresponding to the pruning models in different modes are compared with the preset assessment indexes by a pairwise comparison method, the operation parameters corresponding to the pruning models in different modes are ranked according to the importance, the weight corresponding to the operation parameters which are ranked in front is larger, and the weight corresponding to the operation parameters which are ranked in the back is smaller. Since the operation parameters are sorted according to the importance, in the embodiment of the present application, the mode pruning model corresponding to the operation parameter corresponding to the maximum weight may be determined as the mode pruning model adapted to the electronic device.

The model generation method comprises the steps of firstly obtaining an initial mode pruning model, generating a plurality of reference mode pruning models based on a pruning mode corresponding to the initial mode pruning model, respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, obtaining operating parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operating parameters, then obtaining weights respectively corresponding to the plurality of operating parameters to obtain a plurality of weights, and determining a mode pruning model adaptive to electronic equipment from the plurality of mode pruning models based on the plurality of weights. By the method, the operation parameters of the electronic equipment can be calculated when different mode pruning models operate on the electronic equipment by introducing the cost model, and then the mode pruning model which is more adaptive to the electronic equipment can be found from the plurality of mode pruning models.

Referring to fig. 11, a model generation method provided in the embodiment of the present application is applied to an electronic device, and the method includes:

step S410: a training data set is obtained, wherein the training data set comprises operation parameters of the electronic equipment when different initial mode pruning models are operated on the electronic equipment.

In the embodiment of the present application, the different initial mode pruning models may include a mode pruning model with a different zero proportion of the network weight, and a mode pruning model with the same zero proportion of the network weight but with a different position of the network weight that is set to zero. The training data set comprises parameters for representing the operation performance of the electronic equipment when each initial mode pruning model operates on the same electronic equipment. The mode pruning models correspond to the operation parameters one by one, and one mode pruning model corresponds to one operation parameter.

As one way, the training data set may be pre-stored in the cloud server, and when the initial cost model needs to be trained, the training data set may be obtained from the cloud server. Optionally, the data size of the selected training data may be determined according to the complexity of the initial cost model, and the more complex the initial cost model is, the larger the data size of the selected training data is.

Step S420: and training the initial cost model based on the training data set until the initial cost model converges, and taking the converged initial cost model as the cost model.

As a mode, the step of training an initial cost model based on the training data set until the initial cost model converges and using the converged initial cost model as the cost model includes: training the initial cost model based on the training dataset; obtaining a target loss function; obtaining a loss value corresponding to the trained initial cost model based on the target loss function; and if the loss value meets a preset condition, determining that the initial cost model is converged, and taking the converged initial cost model as the cost model.

In the embodiment of the application, the preset condition is a preset loss value, after the training data set is obtained by the method, the training data set is input into the initial cost model, the initial cost model is trained, the loss value corresponding to the initial cost model is calculated through the target loss function, until the loss value corresponding to the initial cost model meets the preset loss value, the initial cost model is determined to be converged, the training of the initial cost model is stopped, and the converged initial cost model is used as the final cost model.

Step S430: and acquiring an initial mode pruning model.

Step S440: and generating a plurality of reference mode pruning models based on the pruning modes corresponding to the initial mode pruning models, wherein the network weights of the plurality of reference mode pruning models and the network weight of the initial mode pruning model have the same zero ratio.

Step S450: and respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into the cost model, and acquiring operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment.

Step S460: determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of operating parameters.

In the embodiment of the application, after the mode pruning model adapted to the electronic device is determined, in practical application, the mode pruning model can be trained through a training data set in a practical application environment, so that the mode pruning model can recover the precision. The above-described method of selecting a pattern pruning model adapted to an electronic device may also be used for compression of neural networks for recognizing images or recognizing speech.

The model generation method comprises the steps of firstly obtaining a training data set, training an initial cost model until the initial cost model converges, using the converged initial cost model as a cost model, then obtaining an initial mode pruning model, generating a plurality of reference mode pruning models based on a pruning mode corresponding to the initial mode pruning model, respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into the cost model, obtaining a plurality of operating parameters corresponding to the plurality of mode pruning models output by the cost model, and then determining a mode pruning model adaptive to electronic equipment from the plurality of mode pruning models based on the plurality of operating parameters. By the method, the operation parameters of the electronic equipment can be calculated when different mode pruning models operate on the electronic equipment by introducing the cost model, and then the mode pruning model which is more adaptive to the electronic equipment can be found from the plurality of mode pruning models.

Referring to fig. 12, a model generation apparatus 500 according to an embodiment of the present application includes:

a model obtaining unit 510, configured to obtain an initial mode pruning model.

As a manner, the model obtaining unit 510 is further configured to generate an initial mode pruning model according to a model adjusting parameter, where the model adjusting parameter is used to characterize the number of input channels and the number of output channels of a specified layer in the generated initial mode pruning model and a pruning mode corresponding to the generated initial pruning model.

A model generating unit 520, configured to generate a plurality of reference mode pruning models based on the pruning mode corresponding to the initial mode pruning model, where network weights of the plurality of reference mode pruning models and network weights of the initial mode pruning model have the same zero occupancy ratio.

A parameter obtaining unit 530, configured to respectively input the initial mode pruning model and the multiple reference mode pruning models into a cost model, and obtain operation parameters respectively corresponding to the multiple pruning models output by the cost model, so as to obtain multiple operation parameters, where the operation parameters are parameters representing operation performance of the electronic device when the corresponding mode pruning models operate on the electronic device.

A model determining unit 540, configured to determine, based on the plurality of operating parameters, a mode pruning model adapted to the electronic device from the plurality of mode pruning models.

As one way, the model determining unit 540 is configured to determine, as the mode pruning model adapted to the electronic device, the mode pruning model corresponding to the operation parameter that characterizes the best operation performance of the electronic device among the plurality of operation parameters.

As one of the modes, the model determining unit 540 is configured to determine, as a mode pruning model adapted to the electronic device, a mode pruning model corresponding to a running delay, which represents a shortest running delay of the electronic device, in the plurality of running delays.

As another mode, the model determining unit 540 is further configured to determine, as a mode pruning model adapted to the electronic device, a mode pruning model corresponding to an operation power consumption that characterizes the minimum operation power consumption of the electronic device among the multiple operation power consumptions.

As another mode, the model determining unit 540 is further configured to obtain weights corresponding to the multiple operating parameters, so as to obtain multiple weights; determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of weights.

Optionally, the model determining unit 540 is further configured to, if the operation time delay representing the shortest operation time delay does not correspond to the operation power consumption representing the smallest operation power consumption, obtain weights corresponding to the multiple operation parameters, so as to obtain multiple weights.

Referring to fig. 13, the apparatus 500 further includes:

a model training unit 550, configured to obtain a training data set, where the training data set includes operation parameters of the electronic device when different initial mode pruning models are run on the electronic device; and training the initial cost model based on the training data set until the initial cost model converges, and taking the converged initial cost model as the cost model.

By way of example, the model training unit 550 is configured to train the initial cost model based on the training data set; obtaining a target loss function; obtaining a loss value corresponding to the trained initial cost model based on the target loss function; and if the loss value meets a preset condition, determining that the initial cost model is converged, and taking the converged initial cost model as the cost model.

It should be noted that the device embodiment and the method embodiment in the present application correspond to each other, and specific principles in the device embodiment may refer to the contents in the method embodiment, which is not described herein again.

An electronic device provided by the present application will be described below with reference to fig. 14.

Referring to fig. 14, based on the model generating method and apparatus, another electronic device 800 capable of executing the model generating method is provided in the embodiment of the present application. The electronic device 800 includes one or more processors 802 (only one shown), a memory 804, and a network module 806 coupled to each other. The memory 804 stores programs that can execute the content of the foregoing embodiments, and the processor 802 can execute the programs stored in the memory 804.

Processor 802 may include one or more processing cores, among others. The processor 802 interfaces with various components throughout the electronic device 800 using various interfaces and circuitry to perform various functions of the electronic device 800 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 804 and invoking data stored in the memory 804. Alternatively, the processor 802 may be implemented in hardware using at least one of Digital Signal Processing (DSP), field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 802 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 802, but may be implemented by a single communication chip.

The Memory 804 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 804 may be used to store instructions, programs, code, sets of codes, or sets of instructions. The memory 804 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The data storage area may also store data created during use by the electronic device 800 (e.g., phone books, audio-visual data, chat log data), and so forth.

The network module 806 is configured to receive and transmit electromagnetic waves, and achieve interconversion between the electromagnetic waves and the electrical signals, so as to communicate with a communication network or other devices, for example, an audio playing device. The network module 806 may include various existing circuit elements for performing these functions, such as an antenna, a radio frequency transceiver, a digital signal processor, an encryption/decryption chip, a Subscriber Identity Module (SIM) card, memory, and so forth. The network module 806 may communicate with various networks, such as the internet, an intranet, a wireless network, or with other devices via a wireless network. The wireless network may comprise a cellular telephone network, a wireless local area network, or a metropolitan area network. For example, the network module 806 can interact with the base station.

Referring to fig. 15, a block diagram of a computer-readable storage medium according to an embodiment of the present application is shown. The computer-readable medium 900 has stored therein program code that can be called by a processor to perform the methods described in the above-described method embodiments.

The computer-readable storage medium 900 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 900 includes a non-volatile computer-readable storage medium. The computer readable storage medium 900 has storage space for program code 910 to perform any of the method steps of the method described above. The program code can be read from or written to one or more computer program products. The program code 910 may be compressed, for example, in a suitable form.

According to the model generation method, the model generation device, the electronic equipment and the storage medium, an initial mode pruning model is obtained firstly, then a plurality of reference mode pruning models are generated based on a pruning mode corresponding to the initial mode pruning model, wherein the network weights of the plurality of reference mode pruning models and the network weight of the initial mode pruning model have the same zero ratio, then the initial mode pruning model and the plurality of reference mode pruning models are respectively input into a cost model, and the operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model are obtained to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning model operates on the electronic equipment, and the mode pruning model adaptive to the electronic equipment is determined from the plurality of mode pruning model modes based on the plurality of operation parameters. By the method, the operation parameters of the electronic equipment can be calculated when different mode pruning models operate on the electronic equipment by introducing the cost model, and then the mode pruning model which is more adaptive to the electronic equipment can be found from the plurality of mode pruning models.

While the present invention has been described with reference to the particular illustrative embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but is intended to cover various modifications, equivalent arrangements, and equivalents thereof, which may be made by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A method of model generation, the method comprising:

obtaining an initial mode pruning model;

generating a plurality of reference mode pruning models based on the pruning modes corresponding to the initial mode pruning models, wherein the network weights of the plurality of reference mode pruning models and the network weights of the initial mode pruning models have the same zero ratio;

respectively inputting the initial mode pruning model and the plurality of reference mode pruning models into a cost model, and acquiring operation parameters respectively corresponding to the plurality of mode pruning models output by the cost model to obtain a plurality of operation parameters, wherein the operation parameters are parameters representing the operation performance of the electronic equipment when the corresponding mode pruning models operate on the electronic equipment;

determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of operating parameters.

2. The method of claim 1, wherein the obtaining an initial mode pruning model comprises:

and generating an initial mode pruning model according to the model adjusting parameters, wherein the model adjusting parameters are used for representing the number of input channels and the number of output channels of a specified layer in the generated initial mode pruning model and a pruning mode corresponding to the generated initial pruning model.

3. The method of claim 1, wherein determining a mode pruning model from the plurality of mode pruning models that is adapted to the electronic device based on the plurality of operating parameters comprises:

and determining a mode pruning model corresponding to the operation parameter which represents the best operation performance of the electronic equipment in the plurality of operation parameters as a mode pruning model which is adaptive to the electronic equipment.

4. The method of claim 3, wherein the operating parameter comprises an operating time delay; the determining, as a mode pruning model adapted to the electronic device, a mode pruning model corresponding to an operation parameter that characterizes the best operation performance of the electronic device among the plurality of operation parameters includes:

and determining a mode pruning model corresponding to the operation time delay with the shortest operation time delay, which represents the electronic equipment, in the plurality of operation time delays as a mode pruning model adapted to the electronic equipment.

5. The method of claim 3, wherein the operating parameter comprises operating power consumption; the determining, as a mode pruning model adapted to the electronic device, a mode pruning model corresponding to an operation parameter that characterizes the best operation performance of the electronic device among the plurality of operation parameters includes:

and determining a mode pruning model corresponding to the operation power consumption which represents the minimum operation power consumption of the electronic equipment in the plurality of operation power consumptions as a mode pruning model which is adaptive to the electronic equipment.

6. The method of claim 1, wherein determining a mode pruning model from the plurality of mode pruning models that is adapted to the electronic device based on the plurality of operating parameters comprises:

obtaining weights corresponding to the multiple operation parameters respectively to obtain multiple weights;

determining a mode pruning model adapted to the electronic device from the plurality of mode pruning models based on the plurality of weights.

7. The method of claim 6, wherein the operating parameters include a corresponding operating latency and operating power consumption; the obtaining of the respective weights corresponding to the plurality of operating parameters to obtain a plurality of weights includes:

and if the operation time delay with the shortest characteristic operation time delay does not correspond to the operation power consumption with the smallest characteristic operation power consumption, acquiring weights corresponding to the plurality of operation parameters respectively to obtain a plurality of weights.

8. The method of claim 1, wherein the obtaining the initial mode pruning model further comprises:

acquiring a training data set, wherein the training data set comprises operation parameters of the electronic equipment when different initial mode pruning models operate on the electronic equipment;

and training the initial cost model based on the training data set until the initial cost model converges, and taking the converged initial cost model as the cost model.

9. The method of claim 8, wherein the training the initial cost model based on the training data set until the initial cost model converges, and wherein using the converged initial cost model as the cost model comprises:

training the initial cost model based on the training dataset;

obtaining a target loss function;

obtaining a loss value corresponding to the trained initial cost model based on the target loss function;

and if the loss value meets a preset condition, determining that the initial cost model is converged, and taking the converged initial cost model as the cost model.

10. An apparatus for model generation, the apparatus comprising:

the model acquisition unit is used for acquiring an initial mode pruning model;

a model generating unit, configured to generate multiple reference mode pruning models based on the pruning modes corresponding to the initial mode pruning model, where network weights of the multiple reference mode pruning models and network weights of the initial mode pruning model have the same zero ratio;

a parameter obtaining unit, configured to input the initial mode pruning model and the multiple reference mode pruning models into a cost model, and obtain operation parameters corresponding to the multiple pruning models output by the cost model, so as to obtain multiple operation parameters, where the operation parameters are parameters that represent operation performance of an electronic device when the corresponding mode pruning models operate on the electronic device;

a model determining unit, configured to determine, based on the plurality of operating parameters, a mode pruning model adapted to the electronic device from the plurality of mode pruning models.

11. An electronic device comprising one or more processors; one or more programs stored in the memory and configured to be executed by the one or more processors to perform the method of any of claims 1-9.

12. A computer-readable storage medium, having a program code stored therein, wherein the program code when executed by a processor performs the method of any of claims 1-9.

13. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the method of any of claims 1-9.