CN115392448A - Compression method and compression device for convolutional neural network model - Google Patents

Compression method and compression device for convolutional neural network model Download PDF

Info

Publication number
CN115392448A
CN115392448A CN202110565583.2A CN202110565583A CN115392448A CN 115392448 A CN115392448 A CN 115392448A CN 202110565583 A CN202110565583 A CN 202110565583A CN 115392448 A CN115392448 A CN 115392448A
Authority
CN
China
Prior art keywords
neural network
network model
convolutional neural
convolution kernel
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110565583.2A
Other languages
Chinese (zh)
Inventor
欧歌
马小惠
梁烁斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BOE Technology Group Co Ltd
Beijing BOE Technology Development Co Ltd
Original Assignee
BOE Technology Group Co Ltd
Beijing BOE Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BOE Technology Group Co Ltd, Beijing BOE Technology Development Co Ltd filed Critical BOE Technology Group Co Ltd
Priority to CN202110565583.2A priority Critical patent/CN115392448A/en
Publication of CN115392448A publication Critical patent/CN115392448A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The embodiment of the disclosure provides a compression method and a compression device for a convolutional neural network model, which includes the following steps: constructing a convolutional neural network model, wherein the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers; training the convolutional neural network model; and performing pruning operation according to a preset sequence by aiming at the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model. According to the embodiment of the disclosure, the compression of the convolutional neural network model is realized by adopting the pruning operation based on the measurement, so that the redundancy of the convolutional neural network model is reduced, the reduction of the model precision caused by pruning is avoided, the convolutional neural network model has higher identification precision, and meanwhile, the lightweight model is realized, and the model is convenient to transplant to hardware equipment.

Description

Compression method and compression device for convolutional neural network model
Technical Field
The present disclosure relates to the field of network model computing technologies, and in particular, to a compression method and a compression apparatus for a convolutional neural network model.
Background
The deep learning algorithm is widely applied to the fields of disease identification, target detection and the like. In the aspect of heart rate identification, the convolutional neural network model which is one of important means in a deep learning algorithm can well solve the problem of classification of heart rate data and has high accuracy. However, since the network model itself has a high parameter number, the storage cost of the parameters and the calculation amount of the model are increased. These problems make the deep-learning network model difficult to deploy to mobile terminals or hardware devices, so that model compression of the convolutional neural network becomes important. Most of the existing pruning methods based on measurement for convolutional neural network models use modes such as a mean value, an L1 normal form, an L2 normal form and the like as evaluation criteria of the importance of convolutional kernel filters, and when the standard deviation of the measurement values is too small or the minimum value is still large, which convolutional kernel filters should be deleted still cannot be judged, so that reasonable pruning operation cannot be realized.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide a method for solving the problem in the prior art that it is impossible to determine which convolution kernel filters should be deleted in a convolution neural network model, so that a reasonable pruning operation cannot be implemented.
In one aspect, an embodiment of the present disclosure provides a compression method for a convolutional neural network model, including the following steps: constructing a convolutional neural network model, wherein the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers; training the convolutional neural network model; and pruning the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model according to a preset sequence.
In some embodiments, the convolutional neural network model further comprises at least a max pooling layer, a batch normalization layer, and a global average pooling layer.
In some embodiments, said training said convolutional neural network model comprises: inputting a training set into the convolutional neural network model for training and reaching a preset iteration number; and inputting the test set into the convolution neural network model after the iteration is finished for testing so as to achieve the preset precision.
In some embodiments, the pruning operation for the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolutional neural network model according to a predetermined order includes: acquiring the characteristic value of each first convolution kernel filter of a specified layer in the one-dimensional convolution layer according to a first preset sequence; obtaining a set of second convolution kernel filters based on the feature values; a third convolution kernel filter is obtained in the set based on the similarity.
In some embodiments, further comprising: arranging the third convolution kernel filters in a second predetermined order in the designated layer.
In some embodiments, the obtaining, in the first predetermined order, feature values of at least one first convolution kernel filter of a specified layer of the one-dimensional convolution layers is a weight average value, and the weight average value is determined based on the following formula:
Figure BDA0003080870480000021
in some embodiments, said deriving a set of second convolution kernel filters based on said weight feature values comprises: sorting the eigenvalues of the first convolution kernel filter in a third predetermined order; and determining the first convolution kernel filter with the characteristic value larger than the first threshold value as a second convolution kernel filter.
In some embodiments, the screening in the set based on similarity to obtain a third convolution kernel filter comprises: determining a similarity between any two of the second convolution kernel filters in the set; and deleting redundant convolution kernel filters in the set based on the similarity to obtain the third convolution kernel filter.
In some embodiments, said determining a similarity between any two of said second convolution kernel filters in said set comprises: performing one-dimensional convolution operation on all the second convolution kernel filters to obtain an output characteristic diagram with m x n-dimensional vectors, wherein m is a data dimension, and n is the number of the second convolution kernel filters; and acquiring the similarity between two column vectors representing any two second convolution kernel filters based on the output feature map.
In some embodiments, said removing redundant convolution kernel filters in said set based on said similarity comprises removing one of two column vectors having a similarity greater than a second threshold and the corresponding said second convolution kernel filter.
In another aspect, the present disclosure provides a compression apparatus for a convolutional neural network model, which includes: the building module is used for building a convolutional neural network model, and the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers; a training module for training the convolutional neural network model; and the pruning module is used for carrying out pruning operation according to a preset sequence aiming at the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model.
In another aspect, the present disclosure provides a storage medium storing a computer program, which when executed by a processor implements the steps of the method according to any one of the above aspects.
In another aspect, the present disclosure provides an electronic device, which at least includes a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the method in any one of the above technical solutions when executing the computer program on the memory.
On the other hand, the present disclosure also provides a method for identifying heart rate data, which includes the following steps: collecting heart rate data; and inputting the heart rate data into a convolutional neural network model to obtain the type of the heart rate data, wherein the convolutional neural network model is compressed by the method in any one of the technical schemes.
The embodiment of the disclosure relates to a compression algorithm for deep learning based on pruning operation, which adopts pruning operation based on measurement to realize compression of a convolutional neural network model, and specifically performs pruning operation by combining importance of a convolutional kernel filter and similarity between different convolutional kernel filters, thereby reducing redundancy of the convolutional neural network model; by retraining the pruned convolutional neural network model, the reduction of model precision caused by pruning is avoided. The compressed convolutional neural network model not only has better identification precision, but also reduces the parameter quantity and complexity of the model, so that the convolutional neural network model has higher identification precision, realizes the lightweight model and is convenient to transplant to hardware equipment.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or technical solutions in the prior art, the drawings used in the description of the embodiments or prior art will be briefly described below, it is obvious that the drawings in the description below are only some embodiments described in the present disclosure, and for those skilled in the art, other drawings may be obtained according to these drawings without creative efforts.
Fig. 1 is a schematic step diagram of a compression method according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram illustrating the steps of a compression method according to a first embodiment of the disclosure;
FIG. 3 is a schematic step diagram of a compression method according to a first embodiment of the disclosure;
FIG. 4 is a schematic diagram illustrating the steps of a compression method according to a first embodiment of the disclosure;
FIG. 5 is a schematic diagram illustrating a deleting process of a first convolution kernel filter in the compression method according to the first embodiment of the disclosure;
FIG. 6 is a schematic step diagram of a compression method according to a first embodiment of the disclosure;
fig. 7 is a schematic step diagram of a compression method according to a first embodiment of the disclosure.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described below clearly and completely with reference to the accompanying drawings of the embodiments of the present disclosure. It is to be understood that the described embodiments are only a few embodiments of the present disclosure, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the disclosure without any inventive step, are within the scope of protection of the disclosure.
Unless defined otherwise, technical or scientific terms used herein shall have the ordinary meaning as understood by one of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and the like in this disclosure is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", and the like are used only to indicate relative positional relationships, and when the absolute position of the object being described is changed, the relative positional relationships may also be changed accordingly.
To maintain the following description of the embodiments of the present disclosure clear and concise, a detailed description of known functions and known components have been omitted from the present disclosure.
A first embodiment of the present disclosure provides a compression method for a convolutional neural network model, which can implement pruning operations on a neural network model for a deep learning algorithm, in particular, a convolutional neural network model, and delete redundant filters in convolutional layers, specifically, as shown in fig. 1, including the following steps:
s101, constructing a convolutional neural network model, wherein the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers.
In this step, it is first necessary to construct a Convolutional Neural network model, where the Convolutional Neural Network (CNN) is a kind of feed-forward Neural network (fed-forward Neural network) including Convolutional calculation and having a deep structure, the Convolutional Neural network has a characteristic learning (rendering learning) capability, and can perform shift-invariant classification (shift-invariant classification) on input information according to its hierarchical structure, so that the Convolutional Neural network is constructed by simulating a biological visual perception mechanism, can perform supervised learning and unsupervised learning, and can learn grid-like characteristics (e.g., pixels and audio) with a small amount of calculation due to parameter sharing of Convolutional kernels in its internal Convolutional layers and sparseness of inter-layer connections, so that the Convolutional Neural network has a stable effect and has no requirement for additional characteristic engineering (training) on data.
Further, the convolutional neural network model herein comprises at least one-dimensional convolutional layer arranged in multiple layers, where the one-dimensional convolutional layer of each layer is composed of several convolutional units, parameters of each convolutional unit are optimized through a back propagation algorithm, and the purpose of performing convolution operation through the one-dimensional convolutional layer is to extract different input features, where the one-dimensional convolutional layer of the first layer can only extract some low-level features such as the levels of edges, lines, and corners, and the like, and then the one-dimensional convolutional layers of more layers can further iteratively extract more complex features from the extracted low-level features, thereby implementing convolution operation.
Further, the convolutional neural network model constructed herein further includes at least a max pooling layer, a batch normalization layer, and a global average pooling layer. Wherein, the pooling layer (also called as a sampling layer) is generally arranged behind the convolution layer, and is also composed of a plurality of characteristic surfaces, each of which corresponds to one characteristic surface of the layer above, and the number of the characteristic surfaces is not changed; specifically, the convolutional layer is an input layer of the pooling layer, one characteristic surface of the convolutional layer corresponds to one characteristic surface in the pooling layer, and the neurons of the pooling layer are also connected with local receiving domains of the convolutional layer of the input layer thereof, and different neuron local receiving domains are not overlapped. The role of the pooling layer here is to obtain features with spatial invariance by reducing the resolution of the feature plane; thus, the pooling layer acts as a quadratic extraction feature in the convolutional neural network model, with each neuron pooling the local receptive field. The maximum pooling layer (max-pooling) is to take the point with the maximum value in the local receiving domain for pooling operation; the global average pooling (mean pooling) here is the averaging of all values in the local acceptance domain.
S102, training the convolutional neural network model.
In this step, the convolutional neural network model constructed in step S101 is trained. Wherein, the convolutional neural network model is trained by inputting a data set to the convolutional neural network model, and the convolutional neural network model is tested to obtain a model meeting a predetermined condition. The data set herein may be determined according to an applicable scenario of the convolutional neural network model, wherein, for example, during the process of analyzing and classifying the heart rate data, the data set herein may include heart rate related data of different patients, and specifically, the data set of the heart rate data herein may be divided into a training set and a testing set, which are respectively used for training and testing the convolutional neural network model, and the heart rate data recognition function is realized through the training.
Further, as shown in fig. 2, the training the convolutional neural network model includes:
s201, inputting a training set into the convolutional neural network model for training and reaching a preset iteration number.
In this step, a training set in the data set is input into the convolutional neural network model for training and reaches a predetermined number of iterations. Specifically, for example, a data set of heart rate data for training is divided into a training set and a test set, the training set is input into the convolutional neural network model for training of the convolutional neural network model, stepwise iteration of an algorithm and features is realized in the training, and the training of the convolutional neural network model is terminated when a predetermined number of iterations in the training is reached. Wherein, the number of iterations here can be preset.
S202, inputting the test set into the convolution neural network model after iteration for testing so as to achieve preset precision.
After the convolutional neural network model reaches the predetermined iteration times through the step S201, the test set is input into the convolutional neural network model after the iteration is completed to perform the test, so as to reach the predetermined precision. Specifically, for example, a test set of the heart rate data for testing in the data set is input into the iterative convolutional neural network model, so as to obtain a prediction result based on the convolutional neural network model, and the prediction result is compared with a real value, so as to obtain the accuracy of the convolutional neural network model based on the test set. Further, the convolutional neural network model may be adjusted based on the recognition accuracy each time and the processes of step S201 and step S202 are repeated until the convolutional neural network model reaches a predetermined accuracy.
S103, pruning operation is carried out according to a preset sequence by aiming at the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model.
After the convolutional neural network model is trained in step S102, in this step, pruning operations are performed according to a predetermined sequence for convolutional kernel filters in different layers of the one-dimensional convolutional layer in the convolutional neural network model after the training is completed, where the pruning operation is a screening operation for the convolutional kernel filters in different layers of the one-dimensional convolutional layer because the one-dimensional convolutional layer is provided in multiple layers and each layer is provided with a convolutional kernel filter; the predetermined sequence may be performed according to the sequence of the different layers in the one-dimensional convolutional layer or the top-bottom sequence, for example, the sequence may be performed layer by layer from the first layer in the one-dimensional convolutional layer, or may be performed layer by layer from the last layer in the one-dimensional convolutional layer, and the specific manner is not limited herein. In this way, through the screening operation of the convolution kernel filter, the iteration effect of the one-dimensional convolution layer in the convolution neural network model can be better through the combination of the screened convolution kernel filters, and the prediction result through the convolution neural network model is more accurate.
Further, the pruning operation is performed on the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model according to a predetermined sequence, as shown in fig. 3, and includes the following steps:
s301, obtaining the characteristic value of each first convolution kernel filter of the appointed layer in the one-dimensional convolution layer according to a first preset sequence.
In this step, eigenvalues of at least one first convolution kernel filter of a specified layer of the one-dimensional convolution layers are acquired in a first predetermined order. In the process of acquiring convolution kernel filters of different layers in the one-dimensional convolution layer, any one of the one-dimensional convolution layers may be selected according to a first predetermined order, for example, the first layer, the last layer, or an intermediate designated layer i. After selecting a designated layer i in the one-dimensional convolutional layer, obtaining a feature value of each first convolutional kernel filter in the ith layer, wherein the feature value is used for evaluating the importance of the first convolutional kernel filter. Preferably, the feature value may adopt a weight average value, and the weight average value may be used as an evaluation index of the importance of the convolution kernel filter, for this reason, for each first convolution kernel filter, the weight average value is calculated in the following manner, and a specific formula is as follows:
Figure BDA0003080870480000071
wherein n is i Indicating the number of input channels, K, of the designated convolutional layer l Representing the weight.
S302, acquiring a set of second convolution kernel filters based on the characteristic values.
After the feature values of each first convolution kernel filter of a specified layer in the one-dimensional convolution layer are acquired in the first predetermined order by the above-mentioned step S301, all the first convolution kernel filters may be subjected to the first filtering regarding importance based on the feature values to acquire the set of second convolution kernel filters.
Specifically, the obtaining of the set of second convolution kernel filters based on the feature values, as shown in fig. 4, includes the following steps:
s401, sorting the characteristic values of the first convolution kernel filter according to a third preset sequence.
In this step, the eigenvalues of all the first convolution kernel filters of a given layer i in the one-dimensional convolution layer are sorted according to a third predetermined order, where the third predetermined order may be a numerical magnitude order, for example, in the case that the eigenvalues adopt weight average values, the weight average value s of each first convolution kernel filter may be used as a basis j Size orders all of the first convolution kernel filters.
S402, determining the first convolution kernel filter with the characteristic value larger than the first threshold value as a second convolution kernel filter.
After sorting the eigenvalues of the first convolution kernel filter according to the third predetermined order according to step S401 described above, in this step, the first convolution kernel filter whose eigenvalue is greater than the first threshold is determined to be the second convolution kernel filter according to the sorting result. Specifically, the objective of this step is to delete the unimportant first convolution kernel filter, for example, in the case when the feature value adopts a weight average value, first, a first threshold value epsilon is set based on the weight average value, and based on the first threshold value epsilon, the weight average value s is deleted j M (m) having a value smaller than the first threshold epsilon>= 1) said first convolution kernel filters and their corresponding output profiles; further, if the first convolution kernel filter of layer i is specifiedWeight average value s j If the first threshold value epsilon is larger than or equal to the first threshold value epsilon, the calculation of the next step is directly carried out. Here, the deleting process of the first convolution kernel filter is as shown in fig. 5, where three parts in fig. 5 are an input feature map of an i-th layer in the one-dimensional convolution layer, 4 first convolution kernel filters included in the current i-th layer, and an output feature map, respectively, where if the sj value of the first convolution kernel filter in the part f3 is the smallest, that is, a part that needs to be deleted, a part in the output feature map corresponding to the first convolution kernel filter f3 also needs to be deleted accordingly.
S303, acquiring a third convolution kernel filter in the set based on the similarity.
After the set of second convolution kernel filters is obtained based on the feature values by the above-described step S302, in this step, a third convolution kernel filter is obtained by filtering in the set based on the similarity. The similarity here refers to a degree of similarity between the second convolution kernel filters different in each of the one-dimensional convolution layers. In this step, a second filtering is performed in the set including the second convolution kernel filter based on the similarity to obtain a third convolution kernel filter in the set including the second convolution kernel filter.
Further, the obtaining a third convolution kernel filter in the set based on the similarity, as shown in fig. 6, includes the following steps:
s501, determining the similarity between any two second convolution kernel filters in the set.
Since the set includes all the second convolution kernel filters obtained by first filtering the first convolution kernel filter, in this step, the similarity between any two second convolution kernel filters in the set is determined, that is, the similarity between any two second convolution kernel filters in the set is calculated, so as to obtain the degree of similarity between any two second convolution kernel filters. The similarity between two convolution kernel filters is generally calculated in a euclidean distance manner, but since the calculation process of the euclidean distance is easily influenced by dimensions of each dimension, in the embodiment of the present disclosure, it is preferable to calculate the similarity between any two second convolution kernel filters in a cosine distance manner, where the larger the cosine distance between two obtained second convolution kernel filters is, the existence of redundancy between the two second convolution kernel filters is proved, and one of the second convolution kernel filters can be deleted without having too great influence on the convolution neural network model.
Further, the determining the similarity between any two second convolution kernel filters in the set, as shown in fig. 7, includes the following steps:
and S601, performing one-dimensional convolution operation on all the second convolution kernel filters to obtain an output characteristic diagram with m x n-dimensional vectors, wherein m is a data dimension, and n is the number of the second convolution kernel filters.
In this step, a one-dimensional convolution operation is performed on all the second convolution kernel filters to obtain an output feature map having an m × n-dimensional vector, where m is a data dimension and n is the number of the second convolution kernel filters. Specifically, when heart rate data and the like are used as the input of the convolutional neural network model, because the heart rate data and the like belong to one-dimensional signal data, after one-dimensional convolution operation, an output feature map of m × n-dimensional vectors can be obtained, where m is the data dimension after convolution, and n represents the number of convolutional kernel filters, where the output feature map can be regarded as n m-dimensional vectors calculated by the second convolutional kernel filter.
S602, based on the output characteristic diagram, obtaining the similarity between two column vectors representing any two second convolution kernel filters.
After the one-dimensional convolution operation is performed on all the second convolution kernel filters in step S601 to obtain the output feature map having m × n-dimensional vectors, when the similarity calculation between two second convolution kernel filters is performed, the similarity calculation between any two column vectors may be performed on the output feature map by column (n-dimensional).
S502, deleting redundant convolution kernel filters in the set based on the similarity, and obtaining the third convolution kernel filter.
In step S501, the similarity between any two second convolution kernel filters in the set is determined, and in this step, redundant convolution kernel filters are deleted from the set based on the similarity, so as to obtain the third convolution kernel filter.
Further, said removing redundant convolutional kernel filters in the set based on the similarity comprises: deleting one of the two column vectors having a similarity greater than a second threshold and the corresponding second convolution kernel filter. Specifically, one of the two column vectors with the larger similarity value and the second convolution kernel filter corresponding to the column vector are deleted through the comparison of the similarity, so that the redundant second convolution kernel filter is deleted, and the second screening of the convolution kernel filter is realized.
Further, still include:
s304, arranging the third convolution kernel filters according to a second preset sequence in the appointed layer.
In this step, the second convolution kernel filter is obtained by performing a first filtering on the first convolution kernel filter, and after the third convolution kernel filter is obtained by performing a second filtering on the second convolution kernel filter, the remaining third convolution kernel filters and the corresponding output feature maps are arranged according to a second predetermined order, so as to generate a new combination of convolution kernel filters and corresponding output feature maps thereof. And finally, the pruning operation of the convolution kernel filter of the specified layer in the one-dimensional convolution layer is finished, and then the whole convolution neural network model can be retrained based on the preset iteration times and the preset precision, so that the reduction of the model precision caused by pruning is avoided.
After the convolutional neural network model is trained, the pruning operation can be repeated for the next layer of the specified layer in the one-dimensional convolutional layer until all layers in the one-dimensional convolutional layer are pruned, that is, the convolutional kernel filters in all layers are screened twice, and finally the completely compressed convolutional neural network model is obtained.
The compression method for the convolutional neural network model related to the embodiment of the disclosure can be used in the process of type identification of heart rate data, can be conveniently transplanted for the operation of heart rate related hardware equipment, and can also be used for other types of hardware equipment. For example, after the convolutional neural network model is pruned in the server, model parameters or weights of the convolutional neural network model subjected to pruning can be obtained, and the model parameters or weights are transplanted and input into the hardware device, so that after the hardware device collects the heart rate data, the hardware device can be accelerated to operate by importing the model parameters or weights, and accurate identification of the type of the heart rate data is realized.
The embodiment of the disclosure relates to a compression algorithm of deep learning based on pruning operation, which adopts the pruning operation based on measurement to realize the compression of a convolutional neural network model, and specifically carries out the pruning operation by combining the importance of a convolutional kernel filter and the similarity between different convolutional kernel filters, thereby reducing the redundancy of the convolutional neural network model; by retraining the pruned convolutional neural network model, the reduction of model precision caused by pruning is avoided. The compressed convolutional neural network model not only has better identification precision, but also reduces the parameter quantity and complexity of the model, so that the convolutional neural network model realizes the lightweight of the model while having higher identification precision, and is convenient to transplant to hardware equipment.
A second embodiment of the present disclosure provides a compression apparatus for a convolutional neural network model, which is capable of implementing a pruning operation on a neural network model for a deep learning algorithm, in particular, a convolutional neural network model, and removing a redundant filter in a convolutional layer, in particular, to execute the compression method of the first embodiment, specifically, including a building module, a training module, and a pruning module, where the building module, the training module, and the pruning module are coupled to each other, and a function of each of the modules refers to a description of the first embodiment, specifically:
the building module is used for building a convolutional neural network model, and the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers.
And the training module is used for training the convolutional neural network model.
Wherein the training module comprises:
and the training unit is used for inputting a training set into the convolutional neural network model for training and reaching a preset iteration number.
And the test unit is used for inputting the test set into the convolution neural network model after iteration for testing so as to achieve the preset precision.
And the pruning module is used for carrying out pruning operation according to a preset sequence aiming at the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model.
Further, the pruning module comprises a characteristic value obtaining unit, a first screening unit and a second screening unit, wherein:
an eigenvalue acquisition unit for acquiring an eigenvalue of each first convolution kernel filter of a specified layer of the one-dimensional convolution layers in a first predetermined order.
A first screening unit to obtain a set of second convolution kernel filters based on the feature values.
Wherein the first screening unit includes:
a sorting subunit configured to sort the eigenvalues of the first convolution kernel filter in a third predetermined order.
A first determining subunit, configured to determine that the first convolution kernel filter whose feature value is greater than the first threshold value is the second convolution kernel filter.
A second filtering unit to obtain a third convolution kernel filter in the set based on the similarity.
Wherein the second screening unit includes:
a second determining subunit for determining a similarity between any two of the second convolution kernel filters in the set.
Further, the second determining subunit is specifically configured to perform one-dimensional convolution operation on all the second convolution kernel filters, acquire an output feature map having m × n-dimensional vectors, where m is a data dimension, and n is the number of the second convolution kernel filters, and acquire, based on the output feature map, a similarity between two column vectors representing any two second convolution kernel filters.
An obtaining subunit, configured to delete redundant convolution kernel filters from the set based on the similarity, and obtain the third convolution kernel filter.
In addition, the pruning module further comprises an arrangement unit for arranging the third convolution kernel filters in a second predetermined order in the specified layer.
The embodiment of the disclosure relates to a compression algorithm of deep learning based on pruning operation, which adopts the pruning operation based on measurement to realize the compression of a convolutional neural network model, and specifically carries out the pruning operation by combining the importance of a convolutional kernel filter and the similarity between different convolutional kernel filters, thereby reducing the redundancy of the convolutional neural network model; by retraining the pruned convolutional neural network model, the reduction of model precision caused by pruning is avoided. The compressed convolutional neural network model not only has better identification precision, but also reduces the parameter quantity and complexity of the model, so that the convolutional neural network model has higher identification precision, realizes the lightweight model and is convenient to transplant to hardware equipment.
A third embodiment of the present disclosure provides a storage medium, which is a computer-readable medium storing a computer program, which when executed by a processor implements the method provided by the first embodiment of the present disclosure, including the following steps S11 to S13:
s11, constructing a convolutional neural network model, wherein the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers;
s12, training the convolutional neural network model;
s13, pruning operation is carried out according to a preset sequence by aiming at the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model.
Further, the computer program realizes the first embodiment of the disclosure when executed by the processor, and provides other methods
The embodiment of the disclosure relates to a compression algorithm for deep learning based on pruning operation, which adopts pruning operation based on measurement to realize compression of a convolutional neural network model, and specifically performs pruning operation by combining importance of a convolutional kernel filter and similarity between different convolutional kernel filters, thereby reducing redundancy of the convolutional neural network model; by retraining the pruned convolutional neural network model, the reduction of model precision caused by pruning is avoided. The compressed convolutional neural network model not only has better identification precision, but also reduces the parameter quantity and complexity of the model, so that the convolutional neural network model realizes the lightweight of the model while having higher identification precision, and is convenient to transplant to hardware equipment.
A fourth embodiment of the present disclosure provides an electronic device, which at least includes a memory and a processor, where the memory stores a computer program thereon, and the processor implements the method provided in any embodiment of the present disclosure when executing the computer program on the memory. Illustratively, the electronic device computer program steps are as follows S21 to S23:
s21, constructing a convolutional neural network model, wherein the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers;
s22, training the convolutional neural network model;
s23, pruning operation is carried out according to a preset sequence by aiming at the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model.
Further, the processor 902 also executes the computer program in the first embodiment described above
The embodiment of the disclosure relates to a compression algorithm of deep learning based on pruning operation, which adopts the pruning operation based on measurement to realize the compression of a convolutional neural network model, and specifically carries out the pruning operation by combining the importance of a convolutional kernel filter and the similarity between different convolutional kernel filters, thereby reducing the redundancy of the convolutional neural network model; by retraining the pruned convolutional neural network model, the reduction of model precision caused by pruning is avoided. The compressed convolutional neural network model not only has better identification precision, but also reduces the parameter quantity and complexity of the model, so that the convolutional neural network model has higher identification precision, realizes the lightweight model and is convenient to transplant to hardware equipment.
A fifth embodiment of the present disclosure provides a method for recognizing heart rate data, including the following steps: collecting heart rate data, where the patient's heart rate data may be collected by any existing means; inputting the heart rate data into the convolutional neural network model, and acquiring the type of the heart rate data, wherein the convolutional neural network model is compressed by the compression method described in any one of the first embodiments. Therefore, the obtained model parameters or weights of the convolutional neural network model after compression operation are transplanted and input into hardware equipment, so that after the hardware equipment collects the heart rate data, the hardware equipment can operate at an accelerated speed by importing the model parameters or weights, and accurate identification of the type of the heart rate data is realized. The embodiment of the disclosure can realize accurate identification of the type of the heart rate data through the compressed convolutional neural network model.
The storage medium may be included in the electronic device; or may exist separately without being assembled into the electronic device.
The storage medium carries one or more programs that, when executed by the electronic device, cause the electronic device to: acquiring at least two internet protocol addresses; sending a node evaluation request comprising at least two internet protocol addresses to node evaluation equipment, wherein the node evaluation equipment selects the internet protocol addresses from the at least two internet protocol addresses and returns the internet protocol addresses; receiving an internet protocol address returned by the node evaluation equipment; wherein the obtained internet protocol address indicates an edge node in the content distribution network.
Alternatively, the storage medium carries one or more programs that, when executed by the electronic device, cause the electronic device to: receiving a node evaluation request comprising at least two internet protocol addresses; selecting an internet protocol address from at least two internet protocol addresses; returning the selected internet protocol address; wherein the received internet protocol address indicates an edge node in the content distribution network.
Computer program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including but not limited to an object oriented programming language such as Java, smalltalk, C + +, and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the passenger computer, partly on the passenger computer, as a stand-alone software package, partly on the passenger computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the passenger computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It should be noted that the storage media described above in this disclosure can be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any storage medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. Program code embodied on a storage medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, optical cables, RF (radio frequency), etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units described in the embodiments of the present disclosure may be implemented by software or hardware. Wherein the name of an element does not in some cases constitute a limitation on the element itself.
The functions described herein above may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems on a chip (SOCs), complex Programmable Logic Devices (CPLDs), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing description is only exemplary of the preferred embodiments of the disclosure and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other embodiments in which any combination of the features described above or their equivalents does not depart from the spirit of the disclosure. For example, the above features and the technical features disclosed in the present disclosure (but not limited to) having similar functions are replaced with each other to form the technical solution.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. Under certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limitations on the scope of the disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
While the present disclosure has been described in detail with reference to the embodiments, the present disclosure is not limited to the specific embodiments, and those skilled in the art can make various modifications and alterations based on the concept of the present disclosure, and the modifications and alterations should fall within the scope of the present disclosure as claimed.

Claims (14)

1. A compression method for a convolutional neural network model, comprising the steps of:
constructing a convolutional neural network model, wherein the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers;
training the convolutional neural network model;
and pruning the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model according to a preset sequence.
2. The method of claim 1, wherein the convolutional neural network model further comprises at least a max pooling layer, a batch normalization layer, and a global average pooling layer.
3. The method of claim 1, wherein the training the convolutional neural network model comprises:
inputting a training set into the convolutional neural network model for training and reaching a preset iteration number;
and inputting the test set into the convolution neural network model after the iteration is finished for testing so as to achieve the preset precision.
4. The method of claim 1, wherein the pruning operations for the convolutional kernel filters of different layers in the one-dimensional convolutional layer in the convolutional neural network model after completing training are performed in a predetermined order, comprising:
acquiring the characteristic value of each first convolution kernel filter of a specified layer in the one-dimensional convolution layer according to a first preset sequence;
obtaining a set of second convolution kernel filters based on the feature values;
a third convolution kernel filter is obtained in the set based on the similarity.
5. The method of claim 4, further comprising:
arranging the third convolution kernel filters in a second predetermined order in the given layer.
6. The method according to claim 4, wherein the obtaining of the eigenvalues of at least one first convolution kernel filter of a given layer in the one-dimensional convolution layer according to the first predetermined order is a weight average value, and the weight average value is determined based on the following formula:
Figure FDA0003080870470000011
7. the method of claim 4, wherein said obtaining a set of second convolution kernel filters based on the weight eigenvalues comprises:
sorting the eigenvalues of the first convolution kernel filter in a third predetermined order;
and determining the first convolution kernel filter with the characteristic value larger than the first threshold value as a second convolution kernel filter.
8. The method of claim 4, wherein the screening in the set based on similarity to obtain a third convolution kernel filter comprises:
determining a similarity between any two of the second convolution kernel filters in the set;
and deleting redundant convolution kernel filters in the set based on the similarity to obtain the third convolution kernel filter.
9. The method of claim 8, wherein said determining a similarity between any two of said second convolution kernel filters in said set comprises:
performing one-dimensional convolution operation on all the second convolution kernel filters to obtain an output characteristic diagram with m x n-dimensional vectors, wherein m is a data dimension, and n is the number of the second convolution kernel filters;
and acquiring the similarity between two column vectors representing any two second convolution kernel filters based on the output feature map.
10. The method of claim 9, wherein said removing redundant convolutional kernel filters from said set based on said similarity comprises
Deleting one of the two column vectors having a similarity greater than a second threshold and the corresponding second convolution kernel filter.
11. A compression apparatus for a convolutional neural network model, comprising:
the building module is used for building a convolutional neural network model, and the convolutional neural network model at least comprises a plurality of layers of one-dimensional convolutional layers;
a training module for training the convolutional neural network model;
and the pruning module is used for carrying out pruning operation according to a preset sequence aiming at the convolution kernel filters of different layers in the one-dimensional convolution layer in the trained convolution neural network model.
12. A storage medium storing a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 10 when executed by a processor.
13. An electronic device comprising at least a memory, a processor, the memory having a computer program stored thereon, characterized in that the processor realizes the steps of the method of any of claims 1 to 10 when executing the computer program on the memory.
14. A method for recognizing heart rate data is characterized by comprising the following steps:
collecting heart rate data;
inputting the heart rate data into a convolutional neural network model, and acquiring the type of the heart rate data, wherein the convolutional neural network model is compressed by the compression method of any one of claims 1-10.
CN202110565583.2A 2021-05-24 2021-05-24 Compression method and compression device for convolutional neural network model Pending CN115392448A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110565583.2A CN115392448A (en) 2021-05-24 2021-05-24 Compression method and compression device for convolutional neural network model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110565583.2A CN115392448A (en) 2021-05-24 2021-05-24 Compression method and compression device for convolutional neural network model

Publications (1)

Publication Number Publication Date
CN115392448A true CN115392448A (en) 2022-11-25

Family

ID=84113700

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110565583.2A Pending CN115392448A (en) 2021-05-24 2021-05-24 Compression method and compression device for convolutional neural network model

Country Status (1)

Country Link
CN (1) CN115392448A (en)

Similar Documents

Publication Publication Date Title
CN110188765B (en) Image semantic segmentation model generation method, device, equipment and storage medium
CN110321910B (en) Point cloud-oriented feature extraction method, device and equipment
CN112541458B (en) Domain self-adaptive face recognition method, system and device based on meta learning
CN112529015A (en) Three-dimensional point cloud processing method, device and equipment based on geometric unwrapping
CN110738235B (en) Pulmonary tuberculosis judging method, device, computer equipment and storage medium
WO2022188327A1 (en) Method and apparatus for training positioning image acquisition model
CN113435253B (en) Multi-source image combined urban area ground surface coverage classification method
CN113191489B (en) Training method of binary neural network model, image processing method and device
Sungheetha et al. Comparative study: statistical approach and deep learning method for automatic segmentation methods for lung CT image segmentation
CN113095370A (en) Image recognition method and device, electronic equipment and storage medium
CN111680755B (en) Medical image recognition model construction and medical image recognition method, device, medium and terminal
CN111368656A (en) Video content description method and video content description device
CN109146891B (en) Hippocampus segmentation method and device applied to MRI and electronic equipment
CN113240079A (en) Model training method and device
CN115311502A (en) Remote sensing image small sample scene classification method based on multi-scale double-flow architecture
CN112288700A (en) Rail defect detection method
CN111401309B (en) CNN training and remote sensing image target identification method based on wavelet transformation
KR20220017497A (en) Methods, devices and devices for image feature extraction and training of networks
CN114881343B (en) Short-term load prediction method and device for power system based on feature selection
EP3588441B1 (en) Imagification of multivariate data sequences
CN116826734A (en) Photovoltaic power generation power prediction method and device based on multi-input model
CN114741487B (en) Image-text retrieval method and system based on image-text semantic embedding
CN115392448A (en) Compression method and compression device for convolutional neural network model
CN116343016A (en) Multi-angle sonar image target classification method based on lightweight convolution network
CN110705695A (en) Method, device, equipment and storage medium for searching model structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination