WO2021092796A1 - Neural network model deployment method and apparatus, and device - Google Patents

Neural network model deployment method and apparatus, and device Download PDF

Info

Publication number
WO2021092796A1
WO2021092796A1 PCT/CN2019/118043 CN2019118043W WO2021092796A1 WO 2021092796 A1 WO2021092796 A1 WO 2021092796A1 CN 2019118043 W CN2019118043 W CN 2019118043W WO 2021092796 A1 WO2021092796 A1 WO 2021092796A1
Authority
WO
WIPO (PCT)
Prior art keywords
layer
summation
neural network
convolutional
convolutional layer
Prior art date
Application number
PCT/CN2019/118043
Other languages
French (fr)
Chinese (zh)
Inventor
聂谷洪
施泽浩
孙扬
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2019/118043 priority Critical patent/WO2021092796A1/en
Priority to CN201980039593.3A priority patent/CN112313674A/en
Publication of WO2021092796A1 publication Critical patent/WO2021092796A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the technical field of neural networks, and in particular to a neural network model deployment method, device and equipment.
  • the trained convolutional neural network model can be compressed in the following two ways to reduce the size of the trained convolutional neural network model and reduce the calculation the amount.
  • One way is to perform model compression by reducing the number of channels of the trained convolutional neural network model.
  • the model can be compressed by converting the weight parameters of the trained neural network model from floating-point weight parameters to fixed-point weight parameters.
  • the embodiments of the present application provide a neural network model deployment method, device, and equipment to solve the problem that the model deployment method in the prior art is highly dependent on original training data.
  • an embodiment of the present application provides a neural network model deployment method, including: obtaining a trained convolutional neural network model; performing matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain all The matrix decomposition result of the convolutional layer; according to the matrix decomposition result, the structure of the convolutional neural network model is adjusted to compress the convolutional neural network model to obtain the compressed convolutional neural network model Model; deploy the compressed model.
  • an embodiment of the present application provides a neural network model deployment device, including: a processor and a memory; the memory is used to store program code; the processor calls the program code, and when the program code is executed When used to perform the following operations:
  • the structure of the convolutional neural network model is used to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model; deploy the compressed model.
  • an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes at least one piece of code, the at least one piece of code can be executed by a computer to control all The computer executes the method described in any one of the first aspects above.
  • an embodiment of the present application provides a computer program, when the computer program is executed by a computer, it is used to implement the method described in any one of the above-mentioned first aspects.
  • an embodiment of the present application provides a mobile platform, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method of any one of the first aspect;
  • the convolutional neural network model When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile platform.
  • an embodiment of the present application provides a pan-tilt device, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to any one of the methods of the first aspect;
  • the convolutional neural network model When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the pan/tilt device.
  • an embodiment of the present application provides a mobile terminal, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in any one of the first aspect;
  • the convolutional neural network model When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile terminal.
  • the embodiments of the application provide a neural network model deployment method, device, and equipment.
  • the weight parameters of the convolutional layer in the trained convolutional neural network model are subjected to matrix decomposition to obtain the matrix decomposition result of the convolutional layer.
  • the matrix Decompose the results adjust the structure of the convolutional neural network model to compress the convolutional neural network model, and deploy the compressed model, realize the matrix decomposition of the weight parameters and adjust the convolutional neural network model according to the matrix decomposition result
  • the structure of the compressed model is obtained.
  • the structure of the convolutional neural network model is adjusted according to the matrix decomposition result.
  • the obtained compressed model can retain the convolutional features of the original convolutional layer, and then can retain the input and output characteristics of the model, thereby reducing the dependence on the original training data.
  • FIG. 1 is a schematic diagram of an application scenario of a neural network model deployment method provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a neural network model deployment method provided by an embodiment of this application.
  • FIG. 3 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application.
  • FIG. 4 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application.
  • FIG. 5 is a schematic diagram of performing matrix decomposition on weight parameters of a convolutional layer according to an embodiment of the application
  • FIG. 6 is a schematic diagram of the input and output relationship between a convolutional layer and its replacement layer provided by an embodiment of the application;
  • FIG. 7 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application.
  • FIG. 8 is a schematic diagram of performing matrix decomposition on weight parameters of a convolutional layer according to another embodiment of the application.
  • FIG. 9 is a schematic diagram of the input and output relationship between a convolutional layer and its replacement layer provided by another embodiment of the application.
  • FIG. 10 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application.
  • FIG. 11 is a schematic diagram of the input and output relationship between a convolutional layer and its replacement layer provided by another embodiment of this application;
  • FIG. 12 is a schematic structural diagram of a neural network model deployment device provided by an embodiment of this application.
  • the neural network model deployment method provided in the embodiments of the present application can be applied to any scenario where a convolutional neural network model needs to be deployed.
  • the neural network model deployment method can be specifically executed by a neural network model deployment device.
  • a schematic diagram of the application scenario of the neural network model deployment method provided by the embodiment of the present application may be as shown in FIG. 1.
  • the neural network model deployment device 11 can obtain the trained convolutional neural network model from other devices/equipment 12, and The obtained convolutional neural network model is processed using the neural network model deployment method provided in the embodiment of the present application.
  • the specific manner of communication connection between the neural network model deployment device 11 and other devices/devices 12 is not limited in this application.
  • wireless communication connection may be realized based on a Bluetooth interface, or wired communication connection may be realized based on an RS232 interface.
  • the equipment including the neural network model deployment device which may specifically be computer equipment with strong computing capability.
  • the neural network model deployment device obtains the convolutional neural network model from other devices or equipment as an example.
  • the neural network model deployment device can obtain the convolutional neural network model in other ways.
  • the neural network model deployment device can obtain the convolutional neural network model by training the initial convolutional neural network model.
  • the neural network model deployment method performs matrix decomposition on the weight parameters of the convolutional layer in the trained convolutional neural network model to obtain the matrix decomposition result of the convolutional layer, and adjust the volume according to the matrix decomposition result.
  • the structure of the product neural network model is used to compress the convolutional neural network model and deploy the compressed model, reducing the dependence on the original training data.
  • FIG. 2 is a schematic flow chart of a neural network model deployment method provided by an embodiment of this application.
  • the execution subject of this embodiment may be a neural network model deployment device, and specifically may be a processor of the neural network model deployment device.
  • the method of this embodiment may include:
  • Step 201 Obtain a trained convolutional neural network model.
  • the specific method for obtaining the trained convolutional neural network model may not be limited in this application.
  • it can receive trained convolutional neural network models sent by other devices/equipment.
  • the trained convolutional neural network model can be read from the storage device of other devices/equipment.
  • Step 202 Perform matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain a matrix decomposition result of the convolutional layer.
  • the matrix decomposition method for matrix decomposition of the weight parameters of the convolutional layer in the convolutional neural network model may be a method that satisfies the condition 1.
  • condition 1 is that the matrix can be decomposed into the sum of multiple summations.
  • the matrix decomposition method may further satisfy condition 2.
  • Condition 2 is that there is a summation term A with energy greater than the first threshold and a summation term B with energy less than the second threshold among the multiple summations obtained by decomposing the matrix.
  • the summation term A with energy greater than the first threshold and the summation term B with energy less than the second threshold among the multiple summation items can indicate that the energy difference of different summation items is relatively large.
  • the first threshold may be 80%, for example, and the second threshold may be 5%, for example.
  • the energy of a summation item can be used for the importance of the summation item in the matrix decomposition result. The greater the energy, the greater the importance.
  • the matrix decomposition method may be Singular Value Decomposition (SVD), and correspondingly, energy may be understood as a singular value.
  • the weight parameters of each convolutional layer in some or all of the convolutional layers in the convolutional neural network model may be subjected to matrix decomposition to obtain the matrix decomposition result of each convolutional layer.
  • Step 203 Adjust the structure of the convolutional neural network model according to the matrix decomposition result to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model.
  • the structure of the convolutional neural network model needs to be adjusted according to the matrix decomposition result.
  • the result of the convolutional neural network model can be adjusted according to all the summation items, or Select a partial summation item from all the summation items, and adjust the structure of the convolutional neural network model according to the partial summation item.
  • a partial summation item can be selected from all the summation items and adjusted according to the partial summation item The structure of the convolutional neural network model.
  • Step 204 Deploy the compressed model.
  • the compressed model can be directly deployed to the device that performs model calculations; or, the device that performs model calculations can read the compressed model from the storage device of the neural network model deployment device.
  • the compressed model can be sent to other devices/devices, such as other devices/devices 12.
  • the other devices/devices deploy the compressed model to the device that performs model calculations, or the device that performs model calculations from other devices /Read the compressed model from the storage device of the device.
  • the device that performs model calculations may specifically be any type of device that needs to deploy a convolutional neural network model.
  • the device for performing model calculation may include a movable platform, such as a drone.
  • the matrix decomposition result of the convolutional layer is obtained, and the structure of the convolutional neural network model is adjusted according to the matrix decomposition result.
  • it is realized by matrix decomposition of the weight parameters and adjustment of the structure of the convolutional neural network model according to the matrix decomposition result to obtain the compressed model.
  • the obtained matrix decomposition results can retain the characteristics of the matrix, that is, the characteristics of the weight parameters of the original convolutional layer. Therefore, the compressed model obtained by adjusting the structure of the convolutional neural network model according to the matrix decomposition result can retain the original convolutional layer.
  • the convolution feature can retain the input and output characteristics of the model, thereby reducing the dependence on the original training data.
  • FIG. 3 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application. Based on the embodiment shown in FIG. 2, this embodiment mainly describes an optional implementation manner of adjusting the structure of the convolutional neural network model according to the matrix decomposition result. As shown in FIG. 3, the method of this embodiment may include:
  • Step 301 Obtain a trained convolutional neural network model.
  • step 301 is similar to step 201 and will not be repeated here.
  • Step 302 Perform matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain a matrix decomposition result of the convolutional layer.
  • step 302 is similar to step 202, and will not be repeated here.
  • Step 303 Determine a replacement layer for replacing the convolutional layer according to the matrix decomposition result, where the number of weight parameters of the replacement layer is less than the number of weight parameters of the convolution layer.
  • a substitute layer of a convolutional layer in the convolutional neural network model is used to replace the convolutional layer in the convolutional neural network model.
  • the number of weight parameters of the replacement layer is less than the number of weight parameters of the convolutional layer, so that the model can be compressed.
  • the structure of the replacement layer of a convolutional layer and the weight parameter of the replacement layer correspond to all or part of the summation items in the matrix decomposition result of the convolutional layer.
  • the matrix decomposition result includes a plurality of summation items; the determining a replacement layer for replacing the convolutional layer according to the matrix decomposition result may specifically include: according to the plurality of summation items Part of the summation term in determines the replacement layer used to replace the convolutional layer.
  • the structure of the replacement layer and the weight parameters of the replacement layer correspond to the partial summation items in the matrix decomposition structure.
  • the method of this embodiment may further include: Among the multiple summations, other summation items other than the partial summation items determine the bias parameter of the replacement layer.
  • the offset parameters of the replacement layer are used to compensate the error caused by other summation terms in the weight parameters of the replacement layer, which is beneficial to improve the accuracy of the compressed model. degree.
  • the input of a convolutional layer usually comes from the batch normalization (BN) output of the previous layer of convolution, which satisfies the normal distribution in accordance with the BN statistics, it can be based on the normal distribution of the input channel of the alternative layer Characteristics, determine the bias parameters of the replacement layer. It is assumed that the input channels of the replacement layer are independent and distributed, and the value of each position in the same channel corresponds to the mean value of the channel to which it belongs, and is convolved with other summations to obtain the decomposition compression loss in the corresponding compression mode. The loss error calculated by this method can be incorporated into the offset term.
  • the determining the bias parameter of the replacement layer according to other summations other than the partial summation items in the plurality of summation items may specifically include: adding the other summation items The sum result of is convolved with the mean value of the normal distribution of each input channel in the substitution layer to obtain the convolution result of each input channel; the convolution result of each input channel is combined into its bias parameter, In order to obtain the bias parameter of the replacement layer.
  • Step 304 Replace the convolutional layer in the convolutional neural network model with the replacement layer.
  • the structure of the replacement layer is changed relative to the structure of the weight parameter of the convolutional layer
  • the structure of the replacement layer is changed relative to the structure of the convolutional layer.
  • the replacement layer of the build-up layer replaces the convolutional layer in the convolutional neural network model, and realizes the adjustment of the structure of the convolutional neural network model.
  • the weight parameters of the convolutional layer in the trained convolutional neural network model are matrix-decomposed to obtain the matrix decomposition result of the convolutional layer. According to the matrix decomposition result, it is determined to replace the convolutional layer. Substitution layer, replacing the convolutional layer in the convolutional neural network model with the substituting layer, realizes the adjustment of the structure of the convolutional neural network, and since the number of weight parameters of the replacement layer is less than the number of weight parameters of the convolutional layer, Therefore, the convolutional neural network model can be compressed by adjusting the structure of the convolutional neural network.
  • FIG. 4 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application. Based on the embodiment shown in FIG. 3, this embodiment mainly describes the weight parameters of the convolutional layer in the convolutional neural network model. An optional implementation for matrix factorization. As shown in Figure 4, the method of this embodiment may include:
  • Step 401 Obtain a trained convolutional neural network model.
  • step 401 is similar to step 201, and will not be repeated here.
  • Step 402 According to the input channel of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and use each group of weight parameters as a two-dimensional matrix to perform matrix decomposition to obtain each group of weight parameters The result of the first matrix factorization.
  • the weight parameters of the convolutional layer can be divided into C groups according to the input channel, and each group is a two-dimensional matrix with a size of N ⁇ k 2. Further, taking the matrix decomposition method SVD as an example, a two-dimensional matrix After SVD decomposition, Where U i ⁇ R N , k 2 ⁇ N.
  • the first matrix decomposition result of each group of weight parameters includes a plurality of first summation items, and each first summation item corresponds to an energy used to characterize its importance.
  • Step 403 Sort the plurality of first summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the first target summation item.
  • the selection of the partial summation item with the highest ranking as the first target summation item may specifically include: selecting the partial summation item with the highest ranking and whose energy sum is greater than or equal to the energy threshold as the first target summation item.
  • the summation of the first goal For example, assuming that the energy threshold is 0.9, the first matrix decomposition result of a group includes 5 summations, namely summation item 1 to summation item 5, and the energy of summation item 1 is 0.8, and the summation item 2 The energy is 0.1, the energy of the summation term 3 is 0.07, the energy of the summation term 4 is 0.02, and the energy of the summation term 5 is 0.01. Then you can choose the summation term 1 and the summation term 2 of the 5 summation terms As the first goal summation item.
  • the energy threshold can be flexibly implemented according to requirements.
  • the selection of the top partial summation item as the first target summation item may specifically include: selecting the partial summation item that is ranked top and the number of the summation items is less than or equal to the number threshold as the first target summation item.
  • One goal summation item For example, assuming that the number threshold is 3, the first matrix decomposition result of a group includes 5 summation items, which are summation item 1 to summation item 5, and the energy of summation item 1 to summation item 5 is successively reduced If it is small, you can select the summation item 1 to the summation item 3 among the 5 summation items as the first target summation item.
  • the number threshold can be flexibly implemented according to requirements. The larger the number threshold, the closer the first target summation item is to the group weight parameter.
  • Step 404 Determine a replacement layer for replacing the convolutional layer according to the first target summation item of each set of weight parameters.
  • step 404 may specifically include: determining a strategy for replacing the convolution according to the strategy that the first target summation items with the same ranking order of different sets of weight parameters correspond to the same branch, and different branches are connected in parallel. Substitute layer for layer.
  • the number of input channels of the convolutional layer is equal to 5, that is, the weight parameter of the convolutional layer is divided into 5 groups, the number of the first target summation items in each group is 4, and the first group has 4
  • the first goal summation items are summation items 1a to 1d in the order of energy
  • the 4 first goal summation items in the second group are summation items in order of energy.
  • the 4 first goal summation items of the third group are summation item 3a to summation item 3d in descending order of energy
  • the 4 first goal summation items of the fourth group The sum terms are from summation term 4a to summation term 4d in descending order of energy
  • the 4 first objective summation terms in the fifth group are respectively summation term 5a to summation term in descending order of energy.
  • the alternative layer can include 4 branches in parallel, namely branch 1 corresponding to the summation items 1a, 2a, 3a, 4a, and 5a, and branch 2 corresponding to the summation items 1b, 2b, 3b, 4b, and 5b.
  • the branch 3 corresponding to the summation items 1c, 2c, 3c, 4c, and 5c
  • one branch may include a first convolutional layer and a second convolutional layer connected in series.
  • the input of the first convolutional layer may be the input of the replaced convolutional layer, and the output of the first convolutional layer may be the input of the second convolutional layer.
  • the first convolutional layer is used to perform a pointwise convolution operation on the input of the replaced convolutional layer.
  • the number of input channels of the first convolutional layer is C
  • C is equal to the number of input channels of the convolutional layer replaced
  • the number of output channels of the first convolutional layer Is N, N is equal to the number of output channels of the convolutional layer to be replaced
  • the size of the convolution kernel of the first convolutional layer is 1 2 .
  • the parameter quantity of the first convolutional layer may specifically be NC.
  • the second convolutional layer is used to perform a layer-by-layer convolution (depthwise convolution) operation on the output of the first convolutional layer.
  • the number of input channels and output channels of the second convolutional layer is N, and N is equal to the number of input channels of the convolutional layer replaced, and the convolution of the second convolutional layer
  • the kernel size is K 2 , and K 2 is equal to the convolution kernel size of the replaced convolution layer. It can be seen that the parameter quantity of the second convolutional layer may specifically be Nk 2 .
  • the parameter quantity of the convolutional layer replaced by the replacement layer is NCk 2.
  • the parameter of one branch of the replacement layer is NC+Nk 2 , Usually, C ⁇ N, k 2 ⁇ C.
  • the calculation amount of the convolutional layer is NCk 2 HW
  • the replacement layer may include a summation layer for accumulating the outputs of different branches.
  • the input and output relationship between the convolutional layer and its alternative layer can be shown in FIG. 6.
  • the input of the point-wise convolution of each branch of the replacement layer is the input of the convolutional layer it replaces
  • the output of the point-wise convolution of each branch is the input of the layer-by-layer convolution in series with it.
  • the first convolutional layer of a branch corresponds to the U matrix in all the first target summations corresponding to the branch
  • the second convolutional layer of the branch corresponds to the branch Corresponding to the V matrix in all the first objective summations.
  • the point-by-point convolution of the first branch from left to right in Fig. 6 can be the same as the C obtained by performing SVD decomposition on the C group of weight parameters.
  • the point-wise convolution of the second branch can be compared with the C Corresponds to U 2 in the second branch
  • the layer-by-layer convolution of the second branch can be compared with the C Corresponding to V 2 in the third branch
  • the point-by-point convolution of the third branch can be compared with the C Corresponds to U 3 in the third branch
  • the layer-by-layer convolution of the third branch can be combined with the C V 3 in the corresponding.
  • Step 405 Replace the convolutional layer in the convolutional neural network model with the replacement layer.
  • step 405 is similar to step 304, and will not be repeated here.
  • the weight parameters of the convolutional layers are grouped according to the input channels of the convolutional layer in the convolutional neural network model, and the first target summation item of each group of weight parameters is determined, and the weight parameters of each group are determined according to the
  • the first target summation item determines the alternative layer used to replace the convolutional layer, and uses the alternative layer to replace the convolutional layer in the convolutional neural network model, which realizes the weight of the convolutional layer according to the input channel of the convolutional layer
  • the parameters are grouped, and each group of weight parameters is subjected to matrix decomposition to adjust the structure of the convolutional neural network.
  • FIG. 7 is a schematic flow chart of a neural network model deployment method provided by another embodiment of this application. Based on the embodiment shown in FIG. 3, this embodiment mainly describes the weight parameters of the convolutional layer in the convolutional neural network model. An optional implementation for matrix factorization. As shown in FIG. 7, the method of this embodiment may include:
  • Step 701 Obtain a trained convolutional neural network model.
  • step 701 is similar to step 201 and will not be repeated here.
  • Step 702 According to the output channel of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and use each group of weight parameters as a two-dimensional matrix to perform matrix decomposition to obtain each group of weight parameters The result of the second matrix factorization.
  • the weight parameters of the convolutional layer can be divided into N groups according to the output channel, and each group is a two-dimensional matrix with a size of C ⁇ k 2. Further, taking the matrix decomposition method SVD as an example, a two-dimensional matrix After SVD decomposition, Where U i ⁇ R C , k 2 ⁇ C.
  • the second matrix decomposition result of each group of weight parameters includes a plurality of second summation items, and each second summation item corresponds to an energy used to characterize its importance;
  • Step 703 Sort the multiple second summation items of each group of weight parameters according to the order of energy from the largest to the smallest, and select the partial summation item with the highest ranking as the second target summation item.
  • the selection of the partial summation item ranked higher as the second target summation item may specifically include: selecting the partial summation item ranked higher and whose energy sum is greater than or equal to the energy threshold as the partial summation item The summation of the second objective.
  • the second matrix decomposition result of a group includes 6 summation terms, which are the summation term a to the summation term g, and the energy of the summation term a is 0.7, and the summation term b
  • the energy is 0.1
  • the energy of the summation term c is 0.08
  • the energy of the summation term d is 0.06
  • the energy of the summation term e is 0.04
  • the energy of the summation term f is 0.02.
  • the energy threshold can be flexibly implemented according to requirements.
  • the selection of the partial summation item with the highest ranking as the second target summation item may specifically include: selecting the partial summation item with the highest ranking and the number of the summation items is less than or equal to the number threshold as the first summation item.
  • Two goal summation items For example, assuming that the number threshold is 3, the first matrix decomposition result of a group includes 6 summation items, which are summation term a to summation term f, and the energy of summation term a to summation term f is successively reduced If it is small, you can select the summation item a to the summation item c among the 6 summation items as the second target summation item.
  • the number threshold can be flexibly implemented according to requirements. The larger the number threshold, the closer the second target summation item is to the group weight parameter.
  • step 703 may be the same as the number threshold and energy threshold in step 403, respectively.
  • Step 704 Determine a replacement layer for replacing the convolutional layer according to the second target summation item of each set of weight parameters.
  • step 704 may specifically include: determining a replacement layer for replacing the convolutional layer according to a strategy in which the second target summation items with the same ranking order of different sets of weight parameters correspond to the same branch, and different branches are connected in parallel. .
  • the number of output channels of the convolutional layer is equal to 6, that is, the weight parameter of the convolutional layer is divided into 6 groups
  • the number of second target summation items in each group is 4, and the first group has 4
  • the second objective summation items are summation term aa to summation term af according to the order of energy
  • the 4 second objective summation items of the second group are summation items in order of energy order.
  • the 4 second goal summation items of the third group are summation item ca to summation item cf in the order of energy
  • the 4 second goal summation items of the fourth group The sum terms are from the sum term da to the sum term df in the descending order of energy.
  • the 4 second objective sum terms in the fifth group are from the sum term ea to the sum term in descending order of energy.
  • the sum term ef, the 4 second objective sum terms in the sixth group are from the sum term fa to the sum term ff in the order of energy from the largest to the smallest.
  • the alternative layer can include 4 branches in parallel, respectively Branch 1 corresponding to the terms aa, ba, ca, da, ea, and fa, branch 2 corresponding to the summation terms ab, bb, cb, db, eb, and fb, and summation terms ac, bc, cc, dc, ec and Branch 3 corresponding to fc, and branch 4 corresponding to the summation terms ad, bd, cd, dd, ed, and fd.
  • one branch includes a third convolutional layer and a fourth convolutional layer connected in series.
  • the input of the third convolutional layer may be the input of the replaced convolutional layer, and the output of the third convolutional layer may be the input of the fourth convolutional layer.
  • the third convolutional layer is used to perform a layer-by-layer convolution (depthwise convolution) operation on the input of the replaced convolutional layer.
  • the number of input channels and output channels of the third convolutional layer is C, and C is equal to the number of input channels of the convolutional layer replaced, and the convolution of the third convolutional layer
  • the size of the convolution kernel is K 2 , and K 2 is equal to the size of the convolution kernel of the replaced convolution layer. It can be seen that the parameter amount of the third convolutional layer can be specifically Ck 2 .
  • the fourth convolutional layer is used to perform a pointwise convolution operation on the output of the third convolutional layer.
  • the number of input channels of the fourth convolutional layer is C
  • C is equal to the number of input channels of the convolutional layer replaced
  • the number of output channels of the fourth convolutional layer is N
  • N is equal to the number of output channels of the convolutional layer to be replaced
  • the size of the convolution kernel of the fourth convolutional layer is 1 2 .
  • the parameter quantity of the fourth convolutional layer may specifically be NC.
  • the optional replacement layer includes a summation layer for accumulating the outputs of different branches.
  • the third convolution layer performs layer-by-layer convolution
  • the fourth convolution layer performs point-by-point convolution as an example
  • the input and output relationship between the convolution layer and its replacement layer can be shown in FIG. 9.
  • the input of the layer-by-layer convolution of each branch of the replacement layer is the input of the convolutional layer it replaces
  • the output of the layer-by-layer convolution of each branch is used as the input of the point-wise convolution in series with it.
  • the output of the point-wise convolution is accumulated by the summation layer, it is equivalent to the output of the convolutional layer it replaces.
  • the third convolutional layer of a branch corresponds to the U matrix in all the second target summations corresponding to the branch
  • the fourth convolutional layer of the branch corresponds to the branch U matrix.
  • the V matrix in all the second target summations corresponds to the branch U matrix.
  • the layer-by-layer convolution of the first branch from left to right in Fig. 9 can be compared with the N sets of weight parameters obtained by SVD decomposition.
  • U 1 in the corresponding the point-by-point convolution of the first branch can be combined with the N Corresponding to V 1 in the second branch
  • the layer-by-layer convolution of the second branch can be combined with the N Corresponds to U 2 in the second branch
  • the point-by-point convolution of the second branch can be compared with the N Corresponding to V 2 in the third branch
  • the layer-by-layer convolution of the third branch can be combined with the N U 3 in the corresponding
  • the point-by-point convolution of the third branch can be combined with the N V 3 in the corresponding.
  • Step 705 Replace the convolutional layer in the convolutional neural network model with the replacement layer.
  • step 705 is similar to step 304, and will not be repeated here.
  • the weight parameters of the convolutional layers are grouped according to the output channels of the convolutional layers in the convolutional neural network model, and the second target summation term of each group of weight parameters is determined, and the second target summation term of each group of weight parameters is determined according to the value of each group of weight parameters.
  • the second target summation item determines the alternative layer used to replace the convolutional layer, and uses the alternative layer to replace the convolutional layer in the convolutional neural network model, which realizes the weight of the convolutional layer according to the output channel of the convolutional layer
  • the parameters are grouped, and each group of weight parameters is subjected to matrix decomposition to adjust the structure of the convolutional neural network.
  • the matrix decomposition method provided by the embodiment shown in FIG. 4 or FIG. 7 can be used to adjust the structure of the convolutional neural network model.
  • the embodiment shown in FIG. 10 may also be combined with two matrix decomposition methods to adjust the structure of the convolutional neural network model.
  • Fig. 10 is a schematic flow chart of a neural network model deployment method provided by another embodiment of the application. This embodiment mainly describes the comparison of the convolutional layer in the convolutional neural network model on the basis of the embodiment shown in Fig. 4 or Fig. 7
  • Step 1001 Obtain a trained convolutional neural network model.
  • Step 1002 According to the input channel of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and use each group of weight parameters as a two-dimensional matrix to perform matrix decomposition to obtain each group of weight parameters The result of the first matrix factorization.
  • step 1002 is similar to step 402, and will not be repeated here.
  • Step 1003 Sort the multiple first summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the first target summation item.
  • step 1003 is similar to step 403, and will not be repeated here.
  • Step 1004 according to the number of output channels of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and perform matrix decomposition of each group of weight parameters as a two-dimensional matrix to obtain each group of weights The second matrix factorization result of the parameter.
  • step 1004 is similar to step 702, and will not be repeated here.
  • Step 1005 Sort the multiple second summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the second target summation item.
  • step 1005 is similar to step 703, and will not be repeated here.
  • step 1004-step 1005 there is no restriction on the sequence between step 1004-step 1005 and step 1002-step 1003.
  • Step 1006 Based on the target strategy, select the first target summation item or the second target summation item as a specific target summation item, and determine a substitute for replacing the convolutional layer according to the specific target summation item Floor.
  • the target strategy is specifically any type of strategy that can be used to select a more optimal summation item from the first goal summation item and the second goal summation item, which can be implemented flexibly according to requirements.
  • the target strategy includes a strategy with a minimum number of summation items or a strategy with maximum energy.
  • the strategy of minimum summation items suppose that the number of first target summation items in each group of the convolutional layer is 2, and the number of second target summation items in each group of the convolutional layer is equal Taking 4 as an example, since the number of the first goal summation item is less than the number of the second goal summation item, the first goal summation item can be selected as the specific goal summation item according to the minimum summation item strategy, and the selection is based on The method shown in Figure 4 adjusts the structure of the convolutional neural network model. As the number of summation items is smaller, the parameters are fewer, so by using the target strategy as the minimum summation item number strategy, the size of the compressed model can be reduced as much as possible.
  • the target strategy is the energy maximization strategy, which can minimize the error caused by compression on the model.
  • the specific target summation item is specifically the first target summation item or the second target summation item.
  • step 1005 determines the replacement layer for replacing the convolutional layer according to the first target summation item for specific descriptions of the relevant description of step 404, which will not be repeated here. Go into details.
  • step 1005 determines the replacement layer to replace the convolutional layer according to the second target summation item for specific descriptions of the relevant description of step 704, which will not be repeated here. Go into details.
  • Step 1007 Replace the convolutional layer in the convolutional neural network model with the replacement layer.
  • step 1007 is similar to step 304, and will not be repeated here.
  • the second target summation item determined by grouping the weight parameters of the convolutional layer is used as the specific target summation item, and is determined to replace the convolution according to the specific target summation item
  • the alternative layer of the layer realizes the selection of a more optimal summation item from the first objective summation item and the second objective summation item according to the demand to determine the alternative layer, so that the model compression result can meet the demand to the greatest extent.
  • the number of the partial summation items is greater than or equal to one.
  • the input and output relationship between the convolutional layer and its replacement layer can be as shown in FIG. 11.
  • the number of branches of the alternative layer is 1
  • the input of the layer-by-layer convolution of the alternative layer is the input of the convolutional layer it replaces
  • the output of the layer-by-layer convolution of the alternative layer is the point-by-point convolution connected with it.
  • the input of the point-by-point convolution is equivalent to the output of the convolutional layer it replaces.
  • the original training data refers to the training data of the trained convolutional neural network model obtained by training the initial convolutional neural network model.
  • the original convolutional neural network model the original data of the trained convolutional neural network model (hereinafter referred to as the original convolutional neural network model) is used, Retrain the compressed model so that the compressed model can learn the input and output characteristics that the original convolutional neural network model did not learn, so that the expression ability of the retrained compressed model can surpass the original convolutional neural network model. Conducive to improving model performance.
  • FIG. 12 is a schematic structural diagram of a neural network model deployment apparatus provided by an embodiment of the application. As shown in FIG. 12, the apparatus 1200 may include a processor 1201 and a memory 1202.
  • the memory 1202 is used to store program codes
  • the processor 1201 calls the program code, and when the program code is executed, is configured to perform the following operations:
  • the neural network model deployment apparatus provided in this embodiment can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar to those of the method embodiments, and will not be repeated here.
  • an embodiment of the present application also provides a mobile platform, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in the foregoing method embodiment;
  • the convolutional neural network model When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile platform.
  • the sensor data includes vision sensor data.
  • the mobile platform includes an unmanned aerial vehicle.
  • An embodiment of the present application also provides a pan-tilt device including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in the foregoing method embodiment;
  • the convolutional neural network model When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the pan/tilt device.
  • the sensor data includes vision sensor data.
  • the pan-tilt device is a handheld pan-tilt device.
  • An embodiment of the present application also provides a mobile terminal, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in the foregoing method embodiment;
  • the convolutional neural network model When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile terminal.
  • a person of ordinary skill in the art can understand that all or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware.
  • the aforementioned program can be stored in a computer readable storage medium. When the program is executed, it executes the steps including the foregoing method embodiments; and the foregoing storage medium includes: ROM, RAM, magnetic disk, or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

A neural network model deployment method and apparatus, and a device. The method comprises: obtaining a trained convolutional neural network model (201); performing matrix decomposition on a weight parameter of a convolution layer in the convolutional neural network model to obtain a matrix decomposition result of the convolution layer (202); adjusting, according to the matrix decomposition result, the structure of the convolutional neural network model so as to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model (203); and deploying the compressed model (204). Said method reduces the dependence on raw training data.

Description

神经网络模型部署方法、装置及设备Neural network model deployment method, device and equipment 技术领域Technical field
本申请涉及神经网络技术领域,尤其涉及一种神经网络模型部署方法、装置及设备。This application relates to the technical field of neural networks, and in particular to a neural network model deployment method, device and equipment.
背景技术Background technique
随着神经网络技术的不断发展,卷积神经网络模型的应用越来越广泛。With the continuous development of neural network technology, the application of convolutional neural network models has become more and more extensive.
通常,在对训练好的卷积神经网络模型进行部署之前可以通过如下两种方式对已训练好的卷积神经网络模型进行压缩,以减少已训练好的卷积神经网络模型的大小、降低计算量。一种方式,可以通过减少已训练好的卷积神经网络模型的通道数的方式进行模型压缩。另一种方式,可以通过将已训练好的神经网络模型的权重参数由浮点型权重参数转换为定点型权重参数的方式进行模型压缩。Generally, before deploying the trained convolutional neural network model, the trained convolutional neural network model can be compressed in the following two ways to reduce the size of the trained convolutional neural network model and reduce the calculation the amount. One way is to perform model compression by reducing the number of channels of the trained convolutional neural network model. In another way, the model can be compressed by converting the weight parameters of the trained neural network model from floating-point weight parameters to fixed-point weight parameters.
然而,上述模型部署方式存在对原始训练数据依赖性较大的问题。However, the above-mentioned model deployment method has the problem of greater dependence on the original training data.
发明内容Summary of the invention
本申请实施例提供一种神经网络模型部署方法、装置及设备,用以解决现有技术中模型部署方式存在对原始训练数据依赖性较大的问题。The embodiments of the present application provide a neural network model deployment method, device, and equipment to solve the problem that the model deployment method in the prior art is highly dependent on original training data.
第一方面,本申请实施例提供一种神经网络模型部署方法,包括:获得已训练好的卷积神经网络模型;所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果;根据所述矩阵分解结果,调整所述卷积神经网络模型的结构,以对所述卷积神经网络模型进行压缩,得到所述卷积神经网络模型的压缩后模型;对所述压缩后模型进行部署。In the first aspect, an embodiment of the present application provides a neural network model deployment method, including: obtaining a trained convolutional neural network model; performing matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain all The matrix decomposition result of the convolutional layer; according to the matrix decomposition result, the structure of the convolutional neural network model is adjusted to compress the convolutional neural network model to obtain the compressed convolutional neural network model Model; deploy the compressed model.
第二方面,本申请实施例提供一种神经网络模型部署装置,包括:处理 器和存储器;所述存储器,用于存储程序代码;所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:In the second aspect, an embodiment of the present application provides a neural network model deployment device, including: a processor and a memory; the memory is used to store program code; the processor calls the program code, and when the program code is executed When used to perform the following operations:
获得已训练好的卷积神经网络模型;对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果;根据所述矩阵分解结果,调整所述卷积神经网络模型的结构,以对所述卷积神经网络模型进行压缩,得到所述卷积神经网络模型的压缩后模型;对所述压缩后模型进行部署。Obtain a trained convolutional neural network model; perform matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain the matrix decomposition result of the convolutional layer; adjust all parameters according to the matrix decomposition result The structure of the convolutional neural network model is used to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model; deploy the compressed model.
第三方面,本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序包含至少一段代码,所述至少一段代码可由计算机执行,以控制所述计算机执行上述第一方面任一项所述的方法。In a third aspect, an embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer program, the computer program includes at least one piece of code, the at least one piece of code can be executed by a computer to control all The computer executes the method described in any one of the first aspects above.
第四方面,本申请实施例提供一种计算机程序,当所述计算机程序被计算机执行时,用于实现上述第一方面任一项所述的方法。In a fourth aspect, an embodiment of the present application provides a computer program, when the computer program is executed by a computer, it is used to implement the method described in any one of the above-mentioned first aspects.
第五方面,本申请实施例提供一种移动平台,包括存储器和处理器,所述存储器中存储有根据第一方面任一项所述方法部署的卷积神经网络模型;In a fifth aspect, an embodiment of the present application provides a mobile platform, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method of any one of the first aspect;
当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述移动平台获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile platform.
第六方面,本申请实施例提供一种云台设备,包括存储器和处理器,所述存储器中存储有根据第一方面任一项所述方法部署的卷积神经网络模型;In a sixth aspect, an embodiment of the present application provides a pan-tilt device, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to any one of the methods of the first aspect;
当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述云台设备获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the pan/tilt device.
第七方面,本申请实施例提供一种移动终端,包括存储器和处理器,所述存储器中存储有根据第一方面任一项所述方法部署的卷积神经网络模型;In a seventh aspect, an embodiment of the present application provides a mobile terminal, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in any one of the first aspect;
当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述移动终端获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile terminal.
本申请实施例提供一种神经网络模型部署方法、装置及设备,通过对已训练好的卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得卷积层的矩阵分解结果,根据矩阵分解结果,调整卷积神经网络模型的结构,以对卷积神经网络模型进行压缩,并对压缩后模型进行部署,实现了通过对权重参数进行矩阵分解并根据矩阵分解结果调整卷积神经网络模型的结构,获得压缩后模型,由于矩阵分解所得到的矩阵分解结果中可以保留矩阵的特征, 即保留原卷积层的权重参数的特征,因此根据矩阵分解结果调整卷积神经网络模型的结构所获得的压缩后模型能够保留原卷积层的卷积特征,进而能够保留模型的输入输出特性,从而减少了对原始训练数据的依赖。The embodiments of the application provide a neural network model deployment method, device, and equipment. The weight parameters of the convolutional layer in the trained convolutional neural network model are subjected to matrix decomposition to obtain the matrix decomposition result of the convolutional layer. According to the matrix Decompose the results, adjust the structure of the convolutional neural network model to compress the convolutional neural network model, and deploy the compressed model, realize the matrix decomposition of the weight parameters and adjust the convolutional neural network model according to the matrix decomposition result The structure of the compressed model is obtained. Since the matrix feature can be retained in the matrix decomposition result obtained by matrix decomposition, that is, the feature of the weight parameter of the original convolutional layer is retained, the structure of the convolutional neural network model is adjusted according to the matrix decomposition result. The obtained compressed model can retain the convolutional features of the original convolutional layer, and then can retain the input and output characteristics of the model, thereby reducing the dependence on the original training data.
附图说明Description of the drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description These are some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative work.
图1为本申请实施例提供的神经网络模型部署方法的应用场景示意图;FIG. 1 is a schematic diagram of an application scenario of a neural network model deployment method provided by an embodiment of the application;
图2为本申请一实施例提供的神经网络模型部署方法的流程示意图;2 is a schematic flowchart of a neural network model deployment method provided by an embodiment of this application;
图3为本申请另一实施例提供的神经网络模型部署方法的流程示意图;3 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application;
图4为本申请又一实施例提供的神经网络模型部署方法的流程示意图;4 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application;
图5为本申请一实施例提供的对卷积层的权重参数进行矩阵分解的示意图;FIG. 5 is a schematic diagram of performing matrix decomposition on weight parameters of a convolutional layer according to an embodiment of the application;
图6为本申请一实施例提供的卷积层与其替代层的输入输出关系的示意图;6 is a schematic diagram of the input and output relationship between a convolutional layer and its replacement layer provided by an embodiment of the application;
图7为本申请又一实施例提供的神经网络模型部署方法的流程示意图;FIG. 7 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application;
图8为本申请另一实施例提供的对卷积层的权重参数进行矩阵分解的示意图;FIG. 8 is a schematic diagram of performing matrix decomposition on weight parameters of a convolutional layer according to another embodiment of the application;
图9为本申请另一实施例提供的卷积层与其替代层的输入输出关系的示意图;FIG. 9 is a schematic diagram of the input and output relationship between a convolutional layer and its replacement layer provided by another embodiment of the application;
图10为本申请又一实施例提供的神经网络模型部署方法的流程示意图;FIG. 10 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application;
图11为本申请又一实施例提供的卷积层与其替代层的输入输出关系的示意图;FIG. 11 is a schematic diagram of the input and output relationship between a convolutional layer and its replacement layer provided by another embodiment of this application;
图12为本申请一实施例提供的神经网络模型部署装置的结构示意图。FIG. 12 is a schematic structural diagram of a neural network model deployment device provided by an embodiment of this application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请实施例提供的神经网络模型部署方法可以应用于任何需要部署卷积神经网络模型的场景中。该神经网络模型部署方法具体可以由神经网络模型部署装置执行。本申请实施例提供的神经网络模型部署方法的应用场景示意图可以如图1所示,具体的,神经网络模型部署装置11可以从其他装置/设备12获得已训练好的卷积神经网络模型,并对获得的卷积神经网络模型采用本申请实施例提供的神经网络模型部署方法进行处理。对于神经网络模型部署装置11与其他装置/设备12通讯连接的具体方式,本申请可以不做限定,例如可以基于蓝牙接口实现无线通讯连接,或者基于RS232接口实现有线通讯连接。The neural network model deployment method provided in the embodiments of the present application can be applied to any scenario where a convolutional neural network model needs to be deployed. The neural network model deployment method can be specifically executed by a neural network model deployment device. A schematic diagram of the application scenario of the neural network model deployment method provided by the embodiment of the present application may be as shown in FIG. 1. Specifically, the neural network model deployment device 11 can obtain the trained convolutional neural network model from other devices/equipment 12, and The obtained convolutional neural network model is processed using the neural network model deployment method provided in the embodiment of the present application. The specific manner of communication connection between the neural network model deployment device 11 and other devices/devices 12 is not limited in this application. For example, wireless communication connection may be realized based on a Bluetooth interface, or wired communication connection may be realized based on an RS232 interface.
其中,包括神经网络模型部署装置的设备,具体可以为计算能力较强的计算机设备。Among them, the equipment including the neural network model deployment device, which may specifically be computer equipment with strong computing capability.
需要说明的是,图1中以基于神经网络模型部署装置从其他装置或设备获得卷积神经网络模型为例,可替换的,神经网络模型部署装置可以通过其他方式获得卷积神经网络模型,示例性的,神经网络模型部署装置可以通过训练初始的卷积神经网络模型获得卷积神经网络模型。It should be noted that in Figure 1, the neural network model deployment device obtains the convolutional neural network model from other devices or equipment as an example. Alternatively, the neural network model deployment device can obtain the convolutional neural network model in other ways. In general, the neural network model deployment device can obtain the convolutional neural network model by training the initial convolutional neural network model.
本申请实施例提供的神经网络模型部署方法,通过对已训练好的卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得卷积层的矩阵分解结果,根据矩阵分解结果,调整卷积神经网络模型的结构,以对卷积神经网络模型进行压缩,并对压缩后模型进行部署,减少了对原始训练数据的依赖。The neural network model deployment method provided by the embodiments of this application performs matrix decomposition on the weight parameters of the convolutional layer in the trained convolutional neural network model to obtain the matrix decomposition result of the convolutional layer, and adjust the volume according to the matrix decomposition result. The structure of the product neural network model is used to compress the convolutional neural network model and deploy the compressed model, reducing the dependence on the original training data.
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。Hereinafter, some embodiments of the present application will be described in detail with reference to the accompanying drawings. In the case of no conflict, the following embodiments and features in the embodiments can be combined with each other.
图2为本申请一实施例提供的神经网络模型部署方法的流程示意图,本实施例的执行主体可以为神经网络模型部署装置,具体可以为神经网络模型部署装置的处理器。如图2所示,本实施例的方法可以包括:2 is a schematic flow chart of a neural network model deployment method provided by an embodiment of this application. The execution subject of this embodiment may be a neural network model deployment device, and specifically may be a processor of the neural network model deployment device. As shown in Figure 2, the method of this embodiment may include:
步骤201,获得已训练好的卷积神经网络模型。Step 201: Obtain a trained convolutional neural network model.
本步骤中,对于获得已训练好的卷积神经网络模型的具体方式,本申请可以不做限定。例如,可以接收其他装置/设备发送的已训练好的卷积神经网络模型。又例如,可以从其他装置/设备的存储设备中读取已训练好的卷积神经网络模型。In this step, the specific method for obtaining the trained convolutional neural network model may not be limited in this application. For example, it can receive trained convolutional neural network models sent by other devices/equipment. For another example, the trained convolutional neural network model can be read from the storage device of other devices/equipment.
步骤202,对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果。Step 202: Perform matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain a matrix decomposition result of the convolutional layer.
本步骤中,对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解的矩阵分解方式可以为满足条件1的方式。其中,条件1为能够将矩阵分解为多个求和项之和。In this step, the matrix decomposition method for matrix decomposition of the weight parameters of the convolutional layer in the convolutional neural network model may be a method that satisfies the condition 1. Among them, condition 1 is that the matrix can be decomposed into the sum of multiple summations.
可选的,所述矩阵分解方式进一步还可以满足条件2。条件2为将矩阵分解所得到的多个求和项中存在能量大于第一阈值的求和项A以及能量小于第二阈值的求和项B。其中,多个求和项中存在能量大于第一阈值的求和项A以及能量小于第二阈值的求和项B可以表示不同求和项的能量差异程度较大。其中,第一阈值例如可以为80%,第二阈值例如可以为5%。一个求和项的能量可以用于该求和项的在矩阵分解结果中的重要程度,能量越大可以表示重要程度越大。Optionally, the matrix decomposition method may further satisfy condition 2. Condition 2 is that there is a summation term A with energy greater than the first threshold and a summation term B with energy less than the second threshold among the multiple summations obtained by decomposing the matrix. Among them, the summation term A with energy greater than the first threshold and the summation term B with energy less than the second threshold among the multiple summation items can indicate that the energy difference of different summation items is relatively large. The first threshold may be 80%, for example, and the second threshold may be 5%, for example. The energy of a summation item can be used for the importance of the summation item in the matrix decomposition result. The greater the energy, the greater the importance.
通过满足条件2,使得矩阵分解结果中的少量求和项的能量之和,在所有求和项的能量之和中可以占较大比例,因此根据少量求和项能够近似卷积层的权重参数,从而能够提高压缩效果。示例性的,矩阵分解方式可以为奇异值分解(Singular Value Decomposition,SVD),相应的,能量可以理解为奇异值。By satisfying condition 2, the sum of the energy of a small number of summations in the matrix decomposition result can account for a large proportion of the sum of the energy of all the summations. Therefore, the weight parameter of the convolutional layer can be approximated by a small number of summations. , Which can improve the compression effect. Exemplarily, the matrix decomposition method may be Singular Value Decomposition (SVD), and correspondingly, energy may be understood as a singular value.
示例性的,可以对卷积神经网络模型中部分或全部卷积层中各卷积层的权重参数分别进行矩阵分解,获得各卷积层的矩阵分解结果。Exemplarily, the weight parameters of each convolutional layer in some or all of the convolutional layers in the convolutional neural network model may be subjected to matrix decomposition to obtain the matrix decomposition result of each convolutional layer.
步骤203,根据所述矩阵分解结果,调整所述卷积神经网络模型的结构,以对所述卷积神经网络模型进行压缩,得到所述卷积神经网络模型的压缩后模型。Step 203: Adjust the structure of the convolutional neural network model according to the matrix decomposition result to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model.
本步骤中,由于矩阵分解结果改变了权重参数的结构,而权重参数的结构是与模型的结构对应,因此需要根据矩阵分解结果调整卷积神经网络模型的结构。In this step, since the matrix decomposition result changes the structure of the weight parameter, and the structure of the weight parameter corresponds to the structure of the model, the structure of the convolutional neural network model needs to be adjusted according to the matrix decomposition result.
示例性的,当矩阵分解结果的全部求和项对应模型结构的权重参数量小 于已训练好的卷积神经网络模型时,可以根据全部求和项调整卷积神经网络模型的结果,或者,可以从全部求和项中选择部分求和项,并根据部分求和项调整卷积神经网络模型的结构。Exemplarily, when the weight parameters corresponding to the model structure of all the summation items of the matrix decomposition result are smaller than the trained convolutional neural network model, the result of the convolutional neural network model can be adjusted according to all the summation items, or Select a partial summation item from all the summation items, and adjust the structure of the convolutional neural network model according to the partial summation item.
当矩阵分解结果的全部求和项对应模型结构的权重参数量大于或等于已训练好的卷积神经网络模型时,可以从全部求和项中选择部分求和项,并根据部分求和项调整卷积神经网络模型的结构。When the weight parameters of the model structure corresponding to all the summation items of the matrix decomposition result are greater than or equal to the trained convolutional neural network model, a partial summation item can be selected from all the summation items and adjusted according to the partial summation item The structure of the convolutional neural network model.
步骤204,对所述压缩后模型进行部署。Step 204: Deploy the compressed model.
本步骤中,示例性的,可以直接将压缩后模型直接部署到执行模型计算的设备;或者,可以由执行模型计算的设备从神经网络模型部署装置的存储设备中读取压缩后模型。In this step, illustratively, the compressed model can be directly deployed to the device that performs model calculations; or, the device that performs model calculations can read the compressed model from the storage device of the neural network model deployment device.
示例性的,可以将压缩后模型发送至其他装置/设备,例如其他装置/设备12,由其他装置/设备将压缩后模型部署到执行模型计算的设备,或者由执行模型计算的设备从其他装置/设备的存储设备中读取压缩后模型。Exemplarily, the compressed model can be sent to other devices/devices, such as other devices/devices 12. The other devices/devices deploy the compressed model to the device that performs model calculations, or the device that performs model calculations from other devices /Read the compressed model from the storage device of the device.
其中,执行模型计算的设备,具体可以为需要部署卷积神经网络模型的任意类型设备。示例性的,执行模型计算的设备可以包括可移动平台,例如无人机。Among them, the device that performs model calculations may specifically be any type of device that needs to deploy a convolutional neural network model. Exemplarily, the device for performing model calculation may include a movable platform, such as a drone.
本实施例中,通过对已训练好的卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得卷积层的矩阵分解结果,根据矩阵分解结果,调整卷积神经网络模型的结构,以对卷积神经网络模型进行压缩,并对压缩后模型进行部署,实现了通过对权重参数进行矩阵分解并根据矩阵分解结果调整卷积神经网络模型的结构,获得压缩后模型,由于矩阵分解所得到的矩阵分解结果中可以保留矩阵的特征,即保留原卷积层的权重参数的特征,因此根据矩阵分解结果调整卷积神经网络模型的结构所获得的压缩后模型能够保留原卷积层的卷积特征,进而能够保留模型的输入输出特性,从而减少了对原始训练数据的依赖。In this embodiment, by performing matrix decomposition on the weight parameters of the convolutional layer in the trained convolutional neural network model, the matrix decomposition result of the convolutional layer is obtained, and the structure of the convolutional neural network model is adjusted according to the matrix decomposition result. In order to compress the convolutional neural network model, and deploy the compressed model, it is realized by matrix decomposition of the weight parameters and adjustment of the structure of the convolutional neural network model according to the matrix decomposition result to obtain the compressed model. The obtained matrix decomposition results can retain the characteristics of the matrix, that is, the characteristics of the weight parameters of the original convolutional layer. Therefore, the compressed model obtained by adjusting the structure of the convolutional neural network model according to the matrix decomposition result can retain the original convolutional layer. The convolution feature can retain the input and output characteristics of the model, thereby reducing the dependence on the original training data.
图3为本申请另一实施例提供的神经网络模型部署方法的流程示意图。本实施例在图2所示实施例的基础上,主要描述了根据所述矩阵分解结果,调整所述卷积神经网络模型的结构的一种可选实现方式。如图3所示,本实施例的方法可以包括:FIG. 3 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application. Based on the embodiment shown in FIG. 2, this embodiment mainly describes an optional implementation manner of adjusting the structure of the convolutional neural network model according to the matrix decomposition result. As shown in FIG. 3, the method of this embodiment may include:
步骤301,获得已训练好的卷积神经网络模型。Step 301: Obtain a trained convolutional neural network model.
需要说明的是,步骤301与步骤201类似,在此不再赘述。It should be noted that step 301 is similar to step 201 and will not be repeated here.
步骤302,对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果。Step 302: Perform matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain a matrix decomposition result of the convolutional layer.
需要说明的是,步骤302与步骤202类似,在此不再赘述。It should be noted that step 302 is similar to step 202, and will not be repeated here.
步骤303,根据所述矩阵分解结果,确定用于替换所述卷积层的替代层,所述替代层的权重参数的数量少于所述卷积层的权重参数的数量。Step 303: Determine a replacement layer for replacing the convolutional layer according to the matrix decomposition result, where the number of weight parameters of the replacement layer is less than the number of weight parameters of the convolution layer.
本步骤中,卷积神经网络模型中一个卷积层的替代层用于替换卷积神经网络模型中的该卷积层。通过替代层的权重参数的数量少于卷积层的权重参数的数量,可以实现对模型的压缩。In this step, a substitute layer of a convolutional layer in the convolutional neural network model is used to replace the convolutional layer in the convolutional neural network model. The number of weight parameters of the replacement layer is less than the number of weight parameters of the convolutional layer, so that the model can be compressed.
需要说明的是,一个卷积层的替换层的结构以及该替换层的权重参数与该卷积层的矩阵分解结果中的全部或部分求和项对应。It should be noted that the structure of the replacement layer of a convolutional layer and the weight parameter of the replacement layer correspond to all or part of the summation items in the matrix decomposition result of the convolutional layer.
示例性的,所述矩阵分解结果包括多个求和项;所述根据所述矩阵分解结果,确定用于替换所述卷积层的替代层,具体可以包括:根据所述多个求和项中的部分求和项,确定用于替换所述卷积层的替代层。此时,替换层的结构以及该替换层的权重参数与矩阵分解结构中的部分求和项对应。Exemplarily, the matrix decomposition result includes a plurality of summation items; the determining a replacement layer for replacing the convolutional layer according to the matrix decomposition result may specifically include: according to the plurality of summation items Part of the summation term in determines the replacement layer used to replace the convolutional layer. At this time, the structure of the replacement layer and the weight parameters of the replacement layer correspond to the partial summation items in the matrix decomposition structure.
为了补偿由于替代层的权重参数中未考虑多个求和项中部分求和项之外的其他求和项所带来的误差,可选的,本实施例的方法还可以包括:根据所述多个求和项中所述部分求和项之外的其他求和项,确定所述替代层的偏置(bias)参数。通过根据其他求和项确定替代层的偏置参数,实现了通过替代层的偏置参数补偿替代层的权重参数中未考虑其他求和项所带来的误差,有利于提高压缩后模型的准确度。In order to compensate for the error caused by the weight parameters of the alternative layer that do not consider other summation items other than the partial summation items of the multiple summation items, optionally, the method of this embodiment may further include: Among the multiple summations, other summation items other than the partial summation items determine the bias parameter of the replacement layer. By determining the offset parameters of the replacement layer according to other summation terms, the offset parameters of the replacement layer are used to compensate the error caused by other summation terms in the weight parameters of the replacement layer, which is beneficial to improve the accuracy of the compressed model. degree.
由于一个卷积层的输入通常源于上一层卷积经批量归一化(Batch Normalization,BN)的输出,满足符合BN统计的正态分布,因此可以基于替代层的输入通道的正态分布特性,确定替代层的偏置参数。假定替代层的各输入通道独立同分布,且同一通道中各个位置的值对应其所属通道的均值,与其他求和项进行卷积,得到对应的压缩方式下的分解压缩损失。此方法计算得到的损失误差可以合入到偏置项中。示例性的,所述根据所述多个求和项中所述部分求和项之外的其他求和项,确定所述替代层的偏置参数,具体可以包括:将所述其他求和项的求和结果,与所述替代层中各输入通道的正态分布的均值分别进行卷积,获得各输入通道的卷积结果;将各输入通道的卷积结果,合入其偏置参数,以得到所述替代层的偏置参数。Since the input of a convolutional layer usually comes from the batch normalization (BN) output of the previous layer of convolution, which satisfies the normal distribution in accordance with the BN statistics, it can be based on the normal distribution of the input channel of the alternative layer Characteristics, determine the bias parameters of the replacement layer. It is assumed that the input channels of the replacement layer are independent and distributed, and the value of each position in the same channel corresponds to the mean value of the channel to which it belongs, and is convolved with other summations to obtain the decomposition compression loss in the corresponding compression mode. The loss error calculated by this method can be incorporated into the offset term. Exemplarily, the determining the bias parameter of the replacement layer according to other summations other than the partial summation items in the plurality of summation items may specifically include: adding the other summation items The sum result of is convolved with the mean value of the normal distribution of each input channel in the substitution layer to obtain the convolution result of each input channel; the convolution result of each input channel is combined into its bias parameter, In order to obtain the bias parameter of the replacement layer.
步骤304,采用所述替代层替换所述卷积神经网络模型中的所述卷积层。Step 304: Replace the convolutional layer in the convolutional neural network model with the replacement layer.
本步骤中,由于替换层的权重参数的结构相对于卷积层的权重参数的结构发生变化,因此替换层的结构相对于卷积层的结构发生变化,从而通过采用卷积神经网络模型中卷积层的替换层替换卷积神经网络模型中的该卷积层,实现了对卷积神经网络模型的结构的调整。In this step, since the structure of the weight parameter of the replacement layer is changed relative to the structure of the weight parameter of the convolutional layer, the structure of the replacement layer is changed relative to the structure of the convolutional layer. The replacement layer of the build-up layer replaces the convolutional layer in the convolutional neural network model, and realizes the adjustment of the structure of the convolutional neural network model.
本实施例中,通过对已训练好的卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得卷积层的矩阵分解结果,根据矩阵分解结果确定用于替换所述卷积层的替代层,采用替代层替换卷积神经网络模型中的卷积层,实现了对卷积神经网络的结构的调整,并且由于替代层的权重参数的数量少于卷积层的权重参数的数量,因此实现了通过调整卷积神经网络的结构对卷积神经网络模型进行压缩。In this embodiment, the weight parameters of the convolutional layer in the trained convolutional neural network model are matrix-decomposed to obtain the matrix decomposition result of the convolutional layer. According to the matrix decomposition result, it is determined to replace the convolutional layer. Substitution layer, replacing the convolutional layer in the convolutional neural network model with the substituting layer, realizes the adjustment of the structure of the convolutional neural network, and since the number of weight parameters of the replacement layer is less than the number of weight parameters of the convolutional layer, Therefore, the convolutional neural network model can be compressed by adjusting the structure of the convolutional neural network.
图4为本申请又一实施例提供的神经网络模型部署方法的流程示意图,本实施例在图3所示实施例的基础上,主要描述了对卷积神经网络模型中卷积层的权重参数进行矩阵分解的一种可选的实现方式。如图4所示,本实施例的方法可以包括:FIG. 4 is a schematic flowchart of a neural network model deployment method provided by another embodiment of this application. Based on the embodiment shown in FIG. 3, this embodiment mainly describes the weight parameters of the convolutional layer in the convolutional neural network model. An optional implementation for matrix factorization. As shown in Figure 4, the method of this embodiment may include:
步骤401,获得已训练好的卷积神经网络模型。Step 401: Obtain a trained convolutional neural network model.
需要说明的是,步骤401与步骤201类似,在此不再赘述。It should be noted that step 401 is similar to step 201, and will not be repeated here.
步骤402,按照所述卷积神经网络模型中卷积层的输入通道,对所述卷积层的权重参数进行分组,并将每组权重参数作为二维矩阵进行矩阵分解,得到每组权重参数的第一矩阵分解结果。Step 402: According to the input channel of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and use each group of weight parameters as a two-dimensional matrix to perform matrix decomposition to obtain each group of weight parameters The result of the first matrix factorization.
本步骤中,假设卷积层的输入通道数为C,卷积层的输出通道数为N,卷积核大小为k*k,该卷积层的权重参数数量N×C×k 2。如图5所示,可以根据输入通道将该卷积层的权重参数分成C组,每组大小为N×k 2的二维矩阵。进一步的,以矩阵分解方式为SVD为例,二维矩阵
Figure PCTCN2019118043-appb-000001
经SVD分解,
Figure PCTCN2019118043-appb-000002
其中U i∈R N
Figure PCTCN2019118043-appb-000003
k 2<<N。
In this step, it is assumed that the number of input channels of the convolutional layer is C, the number of output channels of the convolutional layer is N, the size of the convolution kernel is k*k, and the number of weight parameters of the convolutional layer is N×C×k 2 . As shown in FIG. 5, the weight parameters of the convolutional layer can be divided into C groups according to the input channel, and each group is a two-dimensional matrix with a size of N×k 2. Further, taking the matrix decomposition method SVD as an example, a two-dimensional matrix
Figure PCTCN2019118043-appb-000001
After SVD decomposition,
Figure PCTCN2019118043-appb-000002
Where U i ∈R N ,
Figure PCTCN2019118043-appb-000003
k 2 <<N.
其中,每组权重参数的所述第一矩阵分解结果包括多个第一求和项,各第一求和项均对应一个用于表征其重要程度的能量。Wherein, the first matrix decomposition result of each group of weight parameters includes a plurality of first summation items, and each first summation item corresponds to an energy used to characterize its importance.
步骤403,按照能量由大至小的顺序,对每组权重参数的所述多个第一求和项进行排序,并选择排序靠前的部分求和项作为第一目标求和项。Step 403: Sort the plurality of first summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the first target summation item.
本步骤中,示例性的,所述选择排序靠前的部分求和项作为第一目标求和项,具体可以包括:选择排序靠前且能量之和大于或等于能量阈值的部分求和项作为第一目标求和项。例如,假设能量阈值为0.9,一组的第一矩阵分 解结果包括5个求和项,分别为求和项1至求和项5,且求和项1的能量为0.8,求和项2的能量为0.1,求和项3的能量为0.07,求和项4的能量为0.02,求和项5的能量为0.01,则可以选择5个求和项中的求和项1和求和项2作为第一目标求和项。In this step, exemplarily, the selection of the partial summation item with the highest ranking as the first target summation item may specifically include: selecting the partial summation item with the highest ranking and whose energy sum is greater than or equal to the energy threshold as the first target summation item. The summation of the first goal. For example, assuming that the energy threshold is 0.9, the first matrix decomposition result of a group includes 5 summations, namely summation item 1 to summation item 5, and the energy of summation item 1 is 0.8, and the summation item 2 The energy is 0.1, the energy of the summation term 3 is 0.07, the energy of the summation term 4 is 0.02, and the energy of the summation term 5 is 0.01. Then you can choose the summation term 1 and the summation term 2 of the 5 summation terms As the first goal summation item.
需要说明的是,能量阈值可以根据需求灵活实现。能量阈值越大,第一目标求和项越接近组权重参数。It should be noted that the energy threshold can be flexibly implemented according to requirements. The larger the energy threshold, the closer the first target summation term is to the group weight parameter.
示例性的,所述选择排序靠前的部分求和项作为第一目标求和项,具体可以包括:选择排序靠前且求和项个数小于或等于个数阈值的部分求和项作为第一目标求和项。例如,假设个数阈值为3,一组的第一矩阵分解结果包括5个求和项,分别为求和项1至求和项5,且求和项1至求和项5的能量依次减小,则可以选择5个求和项中的求和项1至求和项3作为第一目标求和项。Exemplarily, the selection of the top partial summation item as the first target summation item may specifically include: selecting the partial summation item that is ranked top and the number of the summation items is less than or equal to the number threshold as the first target summation item. One goal summation item. For example, assuming that the number threshold is 3, the first matrix decomposition result of a group includes 5 summation items, which are summation item 1 to summation item 5, and the energy of summation item 1 to summation item 5 is successively reduced If it is small, you can select the summation item 1 to the summation item 3 among the 5 summation items as the first target summation item.
需要说明的是,个数阈值可以根据需求灵活实现。个数阈值越大,第一目标求和项越接近组权重参数。It should be noted that the number threshold can be flexibly implemented according to requirements. The larger the number threshold, the closer the first target summation item is to the group weight parameter.
步骤404,根据每组权重参数的所述第一目标求和项,确定用于替换所述卷积层的替代层。Step 404: Determine a replacement layer for replacing the convolutional layer according to the first target summation item of each set of weight parameters.
本步骤中,示例性的,步骤404具体可以包括:根据不同组权重参数的排序位次相同的第一目标求和项对应同一分支,且不同分支并联的策略,确定用于替换所述卷积层的替代层。In this step, exemplarily, step 404 may specifically include: determining a strategy for replacing the convolution according to the strategy that the first target summation items with the same ranking order of different sets of weight parameters correspond to the same branch, and different branches are connected in parallel. Substitute layer for layer.
例如,假设卷积层的输入通道数等于5,即该卷积层的权重参数共分为5组,每组中第一目标求和项的个数均为4,且第一组的4个第一目标求和项按照能量由大至小的顺序分别为求和项1a至求和项1d,第二组的4个第一目标求和项按照能量由大至小的顺序分别为求和项2a至求和项2d,第三组的4个第一目标求和项按照能量由大至小的顺序分别为求和项3a至求和项3d,第四组的4个第一目标求和项按照能量由大至小的顺序分别为求和项4a至求和项4d,第五组的4个第一目标求和项按照能量由大至小的顺序分别为求和项5a至求和项5d,则替代层可以包括并联的4个分支,分别为求和项1a、2a、3a、4a和5a对应的分支1,求和项1b、2b、3b、4b和5b对应的分支2,求和项1c、2c、3c、4c和5c对应的分支3,以及求和项1d、2d、3d、4d和5d对应的分支4。For example, suppose the number of input channels of the convolutional layer is equal to 5, that is, the weight parameter of the convolutional layer is divided into 5 groups, the number of the first target summation items in each group is 4, and the first group has 4 The first goal summation items are summation items 1a to 1d in the order of energy, and the 4 first goal summation items in the second group are summation items in order of energy. Item 2a to summation item 2d, the 4 first goal summation items of the third group are summation item 3a to summation item 3d in descending order of energy, and the 4 first goal summation items of the fourth group The sum terms are from summation term 4a to summation term 4d in descending order of energy, and the 4 first objective summation terms in the fifth group are respectively summation term 5a to summation term in descending order of energy. And item 5d, the alternative layer can include 4 branches in parallel, namely branch 1 corresponding to the summation items 1a, 2a, 3a, 4a, and 5a, and branch 2 corresponding to the summation items 1b, 2b, 3b, 4b, and 5b. , The branch 3 corresponding to the summation items 1c, 2c, 3c, 4c, and 5c, and the branch 4 corresponding to the summation items 1d, 2d, 3d, 4d, and 5d.
示例性的,一个分支可以包括串联的第一卷积层和第二卷积层。其中,第一卷积层的输入可以为所替代的卷积层的输入,第一卷积层的输出可以为 第二卷积层的输入。Exemplarily, one branch may include a first convolutional layer and a second convolutional layer connected in series. The input of the first convolutional layer may be the input of the replaced convolutional layer, and the output of the first convolutional layer may be the input of the second convolutional layer.
示例性的,所述第一卷积层用于对所替代的所述卷积层的输入进行逐点卷积(pointwise convolution)运算。Exemplarily, the first convolutional layer is used to perform a pointwise convolution operation on the input of the replaced convolutional layer.
示例性的,所述第一卷积层的输入通道的个数为C,C等于所替代的所述卷积层的输入通道的个数,所述第一卷积层的输出通道的个数为N,N等于所替代的所述卷积层的输出通道的个数,所述第一卷积层的卷积核大小为1 2。可以看出,第一卷积层的参数量具体可以为NC。 Exemplarily, the number of input channels of the first convolutional layer is C, C is equal to the number of input channels of the convolutional layer replaced, and the number of output channels of the first convolutional layer Is N, N is equal to the number of output channels of the convolutional layer to be replaced, and the size of the convolution kernel of the first convolutional layer is 1 2 . It can be seen that the parameter quantity of the first convolutional layer may specifically be NC.
示例性的,所述第二卷积层用于对所述第一卷积层的输出进行逐层卷积(depthwise convolution)运算。Exemplarily, the second convolutional layer is used to perform a layer-by-layer convolution (depthwise convolution) operation on the output of the first convolutional layer.
示例性的,所述第二卷积层的输入通道和输出通道的个数均为N,N等于所替代的所述卷积层的输入通道个数,所述第二卷积层的卷积核大小为K 2,K 2等于所替代的所述卷积层的卷积核大小。可以看出,第二卷积层的参数量具体可以为Nk 2Exemplarily, the number of input channels and output channels of the second convolutional layer is N, and N is equal to the number of input channels of the convolutional layer replaced, and the convolution of the second convolutional layer The kernel size is K 2 , and K 2 is equal to the convolution kernel size of the replaced convolution layer. It can be seen that the parameter quantity of the second convolutional layer may specifically be Nk 2 .
替代层所替代的卷积层的参数量NCk 2,在第一卷积层进行逐点卷积,第二卷积层进行逐层卷积时,替代层的一个分支的参数为NC+Nk 2,通常,C≤N,k 2<<C。假设输入特征图的高为H,宽为W,则卷积层的计算量为NCk 2HW,替代层的一个分支的计算量为(NC+Nk 2)HW。因此,对于k=3的卷积,替代层一个分支的参数和计算量约为卷积层的1/9。 The parameter quantity of the convolutional layer replaced by the replacement layer is NCk 2. When the first convolutional layer is convolved point by point, and the second convolutional layer is convolved layer by layer, the parameter of one branch of the replacement layer is NC+Nk 2 , Usually, C≤N, k 2 <<C. Assuming that the height of the input feature map is H and the width is W, the calculation amount of the convolutional layer is NCk 2 HW, and the calculation amount of one branch of the replacement layer is (NC+Nk 2 )HW. Therefore, for the convolution with k=3, the parameter and calculation amount of one branch of the replacement layer is about 1/9 of the convolution layer.
在部分求和项的数量大于1时,为了避免更新替代层的下一层,可选的,所述替代层可以包括用于对不同分支的输出进行累加的求和层。When the number of partial summations is greater than 1, in order to avoid updating the next layer of the replacement layer, optionally, the replacement layer may include a summation layer for accumulating the outputs of different branches.
以替代层的分支数等于3,第一卷积层进行逐点卷积,第二卷积层进行逐层卷积为例,卷积层与其替代层的输入输出关系可以如图6所示。参照图6,替代层各分支的逐点卷积的输入即为其所替代的卷积层的输入,各分支的逐点卷积的输出作为与其串联的逐层卷积的输入,各分支的逐层卷积的输出经求和层累加之后,相当于其所替代的卷积层的输出。Taking the number of branches of the alternative layer equal to 3, the first convolutional layer performs point-by-point convolution, and the second convolutional layer performs layer-by-layer convolution as an example, the input and output relationship between the convolutional layer and its alternative layer can be shown in FIG. 6. Referring to Figure 6, the input of the point-wise convolution of each branch of the replacement layer is the input of the convolutional layer it replaces, and the output of the point-wise convolution of each branch is the input of the layer-by-layer convolution in series with it. After the output of the layer-by-layer convolution is accumulated by the summation layer, it is equivalent to the output of the convolution layer it replaces.
需要说明的是,对于前述步骤402的SVD分解,一个分支的第一卷积层与该分支对应的所有第一目标求和项中的U矩阵对应,该分支的第二卷积层与该分支对应的所有第一目标求和项中的V矩阵对应。It should be noted that for the SVD decomposition in step 402, the first convolutional layer of a branch corresponds to the U matrix in all the first target summations corresponding to the branch, and the second convolutional layer of the branch corresponds to the branch Corresponding to the V matrix in all the first objective summations.
具体的,图6中从左至右第一个分支的逐点卷积可以与对C组权重参数分别进行SVD分解所得到C个
Figure PCTCN2019118043-appb-000004
中的U 1对应,第一个分支的逐层卷积可以与该C个
Figure PCTCN2019118043-appb-000005
中的V 1对应,第二个分支的逐点卷积可以与该 C个
Figure PCTCN2019118043-appb-000006
中的U 2对应,第二个分支的逐层卷积可以与该C个
Figure PCTCN2019118043-appb-000007
中的V 2对应,第三个分支的逐点卷积可以与该C个
Figure PCTCN2019118043-appb-000008
中的U 3对应,第三个分支的逐层卷积可以与该C个
Figure PCTCN2019118043-appb-000009
中的V 3对应。
Specifically, the point-by-point convolution of the first branch from left to right in Fig. 6 can be the same as the C obtained by performing SVD decomposition on the C group of weight parameters.
Figure PCTCN2019118043-appb-000004
Corresponds to U 1 in the first branch, and the layer-by-layer convolution of the first branch can be combined with the C
Figure PCTCN2019118043-appb-000005
Corresponding to V 1 in the second branch, the point-wise convolution of the second branch can be compared with the C
Figure PCTCN2019118043-appb-000006
Corresponds to U 2 in the second branch, the layer-by-layer convolution of the second branch can be compared with the C
Figure PCTCN2019118043-appb-000007
Corresponding to V 2 in the third branch, the point-by-point convolution of the third branch can be compared with the C
Figure PCTCN2019118043-appb-000008
Corresponds to U 3 in the third branch, and the layer-by-layer convolution of the third branch can be combined with the C
Figure PCTCN2019118043-appb-000009
V 3 in the corresponding.
步骤405,采用所述替代层替换所述卷积神经网络模型中的所述卷积层。Step 405: Replace the convolutional layer in the convolutional neural network model with the replacement layer.
需要说明的是,步骤405与步骤304类似,在此不再赘述。It should be noted that step 405 is similar to step 304, and will not be repeated here.
本实施例中,通过按照卷积神经网络模型中卷积层的输入通道,对卷积层的权重参数进行分组,确定每组权重参数的第一目标求和项,并根据每组权重参数的第一目标求和项确定用于替换卷积层的替代层,并采用替代层替换所述卷积神经网络模型中的卷积层,实现了按照卷积层的输入通道对卷积层的权重参数进行分组,并对每组权重参数进行矩阵分解的方式,调整卷积神经网络的结构。In this embodiment, the weight parameters of the convolutional layers are grouped according to the input channels of the convolutional layer in the convolutional neural network model, and the first target summation item of each group of weight parameters is determined, and the weight parameters of each group are determined according to the The first target summation item determines the alternative layer used to replace the convolutional layer, and uses the alternative layer to replace the convolutional layer in the convolutional neural network model, which realizes the weight of the convolutional layer according to the input channel of the convolutional layer The parameters are grouped, and each group of weight parameters is subjected to matrix decomposition to adjust the structure of the convolutional neural network.
图7为本申请又一实施例提供的神经网络模型部署方法的流程示意图,本实施例在图3所示实施例的基础上,主要描述了对卷积神经网络模型中卷积层的权重参数进行矩阵分解的一种可选的实现方式。如图7所示,本实施例的方法可以包括:FIG. 7 is a schematic flow chart of a neural network model deployment method provided by another embodiment of this application. Based on the embodiment shown in FIG. 3, this embodiment mainly describes the weight parameters of the convolutional layer in the convolutional neural network model. An optional implementation for matrix factorization. As shown in FIG. 7, the method of this embodiment may include:
步骤701,获得已训练好的卷积神经网络模型。Step 701: Obtain a trained convolutional neural network model.
需要说明的是,步骤701与步骤201类似,在此不再赘述。It should be noted that step 701 is similar to step 201 and will not be repeated here.
步骤702,按照所述卷积神经网络模型中卷积层的输出通道,将所述卷积层的权重参数进行分组,并将每组权重参数作为二维矩阵进行矩阵分解,得到每组权重参数的第二矩阵分解结果。Step 702: According to the output channel of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and use each group of weight parameters as a two-dimensional matrix to perform matrix decomposition to obtain each group of weight parameters The result of the second matrix factorization.
本步骤中,假设卷积层的输入通道数为C,卷积层的输出通道数为N,卷积核大小为k*k,该卷积层的权重参数数量N×C×k 2。如图8所示,可以根据输出通道将该卷积层的权重参数分成N组,每组大小为C×k 2的二维矩阵。进一步的,以矩阵分解方式为SVD为例,二维矩阵
Figure PCTCN2019118043-appb-000010
经SVD分解,
Figure PCTCN2019118043-appb-000011
其中U i∈R C
Figure PCTCN2019118043-appb-000012
k 2<<C。
In this step, it is assumed that the number of input channels of the convolutional layer is C, the number of output channels of the convolutional layer is N, the size of the convolution kernel is k*k, and the number of weight parameters of the convolutional layer is N×C×k 2 . As shown in FIG. 8, the weight parameters of the convolutional layer can be divided into N groups according to the output channel, and each group is a two-dimensional matrix with a size of C×k 2. Further, taking the matrix decomposition method SVD as an example, a two-dimensional matrix
Figure PCTCN2019118043-appb-000010
After SVD decomposition,
Figure PCTCN2019118043-appb-000011
Where U i ∈ R C ,
Figure PCTCN2019118043-appb-000012
k 2 <<C.
其中,每组权重参数的所述第二矩阵分解结果包括多个第二求和项,各第二求和项均对应一个用于表征其重要程度的能量;Wherein, the second matrix decomposition result of each group of weight parameters includes a plurality of second summation items, and each second summation item corresponds to an energy used to characterize its importance;
步骤703,按照能量由大至小的顺序,对每组权重参数的所述多个第二求和项进行排序,并选择排序靠前的部分求和项作为第二目标求和项。Step 703: Sort the multiple second summation items of each group of weight parameters according to the order of energy from the largest to the smallest, and select the partial summation item with the highest ranking as the second target summation item.
本步骤中,示例性的,所述选择排序靠前的部分求和项作为第二目标求和项,具体可以包括:选择排序靠前且能量之和大于或等于能量阈值的部分 求和项作为第二目标求和项。例如,假设能量阈值为0.9,一组的第二矩阵分解结果包括6个求和项,分别为求和项a至求和项g,且求和项a的能量为0.7,求和项b的能量为0.1,求和项c的能量为0.08,求和项d的能量为0.06,求和项e的能量为0.04,求和项f的能量为0.02,则可以选择6个求和项中的求和项a至求和项d作为第二目标求和项。In this step, exemplarily, the selection of the partial summation item ranked higher as the second target summation item may specifically include: selecting the partial summation item ranked higher and whose energy sum is greater than or equal to the energy threshold as the partial summation item The summation of the second objective. For example, assuming that the energy threshold is 0.9, the second matrix decomposition result of a group includes 6 summation terms, which are the summation term a to the summation term g, and the energy of the summation term a is 0.7, and the summation term b The energy is 0.1, the energy of the summation term c is 0.08, the energy of the summation term d is 0.06, the energy of the summation term e is 0.04, and the energy of the summation term f is 0.02. Then you can choose one of the 6 summation terms The summation item a to the summation item d are used as the second target summation item.
需要说明的是,能量阈值可以根据需求灵活实现。能量阈值越大,第二目标求和项越接近组权重参数。It should be noted that the energy threshold can be flexibly implemented according to requirements. The larger the energy threshold, the closer the second target summation term is to the group weight parameter.
示例性的,所述选择排序靠前的部分求和项作为第二目标求和项,具体可以包括:选择排序靠前且求和项个数小于或等于个数阈值的部分求和项作为第二目标求和项。例如,假设个数阈值为3,一组的第一矩阵分解结果包括6个求和项,分别为求和项a至求和项f,且求和项a至求和项f的能量依次减小,则可以选择6个求和项中的求和项a至求和项c作为第二目标求和项。Exemplarily, the selection of the partial summation item with the highest ranking as the second target summation item may specifically include: selecting the partial summation item with the highest ranking and the number of the summation items is less than or equal to the number threshold as the first summation item. Two goal summation items. For example, assuming that the number threshold is 3, the first matrix decomposition result of a group includes 6 summation items, which are summation term a to summation term f, and the energy of summation term a to summation term f is successively reduced If it is small, you can select the summation item a to the summation item c among the 6 summation items as the second target summation item.
需要说明的是,个数阈值可以根据需求灵活实现。个数阈值越大,第二目标求和项越接近组权重参数。It should be noted that the number threshold can be flexibly implemented according to requirements. The larger the number threshold, the closer the second target summation item is to the group weight parameter.
需要说明的是,步骤703中的个数阈值和能量阈值,可以分别与步骤403中的个数阈值和能量阈值相同。It should be noted that the number threshold and energy threshold in step 703 may be the same as the number threshold and energy threshold in step 403, respectively.
步骤704,根据每组权重参数的所述第二目标求和项,确定用于替换所述卷积层的替代层。Step 704: Determine a replacement layer for replacing the convolutional layer according to the second target summation item of each set of weight parameters.
示例性的,步骤704具体可以包括:根据不同组权重参数的排序位次相同的第二目标求和项对应同一分支,且不同分支并联的策略,确定用于替换所述卷积层的替代层。Exemplarily, step 704 may specifically include: determining a replacement layer for replacing the convolutional layer according to a strategy in which the second target summation items with the same ranking order of different sets of weight parameters correspond to the same branch, and different branches are connected in parallel. .
例如,假设卷积层的输出通道数等于6,即该卷积层的权重参数共分为6组,每组中第二目标求和项的个数均为4,且第一组的4个第二目标求和项按照能量由大至小的顺序分别为求和项aa至求和项af,第二组的4个第二目标求和项按照能量由大至小的顺序分别为求和项ba至求和项bf,第三组的4个第二目标求和项按照能量由大至小的顺序分别为求和项ca至求和项cf,第四组的4个第二目标求和项按照能量由大至小的顺序分别为求和项da至求和项df,第五组的4个第二目标求和项按照能量由大至小的顺序分别为求和项ea至求和项ef,第六组的4个第二目标求和项按照能量由大至小的顺序分别为求和项fa至求和项ff,则替代层可以包括并联的4个分支,分别为求和项aa、ba、ca、da、ea和fa对应的分支1,求和项ab、bb、cb、db、eb和fb对应的 分支2,求和项ac、bc、cc、dc、ec和fc对应的分支3,以及求和项ad、bd、cd、dd、ed和fd对应的分支4。For example, suppose the number of output channels of the convolutional layer is equal to 6, that is, the weight parameter of the convolutional layer is divided into 6 groups, the number of second target summation items in each group is 4, and the first group has 4 The second objective summation items are summation term aa to summation term af according to the order of energy, and the 4 second objective summation items of the second group are summation items in order of energy order. Item ba to summation item bf, the 4 second goal summation items of the third group are summation item ca to summation item cf in the order of energy, and the 4 second goal summation items of the fourth group The sum terms are from the sum term da to the sum term df in the descending order of energy. The 4 second objective sum terms in the fifth group are from the sum term ea to the sum term in descending order of energy. The sum term ef, the 4 second objective sum terms in the sixth group are from the sum term fa to the sum term ff in the order of energy from the largest to the smallest. The alternative layer can include 4 branches in parallel, respectively Branch 1 corresponding to the terms aa, ba, ca, da, ea, and fa, branch 2 corresponding to the summation terms ab, bb, cb, db, eb, and fb, and summation terms ac, bc, cc, dc, ec and Branch 3 corresponding to fc, and branch 4 corresponding to the summation terms ad, bd, cd, dd, ed, and fd.
示例性的,一个分支包括串联的第三卷积层和第四卷积层。其中,第三卷积层的输入可以为所替代的卷积层的输入,第三卷积层的输出可以为第四卷积层的输入。Exemplarily, one branch includes a third convolutional layer and a fourth convolutional layer connected in series. The input of the third convolutional layer may be the input of the replaced convolutional layer, and the output of the third convolutional layer may be the input of the fourth convolutional layer.
示例性的,所述第三卷积层用于对所替代的所述卷积层的输入进行逐层卷积(depthwise convolution)运算。Exemplarily, the third convolutional layer is used to perform a layer-by-layer convolution (depthwise convolution) operation on the input of the replaced convolutional layer.
示例性的,所述第三卷积层的输入通道和输出通道的个数均为C,C等于所替代的所述卷积层的输入通道的个数,所述第三卷积层的卷积核大小为K 2,K 2等于所替代的所述卷积层的卷积核大小。可以看出,第三卷积层的参数量具体可以为Ck 2Exemplarily, the number of input channels and output channels of the third convolutional layer is C, and C is equal to the number of input channels of the convolutional layer replaced, and the convolution of the third convolutional layer The size of the convolution kernel is K 2 , and K 2 is equal to the size of the convolution kernel of the replaced convolution layer. It can be seen that the parameter amount of the third convolutional layer can be specifically Ck 2 .
示例性的,所述第四卷积层用于对所述第三卷积层的输出进行逐点卷积(pointwise convolution)运算。Exemplarily, the fourth convolutional layer is used to perform a pointwise convolution operation on the output of the third convolutional layer.
示例性的,所述第四卷积层的输入通道的个数为C,C等于所替代的所述卷积层的输入通道的个数,第四卷积层的输出通道的个数为N,N等于所替代的所述卷积层的输出通道的个数,第四卷积层的卷积核大小为1 2。可以看出,第四卷积层的参数量具体可以为NC。 Exemplarily, the number of input channels of the fourth convolutional layer is C, C is equal to the number of input channels of the convolutional layer replaced, and the number of output channels of the fourth convolutional layer is N , N is equal to the number of output channels of the convolutional layer to be replaced, and the size of the convolution kernel of the fourth convolutional layer is 1 2 . It can be seen that the parameter quantity of the fourth convolutional layer may specifically be NC.
替代层所替代的卷积层的参数量NCk 2,在第三卷积层进行逐层卷积,第四卷积层进行逐点卷积时,替代层的一个分支的参数为NC+Ck 2,通常,C≤N,k 2<<C。假设输入特征图的高为H,宽为W,则卷积层的计算量为NCk 2HW,替代层的一个分支的计算量为(NC+Ck 2)HW。因此,对于k=3的卷积,替代层一个分支的参数和计算量约为卷积层的1/9。 The parameter quantity NCk 2 of the convolutional layer replaced by the alternative layer is convolved layer by layer in the third convolution layer, and when the fourth convolution layer is convolved point by point, the parameter of one branch of the alternative layer is NC+Ck 2 , Usually, C≤N, k 2 <<C. Assuming that the height of the input feature map is H and the width is W, the calculation amount of the convolutional layer is NCk 2 HW, and the calculation amount of one branch of the replacement layer is (NC+Ck 2 )HW. Therefore, for the convolution with k=3, the parameter and calculation amount of one branch of the replacement layer is about 1/9 of the convolution layer.
在部分求和项的数量大于1时,为了避免更新替代层的下一层,可选的所述替代层包括用于对不同分支的输出进行累加的求和层。When the number of partial summation items is greater than 1, in order to avoid updating the next layer of the replacement layer, the optional replacement layer includes a summation layer for accumulating the outputs of different branches.
以替代层的分支数等于3,第三卷积层进行逐层卷积,第四卷积层进行逐点卷积为例,卷积层与其替代层的输入输出关系可以如图9所示。参照图9,替代层各分支的逐层卷积的输入即为其所替代的卷积层的输入,各分支的逐层卷积的输出作为与其串联的逐点卷积的输入,各分支的逐点卷积的输出经求和层累加之后,相当于其所替代的卷积层的输出。Taking the number of branches of the replacement layer equal to 3, the third convolution layer performs layer-by-layer convolution, and the fourth convolution layer performs point-by-point convolution as an example, the input and output relationship between the convolution layer and its replacement layer can be shown in FIG. 9. Referring to Figure 9, the input of the layer-by-layer convolution of each branch of the replacement layer is the input of the convolutional layer it replaces, and the output of the layer-by-layer convolution of each branch is used as the input of the point-wise convolution in series with it. After the output of the point-wise convolution is accumulated by the summation layer, it is equivalent to the output of the convolutional layer it replaces.
需要说明的是,对于前述步骤702的SVD分解,一个分支的第三卷积层与该分支对应的所有第二目标求和项中的U矩阵对应,该分支的第四卷积层 与该分支对应的所有第二目标求和项中的V矩阵对应。It should be noted that for the SVD decomposition in step 702, the third convolutional layer of a branch corresponds to the U matrix in all the second target summations corresponding to the branch, and the fourth convolutional layer of the branch corresponds to the branch U matrix. Corresponding to the V matrix in all the second target summations.
具体的,图9中从左至右第一个分支的逐层卷积可以与对N组权重参数分别进行SVD分解所得到N个
Figure PCTCN2019118043-appb-000013
中的U 1对应,第一个分支的逐点卷积可以与该N个
Figure PCTCN2019118043-appb-000014
中的V 1对应,第二个分支的逐层卷积可以与该N个
Figure PCTCN2019118043-appb-000015
中的U 2对应,第二个分支的逐点卷积可以与该N个
Figure PCTCN2019118043-appb-000016
中的V 2对应,第三个分支的逐层卷积可以与该N个
Figure PCTCN2019118043-appb-000017
中的U 3对应,第三个分支的逐点卷积可以与该N个
Figure PCTCN2019118043-appb-000018
中的V 3对应。
Specifically, the layer-by-layer convolution of the first branch from left to right in Fig. 9 can be compared with the N sets of weight parameters obtained by SVD decomposition.
Figure PCTCN2019118043-appb-000013
U 1 in the corresponding, the point-by-point convolution of the first branch can be combined with the N
Figure PCTCN2019118043-appb-000014
Corresponding to V 1 in the second branch, the layer-by-layer convolution of the second branch can be combined with the N
Figure PCTCN2019118043-appb-000015
Corresponds to U 2 in the second branch, the point-by-point convolution of the second branch can be compared with the N
Figure PCTCN2019118043-appb-000016
Corresponding to V 2 in the third branch, the layer-by-layer convolution of the third branch can be combined with the N
Figure PCTCN2019118043-appb-000017
U 3 in the corresponding, the point-by-point convolution of the third branch can be combined with the N
Figure PCTCN2019118043-appb-000018
V 3 in the corresponding.
步骤705,采用所述替代层替换所述卷积神经网络模型中的所述卷积层。Step 705: Replace the convolutional layer in the convolutional neural network model with the replacement layer.
需要说明的是,步骤705与步骤304类似,在此不再赘述。It should be noted that step 705 is similar to step 304, and will not be repeated here.
本实施例中,通过按照卷积神经网络模型中卷积层的输出通道,对卷积层的权重参数进行分组,确定每组权重参数的第二目标求和项,并根据每组权重参数的第二目标求和项确定用于替换卷积层的替代层,并采用替代层替换所述卷积神经网络模型中的卷积层,实现了按照卷积层的输出通道对卷积层的权重参数进行分组,并对每组权重参数进行矩阵分解的方式,调整卷积神经网络的结构。In this embodiment, the weight parameters of the convolutional layers are grouped according to the output channels of the convolutional layers in the convolutional neural network model, and the second target summation term of each group of weight parameters is determined, and the second target summation term of each group of weight parameters is determined according to the value of each group of weight parameters. The second target summation item determines the alternative layer used to replace the convolutional layer, and uses the alternative layer to replace the convolutional layer in the convolutional neural network model, which realizes the weight of the convolutional layer according to the output channel of the convolutional layer The parameters are grouped, and each group of weight parameters is subjected to matrix decomposition to adjust the structure of the convolutional neural network.
需要说明的是,可以采用图4或图7所示实施例提供的矩阵分解方式,调整卷积神经网络模型的结构。或者,还可以如图10所示实施例,结合两种矩阵分解方式,调整卷积神经网络模型的结构。It should be noted that the matrix decomposition method provided by the embodiment shown in FIG. 4 or FIG. 7 can be used to adjust the structure of the convolutional neural network model. Alternatively, the embodiment shown in FIG. 10 may also be combined with two matrix decomposition methods to adjust the structure of the convolutional neural network model.
图10为本申请又一实施例提供的神经网络模型部署方法的流程示意图,本实施例在图4或图7所示实施例的基础上,主要描述了对卷积神经网络模型中卷积层的权重参数进行矩阵分解的一种可选的实现方式。如图10所示,本实施例的方法可以包括:Fig. 10 is a schematic flow chart of a neural network model deployment method provided by another embodiment of the application. This embodiment mainly describes the comparison of the convolutional layer in the convolutional neural network model on the basis of the embodiment shown in Fig. 4 or Fig. 7 An optional implementation method for matrix decomposition of the weight parameters. As shown in FIG. 10, the method of this embodiment may include:
步骤1001,获得已训练好的卷积神经网络模型。Step 1001: Obtain a trained convolutional neural network model.
步骤1002,按照所述卷积神经网络模型中卷积层的输入通道,对所述卷积层的权重参数进行分组,并将每组权重参数作为二维矩阵进行矩阵分解,得到每组权重参数的第一矩阵分解结果。Step 1002: According to the input channel of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and use each group of weight parameters as a two-dimensional matrix to perform matrix decomposition to obtain each group of weight parameters The result of the first matrix factorization.
需要说明的是,步骤1002与步骤402类似,在此不再赘述。It should be noted that step 1002 is similar to step 402, and will not be repeated here.
步骤1003,按照能量由大至小的顺序,对每组权重参数的所述多个第一求和项进行排序,并选择排序靠前的部分求和项作为第一目标求和项。Step 1003: Sort the multiple first summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the first target summation item.
需要说明的是,步骤1003与步骤403类似,在此不再赘述。It should be noted that step 1003 is similar to step 403, and will not be repeated here.
步骤1004,按照所述卷积神经网络模型中卷积层的输出通道数,将所述 卷积层的权重参数进行分组,并将每组权重参数作为二维矩阵进行矩阵分解,得到每组权重参数的第二矩阵分解结果。 Step 1004, according to the number of output channels of the convolutional layer in the convolutional neural network model, group the weight parameters of the convolutional layer, and perform matrix decomposition of each group of weight parameters as a two-dimensional matrix to obtain each group of weights The second matrix factorization result of the parameter.
需要说明的是,步骤1004与步骤702类似,在此不再赘述。It should be noted that step 1004 is similar to step 702, and will not be repeated here.
步骤1005,按照能量由大至小的顺序,对每组权重参数的所述多个第二求和项进行排序,并选择排序靠前的部分求和项作为第二目标求和项。Step 1005: Sort the multiple second summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the second target summation item.
需要说明的是,步骤1005与步骤703类似,在此不再赘述。It should be noted that step 1005 is similar to step 703, and will not be repeated here.
需要说明的是,步骤1004-步骤1005与步骤1002-步骤1003之间没有先后顺序的限制。It should be noted that there is no restriction on the sequence between step 1004-step 1005 and step 1002-step 1003.
步骤1006,基于目标策略,选择所述第一目标求和项或所述第二目标求和项作为特定目标求和项,并根据特定目标求和项确定用于替换所述卷积层的替代层。Step 1006: Based on the target strategy, select the first target summation item or the second target summation item as a specific target summation item, and determine a substitute for replacing the convolutional layer according to the specific target summation item Floor.
本步骤中,所述目标策略具体为能够用于从第一目标求和项和第二目标求和项中选择更优求和项的任意类型策略,具体可以根据需求灵活实现。示例性的,所述目标策略包括据求和项数最小策略或能量最大策略。In this step, the target strategy is specifically any type of strategy that can be used to select a more optimal summation item from the first goal summation item and the second goal summation item, which can be implemented flexibly according to requirements. Exemplarily, the target strategy includes a strategy with a minimum number of summation items or a strategy with maximum energy.
以求和项数最小策略为例,假设卷积层各组的第一目标求和项的个数均为2个,且该卷积层各组的第二目标求和项数的个数均为4个为例,由于第一目标求和项的数量小于第二目标求和项的数量,因此根据求和项最小策略可以选择第一目标求和项作为特定目标求和项,从而选择基于图4所示的方式,调整卷积神经网络模型的结构。由于求和项数量越少,参数越少,因此通过目标策略为求和项数最小策略,可以尽量缩小压缩后模型的大小。Taking the strategy of minimum summation items as an example, suppose that the number of first target summation items in each group of the convolutional layer is 2, and the number of second target summation items in each group of the convolutional layer is equal Taking 4 as an example, since the number of the first goal summation item is less than the number of the second goal summation item, the first goal summation item can be selected as the specific goal summation item according to the minimum summation item strategy, and the selection is based on The method shown in Figure 4 adjusts the structure of the convolutional neural network model. As the number of summation items is smaller, the parameters are fewer, so by using the target strategy as the minimum summation item number strategy, the size of the compressed model can be reduced as much as possible.
以能量最大策略为例,假设卷积层各组的第一目标求和项的能量之和为0.9,且该卷积层各组的第二目标求和项数的能量之和为0.95,由于第一目标求和项的能量之和小于第二目标求和项的能量之和,因此根据能量最大策略可以选择第二目标求和项作为特定目标求和项,从而选择基于图5所示的方式,调整卷积神经网络模型的结构。由于部分求和项之和的能量越大,求和项所表示的权重参数越接近卷积层的权重参数,因此通过目标策略为能量最大策略,可以尽量减少由于压缩对模型带来的误差。Taking the maximum energy strategy as an example, suppose the sum of the energy of the first target summation term of each group of the convolutional layer is 0.9, and the sum of the energy of the second target summation term of each group of the convolutional layer is 0.95, because The sum of the energy of the first goal summation item is less than the sum of the energy of the second goal summation item. Therefore, the second goal summation item can be selected as the specific goal summation item according to the maximum energy strategy, and the selection is based on the Ways to adjust the structure of the convolutional neural network model. Since the energy of the sum of partial summations is greater, the weight parameters represented by the summation items are closer to the weight parameters of the convolutional layer. Therefore, the target strategy is the energy maximization strategy, which can minimize the error caused by compression on the model.
需要说明的是,所述特定目标求和项具体为第一目标求和项或第二目标求和项。当特定目标求和项为第一求和项时,步骤1005根据第一目标求和项确定用于替换所述卷积层的替代层的具体说明可以参见步骤404的相关描述,在此不再赘述。当特定目标求和项为第二求和项时,步骤1005根据第二目标 求和项确定用于替换所述卷积层的替代层的具体说明可以参见步骤704的相关描述,在此不再赘述。It should be noted that the specific target summation item is specifically the first target summation item or the second target summation item. When the specific target summation item is the first summation item, step 1005 determines the replacement layer for replacing the convolutional layer according to the first target summation item for specific descriptions of the relevant description of step 404, which will not be repeated here. Go into details. When the specific target summation item is the second summation item, step 1005 determines the replacement layer to replace the convolutional layer according to the second target summation item for specific descriptions of the relevant description of step 704, which will not be repeated here. Go into details.
步骤1007,采用所述替代层替换所述卷积神经网络模型中的所述卷积层。Step 1007: Replace the convolutional layer in the convolutional neural network model with the replacement layer.
需要说明的是,步骤1007与步骤304类似,在此不再赘述。It should be noted that step 1007 is similar to step 304, and will not be repeated here.
本实施例中,通过基于目标策略,选择按照卷积神经网络模型中卷积层的输入通道,对卷积层的权重参数进行分组所确定的第二目标求和项,或者,按照卷积神经网络模型中卷积层的输出通道,对卷积层的权重参数进行分组所确定的第二目标求和项作为特定目标求和项,并根据特定目标求和项确定用于替换所述卷积层的替代层,实现了根据需求从第一目标求和项和第二目标求和项中选择更优求和项的用于确定替代层,使得模型压缩结果能够最大程度符合需求。In this embodiment, by selecting the input channel of the convolutional layer in the convolutional neural network model based on the target strategy, the second target summation item determined by grouping the weight parameters of the convolutional layer, or according to the convolutional neural network In the output channel of the convolutional layer in the network model, the second target summation item determined by grouping the weight parameters of the convolutional layer is used as the specific target summation item, and is determined to replace the convolution according to the specific target summation item The alternative layer of the layer realizes the selection of a more optimal summation item from the first objective summation item and the second objective summation item according to the demand to determine the alternative layer, so that the model compression result can meet the demand to the greatest extent.
在上述实施例的基础上,所述部分求和项的个数大于或等于1。以第二目标求和项为例,当部分求和项的个数等于1时,卷积层与其替代层的输入输出关系可以如图11所示。参照图11,替代层的分支数为1,替代层的逐层卷积的输入即为其所替代的卷积层的输入,替代层的逐层卷积的输出作为与其串联的逐点卷积的输入,逐点卷积的输出相当于其所替代的卷积层的输出。通过所述部分求和项的个数等于1,可以最大程度实现对卷积神经网络模型的压缩。On the basis of the foregoing embodiment, the number of the partial summation items is greater than or equal to one. Taking the second target summation item as an example, when the number of partial summation items is equal to 1, the input and output relationship between the convolutional layer and its replacement layer can be as shown in FIG. 11. Referring to Figure 11, the number of branches of the alternative layer is 1, the input of the layer-by-layer convolution of the alternative layer is the input of the convolutional layer it replaces, and the output of the layer-by-layer convolution of the alternative layer is the point-by-point convolution connected with it. The input of the point-by-point convolution is equivalent to the output of the convolutional layer it replaces. By the number of the partial summation items being equal to 1, the convolutional neural network model can be compressed to the greatest extent.
在上述方法实施例的基础上,可选的,还可以包括如下步骤:采用所述卷积神经网络模型的原始训练数据,对所述压缩后模型进行重训练。其中,所述原始训练数据是指通过对初始的卷积神经网络模型进行训练得到所述已训练好的卷积神经网络模型的训练数据。由于压缩后模型相比于所述已训练好的卷积神经网络模型,结构发生了变化,通过采用已训练好的卷积神经网络模型(以下称为原卷积神经网络模型)的原始数据,对压缩后模型进行重训练,使得压缩后模型能够学习到原卷积神经网络模型未学习到的输入输出特性,从而使得重训练后的压缩后模型的表达能力能够超越原卷积神经网络模型,有利于提高模型性能。On the basis of the foregoing method embodiment, optionally, it may further include the following step: using the original training data of the convolutional neural network model to retrain the compressed model. Wherein, the original training data refers to the training data of the trained convolutional neural network model obtained by training the initial convolutional neural network model. Since the compressed model has a different structure compared to the trained convolutional neural network model, the original data of the trained convolutional neural network model (hereinafter referred to as the original convolutional neural network model) is used, Retrain the compressed model so that the compressed model can learn the input and output characteristics that the original convolutional neural network model did not learn, so that the expression ability of the retrained compressed model can surpass the original convolutional neural network model. Conducive to improving model performance.
图12为本申请一实施例提供的神经网络模型部署装置的结构示意图,如图12所示,该装置1200可以包括:处理器1201和存储器1202。FIG. 12 is a schematic structural diagram of a neural network model deployment apparatus provided by an embodiment of the application. As shown in FIG. 12, the apparatus 1200 may include a processor 1201 and a memory 1202.
所述存储器1202,用于存储程序代码;The memory 1202 is used to store program codes;
所述处理器1201,调用所述程序代码,当程序代码被执行时,用于执行 以下操作:The processor 1201 calls the program code, and when the program code is executed, is configured to perform the following operations:
获得已训练好的卷积神经网络模型;Obtain a trained convolutional neural network model;
对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果;Performing matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain a matrix decomposition result of the convolutional layer;
根据所述矩阵分解结果,调整所述卷积神经网络模型的结构,以对所述卷积神经网络模型进行压缩,得到所述卷积神经网络模型的压缩后模型;Adjusting the structure of the convolutional neural network model according to the matrix decomposition result to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model;
对所述压缩后模型进行部署。Deploy the compressed model.
本实施例提供的神经网络模型部署装置,可以用于执行前述方法实施例的技术方案,其实现原理和技术效果与方法实施例类似,在此不再赘述。The neural network model deployment apparatus provided in this embodiment can be used to implement the technical solutions of the foregoing method embodiments, and its implementation principles and technical effects are similar to those of the method embodiments, and will not be repeated here.
另外,本申请实施例还提供一种移动平台,包括存储器和处理器,所述存储器中存储有根据前述方法实施例所述方法部署的卷积神经网络模型;In addition, an embodiment of the present application also provides a mobile platform, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in the foregoing method embodiment;
当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述移动平台获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile platform.
示例性的,所述传感器数据包括视觉传感器数据。Exemplarily, the sensor data includes vision sensor data.
示例性的,所述移动平台包括无人飞行器。Exemplarily, the mobile platform includes an unmanned aerial vehicle.
本申请实施例还提供一种云台设备,包括存储器和处理器,所述存储器中存储有根据前述方法实施例所述方法部署的卷积神经网络模型;An embodiment of the present application also provides a pan-tilt device including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in the foregoing method embodiment;
当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述云台设备获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the pan/tilt device.
示例性的,所述传感器数据包括视觉传感器数据。Exemplarily, the sensor data includes vision sensor data.
示例性的,所述云台设备为手持云台设备。Exemplarily, the pan-tilt device is a handheld pan-tilt device.
本申请实施例还提供一种移动终端,包括存储器和处理器,所述存储器中存储有根据前述方法实施例所述方法部署的卷积神经网络模型;An embodiment of the present application also provides a mobile terminal, including a memory and a processor, and the memory stores a convolutional neural network model deployed according to the method described in the foregoing method embodiment;
当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述移动终端获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile terminal.
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。A person of ordinary skill in the art can understand that all or part of the steps in the foregoing method embodiments can be implemented by a program instructing relevant hardware. The aforementioned program can be stored in a computer readable storage medium. When the program is executed, it executes the steps including the foregoing method embodiments; and the foregoing storage medium includes: ROM, RAM, magnetic disk, or optical disk and other media that can store program codes.
最后应说明的是:以上各实施例仅用以说明本申请的技术方案,而非对 其限制;尽管参照前述各实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application, not to limit them; although the application has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or equivalently replace some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present application. range.

Claims (41)

  1. 一种神经网络模型部署方法,其特征在于,包括:A method for deploying a neural network model, which is characterized in that it includes:
    获得已训练好的卷积神经网络模型;Obtain a trained convolutional neural network model;
    对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果;Performing matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain a matrix decomposition result of the convolutional layer;
    根据所述矩阵分解结果,调整所述卷积神经网络模型的结构,以对所述卷积神经网络模型进行压缩,得到所述卷积神经网络模型的压缩后模型;Adjusting the structure of the convolutional neural network model according to the matrix decomposition result to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model;
    对所述压缩后模型进行部署。Deploy the compressed model.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述矩阵分解结果,调整所述卷积神经网络模型的结构,包括:The method according to claim 1, wherein the adjusting the structure of the convolutional neural network model according to the matrix decomposition result comprises:
    根据所述矩阵分解结果,确定用于替换所述卷积层的替代层,所述替代层的权重参数的数量少于所述卷积层的权重参数的数量;Determine, according to the matrix decomposition result, a replacement layer for replacing the convolutional layer, where the number of weight parameters of the replacement layer is less than the number of weight parameters of the convolution layer;
    采用所述替代层替换所述卷积神经网络模型中的所述卷积层。The replacement layer is used to replace the convolutional layer in the convolutional neural network model.
  3. 根据权利要求2所述的方法,其特征在于,所述矩阵分解结果包括多个求和项;The method according to claim 2, wherein the matrix decomposition result includes a plurality of summation items;
    所述根据所述矩阵分解结果,确定用于替换所述卷积层的替代层,包括:The determining a replacement layer for replacing the convolutional layer according to the matrix decomposition result includes:
    根据所述多个求和项中的部分求和项,确定用于替换所述卷积层的替代层。According to a partial summation item in the plurality of summation items, a replacement layer for replacing the convolutional layer is determined.
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:The method according to claim 3, wherein the method further comprises:
    根据所述多个求和项中所述部分求和项之外的其他求和项,确定所述替代层的偏置参数。The bias parameter of the replacement layer is determined according to other summation items other than the partial summation item in the plurality of summation items.
  5. 根据权利要求2-4任一项所述的方法,其特征在于,所述对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,包括:The method according to any one of claims 2-4, wherein the performing matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model comprises:
    按照所述卷积神经网络模型中卷积层的输入通道,对所述卷积层的权重参数进行分组,并将每组权重参数作为二维矩阵进行矩阵分解,得到每组权重参数的第一矩阵分解结果;和/或,According to the input channel of the convolutional layer in the convolutional neural network model, the weight parameters of the convolutional layer are grouped, and each group of weight parameters is used as a two-dimensional matrix for matrix decomposition to obtain the first of each group of weight parameters. Matrix decomposition result; and/or,
    按照所述卷积神经网络模型中卷积层的输出通道,将所述卷积层的权重参数进行分组,并将每组权重参数作为二维矩阵进行矩阵分解,得到每组权重参数的第二矩阵分解结果。According to the output channel of the convolutional layer in the convolutional neural network model, the weight parameters of the convolutional layer are grouped, and each group of weight parameters is used as a two-dimensional matrix for matrix decomposition to obtain the second of each group of weight parameters. Matrix decomposition result.
  6. 根据权利要求5所述的方法,其特征在于,每组权重参数的所述第一矩阵分解结果包括多个第一求和项,各第一求和项均对应一个用于表征其重 要程度的能量;The method according to claim 5, wherein the first matrix decomposition result of each set of weight parameters includes a plurality of first summation items, and each first summation item corresponds to a value used to characterize its importance. energy;
    所述根据所述矩阵分解结果,确定用于替换所述卷积层的替代层,包括:The determining a replacement layer for replacing the convolutional layer according to the matrix decomposition result includes:
    按照能量由大至小的顺序,对每组权重参数的所述多个第一求和项进行排序,并选择排序靠前的部分求和项作为第一目标求和项;Sort the plurality of first summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the first target summation item;
    根据每组权重参数的所述第一目标求和项,确定用于替换所述卷积层的替代层。According to the first target summation item of each set of weight parameters, a replacement layer for replacing the convolutional layer is determined.
  7. 根据权利要求5所述的方法,其特征在于,每组权重参数的所述第二矩阵分解结果包括多个第二求和项,各第二求和项均对应一个用于表征其重要程度的能量;The method according to claim 5, wherein the second matrix decomposition result of each set of weight parameters includes a plurality of second summation items, and each second summation item corresponds to a second summation item used to characterize its importance. energy;
    所述根据所述矩阵分解结果,确定用于替换所述卷积层的替代层,包括:The determining a replacement layer for replacing the convolutional layer according to the matrix decomposition result includes:
    按照能量由大至小的顺序,对每组权重参数的所述多个第二求和项进行排序,并选择排序靠前的部分求和项作为第二目标求和项;Sort the plurality of second summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the second target summation item;
    根据每组权重参数的所述第二目标求和项,确定用于替换所述卷积层的替代层。According to the second target summation item of each set of weight parameters, a replacement layer for replacing the convolutional layer is determined.
  8. 根据权利要求5所述的方法,其特征在于,每组权重参数的所述第一矩阵分解结果包括多个第一求和项,各第一求和项均对应一个用于表征其重要程度的能量;每组权重参数的所述第二矩阵分解结果包括多个第二求和项,各第二求和项均对应一个用于表征其重要程度的能量;The method according to claim 5, wherein the first matrix decomposition result of each set of weight parameters includes a plurality of first summation items, and each first summation item corresponds to a value used to characterize its importance. Energy; the second matrix decomposition result of each group of weight parameters includes a plurality of second summations, and each second summation item corresponds to an energy used to characterize its importance;
    所述根据所述矩阵分解结果,确定用于替换所述卷积层的替代层,包括:The determining a replacement layer for replacing the convolutional layer according to the matrix decomposition result includes:
    按照能量由大至小的顺序,对每组权重参数的所述多个第一求和项进行排序,并选择排序靠前的部分求和项作为第一目标求和项;Sort the plurality of first summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the first target summation item;
    按照能量由大至小的顺序,对每组权重参数的所述多个第二求和项进行排序,并选择排序靠前的部分求和项作为第二目标求和项;Sort the plurality of second summation items of each group of weight parameters according to the order of energy from largest to smallest, and select the partial summation item with the highest ranking as the second target summation item;
    基于目标策略,选择所述第一目标求和项或所述第二目标求和项作为特定目标求和项,并根据特定目标求和项确定用于替换所述卷积层的替代层。Based on the target strategy, the first target summation item or the second target summation item is selected as a specific target summation item, and a replacement layer for replacing the convolutional layer is determined according to the specific target summation item.
  9. 根据权利要求8所述的方法,其特征在于,所述目标策略包括求和项最少策略或能量最大策略。The method according to claim 8, wherein the target strategy comprises a strategy with a minimum summation item or a strategy with a maximum energy.
  10. 根据权利要求6或8所述的方法,其特征在于,所述选择排序靠前的部分求和项作为第一目标求和项,包括:The method according to claim 6 or 8, wherein the selecting the partial summation item with the highest ranking as the first target summation item comprises:
    选择排序靠前且能量之和大于或等于能量阈值的部分求和项作为第一目标求和项。Select the partial summation item that is ranked higher and whose energy sum is greater than or equal to the energy threshold as the first target summation item.
  11. 根据权利要求6或8所述的方法,其特征在于,所述选择排序靠前的部分求和项作为第一目标求和项,包括:The method according to claim 6 or 8, wherein the selecting the partial summation item with the highest ranking as the first target summation item comprises:
    选择排序靠前且求和项个数小于或等于个数阈值的部分求和项作为第一目标求和项。Select the partial summation items that are ranked higher and whose number of summation items are less than or equal to the number threshold as the first target summation item.
  12. 根据权利要求7或8所述的方法,其特征在于,所述选择排序靠前的部分求和项作为第二目标求和项,包括:The method according to claim 7 or 8, wherein the selecting the partial summation item with a higher ranking as the second target summation item comprises:
    选择排序靠前且能量之和大于或等于能量阈值的部分求和项作为第二目标求和项。Select the partial summation items that are ranked higher and whose energy sum is greater than or equal to the energy threshold as the second target summation item.
  13. 根据权利要求7或8所述的方法,其特征在于,所述选择排序靠前的部分求和项作为第二目标求和项,包括:The method according to claim 7 or 8, wherein the selecting the partial summation item with a higher ranking as the second target summation item comprises:
    选择排序靠前且求和项个数小于或等于个数阈值的部分求和项作为第二目标求和项。Select the partial summation items that are ranked higher and whose number of summation items are less than or equal to the number threshold as the second target summation item.
  14. 根据权利要求6或8所述的方法,其特征在于,所述根据每组权重参数的所述第一目标求和项,确定用于替换所述卷积层的替代层,包括:The method according to claim 6 or 8, wherein the determining a replacement layer for replacing the convolutional layer according to the first target summation item of each set of weight parameters comprises:
    根据不同组权重参数的排序位次相同的第一目标求和项对应同一分支,且不同分支并联的策略,确定用于替换所述卷积层的替代层。According to the strategy that the first target summation items with the same ranking order of different sets of weight parameters correspond to the same branch, and different branches are connected in parallel, a replacement layer for replacing the convolutional layer is determined.
  15. 根据权利要求14所述的方法,其特征在于,所述替代层包括用于对不同分支的输出进行累加的求和层。The method according to claim 14, wherein the substitution layer comprises a summation layer for accumulating the outputs of different branches.
  16. 根据权利要求14所述的方法,其特征在于,一个分支包括串联的第一卷积层和第二卷积层。The method according to claim 14, wherein one branch includes a first convolutional layer and a second convolutional layer connected in series.
  17. 根据权利要求16所述的方法,其特征在于,所述第一卷积层用于对所替代的所述卷积层的输入进行逐点卷积运算。The method according to claim 16, wherein the first convolutional layer is used to perform a point-by-point convolution operation on the input of the replaced convolutional layer.
  18. 根据权利要求16所述的方法,其特征在于,所述第二卷积层用于对所述第一卷积层的输出进行逐层卷积运算。The method according to claim 16, wherein the second convolutional layer is used to perform a layer-by-layer convolution operation on the output of the first convolutional layer.
  19. 根据权利要求17所述的方法,其特征在于,所述第一卷积层的输入通道的个数为C,C等于所替代的所述卷积层的输入通道的个数,所述第一卷积层的输出通道的个数为N,N等于所替代的所述卷积层的输出通道的个数,所述第一卷积层的卷积核大小为1 2The method according to claim 17, wherein the number of input channels of the first convolutional layer is C, and C is equal to the number of input channels of the convolutional layer replaced, and the first The number of output channels of the convolutional layer is N, where N is equal to the number of output channels of the convolutional layer replaced, and the size of the convolution kernel of the first convolutional layer is 1 2 .
  20. 根据权利要求18所述的方法,其特征在于,所述第二卷积层的输入通道和输出通道的个数均为N,N等于所替代的所述卷积层的输入通道个数,所述第二卷积层的卷积核大小为K 2,K 2等于所替代的所述卷积层的卷积核大 小。 The method according to claim 18, wherein the number of input channels and output channels of the second convolutional layer are both N, and N is equal to the number of input channels of the convolutional layer replaced, so The size of the convolution kernel of the second convolution layer is K 2 , and K 2 is equal to the size of the convolution kernel of the replaced convolution layer.
  21. 根据权利要求7或8所述的方法,其特征在于,所述根据每组权重参数的所述第一目标求和项,确定用于替换所述卷积层的替代层,包括:The method according to claim 7 or 8, wherein the determining a replacement layer for replacing the convolutional layer according to the first target summation item of each set of weight parameters comprises:
    根据不同组权重参数的排序位次相同的第二目标求和项对应同一分支,且不同分支并联的策略,确定用于替换所述卷积层的替代层。According to the strategy that the second target summation items with the same ranking order of different sets of weight parameters correspond to the same branch and different branches are connected in parallel, a replacement layer for replacing the convolutional layer is determined.
  22. 根据权利要求21所述的方法,其特征在于,所述替代层包括用于对不同分支的输出进行累加的求和层。22. The method of claim 21, wherein the substitution layer comprises a summation layer for accumulating the outputs of different branches.
  23. 根据权利要求21所述的方法,其特征在于,一个分支包括串联的第三卷积层和第四卷积层。The method according to claim 21, wherein one branch includes a third convolutional layer and a fourth convolutional layer connected in series.
  24. 根据权利要求23所述的方法,其特征在于,所述第三卷积层用于对所替代的所述卷积层的输入进行逐层卷积运算。The method according to claim 23, wherein the third convolutional layer is used to perform a layer-by-layer convolution operation on the input of the replaced convolutional layer.
  25. 根据权利要求23所述的方法,其特征在于,所述第四卷积层用于对所述第三卷积层的输出进行逐点卷积运算。The method according to claim 23, wherein the fourth convolutional layer is used to perform a point-wise convolution operation on the output of the third convolutional layer.
  26. 根据权利要求24所述的方法,其特征在于,所述第三卷积层的输入通道和输出通道的个数均为C,C等于所替代的所述卷积层的输入通道的个数,所述第三卷积层的卷积核大小为K 2,K 2等于所替代的所述卷积层的卷积核大小。 The method according to claim 24, wherein the number of input channels and output channels of the third convolutional layer are both C, and C is equal to the number of input channels of the convolutional layer replaced, The size of the convolution kernel of the third convolution layer is K 2 , and K 2 is equal to the size of the convolution kernel of the replaced convolution layer.
  27. 根据权利要求25所述的方法,其特征在于,所述第四卷积层的输入通道的个数为C,C等于所替代的所述卷积层的输入通道的个数,第四卷积层的输出通道的个数为N,N等于所替代的所述卷积层的输出通道的个数,第四卷积层的卷积核大小为1 2The method according to claim 25, wherein the number of input channels of the fourth convolutional layer is C, and C is equal to the number of input channels of the convolutional layer replaced, and the fourth convolutional layer The number of output channels of the layer is N, where N is equal to the number of output channels of the convolutional layer replaced, and the size of the convolution kernel of the fourth convolutional layer is 1 2 .
  28. 根据权利要求3-4、6-27任一项所述的方法,其特征在于,所述部分求和项的个数大于或等于1。The method according to any one of claims 3-4 and 6-27, wherein the number of the partial summation items is greater than or equal to one.
  29. 根据权利要求4所述的方法,其特征在于,所述根据所述多个求和项中所述部分求和项之外的其他求和项,确定所述替代层的偏置参数,包括:The method according to claim 4, wherein the determining the bias parameter of the replacement layer according to the summation items other than the partial summation items in the plurality of summation items comprises:
    将所述其他求和项的求和结果,与所述替代层中各输入通道的正态分布的均值分别进行卷积,获得各输入通道的卷积结果;Convolve the summation results of the other summation items with the mean value of the normal distribution of each input channel in the substitution layer to obtain the convolution result of each input channel;
    将各输入通道的卷积结果,合入其偏置参数,以得到所述替代层的偏置参数。The convolution result of each input channel is combined into its offset parameter to obtain the offset parameter of the replacement layer.
  30. 根据权利要求1所述的方法,其特征在于,所述对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果, 包括:The method according to claim 1, wherein the performing matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain the matrix decomposition result of the convolutional layer comprises:
    对所述卷积神经网络模型中卷积层的权重参数进行奇异值分解,获得所述卷积层的矩阵分解结果。Singular value decomposition is performed on the weight parameters of the convolutional layer in the convolutional neural network model to obtain the matrix decomposition result of the convolutional layer.
  31. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, wherein the method further comprises:
    采用所述卷积神经网络模型的原始训练数据,对所述压缩后模型进行重训练。The original training data of the convolutional neural network model is used to retrain the compressed model.
  32. 一种神经网络模型部署装置,其特征在于,包括:存储器和处理器;A neural network model deployment device, which is characterized by comprising: a memory and a processor;
    所述存储器,用于存储程序代码;The memory is used to store program code;
    所述处理器,调用所述程序代码,当程序代码被执行时,用于执行以下操作:The processor calls the program code, and when the program code is executed, is used to perform the following operations:
    获得已训练好的卷积神经网络模型;Obtain a trained convolutional neural network model;
    对所述卷积神经网络模型中卷积层的权重参数进行矩阵分解,获得所述卷积层的矩阵分解结果;Performing matrix decomposition on the weight parameters of the convolutional layer in the convolutional neural network model to obtain a matrix decomposition result of the convolutional layer;
    根据所述矩阵分解结果,调整所述卷积神经网络模型的结构,以对所述卷积神经网络模型进行压缩,得到所述卷积神经网络模型的压缩后模型;Adjusting the structure of the convolutional neural network model according to the matrix decomposition result to compress the convolutional neural network model to obtain a compressed model of the convolutional neural network model;
    对所述压缩后模型进行部署。Deploy the compressed model.
  33. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包含至少一段代码,所述至少一段代码可由计算机执行,以控制所述计算机执行如权利要求1-31任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, the computer program contains at least one piece of code, and the at least one piece of code can be executed by a computer to control the computer to execute The method of any one of 1-31 is required.
  34. 一种计算机程序,其特征在于,当所述计算机程序被计算机执行时,用于实现如权利要求1-31任一项所述的方法。A computer program, characterized in that, when the computer program is executed by a computer, it is used to implement the method according to any one of claims 1-31.
  35. 一种移动平台,包括存储器和处理器,所述存储器中存储有根据权利要求1-31之一所述方法部署的卷积神经网络模型;A mobile platform, comprising a memory and a processor, the memory storing a convolutional neural network model deployed according to the method of any one of claims 1-31;
    当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述移动平台获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile platform.
  36. 根据权利要求35所述的移动平台,其特征在于,所述传感器数据包括视觉传感器数据。The mobile platform of claim 35, wherein the sensor data comprises visual sensor data.
  37. 根据权利要求35所述的移动平台,其特征在于,所述移动平台包括无人飞行器。The mobile platform of claim 35, wherein the mobile platform comprises an unmanned aerial vehicle.
  38. 一种云台设备,包括存储器和处理器,所述存储器中存储有根据权利要求1-31之一所述方法部署的卷积神经网络模型;A pan-tilt device, comprising a memory and a processor, the memory storing a convolutional neural network model deployed according to the method of any one of claims 1-31;
    当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述云台设备获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the pan/tilt device.
  39. 根据权利要求38所述的云台设备,其特征在于,所述传感器数据包括视觉传感器数据。The pan/tilt device according to claim 38, wherein the sensor data comprises visual sensor data.
  40. 根据权利要求38所述的云台设备,其特征在于,所述云台设备为手持云台设备。The pan-tilt device according to claim 38, wherein the pan-tilt device is a handheld pan-tilt device.
  41. 一种移动终端,包括存储器和处理器,所述存储器中存储有根据权利要求1-31之一所述方法部署的卷积神经网络模型;A mobile terminal, comprising a memory and a processor, the memory storing a convolutional neural network model deployed according to the method of any one of claims 1-31;
    当所述卷积神经网络模型被所述处理器调用并加载时,用于处理所述移动终端获得的传感器数据。When the convolutional neural network model is called and loaded by the processor, it is used to process the sensor data obtained by the mobile terminal.
PCT/CN2019/118043 2019-11-13 2019-11-13 Neural network model deployment method and apparatus, and device WO2021092796A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2019/118043 WO2021092796A1 (en) 2019-11-13 2019-11-13 Neural network model deployment method and apparatus, and device
CN201980039593.3A CN112313674A (en) 2019-11-13 2019-11-13 Neural network model deployment method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/118043 WO2021092796A1 (en) 2019-11-13 2019-11-13 Neural network model deployment method and apparatus, and device

Publications (1)

Publication Number Publication Date
WO2021092796A1 true WO2021092796A1 (en) 2021-05-20

Family

ID=74336685

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118043 WO2021092796A1 (en) 2019-11-13 2019-11-13 Neural network model deployment method and apparatus, and device

Country Status (2)

Country Link
CN (1) CN112313674A (en)
WO (1) WO2021092796A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688931A (en) * 2021-09-01 2021-11-23 什维新智医疗科技(上海)有限公司 Ultrasonic image screening method and device based on deep learning
CN114648671A (en) * 2022-02-15 2022-06-21 成都臻识科技发展有限公司 Detection model generation method and device based on deep learning

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836804B (en) * 2021-02-08 2024-05-10 北京迈格威科技有限公司 Image processing method, device, electronic equipment and storage medium
CN114186697B (en) * 2021-12-10 2023-03-14 北京百度网讯科技有限公司 Method and device for generating and applying deep learning model based on deep learning framework

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127297A (en) * 2016-06-02 2016-11-16 中国科学院自动化研究所 The acceleration of degree of depth convolutional neural networks based on resolution of tensor and compression method
CN106326985A (en) * 2016-08-18 2017-01-11 北京旷视科技有限公司 Neural network training method, neural network training device, data processing method and data processing device
CN107507250A (en) * 2017-06-02 2017-12-22 北京工业大学 A kind of complexion tongue color image color correction method based on convolutional neural networks
US10303979B2 (en) * 2016-11-16 2019-05-28 Phenomic Ai Inc. System and method for classifying and segmenting microscopy images with deep multiple instance learning

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127297A (en) * 2016-06-02 2016-11-16 中国科学院自动化研究所 The acceleration of degree of depth convolutional neural networks based on resolution of tensor and compression method
CN106326985A (en) * 2016-08-18 2017-01-11 北京旷视科技有限公司 Neural network training method, neural network training device, data processing method and data processing device
US10303979B2 (en) * 2016-11-16 2019-05-28 Phenomic Ai Inc. System and method for classifying and segmenting microscopy images with deep multiple instance learning
CN107507250A (en) * 2017-06-02 2017-12-22 北京工业大学 A kind of complexion tongue color image color correction method based on convolutional neural networks

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113688931A (en) * 2021-09-01 2021-11-23 什维新智医疗科技(上海)有限公司 Ultrasonic image screening method and device based on deep learning
CN113688931B (en) * 2021-09-01 2024-03-29 什维新智医疗科技(上海)有限公司 Deep learning-based ultrasonic image screening method and device
CN114648671A (en) * 2022-02-15 2022-06-21 成都臻识科技发展有限公司 Detection model generation method and device based on deep learning

Also Published As

Publication number Publication date
CN112313674A (en) 2021-02-02

Similar Documents

Publication Publication Date Title
WO2021092796A1 (en) Neural network model deployment method and apparatus, and device
CN109978142B (en) Neural network model compression method and device
CN108809723B (en) Edge server joint task unloading and convolutional neural network layer scheduling method
CN110728361B (en) Deep neural network compression method based on reinforcement learning
CN108304928A (en) Compression method based on the deep neural network for improving cluster
CN109002358A (en) Mobile terminal software adaptive optimization dispatching method based on deeply study
US20220083866A1 (en) Apparatus and a method for neural network compression
CN110298446A (en) The deep neural network compression of embedded system and accelerated method and system
CN112733863B (en) Image feature extraction method, device, equipment and storage medium
CN115952832A (en) Adaptive model quantization method and apparatus, storage medium, and electronic apparatus
CN107748913A (en) A kind of general miniaturization method of deep neural network
CN114154626B (en) Filter pruning method for image classification task
CN116050540A (en) Self-adaptive federal edge learning method based on joint bi-dimensional user scheduling
CN113194031B (en) User clustering method and system combining interference suppression in fog wireless access network
US20210297665A1 (en) Division pattern determining apparatus and learning apparatus and method for controlling same and non-transitory computer-readable storage medium
CN113850365A (en) Method, device, equipment and storage medium for compressing and transplanting convolutional neural network
Zhang et al. Federated multi-task learning with non-stationary heterogeneous data
CN116992941A (en) Convolutional neural network pruning method and device based on feature similarity and feature compensation
CN114924868A (en) Self-adaptive multi-channel distributed deep learning method based on reinforcement learning
CN111225391B (en) Network parameter processing method and equipment
CN113541993A (en) Network evaluation method and device, network index processing method, equipment and medium
TW202044125A (en) Method of training sparse connected neural network
KR102464508B1 (en) Method, system and non-transitory computer-readable recording medium for lightening artificial neural network models
CN117058000B (en) Neural network architecture searching method and device for image super-resolution
CN114697974B (en) Network coverage optimization method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19952777

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19952777

Country of ref document: EP

Kind code of ref document: A1