CN111461302A - Data processing method, device and storage medium based on convolutional neural network - Google Patents

Data processing method, device and storage medium based on convolutional neural network Download PDF

Info

Publication number
CN111461302A
CN111461302A CN202010237669.8A CN202010237669A CN111461302A CN 111461302 A CN111461302 A CN 111461302A CN 202010237669 A CN202010237669 A CN 202010237669A CN 111461302 A CN111461302 A CN 111461302A
Authority
CN
China
Prior art keywords
ratio
weight
weights
sub
scaling
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010237669.8A
Other languages
Chinese (zh)
Other versions
CN111461302B (en
Inventor
郭晖
张楠赓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canaan Bright Sight Co Ltd
Original Assignee
Hangzhou Canaan Creative Information Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Canaan Creative Information Technology Ltd filed Critical Hangzhou Canaan Creative Information Technology Ltd
Priority to CN202010237669.8A priority Critical patent/CN111461302B/en
Publication of CN111461302A publication Critical patent/CN111461302A/en
Application granted granted Critical
Publication of CN111461302B publication Critical patent/CN111461302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The application discloses a data processing method based on a convolutional neural network, relates to the technical field of neural networks, and can solve the technical problem that the data processing precision is influenced due to hardware limitation. The method comprises the following steps: acquiring the number of output channels, and segmenting the initial weight according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels; respectively zooming the plurality of initial sub-weights to obtain a plurality of zooming sub-weights; combining the plurality of scaling sub-weights to obtain scaling weights; and carrying out global quantization on the scaling weights, and applying the scaling weights subjected to global quantization as convolution kernels to convolution layers of a convolution neural network for data processing.

Description

Data processing method, device and storage medium based on convolutional neural network
Technical Field
The present application relates to the field of neural network technologies, and in particular, to a data processing method and device based on a convolutional neural network, and a storage medium.
Background
Convolutional Neural Network (CNN) refers to a feed forward Neural Network (fed Neural Network) that includes convolution calculation and has a deep structure, and is one of the representative algorithms of deep learning (deep). The structure of a convolutional neural network typically includes an input layer, a hidden layer, and an output layer. The input layer of a convolutional neural network can typically process one-dimensional or multi-dimensional data; the hidden layer usually includes a convolution layer, a pooling layer and a full-link layer, and performs operations such as convolution on the data output by the input layer; the output layer is often used to implement the result output of the convolutional neural network, for example, for the image classification problem, the output layer may be designed to output the center coordinates, the size, the classification, and the like of the object, for example, the classification result corresponding to each pixel in the image may be specifically and directly output.
In the data processing process of the convolutional neural network, extremely large input and output bandwidths are often required. At present, in order to solve the above problems, the data processing process involved in the convolutional neural network may be replaced by a fixed-point operation instead of a floating-point operation, so as to reduce the requirements of the data processing process on input and output bandwidths. In the data processing process of the convolutional neural network, the weights can be respectively quantized according to the output channels in the data processing process corresponding to the convolutional layers, and then the corresponding convolutional process is realized. However, under the influence of the performance of hardware such as the Kendryte K210 AI chip, the convolution kernels obtained by quantizing the weights respectively according to the output channels generally affect the data processing precision of the convolutional neural network because the offset numbers are the same, thereby reducing the accuracy of the output result corresponding to the implementation process such as image classification.
Disclosure of Invention
The application provides a data processing method, data processing equipment and a storage medium based on a convolutional neural network, which are used for solving the technical problem that the data processing precision is influenced due to hardware limitation.
In order to solve the above problems, the technical solution provided by the present application is as follows:
in a first aspect, an embodiment of the present application provides a data processing method based on a convolutional neural network. The method comprises the following steps: acquiring the number of output channels, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels; respectively zooming the plurality of initial sub-weights to obtain a plurality of zooming sub-weights; combining the multiple scaling sub-weights to obtain a scaling weight; and carrying out global quantization on the scaling weight, and applying the scaling weight subjected to global quantization as a convolution kernel to a convolution layer of the convolutional neural network for data processing.
In an implementation manner, scaling the plurality of initial sub-weights respectively to obtain a plurality of scaled sub-weights may be implemented as: obtaining a scaling factor of each initial sub-weight; and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.
In one implementation, obtaining the scaling factor of each initial sub-weight may be implemented as: acquiring a first maximum value and a first minimum value in each element of the initial weight; for each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight: acquiring a second maximum value and a second minimum value in each element of the initial sub-weight; determining the ratio of the first maximum value to the second maximum value as a first ratio, and determining the ratio of the first minimum value to the second minimum value as a second ratio; and determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
In one implementation, determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio may be implemented as: if at least one negative number exists in the first ratio and the second ratio, determining the scaling factor as the maximum value of the first ratio and the second ratio; and if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.
In one implementation, the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and the dimension of the initial weight is the same as the dimension of the convolution kernel.
In a second aspect, an embodiment of the present application provides a data processing device based on a convolutional neural network. The apparatus comprises:
and the acquisition module is used for acquiring the number of the output channels.
And the processing module is used for segmenting the initial weight according to the number of the output channels acquired by the acquisition module to obtain a plurality of initial sub-weights corresponding to the number of the output channels.
And the processing module is further used for respectively scaling the plurality of initial sub-weights and obtaining a plurality of scaled sub-weights.
And the processing module is further used for combining the plurality of scaling sub-weights to obtain the scaling weight.
And the processing module is also used for carrying out global quantization on the scaling weight so as to apply the scaling weight after global quantization as a convolution kernel to a convolution layer of the convolution neural network for data processing.
In one implementation, the processing module is further configured to obtain a scaling factor for each initial sub-weight; and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.
In one implementation, the obtaining module is further configured to obtain a first maximum value and a first minimum value of each element of the initial weight. For each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight: and the obtaining module is further used for obtaining a second maximum value and a second minimum value in each element of the initial sub-weights. The processing module is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio. And the processing module is further used for determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
The processing module is further configured to determine that the scaling factor is the maximum value of the first ratio and the second ratio if at least one negative number exists in the first ratio and the second ratio; and if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.
In one implementation, the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and the dimension of the initial weight is the same as the dimension of the convolution kernel.
In a third aspect, the present application provides a data processing apparatus based on a convolutional neural network, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method of the first aspect and any one of its various possible implementations when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium. The storage medium stores a computer program which, when executed by a processor, implements the method of the first aspect described above and any of its various possible implementations.
Compared with the situation that the data processing precision is affected due to hardware limitation in the prior art, in the embodiment of the application, a process similar to that of respectively quantizing the weights according to the output channels and then realizing corresponding convolution is adopted in the data processing process corresponding to the convolution layer. The method specifically comprises the steps of segmenting initial weights according to output channels, processing each segmented initial sub-weight, combining the processed scaling sub-weights, carrying out global quantization on the combined scaling weights, and applying the obtained global quantization to the data processing process of the convolutional layer. Therefore, original limited hardware does not need to be replaced, and the obtained result is close to the result obtained by processing after the weights are respectively quantized according to the output channels. Therefore, the data processing precision can be improved under the condition of adopting fixed-point operation instead of floating-point operation.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
fig. 1 is a first flowchart of a data processing method based on a convolutional neural network according to an embodiment of the present disclosure;
fig. 2 is a flowchart of a data processing method based on a convolutional neural network according to an embodiment of the present disclosure;
fig. 3 is a flow chart of a data processing method based on a convolutional neural network according to an embodiment of the present application;
fig. 4 is a flowchart of a data processing method based on a convolutional neural network according to an embodiment of the present application;
fig. 5 is a first schematic structural diagram of a data processing device based on a convolutional neural network according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a data processing device based on a convolutional neural network according to an embodiment of the present application.
Detailed Description
In order to more clearly explain the overall concept of the present application, the following detailed description is given by way of example in conjunction with the accompanying drawings.
The embodiment of the application provides a data processing method based on a convolutional neural network. The method can be applied to scenarios involving convolutional neural networks, such as image classification, speech recognition, document analysis, and the like. By adopting the technical scheme provided by the embodiment of the application, the precision of data processing can be effectively improved.
For example, for an application scenario of image classification, by using the technical solution provided in the embodiment of the present application, when floating point operation is replaced by fixed point operation, the convolution kernel is scaled according to an output channel, global quantization is performed on the scaled convolution kernel to obtain a convolution kernel finally used for convolution operation, and then the finally obtained convolution kernel is used for convolution operation, so as to improve the precision of data processing. That is to say, when the technical scheme provided by the embodiment of the application is adopted to train the image classification model, more representative image features can be extracted, and the accuracy of feature recognition is higher, so that the image classification model obtained by training can obtain more accurate image classification results in the subsequent image processing process.
It should be noted that, in the practical application process, if the convolution operation is still involved when performing the image classification, the corresponding image classification result may also be obtained by using a similar implementation method of the technical solution provided in the embodiment of the present application. The image classification result obtained in this way is also improved in processing accuracy due to the adjustment of the operation process.
Similarly, for scenes involving a convolutional neural network, such as voice recognition, document analysis, and the like, corresponding effects similar to those obtained by using the technical scheme provided by the embodiment of the present application for the image classification scene may also be achieved, and details of actual effects brought in the implementation process are omitted here, and the contents described above may be referred to.
The technical solution provided by the embodiment of the present application is further described below with reference to corresponding execution steps of the data processing method based on the convolutional neural network provided by the embodiment of the present application. As shown in fig. 1, the method may include S101 to S104.
S101, obtaining the number of output channels, and segmenting the initial weight according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels.
In the embodiment of the present application, the division of the initial weight into several initial sub-weights may be determined according to the number of output channels. In view of the technical solutions provided in the embodiments of the present application, the initial weights are scaled based on the output channels and subjected to global quantization after scaling, and therefore, in an implementation manner of the embodiments of the present application, the initial weights may be split according to the number of the output channels. I.e. the number of initial sub-weights after slicing, is the same as the number of output channels. For example, the number of output channels is 3, then after slicing the initial weights, 3 initial sub-weights can be obtained.
Of course, in the implementation process, if at least some of the initial sub-weights after the segmentation are the same, at least some of the multiple initial sub-weights may also be scaled in the same manner. That is, for the initial sub-weights scaled in the same manner, a scaling factor can be determined for scaling, thereby reducing the resources consumed in calculating the scaling factor.
And S102, respectively scaling the plurality of initial sub-weights to obtain a plurality of scaled sub-weights.
In the process of scaling the plurality of initial sub-weights, at least part of the initial sub-weights may be scaled, that is, all of the initial sub-weights may be scaled or part of the initial sub-weights in the plurality of initial sub-weights may be scaled to obtain corresponding scaled sub-weights. For the case that some initial sub-weights are scaled, which initial sub-weights are scaled and which initial sub-weights are not scaled, which is not limited in the embodiment of the present application, may be specifically adjusted according to factors such as the accuracy of the data processing process.
S103, combining the plurality of scaling sub-weights to obtain the scaling weight.
For the case that some of the initial sub-weights in the plurality of initial sub-weights are scaled, the merging process in S103 includes merging the un-scaled initial sub-weights and the scaled sub-weights. The combining mode of the scaling weights may be obtained by an inverse execution mode of dividing the initial weights into a plurality of initial sub-weights. For the specific implementation process of obtaining the scaling weight by splitting and combining the initial weight, reference may be made to the matrix splitting and combining manner in the prior art, which is not described herein again.
And S104, carrying out global quantization on the scaling weights, and applying the scaling weights subjected to global quantization as convolution kernels to convolution layers of the convolutional neural network for data processing.
In addition, in order to ensure that a scaling weight matching the initial weight can be obtained according to the initial weight, in an implementation manner of the embodiment of the present application, the initial weight and the scaling weight having the same expression can be obtained by ensuring consistency of expressions of the initial sub-weight and the scaling sub-weight. That is, in an implementation manner of the embodiment of the present application, a dimension of the initial sub-weight may be the same as a dimension of its corresponding scaling sub-weight, and a dimension of the initial weight may be the same as a dimension of the convolution kernel.
Compared with the situation that the data processing precision is affected due to hardware limitation in the prior art, in the embodiment of the application, a process similar to that of respectively quantizing the weights according to the output channels and then realizing corresponding convolution is adopted in the data processing process corresponding to the convolution layer. The method specifically comprises the steps of segmenting initial weights according to output channels, processing each segmented initial sub-weight, combining the processed scaling sub-weights, carrying out global quantization on the combined scaling weights, and applying the obtained global quantization to the data processing process of the convolutional layer. Therefore, original limited hardware does not need to be replaced, and the obtained result is close to the result obtained by processing after the weights are respectively quantized according to the output channels. Therefore, the data processing precision can be improved under the condition of adopting fixed-point operation instead of floating-point operation.
In the process of scaling each initial sub-weight, a corresponding scaling factor can be obtained for each initial sub-weight, and then the corresponding initial sub-weight is scaled according to the scaling factor, so that the obtained scaling sub-weight has more pertinence. Therefore, on the basis of the implementation shown in fig. 1, the implementation shown in fig. 2 can also be realized. S102 scales the plurality of initial sub-weights respectively to obtain a plurality of scaled sub-weights, which can be implemented as S201 and S202.
S201, obtaining a scaling factor of each initial sub-weight.
S202, scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.
As a parameter used for scaling the initial sub-weights, in an implementation manner of the embodiment of the present application, a calculation manner of the scaling factor may be determined according to a magnitude relationship between an element of the initial weight and an element of the initial sub-weights. Therefore, on the basis of the implementation shown in fig. 2, the implementation shown in fig. 3 can also be realized. Wherein, S201 obtains the scaling factor of each initial sub-weight, which can be implemented as S301, and multiple sets S302 to S304. I.e. performing S302 to S304 for each initial sub-weight to obtain a scaling factor for each initial sub-weight.
S301, acquiring a first maximum value and a first minimum value in each element of the initial weight.
S302, acquiring a second maximum value and a second minimum value in each element of the initial sub-weight.
S303, determining a ratio of the first maximum value to the second maximum value as a first ratio, and determining a ratio of the first minimum value to the second minimum value as a second ratio.
S304, determining the scaling factor to be the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
On the basis of the implementation shown in fig. 3, the implementation shown in fig. 4 can also be realized. Wherein, S304 determines the scaling factor to be the inverse of the first ratio or the inverse of the second ratio according to the type of the first ratio and/or the second ratio, which may be implemented as S401 or S402.
S401, if at least one negative number exists in the first ratio and the second ratio, determining that the scaling factor is the maximum value of the first ratio and the second ratio.
S402, if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.
The technical solutions provided in the embodiments of the present application are described below with reference to specific examples.
Obtaining the number C of output channels of the initial weight W, dividing the initial weight W1 into C initial sub-weights, and then dividing each initial sub-weight W2iScaling according to the corresponding scaling factor to obtain the corresponding scaling sub-weight W3iAnd combining the scaling sub-weights to obtain a scaling weight W4For data processing. Wherein, the value of i is an integer which is more than or equal to 1 and less than or equal to C, and C is an integer which is more than 1.
According to the initial sub-weight W21Get the corresponding scaling sub-weight W31For example, the minimum value L (i.e., the first minimum value) and the maximum value U (i.e., the first maximum value) of the initial weight are predetermined, i.e., L ═ min (W) and U ═ max (W), the initial sub-weight W2 to be scaled is obtained1Minimum value of L1(i.e., second minimum value) and maximum value U1(i.e., second maximum value), i.e., L1Min (W) and U1Max (w). Obtaining a weight scaling parameter S through calculation1(i.e., first ratio) and S2(i.e., the second ratio). Wherein S is1=U/U1,S2=L/L1. If S1And S2Memory storageIf at least one is negative, then the weight scaling factor S is determined to be the largest of the weight scaling parameters, i.e., S ═ max (S)1,S2) (ii) a Otherwise, determining the weight scaling factor S as the smallest weight scaling parameter of the weight scaling parameters, i.e. S ═ min (S)1,S2). After obtaining the weight scaling factor, a scaled weight W3 is obtained1=SW21. According to the implementation manner, W3 is obtained respectively1、W32、……、W3C. Combining the scaling sub-weights to obtain a scaling weight W4. In the implementation process, the initial sub-weight W21Corresponding multiplication factor M1Is the inverse of the weight scaling factor S, i.e. multiplication factor M1=1/S。
It should be noted that, in the actual data processing process, the min and max functions may include, but are not limited to, a conventional manner of finding the minimum value and the maximum value, and may further include a corresponding algorithm such as K L D to achieve the finding of the parameter, which is not limited herein to a specific implementation manner.
The embodiment of the application provides a data processing device based on a convolutional neural network. As shown in fig. 5, the convolutional neural network-based data processing device 50 may include:
and an obtaining module 51, configured to obtain the number of output channels.
The processing module 52 is configured to segment the initial weights according to the number of the output channels acquired by the acquiring module 51, so as to obtain a plurality of initial sub-weights corresponding to the number of the output channels.
The processing module 52 is further configured to scale the plurality of initial sub-weights respectively, and obtain a plurality of scaled sub-weights.
The processing module 52 is further configured to combine the multiple scaling sub-weights to obtain the scaling weight.
The processing module 52 is further configured to perform global quantization on the scaling weights, so as to apply the scaling weights after global quantization as convolution kernels to convolution layers of the convolutional neural network, so as to perform data processing.
In one implementation, the processing module 52 is further configured to obtain a scaling factor for each initial sub-weight; and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.
In one implementation, the obtaining module 51 is further configured to obtain a first maximum value and a first minimum value of each element of the initial weight.
For each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight:
the obtaining module 51 is further configured to obtain a second maximum value and a second minimum value in each element of the initial sub-weights.
The processing module 52 is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio. The processing module 52 is further configured to determine the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
The processing module 52 is further configured to determine that the scaling factor is the maximum value of the first ratio and the second ratio if at least one negative number exists in the first ratio and the second ratio; and if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.
In one implementation, the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and the dimension of the initial weight is the same as the dimension of the convolution kernel.
In one implementation, the convolutional neural network-based data processing device 50 may further include at least one of a communication module 53, a storage module 54, and a display module 55.
The communication module 53 may be configured to implement data interaction among the above modules, and/or support data interaction between the data processing device 50 based on the convolutional neural network and a device such as a server, other processing device, and the like; a storage module 54, which can be used to store the content required by the above modules when implementing the corresponding functions; the display module 55 may be configured to display the progress of data processing, the operating status of the data processing apparatus 50 based on the convolutional neural network, and the like. In the embodiment of the present application, the content, format, and the like stored in the storage module are not limited.
In the embodiment of the present application, the obtaining module 51 and the communication module 53 may be implemented as a communication interface, the processing module 52 may be implemented as a processor and/or a controller, the storage module 54 may be implemented as a memory, and the display module 55 may be implemented as a display.
Fig. 6 is a schematic structural diagram of another data processing device based on a convolutional neural network according to an embodiment of the present application. The convolutional neural network-based data processing device 60 may include a communication interface 61, a processor 62. In one implementation, the convolutional neural network-based data processing device 60 may also include one or more of a memory 63 and a display 64. The communication interface 61, the processor 62, the memory 63, and the display 64 may communicate with each other through the bus 65. The functions implemented by the above components may refer to the description of the functions of the modules, which is not repeated herein.
It should be noted that, referring to fig. 5 and fig. 6, the data processing apparatus based on the convolutional neural network provided in the embodiment of the present application may include more or less modules and components than those shown in the figures, which is not limited herein.
The application provides a data processing device based on a convolutional neural network, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the method of any one of the above-mentioned various possible implementation modes.
The present application provides a computer-readable storage medium. The storage medium stores a computer program which, when executed by a processor, implements the method of any of the various possible implementations described above.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the physical embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.
Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method of data processing based on a convolutional neural network, the method comprising:
acquiring the number of output channels, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels;
respectively zooming the plurality of initial sub-weights to obtain a plurality of zooming sub-weights;
combining the plurality of scaling sub-weights to obtain scaling weights;
and carrying out global quantization on the scaling weights, and applying the scaling weights subjected to global quantization as convolution kernels to convolution layers of a convolution neural network for data processing.
2. The method of claim 1, wherein scaling the initial sub-weights and obtaining scaled sub-weights comprises:
obtaining a scaling factor of each initial sub-weight;
and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.
3. The method of claim 2, wherein obtaining the scaling factor for each initial sub-weight comprises:
acquiring a first maximum value and a first minimum value in each element of the initial weight;
for each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight:
acquiring a second maximum value and a second minimum value in each element of the initial sub-weight;
determining a ratio of the first maximum value to the second maximum value as a first ratio, and determining a ratio of the first minimum value to the second minimum value as a second ratio;
determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.
4. The method of claim 3, wherein determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio comprises:
if at least one negative number exists in the first ratio and the second ratio, determining the scaling factor as the maximum value of the first ratio and the second ratio;
if no negative number exists in the first ratio and the second ratio, determining that the scaling factor is the minimum value of the first ratio and the second ratio.
5. The method of claim 1, wherein the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and wherein the dimension of the initial weight is the same as the dimension of the convolution kernel.
6. A convolutional neural network-based data processing apparatus, the apparatus comprising:
the acquisition module is used for acquiring the number of the output channels;
the processing module is used for segmenting the initial weight according to the number of the output channels acquired by the acquisition module to obtain a plurality of initial sub-weights corresponding to the number of the output channels;
the processing module is further configured to scale the plurality of initial sub-weights respectively, and obtain a plurality of scaled sub-weights;
the processing module is further configured to combine the scaling sub-weights to obtain a scaling weight;
the processing module is further configured to perform global quantization on the scaling weights, so that the scaling weights after global quantization are applied to a convolutional layer of a convolutional neural network as a convolutional kernel to perform data processing.
7. The apparatus of claim 6, wherein the processing module is further configured to obtain a scaling factor for each initial sub-weight;
and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.
8. The apparatus according to claim 7, wherein the obtaining module is further configured to obtain a first maximum value and a first minimum value of each element of the initial weight;
for each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight:
the obtaining module is further configured to obtain a second maximum value and a second minimum value in each element of the initial sub-weights;
the processing module is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio;
the processing module is further configured to determine that the scaling factor is the first ratio or the second ratio according to a type of the first ratio and/or the second ratio;
the processing module is further configured to determine that the scaling factor is a maximum value of the first ratio and the second ratio if at least one negative number exists in the first ratio and the second ratio;
if no negative number exists in the first ratio and the second ratio, determining that the scaling factor is the minimum value of the first ratio and the second ratio.
9. The apparatus of claim 6, wherein the dimension of an initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and wherein the dimension of the initial weight is the same as the dimension of the convolution kernel.
10. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1 to 5.
CN202010237669.8A 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network Active CN111461302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010237669.8A CN111461302B (en) 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010237669.8A CN111461302B (en) 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN111461302A true CN111461302A (en) 2020-07-28
CN111461302B CN111461302B (en) 2024-04-19

Family

ID=71681615

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010237669.8A Active CN111461302B (en) 2020-03-30 2020-03-30 Data processing method, device and storage medium based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN111461302B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381205A (en) * 2020-09-29 2021-02-19 北京清微智能科技有限公司 Neural network low bit quantization method
CN113780523A (en) * 2021-08-27 2021-12-10 深圳云天励飞技术股份有限公司 Image processing method, image processing device, terminal equipment and storage medium
WO2023125785A1 (en) * 2021-12-29 2023-07-06 杭州海康威视数字技术股份有限公司 Data processing method, network training method, electronic device, and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150372805A1 (en) * 2014-06-23 2015-12-24 Qualcomm Incorporated Asynchronous pulse modulation for threshold-based signal coding
CN108053028A (en) * 2017-12-21 2018-05-18 深圳云天励飞技术有限公司 Data fixed point processing method, device, electronic equipment and computer storage media
CN108304919A (en) * 2018-01-29 2018-07-20 百度在线网络技术(北京)有限公司 Method and apparatus for generating convolutional neural networks
US20190012559A1 (en) * 2017-07-06 2019-01-10 Texas Instruments Incorporated Dynamic quantization for deep neural network inference system and method
US20190042935A1 (en) * 2017-12-28 2019-02-07 Intel Corporation Dynamic quantization of neural networks
CN109902803A (en) * 2019-01-31 2019-06-18 东软睿驰汽车技术(沈阳)有限公司 A kind of method and system of neural network parameter quantization
CN110222821A (en) * 2019-05-30 2019-09-10 浙江大学 Convolutional neural networks low-bit width quantization method based on weight distribution
US20190294413A1 (en) * 2018-03-23 2019-09-26 Amazon Technologies, Inc. Accelerated quantized multiply-and-add operations
CN110322008A (en) * 2019-07-10 2019-10-11 杭州嘉楠耘智信息科技有限公司 Residual convolution neural network-based quantization processing method and device
CN110363279A (en) * 2018-03-26 2019-10-22 华为技术有限公司 Image processing method and device based on convolutional neural networks model
KR20190130443A (en) * 2018-05-14 2019-11-22 삼성전자주식회사 Method and apparatus for quantization of neural network
CN110826685A (en) * 2018-08-08 2020-02-21 华为技术有限公司 Method and device for convolution calculation of neural network

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150372805A1 (en) * 2014-06-23 2015-12-24 Qualcomm Incorporated Asynchronous pulse modulation for threshold-based signal coding
US20190012559A1 (en) * 2017-07-06 2019-01-10 Texas Instruments Incorporated Dynamic quantization for deep neural network inference system and method
CN108053028A (en) * 2017-12-21 2018-05-18 深圳云天励飞技术有限公司 Data fixed point processing method, device, electronic equipment and computer storage media
US20190042935A1 (en) * 2017-12-28 2019-02-07 Intel Corporation Dynamic quantization of neural networks
CN108304919A (en) * 2018-01-29 2018-07-20 百度在线网络技术(北京)有限公司 Method and apparatus for generating convolutional neural networks
US20190294413A1 (en) * 2018-03-23 2019-09-26 Amazon Technologies, Inc. Accelerated quantized multiply-and-add operations
CN110363279A (en) * 2018-03-26 2019-10-22 华为技术有限公司 Image processing method and device based on convolutional neural networks model
KR20190130443A (en) * 2018-05-14 2019-11-22 삼성전자주식회사 Method and apparatus for quantization of neural network
CN110826685A (en) * 2018-08-08 2020-02-21 华为技术有限公司 Method and device for convolution calculation of neural network
CN109902803A (en) * 2019-01-31 2019-06-18 东软睿驰汽车技术(沈阳)有限公司 A kind of method and system of neural network parameter quantization
CN110222821A (en) * 2019-05-30 2019-09-10 浙江大学 Convolutional neural networks low-bit width quantization method based on weight distribution
CN110322008A (en) * 2019-07-10 2019-10-11 杭州嘉楠耘智信息科技有限公司 Residual convolution neural network-based quantization processing method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
张月琴等: "一种改进的BP神经网络算法与应用", 计算机技术与发展, vol. 22, no. 08, 10 August 2012 (2012-08-10), pages 163 - 166 *
牟帅: "基于位量化的深度神经网络加速与压缩研究", 中国优秀硕士学位论文全文数据库 (信息科技辑), no. 06, 15 June 2018 (2018-06-15), pages 138 - 1290 *
许德智等: "基于权重量化与信息压缩的车载图像超分辨率重建", 计算机应用, vol. 39, no. 12, 30 August 2019 (2019-08-30), pages 3644 - 3649 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112381205A (en) * 2020-09-29 2021-02-19 北京清微智能科技有限公司 Neural network low bit quantization method
CN113780523A (en) * 2021-08-27 2021-12-10 深圳云天励飞技术股份有限公司 Image processing method, image processing device, terminal equipment and storage medium
CN113780523B (en) * 2021-08-27 2024-03-29 深圳云天励飞技术股份有限公司 Image processing method, device, terminal equipment and storage medium
WO2023125785A1 (en) * 2021-12-29 2023-07-06 杭州海康威视数字技术股份有限公司 Data processing method, network training method, electronic device, and storage medium

Also Published As

Publication number Publication date
CN111461302B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
CN111461302B (en) Data processing method, device and storage medium based on convolutional neural network
CN111031346B (en) Method and device for enhancing video image quality
CN109002889B (en) Adaptive iterative convolution neural network model compression method
CN108012156B (en) Video processing method and control platform
KR20180073118A (en) Convolutional neural network processing method and apparatus
CN111488985B (en) Deep neural network model compression training method, device, equipment and medium
WO2019238029A1 (en) Convolutional neural network system, and method for quantifying convolutional neural network
TW202119293A (en) Method and system of quantizing artificial neural network and arti ficial neural network apparatus
CN108335293B (en) Image quality determination method and device
US20210350205A1 (en) Convolution Processing Method and Apparatus for Convolutional Neural Network, and Storage Medium
US20200389182A1 (en) Data conversion method and apparatus
CN113449854A (en) Method and device for quantifying mixing precision of network model and computer storage medium
WO2016125500A1 (en) Feature transformation device, recognition device, feature transformation method and computer readable recording medium
Pascual et al. Adjustable compression method for still JPEG images
TW202001701A (en) Method for quantizing an image and method for training a neural network
WO2021037174A1 (en) Neural network model training method and apparatus
CN112561050B (en) Neural network model training method and device
CN114078471A (en) Network model processing method, device, equipment and computer readable storage medium
CN112418388A (en) Method and device for realizing deep convolutional neural network processing
CN110909085A (en) Data processing method, device, equipment and storage medium
CN113159297A (en) Neural network compression method and device, computer equipment and storage medium
CN113313253A (en) Neural network compression method, data processing device and computer equipment
CN111339490B (en) Matrix multiplication calculation method and device
CN114662679B (en) Data processing method based on neural network
CN112182382A (en) Data processing method, electronic device, and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20201211

Address after: Room 206, 2 / F, building C, phase I, Zhongguancun Software Park, No. 8, Dongbei Wangxi Road, Haidian District, Beijing 100094

Applicant after: Canaan Bright Sight Co.,Ltd.

Address before: 310000 Room 1203, 12/F, Building 4, No. 9, Jiuhuan Road, Jianggan District, Hangzhou City, Zhejiang Province

Applicant before: Hangzhou Canaan Creative Information Technology Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant