CN111461302A

CN111461302A - Data processing method, device and storage medium based on convolutional neural network

Info

Publication number: CN111461302A
Application number: CN202010237669.8A
Authority: CN
Inventors: 郭晖; 张楠赓
Original assignee: Hangzhou Canaan Creative Information Technology Ltd
Current assignee: Canaan Bright Sight Co Ltd
Priority date: 2020-03-30
Filing date: 2020-03-30
Publication date: 2020-07-28
Anticipated expiration: 2040-03-30
Also published as: CN111461302B

Abstract

The application discloses a data processing method based on a convolutional neural network, relates to the technical field of neural networks, and can solve the technical problem that the data processing precision is influenced due to hardware limitation. The method comprises the following steps: acquiring the number of output channels, and segmenting the initial weight according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels; respectively zooming the plurality of initial sub-weights to obtain a plurality of zooming sub-weights; combining the plurality of scaling sub-weights to obtain scaling weights; and carrying out global quantization on the scaling weights, and applying the scaling weights subjected to global quantization as convolution kernels to convolution layers of a convolution neural network for data processing.

Description

Data processing method, device and storage medium based on convolutional neural network

Technical Field

The present application relates to the field of neural network technologies, and in particular, to a data processing method and device based on a convolutional neural network, and a storage medium.

Background

Convolutional Neural Network (CNN) refers to a feed forward Neural Network (fed Neural Network) that includes convolution calculation and has a deep structure, and is one of the representative algorithms of deep learning (deep). The structure of a convolutional neural network typically includes an input layer, a hidden layer, and an output layer. The input layer of a convolutional neural network can typically process one-dimensional or multi-dimensional data; the hidden layer usually includes a convolution layer, a pooling layer and a full-link layer, and performs operations such as convolution on the data output by the input layer; the output layer is often used to implement the result output of the convolutional neural network, for example, for the image classification problem, the output layer may be designed to output the center coordinates, the size, the classification, and the like of the object, for example, the classification result corresponding to each pixel in the image may be specifically and directly output.

In the data processing process of the convolutional neural network, extremely large input and output bandwidths are often required. At present, in order to solve the above problems, the data processing process involved in the convolutional neural network may be replaced by a fixed-point operation instead of a floating-point operation, so as to reduce the requirements of the data processing process on input and output bandwidths. In the data processing process of the convolutional neural network, the weights can be respectively quantized according to the output channels in the data processing process corresponding to the convolutional layers, and then the corresponding convolutional process is realized. However, under the influence of the performance of hardware such as the Kendryte K210 AI chip, the convolution kernels obtained by quantizing the weights respectively according to the output channels generally affect the data processing precision of the convolutional neural network because the offset numbers are the same, thereby reducing the accuracy of the output result corresponding to the implementation process such as image classification.

Disclosure of Invention

The application provides a data processing method, data processing equipment and a storage medium based on a convolutional neural network, which are used for solving the technical problem that the data processing precision is influenced due to hardware limitation.

In order to solve the above problems, the technical solution provided by the present application is as follows:

in a first aspect, an embodiment of the present application provides a data processing method based on a convolutional neural network. The method comprises the following steps: acquiring the number of output channels, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels; respectively zooming the plurality of initial sub-weights to obtain a plurality of zooming sub-weights; combining the multiple scaling sub-weights to obtain a scaling weight; and carrying out global quantization on the scaling weight, and applying the scaling weight subjected to global quantization as a convolution kernel to a convolution layer of the convolutional neural network for data processing.

In an implementation manner, scaling the plurality of initial sub-weights respectively to obtain a plurality of scaled sub-weights may be implemented as: obtaining a scaling factor of each initial sub-weight; and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.

In one implementation, obtaining the scaling factor of each initial sub-weight may be implemented as: acquiring a first maximum value and a first minimum value in each element of the initial weight; for each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight: acquiring a second maximum value and a second minimum value in each element of the initial sub-weight; determining the ratio of the first maximum value to the second maximum value as a first ratio, and determining the ratio of the first minimum value to the second minimum value as a second ratio; and determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.

In one implementation, determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio may be implemented as: if at least one negative number exists in the first ratio and the second ratio, determining the scaling factor as the maximum value of the first ratio and the second ratio; and if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.

In one implementation, the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and the dimension of the initial weight is the same as the dimension of the convolution kernel.

In a second aspect, an embodiment of the present application provides a data processing device based on a convolutional neural network. The apparatus comprises:

and the acquisition module is used for acquiring the number of the output channels.

And the processing module is used for segmenting the initial weight according to the number of the output channels acquired by the acquisition module to obtain a plurality of initial sub-weights corresponding to the number of the output channels.

And the processing module is further used for respectively scaling the plurality of initial sub-weights and obtaining a plurality of scaled sub-weights.

And the processing module is further used for combining the plurality of scaling sub-weights to obtain the scaling weight.

And the processing module is also used for carrying out global quantization on the scaling weight so as to apply the scaling weight after global quantization as a convolution kernel to a convolution layer of the convolution neural network for data processing.

In one implementation, the processing module is further configured to obtain a scaling factor for each initial sub-weight; and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.

In one implementation, the obtaining module is further configured to obtain a first maximum value and a first minimum value of each element of the initial weight. For each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight: and the obtaining module is further used for obtaining a second maximum value and a second minimum value in each element of the initial sub-weights. The processing module is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio. And the processing module is further used for determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.

The processing module is further configured to determine that the scaling factor is the maximum value of the first ratio and the second ratio if at least one negative number exists in the first ratio and the second ratio; and if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.

In a third aspect, the present application provides a data processing apparatus based on a convolutional neural network, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method of the first aspect and any one of its various possible implementations when executing the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium. The storage medium stores a computer program which, when executed by a processor, implements the method of the first aspect described above and any of its various possible implementations.

Compared with the situation that the data processing precision is affected due to hardware limitation in the prior art, in the embodiment of the application, a process similar to that of respectively quantizing the weights according to the output channels and then realizing corresponding convolution is adopted in the data processing process corresponding to the convolution layer. The method specifically comprises the steps of segmenting initial weights according to output channels, processing each segmented initial sub-weight, combining the processed scaling sub-weights, carrying out global quantization on the combined scaling weights, and applying the obtained global quantization to the data processing process of the convolutional layer. Therefore, original limited hardware does not need to be replaced, and the obtained result is close to the result obtained by processing after the weights are respectively quantized according to the output channels. Therefore, the data processing precision can be improved under the condition of adopting fixed-point operation instead of floating-point operation.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a first flowchart of a data processing method based on a convolutional neural network according to an embodiment of the present disclosure;

fig. 2 is a flowchart of a data processing method based on a convolutional neural network according to an embodiment of the present disclosure;

fig. 3 is a flow chart of a data processing method based on a convolutional neural network according to an embodiment of the present application;

fig. 4 is a flowchart of a data processing method based on a convolutional neural network according to an embodiment of the present application;

fig. 5 is a first schematic structural diagram of a data processing device based on a convolutional neural network according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a data processing device based on a convolutional neural network according to an embodiment of the present application.

Detailed Description

In order to more clearly explain the overall concept of the present application, the following detailed description is given by way of example in conjunction with the accompanying drawings.

The embodiment of the application provides a data processing method based on a convolutional neural network. The method can be applied to scenarios involving convolutional neural networks, such as image classification, speech recognition, document analysis, and the like. By adopting the technical scheme provided by the embodiment of the application, the precision of data processing can be effectively improved.

For example, for an application scenario of image classification, by using the technical solution provided in the embodiment of the present application, when floating point operation is replaced by fixed point operation, the convolution kernel is scaled according to an output channel, global quantization is performed on the scaled convolution kernel to obtain a convolution kernel finally used for convolution operation, and then the finally obtained convolution kernel is used for convolution operation, so as to improve the precision of data processing. That is to say, when the technical scheme provided by the embodiment of the application is adopted to train the image classification model, more representative image features can be extracted, and the accuracy of feature recognition is higher, so that the image classification model obtained by training can obtain more accurate image classification results in the subsequent image processing process.

It should be noted that, in the practical application process, if the convolution operation is still involved when performing the image classification, the corresponding image classification result may also be obtained by using a similar implementation method of the technical solution provided in the embodiment of the present application. The image classification result obtained in this way is also improved in processing accuracy due to the adjustment of the operation process.

Similarly, for scenes involving a convolutional neural network, such as voice recognition, document analysis, and the like, corresponding effects similar to those obtained by using the technical scheme provided by the embodiment of the present application for the image classification scene may also be achieved, and details of actual effects brought in the implementation process are omitted here, and the contents described above may be referred to.

The technical solution provided by the embodiment of the present application is further described below with reference to corresponding execution steps of the data processing method based on the convolutional neural network provided by the embodiment of the present application. As shown in fig. 1, the method may include S101 to S104.

S101, obtaining the number of output channels, and segmenting the initial weight according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels.

In the embodiment of the present application, the division of the initial weight into several initial sub-weights may be determined according to the number of output channels. In view of the technical solutions provided in the embodiments of the present application, the initial weights are scaled based on the output channels and subjected to global quantization after scaling, and therefore, in an implementation manner of the embodiments of the present application, the initial weights may be split according to the number of the output channels. I.e. the number of initial sub-weights after slicing, is the same as the number of output channels. For example, the number of output channels is 3, then after slicing the initial weights, 3 initial sub-weights can be obtained.

Of course, in the implementation process, if at least some of the initial sub-weights after the segmentation are the same, at least some of the multiple initial sub-weights may also be scaled in the same manner. That is, for the initial sub-weights scaled in the same manner, a scaling factor can be determined for scaling, thereby reducing the resources consumed in calculating the scaling factor.

And S102, respectively scaling the plurality of initial sub-weights to obtain a plurality of scaled sub-weights.

In the process of scaling the plurality of initial sub-weights, at least part of the initial sub-weights may be scaled, that is, all of the initial sub-weights may be scaled or part of the initial sub-weights in the plurality of initial sub-weights may be scaled to obtain corresponding scaled sub-weights. For the case that some initial sub-weights are scaled, which initial sub-weights are scaled and which initial sub-weights are not scaled, which is not limited in the embodiment of the present application, may be specifically adjusted according to factors such as the accuracy of the data processing process.

S103, combining the plurality of scaling sub-weights to obtain the scaling weight.

For the case that some of the initial sub-weights in the plurality of initial sub-weights are scaled, the merging process in S103 includes merging the un-scaled initial sub-weights and the scaled sub-weights. The combining mode of the scaling weights may be obtained by an inverse execution mode of dividing the initial weights into a plurality of initial sub-weights. For the specific implementation process of obtaining the scaling weight by splitting and combining the initial weight, reference may be made to the matrix splitting and combining manner in the prior art, which is not described herein again.

And S104, carrying out global quantization on the scaling weights, and applying the scaling weights subjected to global quantization as convolution kernels to convolution layers of the convolutional neural network for data processing.

In addition, in order to ensure that a scaling weight matching the initial weight can be obtained according to the initial weight, in an implementation manner of the embodiment of the present application, the initial weight and the scaling weight having the same expression can be obtained by ensuring consistency of expressions of the initial sub-weight and the scaling sub-weight. That is, in an implementation manner of the embodiment of the present application, a dimension of the initial sub-weight may be the same as a dimension of its corresponding scaling sub-weight, and a dimension of the initial weight may be the same as a dimension of the convolution kernel.

In the process of scaling each initial sub-weight, a corresponding scaling factor can be obtained for each initial sub-weight, and then the corresponding initial sub-weight is scaled according to the scaling factor, so that the obtained scaling sub-weight has more pertinence. Therefore, on the basis of the implementation shown in fig. 1, the implementation shown in fig. 2 can also be realized. S102 scales the plurality of initial sub-weights respectively to obtain a plurality of scaled sub-weights, which can be implemented as S201 and S202.

S201, obtaining a scaling factor of each initial sub-weight.

S202, scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.

As a parameter used for scaling the initial sub-weights, in an implementation manner of the embodiment of the present application, a calculation manner of the scaling factor may be determined according to a magnitude relationship between an element of the initial weight and an element of the initial sub-weights. Therefore, on the basis of the implementation shown in fig. 2, the implementation shown in fig. 3 can also be realized. Wherein, S201 obtains the scaling factor of each initial sub-weight, which can be implemented as S301, and multiple sets S302 to S304. I.e. performing S302 to S304 for each initial sub-weight to obtain a scaling factor for each initial sub-weight.

S301, acquiring a first maximum value and a first minimum value in each element of the initial weight.

S302, acquiring a second maximum value and a second minimum value in each element of the initial sub-weight.

S303, determining a ratio of the first maximum value to the second maximum value as a first ratio, and determining a ratio of the first minimum value to the second minimum value as a second ratio.

S304, determining the scaling factor to be the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.

On the basis of the implementation shown in fig. 3, the implementation shown in fig. 4 can also be realized. Wherein, S304 determines the scaling factor to be the inverse of the first ratio or the inverse of the second ratio according to the type of the first ratio and/or the second ratio, which may be implemented as S401 or S402.

S401, if at least one negative number exists in the first ratio and the second ratio, determining that the scaling factor is the maximum value of the first ratio and the second ratio.

S402, if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.

The technical solutions provided in the embodiments of the present application are described below with reference to specific examples.

Obtaining the number C of output channels of the initial weight W, dividing the initial weight W1 into C initial sub-weights, and then dividing each initial sub-weight W2_iScaling according to the corresponding scaling factor to obtain the corresponding scaling sub-weight W3_iAnd combining the scaling sub-weights to obtain a scaling weight W₄For data processing. Wherein, the value of i is an integer which is more than or equal to 1 and less than or equal to C, and C is an integer which is more than 1.

According to the initial sub-weight W2₁Get the corresponding scaling sub-weight W3₁For example, the minimum value L (i.e., the first minimum value) and the maximum value U (i.e., the first maximum value) of the initial weight are predetermined, i.e., L ═ min (W) and U ═ max (W), the initial sub-weight W2 to be scaled is obtained₁Minimum value of L₁(i.e., second minimum value) and maximum value U₁(i.e., second maximum value), i.e., L₁Min (W) and U₁Max (w). Obtaining a weight scaling parameter S through calculation₁(i.e., first ratio) and S₂(i.e., the second ratio). Wherein S is₁＝U/U₁，S₂＝L/L₁. If S₁And S₂Memory storageIf at least one is negative, then the weight scaling factor S is determined to be the largest of the weight scaling parameters, i.e., S ═ max (S)₁，S₂) (ii) a Otherwise, determining the weight scaling factor S as the smallest weight scaling parameter of the weight scaling parameters, i.e. S ═ min (S)₁，S₂). After obtaining the weight scaling factor, a scaled weight W3 is obtained₁＝SW2₁. According to the implementation manner, W3 is obtained respectively₁、W3₂、……、W3_C. Combining the scaling sub-weights to obtain a scaling weight W₄. In the implementation process, the initial sub-weight W2₁Corresponding multiplication factor M₁Is the inverse of the weight scaling factor S, i.e. multiplication factor M₁＝1/S。

It should be noted that, in the actual data processing process, the min and max functions may include, but are not limited to, a conventional manner of finding the minimum value and the maximum value, and may further include a corresponding algorithm such as K L D to achieve the finding of the parameter, which is not limited herein to a specific implementation manner.

The embodiment of the application provides a data processing device based on a convolutional neural network. As shown in fig. 5, the convolutional neural network-based data processing device 50 may include:

and an obtaining module 51, configured to obtain the number of output channels.

The processing module 52 is configured to segment the initial weights according to the number of the output channels acquired by the acquiring module 51, so as to obtain a plurality of initial sub-weights corresponding to the number of the output channels.

The processing module 52 is further configured to scale the plurality of initial sub-weights respectively, and obtain a plurality of scaled sub-weights.

The processing module 52 is further configured to combine the multiple scaling sub-weights to obtain the scaling weight.

The processing module 52 is further configured to perform global quantization on the scaling weights, so as to apply the scaling weights after global quantization as convolution kernels to convolution layers of the convolutional neural network, so as to perform data processing.

In one implementation, the processing module 52 is further configured to obtain a scaling factor for each initial sub-weight; and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.

In one implementation, the obtaining module 51 is further configured to obtain a first maximum value and a first minimum value of each element of the initial weight.

For each initial sub-weight, performing the following steps to obtain a scaling factor for each initial sub-weight:

the obtaining module 51 is further configured to obtain a second maximum value and a second minimum value in each element of the initial sub-weights.

The processing module 52 is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio. The processing module 52 is further configured to determine the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.

The processing module 52 is further configured to determine that the scaling factor is the maximum value of the first ratio and the second ratio if at least one negative number exists in the first ratio and the second ratio; and if no negative number exists in the first ratio and the second ratio, determining the scaling factor as the minimum value of the first ratio and the second ratio.

In one implementation, the convolutional neural network-based data processing device 50 may further include at least one of a communication module 53, a storage module 54, and a display module 55.

The communication module 53 may be configured to implement data interaction among the above modules, and/or support data interaction between the data processing device 50 based on the convolutional neural network and a device such as a server, other processing device, and the like; a storage module 54, which can be used to store the content required by the above modules when implementing the corresponding functions; the display module 55 may be configured to display the progress of data processing, the operating status of the data processing apparatus 50 based on the convolutional neural network, and the like. In the embodiment of the present application, the content, format, and the like stored in the storage module are not limited.

In the embodiment of the present application, the obtaining module 51 and the communication module 53 may be implemented as a communication interface, the processing module 52 may be implemented as a processor and/or a controller, the storage module 54 may be implemented as a memory, and the display module 55 may be implemented as a display.

Fig. 6 is a schematic structural diagram of another data processing device based on a convolutional neural network according to an embodiment of the present application. The convolutional neural network-based data processing device 60 may include a communication interface 61, a processor 62. In one implementation, the convolutional neural network-based data processing device 60 may also include one or more of a memory 63 and a display 64. The communication interface 61, the processor 62, the memory 63, and the display 64 may communicate with each other through the bus 65. The functions implemented by the above components may refer to the description of the functions of the modules, which is not repeated herein.

It should be noted that, referring to fig. 5 and fig. 6, the data processing apparatus based on the convolutional neural network provided in the embodiment of the present application may include more or less modules and components than those shown in the figures, which is not limited herein.

The application provides a data processing device based on a convolutional neural network, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the computer program to realize the method of any one of the above-mentioned various possible implementation modes.

The present application provides a computer-readable storage medium. The storage medium stores a computer program which, when executed by a processor, implements the method of any of the various possible implementations described above.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the physical embodiment, since it is substantially similar to the method embodiment, the description is simple, and the relevant points can be referred to the partial description of the method embodiment.

Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied in hardware, a software module executed by a processor, or a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A method of data processing based on a convolutional neural network, the method comprising:

acquiring the number of output channels, and segmenting the initial weights according to the number of the output channels to obtain a plurality of initial sub-weights corresponding to the number of the output channels;

respectively zooming the plurality of initial sub-weights to obtain a plurality of zooming sub-weights;

combining the plurality of scaling sub-weights to obtain scaling weights;

and carrying out global quantization on the scaling weights, and applying the scaling weights subjected to global quantization as convolution kernels to convolution layers of a convolution neural network for data processing.

2. The method of claim 1, wherein scaling the initial sub-weights and obtaining scaled sub-weights comprises:

obtaining a scaling factor of each initial sub-weight;

and scaling each initial sub-weight according to the corresponding scaling factor to respectively obtain the scaling sub-weight corresponding to each initial sub-weight.

3. The method of claim 2, wherein obtaining the scaling factor for each initial sub-weight comprises:

acquiring a first maximum value and a first minimum value in each element of the initial weight;

acquiring a second maximum value and a second minimum value in each element of the initial sub-weight;

determining a ratio of the first maximum value to the second maximum value as a first ratio, and determining a ratio of the first minimum value to the second minimum value as a second ratio;

determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio.

4. The method of claim 3, wherein determining the scaling factor as the first ratio or the second ratio according to the type of the first ratio and/or the second ratio comprises:

if at least one negative number exists in the first ratio and the second ratio, determining the scaling factor as the maximum value of the first ratio and the second ratio;

if no negative number exists in the first ratio and the second ratio, determining that the scaling factor is the minimum value of the first ratio and the second ratio.

5. The method of claim 1, wherein the dimension of the initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and wherein the dimension of the initial weight is the same as the dimension of the convolution kernel.

6. A convolutional neural network-based data processing apparatus, the apparatus comprising:

the acquisition module is used for acquiring the number of the output channels;

the processing module is used for segmenting the initial weight according to the number of the output channels acquired by the acquisition module to obtain a plurality of initial sub-weights corresponding to the number of the output channels;

the processing module is further configured to scale the plurality of initial sub-weights respectively, and obtain a plurality of scaled sub-weights;

the processing module is further configured to combine the scaling sub-weights to obtain a scaling weight;

the processing module is further configured to perform global quantization on the scaling weights, so that the scaling weights after global quantization are applied to a convolutional layer of a convolutional neural network as a convolutional kernel to perform data processing.

7. The apparatus of claim 6, wherein the processing module is further configured to obtain a scaling factor for each initial sub-weight;

8. The apparatus according to claim 7, wherein the obtaining module is further configured to obtain a first maximum value and a first minimum value of each element of the initial weight;

the obtaining module is further configured to obtain a second maximum value and a second minimum value in each element of the initial sub-weights;

the processing module is further configured to determine a ratio of the first maximum value to the second maximum value as a first ratio, and determine a ratio of the first minimum value to the second minimum value as a second ratio;

the processing module is further configured to determine that the scaling factor is the first ratio or the second ratio according to a type of the first ratio and/or the second ratio;

the processing module is further configured to determine that the scaling factor is a maximum value of the first ratio and the second ratio if at least one negative number exists in the first ratio and the second ratio;

9. The apparatus of claim 6, wherein the dimension of an initial sub-weight is the same as the dimension of its corresponding scaled sub-weight, and wherein the dimension of the initial weight is the same as the dimension of the convolution kernel.

10. A computer-readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any one of claims 1 to 5.