CN111144457B

CN111144457B - Image processing method, device, equipment and storage medium

Info

Publication number: CN111144457B
Application number: CN201911280488.7A
Authority: CN
Inventors: 曹效伦
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2019-12-13
Filing date: 2019-12-13
Publication date: 2024-02-27
Anticipated expiration: 2039-12-13
Also published as: CN111144457A

Abstract

The disclosure relates to an image processing method, an apparatus, a device and a storage medium, which are applied to a terminal device or a server, wherein the method comprises the following steps: acquiring weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network; mapping the weight data of the target convolution layer and the input image data to a set threshold interval, and respectively obtaining a weight quantization value and a weight expansion scale corresponding to the weight data and an image quantization value and an image expansion scale corresponding to the input image data; and obtaining image characteristic data output by the target convolution layer according to the convolution operation results of the weight quantization value and the image quantization value of the target convolution layer, and the weight expansion scale and the image expansion scale. The method and the device can only quantize the target convolution layer in the neural network, reduce the model size, improve the image processing speed and the generalization capability of quantization processing, reduce the precision loss and further improve the image processing quality.

Description

Image processing method, device, equipment and storage medium

Technical Field

The disclosure relates to the technical field of deep learning, and in particular relates to an image processing method, an image processing device and a storage medium.

Background

Today, where artificial intelligence is evolving at a high rate, deep learning techniques play an irreplaceable role in an increasing number of business scenarios. As the model structure becomes more and more complex and the application scenario (edge calculation) of the mobile terminal becomes more and more, how to increase the reasoning speed of the neural network model is also receiving more and more attention.

In the related art, a quantization technology is generally adopted to quantize the convolution of the whole inside the neural network so as to improve the reasoning speed of the neural network model. However, this quantization scheme may cause a change in the output tensor distribution of the quantized model, which may further result in a large loss of model accuracy and poor model generalization capability.

Disclosure of Invention

The present disclosure provides an image processing method, apparatus and system, so as to at least solve the above technical problems in the related art.

The technical scheme of the present disclosure is as follows:

according to a first aspect of embodiments of the present disclosure, there is provided an image processing method for performing feature extraction on an input image using a convolutional neural network trained in advance, the convolutional neural network including a plurality of convolutional layers; the method comprises the following steps:

acquiring weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network;

Mapping the weight data of the target convolution layer and the input image data to a set threshold interval, and respectively obtaining a weight quantization value and a weight expansion scale corresponding to the weight data and an image quantization value and an image expansion scale corresponding to the input image data;

and obtaining image characteristic data output by the target convolution layer according to the convolution operation results of the weight quantization value and the image quantization value of the target convolution layer, and the weight expansion scale and the image expansion scale.

In an embodiment, the step of obtaining the image feature data output by the target convolution layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolution layer, and the weight scaling scale and the image scaling scale includes:

obtaining multiplication results of the weight scale, the image scale and the convolution operation result;

carrying out batch processing on the multiplied result, and adding the batch processing result and the offset to obtain an added result;

and operating the addition result by using an activation function to obtain the image characteristic data output by the target convolution layer.

In an embodiment, the step of mapping the weight data and the input image data of the target convolution layer to a set threshold interval to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data, respectively, includes:

Determining an upper limit threshold and a lower limit threshold of data to be mapped, wherein the data to be mapped comprises the weight data and/or the input image data;

and carrying out quantization processing on the data to be mapped based on the upper limit threshold and the lower limit threshold to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data.

In an embodiment, the step of determining the upper and lower thresholds of the data to be mapped includes:

and determining an upper limit threshold and a lower limit threshold of the data to be mapped based on the numerical distribution range of the data to be mapped.

In an embodiment, the step of performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data includes:

in response to determining that the value of the data to be mapped is greater than or equal to the upper threshold, determining a quantized value of the data to be mapped as the upper threshold;

in response to determining that the value of the data to be mapped is less than or equal to the lower threshold, determining a quantized value of the data to be mapped as the lower threshold;

And in response to determining that the numerical value of the data to be mapped is between the upper limit threshold value and the lower limit threshold value, carrying out quantization processing on the data to be mapped based on a preset quantization algorithm, and taking a quantization processing result as a quantization value of the data to be mapped.

According to a second aspect of embodiments of the present disclosure, there is provided an image processing apparatus that performs feature extraction on an input image using a convolutional neural network trained in advance, the convolutional neural network including a plurality of convolutional layers; the device comprises:

a weight data acquisition module configured to perform acquisition of weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network;

a weight data quantization module configured to perform mapping of weight data and input image data of the target convolution layer to a set threshold interval, and obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data, respectively;

and the image characteristic acquisition module is configured to execute convolution operation results according to the weight quantization value of the target convolution layer and the image quantization value, and the weight expansion scale and the image expansion scale to acquire image characteristic data output by the target convolution layer.

In an embodiment, the image feature acquisition module includes:

a multiplication result obtaining unit configured to perform obtaining a multiplication result of the weight scale, the image scale, and the convolution operation result;

an addition result obtaining unit configured to perform batch processing of the multiplication result and add the batch processing result to the offset to obtain an addition result;

and the image characteristic acquisition unit is configured to perform operation on the addition result by using an activation function to obtain image characteristic data output by the target convolution layer.

In an embodiment, the weight data quantization module includes:

a data threshold determining unit configured to perform determination of an upper limit threshold and a lower limit threshold of data to be mapped, the data to be mapped including the weight data and/or the input image data;

and the weight data quantization unit is configured to perform quantization processing on the data to be mapped based on the upper limit threshold and the lower limit threshold to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data.

In an embodiment, the data threshold determining unit is further configured to perform determining an upper threshold and a lower threshold of the data to be mapped based on a numerical distribution range of the data to be mapped.

In an embodiment, the weight data quantization unit is further configured to perform:

According to a third aspect of embodiments of the present disclosure, there is provided an image processing electronic device that performs feature extraction on an input image using a pre-trained convolutional neural network, the convolutional neural network including a plurality of convolutional layers; the electronic device includes:

a processor;

A memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the image processing method as claimed in any one of the preceding claims.

According to a fourth aspect of embodiments of the present disclosure, there is provided a storage medium, which when executed by a processor of an image processing electronic device, enables the image processing electronic device to perform the image processing method as set forth in any one of the above.

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product which, when executed by a processor of an image processing electronic device, enables the image processing electronic device to perform the image processing method as set forth in any one of the preceding claims.

The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:

according to the method, the weight data and the input image data of the target convolutional layer in the pre-trained convolutional neural network are obtained, the weight quantized value and the weight telescopic scale corresponding to the weight data and the image quantized value and the image telescopic scale corresponding to the input image data are respectively obtained by mapping the weight data and the input image data of the target convolutional layer to a set threshold interval, and then according to the convolution operation results of the weight quantized value and the image quantized value of the target convolutional layer and the weight telescopic scale and the image telescopic scale, the image characteristic data output by the target convolutional layer is obtained.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.

Fig. 1A is a schematic diagram of a convolutional neural network according to an example.

FIG. 1B is a schematic diagram of the structure of a convolution kernel in a convolutional neural network according to an example.

Fig. 1C is a schematic diagram of a process for calibration according to an example quantization scheme.

FIG. 1D is a schematic diagram of a process for performing quantization calculations after calibration according to an example quantization scheme.

Fig. 2A is a flowchart illustrating an image processing method according to an exemplary embodiment.

FIG. 2B is a schematic diagram illustrating a convolution calculation process, according to an example embodiment.

FIG. 3 is a flowchart illustrating how image feature data output by the target convolutional layer is obtained, according to an exemplary embodiment.

Fig. 4 is a flowchart showing how the weight data and input image data of the target convolutional layer are mapped to a set threshold interval according to an exemplary embodiment.

Fig. 5 is a flowchart showing how to map the weight data and the input image data of the target convolution layer to a set threshold interval according to still another exemplary embodiment.

Fig. 6 is a block diagram of an image processing apparatus according to an exemplary embodiment.

Fig. 7 is a block diagram of an image processing apparatus according to still another exemplary embodiment.

Fig. 8 is a block diagram of an image processing electronic device, according to an exemplary embodiment.

Detailed Description

In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.

FIG. 1A is a schematic diagram of a convolutional neural network according to an example; FIG. 1B is a schematic diagram of the structure of a convolution kernel in a convolutional neural network according to an example; FIG. 1C is a schematic diagram of a process for calibration according to an exemplary quantization scheme; FIG. 1D is a schematic diagram of a process for performing quantization calculations after calibration according to an example quantization scheme.

As shown in fig. 1A, the CNN convolutional neural network is composed of a plurality of convolutional kernels performing convolutional calculations, and typically, a part of samples input by the convolutional checks performs convolutional calculations to extract features, and another part of the convolutionally checked extracted features performs convolutional calculations to obtain classification results. As shown in fig. 1B, each convolution kernel consists of three parts of computation, convolution, batch processing, and activation functions, respectively. The convolution calculation part carries out convolution calculation on the input data and the weight, the operation result is input into batch processing and added with offset, and the output result of the batch processing is subjected to an activation function to complete the convolution calculation. Further, as shown in fig. 1C, because the range of values that can be expressed by INT8 (inter 8bit, 8bit Integer format) is much smaller than FP32 (Floating Point 32bit, single precision Floating Point format), in order to ensure consistent distribution, the current quantization technique adds a calibration step, in this calibration process, by calculating the convolution of the raw data and the quantized weights, the distribution of the output tensor of each data is obtained, and then statistical analysis is performed on the obtained distribution, so as to obtain an appropriate scale and offset, so that the distribution can be as far as possible within the range that INT8 can express, that is, -128 to 127 or 0 to 255, and then the statistical scale and offset are combined into the scale and offset of the weights. After calibration of a batch of data, the scale of the weights and the offset of the batch are updated, as shown in FIG. 1D, and INT8 convolution calculation is started. The inventors have noted that this approach has the following disadvantages: the resulting scale and offset are statistically based, i.e. strongly dependent on the chosen calibration data set. Once other data is used, the resulting statistical scaling and offset will also be different, thus resulting in a reduced generalization ability of the quantized quantization scheme. For example, selecting a model of one batch of data in the training dataset after calibration may have a relatively large degradation in accuracy for another batch of training data because the scale of the convolution is derived based on the data used in calibration and has solidified into the scale of the weights. Furthermore, when another batch of data with larger distribution difference from the check set is selected for calculation, the calibrated scale and offset are not adapted any more, so that larger errors are caused, and the generalization capability of the quantization scheme is reduced. This cannot be well solved by adding the check set, because when the distribution difference of the data in the check set is large, the obtained statistical expansion scale and offset cannot be well adapted to each data, i.e. the generalization capability is poor. On the other hand, the current quantization technique has a scope of overall convolution within the network (i.e., global quantization), and cannot implement quantization of only a part of convolution within the neural network. However, in the neural network, a part of convolution kernels are generally responsible for feature extraction, then the extracted features are input into another part of convolution kernels for classification, and global quantization affects the extracted features, so that the classification accuracy loss is large, and the model generalization capability is poor.

In view of this, the present application provides the following image processing method, apparatus, system, device and storage medium, so as to solve the technical problems in the related art that the convolution calculation scheme may cause the output tensor distribution of the quantized model to change, thereby causing a large model precision loss and poor model generalization capability. Specifically, the application is realized by the following technical scheme:

fig. 2A is a flowchart illustrating an image processing method according to an exemplary embodiment. FIG. 2B is a schematic diagram illustrating a convolution calculation process, according to an example embodiment. The image processing method of the present embodiment may be used for a computer device, such as a terminal or a server, to perform feature extraction on an input image using a convolutional neural network trained in advance, where the convolutional neural network includes a plurality of convolutional layers. As shown in fig. 2A, the following steps S101-S103 are included.

In step S101, weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network are acquired.

Wherein the input image data may be data of an input image for a first convolution layer; and for other convolutional layers may be the output image data of the previous convolutional layer.

In practical applications, such as applications for processing image data by using a convolutional neural network, in order to reduce the amount of computation and increase the processing speed, the convolutional neural network is generally subjected to compression processing, and the method for implementing compression processing by using a quantization method is mainly described in this embodiment.

In this embodiment, after the convolutional neural network is trained, the weight of the convolutional kernel of each convolutional layer in the network can be obtained. Further, after determining a target convolution layer that needs to be quantized, weight data and input image data of the target convolution layer may be acquired.

It should be noted that, considering that the neural network generally includes a convolution kernel for feature extraction and a convolution kernel for classification, in order to avoid affecting the extracted features, in this embodiment, the convolution kernel for classification may be selected as the target convolution layer, so that the accuracy of feature extraction may be ensured, and quantization processing may be performed on the convolutional neural network.

In step S102, the weight data and the input image data of the target convolution layer are mapped to a set threshold interval, and a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data are respectively obtained.

In this embodiment, after obtaining the weight data and the input image data of the target convolutional layer in the pre-trained convolutional neural network, the weight data and the input image data of the target convolutional layer may be mapped to a set threshold interval, and a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data are obtained respectively.

It should be noted that, the calculation of the neural network is mainly time-consuming on the convolution calculation of FP32 type, so this embodiment reduces the size of the model and increases the reasoning speed by quantizing the FP32 type weight data and the input image data into the INT8 type weight data and the input image data. The manner in which data is quantized from FP32 type to INT8 type may be explained and illustrated in the related art, which is not limited in this embodiment.

The inventor notes that the current quantization technology usually quantizes only the weight of the convolution kernel, which results in the change of the output tensor distribution of the quantized model, and further results in the technical problems of large model precision loss and poor model generalization capability. In view of this, as shown in fig. 2B, the present embodiment performs quantization processing on the weight data of the target convolution layer and the input image data by mapping the weight data and the input image data to the corresponding set threshold space, and counts the scale of the quantization process, so that the weight quantization value and the weight scale corresponding to the weight data, and the image quantization value and the image scale corresponding to the input image data can be obtained.

In another embodiment, the above manner of performing quantization processing on the weights and the input data to obtain the weight quantized values and the input quantized values may be referred to the embodiments shown in fig. 4 or fig. 5, which are not described in detail herein.

In step S103, according to the convolution operation result of the weight quantization value and the image quantization value of the target convolution layer, and the weight scaling scale and the image scaling scale, the image feature data output by the target convolution layer is obtained.

In this embodiment, after the weight quantization value and the input quantization value are obtained by performing quantization processing on the weight and the input data, convolution operation may be performed on the weight quantization value and the input quantization value to obtain an operation result.

For example, in the case of quantizing the FP 32-type weight data and the input image data into the INT 8-type weight data and the input image data, the resulting INT 8-type weight quantized value and input quantized value may be subjected to INT8 convolution

For example, as shown in fig. 2B, when the convolution operation is performed on the weight quantization value and the input quantization value, the image feature data output by the target convolution layer may be obtained according to the convolution operation result of the weight quantization value and the image quantization value of the target convolution layer, and the weight scaling scale and the image scaling scale.

It should be noted that, the manner of obtaining the image feature data output by the target convolution layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolution layer, and the weight scale and the image scale may also refer to the embodiment shown in fig. 3 and will not be described in detail here.

As can be seen from the foregoing description, in this embodiment, by acquiring the weight data and the input image data of the target convolutional layer in the pre-trained convolutional neural network, and mapping the weight data and the input image data of the target convolutional layer to the set threshold interval, the weight quantized value and the weight telescopic scale corresponding to the weight data and the image quantized value and the image telescopic scale corresponding to the input image data are respectively obtained, and according to the convolution operation result of the weight quantized value and the image quantized value of the target convolutional layer and the weight telescopic scale and the image telescopic scale, the image feature data output by the target convolutional layer is obtained, and because the quantization processing is performed on the weight data and the input image data of the target convolutional layer in a manner of mapping to the corresponding set threshold space, the distribution of the quantized image quantized value of the target convolutional layer is approximately close to the distribution of the input image data of the convolutional layer before quantization, so that the quantization processing of the target convolutional layer is independent of the processing of other layers in the current convolutional network, and the quantization processing of only partial convolutional layers (such as the convolutional layer) in the convolutional network can be realized, the quantization processing of the target convolutional layer can be performed, the quantization processing can be performed at a reduced size, and the quantization processing precision can be further improved, and the quantization processing precision can be improved.

FIG. 3 is a flowchart illustrating how image feature data output by the target convolutional layer is obtained, according to an exemplary embodiment. The present embodiment is exemplified on the basis of the above-described embodiments by taking as an example how to obtain the image feature data output by the target convolution layer. As shown in fig. 3, the step of obtaining the image feature data output by the target convolution layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolution layer, the weight scale and the image scale in the step S104 may include the following steps S201 to S203:

in step S201, a multiplication result of the weight scale, the image scale, and the convolution operation result is obtained.

In this embodiment, after performing convolution operation on the weight quantization value and the input quantization value to obtain an operation result, a multiplication result of the weight scale, the image scale and the convolution operation result may be obtained.

For example, when the convolution operation is performed on the weight quantized value and the input quantized value to obtain an operation result, a product of the weight scale, the input scale and the operation result may be calculated, and then the product may be used as inverse mapping data, see an "amplification" process shown in fig. 2B. It will be appreciated that the amplified data is restored to the domain of the original FP 32-type convolution, i.e., the distribution of data convolved with the original FP32 remains similar. It will be appreciated that the distribution of the amplified data is believed to remain similar to that of the original FP32 convolved data, with the theoretical basis being the distribution and combination laws from the convolution.

It should be noted that the "amplification" process shown in fig. 2B can be understood as a concept of an amplifier, that is, when the scale of expansion is greater than 1, the effect thereof is amplification; and when the expansion scale is smaller than 1, the effect is to shrink.

In step S202, the multiplication result is subjected to batch processing, and the batch processing result is added to the offset amount to obtain an addition result.

In this embodiment, after the weight scale, the multiplication result of the image scale and the convolution operation result are obtained, batch processing (Batch) may be performed on the data, and a predetermined offset may be added to obtain an addition result.

The manner of batch processing the data may be referred to as a batch processing scheme in the related art, which is not limited in this embodiment.

In step S203, the addition result is operated by using an activation function, so as to obtain image feature data output by the target convolution layer.

In this embodiment, when the multiplication result is subjected to batch processing, and the batch processing result is added to the offset to obtain an addition result, the addition result may be operated by using an activation function to obtain the image feature data output by the target convolution layer.

In this embodiment, when the weight scale, the multiplication result of the image scale and the convolution operation result are obtained, batch processing (Batch) is performed on the data, and a predetermined offset is added, the obtained addition result may be subjected to an activation function operation, so as to obtain a target processing result, that is, a convolution result of the target convolution layer.

As can be seen from the foregoing description, in this embodiment, the multiplication result of the weight scale, the image scale and the convolution operation result is obtained, the multiplication result is subjected to batch processing, the batch processing result is added with the offset to obtain an addition result, and the addition result is operated by using an activation function to obtain the image feature data output by the target convolution layer, so that the convolution result of the target convolution layer can be accurately determined based on the weight scale, the input scale and the operation result, only the quantization processing of the target convolution layer in the neural network can be realized, the size of the model is reduced, the image processing speed is improved, the generalization capability of the quantization processing can be improved, the precision loss is reduced, and the image processing quality can be further improved.

Fig. 4 is a flowchart showing how the weight data and input image data of the target convolutional layer are mapped to a set threshold interval according to an exemplary embodiment. The present embodiment is exemplified by how the weight data and the input image data of the target convolutional layer are mapped to a set threshold interval on the basis of the above embodiments. As shown in fig. 4, the step of mapping the weight data and the input image data of the target convolution layer to a set threshold interval in the step S102 to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data, respectively, may include the following steps S301 to S302:

in step S301, an upper threshold and a lower threshold of the data to be mapped are determined.

In this embodiment, after obtaining the weight data and the input image data of the target convolutional layer in the convolutional neural network trained in advance, an upper threshold and a lower threshold of data to be mapped may be determined, where the data to be mapped may include the weight data and/or the input image data.

For example, after the data to be mapped is obtained, a numerical distribution range of the data to be mapped may be determined, and then an upper threshold and a lower threshold of the data to be mapped may be determined based on the data distribution range. For example, two end points of the data distribution range may be determined as an upper threshold and a lower threshold of the data to be mapped, respectively.

It should be noted that, the above manner of determining the numerical distribution range of the data to be mapped may be referred to the explanation and description in the related art, which is not limited in this embodiment.

In step S302, the data to be mapped is quantized based on the upper threshold and the lower threshold, so as to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data.

In this embodiment, after determining the upper threshold and the lower threshold of the data to be mapped, quantization processing may be performed on the data to be mapped based on the upper threshold and the lower threshold to obtain a weight quantization value and an input quantization value, and a weight expansion scale and an input expansion scale may be recorded.

For example, if the expression range of the INT8 is currently selected to be-128-127, the maximum value of the absolute value of the data to be mapped can be determined, and the maximum value is mapped to the upper absolute value boundary 127 of the INT8, so as to obtain the expansion scale of the quantized data.

In another embodiment, the method for obtaining the weight quantization value and the input quantization value by performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold may also refer to the embodiment shown in fig. 5, which is not described in detail herein.

As can be seen from the foregoing description, in this embodiment, by determining an upper threshold and a lower threshold of data to be mapped, and performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold, a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data are obtained, a subsequent convolution operation on the weight quantization value and the input quantization value can be performed, so as to obtain an operation result, and image feature data output by the target convolution layer is obtained according to a convolution operation result of the weight quantization value and the image quantization value of the target convolution layer, and the weight expansion scale and the image expansion scale, so that a generalization capability of a quantization scheme can be improved, and since only the target convolution layer in the neural network is quantized, accuracy of a subsequent feature extraction can be ensured, and a classification precision loss can be reduced.

Fig. 5 is a flowchart showing how to map the weight data and the input image data of the target convolution layer to a set threshold interval according to still another exemplary embodiment. The present embodiment is exemplified by how the weight data and the input image data of the target convolutional layer are mapped to a set threshold interval on the basis of the above embodiments. As shown in fig. 5, the step of mapping the weight data and the input image data of the target convolution layer to a set threshold interval in the step S102 to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data, respectively, may include the following steps S401 to S406:

In step S401, an upper threshold and a lower threshold of the data to be mapped are determined.

The explanation and description of step S401 may be referred to the above embodiments, and will not be repeated here.

In step S402, it is determined whether the value of the data to be mapped is greater than or equal to the upper threshold: if yes, go to step S403; if not, step S404 is performed.

In step S403, a quantized value of the data to be mapped is determined as the upper threshold.

In step S404, it is determined whether the value of the data to be mapped is less than or equal to the lower threshold: if yes, go to step S405; if not, step S406 is performed.

In this embodiment, the data to be mapped includes the weight data and/or the input image data.

For example, after determining the upper threshold and the lower threshold of the data to be mapped, the values of the data to be mapped may be compared with the upper threshold and the lower threshold, and the following cases may be further classified based on the obtained different comparison results:

first case: if the value of the data to be mapped is determined to be greater than or equal to the upper threshold, the quantized value of the data to be mapped can be determined to be the upper threshold.

Second case: if the value of the data to be mapped is determined to be smaller than or equal to the lower threshold, the quantized value of the data to be mapped can be determined to be the lower threshold.

Third case: if the value of the data to be mapped is determined to be between the upper limit threshold and the lower limit threshold, the data to be mapped can be quantized based on a preset quantization algorithm, and the result of the quantization process is used as the quantized value of the data to be mapped.

In this embodiment, the method for performing quantization processing on the data to be mapped based on the preset quantization algorithm and using the result of the quantization processing as the quantized value of the data to be mapped may include: and mapping the data to be mapped from the single-precision floating point number format FP32 to the 8-bit integer format INT8, and taking a mapping result as a quantized value of the data to be mapped. For example, after obtaining the scale of quantized data (e.g., a weight scale or an input scale), each element in the data to be mapped may be divided by the corresponding scale, so as to obtain a quantized value (e.g., a weight quantized value or an input quantized value) of the data to be mapped, and then the INT8 mapping may be completed.

It is understood that by setting the upper limit threshold and the lower limit threshold, and determining the quantized value of the data to be mapped as the upper limit threshold when it is determined that the value of the data to be mapped is greater than or equal to the upper limit threshold, and determining the quantized value of the data to be mapped as the lower limit threshold when it is determined that the value of the data to be mapped is less than or equal to the lower limit threshold, and performing quantization processing on the data to be mapped based on a preset quantization algorithm when it is determined that the value of the data to be mapped is between the upper limit threshold and the lower limit threshold, and taking the result of the quantization processing as the quantized value of the data to be mapped, long tail effect of the data to be mapped can be effectively avoided.

For example, the process of quantifying the FP 32-type convolution into the INT 8-type convolution in this embodiment can be intuitively demonstrated by the following formulas (1) to (7):

input.32≈Threshold(input) (1)

weight.32≈Threshold(weight) (2)

scale＝max(abs(input.32))/127 (3)

scale′＝max(abs(weight.32))/127 (5)

input*weight≈input.32*weight.32＝(input.8×scale)*(weight.8×scale′)

＝(input.8*weight.8)×scale×scale′ (7)

wherein Threshold represents the intercept Threshold, input represents the input data, weight represents the weight, ".32" and ".8" represent FP32 and INT8 formats, respectively.

Therefore, the loss of information is from the difference of sampling density through the mapping and inverse mapping of the input data and the weight, so that the distribution of the calculation result can be ensured to be approximately consistent, the quantization scheme is ensured to have good generalization capability, and meanwhile, the quantization of a single convolution kernel in the neural network can be realized without affecting the continuous calculation of other convolution kernels in the FP32 domain, so that the precision loss of the whole network is lower, and the fine-granularity INT8 quantization is realized.

Fig. 6 is a block diagram of an image processing apparatus according to an exemplary embodiment. The image processing method of the present embodiment may be used for a computer device, such as a terminal or a server, to perform feature extraction on an input image using a convolutional neural network trained in advance, where the convolutional neural network includes a plurality of convolutional layers. As shown in fig. 6, the apparatus includes: a weight data acquisition module 110, a weight data quantization module 120, and an image feature acquisition module 130, wherein:

a weight data acquisition module 110 configured to perform acquisition of weight data and input image data of a target convolutional layer in the pre-trained convolutional neural network;

a weight data quantization module 120 configured to perform mapping of the weight data of the target convolution layer and the input image data to a set threshold interval, and obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data, respectively;

an image feature obtaining module 130, configured to perform a convolution operation result according to the weight quantization value and the image quantization value of the target convolution layer, and the weight scale and the image scale, to obtain image feature data output by the target convolution layer.

Fig. 7 is a block diagram of an image processing apparatus according to still another exemplary embodiment. The image processing method of the present embodiment may be used for a computer device, such as a terminal or a server, to perform feature extraction on an input image using a convolutional neural network trained in advance, where the convolutional neural network includes a plurality of convolutional layers. The functions of the weight data obtaining module 210, the weight data quantizing module 220, and the image feature obtaining module 230 are the same as those of the weight data obtaining module 110, the weight data quantizing module 120, and the image feature obtaining module 130 in the embodiment shown in fig. 6, and are not described herein. As shown in fig. 7, the image feature acquisition module 230 may include:

a multiplication result obtaining unit 231 configured to perform obtaining a multiplication result of the weight scale, the image scale, and the convolution operation result;

an addition result obtaining unit 232 configured to perform batch processing of the multiplication result, and add the batch processing result to the offset to obtain an addition result;

and an image feature obtaining unit 233 configured to perform an operation on the addition result by using an activation function, to obtain image feature data output by the target convolution layer.

In one embodiment, the weight data quantization module 220 may include:

a data threshold determining unit 221 configured to perform determination of an upper threshold and a lower threshold of data to be mapped, the data to be mapped including the weight data and/or the input image data;

the weight data quantization unit 222 is configured to perform quantization processing on the data to be mapped based on the upper threshold and the lower threshold, so as to obtain a weight quantization value and a weight expansion scale corresponding to the weight data, and an image quantization value and an image expansion scale corresponding to the input image data.

In an embodiment, the data threshold determining unit 221 may be further configured to perform determining an upper threshold and a lower threshold of the data to be mapped based on the numerical distribution range of the data to be mapped.

In an embodiment, the weight data quantization unit 222 is further configured to perform:

In an embodiment, the weight data quantization unit 222 may be further configured to perform mapping of the data to be mapped into a single precision floating point number format FP32 to an 8-bit integer number format INT8, and use the mapping result as a quantized value of the data to be mapped.

The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.

It should be noted that, in all the above alternative solutions, any combination may be adopted to form an alternative embodiment of the disclosure, which is not described herein in detail.

The embodiment of the image processing apparatus of the present invention can be applied to a network device. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of a device where the device is located, where the computer program is used to execute the image processing method provided by the embodiments shown in fig. 2A to 5. In terms of hardware, as shown in fig. 8, which is a hardware structure diagram of the convolution computing device of the present invention, in addition to the processor, the network interface, the memory and the nonvolatile memory shown in fig. 8, the device may generally further include other hardware, such as a forwarding chip responsible for processing the packet, etc.; the device may also be a distributed device in terms of hardware architecture, possibly comprising a plurality of interface cards, for the extension of the message processing at the hardware level.

In another aspect, the present application also provides a computer-readable storage medium, which when executed by a processor of an image processing electronic device, enables the image processing electronic device to perform the image processing method provided by the embodiments shown in fig. 2A to 5.

In another aspect, the present application also provides a computer program product which, when executed by a processor of an image processing electronic device, enables the image processing electronic device to perform the image processing method provided by the embodiments shown in fig. 2A to 5.

For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. An image processing method is characterized in that a pre-trained convolutional neural network is utilized to perform feature extraction on an input image, and the convolutional neural network comprises a plurality of convolutional layers; the method comprises the following steps:

Obtaining image characteristic data output by the target convolution layer according to the convolution operation results of the weight quantization value and the image quantization value of the target convolution layer and the weight expansion scale and the image expansion scale;

the step of obtaining the image characteristic data output by the target convolution layer according to the convolution operation result of the weight quantization value and the image quantization value of the target convolution layer, and the weight expansion scale and the image expansion scale comprises the following steps:

2. The image processing method according to claim 1, wherein the step of mapping the weight data and the input image data of the target convolution layer to a set threshold interval to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data, respectively, comprises:

3. The image processing method according to claim 2, wherein the step of determining the upper and lower thresholds of the data to be mapped includes:

4. The image processing method according to claim 2, wherein the step of performing quantization processing on the data to be mapped based on the upper threshold and the lower threshold to obtain a weight quantization value and a weight scaling scale corresponding to the weight data, and an image quantization value and an image scaling scale corresponding to the input image data, includes:

5. An image processing apparatus characterized by performing feature extraction on an input image using a convolutional neural network trained in advance, the convolutional neural network including a plurality of convolutional layers; the device comprises:

an image feature acquisition module configured to perform a convolution operation result according to the weight quantization value and the image quantization value of the target convolution layer, and the weight scale and the image scale, to obtain image feature data output by the target convolution layer;

The image feature acquisition module comprises:

6. The image processing apparatus according to claim 5, wherein the weight data quantization module includes:

7. The image processing apparatus according to claim 6, wherein the data threshold determining unit is further configured to perform determining an upper threshold and a lower threshold of the data to be mapped based on a numerical distribution range of the data to be mapped.

8. The image processing apparatus according to claim 6, wherein the weight data quantization unit is further configured to perform:

9. An image processing electronic device, characterized in that an input image is feature extracted using a pre-trained convolutional neural network, the convolutional neural network comprising a plurality of convolutional layers; the electronic device includes:

A processor;

a memory for storing the processor-executable instructions;

wherein the processor is configured to execute the instructions to implement the image processing method of any one of claims 1 to 4.

10. A storage medium, which when executed by a processor of an image processing electronic device, causes the image processing electronic device to perform the image processing method of any of claims 1 to 4.