CN112927125A

CN112927125A - Data processing method and device, computer equipment and storage medium

Info

Publication number: CN112927125A
Application number: CN202110132573.XA
Authority: CN
Inventors: 周军; 周亮; 常亮; 王文强; 吴飞; 徐宁仪
Original assignee: University of Electronic Science and Technology of China; Chengdu Sensetime Technology Co Ltd
Current assignee: University of Electronic Science and Technology of China; Chengdu Sensetime Technology Co Ltd
Priority date: 2021-01-31
Filing date: 2021-01-31
Publication date: 2021-06-08
Anticipated expiration: 2041-01-31
Also published as: WO2022160706A1; CN112927125B

Abstract

The present disclosure provides a data processing method, apparatus, computer device, and storage medium, wherein the method comprises: grouping a plurality of multiplier-adders in a multiplier-adder array based on a matrix operand operation step size to obtain at least one multiplier-adder group; and executing data processing tasks corresponding to each multiplier-adder group in parallel by utilizing each multiplier-adder group in the at least one multiplier-adder group. The method and the device can enable the multiplier-adder array to process a plurality of data processing tasks simultaneously, and improve the processing efficiency of the multiplier-adder array on the data processing tasks. In addition, the multiplier-adder arrays are grouped based on the operand operation step length, so that the multiplier-adder which originally has invalid processing results of a certain data processing task is valid for the processing results of another data processing task, the utilization rate of the multiplier-adder arrays is improved, and the waste of computing resources is reduced.

Description

Data processing method and device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a data processing method and apparatus, a computer device, and a storage medium.

Background

At present, a convolutional neural network mainly depends on a multiplier-adder array to carry out convolution processing, the multiplier-adder array stores image data to be processed in a data processing task in a corresponding register array, and the image data to be processed moves in the register array in different data processing periods; however, the current data processing mode has the problems of low utilization rate of the multiplier-adder array and waste of computing resources.

Disclosure of Invention

The embodiment of the disclosure at least provides a data processing method, a data processing device, computer equipment and a storage medium.

In a first aspect, an embodiment of the present disclosure provides a data processing method, including:

grouping a plurality of multiplier-adders in a multiplier-adder array based on a matrix operand operation step size to obtain at least one multiplier-adder group; and executing data processing tasks corresponding to each multiplier-adder group in parallel by utilizing each multiplier-adder group in the at least one multiplier-adder group.

Therefore, based on grouping the multiplier-adder arrays, the multiplier-adder arrays can simultaneously process a plurality of data processing tasks, and the processing efficiency of the multiplier-adder arrays on the data processing tasks is improved. In addition, the multiplier-adder arrays are grouped based on the operand operation step length, so that the multiplier-adder which originally has invalid processing results of a certain data processing task is valid for the processing results of another data processing task, the utilization rate of the multiplier-adder arrays is improved, and the waste of computing resources is reduced.

In one possible embodiment, two adjacent same-group multiply-add devices in the same row of the multiply-add array are separated by a non-same-group multiply-add device with the same number and different zero, and two adjacent same-group multiply-add devices in the same column of the multiply-add array are separated by a non-same-group multiply-add device with the same number and different zero.

Therefore, each multiplier-adder group can be guaranteed to process different data processing tasks based on the grouping condition of the multiplier-adder array, so that the multiplier-adder array can simultaneously process a plurality of data processing tasks, and the processing efficiency of the multiplier-adder array on the data processing tasks is improved.

In one possible implementation, the grouping a plurality of multiplier-adders in a multiplier-adder array based on a matrix operand operation step size includes: determining a number of the multiplier-adder groups based on the matrix operand operation step size; grouping a plurality of multipliers in the multiplier-adder array based on the number of multiplier-adder groups.

Therefore, the processing result of each multiplier-adder group of the multiplier-adder array to the data task of the multiplier-adder group is guaranteed to be effective, so that the multiplier-adder array can simultaneously process a plurality of data processing tasks, and the processing efficiency of the multiplier-adder array to the data processing tasks is improved.

In one possible implementation, the grouping the plurality of multiplier-adders in the multiplier-adder array based on the number of multiplier-adder groups includes: determining a first target multiplier-adder in each multiplier-adder group from the multiplier-adder array; determining, from the multiplier-adder array, other target multiplier-adders in the each multiplier-adder group than the first target multiplier-adder based on a position of the first multiplier-adder in the multiplier-adder array, the matrix operand operation step size, and a size of the multiplier-adder array.

After the position of the first multiplier-adder of each multiplier-adder group in the multiplier-adder array is determined, the positions of other target multiplier-adders except the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array can be determined based on the position of the first multiplier-adder of each multiplier-adder group in the multiplier-adder array, and the grouping efficiency of the multiplier-adder array is improved.

In one possible implementation, the determining, from the multiplier-adder array, other target multiplier-adders in each multiplier-adder group except the first target multiplier-adder based on the position of the first multiplier-adder in the multiplier-adder array, the matrix operand operation step size, and the size of the multiplier-adder array includes: for each multiplier-adder group, determining a first position relation between each multiplier-adder except the first multiplier-adder in each row of the multiplier-adder group and an adjacent previous multiplier-adder of the multiplier-adder in the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder array and the operation step size of the matrix operand; and determining a second position relation between each multiplier-adder except the first multiplier-adder in the column in the multiplier-adder group and an adjacent previous multiplier-adder in the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder group in the multiplier-adder array, the operation step size of the matrix operand and the number of columns in the multiplier-adder array; and determining the target positions of other target multipliers and adders except the first target multiplier and adder in the multiplier and adder group in the multiplier and adder array based on the first position relation and/or the second position relation.

In one possible embodiment, the determining the first target multiplier-adder in each multiplier-adder group from the multiplier-adder array includes: determining a target matrix based on the operation step size of the matrix operand and the size of the multiplier-adder array; and determining the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array according to the matrix element values of the target matrix.

In one possible implementation, the executing, by each of the at least one multiplier-adder group, a data processing task corresponding to the each multiplier-adder group includes: storing the image data to be processed corresponding to each multiplier-adder group into a register array corresponding to each multiplier-adder group according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array; for each data processing cycle in a plurality of data processing cycles, respectively reading image data to be processed corresponding to each multiplier-accumulator group in the data processing cycle from the register array corresponding to each multiplier-accumulator group; processing the read image data to be processed, and obtaining the data processing result of each multiplier-adder group in the data processing period in parallel; and finishing the data processing tasks respectively corresponding to the multiplier-adder groups according to the data processing results respectively corresponding to the multiplier-adder groups in each data processing period.

Therefore, the multiplier-adder array reads corresponding operands in different data processing periods to ensure that each multiplier-adder group can process corresponding data processing tasks and ensure the validity of processing results of the multiplier-adder array on the data processing tasks.

In one possible implementation, storing the image data to be processed corresponding to each multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of the target multiplier-adder in the multiplier-adder array comprises: determining the number of registers contained in a register array corresponding to each multiplier-adder according to the size of the matrix operand; for each multiplier-adder group, determining the position of a target multiplier-adder of the multiplier-adder group in a register array which corresponds to the target multiplier-adder in each fixed reading mode; and for each multiplier-adder group, storing the image data to be processed corresponding to the multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the position of a register fixedly read by the target multiplier-adder in the multiplier-adder group and the processing sequence of operands contained in the image data to be processed in the data processing process, so that the operands stored in the positions of the registers fixedly read by each target multiplier-adder correspond to matrix elements in the matrix operands of the corresponding data processing period in each data processing period.

In a possible implementation manner, for each data processing cycle in the plurality of data processing cycles, the to-be-processed image data corresponding to each multiplier-adder group in the data processing cycle is read from the register array corresponding to each multiplier-adder group; and processing the read image data to be processed, and parallelly obtaining the data processing result of each multiplier-adder group in the data processing period, wherein the data processing result comprises the following steps: for the first data processing period of processing the image data to be processed, controlling each target multiplier-adder in each multiplier-adder group, and respectively reading an operand of each target multiplier-adder in the first data processing period from a register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the first data processing period as second operands; respectively determining the product of a first operand and a second operand of each target multiplier-adder in the first data processing period; for a non-first data processing period for processing the image data to be processed, controlling the image data to be processed to move a preset step length in the register array according to a preset data moving mode corresponding to the data processing period; controlling each target multiplier-adder in each multiplier-adder group, and respectively reading the operand of each target multiplier-adder in the non-first data processing period from the register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the data processing period as a second operand; the product of the first operand and the second operand of each target multiplier-adder in the data processing period is respectively determined.

Therefore, the operands are enabled to make ordered displacement in the register array along with the transformation of the data processing period based on the preset step length and the preset data moving mode, the corresponding multiplier-adder in the multiplier-adder array can be ensured to obtain effective data, and the effectiveness of the processing result of the data processing task is ensured.

In a possible implementation manner, the completing the data processing tasks corresponding to the multiplier-adder groups according to the data processing results corresponding to the multiplier-adder groups in each data processing cycle includes: for each target multiplier-adder in each multiplier-adder group, adding products obtained by the target multiplier-adder in each data processing period to obtain a sum; and finishing the data processing tasks corresponding to the multiplier-adder groups respectively based on the sum values corresponding to the target multiplier-adders respectively contained in each multiplier-adder group.

In one possible implementation, the data processing task includes: a convolution processing task; and the convolution processing tasks of different multiplier-adder groups correspond to different images to be processed.

Therefore, the multiplier-adder array can process a plurality of images to be processed simultaneously, and the processing efficiency of the multiplier-adder array on the images to be processed is improved.

In a second aspect, an embodiment of the present disclosure further provides a data processing apparatus, including: a controller; the controller is configured to:

grouping a plurality of multiplier-adders in a multiplier-adder array based on a matrix operand operation step size to obtain at least one multiplier-adder group;

and executing data processing tasks corresponding to each multiplier-adder group in parallel by utilizing each multiplier-adder group in the at least one multiplier-adder group.

In one possible embodiment, when grouping a plurality of multiplier-adders in a multiplier-adder array based on a matrix operand operation step size, the controller is specifically configured to determine the number of multiplier-adder groups based on the matrix operand operation step size; grouping a plurality of multipliers in the multiplier-adder array based on the number of multiplier-adder groups.

In one possible embodiment, the controller is specifically configured to determine a first target multiplier-adder in each multiplier-adder group from the multiplier-adder array when grouping the plurality of multiplier-adders in the multiplier-adder array based on the number of multiplier-adder groups; determining, from the multiplier-adder array, other target multiplier-adders in the each multiplier-adder group than the first target multiplier-adder based on a position of the first multiplier-adder in the multiplier-adder array, the matrix operand operation step size, and a size of the multiplier-adder array.

In one possible embodiment, when determining the other target multiplier-adders in each multiplier-adder group except the first target multiplier-adder in each multiplier-adder group from the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder array, the matrix operand operation step size and the size of the multiplier-adder array, the controller is specifically configured to determine, for each multiplier-adder group, a first positional relationship between each multiplier-adder in each row except the first multiplier-adder in the multiplier-adder group and an adjacent previous multiplier-adder of the multiplier-adder in the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder group and the matrix operand operation step size; and determining a second position relation between each multiplier-adder except the first multiplier-adder in the column in the multiplier-adder group and an adjacent previous multiplier-adder in the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder group in the multiplier-adder array, the operation step size of the matrix operand and the number of columns in the multiplier-adder array; and determining the target positions of other target multipliers and adders except the first target multiplier and adder in the multiplier and adder group in the multiplier and adder array based on the first position relation and/or the second position relation.

In a possible implementation, when determining a first target multiplier-adder in each multiplier-adder group from the multiplier-adder array, the controller is specifically configured to determine a target matrix based on the matrix operand operation step size, the size of the multiplier-adder array; and determining the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array according to the matrix element values of the target matrix.

In a possible implementation manner, when each multiplier-adder group in the at least one multiplier-adder group is utilized to execute the data processing task corresponding to each multiplier-adder group, the controller is specifically configured to store the image data to be processed corresponding to each multiplier-adder group into the register array corresponding to each multiplier-adder group according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array; for each data processing cycle in a plurality of data processing cycles, respectively reading image data to be processed corresponding to each multiplier-accumulator group in the data processing cycle from the register array corresponding to each multiplier-accumulator group; processing the read image data to be processed, and obtaining the data processing result of each multiplier-adder group in the data processing period in parallel; and finishing the data processing tasks respectively corresponding to the multiplier-adder groups according to the data processing results respectively corresponding to the multiplier-adder groups in each data processing period.

In a possible implementation manner, when storing the image data to be processed corresponding to each multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array, the controller is specifically configured to determine the number of registers included in each multiplier-adder corresponding register array according to the size of a matrix operand; for each multiplier-adder group, determining the position of a target multiplier-adder of the multiplier-adder group in a register array which corresponds to the target multiplier-adder in each fixed reading mode; and for each multiplier-adder group, storing the image data to be processed corresponding to the multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the position of a register fixedly read by the target multiplier-adder in the multiplier-adder group and the processing sequence of operands contained in the image data to be processed in the data processing process, so that the operands stored in the positions of the registers fixedly read by each target multiplier-adder correspond to matrix elements in the matrix operands of the corresponding data processing period in each data processing period.

In one possible implementation mode, for each data processing cycle in a plurality of data processing cycles, respectively reading the image data to be processed corresponding to each multiplier-adder group in the data processing cycle from the register array corresponding to each multiplier-adder group; when the read image data to be processed is processed and the data processing results of the multiplier-adder groups in the data processing period are obtained in parallel, the controller is specifically used for controlling each target multiplier-adder in each multiplier-adder group aiming at the first data processing period for processing the image data to be processed and respectively reading the operand of each target multiplier-adder in the first data processing period from the register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the first data processing period as second operands; respectively determining the product of a first operand and a second operand of each target multiplier-adder in the first data processing period; for a non-first data processing period for processing the image data to be processed, controlling the image data to be processed to move a preset step length in the register array according to a preset data moving mode corresponding to the data processing period; controlling each target multiplier-adder in each multiplier-adder group, and respectively reading the operand of each target multiplier-adder in the non-first data processing period from the register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the data processing period as a second operand; the product of the first operand and the second operand of each target multiplier-adder in the data processing period is respectively determined.

In a possible embodiment, when the data processing tasks corresponding to the multiplier-adder groups are completed according to the data processing results corresponding to the multiplier-adder groups in each data processing cycle, the controller is specifically configured to add, for each target multiplier-adder in each multiplier-adder group, the products obtained by the target multiplier-adder in each data processing cycle to obtain a sum; and finishing the data processing tasks corresponding to the multiplier-adder groups respectively based on the sum values corresponding to the target multiplier-adders respectively contained in each multiplier-adder group.

In a third aspect, this disclosure also provides a computer device, a controller, and a memory, where the memory stores machine-readable instructions executable by the controller, and the controller is configured to execute the machine-readable instructions stored in the memory, and when the machine-readable instructions are executed by the controller, the machine-readable instructions are executed by the controller to perform the steps in the first aspect or any one of the possible implementations of the first aspect.

In a fourth aspect, this disclosure also provides a computer-readable storage medium having a computer program stored thereon, where the computer program is executed to perform the steps in the first aspect or any one of the possible implementation manners of the first aspect.

For the description of the effects of the data processing apparatus, the computer device, and the computer-readable storage medium, reference is made to the description of the data processing method, which is not repeated herein.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 shows a flow chart of a data processing method provided by an embodiment of the present disclosure;

fig. 2 illustrates an example diagram of a multiplier-adder array provided by an embodiment of this disclosure;

FIG. 3 illustrates an example graph of a matrix operand based operation step move provided by an embodiment of the present disclosure;

FIG. 4 illustrates an example diagram of a multiplier-adder array divided into four multiplier-adder groups provided by this disclosure;

fig. 5 is a diagram illustrating an example of a matrix for determining the position of a leading target multiplier-adder in each multiplier-adder group in a multiplier-adder array according to an embodiment of the disclosure;

fig. 6 illustrates an example diagram of a multiplier-adder array and a corresponding register array provided by an embodiment of the disclosure;

FIG. 7 is a diagram illustrating an example of a register array a after shifting image data to be processed in the register array by one step to the left in an overall manner in an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of a data processing apparatus provided by an embodiment of the present disclosure;

fig. 9 shows a schematic diagram of a computer device provided by an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of embodiments of the present disclosure, as generally described and illustrated herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

It has been found that convolutional neural networks rely primarily on multiplier-adder arrays for convolution processing. When convolution processing is carried out, the image data to be processed is stored in a register array connected with the multiplier-adder array; the image data to be processed stored in the register array can move in the register array in different data processing periods; the multiplier-adder array reads the operands of the data processing cycle from the registers (belonging to the register array) connected to the multiplier-adder array and performs multiplication and/or addition operations per data processing cycle. After the processing of a plurality of data processing periods, the multiplier-adder array outputs a partial result of the convolution processing of the image data to be processed. When the operation step size of the matrix operand is larger than 1, the processing result of part of the multipliers in the multiplier-adder array is not needed in the result of processing the image data to be processed, so that the data processing mode in the case has the problems of low utilization rate of the multiplier-adder array and waste of computing resources.

Based on the above research, the present disclosure provides a data processing method, an apparatus, a computer device, and a storage medium, where at least one multiplier-adder group is obtained by grouping multiplier-adder arrays based on a matrix operand operation step length, and different multiplier-adder groups in the multiplier-adder arrays respectively process data processing tasks corresponding to different image data to be processed in parallel, that is, the same multiplier-adder array can process multiple image data to be processed simultaneously, and each multiplier-adder group processes one image data to be processed, so that multipliers-adder unused in a process of processing one image data to be processed are used to process other image data to be processed, thereby improving utilization rate of the multiplier-adder array, reducing waste of computing resources, and improving processing efficiency of the multiplier-adder array on the image data to be processed.

The above-mentioned drawbacks are the results of the inventor after practical and careful study, and therefore, the discovery process of the above-mentioned problems and the solutions proposed by the present disclosure to the above-mentioned problems should be the contribution of the inventor in the process of the present disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

To facilitate understanding of the present embodiment, first, a data processing method disclosed in the embodiments of the present disclosure is described in detail, where an execution subject of the data processing method provided in the embodiments of the present disclosure is generally a computer device with certain computing capability, and the computer device includes, for example: a terminal device, which may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal Digital Assistant (PDA), a handheld device, a computing device, a vehicle mounted device, a wearable device, or a server or other processing device. In some possible implementations, the data processing method may be implemented by a processor calling computer readable instructions stored in a memory.

The following describes a data processing method provided by the embodiments of the present disclosure.

Referring to fig. 1, a flowchart of a data processing method provided by the embodiment of the present disclosure is shown, where the method includes steps S101 to S102, where:

s101: grouping a plurality of multiplier-adders in a multiplier-adder array to obtain at least one multiplier-adder group based on the operation step length of a matrix operand;

s102: with each of the at least one multiplier-accumulator set, data processing tasks corresponding to each multiplier-accumulator set are performed in parallel.

The method comprises the steps of grouping multiplier-adder arrays based on operand operation step lengths to obtain at least one multiplier-adder group, and enabling each multiplier-adder group in the at least one multiplier-adder group to execute data processing tasks corresponding to the multiplier-adder group in parallel; the data processing tasks processed by each multiplier-adder group are different, so that the multiplier-adder array can simultaneously process a plurality of data processing tasks, and the processing efficiency of the multiplier-adder array on the data processing tasks is improved.

In addition, in the processing mode, the multiplier-adder which is not used in the processing process of one image data to be processed is used for processing other image data to be processed, so that the utilization rate of the multiplier-adder array is improved, and the waste of computing resources is reduced.

The following describes the details of S101 to S102.

For the above S101, the multiplier-adder array is a matrix array composed of at least one multiplier-adderAs an example, fig. 2 shows an exemplary diagram of a multiplier-adder array provided by the present disclosure, which includes 4 rows and 4 columns for 16 multiplier-adders. The matrix operand includes, for example, a convolution kernel when processing image data to be processed; the convolution operand step is, for example, the convolution move step. Illustratively, a convolution kernel moving by 2 steps as in FIG. 3 represents: s_x＝2、S_yFor example, the moving process is from the first target position shown in a to the second target position shown in b, and then from the second target position shown in b to the third target position shown in c, that is, two pixels at a time when the moving is performed in the transverse direction, and two pixels at a time when the moving is performed in the longitudinal direction; wherein S is_xRepresenting pixels moving in the lateral direction, S_yRepresenting pixels moving in the longitudinal direction.

In grouping the multiplier-adders in the multiplier-adder array, for example, the number of multiplier-adder groups may be determined based on the matrix operand operation step size, and the plurality of multiplier-adders in the multiplier-adder array may be grouped based on the number of multiplier-adder groups.

In a specific implementation, the matrix operand operation step size multiplier-adder set number is related to: number of multiplier-adder groups being S_x*S_y(ii) a For example, when the operation step size of the matrix operand is 2, S_x＝2、S_yWhen 2, the number of the multiplier-adder groups is S_x*S_y＝2*2＝4。

In a specific implementation, an embodiment of the present disclosure provides a specific method for grouping a plurality of multiply adders in a multiply-add array based on a matrix operand operation step size to obtain at least one multiply-add group, including:

determining a first target multiplier-adder in each multiplier-adder group from the multiplier-adder array;

the other target multiplier-adders in each multiplier-adder group except the first target multiplier-adder are determined from the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder array, the matrix operand operation step size, and the size of the multiplier-adder array.

In some cases, the graph to be processed is fixed due to the fixed size of the multiplier-adder arrayThe size of the image data may be different according to the actual image processing situation, and therefore, even if the data processing method provided by the embodiment of the present disclosure is used to process a plurality of image data to be processed in parallel, the utilization rate of the multiplier-adder array may not reach one hundred percent in many cases. Therefore, in the embodiment of the present disclosure, size information of the multiplier-adder array actually used is first determined based on the operation step size of the matrix operand and size information of the multiplier-adder array; the size information of the multiplier-adder array comprises the row number and the column number of the multiplier-adder array, and the size information of the actually used multiplier-adder array comprises the row number and the column number of the actually used multiplier-adder array; the relationship between the size information of the multiplier-adder array actually used and the operation step size of the matrix operand, and the size information of the multiplier-adder array is: a'_x＝A_x-A_x％S_x；A′_y＝A_y-A_y％S_y(ii) a Wherein A is_xFor the number of columns of the multiplier-adder array, A_yIs the number of rows in the multiplier-adder array; a'_xIs column number, A 'of multiplier-adder array actually used'_yFor the number of rows of the multiplier-adder array actually used,% is the operation of calculating the remainder. Illustratively, S when the operation step size of the matrix operand is 2_x＝2、S_yThe size information of the multiplier-adder array is 2: a. the_x＝5，A_y(ii) 5; column number A 'of multiplier-adder array thus actually used'_x＝A_x-A_x％S_x5-5% 2-4, the number of rows of the multiplier-adder array actually used is a'_y＝A_y-A_y％S_y＝5-5％2＝4。

The first target multiplier-adder for each multiplier-adder group is then determined in the multiplier-adder array actually used.

In particular implementation, the first target range multiplier-adder of each multiplier-adder group may be determined, for example, based on:

determining a target matrix based on the operation step length of the matrix operand and the size of the multiplier-adder array; determining the position of a first target multiplier-adder in each multiplier-adder group in a multiplier-adder array according to the matrix element value of the target matrix; wherein the matrix element values are the multiply-add units in the target matrix.

Illustratively, the size information of the multiplier-adder array is 4 rows and 4 columns, and S is S when the operation step size of the matrix operand is 2_x＝2、S_yWhen 2, the number of the multiplier-adder groups is S_x*S_y2 x 4, the target matrix includes two rows and two columns for 4 multiply adders, the first multiply adder in the multiply adder array is used as the first multiply adder of the target matrix, that is, the first target multiply adder of the first multiply adder group, the first target multiply adder of other multiply adder groups in the target matrix is determined based on the first multiply adder of the target matrix, for example, the position arrangement number of the actually used multiply adder array is as

The first target multiplier-adder of the first multiplier-adder group, that is, the first multiplier-adder of the target matrix is at position 0, and the corresponding position arrangement number of the target matrix in two rows and two columns determined based on the first target multiplier-adder of the first multiplier-adder group at position 0 in the multiplier-adder array in actual use is as

The first target multiplier-adder of the other three multiplier-adder groups is numbered 1, 4, 5 in the actual used multiplier-adder array, respectively, as the target matrix shown in fig. 4.

For example, the target position of the first target multiplier-adder of each multiplier-adder group in the target image can be determined by referring to the formula corresponding to each position of the matrix shown in fig. 5, that is, each matrix element in the matrix shown in fig. 5 represents the position of the first target multiplier-adder of one multiplier-adder array group. Wherein, A'_xIs column number, A 'of multiplier-adder array actually used'_yIs the number of rows, A ', of the actual multiplier-adder array'_x＝A_x-A_x％S_x；A′_y＝A_y-A_y％_y，A_xFor the number of columns of the multiplier-adder array, A_yIs the number of rows in the multiplier-adder array; s_xStep size of lateral shift for matrix operand operation step size, S_yA vertical shift step size of the matrix operand operation step size.

After the first target multiplier-adder of each multiplier-adder group is determined in the multiplier-adder array in actual use, the other target multiplier-adders in each multiplier-adder group except the first target multiplier-adder may be determined, for example, based on the method described in the following steps one to three:

the method comprises the following steps: for each multiplier-adder group, determining a first position relation between each multiplier-adder except the first multiplier-adder in each row in the multiplier-adder group and an adjacent previous multiplier-adder in the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder array and the operation step length of a matrix operand;

wherein, the first position relationship between each multiplier-adder of each row except the first multiplier-adder of the row and the adjacent previous multiplier-adder of the multiplier-adder in the multiplier-adder array is as follows: position + S of the adjacent previous multiplier-adder_xThe position of the multiplier-adder except the first multiplier-adder in each row.

Illustratively, the actual multiplier-adder array used is 4 rows and 4 columns, A'_y＝4、A′_xThe position arrangement of the multiplier-adder array actually used is numbered as 4

When the operation step length of the matrix operand is 2, S_x＝2、S_y2, the four different colors as shown in fig. 4 represent four multiplier-adder groups: a first multiplier-accumulator group in black, a second multiplier-accumulator group in white, a third multiplier-accumulator group in light gray, and a fourth multiplier-accumulator group in dark gray, taking the first multiplier-accumulator group as an example, if the first multiplier-accumulator of the first multiplier-accumulator group is at position 0, then the position of another multiplier-accumulator a in the same group in the row is: 0+ S_xThe position of the next multiplier-adder B after the row-id position 2 is 0+ 2: 2+ S_x2+ 2-4, but since the size of the multiplier-adder array actually used is 4 columns, and the maximum position arrangement number of the row is 3, the maximum position arrangement number of the row is 4The multiplier-adders in the same group at row and position 0 have only the multiplier-adder at position 2.

Step two: determining a second position relation of each multiplier-adder except the first multiplier-adder in each column in the multiplier-adder group and an adjacent previous multiplier-adder in the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder array, the operation step size of a matrix operand and the number of columns in the multiplier-adder array;

wherein, the second position relationship between each multiplier-adder in each row of the multiplier-adder group except the first multiplier-adder in the row and the adjacent previous multiplier-adder of the multiplier-adder in the multiplier-adder array is as follows: position + S of adjacent previous multiplier-adder in multiplier-adder array_y*A′_xThe position of the multiplier-adder except the first multiplier-adder in each column.

When the operation step length of the matrix operand is 2, S_x＝2、S_yAs shown in fig. 4, taking the first multiplier-adder group as an example, if the first multiplier-adder of the first multiplier-adder group is at position 0, the position of the other multiplier-adder C in the same column of the group is: 0+ S_y*A′_xThe position of the next multiplier-adder D after the column position 8 is 0+2 × 4 — 8: 8+ S_y*A′_x8+2 × 4 is 16, but since the size of the multiplier-adder array actually used is 4 rows, and the maximum position arrangement of the column is numbered 12, the multiplier-adder in the same group as that at position 0 has only the multiplier-adder at position 8.

Step three: and determining the target positions of other target multipliers and adders except the first target multiplier and adder in the multiplier and adder group in the multiplier and adder array based on the first position relation and/or the second position relation.

Illustratively, the actual multiplier-adder array used is 4 rows and 4 columns, A'_y＝4、A′_xMultiplier-adder array for practical use as 4Is numbered in the position arrangement of

When the operation step length of the matrix operand is 2, S_x＝2、S_yAfter calculating the position of the first multiplier-adder of each multiplier-adder group in each row or column, referring to the above formula for calculating the position of the adjacent multiplier-adder in the same row and the same group or calculating the target position of the other target multiplier-adders except the first target multiplier-adder in the multiplier-adder group in the multiplier-adder array, as shown in fig. 4, taking the first multiplier-adder as an example, the first multiplier-adder of the first multiplier-adder group is at position 0, the next multiplier-adder a in the same row and the same group is at position 2, and the next multiplier-adder E in the same column as the multiplier-adder a is at position: 2+ S_y*A′_x2+2 × 4-2 + 8-10; or, for example, if the position of the next multiplier-adder C in the same column as the multiplier-adder C at position 0 is 8, the position of the next multiplier-adder E in the same row as the multiplier-adder C is: 8+ S_x＝8+2＝10。

Illustratively, as shown in fig. 4, the present disclosure provides an exemplary diagram of a multiplier-adder array divided into four multiplier-adder groups, four different colors representing the four multiplier-adder groups, a first multiplier-adder group in black, a second multiplier-adder group in white, a third multiplier-adder group in light gray, and a fourth multiplier-adder group in dark gray; in the same row of the multiplier-adder array, the number of the adjacent two same-group multiplier-adder spacing non-same-group multiplier-adders is the same and is not zero, and in the same column of the multiplier-adder array, the number of the adjacent two same-group multiplier-adder spacing non-same-group multiplier-adders is the same and is not zero.

For the above S102, the to-be-processed images corresponding to the convolution processing tasks of different multiplier-adder groups are different, for example, each multiplier-adder group performs convolution on a different data matrix.

When a data processing task corresponding to each multiplier-adder group is executed in parallel by using each multiplier-adder group in at least one multiplier-adder group, the image data to be processed corresponding to each multiplier-adder group is stored into the register array corresponding to each multiplier-adder group according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array.

Here, the image data to be processed includes, for example, at least one of:

an original image to be processed;

a subgraph corresponding to any color channel in an original image to be processed;

carrying out feature extraction on the original image to obtain a feature map;

performing feature extraction on the original image to obtain a feature subgraph corresponding to at least one channel in a feature graph;

performing data filling processing on a subgraph corresponding to at least one color channel in the original image to obtain;

and performing data filling processing on the characteristic subgraph corresponding to at least one channel of the characteristic graph.

Taking the feature map as the image data to be processed as an example, when the image data to be processed is stored in the register array, the feature value of a feature point in the image data to be processed, which is also called an operand required by the multiplier-adder, is stored in each register of at least a part of the registers.

For each multiplier-adder group, determining the position of a target multiplier-adder of the multiplier-adder group in a register array which corresponds to the target multiplier-adder in each fixed reading mode; as shown in fig. 6, the multiplier-adder array in n includes four multiplier-adder groups, four multiplier-adder groups in n correspond to four register arrays shown in m, a black multiplier-adder group corresponds to a black register array a, a white multiplier-adder group corresponds to a white register array B, a light gray multiplier-adder group corresponds to a light gray register array C, a dark gray multiplier-adder group corresponds to a dark gray register array d, a target multiplier-adder PE0 reads the characteristic value stored in a0 from a register a0 fixedly read in each corresponding register array, a target multiplier-adder PE1 reads the characteristic value stored in B0 in a register B0, a target multiplier-adder PE2 reads the characteristic value stored in a2 in a register a2, a target multiplier-adder PE3 reads the characteristic value stored in B2 in register B2, a target multiplier-adder PE4 reads the characteristic value stored in C0 in a register C0, target multiplier PE5 reads the feature value stored in D5 in register D5, target multiplier PE5 reads the feature value stored in C5 in register C5, target multiplier PE5 reads the feature value stored in D5 in register D5, target multiplier PE5 reads the feature value stored in a5 in register a5, PE5 reads the feature value stored in B5 in register B5, target multiplier PE5 reads the feature value stored in a5 in register a5, target multiplier PE5 reads the feature value stored in B5 in register B5, target multiplier PE5 reads the feature value stored in C5 in register C5, target multiplier PE5 reads the feature value stored in register D5, and target multiplier PE5 reads the feature value stored in register D5.

And for each multiplier-adder group, storing the image data to be processed corresponding to the multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the position of a register fixedly read by the target multiplier-adder in the multiplier-adder group and the processing sequence of operands contained in the image data to be processed in the data processing process, so that the operands stored in the positions of the registers fixedly read by each target multiplier-adder correspond to matrix elements in the matrix operands of the corresponding processing period in each data processing period.

Where the matrix operand includes, for example, a convolution kernel, i.e., a data matrix, in a convolution calculation, illustratively,

is a two-row and two-column matrix operand, comprising matrix elements: w₀、W₁、W₂、W₃. The number of operands contained in the image data to be processed corresponding to each multiplier-adder group is consistent. The image data to be processed corresponding to the first multiplier-adder group shown in FIG. 6 is

The image data to be processedThe storage rule in the black register array corresponding to the first multiplier-adder group is shown as a in FIG. 6, and the image data to be processed corresponding to the second multiplier-adder group is

The storage rule of the image data to be processed in the white register array corresponding to the second multiplier-adder group is shown as b in FIG. 6, and the image data to be processed corresponding to the third multiplier-adder group is

The storage rule of the image data to be processed in the light gray register array corresponding to the third multiplier-adder group is shown as c in FIG. 6, and the image data to be processed corresponding to the fourth multiplier-adder group is

The storage rule of the image data to be processed in the dark gray register array corresponding to the fourth multiplier-adder group is shown as d in fig. 6.

After storing the image data to be processed corresponding to each multiplier-adder group into the register array corresponding to each multiplier-adder group, respectively reading the image data to be processed corresponding to each multiplier-adder group in the data processing period from the fixed register array corresponding to each multiplier-adder group aiming at each data processing period in a plurality of data processing periods; and processing the read image data to be processed, and obtaining the data processing result of each multiplier-adder group in the data processing period in parallel.

The method comprises the steps that for a first data processing period for processing image data to be processed, each target multiplier-adder in each multiplier-adder group is controlled, and operands of each target multiplier-adder in the first data processing period are read from a register fixedly read by each target multiplier-adder to serve as first operands; determining matrix elements of matrix operands corresponding to each multiplier-adder group in the first data processing period as second operands; respectively determining the product of a first operand and a second operand of each target multiplier-adder in the first data processing period;

for example, the target multiplier-adder PE0 reads the operand a0 from the register a0 that is fixedly read from the corresponding register array, and the target multiplier-adder PE1 reads the operand B0 from the register B0, and so on, and the operands read by the target multiplier-adder are not described herein; assume that the matrix operands are:

taking PE0 as an example, after reading operand a0, a0 is taken as the first operand, and the matrix element corresponding to the data processing cycle is W₀W is to be₀As a second operand, then calculate W₀A 0; and stores the result in a register.

For a non-first data processing period for processing the image data to be processed, controlling the image data to be processed to move a preset step length in the register array according to a preset data moving mode corresponding to the data processing period; controlling each target multiplier-adder in each multiplier-adder group, and respectively reading the operand of each target multiplier-adder in the non-first data processing period from the register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the data processing period as a second operand; the product of the first operand and the second operand of each target multiplier-adder in the data processing period is respectively determined.

For example, taking the second data processing period corresponding to the multiplier-adder group shown in fig. 6 as an example, the left shift step size is 1, as shown in fig. 7, which is an example diagram of the register array a after the image data to be processed is shifted left by one step size in the register array as a whole in the embodiment of the present disclosure, PE0 reads a1 from a0, PE2 reads A3 … … from a2, and other multiplier-adders read operands and so on, and are not described again; the matrix operands are:

taking PE0 as an example, after reading operand a1, a1 is taken as the first operand, and the matrix element corresponding to the data processing cycle is W₁W is to be₁As a firstTwo operands, then calculate W₁A 1; and stores the result in a register carried by itself.

Similarly, in the third data processing cycle, the data to be processed may be shifted up by one step as a whole based on the position shown in fig. 7, at this time, a5 is stored in a0, and PE0 may perform the calculation of W2 × a 5; in the fourth data processing cycle, the data to be processed may be shifted to the right by one step as a whole on the basis of the movement completed in the third data processing cycle, at this time, a4 is stored in a0, and PE0 may perform calculation of W3 × a4, and other PEs are the same, and are not described here again.

It can be seen that in each data processing cycle, the PE storing different image data to be processed completes the calculation of the corresponding data processing cycle, that is, the different multiplier-accumulator sets complete the calculation in the corresponding data processing cycle in parallel in each data processing cycle, and after all the data processing cycles, the different multiplier-accumulator sets complete the final calculation at the same time, thereby saving system resources.

Here, for different image data to be processed, the corresponding convolution kernels may be different or the same. For example, if two pieces of image data to be processed are respectively different feature subgraphs of the same feature graph, convolution kernels corresponding to the two pieces of image data to be processed are different. And if the two pieces of image data to be processed are image data at different positions of the same characteristic subgraph, the convolution kernels corresponding to the two pieces of images to be processed are the same.

And finishing the data processing tasks respectively corresponding to the multiplier-adder groups according to the data processing results respectively corresponding to the multiplier-adder groups in each data processing period.

For each target multiplier-adder in each multiplier-adder group, adding products obtained by the target multiplier-adder in each data processing period to obtain a sum; and finishing the data processing tasks corresponding to the multiplier-adder groups respectively based on the sum values corresponding to the target multiplier-adders respectively contained in each multiplier-adder group.

For example, taking PE0 shown in fig. 6 as an example, the calculation performed in four data processing cycles of PE0 is: w₀*a0、W₁*a1、W₂*a5、W₃A 4; the four calculation results are added: w₀*a0+W₁*a1+W₂*a5+W₃A4, the result is a result value in the processing result matrix of the data processing task of the image data to be processed corresponding to the first multiplier-adder group, and the result value in the processing result matrix of the data processing task of the image data to be processed corresponding to the first multiplier-adder group is arranged as

Here, if the image data to be processed after convolution is a feature map, the feature map includes 16 channels, and feature subgraphs corresponding to 4 channels are processed each time, that is, the feature subgraphs corresponding to 16 channels are divided into 4 groups, and a group of feature subgraphs are processed each time. If the 4 groups of characteristic subgraphs are respectively: when the group a, the group b, the group c and the group d are used, after 4 characteristic subgraphs included in the group a are processed, 4 results corresponding to the group a output by the multiplier-adder are accumulated; after the 4 characteristic subgraphs included in the group b are processed, accumulating 4 results corresponding to the group b, and accumulating the accumulated result corresponding to the group a and the accumulated result corresponding to the group b; after the 4 characteristic subgraphs included in the group c are processed, accumulating 4 results corresponding to the group c, and accumulating the accumulated results of the group a and the group b and the accumulated result corresponding to the group c; after the 4 characteristic subgraphs included in the group d are processed, the 4 results corresponding to the group d are accumulated, the accumulated results of the group a, the group b and the group c and the accumulated result corresponding to the group d are accumulated, and finally, the accumulated sum of the convolution results corresponding to the 16 channels is obtained.

After the 4 feature subgraphs included in the group a are processed, the obtained 4 output results corresponding to the group a are respectively: a1, a2, a3 and a 4. After the 4 feature subgraphs included in the group b are processed, the obtained 4 output results corresponding to the group b are respectively: b1, b2, b3 and b 4. At this time, a1+ b1 ═ O1, a2+ b2 ═ O2, a3+ b3 ═ O3, and a4+ b4 ═ O4 are performed. After the 4 feature subgraphs included in the group c are processed, the obtained 4 output results corresponding to the group c are respectively: c1, c2, c3 and c4, and further performing: o1+ c1, O2+ c2, O3+ c3, O4+ c 4; by analogy, a1+ b1+ c1+ d1, a2+ b2+ c2+ d2, a3+ b3+ c3+ d3 and a4+ b4+ c4+ d4 are finally obtained, and then the four results are accumulated together to obtain the accumulated sum of convolution results corresponding to 16 channels respectively.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

Based on the same inventive concept, a data processing apparatus corresponding to the data processing method is also provided in the embodiments of the present disclosure, and because the principle of the apparatus in the embodiments of the present disclosure for solving the problem is similar to the data processing method described above in the embodiments of the present disclosure, the implementation of the apparatus may refer to the implementation of the method, and repeated details are not described again.

Referring to fig. 8, a schematic diagram of a data processing apparatus provided in an embodiment of the present disclosure is shown, where the apparatus includes: a controller 801; the controller 801 is configured to:

grouping a plurality of multiplier-adders in the multiplier-adder array based on a matrix operand operation step size to obtain at least one multiplier-adder group;

In one possible embodiment, when grouping the plurality of multiplier-adders in the multiplier-adder array based on the matrix operand operation step size, the controller 801 is specifically configured to determine the number of multiplier-adder groups based on the matrix operand operation step size; grouping a plurality of multipliers in the multiplier-adder array based on the number of multiplier-adder groups.

In one possible implementation, when grouping the plurality of multiplier-adders in the multiplier-adder array based on the number of multiplier-adder groups, the controller 801 is specifically configured to determine a first target multiplier-adder in each multiplier-adder group from the multiplier-adder array; determining, from the multiplier-adder array, other target multiplier-adders in the each multiplier-adder group than the first target multiplier-adder based on a position of the first multiplier-adder in the multiplier-adder array, the matrix operand operation step size, and a size of the multiplier-adder array.

In one possible embodiment, when determining other target multipliers and adders in each multiplier and adder group except the first target multiplier and adder in each multiplier and adder group from the multiplier and adder array based on the position of the first multiplier and adder in the multiplier and adder array, the controller 801 is specifically configured to determine, for each multiplier and adder group, a first position relationship between each multiplier and adder in each multiplier and adder array adjacent to the first multiplier and adder in each row except the first multiplier and adder in the multiplier and adder group based on the position of the first multiplier and adder in the multiplier and adder array and the matrix operand operation step; and determining a second position relation between each multiplier-adder except the first multiplier-adder in the column in the multiplier-adder group and an adjacent previous multiplier-adder in the multiplier-adder array based on the position of the first multiplier-adder in the multiplier-adder group in the multiplier-adder array, the operation step size of the matrix operand and the number of columns in the multiplier-adder array; and determining the target positions of other target multipliers and adders except the first target multiplier and adder in the multiplier and adder group in the multiplier and adder array based on the first position relation and/or the second position relation.

In one possible implementation, when determining the first target multiplier-adder in each multiplier-adder group from the multiplier-adder array, the controller 801 is specifically configured to determine a target matrix based on the matrix operand operation step size and the size of the multiplier-adder array; and determining the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array according to the matrix element values of the target matrix.

In a possible implementation manner, when each multiplier-adder group of the at least one multiplier-adder group is utilized to execute the data processing task corresponding to each multiplier-adder group, the controller 801 is specifically configured to store the image data to be processed corresponding to each multiplier-adder group into the register array corresponding to each multiplier-adder group according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array; for each data processing cycle in a plurality of data processing cycles, respectively reading image data to be processed corresponding to each multiplier-accumulator group in the data processing cycle from the register array corresponding to each multiplier-accumulator group; processing the read image data to be processed, and obtaining the data processing result of each multiplier-adder group in the data processing period in parallel; and finishing the data processing tasks respectively corresponding to the multiplier-adder groups according to the data processing results respectively corresponding to the multiplier-adder groups in each data processing period.

In a possible implementation manner, when storing the image data to be processed corresponding to the multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the controller 801 is specifically configured to determine the number of registers included in the register array corresponding to each multiplier-adder group according to the size of a matrix operand; for each multiplier-adder group, determining the position of a target multiplier-adder of the multiplier-adder group in a register array which corresponds to the target multiplier-adder in each fixed reading mode; and for each multiplier-adder group, storing the image data to be processed corresponding to the multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the position of a register fixedly read by the target multiplier-adder in the multiplier-adder group and the processing sequence of operands contained in the image data to be processed in the data processing process, so that the operands stored in the positions of the registers fixedly read by each target multiplier-adder correspond to matrix elements in the matrix operands in the corresponding processing period in each data processing period.

In one possible implementation mode, for each data processing cycle in a plurality of data processing cycles, respectively reading the image data to be processed corresponding to each multiplier-adder group in the data processing cycle from the register array corresponding to each multiplier-adder group; when the read image data to be processed is processed and the data processing results of each multiplier-adder group in the data processing cycle are obtained in parallel, the controller 801 is specifically configured to control each target multiplier-adder in each multiplier-adder group for a first data processing cycle in which the image data to be processed is processed, and read the operand of each target multiplier-adder in the first data processing cycle from the register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the first data processing period as second operands; respectively determining the product of a first operand and a second operand of each target multiplier-adder in the first data processing period; for a non-first data processing period for processing the image data to be processed, controlling the image data to be processed to move a preset step length in the register array according to a preset data moving mode corresponding to the data processing period; controlling each target multiplier-adder in each multiplier-adder group, and respectively reading the operand of each target multiplier-adder in the non-first data processing period from the register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the data processing period as a second operand; the product of the first operand and the second operand of each target multiplier-adder in the data processing period is respectively determined.

In a possible embodiment, when the data processing tasks corresponding to the multiplier-adder groups are completed according to the data processing results corresponding to the multiplier-adder groups in each data processing cycle, the controller 801 is specifically configured to add, for each target multiplier-adder in each multiplier-adder group, the products obtained by the target multiplier-adder in each data processing cycle to obtain a sum; and finishing the data processing tasks corresponding to the multiplier-adder groups respectively based on the sum values corresponding to the target multiplier-adders respectively contained in each multiplier-adder group.

The description of the processing flow of each module in the device and the interaction flow between the modules may refer to the related description in the above method embodiments, and will not be described in detail here.

The image processing device provided by the embodiment of the disclosure may include a chip, an AI chip, and the like.

An embodiment of the present disclosure further provides a computer device, as shown in fig. 9, which is a schematic structural diagram of the computer device provided in the embodiment of the present disclosure, and the computer device includes:

a controller 91 and a memory 92; the memory 92 stores machine-readable instructions executable by the controller 91, the controller 91 is configured to execute the machine-readable instructions stored in the memory 92, when the machine-readable instructions are executed by the controller 91, the controller 91 performs the following steps:

The memory 92 includes a memory 921 and an external memory 922; the memory 921 is also referred to as an internal memory, and temporarily stores operation data in the controller 91 and data exchanged with an external memory 922 such as a hard disk, and the controller 91 exchanges data with the external memory 922 through the memory 921.

The computer device provided by the embodiment of the present disclosure may include an intelligent terminal such as a mobile phone, or may also be other devices, servers, and the like that have a camera and can perform image processing, and is not limited herein.

For the specific execution process of the instruction, reference may be made to the steps of the data processing method described in the embodiments of the present disclosure, and details are not described here.

The embodiments of the present disclosure also provide a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the data processing method described in the above method embodiments. The storage medium may be a volatile or non-volatile computer-readable storage medium.

The embodiments of the present disclosure also provide a computer program product, where the computer program product carries a program code, and instructions included in the program code may be used to execute the steps of the data processing method in the foregoing method embodiments, which may be referred to specifically in the foregoing method embodiments, and are not described herein again.

The computer program product may be implemented by hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied in a computer storage medium, and in another alternative embodiment, the computer program product is embodied in a Software product, such as a Software Development Kit (SDK), or the like.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A data processing method, comprising:

2. The data processing method of claim 1, wherein two adjacent same-group multiplier-adder-spaced non-same-group multiplier-adders in a same row of the multiplier-adder array are equal in number and different from zero, and two adjacent same-group multiplier-adder-spaced non-same-group multiplier-adders in a same column of the multiplier-adder array are equal in number and different from zero.

3. The data processing method of claim 1 or 2, wherein grouping a plurality of multipliers in a multiplier-adder array based on a matrix operand operation step size comprises:

determining a number of the multiplier-adder groups based on the matrix operand operation step size;

grouping a plurality of multipliers in the multiplier-adder array based on the number of multiplier-adder groups.

4. The data processing method of claim 3, wherein the grouping a plurality of multipliers in the multiplier-adder array based on the number of multiplier-adder groups comprises:

determining, from the multiplier-adder array, other target multiplier-adders in the each multiplier-adder group than the first target multiplier-adder based on a position of the first target multiplier-adder in the multiplier-adder array, the matrix operand operation step size, and a size of the multiplier-adder array.

5. The data processing method of claim 4, wherein determining the other target multiply-adders in the each multiply-adder group except the first target multiply-adder from the multiply-adder array based on the position of the first target multiply-adder in the multiply-adder array, the matrix operand operation step size, and the size of the multiply-adder array comprises:

for each multiplier-adder group, determining a first position relation between each multiplier-adder except the first target multiplier-adder in each row of the multiplier-adder group and an adjacent previous multiplier-adder of the multiplier-adder in the multiplier-adder array based on the position of the first target multiplier-adder in the multiplier-adder array and the operation step length of the matrix operand; and are

Determining a second position relation of each multiplier-adder except the first row of multiplier-adders in each row of the multiplier-adder group and an adjacent previous multiplier-adder of the multiplier-adder in the multiplier-adder array based on the position of the first target multiplier-adder of the multiplier-adder group in the multiplier-adder array, the operation step size of the matrix operand and the number of rows of the multiplier-adder array;

and determining the target positions of other target multipliers and adders except the first target multiplier and adder in the multiplier and adder group in the multiplier and adder array based on the first position relation and/or the second position relation.

6. The data processing method according to claim 4 or 5, wherein said determining a first target multiplier-adder in said each multiplier-adder group from said multiplier-adder array comprises:

determining a target matrix based on the operation step size of the matrix operand and the size of the multiplier-adder array;

and determining the position of the first target multiplier-adder in each multiplier-adder group in the multiplier-adder array according to the matrix element values of the target matrix.

7. The data processing method according to any of claims 1-6, wherein said performing, with each of said at least one multiplier-accumulator set, a data processing task corresponding to said each multiplier-accumulator set comprises:

storing the image data to be processed corresponding to each multiplier-adder group into a register array corresponding to each multiplier-adder group according to the position of each target multiplier-adder in each multiplier-adder group in the multiplier-adder array;

for each data processing cycle in a plurality of data processing cycles, respectively reading image data to be processed corresponding to each multiplier-accumulator group in the data processing cycle from the register array corresponding to each multiplier-accumulator group; and are

Processing the read image data to be processed, and parallelly obtaining the data processing result of each multiplier-adder group in the data processing period;

8. The data processing method according to claim 7, wherein storing the image data to be processed corresponding to each multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of the respective target multiplier-adder in the multiplier-adder array comprises:

determining the number of registers contained in a register array corresponding to each multiplier-adder according to the size of the matrix operand;

for each multiplier-adder group, determining the position of a target multiplier-adder of the multiplier-adder group in a register array which corresponds to the target multiplier-adder in each fixed reading mode;

and for each multiplier-adder group, storing the image data to be processed corresponding to the multiplier-adder group into the register array corresponding to the multiplier-adder group according to the position of each target multiplier-adder in the multiplier-adder group in the multiplier-adder array, the position of a register fixedly read by the target multiplier-adder in the multiplier-adder group and the processing sequence of operands contained in the image data to be processed in the data processing process, so that the operands stored in the positions of the registers fixedly read by each target multiplier-adder correspond to matrix elements in the matrix operands of the corresponding data processing period in each data processing period.

9. The data processing method according to claim 7 or 8, wherein for each data processing cycle of the plurality of data processing cycles, the image data to be processed corresponding to each multiplier-adder group of the data processing cycle is read from the register array corresponding to each multiplier-adder group; and processing the read image data to be processed, and parallelly obtaining the data processing result of each multiplier-adder group in the data processing period, wherein the data processing result comprises the following steps:

for the first data processing period of processing the image data to be processed, controlling each target multiplier-adder in each multiplier-adder group, and respectively reading an operand of each target multiplier-adder in the first data processing period from a register fixedly read by each target multiplier-adder as a first operand; determining matrix elements of each multiplier-adder group in a matrix operand corresponding to the first data processing period as second operands; respectively determining the product of a first operand and a second operand of each target multiplier-adder in the first data processing period;

10. The data processing method according to any one of claims 7 to 9, wherein the performing the data processing tasks corresponding to the multiplier-adder groups according to the data processing results corresponding to the multiplier-adder groups in each data processing cycle comprises:

for each target multiplier-adder in each multiplier-adder group, adding products obtained by the target multiplier-adder in each data processing period to obtain a sum;

and finishing the data processing tasks corresponding to the multiplier-adder groups respectively based on the sum values corresponding to the target multiplier-adders respectively contained in each multiplier-adder group.

11. A data processing method according to any one of claims 1 to 10, wherein the data processing task comprises: a convolution processing task;

and the convolution processing tasks of different multiplier-adder groups correspond to different images to be processed.

12. A data processing apparatus, comprising: a controller; the controller is configured to:

13. A computer device, comprising: a controller, a memory storing machine-readable instructions executable by the controller, the controller to execute machine-readable instructions stored in the memory, the machine-readable instructions, when executed by the controller, the controller to perform the steps of the data processing method of any one of claims 1 to 11.

14. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when executed by a computer device, performs the steps of the data processing method according to any one of claims 1 to 11.