CN112766474A - Method, apparatus, medium, and electronic device for implementing convolution operation - Google Patents

Method, apparatus, medium, and electronic device for implementing convolution operation Download PDF

Info

Publication number
CN112766474A
CN112766474A CN201911066229.4A CN201911066229A CN112766474A CN 112766474 A CN112766474 A CN 112766474A CN 201911066229 A CN201911066229 A CN 201911066229A CN 112766474 A CN112766474 A CN 112766474A
Authority
CN
China
Prior art keywords
weight matrix
operated
feature
convolution
row
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911066229.4A
Other languages
Chinese (zh)
Other versions
CN112766474B (en
Inventor
王振江
李德林
张祎男
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Horizon Robotics Technology Research and Development Co Ltd
Original Assignee
Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Horizon Robotics Technology Research and Development Co Ltd filed Critical Beijing Horizon Robotics Technology Research and Development Co Ltd
Priority to CN201911066229.4A priority Critical patent/CN112766474B/en
Publication of CN112766474A publication Critical patent/CN112766474A/en
Application granted granted Critical
Publication of CN112766474B publication Critical patent/CN112766474B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Neurology (AREA)
  • Complex Calculations (AREA)

Abstract

Disclosed are a method, an apparatus, a medium, and an electronic device for implementing a convolution operation, wherein the method includes: acquiring input features of a convolutional layer and a weight matrix of a convolutional core corresponding to the convolutional layer, wherein the spatial resolution of the input features is n1 × n11, the weight matrix is m1 × m1, and m1 is a non-zero even number; obtaining a feature to be operated and a weight matrix to be operated after row-column expansion according to the input feature and the weight matrix, wherein the spatial resolution of the feature to be operated is n2 multiplied by n22, the weight matrix of the weight matrix to be operated is m2 multiplied by m2, and m2 is an odd number larger than m 1; performing convolution operation on the weight matrix to be operated and the characteristics to be operated through a data processor to obtain a convolution operation result; and obtaining the output characteristics of the convolutional layer according to the convolution operation result. The present disclosure can utilize a data processor to implement various types of convolution operations, thereby facilitating enrichment of the implementation of convolution operations.

Description

Method, apparatus, medium, and electronic device for implementing convolution operation
Technical Field
The present disclosure relates to computer vision technologies, and in particular, to a method for implementing convolution operation, an apparatus for implementing convolution operation, a storage medium, and an electronic device.
Background
Computer vision techniques tend to be indiscernible from convolution operations. For example, in a Neural Network such as CNN (Convolutional Neural Networks), RPN (Region candidate Neural Networks), and RNN (Recurrent Neural Networks) used in the computer vision technology, at least one Convolutional layer is usually included, each Convolutional layer corresponds to a Convolutional kernel of a preset size, and each Convolutional layer can perform a Convolutional operation on an input feature of the Convolutional layer based on the corresponding Convolutional kernel to form a new feature, and the new feature is used as an output feature of the Convolutional layer, thereby implementing the Convolutional operation of the Convolutional layer.
At present, some hardware units only support a single type of convolution operation, and how to make the hardware units support multiple types of convolution operation is a technical problem of great concern.
Disclosure of Invention
The present disclosure is proposed to solve the above technical problems. The embodiment of the disclosure provides a method, a device, a storage medium and an electronic device for realizing convolution operation.
According to an aspect of the embodiments of the present disclosure, there is provided a method for implementing a convolution operation, the method including: acquiring input features of a convolutional layer and a weight matrix of a convolutional core corresponding to the convolutional layer, wherein the spatial resolution of the input features is n1 × n11, the weight matrix is m1 × m1, n1 and n11 are positive integers, and m1 is a nonzero even number; obtaining a row-column expanded feature to be operated and a weight matrix to be operated according to the input feature and the weight matrix, wherein the spatial resolution of the feature to be operated is n2 × n22, the weight matrix to be operated is a weight matrix of m2 × m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 × m2 is the size of the weight matrix supported by the data processor; performing convolution operation on the weight matrix to be operated and the feature to be operated through the data processor to obtain a convolution operation result; and obtaining the output characteristics of the convolutional layer according to the convolution operation result.
According to another aspect of the embodiments of the present disclosure, there is provided a method for implementing a convolution operation, including: acquiring input features of a convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer, wherein the weight matrix is a weight matrix of m3 × m3, and m3 is an odd number; under the condition that the data processor does not support the feature filling requirement of the convolutional layer, obtaining a weight matrix to be operated after row-column expansion according to the weight matrix, wherein the weight matrix to be operated is a weight matrix of m4 × m4, and m4 × m4 is the size of the weight matrix supported by the data processor; performing, by the data processor, convolution operation based on feature filling supported by the data processor on the weight matrix to be operated and the input features to obtain a convolution operation result; and obtaining the output characteristics of the convolutional layer according to the convolution operation result.
According to still another aspect of the embodiments of the present disclosure, there is provided an apparatus for implementing a convolution operation, the apparatus including: a first obtaining module, configured to obtain an input feature of a convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer, where a spatial resolution of the input feature is n1 × n11, the weight matrix is a weight matrix of m1 × m1, n1 and n11 are positive integers, and m1 is a non-zero even number; a second obtaining module, configured to obtain a row-column expanded feature to be operated and a weight matrix to be operated according to the input feature and the weight matrix obtained by the first obtaining module, where a spatial resolution of the feature to be operated is n2 × n22, the weight matrix to be operated is a weight matrix of m2 × m2, n2 is an integer greater than n1, n22 is an integer greater than n11, m2 is an odd number greater than m1, and m2 × m2 is a size of the weight matrix supported by the data processor; the first operation result obtaining module is used for performing convolution operation on the weight matrix to be operated and the characteristics to be operated, which are obtained by the second obtaining module, through the data processor to obtain a convolution operation result; and the first obtaining output characteristic module is used for obtaining the output characteristic of the convolutional layer according to the convolution operation result obtained by the first obtaining operation result module.
According to still another aspect of the embodiments of the present disclosure, there is provided an apparatus for implementing a convolution operation, the apparatus including: a third obtaining module, configured to obtain an input feature of a convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer, where the weight matrix is a weight matrix of m3 × m3, and m3 is an odd number; a fourth obtaining module, configured to, when the data processor does not support the feature filling requirement of the convolutional layer, obtain, according to the weight matrix obtained by the third obtaining module, a to-be-computed weight matrix after row-column expansion, where the to-be-computed weight matrix is a weight matrix of m4 × m4, and m4 × m4 is a size of the weight matrix supported by the data processor; a second obtaining operation result module, configured to perform, by the data processor, a convolution operation based on feature filling supported by the data processor on the weight matrix of m4 × m4 obtained by the fourth obtaining module and the input feature obtained by the third obtaining module, so as to obtain a convolution operation result; and the second obtaining and outputting characteristic module is used for obtaining the output characteristic of the convolutional layer according to the convolution operation result obtained by the second obtaining and operating result module.
According to still another aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium storing a computer program for implementing the above method.
According to still another aspect of an embodiment of the present disclosure, there is provided an electronic apparatus including: a processor; a memory for storing the processor-executable instructions; the processor is used for reading the executable instructions from the memory and executing the instructions to realize the method.
Based on the method and the device for realizing convolution operation provided by the embodiment of the disclosure, for the weight matrix with even number of rows and columns, the method and the device can make the weight matrix of the convolution kernel become the weight matrix with odd number of rows and columns by performing row and column expansion on the weight matrix of the convolution layer, so that the data processor supporting the odd number of weight matrices can execute convolution operation aiming at the input characteristics; by row-column expanding the input features of the convolutional layer, the output features of the convolutional layer can be obtained from the result of the convolution operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can utilize the data processor supporting the odd convolution kernel to complete convolution operation based on the even convolution kernel, so that the data processor supporting the odd convolution kernel can be used for realizing various types of convolution operation, and further the implementation mode of the convolution operation can be enriched.
Based on the method and apparatus for implementing convolution operation provided by the above embodiments of the present disclosure, in the case that the data processor does not support the feature filling requirement of the convolution layer, the present disclosure performs row-column expansion on the weight matrix of the convolution kernel, so that the data processor can perform convolution operation on the input feature of the convolution layer by using the expanded weight matrix based on the feature filling supported by the expanded weight matrix, thereby obtaining the output feature of the convolution layer based on the result of the convolution operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can realize the convolution operation of the convolution layer with the corresponding characteristic filling requirement by using the data processor which does not support the characteristic filling requirement of the convolution layer, thereby being beneficial to realizing various types of convolution operation by using the data processor and further being beneficial to enriching the realization mode of the convolution operation.
The technical solution of the present disclosure is further described in detail by the accompanying drawings and examples.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing in more detail embodiments of the present disclosure with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of the embodiments of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the principles of the disclosure and not to limit the disclosure. In the drawings, like reference numbers generally represent like parts or steps.
FIG. 1 is a schematic diagram of a scenario in which the present disclosure is applicable;
FIG. 2 is a flow chart of one embodiment of a method of the present disclosure for implementing convolution operations;
FIG. 3 is a diagram of a first example of a convolution operation implementing a convolutional layer of the present disclosure;
FIG. 4 is a diagram illustrating a second example of a convolution operation for implementing convolutional layers according to the present disclosure;
FIG. 5 is a diagram of a third example of a convolution operation implementing a convolutional layer of the present disclosure;
FIG. 6 is a diagram illustrating a fourth example of a convolution operation implementing a convolutional layer according to the present disclosure;
FIG. 7 is a schematic diagram of a fifth example of a convolution operation implementing a convolutional layer of the present disclosure;
FIG. 8 is a flow chart of another embodiment of a method of the present disclosure for implementing a convolution operation;
FIG. 9 is a diagram illustrating an example of a convolution operation for implementing convolutional layers according to the present disclosure;
FIG. 10 is a schematic diagram illustrating an embodiment of an apparatus for performing convolution operations according to the present disclosure;
FIG. 11 is a schematic structural diagram illustrating another embodiment of an apparatus for performing convolution operations according to the present disclosure;
fig. 12 is a block diagram of an electronic device provided in an exemplary embodiment of the present application.
Detailed Description
Example embodiments according to the present disclosure will be described in detail below with reference to the accompanying drawings. It is to be understood that the described embodiments are merely a subset of the embodiments of the present disclosure and not all embodiments of the present disclosure, with the understanding that the present disclosure is not limited to the example embodiments described herein.
It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.
It will be understood by those of skill in the art that the terms "first," "second," and the like in the embodiments of the present disclosure are used merely to distinguish one element from another, and are not intended to imply any particular technical meaning, nor is the necessary logical order between them.
It is also understood that in embodiments of the present disclosure, "a plurality" may refer to two or more and "at least one" may refer to one, two or more.
It is also to be understood that any reference to any component, data, or structure in the embodiments of the disclosure, may be generally understood as one or more, unless explicitly defined otherwise or stated otherwise.
In addition, the term "and/or" in the present disclosure is only one kind of association relationship describing the associated object, and means that there may be three kinds of relationships, such as a and/or B, and may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" in the present disclosure generally indicates that the former and latter associated objects are in an "or" relationship.
It should also be understood that the description of the various embodiments of the present disclosure emphasizes the differences between the various embodiments, and the same or similar parts may be referred to each other, so that the descriptions thereof are omitted for brevity.
Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.
Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
Embodiments of the present disclosure may be implemented in electronic devices such as terminal devices, computer systems, servers, etc., which are operational with numerous other general purpose or special purpose computing system environments or configurations. Examples of well known terminal devices, computing systems, environments, and/or configurations that may be suitable for use with an electronic device, such as a terminal device, computer system, or server, include, but are not limited to: personal computer systems, server computer systems, thin clients, thick clients, hand-held or laptop devices, microprocessor-based systems, set top boxes, programmable consumer electronics, network pcs, minicomputer systems, mainframe computer systems, distributed cloud computing environments that include any of the above, and the like.
Electronic devices such as terminal devices, computer systems, servers, etc. may be described in the general context of computer system-executable instructions, such as program modules, being executed by a computer system. Generally, program modules may include routines, programs, objects, components, logic, data structures, etc. that perform particular tasks or implement particular abstract data types. The computer system/server may be implemented in a distributed cloud computing environment. In a distributed cloud computing environment, tasks may be performed by remote processing devices that are linked through a communications network. In a distributed cloud computing environment, program modules may be located in both local and remote computer system storage media including memory storage devices.
Summary of the disclosure
In implementing the present disclosure, the inventors found that, in consideration of the factors such as the invariance of the sizes before and after convolution and the convolution anchor points, the size of the convolution kernel in the width and height directions is, in general, an odd number × odd number, for example, 1 × 1 or 3 × 3 or 5 × 5 or 7 × 7 or 11 × 11 or the like. Convolution kernels having dimensions in the width and height directions of odd × odd may be referred to as odd convolution kernels. Due to the consideration of factors such as power consumption and area of the hardware units, some hardware units support convolution operation of odd convolution kernels, but not convolution operation of even convolution kernels. For example, when the size of the convolution kernel in the width and height directions is 4 × 4, some hardware units cannot perform the corresponding convolution operation on the convolution layer. If the convolution operation of even convolution kernels can be realized by using a hardware unit supporting the convolution operation of odd convolution kernels, the implementation mode of the convolution operation can be enriched. Among them, the hardware unit may include but is not limited to: a CPU (Central Processing Unit), a BPU (Brain Processing Unit), a GPU (Graphics Processing Unit), an FPGA (Field Programmable Gate Array), or the like.
In addition, in the process of performing the convolution operation by the hardware unit, the input features of the convolutional layer may be subjected to filling processing, and for example, the hardware unit may perform filling processing on the input features of the convolutional layer in consideration of factors such as avoiding that the spatial resolution of the output features of the convolutional layer obtained after the convolution operation becomes small or avoiding that the edge feature information of the input features of the convolutional layer is lost. There are some limitations to feature filling supported by some hardware units due to considerations such as power consumption and area of the hardware units. Typically, the feature fill supported by a hardware unit is related to the size of the convolution kernel. For example, when the size of the convolution kernel corresponding to a convolution layer is 5 × 5, some hardware units support feature filling only in two cases, 0 and 2; as another example, some hardware units support feature filling only in both cases 0 and 3 when the convolution layer corresponds to a convolution kernel of size 7 × 7. Where 0 means no padding. It would be advantageous to enrich the implementation of convolution operations if convolution operations based on feature filling of a greater variety of cases (e.g., 0, 1, and 2, and again, 0, 1, 2, and 3) could be implemented using hardware units that support feature filling of both cases. Similarly, the hardware units may include but are not limited to: CPU, BPU, GPU or FPGA.
Brief description of the drawings
The technical scheme for realizing the convolution operation provided by the disclosure can be applied to various scenes. One example is shown in figure 1.
In fig. 1, in the process of implementing convolution operation by using the data processor 100, the data processor 100 is triggered to execute each instruction in the corresponding instruction string, for example, after the data processor 100 receives a predetermined interrupt signal, each instruction is executed in sequence according to the order of each instruction in the instruction string corresponding to the predetermined interrupt signal.
An example of the data processor 100 executing instructions in sequence may be: the Data processor 100 reads a weight matrix of a convolution kernel corresponding to a current convolution layer from a DDR SDRAM (Double Data rate synchronous Random Access Memory) 101 according to a first preset address, reads an input feature of the current convolution layer from an SRAM (Static Random-Access Memory) 102 according to a second preset address, and then the Data processor 100 performs a multiply-add operation on the currently read weight matrix and the input feature, and stores a result of the multiply-add operation in the SRAM102 according to a third preset address, where the result of the multiply-add operation can form an output feature of the current convolution layer, thereby implementing the convolution operation of the current convolution layer.
Exemplary method
FIG. 2 is a flow chart of one embodiment of a method for implementing convolution operations according to the present disclosure. The method shown in fig. 2 comprises: s200, S201, S202, and S203. The following describes each step.
S200, acquiring input characteristics of the convolutional layer and a weight matrix of a convolutional core corresponding to the convolutional layer.
The convolutional layer in the present disclosure is typically one layer in a neural network. Input characteristics of the convolutional layer may include, but are not limited to: feature maps (Feature maps) of images, Feature vectors of audio, and the like. The convolutional layer in the present disclosure is used to represent that the convolution operation is performed on its input features to further extract features from the input features, and the extracted features form its output features. The convolutional layer in this disclosure corresponds to a convolutional kernel. The size of the convolution kernel corresponding to the convolution layer is the size of the weight matrix of the convolution kernel. For example, if the size of the convolution kernel is 4 × 4 × C, the size of the weight matrix of the convolution kernel is 4 × 4 × C. Wherein C represents the number of channels. Each element in the weight matrix of the convolution kernel corresponding to a convolutional layer may refer to a weight that each neuron in the convolutional layer is linked to a corresponding neuron in the next layer in the neural network.
The spatial resolution of the input features in the present disclosure is, for example, n1 × n11, where n1 and n11 are positive integers, i.e., n1 and n11 are integers greater than zero. The weight matrix of the convolution kernel in this disclosure is a weight matrix of m1 × m1, i.e., the size of the weight matrix is m1 × m1, where m1 is a non-zero even number. For example, m1 can be 2, 4, or 6, etc. It should be noted that, the present disclosure is only schematically illustrated with the spatial resolution of the input feature being n1 × n11, and the dimensions of the spatial resolution of the input feature in the width and height directions are not limited by the above n1 × n11, and may be any size in the width and height directions. n1 may be equal to n 11. In the following examples, the description will be made mainly taking n1 × n11 as n1 × n1 as an example. In addition, m1 × m1 indicates the length and width of the weight matrix. Since the number of channels of each element of the weight matrix of the convolution kernel in the width and height directions is the same, a description of the number of channels of the weight matrix is omitted in the description of the present disclosure. Since the number of channels of each element in the spatial resolution (W × H, the width direction and the height direction of the input feature) of the input feature of the convolutional layer is the same, in the description of the present disclosure, the description of the number of channels of the input feature of the convolutional layer is omitted. Also, since the number of channels of each element in the spatial resolution of the output features of the convolutional layer is the same, a description of the number of channels of the output features of the convolutional layer is omitted in the description of the present disclosure.
S201, obtaining the feature to be operated and the weight matrix to be operated after the row and column expansion according to the input feature and the weight matrix.
The to-be-operated feature after the row and column expansion in the present disclosure refers to a new input feature obtained by performing the row and column expansion on the input feature of the convolutional layer. For example, at least one row and at least one column are added to the input features of the convolutional layer, thereby forming the features to be operated. The number of rows added to the input features of the convolutional layer is generally the same as the number of columns added to the input features of the convolutional layer. That is, if the spatial resolution of the feature to be operated on is n2 × n22, then n2 is an integer greater than n1 and n22 is an integer greater than n 11. n2 may be equal to n 22. In the following examples, the description will be made mainly taking n2 × n22 as n2 × n2 as an example. In addition, the present disclosure typically adds rows and columns, respectively, at the outermost sides of the input features. The outermost side of the input feature is, for example, the left side of the leftmost column of the input feature, the right side of the rightmost column of the input feature, the upper side of the uppermost row of the input feature, or the lower side of the lowermost row of the input feature. The present disclosure adds each element in the row and column in the feature to be operated with the same number of lanes as the number of lanes of the element in the input feature.
The to-be-operated weight matrix after row and column expansion in the present disclosure is a new weight matrix obtained by performing row and column expansion on the weight matrix of the convolution kernel corresponding to the convolution layer. For example, at least one row and at least one column are added to the weight matrix of the convolution kernel, thereby forming a weight matrix to be operated. The number of rows added to the weight matrix of the convolution kernel is typically the same as the number of columns added to the weight matrix. That is, if the size of the weight matrix to be operated on is m2 × m2, m2 is an integer greater than m1, and m2 is an odd number, and further, m2 × m2 is the size of the convolution kernel supported by the data processor. In addition, the present disclosure typically adds rows and columns, respectively, at the outermost sides of the weight matrix of the convolution kernel. The values of the elements in the added rows and columns at the outermost side of the weight matrix are generally: the predetermined value of the convolution operation result of the input feature and the weight matrix is not changed. The outermost side of the weight matrix is, for example, the left side of the leftmost column of the weight matrix, the right side of the rightmost column of the weight matrix, the upper side of the uppermost row of the weight matrix, or the lower side of the lowermost row of the weight matrix. The number of channels of each element in the row and column added in the weight matrix to be operated by the present disclosure is the same as the number of channels of the element in the weight matrix.
S202, performing convolution operation on the weight matrix to be operated and the features to be operated through the data processor to obtain a convolution operation result.
A data processor in the present disclosure may refer to a hardware unit having data operation capability, for example, the data processor may include but is not limited to: CPU, BPU, GPU or FPGA.
The convolution operation in the present disclosure may refer to an operation performed to realize feature extraction of a feature to be operated, for example, the data processor performs a multiply-add operation on a weight matrix to be operated and a feature to be operated, and the like. In addition, the step size of the convolution operation in this disclosure may be 1 or other values.
S203, obtaining the output characteristics of the convolution layer according to the convolution operation result.
The convolution operation result in the present disclosure may refer to an operation result based on a weight matrix to be operated and a multiply-add operation of a feature to be operated, or the like. The output characteristic of the convolutional layer in the present disclosure may refer to a convolution operation result obtained by performing a convolution operation with respect to a weight matrix of a convolution kernel corresponding to the convolutional layer and an input characteristic of the convolutional layer. Although the weight matrix to be computed and the feature to be computed in the present disclosure are both results after row-column expansion, the convolution operation result based on the weight matrix to be computed and the feature to be computed may or may not include redundant contents (e.g., the convolutional layer has a feature filling requirement). In the case that the convolution operation result does not contain redundant content, the present disclosure may directly use the convolution operation result as an output characteristic of the convolution layer. Under the condition that the convolution operation result contains redundant content, the method can remove the redundant content in the convolution operation result based on the weight matrix to be operated and the characteristic to be operated, so as to obtain the output characteristic of the convolution layer, namely obtain the convolution operation result based on the weight matrix of the convolution kernel corresponding to the convolution layer and the input characteristic.
For a weight matrix with even number of rows and columns (namely, an even convolution kernel), the weight matrix of the convolution layer can be converted into a weight matrix with odd number of rows and columns (namely, an odd convolution kernel) by performing row-column expansion on the weight matrix of the convolution layer, so that a data processor supporting the odd weight matrix can execute convolution operation aiming at input characteristics; by row-column expanding the input features of the convolutional layer, the output features of the convolutional layer can be obtained from the result of the convolution operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can utilize the data processor supporting the odd convolution kernel to complete convolution operation based on the even convolution kernel, thereby being beneficial to utilizing the data processor to realize various types of convolution operation and further being beneficial to enriching the realization modes of the convolution operation.
In an alternative example, the manner of obtaining the features to be operated and the weight matrix to be operated after the row and column expansion is generally as follows: and respectively adding at least one expansion row and at least one expansion column in the same direction of the input features and the weight matrix to obtain the input features to be operated and the weight matrix to be operated after the row and column expansion. The same orientation therein may refer to the same position of the input features and the weight matrix. That is, the same orientation of the input feature and the weight matrix may be the uppermost side of the input feature and the uppermost side of the weight matrix, the leftmost side of the input feature and the leftmost side of the weight matrix, the lowermost side of the input feature and the lowermost side of the weight matrix, or the rightmost side of the input feature and the rightmost side of the weight matrix.
According to the convolution operation method and device, the expansion rows and the expansion columns are added in the same direction of the input feature and the weight matrix respectively, so that the convolution operation result of the input feature and the weight matrix is completely the same as the convolution operation result of the weight matrix to be operated and the convolution operation result of the feature to be operated, and the output feature of the convolution layer can be conveniently and rapidly obtained.
In an alternative example, the present disclosure adds the same number of rows of extension rows and the same number of columns of extension columns, respectively, in the same orientation of the input features and weight matrix. The number of rows of extended rows and the number of columns of extended columns added in the input feature and weight matrix of the present disclosure may be the same. For example, the present disclosure adds a row extension rows in the same orientation of the input features and weight matrix, respectively, and b column extension columns in the same orientation of the weight matrix of the input features, respectively. Wherein, the values of a and b can be the same. In addition, the values of a and b should be as small as possible to avoid excessive convolution operations performed by the data processor, thereby reducing the amount of computation in the convolution operations performed by the data processor.
According to the convolution operation method and device, the extension rows with the same number of rows and the extension columns with the same number of columns are added to the same direction of the input feature and the weight matrix respectively, so that the convolution operation results of the input feature and the weight matrix are identical to the convolution operation results of the weight matrix to be operated and the feature to be operated, and the output feature of the convolution layer can be conveniently and rapidly obtained.
In an alternative example, in a case that the data processor does not support the feature filling requirement of the convolutional layer in the present disclosure, the present disclosure should obtain the input features to be operated and the weight matrix to be operated after row and column expansion by adding the same number of expansion rows (e.g., one row of expansion rows) and the same number of expansion columns (e.g., one column of expansion columns) to the upper side of the uppermost row and the left side of the leftmost column of the input features and the weight matrix, respectively. The feature fill requirement of convolutional layers in this disclosure may refer to padding in a neural network. Padding in a neural network is typically for the input features of the convolutional layer, e.g., filling the outermost of the input features of the convolutional layer with a-turns of a predetermined feature value (e.g., 0), so that the spatial resolution of the input features of the convolutional layer is changed from n1 × n1 to (n1+2a) × (n1+2 a).
Alternatively, assume that the size of the convolution kernel corresponding to the convolution layer is m1 × m1, and the feature fill supported by the data processor is limited, i.e., the feature fill supported by the data processor is 0 or one-half (m 1-1). If the feature fill requirement of the convolutional layer is an integer between 0 and one-half of (m1-1), the data processor does not support the feature fill requirement of the convolutional layer. Under the assumption, the input features to be operated and the weight matrix to be operated after the row and column are expanded are obtained by respectively adding the same number of expansion rows and the same number of expansion columns to the upper side of the uppermost row and the left side of the leftmost column of the input features and the weight matrix. At this time, the data processor may perform a convolution operation on the weight matrix to be operated on and the feature to be operated on based on the feature padding (e.g., one-half of (m 1-1)) that it supports when performing the convolution operation.
According to the method, when the data processor does not support the feature filling requirement of the convolutional layer, the same number of expansion rows and the same number of expansion columns are respectively added to the upper side of the uppermost row and the left side of the leftmost column of the input feature and weight matrix, the input feature to be operated and the weight matrix to be operated after row and column expansion are obtained, the convolution operation result of the input feature and the weight matrix based on the feature filling requirement of the convolutional layer can be enabled to completely appear in the preset area of the convolution operation result of the weight matrix to be operated and the feature to be operated, and therefore the output feature based on the feature filling requirement of the convolutional layer can be conveniently obtained.
In an alternative example, since part of rows and part of columns in the convolution operation result of the weight matrix to be operated and the feature to be operated according to the present disclosure are formed by feature filling performed by the data processor for the input feature to be operated, there are redundant rows and columns in the convolution operation result based on feature filling of the weight matrix to be operated and the feature to be operated. That is, part of the rows and part of the columns in the convolution operation result based on feature filling of the weight matrix to be operated and the feature to be operated are the convolution operation result based on feature filling of the weight matrix and the input feature, so that the present disclosure may intercept part of the rows and part of the columns from the convolution operation result based on feature filling of the weight matrix to be operated and the feature to be operated, and use the intercepted part of the rows and part of the columns as the output feature of the convolutional layer.
According to the convolution operation method and device, partial rows and partial columns which are extracted from the convolution operation result based on feature filling of the weight matrix to be operated and the feature to be operated are used as the output feature of the convolution layer, so that not only can the convolution operation of even number convolution kernels be realized by using the data processor, but also the convolution operation of feature filling requirements which is not supported by the data processor can be realized by using the data processor, and further enrichment of the implementation mode of the convolution operation is facilitated.
In an optional example, values of elements in each expanded row and each expanded column in the weight matrix to be operated after row and column expansion obtained by the present disclosure may be all zero, and values of elements in each expanded row and each expanded column in the feature to be operated after row and column expansion may be all arbitrary values. The convolution operation result of the input feature and the weight matrix can completely appear in the convolution operation result of the weight matrix to be operated and the feature to be operated by setting the value of each element in each expansion row and each expansion column in the weight matrix to be operated to be zero and setting the value of each element in each expansion row and each expansion column in the feature to be operated to be any value (including zero), so that the output feature of the convolution layer can be conveniently and rapidly obtained.
In one optional example, the number of rows of the extended rows in the present disclosure may be: the difference between the row number of the weight matrix of the first convolution kernel supported by the data processor and the row number of the weight matrix of the convolution kernel corresponding to the convolution layer. The number of columns of extended columns in the present disclosure may be: the difference value between the column number of the weight matrix of the first convolution kernel supported by the data processor and the column number of the weight matrix of the convolution kernel corresponding to the convolution layer. The weight matrix of the first convolution kernel is: and the weight matrix with the size which is larger than that of the convolution kernel corresponding to the convolution layer and is closest to that of the convolution kernel corresponding to the convolution layer in each weight matrix supported by the data processor.
For example, assuming that the sizes of the convolution kernels supported by the data processor are a1 × a1, a2 × a2 and a3 × a3, wherein a1, a2 and a3 are all odd numbers, and assuming that the size of the convolution kernel corresponding to the convolution layer is a4 × a4, the values of a and b in the present disclosure may be the smallest positive number of a1-a4, a2-a4 and a3-a 4. In general, a and b may take on values of 1.
The row number of the extension rows and the column number of the extension columns are determined by utilizing the row number and the column number of the weight matrix of the first convolution kernel supported by the data processor, so that the data processor is favorable for avoiding executing unnecessary convolution operation and storing unnecessary convolution operation results, and the data processor is favorable for saving calculation resources and cache resources.
The present disclosure uses the row and column expanded feature to be operated and the weight matrix to be operated to realize an example of convolution operation of the convolutional layer, as shown in fig. 3.
In fig. 3, it is assumed that the size of the convolution kernel supported by the data processor includes 5 × 5. Assume that the spatial resolution of an input feature 300 of a convolutional layer of the present disclosure is 7 × 7, and the size of a weight matrix 301 of a convolutional kernel corresponding to the convolutional layer is 4 × 4. Assume again that the convolutional layer has no feature fill requirements.
Under the above assumption, the present disclosure may add a column of extension column to the rightmost side of the input feature 300 of fig. 3, and the value of each element in the extension column may be any value; adding a row of expansion lines at the lowest side of the input feature 300, wherein the values of all elements of the expansion lines can be any values; the present disclosure thus obtains features 302 to be computed with a spatial resolution of 8 x 8. Similarly, in the present disclosure, a column of extension column may be added to the rightmost side of the weight matrix 301 in fig. 3, and values of each element in the extension column are all zero; adding a row of expansion rows at the lowest side of the weight matrix 301, wherein the values of all elements in the expansion rows are zero; the present disclosure thus obtains a weight matrix 303 to be computed of size 5 x 5. It can be seen that the convolution operation (304 in fig. 3 represents a multiply-add operation) of the input feature 300 and the weight matrix 301 in the present disclosure is changed to a convolution operation of the feature to be operated 302 and the weight matrix to be operated 303.
The convolution result 306 of the feature to be operated 302 and the weight matrix to be operated 303 (i.e. the convolution result with step size 1) is identical to the convolution result 305 of the input feature 300 and the weight matrix 301 (also the convolution result with step size 1). Thus, the present disclosure may directly take the convolution operation result 306 as the output characteristic of the convolutional layer.
The present disclosure uses the row and column expanded feature to be operated and the weight matrix to be operated to realize another example of convolution operation of the convolutional layer, as shown in fig. 4.
In fig. 4, it is assumed that the size of the convolution kernel supported by the data processor includes 5 × 5. Assume that the spatial resolution of an input feature 400 of a convolutional layer of the present disclosure is 7 × 7, and the size of a weight matrix 401 of a convolutional kernel corresponding to the convolutional layer is 4 × 4. Assume again that the convolutional layer has no feature fill requirements.
Under the above assumption, the present disclosure may add a column of extension column to the rightmost side of the input feature 400 of fig. 4, and the value of each element in the extension column may be any value; adding a row of expansion lines at the top of the input feature 400, wherein the values of all elements of the expansion lines can be any values; the present disclosure thus obtains features to be computed 402 with a spatial resolution of 8 x 8. Similarly, in the present disclosure, a column of extension column may be added to the rightmost side of the weight matrix 401 in fig. 4, and values of elements in the extension column are all zero; adding a row of expansion rows at the top of the weight matrix 401, wherein values of all elements in the expansion rows are zero; the present disclosure thus obtains a weight matrix 403 to be computed of size 5 x 5. As can be seen, the convolution operation (404 in fig. 4 represents a multiply-add operation) of the input feature 400 and the weight matrix 401 in the present disclosure is changed to a convolution operation of the feature to be operated 402 and the weight matrix to be operated 403.
The convolution result 406 of the feature to be operated 402 and the weight matrix to be operated 403 (i.e. the convolution result with step size 1) is identical to the convolution result 405 of the input feature 400 and the weight matrix 401 (also the convolution result with step size 1). Thus, the present disclosure may directly take the convolution operation result 406 as the output characteristic of the convolutional layer.
The present disclosure uses the row-column expanded feature to be operated and the weight matrix to be operated to realize a further example of convolution operation of the convolutional layer, as shown in fig. 5.
In fig. 5, it is assumed that the size of the convolution kernel supported by the data processor includes 5 × 5. Assume that the spatial resolution of an input feature 500 of a convolutional layer of the present disclosure is 7 × 7, and the size of a weight matrix 501 of a convolutional kernel corresponding to the convolutional layer is 4 × 4. Assume again that the convolutional layer has no feature fill requirements.
Under the above assumption, the present disclosure may add a column of extension column to the leftmost side of the input feature 500 of fig. 5, and the value of each element in the extension column may be any value; adding a row of expansion lines at the top of the input feature 500, wherein the values of all elements of the expansion lines can be any values; the present disclosure thus obtains features 502 to be computed with a spatial resolution of 8 x 8. Similarly, in the present disclosure, a row of extension columns may be added to the leftmost side of the weight matrix 501 in fig. 5, and values of elements in the extension columns are all zero; adding a row of expansion rows at the top of the weight matrix 501, wherein the values of all elements in the expansion rows are zero; the present disclosure thus obtains a weight matrix 503 to be computed of size 5 x 5. As can be seen, the convolution operation (504 in fig. 5 represents a multiply-add operation) of the input feature 500 and the weight matrix 501 in the present disclosure is changed to a convolution operation of the feature to be operated 502 and the weight matrix to be operated 503.
The convolution result 506 of the feature to be operated 502 and the weight matrix to be operated 503 (i.e. the convolution result with step size 1) is identical to the convolution result 505 of the input feature 500 and the weight matrix 501 (also the convolution result with step size 1). Thus, the present disclosure may directly take convolution operation result 506 as an output characteristic of the convolutional layer.
The present disclosure uses the row and column expanded feature to be operated and the weight matrix to be operated to realize another example of convolution operation of the convolutional layer, as shown in fig. 6.
In fig. 6, it is assumed that the size of the convolution kernel supported by the data processor includes 5 × 5. Assume that the spatial resolution of an input feature 600 of a convolutional layer of the present disclosure is 7 × 7, and the size of a weight matrix 601 of a convolutional kernel corresponding to the convolutional layer is 4 × 4. Assume again that the convolutional layer has no feature fill requirements.
Under the above assumption, the present disclosure may add a column of extension column to the leftmost side of the input feature 600 of fig. 6, and the value of each element in the extension column may be any value; adding a row of expansion lines at the lowest side of the input feature 600, wherein the values of all elements of the expansion lines can be any values; the present disclosure thus obtains a feature to be computed 602 with a spatial resolution of 8 x 8. Similarly, in the present disclosure, a column of extension columns may be added to the leftmost side of the weight matrix 601 in fig. 6, and values of elements in the extension columns are all zero; adding a row of expansion rows at the lowest side of the weight matrix 601, wherein the values of all elements in the expansion rows are zero; the present disclosure thus obtains a weight matrix 603 to be computed of size 5 x 5. As can be seen, the convolution operation (604 in fig. 6 represents a multiply-add operation) of the input feature 600 and the weight matrix 601 in the present disclosure is changed to a convolution operation of the feature to be operated 602 and the weight matrix to be operated 603.
The convolution result 606 of the feature to be operated 602 and the weight matrix to be operated 603 (i.e. the convolution result with step size 1) is identical to the convolution result 605 of the input feature 600 and the weight matrix 601 (also the convolution result with step size 1). Thus, the present disclosure may directly take the convolution operation result 606 as the output characteristic of the convolutional layer.
The present disclosure uses the row-column expanded feature to be operated and the weight matrix to be operated to realize a further example of convolution operation of the convolutional layer, as shown in fig. 7.
In fig. 7, it is assumed that the size of the convolution kernel supported by the data processor includes 5 × 5. Assume that the spatial resolution of the input features 700 of a convolutional layer of the present disclosure is 7 × 7, and the size of the weight matrix 701 of the convolutional kernel corresponding to the convolutional layer is 4 × 4. Assume again that the feature fill requirement of the convolutional layer is 1 and the feature fills supported by the data processor are limited, i.e., the feature fills supported by the data processor are 0 and 2.
Under the above assumption, the present disclosure may add a column of extension column to the leftmost side of the input feature 700 of fig. 7, and the value of each element in the extension column may be any value; adding a row of expansion lines at the top of the input feature 700, wherein the values of all elements of the expansion lines can be any values; the present disclosure thus obtains features to be computed 702 with a spatial resolution of 8 x 8. Similarly, in the present disclosure, a row of extension columns may be added to the leftmost side of the weight matrix 701 in fig. 7, and values of elements in the extension columns are all zero; adding a row of expansion rows at the top of the weight matrix 701, wherein values of all elements in the expansion rows are zero; the present disclosure thus obtains a weight matrix 703 to be computed of size 5 x 5. As can be seen from this, the convolution operation (704 in fig. 7 represents a multiply-add operation) of the input feature 700 and the weight matrix 701 in the present disclosure is changed to a convolution operation of the feature to be operated 702 and the weight matrix to be operated 703.
The convolution result 706 (i.e. convolution result with step size 1) of the to-be-computed feature 702 and the to-be-computed weight matrix 703 is different from the convolution result 705 (also convolution result with step size 1, the outermost circle in the convolution 705 is formed due to the feature filling requirement of the convolution layer) of the input feature 700 and the weight matrix 701. Specifically, the spatial resolution of convolution operation result 706 is 8 × 8, and the spatial resolution of convolution operation result 705 is 6 × 6. The present disclosure may intercept the contents of the 2 nd row, column 2 to column 7 th row from the convolution operation result 706, and take the intercepted contents as the output characteristic of the convolution layer. That is, the present disclosure may obtain the output characteristics of the convolutional layer after removing the contents of the outermost turn of the convolutional operation result 706.
FIG. 8 is a flow chart of yet another embodiment of a method of the present disclosure for implementing convolution operations. The method shown in fig. 8 includes: s800, S801, S802, and S803. The following describes each step.
S800, obtaining input characteristics of the convolutional layer and a weight matrix of a convolutional core corresponding to the convolutional layer.
The convolutional layer in the present disclosure is typically one layer in a neural network. Input characteristics of the convolutional layer may include, but are not limited to: feature maps (Feature maps) of images, Feature vectors of audio, and the like. The convolutional layer in the present disclosure is used to represent that the convolution operation is performed on its input features to further extract features from the input features, and the extracted features form its output features. The convolutional layer in the present disclosure has a convolution kernel. The size of the convolution kernel corresponding to the convolution layer is the size of the weight matrix of the convolution kernel. For example, if the size of the convolution kernel is 5 × 5, the size of the weight matrix of the convolution kernel is 5 × 5. Each element in the weight matrix of the convolution kernel corresponding to a convolutional layer may refer to a weight that each neuron in the convolutional layer is linked to a corresponding neuron in the next layer in the neural network.
The weight matrix of the convolution kernel in this disclosure is a weight matrix of m3 × m3, i.e., the size of the weight matrix is m3 × m3, where m3 is an odd number greater than zero. For example, m3 can be 3, 5, 7, or the like.
S801, under the condition that the data processor does not support the feature filling requirement of the convolutional layer, obtaining a weight matrix to be operated after row and column expansion according to the weight matrix.
The feature fill requirements in this disclosure are for the input features of the convolutional layer. The feature fill requirement indicates that feature fill is performed around the input features of the convolutional layer to increase the spatial resolution of the input features.
The to-be-operated weight matrix after row and column expansion in the present disclosure is a new weight matrix obtained by performing row and column expansion on the weight matrix of the convolution kernel corresponding to the convolution layer. For example, at least one row and at least one column are added to the weight matrix of the convolution kernel, thereby forming a weight matrix to be operated. The number of rows added to the weight matrix of the convolution kernel is typically the same as the number of columns added to the weight matrix. That is, if the size of the weight matrix to be operated on is m4 × m4, m4 is an integer greater than m3, and m4 may be even if the data processor supports even convolution kernels, whereas m4 should be odd if the data processor does not support even convolution kernels. m4 xm 4 is the size of the convolution kernel supported by the data processor. In addition, the present disclosure typically adds rows and columns, respectively, at the outermost sides of the weight matrix of the convolution kernel. The values of the elements in the added rows and columns at the outermost side of the weight matrix are generally: the predetermined value of the convolution operation result of the input feature and the weight matrix is not changed as much as possible.
S802, performing convolution operation based on feature filling supported by the data processor on the weight matrix to be operated and the input features through the data processor to obtain a convolution operation result.
A data processor in the present disclosure may refer to a hardware unit having data operation capability, for example, the data processor may include but is not limited to: CPU, BPU, GPU or FPGA.
The convolution operation in the present disclosure may refer to an operation executed to extract features of features to be operated on the basis of feature filling, for example, a data processor performs feature filling processing on input features of a convolution layer to obtain features to be operated; then, the data processor executes multiply-add operation and the like on the feature to be operated and the weight matrix to be operated, so as to obtain a convolution operation result. The step size of the convolution operation in this disclosure may be 1 or other value.
S803, according to the convolution operation result, the output characteristic of the convolution layer is obtained.
The output characteristic of the convolutional layer in the present disclosure may refer to a convolution operation result obtained by performing a convolution operation based on a characteristic filling requirement with respect to a weight matrix and an input characteristic of a convolution kernel corresponding to the convolutional layer.
Since the data processor in the present disclosure performs the feature filling process on the input features of the convolutional layer, and the feature filling process does not meet the feature filling requirement of the convolutional layer, the convolution operation result based on the weight matrix to be calculated and the features to be calculated obtained by the present disclosure may include redundant contents. The redundant content in the convolution operation result can be determined according to the actual situation, and the disclosure does not limit this.
Under the condition that the data processor does not support the feature filling requirement of the convolutional layer, the method and the device for filling the feature of the convolutional layer perform the convolutional operation on the input feature of the convolutional layer by performing row-column expansion on the weight matrix of the convolutional core, so that the data processor can utilize the expanded weight matrix to fill the feature supported by the weight matrix, and the output feature of the convolutional layer can be obtained based on the result of the convolutional operation performed by the data processor. Therefore, the technical scheme provided by the disclosure can realize the convolution operation of the convolution layer with the corresponding characteristic filling requirement by using the data processor which does not support the characteristic filling requirement of the convolution layer, thereby being beneficial to realizing various types of convolution operation by using the data processor and further being beneficial to enriching the realization mode of the convolution operation.
In an alternative example, the manner of obtaining the weight matrix to be operated after row-column expansion according to the present disclosure is generally as follows: and respectively adding the same number of expansion rows and expansion columns on the upper side of the uppermost row and the left side of the leftmost column of the weight matrix of the convolution kernel to obtain the weight matrix to be operated after the row and column are expanded. For example, where the weights of the convolution kernel are even weight matrices and the data processor supports odd weight matrices, the present disclosure may add a row of extended rows on the upper side of the top row of the weight matrix and a column of extended columns on the left side of the left-most column of the weight matrix. For another example, in a case where the weight of the convolution kernel is an odd weight matrix and the data processor supports the odd weight matrix, the present disclosure may add two rows of extension rows on an upper side of an uppermost row of the weight matrix and add two columns of extension columns on a left side of a leftmost column of the weight matrix. The number of extension rows and extension columns respectively added to the upper side of the uppermost row and the left side of the leftmost column of the weight matrix of the convolution kernel is generally determined according to the size of the convolution kernel corresponding to the convolution layer, the actual feature filling requirement, the size of the convolution kernel supported by the data processor, and the feature filling supported by the data processor. The method and the device can set a table in advance according to the size of the convolution kernel corresponding to the convolution layer, the actual feature filling requirement, the size and the feature filling of the convolution kernel supported by the data processor and the number of the expansion rows and the expansion columns, so that the number of the expansion rows and the expansion columns can be determined through table lookup.
According to the convolution operation method and device, the extension rows and the extension columns are respectively added on the upper side of the uppermost row and the left side of the leftmost column of the weight matrix, so that the convolution operation results of the input features and the weight matrix based on feature filling requirements can be enabled to completely appear in the predetermined regions of the convolution operation results of the input features and the weight matrix to be operated based on feature filling supported by a data processor, and the output features of the convolution layer can be conveniently and rapidly obtained.
In an optional example, values of elements in each expanded row and each expanded column in the matrix to be operated after row and column expansion obtained by the present disclosure may be all zero. According to the method and the device, values of all elements in each expansion row and each expansion column in the weight matrix to be operated are set to be zero, so that convolution operation results of the input features and the weight matrix based on feature filling requirements can be completely presented in convolution operation results of the input features and the weight matrix to be operated based on feature filling supported by a data processor, and the output features of the convolution layer can be conveniently and rapidly obtained.
The present disclosure uses the row-column expanded weight matrix to be operated to realize an example of convolution operation of the convolutional layer, as shown in fig. 9.
In fig. 9, it is assumed that the sizes of convolution kernels supported by the data processor include 5 × 5 and 7 × 7. Assume that the spatial resolution of an input feature 900 of a convolutional layer of the present disclosure is 7 × 7, and the size of a weight matrix 901 of a convolutional kernel corresponding to the convolutional layer is 5 × 5. Assume again that the feature fill requirement of the convolutional layer is 1 and the feature fills supported by the data processor are limited, i.e., the feature fills supported by the data processor are 0 and 2, or 0 and 3.
Under the above assumed conditions, the present disclosure may add two expansion columns to the leftmost side of the weight matrix 901 of fig. 9, where values of elements in the two expansion columns are all zero; adding two rows of expansion rows at the top of the weight matrix 901, wherein the values of all elements in the two rows of expansion rows are zero; the present disclosure thus obtains a weight matrix 903 to be computed of size 7 x 7. It can be seen that the convolution operation (904 in fig. 9 represents a multiply-add operation) of the input feature 900 and the weight matrix 901 in the present disclosure is changed to a convolution operation of the input feature 900 and the weight matrix 903 to be calculated.
The convolution operation result 906 (step size is 1) of the input feature 900 and the weight matrix 903 to be operated with the feature filling requirement of 1 is different from the convolution operation result 905 (step size is 1) of the input feature 900 and the weight matrix 901 with the feature filling requirement of 3 supported by the data processor. Specifically, the spatial resolution of convolution operation result 905 is 5 × 5, and the spatial resolution of convolution operation result 906 is 7 × 7. The present disclosure may intercept the contents of the 1 st row, the 1 st column, to the 5 th row, the 5 th column from the convolution operation result 906, and take the intercepted contents as the output characteristic of the convolution layer. That is, the present disclosure can obtain the output characteristic of the convolution layer, i.e., the convolution result 905, after removing the contents of the rightmost two columns and the bottommost two rows of the convolution result 906. The outermost turn in convolution operation 905 is due to the feature fill requirement of the convolutional layer.
Exemplary devices
Fig. 10 is a schematic structural diagram of an embodiment of an apparatus for implementing convolution operations according to the present disclosure. The apparatus of this embodiment may be used to implement the method embodiments shown in fig. 3-7 of the present disclosure. The apparatus shown in fig. 10 includes: a first obtaining module 1000, a second obtaining module 1001, a first obtaining operation result module 1002, and a first obtaining output characteristic module 1003.
The first obtaining module 1000 is configured to obtain an input feature of a convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer. The spatial resolution of the input features is n1 × n11, the weight matrix is m1 × m1, n1 and n11 are positive integers, and m1 is a non-zero even number.
The second obtaining module 1001 is configured to obtain the feature to be operated and the weight matrix to be operated after the row-column expansion according to the input feature and the weight matrix obtained by the first obtaining module 1000. The spatial resolution of the feature to be operated is n2 × n22, the weight matrix to be operated is a weight matrix of m2 × m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 × m2 is the size of the weight matrix supported by the data processor.
Optionally, the second obtaining module 1001 may obtain the input feature to be operated and the weight matrix to be operated after the row and column expansion by adding at least one expansion row and at least one expansion column respectively in the same direction of the input feature and the weight matrix.
Optionally, the second obtaining module 1001 may obtain the input feature to be operated and the weight matrix to be operated after the row and column expansion by adding the same number of expansion rows and the same number of expansion columns to the same direction of the input feature and the weight matrix, respectively.
Optionally, the number of rows of the extended row may be a difference between the number of rows of the weight matrix of the first convolution kernel supported by the data processor and the number of rows of the weight matrix of the convolution kernel corresponding to the convolution layer. The number of columns of the extended columns may be a difference between the number of columns of the weight matrix of the first convolution kernel supported by the data processor and the number of columns of the weight matrix of the convolution kernel corresponding to the convolution layer; the weight matrix of the first convolution kernel is: and the weight matrix with the size which is larger than that of the convolution kernel corresponding to the convolution layer and is closest to that of the convolution kernel corresponding to the convolution layer in each weight matrix supported by the data processor.
Optionally, under the condition that the data processor does not support the feature filling requirement of the convolutional layer, the second obtaining module 1001 may obtain the input features to be operated and the weight matrix to be operated after row and column expansion by adding the same number of expansion rows and the same number of expansion columns to the upper side of the uppermost row and the left side of the leftmost column of the input features and the weight matrix, respectively.
Optionally, the second obtaining module 1001 may set values of each element in an extension row and an extension column in the weight matrix to be calculated to be zero. The second obtaining module 1001 may set values of each element in the extension row and the extension column in the feature to be operated to arbitrary values.
The first operation result obtaining module 1002 is configured to perform, by using the data processor, a convolution operation on the weight matrix to be operated and the feature to be operated, which are obtained by the second obtaining module 1001, to obtain a convolution operation result.
Optionally, under the condition that the data processor does not support the feature filling requirement of the convolutional layer, the first operation result obtaining module 1002 is configured to perform convolution operation on the weight matrix to be operated and the feature to be operated through the feature filling supported by the data processor, so as to obtain a convolution operation result.
The first obtaining output characteristic module 1003 is configured to obtain an output characteristic of the convolutional layer according to the convolution operation result obtained by the first obtaining operation result module 1002.
Optionally, under the condition that the data processor does not support the feature filling requirement of the convolutional layer, the first obtain output feature module 1003 may obtain the output feature of the convolutional layer according to a part of rows and a part of columns in the convolution operation result.
Fig. 11 is a schematic structural diagram of an embodiment of an apparatus for implementing convolution operations according to the present disclosure. The apparatus of this embodiment may be used to implement the method embodiments shown in fig. 8-9 of the present disclosure. The apparatus shown in fig. 11 includes: a third obtaining module 1100, a fourth obtaining module 1101, a second obtaining operation result module 1102 and a second obtaining output characteristic module 1103.
The third obtaining module 1100 is configured to obtain input features of the convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer. The weight matrix is m3 × m3, and m3 is an odd number.
The fourth obtaining module 1101 is configured to, under the condition that the data processor does not support the feature filling requirement of the convolutional layer, obtain a weight matrix to be operated after row-column expansion according to the weight matrix obtained by the third obtaining module 1100. The weight matrix to be calculated is a weight matrix of m4 × m4, and m4 × m4 is the size of the weight matrix supported by the data processor.
Optionally, the fourth obtaining module 1101 may obtain the weight matrix to be calculated by adding the same number of extension rows and extension columns to the upper side of the uppermost row and the left side of the leftmost column of the weight matrix respectively.
Optionally, the fourth obtaining module 1101 may set values of each element in an extension row and an extension column in the weight matrix to be calculated to be zero.
The second obtaining operation result module 1102 is configured to perform, by using the data processor, a convolution operation based on feature filling supported by the data processor on the weight matrix to be operated obtained by the fourth obtaining module 1101 and the input feature obtained by the third obtaining module 1100, so as to obtain a convolution operation result.
The second obtaining output characteristic module 1103 is configured to obtain an output characteristic of the convolutional layer according to the convolution operation result obtained by the second obtaining operation result module 1102.
Optionally, the second obtaining output characteristic module 1103 may obtain the output characteristic of the convolutional layer according to a partial row and a partial column in the convolutional operation result obtained by the second obtaining operation result module 1102.
Exemplary electronic device
An electronic device according to an embodiment of the present disclosure is described below with reference to fig. 12. FIG. 12 shows a block diagram of an electronic device in accordance with an embodiment of the disclosure. As shown in fig. 12, the electronic device 121 includes one or more processors 1211 and a memory 1212.
The processor 1211 may be a Central Processing Unit (CPU) or other form of processing unit having data processing capabilities and/or instruction execution capabilities, and may control other components in the electronic device 121 to perform desired functions.
The memory 1212 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory, for example, may include: random Access Memory (RAM) and/or cache memory (cache), etc. The nonvolatile memory, for example, may include: read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer-readable storage medium and executed by processor 1211 to implement the method for implementing convolution operations of the various embodiments of the present disclosure described above and/or other desired functions. Various contents such as an input signal, a signal component, and a noise component may also be stored in the computer-readable storage medium.
In one example, the electronic device 121 may further include: input devices 1213 and output devices 1214, which may be interconnected via a bus system and/or other type of connection mechanism (not shown). The input devices 1213 may include, for example, a keyboard, a mouse, and the like. The output device 1214 can output various information to the outside. The output devices 1214 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, among others.
Of course, for simplicity, only some of the components of the electronic device 121 relevant to the present disclosure are shown in fig. 12, and components such as buses, input/output interfaces, and the like are omitted. In addition, electronic device 121 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer-readable storage Medium
In addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the methods for implementing convolution operations according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform steps in a method for implementing convolution operations according to various embodiments of the present disclosure described in the "exemplary methods" section above of this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium may include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing describes the general principles of the present disclosure in conjunction with specific embodiments, however, it is noted that the advantages, effects, etc. mentioned in the present disclosure are merely examples and are not limiting, and they should not be considered essential to the various embodiments of the present disclosure. Furthermore, the foregoing disclosure of specific details is for the purpose of illustration and description and is not intended to be limiting, since the disclosure is not intended to be limited to the specific details so described.
In the present specification, the embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts in the embodiments are referred to each other. For the system embodiment, since it basically corresponds to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The block diagrams of devices, apparatuses, systems referred to in this disclosure are only given as illustrative examples and are not intended to require or imply that the connections, arrangements, configurations, etc. must be made in the manner shown in the block diagrams. These devices, apparatuses, devices, and systems may be connected, arranged, configured in any manner, as will be appreciated by those skilled in the art. Words such as "including," comprising, "having," and the like are open-ended words that mean "including, but not limited to," and are used interchangeably therewith. The words "or" and "as used herein mean, and are used interchangeably with, the word" and/or, "unless the context clearly dictates otherwise. The word "such as" is used herein to mean, and is used interchangeably with, the phrase "such as but not limited to".
The methods and apparatus of the present disclosure may be implemented in a number of ways. For example, the methods and apparatus of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.
It is also noted that in the devices, apparatuses, and methods of the present disclosure, each component or step can be decomposed and/or recombined. These decompositions and/or recombinations are to be considered equivalents of the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects, and the like, will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, the description is not intended to limit embodiments of the disclosure to the form disclosed herein. While a number of example aspects and embodiments have been discussed above, those of skill in the art will recognize certain variations, modifications, alterations, additions and sub-combinations thereof.

Claims (14)

1. A method for implementing a convolution operation, comprising:
acquiring input features of a convolutional layer and a weight matrix of a convolutional core corresponding to the convolutional layer, wherein the spatial resolution of the input features is n1 × n11, the weight matrix is m1 × m1, n1 and n11 are positive integers, and m1 is a non-zero even number;
obtaining a row-column expanded feature to be operated and a weight matrix to be operated according to the input feature and the weight matrix, wherein the spatial resolution of the feature to be operated is n2 × n22, the weight matrix to be operated is a weight matrix of m2 × m2, n2 is an integer larger than n1, n22 is an integer larger than n11, m2 is an odd number larger than m1, and m2 × m2 is the size of the weight matrix supported by the data processor;
performing convolution operation on the weight matrix to be operated and the feature to be operated through the data processor to obtain a convolution operation result;
and obtaining the output characteristics of the convolutional layer according to the convolution operation result.
2. The method according to claim 1, wherein the obtaining the row-column expanded feature to be operated and the weight matrix to be operated according to the input feature and the weight matrix comprises:
and respectively adding at least one expansion row and at least one expansion column in the same direction of the input features and the weight matrix to obtain the input features to be operated and the weight matrix to be operated after the row and column expansion.
3. The method according to claim 2, wherein the obtaining the input features to be operated and the weight matrix to be operated after the row and column expansion by adding at least one expansion row and at least one expansion column respectively to the same orientation of the input features and the weight matrix comprises:
and respectively adding the same number of expansion rows and the same number of expansion columns in the same direction of the input features and the weight matrix to obtain the input features to be operated and the weight matrix to be operated after the rows and the columns are expanded.
4. The method according to claim 3, wherein the obtaining the input features to be operated and the weight matrix to be operated after the row and column expansion by adding the same number of expansion rows and the same number of expansion columns to the same orientation of the input features and the weight matrix respectively comprises:
under the condition that the data processor does not support the feature filling requirement of the convolutional layer, respectively adding the same number of expansion rows and the same number of expansion columns to the upper side of the uppermost row and the left side of the leftmost column of the input feature and weight matrix to obtain the input feature to be operated and the weight matrix to be operated after row-column expansion;
and the performing, by the data processor, a convolution operation on the weight matrix to be operated and the feature to be operated includes:
and performing convolution operation on the weight matrix to be operated and the feature to be operated by the data processor based on the feature filling supported by the data processor.
5. The method of claim 4, wherein said obtaining output characteristics of the convolutional layer from the result of the convolution operation comprises:
and obtaining the output characteristics of the convolutional layer according to partial rows and partial columns in the convolution operation result.
6. The method of any of claims 2 to 5, wherein:
values of all elements in the expansion row and the expansion column in the weight matrix to be operated are zero;
and the values of all elements in the expansion row and the expansion column in the feature to be operated are all arbitrary values.
7. The method of any of claims 2 to 6, wherein:
the row number of the extended row is the difference value between the row number of the weight matrix of the first convolution kernel supported by the data processor and the row number of the weight matrix of the convolution kernel corresponding to the convolution layer;
the number of columns of the extended columns is the difference value between the number of columns of the weight matrix of the first convolution kernel supported by the data processor and the number of columns of the weight matrix of the convolution kernel corresponding to the convolution layer;
wherein the weight matrix of the first convolution kernel is: and the weight matrix of each weight matrix supported by the data processor is larger than the size of the weight matrix of the convolution kernel corresponding to the convolution layer, and the weight matrix of the convolution kernel corresponding to the convolution layer has the closest size.
8. A method for implementing a convolution operation, comprising:
acquiring input features of a convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer, wherein the weight matrix is a weight matrix of m3 × m3, and m3 is an odd number;
under the condition that the data processor does not support the feature filling requirement of the convolutional layer, obtaining a weight matrix to be operated after row-column expansion according to the weight matrix, wherein the weight matrix to be operated is a weight matrix of m4 × m4, and m4 × m4 is the size of the weight matrix supported by the data processor;
performing, by the data processor, convolution operation based on feature filling supported by the data processor on the weight matrix to be operated and the input features to obtain a convolution operation result;
and obtaining the output characteristics of the convolutional layer according to the convolution operation result.
9. The method according to claim 8, wherein the obtaining the row-column expanded weight matrix to be operated according to the weight matrix comprises:
and respectively adding the same number of expansion rows and expansion columns on the upper side of the uppermost row and the left side of the leftmost column of the weight matrix to obtain the weight matrix to be operated.
10. The method of claim 8 or 9, wherein:
and the values of all elements in the expansion row and the expansion column in the weight matrix to be operated are zero.
11. An apparatus for implementing a convolution operation, comprising:
a first obtaining module, configured to obtain an input feature of a convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer, where a spatial resolution of the input feature is n1 × n11, the weight matrix is a weight matrix of m1 × m1, n1 and n11 are positive integers, and m1 is a non-zero even number;
a second obtaining module, configured to obtain a row-column expanded feature to be operated and a weight matrix to be operated according to the input feature and the weight matrix obtained by the first obtaining module, where a spatial resolution of the feature to be operated is n2 × n22, the weight matrix to be operated is a weight matrix of m2 × m2, n2 is an integer greater than n1, n22 is an integer greater than n11, m2 is an odd number greater than m1, and m2 × m2 is a size of the weight matrix supported by the data processor;
the first operation result obtaining module is used for performing convolution operation on the weight matrix to be operated and the characteristics to be operated, which are obtained by the second obtaining module, through the data processor to obtain a convolution operation result;
and the first obtaining output characteristic module is used for obtaining the output characteristic of the convolutional layer according to the convolution operation result obtained by the first obtaining operation result module.
12. An apparatus for implementing a convolution operation, comprising:
a third obtaining module, configured to obtain an input feature of a convolutional layer and a weight matrix of a convolutional kernel corresponding to the convolutional layer, where the weight matrix is a weight matrix of m3 × m3, and m3 is an odd number;
a fourth obtaining module, configured to, when the data processor does not support the feature filling requirement of the convolutional layer, obtain, according to the weight matrix obtained by the third obtaining module, a to-be-computed weight matrix after row-column expansion, where the to-be-computed weight matrix is a weight matrix of m4 × m4, and m4 × m4 is a size of the weight matrix supported by the data processor;
the second acquiring operation result module is used for performing convolution operation based on feature filling supported by the data processor on the weight matrix to be operated acquired by the fourth acquiring module and the input features acquired by the third acquiring module through the data processor to acquire a convolution operation result;
and the second obtaining and outputting characteristic module is used for obtaining the output characteristic of the convolutional layer according to the convolution operation result obtained by the second obtaining and operating result module.
13. A computer-readable storage medium, the storage medium storing a computer program for performing the method of any of the preceding claims 1-10.
14. An electronic device, the electronic device comprising:
a processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the method of any one of claims 1-10.
CN201911066229.4A 2019-11-04 2019-11-04 Method, device, medium and electronic equipment for realizing convolution operation Active CN112766474B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911066229.4A CN112766474B (en) 2019-11-04 2019-11-04 Method, device, medium and electronic equipment for realizing convolution operation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911066229.4A CN112766474B (en) 2019-11-04 2019-11-04 Method, device, medium and electronic equipment for realizing convolution operation

Publications (2)

Publication Number Publication Date
CN112766474A true CN112766474A (en) 2021-05-07
CN112766474B CN112766474B (en) 2024-03-22

Family

ID=75692286

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911066229.4A Active CN112766474B (en) 2019-11-04 2019-11-04 Method, device, medium and electronic equipment for realizing convolution operation

Country Status (1)

Country Link
CN (1) CN112766474B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109542512A (en) * 2018-11-06 2019-03-29 腾讯科技(深圳)有限公司 A kind of data processing method, device and storage medium
WO2019119301A1 (en) * 2017-12-20 2019-06-27 华为技术有限公司 Method and device for determining feature image in convolutional neural network model
CN110135556A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Neural network accelerated method, device, computer equipment and storage medium based on systolic arrays
KR102038390B1 (en) * 2018-07-02 2019-10-31 한양대학교 산학협력단 Artificial neural network module and scheduling method thereof for highly effective parallel processing

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019119301A1 (en) * 2017-12-20 2019-06-27 华为技术有限公司 Method and device for determining feature image in convolutional neural network model
KR102038390B1 (en) * 2018-07-02 2019-10-31 한양대학교 산학협력단 Artificial neural network module and scheduling method thereof for highly effective parallel processing
CN109542512A (en) * 2018-11-06 2019-03-29 腾讯科技(深圳)有限公司 A kind of data processing method, device and storage medium
CN110135556A (en) * 2019-04-04 2019-08-16 平安科技(深圳)有限公司 Neural network accelerated method, device, computer equipment and storage medium based on systolic arrays

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JEFFERY: "为什么CNN中的卷积核一般都是奇数*奇数,没有偶数*偶数的", pages 1 - 3, Retrieved from the Internet <URL:https://www.zhihu.com/question/51603070/answer/253873295> *
SHUANG WU等: "convolution with even-sized kernels and symmetric padding", COMPUTER VISION AND PATTERN RECOGNITION, pages 2 - 8 *

Also Published As

Publication number Publication date
CN112766474B (en) 2024-03-22

Similar Documents

Publication Publication Date Title
AU2020220126B2 (en) Superpixel methods for convolutional neural networks
CN109919311B (en) Method for generating instruction sequence, method and device for executing neural network operation
US9697176B2 (en) Efficient sparse matrix-vector multiplication on parallel processors
CN110546611A (en) Reducing power consumption in a neural network processor by skipping processing operations
US8868470B2 (en) Parallel processing of data sets
US9588991B2 (en) Image search device, image search method, program, and computer-readable storage medium
US20230026006A1 (en) Convolution computation engine, artificial intelligence chip, and data processing method
KR102129895B1 (en) Method and apparatus for performing convolution operation on folded feature date
CN111758107A (en) System and method for hardware-based pooling
JPWO2020190809A5 (en)
US20220253672A1 (en) Sparse attention neural networks
JPS6126712B2 (en)
US20230068450A1 (en) Method and apparatus for processing sparse data
Wu et al. A new approach to compute cnns for extremely large images
CN111028136B (en) Method and equipment for processing two-dimensional complex matrix by artificial intelligence processor
US10592307B2 (en) Multi user threaded executor
CN111125628A (en) Method and apparatus for processing two-dimensional data matrix by artificial intelligence processor
CN112766474A (en) Method, apparatus, medium, and electronic device for implementing convolution operation
TWI788257B (en) Method and non-transitory computer readable medium for compute-in-memory macro arrangement, and electronic device applying the same
CN107977923B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN112889072A (en) System, method and apparatus for reducing power consumption
US20200050484A1 (en) Steal one-process many work-stealing
CN114065119A (en) Data processing method and related product
CN111142858A (en) Engine device
Zhao et al. Divide‐and‐conquer approach for solving singular value decomposition based on MapReduce

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant