CN112132274B - Feature map full-connection convolution method and device, readable storage medium and electronic equipment - Google Patents

Feature map full-connection convolution method and device, readable storage medium and electronic equipment Download PDF

Info

Publication number
CN112132274B
CN112132274B CN202011006954.5A CN202011006954A CN112132274B CN 112132274 B CN112132274 B CN 112132274B CN 202011006954 A CN202011006954 A CN 202011006954A CN 112132274 B CN112132274 B CN 112132274B
Authority
CN
China
Prior art keywords
convolution kernel
feature map
convolution
size
weight values
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011006954.5A
Other languages
Chinese (zh)
Other versions
CN112132274A (en
Inventor
李德林
李建军
王振江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Horizon Shanghai Artificial Intelligence Technology Co Ltd
Original Assignee
Horizon Shanghai Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Horizon Shanghai Artificial Intelligence Technology Co Ltd filed Critical Horizon Shanghai Artificial Intelligence Technology Co Ltd
Priority to CN202011006954.5A priority Critical patent/CN112132274B/en
Publication of CN112132274A publication Critical patent/CN112132274A/en
Application granted granted Critical
Publication of CN112132274B publication Critical patent/CN112132274B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Neurology (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The present disclosure discloses a feature map full-connection convolution method, comprising: determining a plurality of weight values corresponding to the feature map in a first convolution kernel based on the feature map and the first convolution kernel; determining a second convolution kernel based on the plurality of weight values in the first convolution kernel and the corresponding relation between the plurality of weight values and the feature map; determining a plurality of weight values to be supplemented of the second convolution kernel and positions of the weight values to be supplemented corresponding to the second convolution kernel based on the preset size and the size of the second convolution kernel; determining a third convolution kernel of a predetermined size based on the second convolution kernel, the weight value to be supplemented, and the position of the weight value to be supplemented; and performing full-connection convolution calculation by adopting a third convolution check feature diagram. According to the technical scheme, full-connection convolution of the feature map can be realized without adjusting and filling the feature map, calculation resources for adjusting and filling the feature map can be saved, meanwhile, the feature map does not need to be filled, the small size can be kept, and the memory space occupied by the feature map is saved.

Description

Feature map full-connection convolution method and device, readable storage medium and electronic equipment
Technical Field
The disclosure relates to the technical field of neural network computing, in particular to a feature map full-connection convolution method and device, a readable storage medium and electronic equipment.
Background
In the design of neural network model structures, such as CNN convolutional neural network model structures, some full-join convolution typically occurs at the last few layers of the network structure. For the above design, during the application process, since the instruction set architecture of a dedicated hardware platform does not support full join, only the normal convolution can be used to perform the simulated full join calculation, and the normal convolution only supports convolution kernels with odd sizes, such as 3x3, 5x5, 7x7, etc. When the network structure needs to make full connection to the feature images with even number, the feature images need to be convolved by adopting convolution check with the size being odd number and larger than the size of the feature images, at this time, because of the limitation of the instruction set architecture of the hardware platform to the filling of the feature images, the feature images need to be filled in specific rows or columns, thus additional instructions are needed to adjust the data arrangement, and meanwhile, the space occupation of the feature images to the memory is increased.
Disclosure of Invention
The present disclosure has been made in order to solve the above technical problems. The embodiment of the disclosure provides a feature map full-connection convolution method, a device, a readable storage medium and electronic equipment, which can set an effective weight value in a third convolution kernel at a position corresponding to a feature map, so that the feature map is directly subjected to full-connection convolution by the third convolution kernel, and the calculated amount and the memory occupation space are reduced.
According to one aspect of the present disclosure, there is provided a feature map full-join convolution method, including:
determining a plurality of weight values corresponding to the feature map in a first convolution kernel based on the feature map and the first convolution kernel;
determining a second convolution kernel based on the plurality of weight values in the first convolution kernel and the correspondence between the plurality of weight values and the feature map;
Determining a plurality of weight values to be supplemented of the second convolution kernel and positions of the weight values to be supplemented corresponding to the second convolution kernel based on a preset size and the size of the second convolution kernel;
determining a third convolution kernel of a predetermined size based on the second convolution kernel, the weight value to be supplemented, and the position of the weight value to be supplemented;
And adopting the third convolution to check the feature map to perform full-connection convolution calculation.
According to a second aspect of the present disclosure, there is provided a feature map full-join convolution apparatus comprising:
The weight value determining module is used for determining a plurality of weight values corresponding to the feature map in a first convolution kernel based on the feature map and the first convolution kernel;
a second convolution kernel determining module, configured to determine a second convolution kernel based on the plurality of weight values in the first convolution kernel and the correspondence between the plurality of weight values and the feature map;
the weight value to be supplemented determining module is used for determining a plurality of weight values to be supplemented of the second convolution kernel and corresponding positions of the weight values to be supplemented based on a preset size and the size of the second convolution kernel;
A third convolution kernel determining module, configured to form a third convolution kernel with a predetermined size based on the second convolution kernel, the weight value to be supplemented, and a position of the weight value to be supplemented;
and the calculation module is used for carrying out full-connection convolution calculation on the characteristic map by adopting the third convolution check.
According to a third aspect of the present disclosure, there is provided a computer-readable storage medium storing a computer program for performing any one of the above-described feature map full-join convolution methods.
According to a fourth aspect of the present disclosure, there is provided an electronic device comprising:
A processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement any of the feature map full-join convolution methods described above.
In the above four technical solutions of the present disclosure, the weight value in the first convolution kernel is taken out, a second convolution kernel is formed according to the corresponding relation with the feature map, and then the second convolution kernel is supplemented with the weight value to form a third convolution kernel. Since the feature map does not need to be filled or adjusted, the calculation amount required for processing the feature map can be reduced; in addition, the feature map does not need to be filled, namely, the small size is kept, and the memory space occupied by the feature map can be reduced.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent by describing embodiments thereof in more detail with reference to the accompanying drawings. The accompanying drawings are included to provide a further understanding of embodiments of the disclosure, and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure, without limitation to the disclosure. In the drawings, like reference numerals generally refer to like parts or steps.
Fig. 1 is a scene diagram or system diagram or circuit configuration diagram to which the present disclosure is applicable.
Fig. 2 is a flow chart of a feature map full-join convolution method according to an exemplary embodiment of the present disclosure.
Fig. 3 is a convolution kernel transformation diagram of a feature map full-join convolution method provided in another exemplary embodiment of the present disclosure.
Fig. 4 is a schematic flow chart of a convolution calculation using a third convolution kernel in a feature map full-connection convolution method according to an exemplary embodiment of the disclosure.
Fig. 6 is a schematic diagram of determining a second convolution kernel in a feature map full-join convolution method provided by an exemplary embodiment of the present disclosure.
Fig. 5 is a schematic flow chart of determining a second convolution kernel in a feature map full-join convolution method according to an exemplary embodiment of the present disclosure.
Fig. 7 is a schematic diagram of a feature map full-join convolution apparatus provided by an exemplary embodiment of the present disclosure.
Fig. 8 is a schematic diagram of a calculation module in a feature map full-connection convolution apparatus according to an exemplary embodiment of the present disclosure.
Fig. 9 is a schematic diagram of a second convolution kernel determination module in a feature map full-connection convolution apparatus according to an exemplary embodiment of the present disclosure.
Fig. 10 is a block diagram of an electronic device provided in an exemplary embodiment of the present disclosure.
Detailed Description
Hereinafter, example embodiments according to the present disclosure will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present disclosure and not all of the embodiments of the present disclosure, and that the present disclosure is not limited by the example embodiments described herein.
Summary of the application
As described above, when the network structure needs to fully connect even-sized feature graphs, convolution check feature graphs with a size larger than the feature graph and an odd-sized feature graph need to be convolved, and at this time, because of the limitation of the instruction set architecture of some dedicated hardware platforms to the feature graph filling, specific row or column filling needs to be performed on the feature graph, so that additional instructions are needed to adjust the data arrangement, and meanwhile, the space occupation of the feature graph to the memory is increased. Further, the hardware platform may be, for example, an FPGA, an ASIC, or an SOC, etc., and may be used to identify pedestrians, vehicles, etc. from images collected in a specific security domain and driving domain by using various neural network algorithms.
In the present application, the weight value of the corresponding feature map in the first convolution kernel is taken out, and then a second convolution kernel is formed according to the corresponding relation with the feature map, where the second convolution kernel has the same size as the feature map, but because of the limitation of the instruction set architecture of some special hardware platforms, the second convolution kernel is not available, so that the second convolution kernel is expanded to form a third convolution kernel with a specified size, and at this time, the feature map is subjected to convolution again by adopting the third convolution kernel. At this time, the feature map does not need to be adjusted and filled, and the weight value in the third convolution kernel directly corresponds to the feature map in the calculation process. Therefore, in the method, the characteristic diagram does not need to be adjusted and filled in the calculation process, so that the corresponding calculation amount can be reduced; meanwhile, the feature map does not need to be filled, so that the occupied size of the feature map in the memory can be reduced.
Exemplary System
As shown in fig. 1, in the prior art, if a common convolution is adopted to simulate calculation of full connection in the full connection convolution process, since the common convolution only supports convolution kernels with odd number, when the network structure needs to make full connection on the feature images with even number, a convolution kernel with size larger than the feature images and odd number is generally adopted, for example, when making full connection on the feature images 12 of 4*4, a convolution kernel 11 with 5*5 is selected, then in the convolution process, the left side of the feature images 12 is added by one column, the upper side is added by one row, and the pixel value of the added part is set to 0; that is, the feature map is converted into the feature map 13 of 5*5, but the pixel values of the left row and the upper row in the feature map 13 of 5*5 are 0, and at this time, the convolution kernel 11 of 5*5 may be used to perform the convolution calculation on the feature map 13 of 5*5. From the above calculation, it is known that, in practice, the weights K22 to K25, K32 to K35, K42 to K45 and K52 to K55 in the convolution kernel 11 are valid weights, and the other weights are multiplied by the portion of the 5*5 feature map 13 where the pixel value is 0, which does not contribute to the final calculation result.
Exemplary method
Fig. 2 is a flow chart of a feature map full-join convolution method according to an exemplary embodiment of the present disclosure. The embodiment can be applied to an electronic device, as shown in fig. 2, and includes the following steps:
Step 201, determining a plurality of weight values corresponding to the feature map in a first convolution kernel based on the feature map and the first convolution kernel.
In some embodiments, the signature refers to a signature of a previous convolutional layer or fully-connected layer output, where the previous convolutional layer or fully-connected layer refers to a convolutional layer or fully-connected layer in a convolutional neural network, such as a convolutional layer or fully-connected layer in a CNN convolutional neural network. The feature map may be a feature map obtained by convoluting an image, sound, text, or the like acquired in real time by the image acquisition device through a convolutional neural network, or may be a feature map obtained by convoluting an image, sound, or text stored in advance in the storage unit through a convolutional neural network. The image can be, for example, a driving record image, a security monitoring image, and an image acquired by an intelligent terminal such as a mobile phone, a computer and the like. The width and the height of the feature map are both even numbers, the first convolution kernel refers to the convolution kernel with the width and the height being odd numbers, taking fig. 3 as an example, when the feature map 32 is fully connected, the feature map is 4*4, the first convolution kernel of 5*5 is selected at this time, according to the calculation process in fig. 1, in the first convolution kernel, only the weight values of the parts K22-K25, K32-K35, K42-K45 and K52-K55 are effectively calculated, that is, in the first convolution kernel, the weight values of the parts K22-K25, K32-K35, K42-K45 and K52-K55 correspond to the feature map 32.
Step 202, determining a second convolution kernel based on the plurality of weight values in the first convolution kernel and the correspondence between the plurality of weight values and the feature map.
In some implementations, the size of the second convolution kernel is the same as the size of the feature map, each weight value in the second convolution kernel is a weight value in the first convolution kernel that corresponds to the feature map, and each weight value is the same as the location of a corresponding pixel in the feature map. Still taking fig. 3 as an example, the weight values of the parts K22 to K25, K32 to K35, K42 to K45, and K52 to K55 in the first convolution kernel 31 are arranged in accordance with the positions of the corresponding pixels in the feature map 32 to form a second convolution kernel 33.
Step 203, determining a plurality of weight values to be supplemented by the second convolution kernel and positions of the weight values to be supplemented corresponding to the second convolution kernel based on a preset size and the size of the second convolution kernel;
In some embodiments, the predetermined size refers to a convolution kernel size supported by an instruction set architecture of the hardware platform, such as 3*3, 5*5, or 7*7; the position of the second convolution kernel, where the weight value to be supplemented corresponds to the second convolution kernel, is a position where the weight value in the second convolution kernel can be multiplied by the feature map after the second convolution kernel is expanded; the weight value to be supplemented refers to a weight value filled in a position of the weight value to be supplemented. Still taking fig. 3 as an example, the third convolution kernel 34 of 5*5 is selected in fig. 3, and because the convolution is performed in the order from left to right and from top to bottom during the convolution calculation, in order that the weight values in the second convolution kernel 33 can correspond to the feature map 32, the positions to be supplemented with the weight values select a row that increases after the last row and a column that increases after the last column of the second convolution kernel. Of course, when expanding the second convolution kernel, it may also be expanded to other odd sizes supported by the instruction set architecture of the hardware platform, for example, the third convolution kernel 34 may be expanded to 7*7, where the positions to be complemented with weight values select three rows that are added after the last row and three columns that are added after the last column of the second convolution kernel.
Step 204, determining a third convolution kernel with a predetermined size based on the second convolution kernel, the weight value to be supplemented, and the position of the weight value to be supplemented;
In some embodiments, after the weight value to be supplemented is supplemented in the position of the weight value to be supplemented, a third convolution kernel is formed, where the third convolution kernel is a convolution kernel with a size that meets the instruction set architecture requirement of the hardware platform. Taking fig. 3 as an example, the third convolution kernel 34 is formed by filling one column added to the last column of the second convolution kernel 33 and one row added after the last row of the second convolution kernel, i.e., filling P26, P36, P46, P56, and P62-P66.
And 205, performing full-connection convolution calculation on the characteristic map by adopting the third convolution kernel.
In some embodiments, when the third convolution kernel performs full-connection convolution calculation on the feature map, convolution is performed in order from left to right and from top to bottom, and when the size of the third convolution kernel is larger than that of the feature map, the weight value at the upper left part of the third convolution kernel and the feature map are multiplied correspondingly only once, so that a full-connection convolution calculation result is obtained. Taking fig. 3 as an example, in the convolution calculation process, the weight values of the parts K22-K25, K32-K35, K42-K45 and K52-K55 are correspondingly multiplied by the feature map, so as to obtain the result of the full-connection convolution calculation.
In the above technical solution of the present embodiment, the weight value in the first convolution kernel is read from the first convolution kernel stored in the memory unit according to the correspondence with the feature map to form a second convolution kernel, and then the second convolution kernel is supplemented with the weight value to form a third convolution kernel, because the effective weight value in the third convolution kernel is set at the position corresponding to the feature map, when performing convolution calculation, the weight value corresponding to the feature map in the third convolution kernel can directly perform convolution calculation with the feature map, thereby reducing the calculation amount required for processing the feature map; in addition, as the feature map does not need to be filled, compared with the method of carrying out convolution calculation after filling the feature map in the prior art, the method keeps smaller feature map size and can reduce the memory space occupied by the feature map.
On the basis of the embodiment shown in fig. 2, the positions of the weight values to be supplemented include: at least one row after the last row of the second convolution kernel and at least one column after the last column of the second convolution kernel.
In some embodiments, the odd rows are added after the last row of the second convolution kernel and the odd columns are added after the last column of the second convolution kernel, so that a third convolution kernel of odd size can be formed, where the selection is made according to the size of the feature map supported by the instruction set architecture of the hardware platform when the number of rows and columns are added, e.g., the second convolution kernel has a size 4*4, and the instruction set architecture of the hardware platform supports the convolution kernels of 5*5, 7*7, and 9*9, where the odd rows, e.g., 1 row, 3 row, or 5 row, that can be added after the last row of the second convolution kernel and the odd columns, e.g., 1 column, 3 column, or 5 column, that can be added after the last column of the second convolution kernel. However, since none of the added weights is effectively calculated, as a preferred approach, a lesser addition of odd rows or columns is typically chosen, such as an addition of odd rows, e.g., 1 row, after the last row of the second convolution kernel and an addition of odd columns, e.g., 1 column, after the last column of the second convolution kernel.
On the basis of the embodiment shown in fig. 2, the weight values to be supplemented are all zero.
In some embodiments, no efficient calculation is performed during the convolution calculation due to the weight values to be supplemented. Accordingly, the weight value to be supplemented may be set to an arbitrary value. In some preferred embodiments, in order to reduce the amount of calculation in the process of selecting the weight value, the weight value to be supplemented is usually set to zero by default, and meanwhile, when the weight value to be supplemented is set to zero, the multiplication result of zero and any value is zero, so that even if other interference factors occur in the calculation process, the accuracy of the calculation result can be ensured, for example, in the process of reading the memory by the computer, due to the memory alignment mechanism, the computer takes out data other than the feature map data from the memory, for example, the random number in the memory or the data which is not recovered in the last calculation, and at this time, the result of the multiplication of zero and the zero is still zero, so that the accuracy of the calculation result can be ensured.
As shown in fig. 4, step 205 may further include the following steps, on the basis of the embodiment shown in fig. 2:
step 2051, associating a first row of the third convolution kernel with a first row of the feature map, and associating a first column of the third convolution kernel with a first column of the feature map;
In some embodiments, the relative positions of the weights of the third convolution kernel are fixed, and the pixel array of the feature map is also fixed, so that the correspondence between the weight values in the third convolution kernel and the feature map can be determined by corresponding the first row of the third convolution kernel to the first row of the feature map and the first column of the third convolution kernel to the first column of the feature map. Taking fig. 3 as an example, in fig. 3, since K22 to K25, K32 to K35, K42 to K45, and K52 to K55 in the third convolution kernel are valid weight values, the first row in the third convolution kernel 34 is aligned with the first row of the feature map 32, the first column in the third convolution kernel 34 is aligned with the first column of the feature map 32, and the valid weight values in the third convolution kernel 34 are corresponding to the feature map.
Step 2052, based on the correspondence between the third convolution kernel and the feature map, performing convolution calculation by using a third convolution kernel to check the feature map.
In some embodiments, when performing convolution calculation, the weight value of the third convolution kernel is multiplied by the corresponding pixel value in the feature map, and then the multiplication results are accumulated, so as to obtain the feature value of the feature map.
By adopting the calculation mode, when the size of the third convolution kernel is larger than that of the feature map, the effective weight value of the third convolution kernel is arranged at the position corresponding to the feature map, and the feature map is directly checked by the third convolution kernel to carry out full-connection convolution calculation, so that calculation resources required by adjusting the feature map can be saved, and meanwhile, the feature map is not expanded due to the original size, and compared with the mode of calculating after expanding the feature map in the prior art, the memory space occupied by the feature map can be reduced.
On the basis of the embodiment shown in fig. 2 described above, the size of the first convolution kernel is larger than the size of the feature map.
In some embodiments, the feature map used in the full-join computation is heatmap of the full-join layer, where the feature map generally has a smaller size, and when the feature map has an even size and the instruction set architecture of the hardware platform does not support even convolution kernels, a first convolution kernel having an odd size greater than the feature map is generally used, and the embodiment shown in fig. 2 is used to process the first convolution kernel. Those skilled in the art will appreciate that the use of a first convolution kernel having a size greater than the size of the feature map is merely a preferred embodiment of the present disclosure. In other preferred embodiments, the first convolution kernel may be selected as a convolution kernel with an even size that is the same as the size of the feature map, where the first convolution kernel and the second convolution kernel are identical convolution kernels, and after the second convolution kernel is expanded, a third convolution kernel with a size that is greater than the size of the feature map and an odd number of sizes is formed, where if the instruction set architecture of the hardware platform can only support the convolution kernel with the odd number of sizes, it may be ensured that the third convolution kernel can also meet the requirement of the instruction set architecture of the hardware platform.
The first convolution kernel which is larger than the size of the feature map is adopted, the first convolution kernel can be selected from the original convolution kernels which are suitable for the instruction set architecture of the hardware platform, the first convolution kernel is not required to be searched or set outside the range of the original convolution kernels, the process of selecting the first convolution kernel is simple and convenient, and the calculated amount is small.
As shown in fig. 5, step 202 further includes the following steps, based on the embodiment shown in fig. 2, described above:
Step 2021, arranging the plurality of weight values into a matrix with the same size as the feature map based on the correspondence between the plurality of weight values and the feature map; the position of the weight value in the matrix is the same as the position of the pixel value corresponding to the weight value in the feature map;
In some embodiments, in the first convolution kernel, only a portion of the weight values are valid weight values, and the valid weight values correspond one-to-one to the pixels in the feature map. In the second convolution kernel forming process, a matrix corresponding to the size of the feature image is established, and the position of a certain pixel corresponding to the feature image in the matrix is filled with a weight value corresponding to the pixel. Still taking fig. 3 as an example, in the matrix of the second feature map, the weight value corresponding to F11 is filled into K22, the weight value corresponding to F12 is filled into K23, and so on until the matrix is filled, a plurality of weight values form a matrix.
Step 2022, determining, based on the matrix, a weight value for a corresponding position in the second convolution kernel.
In some embodiments, the number of rows of the second convolution kernel is the same as the number of rows of the matrix, the number of columns of the second convolution kernel is the same as the number of columns of the matrix, the weight value in the second convolution kernel is the same as the weight value at the same position in the matrix, taking fig. 6 as an example, the weight value at K22 in the second convolution kernel 33 is the same as the weight value at a22 in the matrix 331, the weight value at K23 is the same as the weight value at a23, and so on, each weight value in the second convolution kernel can be determined.
In the above embodiment, each weight value of the second convolution kernel is associated with the feature map, so that the corresponding multiplication relationship of the weight value in the convolution kernel and the pixel in the feature map is ensured first. In the subsequent process, the corresponding relation between the effective weight value and the feature map in the convolution calculation process is adjusted by expanding the second convolution kernel, so that the third convolution kernel can directly carry out convolution calculation with the feature map, and the calculation amount for adjusting and filling the feature map is saved. Meanwhile, as the feature map is not filled, compared with the prior art that the convolution operation is carried out by adopting a mode of expanding the feature map, the feature map has smaller size, and the memory space occupied by the feature map can be reduced.
Fig. 7 is a schematic diagram of a feature map full-connection convolution apparatus according to an exemplary embodiment of the present disclosure, including the following modules.
A weight value determining module 601, configured to determine, based on a feature map and a first convolution kernel, a plurality of weight values in the first convolution kernel corresponding to the feature map;
In some embodiments, the signature refers to a signature of a previous convolutional layer or fully-connected layer output, where the previous convolutional layer or fully-connected layer refers to a convolutional layer or fully-connected layer in a convolutional neural network, such as a convolutional layer or fully-connected layer in a CNN convolutional neural network. The feature map may be a feature map obtained by convoluting an image, sound, text, or the like acquired in real time by the image acquisition device through a convolutional neural network, or may be a feature map obtained by convoluting an image, sound, or text stored in advance in the storage unit through a convolutional neural network. The image can be, for example, a driving record image, a security monitoring image, and an image acquired by an intelligent terminal such as a mobile phone, a computer and the like. The width and the height of the feature map are both even numbers, the first convolution kernel refers to the convolution kernel with the width and the height being odd numbers, taking fig. 3 as an example, when the feature map 32 is fully connected, the feature map is 4*4, the first convolution kernel of 5*5 is selected at this time, according to the calculation process in fig. 1, in the first convolution kernel, only the weight values of the parts K22-K25, K32-K35, K42-K45 and K52-K55 are effectively calculated, that is, in the first convolution kernel, the weight values of the parts K22-K25, K32-K35, K42-K45 and K52-K55 correspond to the feature map 32.
A second convolution kernel determining module 602, configured to determine a second convolution kernel based on the plurality of weight values in the first convolution kernel and the correspondence between the plurality of weight values and the feature map;
In some implementations, the second convolution kernel is a convolution kernel having the same size as the feature map, each weight value in the second convolution kernel is a weight value in the first convolution kernel corresponding to the feature map, and each weight value is the same location as a corresponding pixel in the feature map. Still taking fig. 3 as an example, the weight values of the parts K22 to K25, K32 to K35, K42 to K45, and K52 to K55 in the first convolution kernel 31 are arranged in accordance with the positions of the corresponding pixels in the feature map 32 to form a second convolution kernel 33.
A weight value to be supplemented determining module 603, configured to determine a plurality of weight values to be supplemented of the second convolution kernel and positions of the weight values to be supplemented corresponding to the second convolution kernel based on a predetermined size and a size of the second convolution kernel;
In some embodiments, the predetermined size refers to a convolution kernel size supported by an instruction set architecture of the hardware platform, such as 3*3, 5*5, or 7*7; the position of the second convolution kernel, where the weight value to be supplemented corresponds to the second convolution kernel, is a position where the weight value in the second convolution kernel can be multiplied by the feature map after the second convolution kernel is expanded; the weight value to be supplemented refers to a weight value filled in a position of the weight value to be supplemented. Still taking fig. 3 as an example, the third convolution kernel 34 of 5*5 is selected in fig. 3, and because the convolution is performed in the order from left to right and from top to bottom during the convolution calculation, in order that the weight values in the second convolution kernel 33 can correspond to the feature map 32, the positions to be supplemented with the weight values select a row that increases after the last row and a column that increases after the last column of the second convolution kernel. Of course, when expanding the second convolution kernel, it may also be expanded to other odd sizes supported by the instruction set architecture of the hardware platform, for example, the third convolution kernel 34 may be expanded to 7*7, where the positions to be complemented with weight values select three rows that are added after the last row and three columns that are added after the last column of the second convolution kernel.
A third convolution kernel determining module 604, configured to determine a third convolution kernel of a predetermined size based on the second convolution kernel, the weight value to be supplemented, and a position of the weight value to be supplemented;
In some embodiments, after the weight value to be supplemented is supplemented in the position of the weight value to be supplemented, a third convolution kernel is formed, where the third convolution kernel is a convolution kernel with a size that meets the instruction set architecture requirement of the hardware platform. Taking fig. 3 as an example, the third convolution kernel 34 is formed by filling one column added to the last column of the second convolution kernel 33 and one row added after the last row of the second convolution kernel, i.e., filling P26, P36, P46, P56, and P62-P66.
And a calculation module 605, which performs full-connection convolution calculation by using the third convolution kernel to perform full-connection convolution calculation on the feature map.
In some embodiments, when the third convolution kernel performs full-connection convolution calculation on the feature map, convolution is performed in order from left to right and from top to bottom, and when the size of the third convolution kernel is larger than that of the feature map, the weight value at the upper left part of the third convolution kernel and the feature map are multiplied correspondingly only once, so that a full-connection convolution calculation result is obtained. Taking fig. 3 as an example, in the convolution calculation process, the weight values of the parts K22-K25, K32-K35, K42-K45 and K52-K55 are correspondingly multiplied by the feature map, so as to obtain the result of the full-connection convolution calculation.
In the above technical solution of the present embodiment, the weight value in the first convolution kernel is taken out, and a second convolution kernel is formed according to the corresponding relation with the feature map, and then the second convolution kernel is supplemented with the weight value to form a third convolution kernel, and because the weight value in the third convolution kernel is set at the position corresponding to the feature map, when the convolution calculation is performed, the weight value corresponding to the feature map in the third convolution kernel can directly perform the convolution calculation with the feature map, thereby reducing the calculation amount required for processing the feature map; in addition, as the feature map does not need to be filled, compared with a convolution calculation mode of expanding the feature map in the prior art, the feature map keeps smaller size, and the memory space occupied by the feature map can be reduced.
On the basis of the embodiment shown in fig. 7, the weight value to be supplemented determining module 60 is specifically configured for at least one row after the last row of the second convolution kernel and at least one column after the last column of the second convolution kernel.
In some embodiments, the odd rows are added after the last row of the second convolution kernel, and the odd columns are added after the last column of the second convolution kernel, so that a third convolution kernel of odd size can be formed, when the added rows and columns are selected, the selection is made according to the size of the feature map supported by the instruction set architecture of the hardware platform, for example, the convolution kernel of the second convolution kernel whose size is 4*4, the instruction set architecture of the hardware platform supports the convolution kernels of 5*5, 7*7 and 9*9, where the odd rows added after the last row of the second convolution kernel may be 1 row, 3 row or 5 row, and the odd columns added after the last column of the second convolution kernel may be 1 column, 3 column or 5 column, but since the added weights do not perform effective calculation, as a preferred way, a smaller number of rows or columns are generally selected to be added, for example, the odd rows added after the last row of the second convolution kernel are 1 row, and the odd columns added after the last column of the second convolution kernel are 1 column.
On the basis of the embodiment shown in fig. 7, the plurality of weight values to be supplemented determined by the weight value to be supplemented 603 are all zero.
In some embodiments, no efficient calculation is performed during the convolution calculation due to the weight values to be supplemented. Accordingly, the weight value to be supplemented may be set to an arbitrary value. In some preferred embodiments, in order to reduce the amount of calculation in the process of selecting the weight value, the weight value to be supplemented is usually set to zero by default, and meanwhile, when the weight value to be supplemented is set to zero, the multiplication result of zero and any value is zero, even if other interference factors occur in the calculation process, for example, in the process of reading the memory by the computer, due to the memory alignment mechanism, the computer takes out data other than the feature map data from the memory, for example, the random number in the memory or the data not recovered in the previous calculation, and at this time, the result of the multiplication of zero and any value is still zero, so that the accuracy of the calculation result can be ensured.
As shown in fig. 8, on the basis of the embodiment shown in fig. 7, the calculation module 605 may further include the following units:
A correspondence unit 6051 that corresponds a first row of the third convolution kernel to a first row of the feature map, and corresponds a first column of the third convolution kernel to a first column of the feature map;
In some embodiments, the relative positions of the weights of the third convolution kernel are fixed, and the pixel array of the feature map is also fixed, so that the correspondence between the weight values in the third convolution kernel and the feature map can be determined by corresponding the first row of the third convolution kernel to the first row of the feature map and the first column of the third convolution kernel to the first column of the feature map. Taking fig. 3 as an example, in fig. 3, since K22 to K25, K32 to K35, K42 to K45, and K52 to K55 in the third convolution kernel are valid weight values, the first row in the third convolution kernel 34 is aligned with the first row of the feature map 32, the first column in the third convolution kernel 34 is aligned with the first column of the feature map 32, and the valid weight values in the third convolution kernel 34 are corresponding to the feature map.
And a calculating unit 6052, configured to perform convolution calculation with the feature map using a third convolution kernel based on the correspondence between the third convolution kernel and the feature map.
In some embodiments, when performing convolution calculation, the weight value of the third convolution kernel is multiplied by the corresponding pixel value in the feature map, and then the multiplication results are accumulated, so as to obtain the feature value of the feature map.
By adopting the calculation mode, when the size of the third convolution kernel is larger than that of the feature map, the effective weight value of the third convolution kernel is arranged at the position corresponding to the feature map, and the feature map is directly checked by the third convolution kernel to carry out full-connection convolution calculation, so that calculation resources for adjusting the feature map can be saved, and meanwhile, the feature map is kept in original size and is not expanded, and compared with the mode of calculating after expanding the feature map in the prior art, the memory space occupied by the feature map can be reduced.
On the basis of the embodiment shown in fig. 7 described above, the size of the first convolution kernel is larger than the size of the feature map.
In some embodiments, the feature map used in the full-join computation is heatmap of the full-join layer, where the feature map generally has a smaller size, and when the feature map has an even size and the instruction set architecture of the hardware platform does not support even convolution kernels, a first convolution kernel having an odd size greater than the feature map is generally used, and the embodiment shown in fig. 2 is used to process the first convolution kernel. Those skilled in the art will appreciate that the use of a first convolution kernel having a size greater than the size of the feature map is merely a preferred embodiment of the present disclosure. In other preferred embodiments, the first convolution kernel may be selected as a convolution kernel with an even size that is the same as the size of the feature map, where the first convolution kernel and the second convolution kernel are identical convolution kernels, and after the second convolution kernel is expanded, a third convolution kernel with a size that is greater than the size of the feature map and an odd number of sizes is formed, where if the instruction set architecture of the hardware platform can only support the convolution kernel with the odd number of sizes, it may be ensured that the third convolution kernel can also meet the requirement of the instruction set architecture of the hardware platform.
The first convolution kernel which is larger than the size of the feature map is adopted, the first convolution kernel can be selected from the original convolution kernels which are suitable for the instruction set architecture of the hardware platform, the first convolution kernel is not required to be searched or set outside the range of the original convolution kernels, the process of selecting the first convolution kernel is simple and convenient, and the calculated amount is small.
As shown in fig. 9, on the basis of the embodiment shown in fig. 7 described above, the second convolution kernel determination module 602 further includes the following units:
A matrix determining unit 6021 for arranging the plurality of weight values into a matrix having the same size as the feature map based on the correspondence relation between the plurality of weight values and the feature map; the position of the weight value in the matrix is the same as the position of the pixel value corresponding to the weight value in the feature map;
In some embodiments, in the first convolution kernel, only a portion of the weight values are valid weight values, and the valid weight values correspond one-to-one to the pixels in the feature map. In the second convolution kernel forming process, a matrix corresponding to the size of the feature image is established, and the position of a certain pixel corresponding to the feature image in the matrix is filled with a weight value corresponding to the pixel. Still taking fig. 3 as an example, in the matrix of the second feature map, the weight value corresponding to F11 is filled into K22, the weight value corresponding to F12 is filled into K23, and so on until the matrix is filled, a plurality of weight values form a matrix.
The convolution kernel determination unit 6022 determines a weight value of a corresponding position in the second convolution kernel based on the matrix.
In some embodiments, the number of rows of the second convolution kernel is the same as the number of rows of the matrix, the number of columns of the second convolution kernel is the same as the number of columns of the matrix, the weight value in the second convolution kernel is the same as the weight value at the same position in the matrix, taking fig. 6 as an example, the weight value at K22 in the second convolution kernel 33 is the same as the weight value at a22 in the matrix 331, the weight value at K23 is the same as the weight value at a23, and so on, each weight value in the second convolution kernel can be determined.
In the above embodiment, each weight value of the second convolution kernel is associated with the feature map, so that the corresponding multiplication relationship of the weight value in the convolution kernel and the pixel in the feature map is ensured first. In the subsequent process, the corresponding relation between the effective weight value and the feature map in the convolution calculation process is adjusted by expanding the second convolution kernel, so that the third convolution kernel can directly carry out convolution calculation with the feature map, and the calculation amount for adjusting and filling the feature map is saved. Meanwhile, as the feature map is not filled, compared with the convolution operation in the mode of expanding the feature map in the prior art, the feature map has smaller size, and the memory space occupied by the feature map can be reduced.
Exemplary electronic device
Next, an electronic device according to an embodiment of the present disclosure is described with reference to fig. 10. Fig. 10 illustrates a block diagram of an electronic device according to an embodiment of the disclosure.
As shown in fig. 10, the electronic device 11 includes one or more processors 111 and a memory 112.
The processor 111 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 11 to perform desired functions.
Memory 112 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium that can be executed by the processor 111 to implement the sound source localization methods and/or other desired functions of the various embodiments of the present disclosure described above. Various contents such as an input signal, a signal component, a noise component, and the like may also be stored in the computer-readable storage medium.
In one example, the electronic device 11 may further include: an input device 113 and an output device 114, which are interconnected by a bus system and/or other forms of connection mechanisms (not shown).
For example, the input device 113 may be a microphone or a microphone array as described above for capturing an input signal of a sound source. When the electronic device is a stand-alone device, the input means 113 may be a communication network connector for receiving the acquired input signal therefrom.
In addition, the input device 113 may also include, for example, a keyboard, a mouse, and the like.
The output device 114 may output various information to the outside, including the determined distance information, direction information, and the like. The output device 114 may include, for example, a display, speakers, a printer, and a communication network and remote output devices connected thereto, etc.
Of course, only some of the components of the electronic device 11 relevant to the present disclosure are shown in fig. 10 for simplicity, components such as buses, input/output interfaces, and the like being omitted. In addition, the electronic device 11 may include any other suitable components depending on the particular application.
Exemplary computer program product and computer readable storage Medium
In addition to the methods and apparatus described above, embodiments of the present disclosure may also be a computer program product comprising computer program instructions which, when executed by a processor, cause the processor to perform the steps in a sound source localization method according to various embodiments of the present disclosure described in the above "exemplary methods" section of the present description.
The computer program product may write program code for performing the operations of embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium, having stored thereon computer program instructions, which when executed by a processor, cause the processor to perform the steps in a sound source localization method according to various embodiments of the present disclosure described in the above "exemplary method" section of the present disclosure.
The computer readable storage medium may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may include, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The basic principles of the present disclosure have been described above in connection with specific embodiments, but it should be noted that the advantages, benefits, effects, etc. mentioned in the present disclosure are merely examples and not limiting, and these advantages, benefits, effects, etc. are not to be considered as necessarily possessed by the various embodiments of the present disclosure. Furthermore, the specific details disclosed herein are for purposes of illustration and understanding only, and are not intended to be limiting, since the disclosure is not necessarily limited to practice with the specific details described.
The block diagrams of the devices, apparatuses, devices, systems referred to in this disclosure are merely illustrative examples and are not intended to require or imply that the connections, arrangements, configurations must be made in the manner shown in the block diagrams. As will be appreciated by one of skill in the art, the devices, apparatuses, devices, systems may be connected, arranged, configured in any manner. Words such as "including," "comprising," "having," and the like are words of openness and mean "including but not limited to," and are used interchangeably therewith. The terms "or" and "as used herein refer to and are used interchangeably with the term" and/or "unless the context clearly indicates otherwise. The term "such as" as used herein refers to, and is used interchangeably with, the phrase "such as, but not limited to.
It is also noted that in the apparatus, devices and methods of the present disclosure, components or steps may be disassembled and/or assembled. Such decomposition and/or recombination should be considered equivalent to the present disclosure.
The previous description of the disclosed aspects is provided to enable any person skilled in the art to make or use the present disclosure. Various modifications to these aspects will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other aspects without departing from the scope of the disclosure. Thus, the present disclosure is not intended to be limited to the aspects shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
The foregoing description has been presented for purposes of illustration and description. Furthermore, this description is not intended to limit the embodiments of the disclosure to the form disclosed herein. Although a number of example aspects and embodiments have been discussed above, a person of ordinary skill in the art will recognize certain variations, modifications, alterations, additions, and subcombinations thereof.

Claims (10)

1. A feature map full-join convolution method, comprising:
Determining a plurality of weight values corresponding to the feature map in a first convolution kernel based on the feature map and the first convolution kernel; wherein the size of the first convolution kernel is greater than the size of the feature map;
determining a second convolution kernel based on the plurality of weight values in the first convolution kernel and the correspondence between the plurality of weight values and the feature map; wherein the size of the second convolution kernel is the same as the size of the feature map;
determining a plurality of weight values to be supplemented of the second convolution kernel and positions of the weight values to be supplemented corresponding to the second convolution kernel based on a preset size and the size of the second convolution kernel; wherein, the predetermined size refers to the convolution kernel size supported by the instruction set architecture of the hardware platform;
Determining a third convolution kernel with a preset size based on the second convolution kernel, the weight value to be supplemented and the position of the weight value to be supplemented, so that the effective weight value in the third convolution kernel is arranged at the position corresponding to the feature map;
And adopting the third convolution to check the feature map to perform full-connection convolution calculation.
2. The method of claim 1, wherein the location of the weight value to be supplemented comprises: at least one row after the last row of the second convolution kernel and at least one column after the last column of the second convolution kernel.
3. The method of claim 1, wherein the plurality of weights to be replenished are each zero.
4. The method of claim 1, wherein the performing a full join convolution calculation with the feature map using a third convolution kernel comprises:
Corresponding a first row of the third convolution kernel to a first row of the feature map, and corresponding a first column of the third convolution kernel to a first column of the feature map;
And based on the corresponding relation between the third convolution kernel and the feature map, adopting a third convolution kernel to carry out convolution calculation on the feature map.
5. The method of claim 1, wherein the plurality of weight values corresponding to the feature map includes a plurality of weight values other than at least a previous row weight value and at least a previous column weight value in the first convolution kernel.
6. The method of claim 1, wherein the first convolution kernel has a size that is greater than a size of the feature map.
7. The method of claim 1, wherein the determining a second convolution kernel based on the plurality of weight values in the first convolution kernel and the correspondence of the plurality of weight values to the feature map comprises:
Based on the corresponding relation between the weight values and the feature map, arranging the weight values into a matrix with the same size as the feature map; the position of the weight value in the matrix is the same as the position of the pixel value corresponding to the weight value in the feature map;
And determining weight values of corresponding positions in the second convolution kernel based on the matrix.
8. A signature fully-connected convolution apparatus comprising:
the weight value determining module is used for determining a plurality of weight values corresponding to the feature map in a first convolution kernel based on the feature map and the first convolution kernel; wherein the size of the first convolution kernel is greater than the size of the feature map;
A second convolution kernel determining module, configured to determine a second convolution kernel based on the plurality of weight values in the first convolution kernel and the correspondence between the plurality of weight values and the feature map; wherein the size of the second convolution kernel is the same as the size of the feature map;
The weight value to be supplemented determining module is used for determining a plurality of weight values to be supplemented of the second convolution kernel and corresponding positions of the weight values to be supplemented based on a preset size and the size of the second convolution kernel; wherein, the predetermined size refers to the convolution kernel size supported by the instruction set architecture of the hardware platform;
The third convolution kernel determining module is used for forming a third convolution kernel with a preset size based on the second convolution kernel, the weight value to be supplemented and the position of the weight value to be supplemented, so that the effective weight value in the third convolution kernel is arranged at the position corresponding to the feature map;
and the calculation module is used for carrying out full-connection convolution calculation on the characteristic map by adopting the third convolution check.
9. A computer readable storage medium storing a computer program for performing the feature map full join convolution method of any one of the preceding claims 1-7.
10. An electronic device, the electronic device comprising:
A processor;
a memory for storing the processor-executable instructions;
the processor is configured to read the executable instructions from the memory and execute the instructions to implement the feature map full join convolution method according to any one of claims 1 to 7.
CN202011006954.5A 2020-09-22 2020-09-22 Feature map full-connection convolution method and device, readable storage medium and electronic equipment Active CN112132274B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011006954.5A CN112132274B (en) 2020-09-22 2020-09-22 Feature map full-connection convolution method and device, readable storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011006954.5A CN112132274B (en) 2020-09-22 2020-09-22 Feature map full-connection convolution method and device, readable storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN112132274A CN112132274A (en) 2020-12-25
CN112132274B true CN112132274B (en) 2024-05-28

Family

ID=73842665

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011006954.5A Active CN112132274B (en) 2020-09-22 2020-09-22 Feature map full-connection convolution method and device, readable storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN112132274B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108133270A (en) * 2018-01-12 2018-06-08 清华大学 Convolutional neural networks accelerating method and device
CN109656623A (en) * 2019-03-13 2019-04-19 北京地平线机器人技术研发有限公司 It executes the method and device of convolution algorithm operation, generate the method and device of instruction
KR20190063393A (en) * 2017-11-29 2019-06-07 한국전자통신연구원 Apparatus for processing convolutional neural network using systolic array and method thereof
CN110443357A (en) * 2019-08-07 2019-11-12 上海燧原智能科技有限公司 Convolutional neural networks calculation optimization method, apparatus, computer equipment and medium
CN110472700A (en) * 2019-10-14 2019-11-19 深兰人工智能芯片研究院(江苏)有限公司 A kind of parameter fill method and device based on convolutional neural networks
CN111340201A (en) * 2018-12-19 2020-06-26 北京地平线机器人技术研发有限公司 Convolutional neural network accelerator and method for performing convolutional operation thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20190063393A (en) * 2017-11-29 2019-06-07 한국전자통신연구원 Apparatus for processing convolutional neural network using systolic array and method thereof
CN108133270A (en) * 2018-01-12 2018-06-08 清华大学 Convolutional neural networks accelerating method and device
CN111340201A (en) * 2018-12-19 2020-06-26 北京地平线机器人技术研发有限公司 Convolutional neural network accelerator and method for performing convolutional operation thereof
CN109656623A (en) * 2019-03-13 2019-04-19 北京地平线机器人技术研发有限公司 It executes the method and device of convolution algorithm operation, generate the method and device of instruction
CN110443357A (en) * 2019-08-07 2019-11-12 上海燧原智能科技有限公司 Convolutional neural networks calculation optimization method, apparatus, computer equipment and medium
CN110472700A (en) * 2019-10-14 2019-11-19 深兰人工智能芯片研究院(江苏)有限公司 A kind of parameter fill method and device based on convolutional neural networks

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于多线程架构的卷积神经网络加速器研究;陈伟光;《中国优秀硕士学位论文全文数据库 信息科技辑》;第2020卷(第07期);第I137-34页 *

Also Published As

Publication number Publication date
CN112132274A (en) 2020-12-25

Similar Documents

Publication Publication Date Title
US11922132B2 (en) Information processing method and terminal device
CN112214726B (en) Operation accelerator
CN108876792B (en) Semantic segmentation method, device and system and storage medium
US10540574B2 (en) Image compression method and related device
JP7007488B2 (en) Hardware-based pooling system and method
CN108111714B (en) Multi-lens based capture apparatus and method
EP4276690A1 (en) Vector computation unit in a neural network processor
US11822900B2 (en) Filter processing device and method of performing convolution operation at filter processing device
CN112734827B (en) Target detection method and device, electronic equipment and storage medium
CN112116071B (en) Neural network computing method and device, readable storage medium and electronic equipment
CN111125628A (en) Method and apparatus for processing two-dimensional data matrix by artificial intelligence processor
US11636569B1 (en) Matrix transpose hardware acceleration
CN112132274B (en) Feature map full-connection convolution method and device, readable storage medium and electronic equipment
CN112116083B (en) Neural network accelerator and detection method and device thereof
CN107977923B (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN112257859B (en) Feature data processing method and device, equipment and storage medium
CN114239803B (en) Compiling method and device of neural network model, electronic equipment and storage medium
CN112907450B (en) Three-dimensional time sequence image processing method and device, computer equipment and storage medium
CN115563443A (en) Convolution operation method and device, convolution processing method and device and storage medium
CN114625378A (en) Method and device for compiling neural network model and computer readable storage medium
CN112712461B (en) Image deconvolution processing method and device and terminal equipment
CN114549945A (en) Remote sensing image change detection method and related device
US20160148358A1 (en) Apparatus and method for gaussian filtering
CN110134813B (en) Image retrieval method, image retrieval device and terminal equipment
CN113762472A (en) Instruction sequence generation method and device of neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant