WO2019153908A1 - Image recognition method and system based on attention model - Google Patents

Image recognition method and system based on attention model Download PDF

Info

Publication number
WO2019153908A1
WO2019153908A1 PCT/CN2018/122684 CN2018122684W WO2019153908A1 WO 2019153908 A1 WO2019153908 A1 WO 2019153908A1 CN 2018122684 W CN2018122684 W CN 2018122684W WO 2019153908 A1 WO2019153908 A1 WO 2019153908A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
image
spatial
feature map
attention
Prior art date
Application number
PCT/CN2018/122684
Other languages
French (fr)
Chinese (zh)
Inventor
张志伟
杨帆
Original Assignee
北京达佳互联信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京达佳互联信息技术有限公司 filed Critical 北京达佳互联信息技术有限公司
Publication of WO2019153908A1 publication Critical patent/WO2019153908A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • G06F18/24133Distances to prototypes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present application relates to the field of image processing technologies, and in particular, to an image recognition method and system based on an attention model.
  • the purpose of the present application is to at least solve one of the above technical drawbacks, and in particular, to easily overlook the technical defects of data local information.
  • the application provides an image recognition method based on an attention model, comprising the following steps:
  • Step S10 Acquire an input feature map whose image matrix shape is [W, H, C], where W is a width, H is a height, and C is a channel number;
  • Step S20 spatially mapping the input feature map by using a preset spatial mapping weight matrix, and obtaining a spatial weight matrix after activation by the activation function, multiplying the spatial weight matrix and the image matrix of the input feature image by bit to obtain an output feature.
  • the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height
  • the shape of the spatial weight matrix is [W, H, 1]
  • the preset spatial mapping weight matrix is a channel attention matrix [C, C] whose attention is in the number of image channels
  • the shape of the spatial weight matrix is [1, 1, C].
  • step S20 when the preset spatial mapping weight matrix is the spatial attention matrix [C, 1], the following formula is used in step S20:
  • bitwise multiplication
  • matrix multiplication
  • o :,:, c the output feature map
  • i :, :, c the input feature map
  • sigmoid the activation function
  • w s the spatial map weight
  • b s deviation.
  • step S20 when the preset spatial mapping weight matrix is the channel attention matrix [C, C], the following formula is used in step S20:
  • o w,h,: i w,h,: ⁇ sigmoid(mean(i w,h,: ) ⁇ w c +b c )
  • bitwise multiplication
  • matrix multiplication
  • o w,h is the output feature map
  • i w,h is the input feature map
  • sigmoid is the activation function
  • mean is the averaging function
  • w c is The spatial mapping weights
  • b c is the deviation.
  • step S20 includes:
  • the shallow network on the convolutional neural network spatially maps the input feature map using the spatial attention matrix [C, 1], and is activated by the activation function to obtain a first spatial weight matrix, and the first spatial weight matrix Multiplying the image matrix of the input feature map by bit to obtain a first output feature map;
  • the first output feature map is spatially mapped using the channel attention matrix [C, 1] in the deep network of the convolutional neural network, and the second spatial weight matrix is obtained after activation by the activation function, and the second The spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map.
  • the method further includes step S30:
  • An image classification is performed by applying a classifier according to the output feature map.
  • the application also provides an image recognition system based on attention model, comprising:
  • An image acquisition module configured to acquire an input feature image of an image matrix shape of [W, H, C], where W is a width, H is a height, and C is a channel number;
  • An image processing module is configured to spatially map the input feature map by using a preset spatial mapping weight matrix, and obtain a spatial weight matrix after being activated by the activation function, and multiply the spatial weight matrix by the image matrix of the input feature image by bitwise Obtaining an output feature map, wherein the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] with attention to image width and height, and the shape of the spatial weight matrix is [W, H, 1] Or the preset spatial mapping weight matrix is a channel attention matrix [C, C] whose attention is in the number of image channels, and the shape of the spatial weight matrix is [1, 1, C].
  • the image processing module obtains an output feature map by using the following formula:
  • bitwise multiplication
  • matrix multiplication
  • o :,:, c the output feature map
  • i :, :, c the input feature map
  • sigmoid the activation function
  • w s the spatial map weight
  • b s deviation.
  • the image processing module obtains an output feature map by using the following formula:
  • o w,h,: i w,h,: ⁇ sigmoid(mean(i w,h,: ) ⁇ w c +b c )
  • bitwise multiplication
  • matrix multiplication
  • o w,h is the output feature map
  • i w,h is the input feature map
  • sigmoid is the activation function
  • mean is the averaging function
  • w c is The spatial mapping weights
  • b c is the deviation.
  • the image processing module includes a low-level semantic feature extraction module and an advanced semantic feature extraction module;
  • the low-level semantic feature extraction module is configured to: spatially map an input feature map by using the spatial attention matrix [C, 1] in a shallow network of a convolutional neural network, and obtain a first spatial weight after activation by an activation function. a matrix, the first spatial weight matrix is multiplied by an image matrix of the input feature map to obtain a first output feature map;
  • the advanced semantic feature extraction module is configured to: spatially map the first output feature map by using the channel attention matrix [C, 1] in a deep network of a convolutional neural network, and obtain an activation after activation by an activation function And a second spatial weight matrix, wherein the second spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map.
  • a classification module is further included for applying the classifier to perform image classification according to the output feature map.
  • the embodiment of the present application further provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;
  • a memory for storing a computer program
  • the image recognition method based on the attention model provided by the embodiment of the present application is implemented when the processor is configured to execute the program stored in the memory.
  • the embodiment of the present application further provides a storage medium, where the processing program is stored by the processor, and the image recognition method based on the attention model provided by the embodiment of the present application is implemented.
  • the embodiment of the present application further provides an application program for performing the attention-based model-based image recognition method provided by the embodiment of the present application at runtime.
  • the image recognition method and system based on the attention model described above first acquire an input feature map whose image matrix shape is [W, H, C], where W is the width, H is the height, and C is the number of channels; then the preset is used.
  • the spatial mapping weight matrix spatially maps the input feature map, and after the activation function is activated, the spatial weight matrix is obtained, and the spatial weight matrix is multiplied by the image matrix of the input feature image to obtain an output feature map, wherein
  • the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] with attention to image width and height, where the shape of the spatial weight matrix is [W, H, 1], or the preset spatial mapping
  • the weight matrix is the channel attention matrix [C, C] whose attention is on the number of image channels.
  • the shape of the spatial weight matrix is [1, 1, C].
  • FIG. 1 is a schematic flow chart of an image recognition method based on an attention model according to an embodiment
  • FIG. 2 is a schematic diagram of a feature extraction process based on a spatial attention model according to an embodiment
  • FIG. 3 is a schematic diagram of a feature extraction process based on a channel attention model according to an embodiment
  • FIG. 4 is a schematic flow chart of an image recognition method based on an attention model according to another embodiment
  • FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment.
  • FIG. 1 is a schematic flowchart of an image recognition method based on an attention model according to an embodiment, and an image recognition method based on an attention model, which may include the following steps:
  • Step S10 Acquire an input feature map whose image matrix shape is [W, H, C], where W is a width (the width of the image, the unit is a pixel), H is a height (the height of the image, the unit is a pixel), and C is Number of channels (number of color channels of the image).
  • the image matrix here is a three-dimensional matrix, and the format of [W, H, C] can also be written in the format of W*H*C, that is, width*height*channel number.
  • the input feature map is a feature map as an input content, and the feature map may include: a color feature of the image, a texture feature of the image, a shape feature of the image, a spatial relationship feature of the image, and the like.
  • the features of the image included in the feature map are not specifically limited in the embodiment of the present invention.
  • Step S20 spatially mapping the input feature map by using a preset spatial mapping weight matrix, and obtaining a spatial weight matrix after activation by the activation function, and multiplying the spatial weight matrix by the image matrix of the input feature image to obtain an output feature map.
  • the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height
  • the shape of the spatial weight matrix is [W, H, 1]
  • the shape of the spatial weight matrix is [1, 1, C].
  • the output feature map is a feature map as an output result.
  • the activation function may be set according to actual conditions, wherein the activation function may be various, for example, a Sigmoid function, a Tanh function, or a ReLU function.
  • the output feature map is obtained by using the following formula in step S20:
  • is bitwise multiplication
  • is matrix multiplication
  • o :,:, c is the output feature map (image matrix)
  • i :, :, c is the input feature map (image matrix)
  • sigmoid is the activation function
  • w s For spatial mapping weights
  • b s is the deviation.
  • is to represent that the data of the same position in two matrices of the same size are multiplied to generate a matrix of the same size.
  • a and B are two 2*2 two-dimensional matrices
  • a ⁇ B generates a two-dimensional matrix K of 2*2.
  • FIG. 2 is a schematic diagram of a feature extraction process based on a spatial attention model according to an embodiment, where i is an input feature map, w is a spatial weight matrix, and o is an output feature map.
  • the output feature map is obtained in step S20 using the following formula:
  • o w,h,: i w,h,: ⁇ sigmoid(mean(i w,h,: ) ⁇ w c +b c )
  • bitwise multiplication
  • matrix multiplication
  • o w, h : is the output feature map (image matrix)
  • i w, h is the input feature map (image matrix)
  • sigmoid is the activation function
  • mean is The average function is obtained
  • w c is the spatial mapping weight
  • b c is the deviation.
  • FIG. 3 is a schematic diagram of a feature extraction process based on a channel attention model according to an embodiment, wherein “characteristic map 1, feature map 2, ... feature map m” on the left side represents an input feature map of m channels, and “features on the right side” Fig. 1, feature map 2, ... feature map m” shows the output characteristic map of m channels.
  • step S30 may be further included: applying the classifier according to the output feature map to perform image classification.
  • the output feature map application classifier can classify the image according to the color feature of the image, the texture feature of the image, the shape feature of the image, or the spatial relationship feature of the image.
  • FIG. 4 is a schematic flow chart of an image recognition method based on an attention model according to another embodiment, and an image recognition method based on an attention model, comprising the following steps:
  • Step S21 Acquire an input feature map whose image matrix shape is [W, H, C], where W is a width (the width of the image, the unit is a pixel), H is a height (the height of the image, the unit is a pixel), and C is Number of channels (number of color channels of the image).
  • the image matrix here is a three-dimensional matrix, and the format of [W, H, C] can also be written in the format of W*H*C, that is, width*height*channel number.
  • Step S22 spatially mapping the input feature map by using the spatial attention matrix [C, 1] in the shallow network of the convolutional neural network, and obtaining the first spatial weight matrix after activation by the activation function, and the first spatial weight matrix and The image matrix of the input feature map is multiplied by bits to obtain a first output feature map.
  • the shallow network is used to extract the underlying features of the image, so it is sensitive in space. It is more appropriate to use the spatial attention matrix [C, 1] to extract the feature attention pattern.
  • the convolutional neural network may include an input layer, an intermediate layer, and an output layer, wherein the shallow network of the convolutional neural network may refer to an input layer of a convolutional neural network through which an underlying feature of the image may be obtained, the bottom layer Features may include: color features of the image, texture features of the image, and shape features of the image.
  • the first output characteristic map can be obtained using the following formula:
  • bitwise multiplication
  • matrix multiplication
  • o :,:, c is the output feature map (ie the first output feature map)
  • i :, :, c is the input feature map (ie input feature map)
  • sigmoid To activate the function, w s is the spatial mapping weight (ie, the spatial attention matrix [C, 1]), b s is the deviation, and sigmoid(i :, :, c ⁇ w s +b s ) is the first spatial weight matrix.
  • FIG. 2 is a schematic diagram of a feature extraction process based on a spatial attention model according to an embodiment, where i is an input feature map, w is a spatial weight matrix, and o is an output feature map.
  • Step S23 spatially mapping the first output feature map by using the channel attention matrix [C, 1] in the deep network of the convolutional neural network, and obtaining the second spatial weight matrix after the activation function is activated, and the second spatial weight matrix Multiplying the image matrix of the first output feature map by bits to obtain a second output feature map.
  • Deep networks are used to extract features of the high-level semantic hierarchy, so they are sensitive to channel information.
  • the convolutional neural network may include an input layer, an intermediate layer, and an output layer, wherein the deep network of the convolutional neural network may refer to an output layer of the convolutional neural network through which deep features of the image may be acquired, and the deep features may Is the spatial relationship feature of the image.
  • the second output characteristic map is obtained using the following formula:
  • o w,h,: i w,h,: ⁇ sigmoid(mean(i w,h,: ) ⁇ w c +b c )
  • bitwise multiplication
  • matrix multiplication
  • o w, h is the output feature map (ie, the second output feature map)
  • i w, h is the input feature map (ie, the first output feature map)
  • sigmoid is the activation function
  • mean is the averaging function
  • w c is the spatial mapping weight
  • b c is the deviation, which is the second spatial weight matrix.
  • FIG. 3 is a schematic diagram of a feature extraction process based on a channel attention model according to an embodiment, wherein “characteristic map 1, feature map 2, ... feature map m” on the left side represents an input feature map of m channels, and “features on the right side” Fig. 1, feature map 2, ... feature map m” shows the output characteristic map of m channels.
  • step S24 may be further included: applying a classifier according to the second output feature map to perform image classification.
  • the application also provides an image recognition system based on attention model, comprising:
  • the image obtaining module is configured to obtain an input feature image whose image matrix shape is [W, H, C], wherein W is a width, H is a height, and C is a channel number.
  • the image processing module is configured to spatially map the input feature map by using a preset spatial mapping weight matrix, and obtain a spatial weight matrix after activation by the activation function, and multiply the spatial weight matrix and the image matrix of the input feature image by bit to obtain an output.
  • a feature map in which when the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height, the shape of the spatial weight matrix is [W, H, 1], or when The preset spatial mapping weight matrix is a channel attention matrix [C, C] with attention to the number of image channels, and the shape of the spatial weight matrix is [1, 1, C].
  • the image processing module obtains the output feature map by using the following formula:
  • bitwise multiplication
  • matrix multiplication
  • o :,:, c the output feature map
  • i :, :, c the input feature map
  • sigmoid the activation function
  • w s the spatial map weight
  • b s deviation.
  • the image processing module obtains the output feature map by using the following formula:
  • o w,h,: i w,h,: ⁇ sigmoid(mean(i w,h,: ) ⁇ w c +b c )
  • bitwise multiplication
  • matrix multiplication
  • o w,h is the output feature map
  • i w,h is the input feature map
  • sigmoid is the activation function
  • mean is the averaging function
  • w c is The spatial mapping weights
  • b c is the deviation.
  • the classification module may further be configured to apply the classifier to perform image classification according to the output feature map.
  • the present application also provides an image recognition system based on an attention model, comprising: an image acquisition module and an image processing module.
  • the image acquisition module is configured to obtain an input feature map whose image matrix shape is [W, H, C], where W is a width, H is a height, and C is a channel number.
  • the image processing module includes a low-level semantic feature extraction module and an advanced semantic feature extraction module.
  • the low-level semantic feature extraction module is used to: spatially map the input feature map using the spatial attention matrix [C, 1] in the shallow network of the convolutional neural network, and obtain the first spatial weight matrix after activation by the activation function.
  • a spatial weight matrix is multiplied by the image matrix of the input feature map to obtain a first output feature map.
  • the shallow network is used to extract the underlying features of the image, so it is sensitive in space. It is more appropriate to use the spatial attention matrix [C, 1] to extract the feature attention pattern.
  • the first output characteristic map can be obtained using the following formula:
  • bitwise multiplication
  • matrix multiplication
  • o :,:, c is the output feature map (ie the first output feature map)
  • i :, :, c is the input feature map (ie input feature map)
  • sigmoid To activate the function, w s is the spatial mapping weight (ie, the spatial attention matrix [C, 1]), b s is the deviation, and sigmoid(i :, :, c ⁇ w s +b s ) is the first spatial weight matrix.
  • the advanced semantic feature extraction module is configured to: spatially map the first output feature map by using the channel attention matrix [C, 1] in the deep network of the convolutional neural network, and obtain the second spatial weight matrix after activation by the activation function, The second spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map.
  • Deep networks are used to extract features of the high-level semantic hierarchy, so they are sensitive to channel information.
  • the second output feature map can be obtained using the following formula:
  • o w,h,: i w,h,: ⁇ sigmoid(mean(i w,h,: ) ⁇ w c +b c )
  • bitwise multiplication
  • matrix multiplication
  • o w, h,: is the output feature map (ie, the second output feature map)
  • i w, h, : is the input feature map (ie, the first output feature map)
  • sigmoid is the activation function
  • mean is the averaging function
  • w c is the spatial mapping weight
  • b c is the deviation
  • sigmoid(mean(i w,h,: ) ⁇ w c +b c ) is the second spatial weight matrix.
  • a classification module is further included, configured to apply the classifier according to the second output feature map to perform image classification.
  • the image recognition method and system based on the attention model described above first acquire an input feature map whose image matrix shape is [W, H, C], where W is the width, H is the height, and C is the number of channels; then the preset is used.
  • the spatial mapping weight matrix spatially maps the input feature map, and after the activation function is activated, the spatial weight matrix is obtained, and the spatial weight matrix is multiplied by the image matrix of the input feature image to obtain an output feature map, wherein, when the preset When the spatial mapping weight matrix is the spatial attention matrix [C, 1] whose attention is on the image width and height, the shape of the spatial weight matrix is [W, H, 1], or when the preset spatial mapping weight matrix is the attention When the channel attention matrix [C, C] of the number of image channels is used, the shape of the spatial weight matrix is [1, 1, C].
  • the embodiment of the present application further provides an electronic device, as shown in FIG. 5, including a processor 501, a communication interface 502, a memory 503, and a communication bus 504, wherein the processor 501, the communication interface 502, and the memory 503 pass through the communication bus 504.
  • an electronic device including a processor 501, a communication interface 502, a memory 503, and a communication bus 504, wherein the processor 501, the communication interface 502, and the memory 503 pass through the communication bus 504.
  • the processor 501 is configured to implement the image recognition method based on the attention model provided by the embodiment of the present application when the program stored in the memory 503 is executed.
  • the embodiment of the present application further provides a storage medium, where the processing program is stored by the processor, and the image recognition method based on the attention model provided by the embodiment of the present application is implemented.
  • the embodiment of the present application further provides an application program for performing the attention-based model-based image recognition method provided by the embodiment of the present application at runtime.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

An image recognition method and system based on an attention model. The method comprises: first, obtaining an input feature image with the shape of an image matrix as [W, H, C], wherein W is the width, H is the height, and C is the number of channels; and then, performing spatial mapping on the input feature image by using a preset spatial mapping weight matrix, obtaining a spatial weight matrix after the input feature image is activated by an activation function, and multiplying the spatial weight matrix with the image matrix of the input feature image by bits to obtain an output feature image. When a preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention lies in the width and height of an image, the shape of the spatial weight matrix is [W, H, 1], or when the preset spatial mapping weight matrix is a channel attention matrix [C, C] whose attention lies in the number of channels in the image, the shape of the spatial weight matrix is [1, 1, C]. The pertinence of feature extraction can be effectively improved, thereby improving the extraction capability of image local features.

Description

基于注意力模型的图像识别方法和***Image recognition method and system based on attention model
本申请要求于2018年2月11日提交中国专利局、申请号为201810139775.5、申请名称为“基于注意力模型的图像识别方法和***”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201810139775.5, filed on February 11, 2018, the entire disclosure of which is incorporated herein by reference. In this application.
技术领域Technical field
本申请涉及图像处理技术领域,具体而言,本申请涉及一种基于注意力模型的图像识别方法和***。The present application relates to the field of image processing technologies, and in particular, to an image recognition method and system based on an attention model.
背景技术Background technique
近年来,深度学习模型在视频图像处理、语音识别、自然语言处理等相关领域得到了广泛应用。但是在处理具体的图像分类任务或者语音识别任务时,会由于输入数据的多样性,使得深度学习模型只能捕捉到数据的全局信息,而忽视了数据的局部信息。In recent years, deep learning models have been widely used in video image processing, speech recognition, natural language processing and other related fields. However, when dealing with specific image classification tasks or speech recognition tasks, the deep learning model can only capture the global information of the data and ignore the local information of the data due to the diversity of the input data.
为了解决此问题,相关技术中提供了一些解决办法。以图像分类为例,一些传统的解决办法是将图像人为划定成多个分割区域,采用空间金字塔的形式捕捉数据的局部信息,虽然该解决方法可以一定程度上解决上述问题,但是由于是人为预先划定分割区域,所以其对不同数据的泛化能力较差。In order to solve this problem, some solutions are provided in the related art. Taking image classification as an example, some traditional solutions are to manually divide the image into multiple segmentation regions and capture the local information of the data in the form of a spatial pyramid. Although the solution can solve the above problems to a certain extent, it is artificial. The segmentation area is delineated in advance, so its generalization ability for different data is poor.
发明内容Summary of the invention
本申请的目的旨在至少能解决上述的技术缺陷之一,特别是容易忽略数据局部信息的技术缺陷。The purpose of the present application is to at least solve one of the above technical drawbacks, and in particular, to easily overlook the technical defects of data local information.
本申请提供一种基于注意力模型的图像识别方法,包括如下步骤:The application provides an image recognition method based on an attention model, comprising the following steps:
步骤S10:获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数;Step S10: Acquire an input feature map whose image matrix shape is [W, H, C], where W is a width, H is a height, and C is a channel number;
步骤S20:使用预设的空间映射权重矩阵对输入特征图进行空间映射,并经过激活函数激活后得到空间权重矩阵,将所述空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,当所述预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1]时,空间权重矩阵的形状为[W,H,1],或者当所述预设的空间映射权重矩阵为注意力在于图像通 道数的通道注意力矩阵[C,C]时,空间权重矩阵的形状为[1,1,C]。Step S20: spatially mapping the input feature map by using a preset spatial mapping weight matrix, and obtaining a spatial weight matrix after activation by the activation function, multiplying the spatial weight matrix and the image matrix of the input feature image by bit to obtain an output feature. In the figure, when the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height, the shape of the spatial weight matrix is [W, H, 1], or when The preset spatial mapping weight matrix is a channel attention matrix [C, C] whose attention is in the number of image channels, and the shape of the spatial weight matrix is [1, 1, C].
在其中一个实施例中,当所述预设的空间映射权重矩阵为空间注意力矩阵[C,1]时,在步骤S20中使用以下公式:In one of the embodiments, when the preset spatial mapping weight matrix is the spatial attention matrix [C, 1], the following formula is used in step S20:
o :,:,c=i :,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c =i :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图,i :,:,c为输入特征图,sigmoid为激活函数,w s为空间映射权重,b s为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map, i :, :, c is the input feature map, sigmoid is the activation function, w s is the spatial map weight, b s is deviation.
在其中一个实施例中,所述预设的空间映射权重矩阵为通道注意力矩阵[C,C]时,在步骤S20中使用以下公式:In one embodiment, when the preset spatial mapping weight matrix is the channel attention matrix [C, C], the following formula is used in step S20:
o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图,i w,h,:为输入特征图,sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w,h,: is the output feature map, i w,h,: is the input feature map, sigmoid is the activation function, mean is the averaging function, w c is The spatial mapping weights, b c is the deviation.
在其中一个实施例中,步骤S20包括:In one embodiment, step S20 includes:
在卷积神经网络的浅层网络使用所述空间注意力矩阵[C,1]对输入特征图进行空间映射,并经过激活函数激活后得到第一空间权重矩阵,将所述第一空间权重矩阵与输入特征图的图像矩阵按位相乘得到第一输出特征图;The shallow network on the convolutional neural network spatially maps the input feature map using the spatial attention matrix [C, 1], and is activated by the activation function to obtain a first spatial weight matrix, and the first spatial weight matrix Multiplying the image matrix of the input feature map by bit to obtain a first output feature map;
在卷积神经网络的深层网络使用所述通道注意力矩阵[C,1]对所述第一输出特征图进行空间映射,并经过激活函数激活后得到第二空间权重矩阵,将所述第二空间权重矩阵与所述第一输出特征图的图像矩阵按位相乘得到第二输出特征图。The first output feature map is spatially mapped using the channel attention matrix [C, 1] in the deep network of the convolutional neural network, and the second spatial weight matrix is obtained after activation by the activation function, and the second The spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map.
在其中一个实施例中,还包括步骤S30:In one embodiment, the method further includes step S30:
根据所述输出特征图应用分类器进行图像分类。An image classification is performed by applying a classifier according to the output feature map.
本申请还提供一种基于注意力模型的图像识别***,包括:The application also provides an image recognition system based on attention model, comprising:
图像获取模块,用于获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数;An image acquisition module, configured to acquire an input feature image of an image matrix shape of [W, H, C], where W is a width, H is a height, and C is a channel number;
图像处理模块,用于使用预设的空间映射权重矩阵对输入特征图进行空 间映射,并经过激活函数激活后得到空间权重矩阵,将所述空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,所述预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1],此时空间权重矩阵的形状为[W,H,1],或者所述预设的空间映射权重矩阵为注意力在于图像通道数的通道注意力矩阵[C,C],此时空间权重矩阵的形状为[1,1,C]。An image processing module is configured to spatially map the input feature map by using a preset spatial mapping weight matrix, and obtain a spatial weight matrix after being activated by the activation function, and multiply the spatial weight matrix by the image matrix of the input feature image by bitwise Obtaining an output feature map, wherein the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] with attention to image width and height, and the shape of the spatial weight matrix is [W, H, 1] Or the preset spatial mapping weight matrix is a channel attention matrix [C, C] whose attention is in the number of image channels, and the shape of the spatial weight matrix is [1, 1, C].
在其中一个实施例中,所述预设的空间映射权重矩阵为空间注意力矩阵[C,1]时,所述图像处理模块使用以下公式得到输出特征图:In one embodiment, when the preset spatial mapping weight matrix is the spatial attention matrix [C, 1], the image processing module obtains an output feature map by using the following formula:
o :,:,c:,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c = :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图,i :,:,c为输入特征图,sigmoid为激活函数,w s为空间映射权重,b s为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map, i :, :, c is the input feature map, sigmoid is the activation function, w s is the spatial map weight, b s is deviation.
在其中一个实施例中,所述预设的空间映射权重矩阵为通道注意力矩阵[C,C]时,所述图像处理模块使用以下公式得到输出特征图:In one embodiment, when the preset spatial mapping weight matrix is the channel attention matrix [C, C], the image processing module obtains an output feature map by using the following formula:
o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图,i w,h,:为输入特征图,sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w,h,: is the output feature map, i w,h,: is the input feature map, sigmoid is the activation function, mean is the averaging function, w c is The spatial mapping weights, b c is the deviation.
在其中一个实施例中,所述图像处理模块包括低级语义特征提取模块和高级语义特征提取模块;In one embodiment, the image processing module includes a low-level semantic feature extraction module and an advanced semantic feature extraction module;
所述低级语义特征提取模块用于:在卷积神经网络的浅层网络使用所述空间注意力矩阵[C,1]对输入特征图进行空间映射,并经过激活函数激活后得到第一空间权重矩阵,将所述第一空间权重矩阵与输入特征图的图像矩阵按位相乘得到第一输出特征图;The low-level semantic feature extraction module is configured to: spatially map an input feature map by using the spatial attention matrix [C, 1] in a shallow network of a convolutional neural network, and obtain a first spatial weight after activation by an activation function. a matrix, the first spatial weight matrix is multiplied by an image matrix of the input feature map to obtain a first output feature map;
所述高级语义特征提取模块用于:在卷积神经网络的深层网络使用所述通道注意力矩阵[C,1]对所述第一输出特征图进行空间映射,并经过激活函数激活后得到第二空间权重矩阵,将所述第二空间权重矩阵与所述第一输出特征图的图像矩阵按位相乘得到第二输出特征图。The advanced semantic feature extraction module is configured to: spatially map the first output feature map by using the channel attention matrix [C, 1] in a deep network of a convolutional neural network, and obtain an activation after activation by an activation function And a second spatial weight matrix, wherein the second spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map.
在其中一个实施例中,还包括分类模块,用于根据所述输出特征图应用分类器进行图像分类。In one embodiment, a classification module is further included for applying the classifier to perform image classification according to the output feature map.
本申请实施例还提供了一种电子设备,包括处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;The embodiment of the present application further provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;
存储器,用于存放计算机程序;a memory for storing a computer program;
处理器,用于执行存储器上所存放的程序时,实现本申请实施例所提供的基于注意力模型的图像识别方法。The image recognition method based on the attention model provided by the embodiment of the present application is implemented when the processor is configured to execute the program stored in the memory.
本申请实施例还提供了一种存储介质,所述存储介质内存储有处理程序,所述处理程序被处理器执行时实现本申请实施例所提供的基于注意力模型的图像识别方法。The embodiment of the present application further provides a storage medium, where the processing program is stored by the processor, and the image recognition method based on the attention model provided by the embodiment of the present application is implemented.
本申请实施例还提供了一种应用程序,所述应用程序用于在运行时执行本申请实施例所提供的基于注意力模型的图像识别方法。The embodiment of the present application further provides an application program for performing the attention-based model-based image recognition method provided by the embodiment of the present application at runtime.
上述的基于注意力模型的图像识别方法和***,首先获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数;然后使用预设的空间映射权重矩阵对输入特征图进行空间映射,并经过激活函数激活后得到空间权重矩阵,将所述空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,所述预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1],此时空间权重矩阵的形状为[W,H,1],或者所述预设的空间映射权重矩阵为注意力在于图像通道数的通道注意力矩阵[C,C],此时空间权重矩阵的形状为[1,1,C]。通过上述的的空间注意力矩阵[C,1]或通道注意力矩阵[C,C],可以使得在特征提取过程中注意力在于空间或通道,有效提高特征提取的针对性,从而强化对于图像局部特征的提取能力。The image recognition method and system based on the attention model described above first acquire an input feature map whose image matrix shape is [W, H, C], where W is the width, H is the height, and C is the number of channels; then the preset is used. The spatial mapping weight matrix spatially maps the input feature map, and after the activation function is activated, the spatial weight matrix is obtained, and the spatial weight matrix is multiplied by the image matrix of the input feature image to obtain an output feature map, wherein The preset spatial mapping weight matrix is a spatial attention matrix [C, 1] with attention to image width and height, where the shape of the spatial weight matrix is [W, H, 1], or the preset spatial mapping The weight matrix is the channel attention matrix [C, C] whose attention is on the number of image channels. At this time, the shape of the spatial weight matrix is [1, 1, C]. Through the above-mentioned spatial attention matrix [C, 1] or channel attention matrix [C, C], attention can be paid to the space or channel in the feature extraction process, effectively improving the pertinence of feature extraction, thereby enhancing the image. The ability to extract local features.
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。The aspects and advantages of the present invention will be set forth in part in the description which follows.
附图说明DRAWINGS
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the present invention will become apparent and readily understood from
图1为一个实施例的基于注意力模型的图像识别方法流程示意图;1 is a schematic flow chart of an image recognition method based on an attention model according to an embodiment;
图2为一个实施例的基于空间注意力模型的特征提取过程示意图;2 is a schematic diagram of a feature extraction process based on a spatial attention model according to an embodiment;
图3为一个实施例的基于通道注意力模型的特征提取过程示意图;3 is a schematic diagram of a feature extraction process based on a channel attention model according to an embodiment;
图4为另一个实施例的基于注意力模型的图像识别方法流程示意图;4 is a schematic flow chart of an image recognition method based on an attention model according to another embodiment;
图5为一个实施例的电子设备的结构示意图。FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment.
具体实施方式Detailed ways
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能解释为对本申请的限制。The embodiments of the present application are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals are used to refer to the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the accompanying drawings are intended to be illustrative only, and are not to be construed as limiting.
本技术领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。The singular forms "a", "an", "the" It is to be understood that the phrase "comprise" or "an" Integers, steps, operations, components, components, and/or groups thereof.
本技术领域技术人员可以理解,除非另外定义,这里使用的所有术语(包括技术术语和科学术语),具有与本申请所属领域中的普通技术人员的一般理解相同的意义。还应该理解的是,诸如通用字典中定义的那些术语,应该被理解为具有与现有技术的上下文中的意义一致的意义,并且除非像这里一样被特定定义,否则不会用理想化或过于正式的含义来解释。Those skilled in the art will appreciate that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. It should also be understood that terms such as those defined in a general dictionary should be understood to have meaning consistent with the meaning in the context of the prior art, and will not be idealized or excessive unless specifically defined as here. The formal meaning is explained.
实施例一 Embodiment 1
图1为一个实施例的基于注意力模型的图像识别方法的流程示意图,一种基于注意力模型的图像识别方法,可以包括如下步骤:FIG. 1 is a schematic flowchart of an image recognition method based on an attention model according to an embodiment, and an image recognition method based on an attention model, which may include the following steps:
步骤S10:获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度(图像的宽度,单位为像素),H为高度(图像的高度,单位为像素),C为通道数(图像的颜色通道数量)。这里的图像矩阵为三维矩阵,[W,H,C]的格式还可以写成W*H*C的格式,即宽度*高度*通道数。其中,输入特征图为 作为输入内容的特征图,特征图可以包括:图像的颜色特征、图像的纹理特征、图像的形状特征、图像的空间关系特征等。本发明实施例对特征图所包含的图像的特征不做具体限定。Step S10: Acquire an input feature map whose image matrix shape is [W, H, C], where W is a width (the width of the image, the unit is a pixel), H is a height (the height of the image, the unit is a pixel), and C is Number of channels (number of color channels of the image). The image matrix here is a three-dimensional matrix, and the format of [W, H, C] can also be written in the format of W*H*C, that is, width*height*channel number. The input feature map is a feature map as an input content, and the feature map may include: a color feature of the image, a texture feature of the image, a shape feature of the image, a spatial relationship feature of the image, and the like. The features of the image included in the feature map are not specifically limited in the embodiment of the present invention.
步骤S20:使用预设的空间映射权重矩阵对输入特征图进行空间映射,并经过激活函数激活后得到空间权重矩阵,将空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,当预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1],此时空间权重矩阵的形状为[W,H,1];或者当预设的空间映射权重矩阵为注意力在于图像通道数的通道注意力矩阵[C,C],此时空间权重矩阵的形状为[1,1,C]。Step S20: spatially mapping the input feature map by using a preset spatial mapping weight matrix, and obtaining a spatial weight matrix after activation by the activation function, and multiplying the spatial weight matrix by the image matrix of the input feature image to obtain an output feature map. Wherein, when the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height, the shape of the spatial weight matrix is [W, H, 1]; or when preset The spatial mapping weight matrix is the channel attention matrix [C, C] with attention to the number of image channels. At this time, the shape of the spatial weight matrix is [1, 1, C].
其中,输出特征图为作为输出结果的特征图。Among them, the output feature map is a feature map as an output result.
需要说明的是,在实际应用中,可以根据实际情况设置激活函数,其中,激活函数可以有多种,例如,Sigmoid函数、Tanh函数或者ReLU函数。It should be noted that, in practical applications, the activation function may be set according to actual conditions, wherein the activation function may be various, for example, a Sigmoid function, a Tanh function, or a ReLU function.
在一种实现方式中,在本实施例中,预设的空间映射权重矩阵为空间注意力矩阵[C,1]时,在步骤S20中使用以下公式得到输出特征图:In an implementation manner, in this embodiment, when the preset spatial mapping weight matrix is the spatial attention matrix [C, 1], the output feature map is obtained by using the following formula in step S20:
o :,:,c=i :,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c =i :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图(图像矩阵),i :,:,c为输入特征图(图像矩阵),sigmoid为激活函数,w s为空间映射权重,b s为偏差。⊙是表示两个相同尺寸矩阵中相同位置的数据相乘以生成一个同一尺寸的矩阵。例如,A和B为两个2*2的二维矩阵,A⊙B生成2*2的二维矩阵K。其中,A矩阵中的数据为Amn(A11,A12,A21,A22),m为行数,n为列数;B矩阵中的数据为Bmn(B11,B12,B21,B22),m为行数,n为列数;K矩阵中的数据为Kmn(K11,K12,K21,K22),m为行数,n为列数;则Amn×Bmn=Kmn,即A11×B11=K11,A12×B12=K12,A21×B21=K21,A22×B22=K22。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map (image matrix), i :, :, c is the input feature map (image matrix), sigmoid is the activation function, w s For spatial mapping weights, b s is the deviation. ⊙ is to represent that the data of the same position in two matrices of the same size are multiplied to generate a matrix of the same size. For example, A and B are two 2*2 two-dimensional matrices, and A⊙B generates a two-dimensional matrix K of 2*2. The data in the A matrix is Amn (A11, A12, A21, A22), m is the number of rows, n is the number of columns; the data in the B matrix is Bmn (B11, B12, B21, B22), and m is the number of rows. , n is the number of columns; the data in the K matrix is Kmn (K11, K12, K21, K22), m is the number of rows, n is the number of columns; then Amn × Bmn = Kmn, that is, A11 × B11 = K11, A12 × B12 = K12, A21 × B21 = K21, A22 × B22 = K22.
图2为一个实施例的基于空间注意力模型的特征提取过程示意图,i为输入特征图,w为空间权重矩阵,o为输出特征图。2 is a schematic diagram of a feature extraction process based on a spatial attention model according to an embodiment, where i is an input feature map, w is a spatial weight matrix, and o is an output feature map.
在本实施例中,预设的空间映射权重矩阵为通道注意力矩阵[C,C]时,在步骤S20中使用以下公式得到输出特征图:In this embodiment, when the preset spatial mapping weight matrix is the channel attention matrix [C, C], the output feature map is obtained in step S20 using the following formula:
o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图(图像矩阵),i w,h,:为输入特征图(图像矩阵),sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w, h, : is the output feature map (image matrix), i w, h, is the input feature map (image matrix), sigmoid is the activation function, and mean is The average function is obtained, w c is the spatial mapping weight, and b c is the deviation.
图3为一个实施例的基于通道注意力模型的特征提取过程示意图,左侧的“特征图1、特征图2、……特征图m”表示m个通道的输入特征图,右侧的“特征图1、特征图2、……特征图m”表示m个通道的输出特征图。FIG. 3 is a schematic diagram of a feature extraction process based on a channel attention model according to an embodiment, wherein “characteristic map 1, feature map 2, ... feature map m” on the left side represents an input feature map of m channels, and “features on the right side” Fig. 1, feature map 2, ... feature map m" shows the output characteristic map of m channels.
在上述本实施例中,还可以包括步骤S30:根据输出特征图应用分类器进行图像分类。In the above embodiment, step S30 may be further included: applying the classifier according to the output feature map to perform image classification.
在该步骤中,输出特征图应用分类器可以根据图像的颜色特征、图像的纹理特征、图像的形状特征或图像的空间关系特征实现对图像的分类。In this step, the output feature map application classifier can classify the image according to the color feature of the image, the texture feature of the image, the shape feature of the image, or the spatial relationship feature of the image.
实施例二 Embodiment 2
图4为另一个实施例的基于注意力模型的图像识别方法流程示意图,一种基于注意力模型的图像识别方法,包括如下步骤:4 is a schematic flow chart of an image recognition method based on an attention model according to another embodiment, and an image recognition method based on an attention model, comprising the following steps:
步骤S21:获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度(图像的宽度,单位为像素),H为高度(图像的高度,单位为像素),C为通道数(图像的颜色通道数量)。这里的图像矩阵为三维矩阵,[W,H,C]的格式还可以写成W*H*C的格式,即宽度*高度*通道数。Step S21: Acquire an input feature map whose image matrix shape is [W, H, C], where W is a width (the width of the image, the unit is a pixel), H is a height (the height of the image, the unit is a pixel), and C is Number of channels (number of color channels of the image). The image matrix here is a three-dimensional matrix, and the format of [W, H, C] can also be written in the format of W*H*C, that is, width*height*channel number.
步骤S22:在卷积神经网络的浅层网络使用空间注意力矩阵[C,1]对输入特征图进行空间映射,并经过激活函数激活后得到第一空间权重矩阵,将第一空间权重矩阵与输入特征图的图像矩阵按位相乘得到第一输出特征图。浅层网络用于提取图像的底层特征,所以在空间上较为敏感,使用空间注意力矩阵[C,1]提取特征的注意力模式比较合适。Step S22: spatially mapping the input feature map by using the spatial attention matrix [C, 1] in the shallow network of the convolutional neural network, and obtaining the first spatial weight matrix after activation by the activation function, and the first spatial weight matrix and The image matrix of the input feature map is multiplied by bits to obtain a first output feature map. The shallow network is used to extract the underlying features of the image, so it is sensitive in space. It is more appropriate to use the spatial attention matrix [C, 1] to extract the feature attention pattern.
卷积神经网络可以包括输入层、中间层和输出层,其中,卷积神经网络的浅层网络可以指卷积神经网络的输入层,通过该浅层网络可以获取到图像的底层特征,该底层特征可以包括:图像的颜色特征、图像的纹理特征及图像的形状特征。The convolutional neural network may include an input layer, an intermediate layer, and an output layer, wherein the shallow network of the convolutional neural network may refer to an input layer of a convolutional neural network through which an underlying feature of the image may be obtained, the bottom layer Features may include: color features of the image, texture features of the image, and shape features of the image.
在本实施例中,可以使用以下公式得到第一输出特征图:In this embodiment, the first output characteristic map can be obtained using the following formula:
o :,:,c=i :,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c =i :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图(即第一输出特征图),i :,:,c为输入特征图(即输入特征图),sigmoid为激活函数,w s为空间映射权重(即空间注意力矩阵[C,1]),b s为偏差,sigmoid(i :,:,c·w s+b s)为第一空间权重矩阵。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map (ie the first output feature map), i :, :, c is the input feature map (ie input feature map), sigmoid To activate the function, w s is the spatial mapping weight (ie, the spatial attention matrix [C, 1]), b s is the deviation, and sigmoid(i :, :, c ·w s +b s ) is the first spatial weight matrix.
图2为一个实施例的基于空间注意力模型的特征提取过程示意图,i为输入特征图,w为空间权重矩阵,o为输出特征图。2 is a schematic diagram of a feature extraction process based on a spatial attention model according to an embodiment, where i is an input feature map, w is a spatial weight matrix, and o is an output feature map.
步骤S23:在卷积神经网络的深层网络使用通道注意力矩阵[C,1]对第一输出特征图进行空间映射,并经过激活函数激活后得到第二空间权重矩阵,将第二空间权重矩阵与第一输出特征图的图像矩阵按位相乘得到第二输出特征图。深层网络用于提取高级语义层级的特征,所以对通道的信息较为敏感。Step S23: spatially mapping the first output feature map by using the channel attention matrix [C, 1] in the deep network of the convolutional neural network, and obtaining the second spatial weight matrix after the activation function is activated, and the second spatial weight matrix Multiplying the image matrix of the first output feature map by bits to obtain a second output feature map. Deep networks are used to extract features of the high-level semantic hierarchy, so they are sensitive to channel information.
卷积神经网络可以包括输入层、中间层和输出层,其中,卷积神经网络的深层网络可以指卷积神经网络的输出层,通过该深层网络可以获取到图像的深层特征,该深层特征可以是图像的空间关系特征。The convolutional neural network may include an input layer, an intermediate layer, and an output layer, wherein the deep network of the convolutional neural network may refer to an output layer of the convolutional neural network through which deep features of the image may be acquired, and the deep features may Is the spatial relationship feature of the image.
在本实施例中,使用以下公式得到第二输出特征图:In this embodiment, the second output characteristic map is obtained using the following formula:
o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图(即第二输出特征图),i w,h,:为输入特征图(即第一输出特征图),sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差,为第二空间权重矩阵。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w, h,: is the output feature map (ie, the second output feature map), i w, h, : is the input feature map (ie, the first output feature map) , sigmoid is the activation function, mean is the averaging function, w c is the spatial mapping weight, and b c is the deviation, which is the second spatial weight matrix.
图3为一个实施例的基于通道注意力模型的特征提取过程示意图,左侧的“特征图1、特征图2、……特征图m”表示m个通道的输入特征图,右侧的“特征图1、特征图2、……特征图m”表示m个通道的输出特征图。FIG. 3 is a schematic diagram of a feature extraction process based on a channel attention model according to an embodiment, wherein “characteristic map 1, feature map 2, ... feature map m” on the left side represents an input feature map of m channels, and “features on the right side” Fig. 1, feature map 2, ... feature map m" shows the output characteristic map of m channels.
在上述本实施例中,还可以包括步骤S24:根据第二输出特征图应用分类器进行图像分类。In the above embodiment, step S24 may be further included: applying a classifier according to the second output feature map to perform image classification.
实施例三Embodiment 3
本申请还提供一种基于注意力模型的图像识别***,包括:The application also provides an image recognition system based on attention model, comprising:
图像获取模块,用于获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数。The image obtaining module is configured to obtain an input feature image whose image matrix shape is [W, H, C], wherein W is a width, H is a height, and C is a channel number.
图像处理模块,用于使用预设的空间映射权重矩阵对输入特征图进行空间映射,并经过激活函数激活后得到空间权重矩阵,将空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,当预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1],此时空间权重矩阵的形状为[W,H,1],或者当预设的空间映射权重矩阵为注意力在于图像通道数的通道注意力矩阵[C,C],此时空间权重矩阵的形状为[1,1,C]。The image processing module is configured to spatially map the input feature map by using a preset spatial mapping weight matrix, and obtain a spatial weight matrix after activation by the activation function, and multiply the spatial weight matrix and the image matrix of the input feature image by bit to obtain an output. a feature map in which when the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height, the shape of the spatial weight matrix is [W, H, 1], or when The preset spatial mapping weight matrix is a channel attention matrix [C, C] with attention to the number of image channels, and the shape of the spatial weight matrix is [1, 1, C].
在本实施例中,当预设的空间映射权重矩阵为空间注意力矩阵[C,1]时,图像处理模块使用以下公式得到输出特征图:In this embodiment, when the preset spatial mapping weight matrix is the spatial attention matrix [C, 1], the image processing module obtains the output feature map by using the following formula:
o :,:,c=i :,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c =i :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图,i :,:,c为输入特征图,sigmoid为激活函数,w s为空间映射权重,b s为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map, i :, :, c is the input feature map, sigmoid is the activation function, w s is the spatial map weight, b s is deviation.
在本实施例中,当预设的空间映射权重矩阵为通道注意力矩阵[C,C]时,图像处理模块使用以下公式得到输出特征图:In this embodiment, when the preset spatial mapping weight matrix is the channel attention matrix [C, C], the image processing module obtains the output feature map by using the following formula:
o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图,i w,h,:为输入特征图,sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w,h,: is the output feature map, i w,h,: is the input feature map, sigmoid is the activation function, mean is the averaging function, w c is The spatial mapping weights, b c is the deviation.
在上述本实施例中,还可以包括分类模块,用于根据输出特征图应用分类器进行图像分类。In the above embodiment, the classification module may further be configured to apply the classifier to perform image classification according to the output feature map.
实施例四Embodiment 4
本申请还提供一种基于注意力模型的图像识别***,包括:图像获取模 块和图像处理模块。The present application also provides an image recognition system based on an attention model, comprising: an image acquisition module and an image processing module.
图像获取模块用于获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数。The image acquisition module is configured to obtain an input feature map whose image matrix shape is [W, H, C], where W is a width, H is a height, and C is a channel number.
图像处理模块包括低级语义特征提取模块和高级语义特征提取模块。The image processing module includes a low-level semantic feature extraction module and an advanced semantic feature extraction module.
低级语义特征提取模块用于:在卷积神经网络的浅层网络使用空间注意力矩阵[C,1]对输入特征图进行空间映射,并经过激活函数激活后得到第一空间权重矩阵,将第一空间权重矩阵与输入特征图的图像矩阵按位相乘得到第一输出特征图。浅层网络用于提取图像的底层特征,所以在空间上较为敏感,使用空间注意力矩阵[C,1]提取特征的注意力模式比较合适。The low-level semantic feature extraction module is used to: spatially map the input feature map using the spatial attention matrix [C, 1] in the shallow network of the convolutional neural network, and obtain the first spatial weight matrix after activation by the activation function. A spatial weight matrix is multiplied by the image matrix of the input feature map to obtain a first output feature map. The shallow network is used to extract the underlying features of the image, so it is sensitive in space. It is more appropriate to use the spatial attention matrix [C, 1] to extract the feature attention pattern.
在本实施例中,可以使用以下公式得到第一输出特征图:In this embodiment, the first output characteristic map can be obtained using the following formula:
o :,:,c=i :,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c =i :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图(即第一输出特征图),i :,:,c为输入特征图(即输入特征图),sigmoid为激活函数,w s为空间映射权重(即空间注意力矩阵[C,1]),b s为偏差,sigmoid(i :,:,c·w s+b s)为第一空间权重矩阵。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map (ie the first output feature map), i :, :, c is the input feature map (ie input feature map), sigmoid To activate the function, w s is the spatial mapping weight (ie, the spatial attention matrix [C, 1]), b s is the deviation, and sigmoid(i :, :, c ·w s +b s ) is the first spatial weight matrix.
高级语义特征提取模块用于:在卷积神经网络的深层网络使用通道注意力矩阵[C,1]对第一输出特征图进行空间映射,并经过激活函数激活后得到第二空间权重矩阵,将第二空间权重矩阵与第一输出特征图的图像矩阵按位相乘得到第二输出特征图。深层网络用于提取高级语义层级的特征,所以对通道的信息较为敏感。The advanced semantic feature extraction module is configured to: spatially map the first output feature map by using the channel attention matrix [C, 1] in the deep network of the convolutional neural network, and obtain the second spatial weight matrix after activation by the activation function, The second spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map. Deep networks are used to extract features of the high-level semantic hierarchy, so they are sensitive to channel information.
在本实施例中,可以使用以下公式得到第二输出特征图:In this embodiment, the second output feature map can be obtained using the following formula:
o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图(即第二输出特征图),i w,h,:为输入特征图(即第一输出特征图),sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差,sigmoid(mean(i w,h,:)·w c+b c)为第二空间权重矩阵。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w, h,: is the output feature map (ie, the second output feature map), i w, h, : is the input feature map (ie, the first output feature map) , sigmoid is the activation function, mean is the averaging function, w c is the spatial mapping weight, b c is the deviation, and sigmoid(mean(i w,h,: )·w c +b c ) is the second spatial weight matrix.
在本实施例中,还包括分类模块,用于根据第二输出特征图应用分类器进行图像分类。In this embodiment, a classification module is further included, configured to apply the classifier according to the second output feature map to perform image classification.
上述的基于注意力模型的图像识别方法和***,首先获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数;然后使用预设的空间映射权重矩阵对输入特征图进行空间映射,并经过激活函数激活后得到空间权重矩阵,将空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,当预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1]时,空间权重矩阵的形状为[W,H,1],或者当预设的空间映射权重矩阵为注意力在于图像通道数的通道注意力矩阵[C,C]时,空间权重矩阵的形状为[1,1,C]。通过上述的的空间注意力矩阵[C,1]或通道注意力矩阵[C,C],可以使得在特征提取过程中注意力在于空间或通道,有效提高特征提取的针对性,从而强化对于图像局部特征的提取能力。The image recognition method and system based on the attention model described above first acquire an input feature map whose image matrix shape is [W, H, C], where W is the width, H is the height, and C is the number of channels; then the preset is used. The spatial mapping weight matrix spatially maps the input feature map, and after the activation function is activated, the spatial weight matrix is obtained, and the spatial weight matrix is multiplied by the image matrix of the input feature image to obtain an output feature map, wherein, when the preset When the spatial mapping weight matrix is the spatial attention matrix [C, 1] whose attention is on the image width and height, the shape of the spatial weight matrix is [W, H, 1], or when the preset spatial mapping weight matrix is the attention When the channel attention matrix [C, C] of the number of image channels is used, the shape of the spatial weight matrix is [1, 1, C]. Through the above-mentioned spatial attention matrix [C, 1] or channel attention matrix [C, C], attention can be paid to the space or channel in the feature extraction process, effectively improving the pertinence of feature extraction, thereby enhancing the image. The ability to extract local features.
本申请实施例还提供了一种电子设备,如图5所示,包括处理器501、通信接口502、存储器503和通信总线504,其中,处理器501,通信接口502,存储器503通过通信总线504完成相互间的通信;The embodiment of the present application further provides an electronic device, as shown in FIG. 5, including a processor 501, a communication interface 502, a memory 503, and a communication bus 504, wherein the processor 501, the communication interface 502, and the memory 503 pass through the communication bus 504. Complete mutual communication;
存储器503,用于存放计算机程序;a memory 503, configured to store a computer program;
处理器501,用于执行存储器503上所存放的程序时,实现本申请实施例所提供的基于注意力模型的图像识别方法。The processor 501 is configured to implement the image recognition method based on the attention model provided by the embodiment of the present application when the program stored in the memory 503 is executed.
本申请实施例还提供了一种存储介质,所述存储介质内存储有处理程序,所述处理程序被处理器执行时实现本申请实施例所提供的基于注意力模型的图像识别方法。The embodiment of the present application further provides a storage medium, where the processing program is stored by the processor, and the image recognition method based on the attention model provided by the embodiment of the present application is implemented.
本申请实施例还提供了一种应用程序,所述应用程序用于在运行时执行本申请实施例所提供的基于注意力模型的图像识别方法。The embodiment of the present application further provides an application program for performing the attention-based model-based image recognition method provided by the embodiment of the present application at runtime.
应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或 者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。It should be understood that although the various steps in the flowchart of the drawings are sequentially displayed as indicated by the arrows, these steps are not necessarily performed in the order indicated by the arrows. Except as explicitly stated herein, the execution of these steps is not strictly limited, and may be performed in other sequences. Moreover, at least some of the steps in the flowchart of the drawings may include a plurality of sub-steps or stages, which are not necessarily performed at the same time, but may be executed at different times, and the execution order thereof is also It is not necessarily performed sequentially, but may be performed alternately or alternately with at least a portion of other steps or sub-steps or stages of other steps.
以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above description is only a partial embodiment of the present application, and it should be noted that those skilled in the art can also make some improvements and retouching without departing from the principle of the present application. It should be considered as the scope of protection of this application.

Claims (13)

  1. 一种基于注意力模型的图像识别方法,其特征在于,包括如下步骤:An image recognition method based on attention model, comprising the following steps:
    步骤S10:获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数;Step S10: Acquire an input feature map whose image matrix shape is [W, H, C], where W is a width, H is a height, and C is a channel number;
    步骤S20:使用预设的空间映射权重矩阵对输入特征图进行空间映射,并经过激活函数激活后得到空间权重矩阵,将所述空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,当所述预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1]时,空间权重矩阵的形状为[W,H,1],或者当所述预设的空间映射权重矩阵为注意力在于图像通道数的通道注意力矩阵[C,C]时,空间权重矩阵的形状为[1,1,C]。Step S20: spatially mapping the input feature map by using a preset spatial mapping weight matrix, and obtaining a spatial weight matrix after activation by the activation function, multiplying the spatial weight matrix and the image matrix of the input feature image by bit to obtain an output feature. In the figure, when the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height, the shape of the spatial weight matrix is [W, H, 1], or when The preset spatial mapping weight matrix is a channel attention matrix [C, C] whose attention is in the number of image channels, and the shape of the spatial weight matrix is [1, 1, C].
  2. 根据权利要求1所述的基于注意力模型的图像识别方法,其特征在于,当所述预设的空间映射权重矩阵为空间注意力矩阵[C,1]时,在步骤S20中使用以下公式得到输出特征图:The image recognition method based on the attention model according to claim 1, wherein when the preset spatial mapping weight matrix is the spatial attention matrix [C, 1], the following formula is used in step S20. Output feature map:
    o :,:,c=i :,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c =i :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
    其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图,i :,:,c为输入特征图,sigmoid为激活函数,w s为空间映射权重,b s为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map, i :, :, c is the input feature map, sigmoid is the activation function, w s is the spatial map weight, b s is deviation.
  3. 根据权利要求1所述的基于注意力模型的图像识别方法,其特征在于,当所述预设的空间映射权重矩阵为通道注意力矩阵[C,C]时,在步骤S20中使用以下公式得到输出特征图:The image recognition method based on the attention model according to claim 1, wherein when the preset spatial mapping weight matrix is the channel attention matrix [C, C], the following formula is used in step S20. Output feature map:
    o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
    其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图,i w,h,:为输入特征图,sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w,h,: is the output feature map, i w,h,: is the input feature map, sigmoid is the activation function, mean is the averaging function, w c is The spatial mapping weights, b c is the deviation.
  4. 根据权利要求1所述的基于注意力模型的图像识别方法,其特征在于,步骤S20包括:The image recognition method based on the attention model according to claim 1, wherein the step S20 comprises:
    在卷积神经网络的浅层网络使用所述空间注意力矩阵[C,1]对输入特征图进行空间映射,并经过激活函数激活后得到第一空间权重矩阵,将所述第一空间权重矩阵与输入特征图的图像矩阵按位相乘得到第一输出特征图;The shallow network on the convolutional neural network spatially maps the input feature map using the spatial attention matrix [C, 1], and is activated by the activation function to obtain a first spatial weight matrix, and the first spatial weight matrix Multiplying the image matrix of the input feature map by bit to obtain a first output feature map;
    在卷积神经网络的深层网络使用所述通道注意力矩阵[C,1]对所述第一 输出特征图进行空间映射,并经过激活函数激活后得到第二空间权重矩阵,将所述第二空间权重矩阵与所述第一输出特征图的图像矩阵按位相乘得到第二输出特征图。The first output feature map is spatially mapped using the channel attention matrix [C, 1] in the deep network of the convolutional neural network, and the second spatial weight matrix is obtained after activation by the activation function, and the second The spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map.
  5. 根据权利要求1所述的基于注意力模型的图像识别方法,其特征在于,还包括步骤S30:The image recognition method based on the attention model according to claim 1, further comprising the step S30:
    根据所述输出特征图应用分类器进行图像分类。An image classification is performed by applying a classifier according to the output feature map.
  6. 一种基于注意力模型的图像识别***,其特征在于,包括:An image recognition system based on attention model, comprising:
    图像获取模块,用于获取图像矩阵形状为[W,H,C]的输入特征图,其中,W为宽度,H为高度,C为通道数;An image acquisition module, configured to acquire an input feature image of an image matrix shape of [W, H, C], where W is a width, H is a height, and C is a channel number;
    图像处理模块,用于使用预设的空间映射权重矩阵对输入特征图进行空间映射,并经过激活函数激活后得到空间权重矩阵,将所述空间权重矩阵与输入特征图的图像矩阵按位相乘得到输出特征图,其中,当所述预设的空间映射权重矩阵为注意力在于图像宽度和高度的空间注意力矩阵[C,1]时,空间权重矩阵的形状为[W,H,1],或者当所述预设的空间映射权重矩阵为注意力在于图像通道数的通道注意力矩阵[C,C]时,空间权重矩阵的形状为[1,1,C]。An image processing module is configured to spatially map the input feature map by using a preset spatial mapping weight matrix, and obtain a spatial weight matrix after being activated by the activation function, and multiply the spatial weight matrix by the image matrix of the input feature image by bitwise Obtaining an output feature map, wherein when the preset spatial mapping weight matrix is a spatial attention matrix [C, 1] whose attention is in image width and height, the shape of the spatial weight matrix is [W, H, 1] Or when the preset spatial mapping weight matrix is the channel attention matrix [C, C] whose attention is on the number of image channels, the shape of the spatial weight matrix is [1, 1, C].
  7. 根据权利要求6所述的基于注意力模型的图像识别***,其特征在于,当所述预设的空间映射权重矩阵为空间注意力矩阵[C,1]时,所述图像处理模块使用以下公式得到输出特征图:The attention recognition model-based image recognition system according to claim 6, wherein when the preset spatial mapping weight matrix is a spatial attention matrix [C, 1], the image processing module uses the following formula Get the output feature map:
    o :,:,c=i :,:,c⊙sigmoid(i :,:,c·w s+b s) o :,:,c =i :,:,c ⊙sigmoid(i :,:,c ·w s +b s )
    其中,⊙为按位乘,·为矩阵乘法,o :,:,c为输出特征图,i :,:,c为输入特征图,sigmoid为激活函数,w s为空间映射权重,b s为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o :,:, c is the output feature map, i :, :, c is the input feature map, sigmoid is the activation function, w s is the spatial map weight, b s is deviation.
  8. 根据权利要求6所述的基于注意力模型的图像识别***,其特征在于,当所述预设的空间映射权重矩阵为通道注意力矩阵[C,C]时,所述图像处理模块使用以下公式得到输出特征图:The attention recognition model based image recognition system according to claim 6, wherein when the preset spatial mapping weight matrix is a channel attention matrix [C, C], the image processing module uses the following formula Get the output feature map:
    o w,h,:=i w,h,:⊙sigmoid(mean(i w,h,:)·w c+b c) o w,h,: =i w,h,: ⊙sigmoid(mean(i w,h,: )·w c +b c )
    其中,⊙为按位乘,·为矩阵乘法,o w,h,:为输出特征图,i w,h,:为输入特征图,sigmoid为激活函数,mean为求平均值函数,w c为空间映射权重,b c为偏差。 Where ⊙ is bitwise multiplication, · is matrix multiplication, o w,h,: is the output feature map, i w,h,: is the input feature map, sigmoid is the activation function, mean is the averaging function, w c is The spatial mapping weights, b c is the deviation.
  9. 根据权利要求6所述的基于注意力模型的图像识别***,其特征在于,所述图像处理模块包括低级语义特征提取模块和高级语义特征提取模块;The attention recognition model-based image recognition system according to claim 6, wherein the image processing module comprises a low-level semantic feature extraction module and a high-level semantic feature extraction module;
    所述低级语义特征提取模块用于:在卷积神经网络的浅层网络使用所述空间注意力矩阵[C,1]对输入特征图进行空间映射,并经过激活函数激活后得到第一空间权重矩阵,将所述第一空间权重矩阵与输入特征图的图像矩阵按位相乘得到第一输出特征图;The low-level semantic feature extraction module is configured to: spatially map an input feature map by using the spatial attention matrix [C, 1] in a shallow network of a convolutional neural network, and obtain a first spatial weight after activation by an activation function. a matrix, the first spatial weight matrix is multiplied by an image matrix of the input feature map to obtain a first output feature map;
    所述高级语义特征提取模块用于:在卷积神经网络的深层网络使用所述通道注意力矩阵[C,1]对所述第一输出特征图进行空间映射,并经过激活函数激活后得到第二空间权重矩阵,将所述第二空间权重矩阵与所述第一输出特征图的图像矩阵按位相乘得到第二输出特征图。The advanced semantic feature extraction module is configured to: spatially map the first output feature map by using the channel attention matrix [C, 1] in a deep network of a convolutional neural network, and obtain an activation after activation by an activation function And a second spatial weight matrix, wherein the second spatial weight matrix is multiplied by the image matrix of the first output feature map to obtain a second output feature map.
  10. 根据权利要求6所述的基于注意力模型的图像识别***,其特征在于,还包括分类模块,用于根据所述输出特征图应用分类器进行图像分类。The attention recognition model-based image recognition system according to claim 6, further comprising a classification module, configured to apply the classifier to perform image classification according to the output feature map.
  11. 一种电子设备,其特征在于,包括处理器、通信接口、存储器和通信总线,其中,处理器,通信接口,存储器通过通信总线完成相互间的通信;An electronic device, comprising: a processor, a communication interface, a memory, and a communication bus, wherein the processor, the communication interface, and the memory complete communication with each other through the communication bus;
    存储器,用于存放计算机程序;a memory for storing a computer program;
    处理器,用于执行存储器上所存放的程序时,实现权利要求1-5任一所述的方法步骤。The method of any one of claims 1-5 is implemented when the processor is configured to execute a program stored on the memory.
  12. 一种存储介质,其特征在于,所述存储介质内存储有处理程序,所述处理程序被处理器执行时实现权利要求1-5任一所述的方法步骤。A storage medium, characterized in that a processing program is stored in the storage medium, and the processing program is executed by a processor to implement the method steps of any one of claims 1-5.
  13. 一种应用程序,其特征在于,所述应用程序用于在运行时执行权利要求1-5任一项所述的方法步骤。An application, characterized in that the application is operative to perform the method steps of any of claims 1-5 at runtime.
PCT/CN2018/122684 2018-02-11 2018-12-21 Image recognition method and system based on attention model WO2019153908A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810139775.5 2018-02-11
CN201810139775.5A CN108364023A (en) 2018-02-11 2018-02-11 Image-recognizing method based on attention model and system

Publications (1)

Publication Number Publication Date
WO2019153908A1 true WO2019153908A1 (en) 2019-08-15

Family

ID=63005720

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/122684 WO2019153908A1 (en) 2018-02-11 2018-12-21 Image recognition method and system based on attention model

Country Status (2)

Country Link
CN (1) CN108364023A (en)
WO (1) WO2019153908A1 (en)

Cited By (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111028253A (en) * 2019-11-25 2020-04-17 北京科技大学 Iron concentrate powder segmentation method and segmentation device
CN111126258A (en) * 2019-12-23 2020-05-08 深圳市华尊科技股份有限公司 Image recognition method and related device
CN111369433A (en) * 2019-11-12 2020-07-03 天津大学 Three-dimensional image super-resolution reconstruction method based on separable convolution and attention
CN111414962A (en) * 2020-03-19 2020-07-14 创新奇智(重庆)科技有限公司 Image classification method introducing object relationship
CN111539884A (en) * 2020-04-21 2020-08-14 温州大学 Neural network video deblurring method based on multi-attention machine mechanism fusion
CN111639654A (en) * 2020-05-12 2020-09-08 博泰车联网(南京)有限公司 Image processing method and device and computer storage medium
CN111950586A (en) * 2020-07-01 2020-11-17 银江股份有限公司 Target detection method introducing bidirectional attention
CN112035645A (en) * 2020-09-01 2020-12-04 平安科技(深圳)有限公司 Data query method and system
CN112464787A (en) * 2020-11-25 2021-03-09 北京航空航天大学 Remote sensing image ship target fine-grained classification method based on spatial fusion attention
CN112489033A (en) * 2020-12-13 2021-03-12 南通云达信息技术有限公司 Method for detecting cleaning effect of concrete curing box based on classification weight
CN112560907A (en) * 2020-12-02 2021-03-26 西安电子科技大学 Limited pixel infrared unmanned aerial vehicle target detection method based on mixed domain attention
CN112613356A (en) * 2020-12-07 2021-04-06 北京理工大学 Action detection method and device based on deep attention fusion network
CN112653899A (en) * 2020-12-18 2021-04-13 北京工业大学 Network live broadcast video feature extraction method based on joint attention ResNeSt under complex scene
CN112733578A (en) * 2019-10-28 2021-04-30 普天信息技术有限公司 Vehicle weight identification method and system
CN112801945A (en) * 2021-01-11 2021-05-14 西北大学 Depth Gaussian mixture model skull registration method based on dual attention mechanism feature extraction
CN113255821A (en) * 2021-06-15 2021-08-13 中国人民解放军国防科技大学 Attention-based image recognition method, attention-based image recognition system, electronic device and storage medium
CN113283278A (en) * 2021-01-08 2021-08-20 浙江大学 Anti-interference laser underwater target recognition instrument
CN113408577A (en) * 2021-05-12 2021-09-17 桂林电子科技大学 Image classification method based on attention mechanism
CN113450366A (en) * 2021-07-16 2021-09-28 桂林电子科技大学 AdaptGAN-based low-illumination semantic segmentation method
CN113468967A (en) * 2021-06-02 2021-10-01 北京邮电大学 Lane line detection method, device, equipment and medium based on attention mechanism
CN113539297A (en) * 2021-07-08 2021-10-22 中国海洋大学 Combined attention mechanism model and method for sound classification and application
CN113569735A (en) * 2021-07-28 2021-10-29 中国人民解放军空军预警学院 Complex coordinate attention module and complex input feature map processing method and system
CN113658114A (en) * 2021-07-29 2021-11-16 南京理工大学 Contact net opening pin defect target detection method based on multi-scale cross attention
CN113674334A (en) * 2021-07-06 2021-11-19 复旦大学 Texture recognition method based on depth self-attention network and local feature coding
CN113744844A (en) * 2021-09-17 2021-12-03 天津市肿瘤医院(天津医科大学肿瘤医院) Thyroid ultrasonic image processing method based on deep convolutional neural network
CN113744284A (en) * 2021-09-06 2021-12-03 浙大城市学院 Brain tumor image region segmentation method and device, neural network and electronic equipment
CN113744164A (en) * 2021-11-05 2021-12-03 深圳市安软慧视科技有限公司 Method, system and related equipment for enhancing low-illumination image at night quickly
CN113793345A (en) * 2021-09-07 2021-12-14 复旦大学附属华山医院 Medical image segmentation method and device based on improved attention module
CN114549962A (en) * 2022-03-07 2022-05-27 重庆锐云科技有限公司 Garden plant leaf disease classification method
CN114612979A (en) * 2022-03-09 2022-06-10 平安科技(深圳)有限公司 Living body detection method and device, electronic equipment and storage medium
CN114758206A (en) * 2022-06-13 2022-07-15 武汉珈鹰智能科技有限公司 Steel truss structure abnormity detection method and device
CN115578615A (en) * 2022-10-31 2023-01-06 成都信息工程大学 Night traffic sign image detection model establishing method based on deep learning
CN115937792A (en) * 2023-01-10 2023-04-07 浙江非线数联科技股份有限公司 Intelligent community operation management system based on block chain
CN116030014A (en) * 2023-01-06 2023-04-28 浙江伟众科技有限公司 Intelligent processing method and system for soft and hard air conditioner pipes
US11694319B2 (en) 2020-04-10 2023-07-04 Samsung Display Co., Ltd. Image-based defects identification and semi-supervised localization
CN116503398A (en) * 2023-06-26 2023-07-28 广东电网有限责任公司湛江供电局 Insulator pollution flashover detection method and device, electronic equipment and storage medium
CN117218720A (en) * 2023-08-25 2023-12-12 中南民族大学 Footprint identification method, system and related device of composite attention mechanism
CN117789153A (en) * 2024-02-26 2024-03-29 浙江驿公里智能科技有限公司 Automobile oil tank outer cover positioning system and method based on computer vision

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108364023A (en) * 2018-02-11 2018-08-03 北京达佳互联信息技术有限公司 Image-recognizing method based on attention model and system
CN109344840B (en) * 2018-08-07 2022-04-01 深圳市商汤科技有限公司 Image processing method and apparatus, electronic device, storage medium, and program product
CN109325911B (en) * 2018-08-27 2020-04-14 北京航空航天大学 Empty base rail detection method based on attention enhancement mechanism
CN109584161A (en) * 2018-11-29 2019-04-05 四川大学 The Remote sensed image super-resolution reconstruction method of convolutional neural networks based on channel attention
CN109376804B (en) * 2018-12-19 2020-10-30 中国地质大学(武汉) Hyperspectral remote sensing image classification method based on attention mechanism and convolutional neural network
CN109871532B (en) * 2019-01-04 2022-07-08 平安科技(深圳)有限公司 Text theme extraction method and device and storage medium
CN109871777B (en) * 2019-01-23 2021-10-01 广州智慧城市发展研究院 Behavior recognition system based on attention mechanism
CN109960726B (en) * 2019-02-13 2024-01-23 平安科技(深圳)有限公司 Text classification model construction method, device, terminal and storage medium
CN111598117B (en) * 2019-02-21 2023-06-30 成都通甲优博科技有限责任公司 Image recognition method and device
CN109919925A (en) * 2019-03-04 2019-06-21 联觉(深圳)科技有限公司 Printed circuit board intelligent detecting method, system, electronic device and storage medium
CN109919249B (en) * 2019-03-19 2020-07-31 北京字节跳动网络技术有限公司 Method and device for generating feature map
CN109871909B (en) 2019-04-16 2021-10-01 京东方科技集团股份有限公司 Image recognition method and device
CN110084794B (en) * 2019-04-22 2020-12-22 华南理工大学 Skin cancer image identification method based on attention convolution neural network
CN110046598B (en) * 2019-04-23 2023-01-06 中南大学 Plug-and-play multi-scale space and channel attention remote sensing image target detection method
CN110135325B (en) * 2019-05-10 2020-12-08 山东大学 Method and system for counting people of crowd based on scale adaptive network
CN110334749B (en) * 2019-06-20 2021-08-03 浙江工业大学 Anti-attack defense model based on attention mechanism, construction method and application
CN110334716B (en) * 2019-07-04 2022-01-11 北京迈格威科技有限公司 Feature map processing method, image processing method and device
CN110689093B (en) * 2019-12-10 2020-04-21 北京同方软件有限公司 Image target fine classification method under complex scene
CN111191737B (en) * 2020-01-05 2023-07-25 天津大学 Fine granularity image classification method based on multi-scale repeated attention mechanism
CN111461973A (en) * 2020-01-17 2020-07-28 华中科技大学 Super-resolution reconstruction method and system for image
CN110991568B (en) * 2020-03-02 2020-07-31 佳都新太科技股份有限公司 Target identification method, device, equipment and storage medium
CN112287989B (en) * 2020-10-20 2022-06-07 武汉大学 Aerial image ground object classification method based on self-attention mechanism
CN112329702B (en) * 2020-11-19 2021-05-07 上海点泽智能科技有限公司 Method and device for rapid face density prediction and face detection, electronic equipment and storage medium
CN114529963A (en) * 2020-11-23 2022-05-24 中兴通讯股份有限公司 Image processing method, image processing device, electronic equipment and readable storage medium
CN112633158A (en) * 2020-12-22 2021-04-09 广东电网有限责任公司电力科学研究院 Power transmission line corridor vehicle identification method, device, equipment and storage medium
CN112766597B (en) * 2021-01-29 2023-06-27 中国科学院自动化研究所 Bus passenger flow prediction method and system
CN113076878B (en) * 2021-04-02 2023-06-09 郑州大学 Constitution identification method based on attention mechanism convolution network structure
CN113139444A (en) * 2021-04-06 2021-07-20 上海工程技术大学 Space-time attention mask wearing real-time detection method based on MobileNet V2
CN113361441B (en) * 2021-06-18 2022-09-06 山东大学 Sight line area estimation method and system based on head posture and space attention
CN114005078B (en) * 2021-12-31 2022-03-29 山东交通学院 Vehicle weight identification method based on double-relation attention mechanism

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104517122A (en) * 2014-12-12 2015-04-15 浙江大学 Image target recognition method based on optimized convolution architecture
CN106127749A (en) * 2016-06-16 2016-11-16 华南理工大学 The target part recognition methods of view-based access control model attention mechanism
CN107273800A (en) * 2017-05-17 2017-10-20 大连理工大学 A kind of action identification method of the convolution recurrent neural network based on attention mechanism
CN107609638A (en) * 2017-10-12 2018-01-19 湖北工业大学 A kind of method based on line decoder and interpolation sampling optimization convolutional neural networks
CN108364023A (en) * 2018-02-11 2018-08-03 北京达佳互联信息技术有限公司 Image-recognizing method based on attention model and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934397B (en) * 2017-03-13 2020-09-01 北京市商汤科技开发有限公司 Image processing method and device and electronic equipment
CN107291945B (en) * 2017-07-12 2020-03-31 上海媒智科技有限公司 High-precision clothing image retrieval method and system based on visual attention model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104517122A (en) * 2014-12-12 2015-04-15 浙江大学 Image target recognition method based on optimized convolution architecture
CN106127749A (en) * 2016-06-16 2016-11-16 华南理工大学 The target part recognition methods of view-based access control model attention mechanism
CN107273800A (en) * 2017-05-17 2017-10-20 大连理工大学 A kind of action identification method of the convolution recurrent neural network based on attention mechanism
CN107609638A (en) * 2017-10-12 2018-01-19 湖北工业大学 A kind of method based on line decoder and interpolation sampling optimization convolutional neural networks
CN108364023A (en) * 2018-02-11 2018-08-03 北京达佳互联信息技术有限公司 Image-recognizing method based on attention model and system

Cited By (64)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112733578B (en) * 2019-10-28 2024-05-24 普天信息技术有限公司 Vehicle re-identification method and system
CN112733578A (en) * 2019-10-28 2021-04-30 普天信息技术有限公司 Vehicle weight identification method and system
CN111369433A (en) * 2019-11-12 2020-07-03 天津大学 Three-dimensional image super-resolution reconstruction method based on separable convolution and attention
CN111369433B (en) * 2019-11-12 2024-02-13 天津大学 Three-dimensional image super-resolution reconstruction method based on separable convolution and attention
CN111028253A (en) * 2019-11-25 2020-04-17 北京科技大学 Iron concentrate powder segmentation method and segmentation device
CN111028253B (en) * 2019-11-25 2023-05-30 北京科技大学 Method and device for dividing fine iron powder
CN111126258B (en) * 2019-12-23 2023-06-23 深圳市华尊科技股份有限公司 Image recognition method and related device
CN111126258A (en) * 2019-12-23 2020-05-08 深圳市华尊科技股份有限公司 Image recognition method and related device
CN111414962A (en) * 2020-03-19 2020-07-14 创新奇智(重庆)科技有限公司 Image classification method introducing object relationship
US11694319B2 (en) 2020-04-10 2023-07-04 Samsung Display Co., Ltd. Image-based defects identification and semi-supervised localization
CN111539884A (en) * 2020-04-21 2020-08-14 温州大学 Neural network video deblurring method based on multi-attention machine mechanism fusion
CN111539884B (en) * 2020-04-21 2023-08-15 温州大学 Neural network video deblurring method based on multi-attention mechanism fusion
CN111639654B (en) * 2020-05-12 2023-12-26 博泰车联网(南京)有限公司 Image processing method, device and computer storage medium
CN111639654A (en) * 2020-05-12 2020-09-08 博泰车联网(南京)有限公司 Image processing method and device and computer storage medium
CN111950586A (en) * 2020-07-01 2020-11-17 银江股份有限公司 Target detection method introducing bidirectional attention
CN111950586B (en) * 2020-07-01 2024-01-19 银江技术股份有限公司 Target detection method for introducing bidirectional attention
CN112035645A (en) * 2020-09-01 2020-12-04 平安科技(深圳)有限公司 Data query method and system
CN112035645B (en) * 2020-09-01 2024-06-11 平安科技(深圳)有限公司 Data query method and system
CN112464787A (en) * 2020-11-25 2021-03-09 北京航空航天大学 Remote sensing image ship target fine-grained classification method based on spatial fusion attention
CN112560907B (en) * 2020-12-02 2024-05-28 西安电子科技大学 Finite pixel infrared unmanned aerial vehicle target detection method based on mixed domain attention
CN112560907A (en) * 2020-12-02 2021-03-26 西安电子科技大学 Limited pixel infrared unmanned aerial vehicle target detection method based on mixed domain attention
CN112613356A (en) * 2020-12-07 2021-04-06 北京理工大学 Action detection method and device based on deep attention fusion network
CN112489033A (en) * 2020-12-13 2021-03-12 南通云达信息技术有限公司 Method for detecting cleaning effect of concrete curing box based on classification weight
CN112653899B (en) * 2020-12-18 2022-07-12 北京工业大学 Network live broadcast video feature extraction method based on joint attention ResNeSt under complex scene
CN112653899A (en) * 2020-12-18 2021-04-13 北京工业大学 Network live broadcast video feature extraction method based on joint attention ResNeSt under complex scene
CN113283278B (en) * 2021-01-08 2023-03-24 浙江大学 Anti-interference laser underwater target recognition instrument
CN113283278A (en) * 2021-01-08 2021-08-20 浙江大学 Anti-interference laser underwater target recognition instrument
CN112801945A (en) * 2021-01-11 2021-05-14 西北大学 Depth Gaussian mixture model skull registration method based on dual attention mechanism feature extraction
CN113408577A (en) * 2021-05-12 2021-09-17 桂林电子科技大学 Image classification method based on attention mechanism
CN113468967B (en) * 2021-06-02 2023-08-18 北京邮电大学 Attention mechanism-based lane line detection method, attention mechanism-based lane line detection device, attention mechanism-based lane line detection equipment and attention mechanism-based lane line detection medium
CN113468967A (en) * 2021-06-02 2021-10-01 北京邮电大学 Lane line detection method, device, equipment and medium based on attention mechanism
CN113255821A (en) * 2021-06-15 2021-08-13 中国人民解放军国防科技大学 Attention-based image recognition method, attention-based image recognition system, electronic device and storage medium
CN113674334A (en) * 2021-07-06 2021-11-19 复旦大学 Texture recognition method based on depth self-attention network and local feature coding
CN113674334B (en) * 2021-07-06 2023-04-18 复旦大学 Texture recognition method based on depth self-attention network and local feature coding
CN113539297A (en) * 2021-07-08 2021-10-22 中国海洋大学 Combined attention mechanism model and method for sound classification and application
CN113450366B (en) * 2021-07-16 2022-08-30 桂林电子科技大学 AdaptGAN-based low-illumination semantic segmentation method
CN113450366A (en) * 2021-07-16 2021-09-28 桂林电子科技大学 AdaptGAN-based low-illumination semantic segmentation method
CN113569735B (en) * 2021-07-28 2023-04-07 中国人民解放军空军预警学院 Complex input feature graph processing method and system based on complex coordinate attention module
CN113569735A (en) * 2021-07-28 2021-10-29 中国人民解放军空军预警学院 Complex coordinate attention module and complex input feature map processing method and system
CN113658114A (en) * 2021-07-29 2021-11-16 南京理工大学 Contact net opening pin defect target detection method based on multi-scale cross attention
CN113744284A (en) * 2021-09-06 2021-12-03 浙大城市学院 Brain tumor image region segmentation method and device, neural network and electronic equipment
CN113744284B (en) * 2021-09-06 2023-08-29 浙大城市学院 Brain tumor image region segmentation method and device, neural network and electronic equipment
CN113793345B (en) * 2021-09-07 2023-10-31 复旦大学附属华山医院 Medical image segmentation method and device based on improved attention module
CN113793345A (en) * 2021-09-07 2021-12-14 复旦大学附属华山医院 Medical image segmentation method and device based on improved attention module
CN113744844A (en) * 2021-09-17 2021-12-03 天津市肿瘤医院(天津医科大学肿瘤医院) Thyroid ultrasonic image processing method based on deep convolutional neural network
CN113744844B (en) * 2021-09-17 2024-01-26 天津市肿瘤医院(天津医科大学肿瘤医院) Thyroid ultrasonic image processing method based on deep convolutional neural network
CN113744164A (en) * 2021-11-05 2021-12-03 深圳市安软慧视科技有限公司 Method, system and related equipment for enhancing low-illumination image at night quickly
CN113744164B (en) * 2021-11-05 2022-03-15 深圳市安软慧视科技有限公司 Method, system and related equipment for enhancing low-illumination image at night quickly
CN114549962A (en) * 2022-03-07 2022-05-27 重庆锐云科技有限公司 Garden plant leaf disease classification method
CN114612979B (en) * 2022-03-09 2024-05-31 平安科技(深圳)有限公司 Living body detection method and device, electronic equipment and storage medium
CN114612979A (en) * 2022-03-09 2022-06-10 平安科技(深圳)有限公司 Living body detection method and device, electronic equipment and storage medium
CN114758206A (en) * 2022-06-13 2022-07-15 武汉珈鹰智能科技有限公司 Steel truss structure abnormity detection method and device
CN114758206B (en) * 2022-06-13 2022-10-28 武汉珈鹰智能科技有限公司 Steel truss structure abnormity detection method and device
CN115578615A (en) * 2022-10-31 2023-01-06 成都信息工程大学 Night traffic sign image detection model establishing method based on deep learning
CN116030014A (en) * 2023-01-06 2023-04-28 浙江伟众科技有限公司 Intelligent processing method and system for soft and hard air conditioner pipes
CN116030014B (en) * 2023-01-06 2024-04-09 浙江伟众科技有限公司 Intelligent processing method and system for soft and hard air conditioner pipes
CN115937792B (en) * 2023-01-10 2023-09-12 浙江非线数联科技股份有限公司 Intelligent community operation management system based on block chain
CN115937792A (en) * 2023-01-10 2023-04-07 浙江非线数联科技股份有限公司 Intelligent community operation management system based on block chain
CN116503398A (en) * 2023-06-26 2023-07-28 广东电网有限责任公司湛江供电局 Insulator pollution flashover detection method and device, electronic equipment and storage medium
CN116503398B (en) * 2023-06-26 2023-09-26 广东电网有限责任公司湛江供电局 Insulator pollution flashover detection method and device, electronic equipment and storage medium
CN117218720B (en) * 2023-08-25 2024-04-16 中南民族大学 Footprint identification method, system and related device of composite attention mechanism
CN117218720A (en) * 2023-08-25 2023-12-12 中南民族大学 Footprint identification method, system and related device of composite attention mechanism
CN117789153B (en) * 2024-02-26 2024-05-03 浙江驿公里智能科技有限公司 Automobile oil tank outer cover positioning system and method based on computer vision
CN117789153A (en) * 2024-02-26 2024-03-29 浙江驿公里智能科技有限公司 Automobile oil tank outer cover positioning system and method based on computer vision

Also Published As

Publication number Publication date
CN108364023A (en) 2018-08-03

Similar Documents

Publication Publication Date Title
WO2019153908A1 (en) Image recognition method and system based on attention model
CN109685819B (en) Three-dimensional medical image segmentation method based on feature enhancement
WO2017124646A1 (en) Artificial neural network calculating device and method for sparse connection
CN107704838B (en) Target object attribute identification method and device
CN109241880B (en) Image processing method, image processing apparatus, computer-readable storage medium
WO2019100724A1 (en) Method and device for training multi-label classification model
CN112651438A (en) Multi-class image classification method and device, terminal equipment and storage medium
KR100924689B1 (en) Apparatus and method for transforming an image in a mobile device
CN106855952B (en) Neural network-based computing method and device
CN111340077B (en) Attention mechanism-based disparity map acquisition method and device
CN107590811B (en) Scene segmentation based landscape image processing method and device and computing equipment
WO2020253304A1 (en) Face recognition device and image processing method, feature extraction model, and storage medium
US11481994B2 (en) Method and apparatus for extracting image data in parallel from multiple convolution windows, device, and computer-readable storage medium
CN111383232A (en) Matting method, matting device, terminal equipment and computer-readable storage medium
WO2023284608A1 (en) Character recognition model generating method and apparatus, computer device, and storage medium
CN107862680A (en) A kind of target following optimization method based on correlation filter
CN111833360B (en) Image processing method, device, equipment and computer readable storage medium
JP2015036939A (en) Feature extraction program and information processing apparatus
CN112799599A (en) Data storage method, computing core, chip and electronic equipment
CN112257727A (en) Feature image extraction method based on deep learning self-adaptive deformable convolution
WO2021042544A1 (en) Facial verification method and apparatus based on mesh removal model, and computer device and storage medium
WO2022179075A1 (en) Data processing method and apparatus, computer device and storage medium
US20210357647A1 (en) Method and System for Video Action Classification by Mixing 2D and 3D Features
WO2021081854A1 (en) Convolution operation circuit and convolution operation method
Lin et al. Coupled space learning of image style transformation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18905182

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 26.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18905182

Country of ref document: EP

Kind code of ref document: A1