CN110807752A - Image attention mechanism processing method based on convolutional neural network - Google Patents

Image attention mechanism processing method based on convolutional neural network Download PDF

Info

Publication number
CN110807752A
CN110807752A CN201910896954.8A CN201910896954A CN110807752A CN 110807752 A CN110807752 A CN 110807752A CN 201910896954 A CN201910896954 A CN 201910896954A CN 110807752 A CN110807752 A CN 110807752A
Authority
CN
China
Prior art keywords
target image
image
preset
channel
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910896954.8A
Other languages
Chinese (zh)
Other versions
CN110807752B (en
Inventor
陈旋
吕成云
张玉立
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Ai Jia Household Articles Co Ltd
Original Assignee
Jiangsu Ai Jia Household Articles Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Ai Jia Household Articles Co Ltd filed Critical Jiangsu Ai Jia Household Articles Co Ltd
Priority to CN201910896954.8A priority Critical patent/CN110807752B/en
Publication of CN110807752A publication Critical patent/CN110807752A/en
Application granted granted Critical
Publication of CN110807752B publication Critical patent/CN110807752B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to an image attention mechanism processing method based on a convolutional neural network, which adopts a brand-new control logic, combines the convolutional neural network and introduces an attention mechanism on the basis of a residual error module, wherein in order to better process redundant information, the spatial position attention and channel attention processing is firstly carried out on a target image, so that redundant information generated by image superposition is removed, then fusion processing is carried out on the target image by butting a preset convolutional layer, and finally, the input and the output of the module are connected by adopting a jump connection method to obtain a processing result aiming at the target image, so that the spatial position attention and channel attention addition can be more accurately realized.

Description

Image attention mechanism processing method based on convolutional neural network
Technical Field
The invention relates to an image attention mechanism processing method based on a convolutional neural network, and belongs to the technical field of image processing.
Background
With the advent of the big data age, deep neural networks, i.e., deep learning, have been developed dramatically and have been successfully applied in many industrial fields. As one of the deep neural networks, the convolutional neural network is widely used in the field of image processing. When the traditional machine vision method is used for image processing, characteristics need to be designed manually, the manually designed characteristics cannot cope with conditions of complicated and variable light, color, texture and the like, and the processing effect is not good.
The convolutional neural network method is to adopt convolutional neural network to establish an image processing model, compared with the traditional machine vision method, the image processing model does not need to be designed manually, but is obtained by learning of the network, and therefore the model can deal with conditions of complex and changeable light, colors, textures and the like. At present, the convolutional neural network has great success in many directions in the image processing fields of image recognition, semantic segmentation, target detection, human body posture estimation and the like, and some expressions of the convolutional neural network exceed those of human beings. As with humans, a better result can be achieved by adding a mechanism of attention when the convolutional neural network processes images.
The attention mechanism in the current image processing mainly comprises two aspects, namely information about spatial position and characteristic map channels. Attention mechanisms in terms of spatial location information may allow the network to better process information about spatial location, giving a large weight to regions of interest and a small weight to regions not of interest. However, in the current method, they are used separately, or convolution operation is performed first, and then attention operation of spatial position and channel is performed in sequence, so that redundant information existing in the feature map after superposition cannot be processed well.
Disclosure of Invention
The technical problem to be solved by the invention is to provide an image attention mechanism processing method based on a convolutional neural network, wherein brand-new control logic is adopted, and the spatial position attention and channel attention addition can be more accurately realized by combining the convolutional neural network.
The invention adopts the following technical scheme for solving the technical problems: the invention designs an image attention mechanism processing method based on a convolutional neural network, which is used for realizing attention processing aiming at a target image and comprises the following steps:
a, copying a target image as an original target image, and then entering the step B;
b, adding spatial attention information to the target image for updating to obtain an updated target image, and then entering the step C;
c, adding channel attention information for the target image to update, obtaining an updated target image, and then entering the step D;
d, adopting preset type convolution layers with preset layers, carrying out fusion processing on the target image, updating the target image, and then entering the step E;
and E, adding the pixel values of all the pixel points on the original target image and the pixel values of the pixel points at the same position on the target image respectively to obtain a final target image which is used as a result of attention processing on the target image.
As a preferred technical scheme of the invention: the step B comprises the following steps B1 to B2;
step B1, according to the preset weight of each preset division area on the target image, obtaining an image with the same size as the target image as a weight image, wherein the pixel value of each pixel point in the weight image is the preset weight of the preset division area where the pixel point at the same position on the target image is, and then entering the step B2;
and step B2, respectively aiming at the channel images of the preset types of channels corresponding to the target image, respectively multiplying the corresponding attribute values of the pixel points in the channel images by the pixel values of the pixel points at the same positions in the weight image, updating the channel images, further updating the channel images of the preset types of channels corresponding to the target image, updating the target image, and then entering the step C.
As a preferred technical scheme of the invention: in step B1, a 1 × 1 convolution layer is used to output an image with the same size as the target image as a weight image, and the pixel value of each pixel point in the weight image is the preset weight of the preset division area where the pixel point at the same position on the target image is located.
As a preferred technical scheme of the invention, the step C comprises the following steps C1 to C3;
step C1, obtaining an N-dimensional array corresponding to the target image according to pixel values of pixel points under each channel of the preset type corresponding to the target image and on each channel image, wherein N represents the channel number of the preset type channel corresponding to the target image, each dimension of the N-dimensional array corresponds to each channel of the preset type corresponding to the target image one by one, and then entering the step C2;
step C2, obtaining the weight of each channel of the preset type corresponding to the target image according to the N-dimensional array, and then entering the step C3;
and step C3, respectively aiming at the channel images of the preset types of channels corresponding to the target image, respectively multiplying the corresponding attribute values of the pixel points in the channel images by the weights of the corresponding channels, updating the channel images, further updating the channel images of the preset types of channels corresponding to the target image, updating the target image, and then entering the step D.
As a preferred technical scheme of the invention: in the step C1, the target image is processed in a global pooling manner according to the pixel values of the pixels of the preset types of the channels and the pixels of the channel image, so as to obtain the N-dimensional array corresponding to the target image.
As a preferred technical scheme of the invention: and D, adopting three layers of convolution layers of preset types, carrying out fusion processing on the target image, and updating the target image.
As a preferred technical solution of the present invention, the three preset types of convolutional layers in step D sequentially include the following:
the first layer adopts a convolution kernel of 1 multiplied by 1, the following layer is connected with a batch normalization layer, and the activation function adopts ReLu; the second layer adopts a convolution kernel of 3 multiplied by 3, the following layer is connected with a batch normalization layer, and the activation function adopts ReLu; the last layer uses a 1 x 1 convolution kernel followed by a batch normalization layer without the use of an activation function.
As a preferred technical scheme of the invention: in the step E, a skip-join method is adopted to add the pixel values of the pixels on the original target image and the pixel values of the pixels at the same position on the target image, so as to obtain a final target image, which is used as a result of performing attention processing on the target image.
Advantageous effects
Compared with the prior art, the image attention mechanism processing method based on the convolutional neural network has the following technical effects by adopting the technical scheme:
the invention designs an image attention mechanism processing method based on a convolutional neural network, which adopts a brand-new control logic and combines the convolutional neural network to introduce an attention mechanism on the basis of a residual error module, wherein in order to better process redundant information, spatial position attention and channel attention processing are firstly carried out on a target image, so that redundant information generated by image superposition is removed, then fusion processing is carried out on the target image by butting a preset convolutional layer, and finally the input and the output of the module are connected by adopting a jump connection method to obtain a processing result aiming at the target image, thereby more accurately realizing spatial position attention and channel attention addition.
Drawings
FIG. 1 is a flow chart of an image attention mechanism processing method based on a convolutional neural network according to the present invention.
Detailed Description
The following description will explain embodiments of the present invention in further detail with reference to the accompanying drawings.
The invention designs an image attention mechanism processing method based on a convolutional neural network, which is used for realizing attention processing aiming at a target image, and as shown in fig. 1, in practical application, the following steps A to E are specifically executed.
Spatial attention is used to process the spatial information of an image, which mimics the attention mechanism in the human visual system, gives more attention to a point that is desired to be focused on, blurs less important information, and can concentrate on important information. In image processing, the spatial attention mechanism sets corresponding weights for each position in the image according to the importance degree.
And step A, copying a target image as an original target image, and then entering the step B.
And B, adding spatial attention information for the target image to update, obtaining an updated target image, and then entering the step C.
In practical applications, the step B specifically includes the following steps B1 to B2.
And B1, outputting an image with the same size as the target image by adopting a 1 x 1 convolution layer as a weight image according to the preset weight of each preset division area on the target image, wherein the pixel value of each pixel point in the weight image is the preset weight of the preset division area in which the pixel point at the same position on the target image is positioned, and then entering the step B2.
And step B2, respectively aiming at the channel images of the preset types of channels corresponding to the target image, respectively multiplying the corresponding attribute values of the pixel points in the channel images by the pixel values of the pixel points at the same positions in the weight image, updating the channel images, further updating the channel images of the preset types of channels corresponding to the target image, updating the target image, and then entering the step C.
And C, adding channel attention information for the target image, updating to obtain an updated target image, and then entering the step D.
In practical applications, the step C specifically performs the following steps C1 to C3.
And C1, processing the target image in a global pooling mode according to pixel values of pixel points under each channel of the preset type corresponding to the target image and on each channel image to obtain an N-dimensional array corresponding to the target image, wherein N represents the channel number of the preset type channel corresponding to the target image, each dimension of the N-dimensional array corresponds to each channel of the preset type corresponding to the target image one by one, and then entering the step C2.
And C2, acquiring the weight of each channel of the preset type corresponding to the target image according to the N-dimensional array, and then entering the step C3.
And step C3, respectively aiming at the channel images of the preset types of channels corresponding to the target image, respectively multiplying the corresponding attribute values of the pixel points in the channel images by the weights of the corresponding channels, updating the channel images, further updating the channel images of the preset types of channels corresponding to the target image, updating the target image, and then entering the step D.
Based on the operation of step C, redundant information between the channels of the feature map is removed by adding a weight to each channel.
And D, adopting preset type convolution layers with preset layers, carrying out fusion processing on the target image, updating the target image, and then entering the step E.
Specifically, in the step D, three preset type convolutional layers are adopted, fusion processing is performed on the target image, and the target image is updated, wherein the three preset type convolutional layers sequentially include the following:
the first layer adopts a convolution kernel of 1 multiplied by 1, the following layer is connected with a batch normalization layer, and the activation function adopts ReLu; the second layer adopts a convolution kernel of 3 multiplied by 3, the following layer is connected with a batch normalization layer, and the activation function adopts ReLu; the last layer uses a 1 x 1 convolution kernel followed by a batch normalization layer without the use of an activation function.
And E, adding the pixel values of all pixel points on the original target image and the pixel values of the pixel points at the same positions on the target image by adopting a jump connection method to obtain a final target image as a result of performing attention processing on the target image.
The image attention mechanism processing method based on the convolutional neural network is designed by adopting a brand-new control logic, combines the convolutional neural network and introduces an attention mechanism on the basis of a residual error module, wherein in order to better process redundant information, the spatial position attention and channel attention processing is firstly carried out on a target image, so that redundant information generated by image superposition is removed, then fusion processing is carried out on the target image by butting a preset convolutional layer, and finally the input and the output of the module are connected by adopting a jump connection method to obtain a processing result aiming at the target image, so that the spatial position attention and channel attention addition can be more accurately realized.
The embodiments of the present invention have been described in detail with reference to the drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims (8)

1. An image attention mechanism processing method based on a convolutional neural network is used for realizing attention processing aiming at a target image, and is characterized by comprising the following steps:
a, copying a target image as an original target image, and then entering the step B;
b, adding spatial attention information to the target image for updating to obtain an updated target image, and then entering the step C;
c, adding channel attention information for the target image to update, obtaining an updated target image, and then entering the step D;
d, adopting preset type convolution layers with preset layers, carrying out fusion processing on the target image, updating the target image, and then entering the step E;
and E, adding the pixel values of all the pixel points on the original target image and the pixel values of the pixel points at the same position on the target image respectively to obtain a final target image which is used as a result of attention processing on the target image.
2. The image attention mechanism processing method based on the convolutional neural network as claimed in claim 1, wherein said step B comprises the following steps B1 to B2;
step B1, according to the preset weight of each preset division area on the target image, obtaining an image with the same size as the target image as a weight image, wherein the pixel value of each pixel point in the weight image is the preset weight of the preset division area where the pixel point at the same position on the target image is, and then entering the step B2;
and step B2, respectively aiming at the channel images of the preset types of channels corresponding to the target image, respectively multiplying the corresponding attribute values of the pixel points in the channel images by the pixel values of the pixel points at the same positions in the weight image, updating the channel images, further updating the channel images of the preset types of channels corresponding to the target image, updating the target image, and then entering the step C.
3. The image attention mechanism processing method based on the convolutional neural network as claimed in claim 2, wherein: in step B1, a 1 × 1 convolution layer is used to output an image with the same size as the target image as a weight image, and the pixel value of each pixel point in the weight image is the preset weight of the preset division area where the pixel point at the same position on the target image is located.
4. The convolutional neural network-based image attention mechanism processing method as claimed in any one of claims 1 to 3, wherein said step C comprises the following steps C1 to C3;
step C1, obtaining an N-dimensional array corresponding to the target image according to pixel values of pixel points under each channel of the preset type corresponding to the target image and on each channel image, wherein N represents the channel number of the preset type channel corresponding to the target image, each dimension of the N-dimensional array corresponds to each channel of the preset type corresponding to the target image one by one, and then entering the step C2;
step C2, obtaining the weight of each channel of the preset type corresponding to the target image according to the N-dimensional array, and then entering the step C3;
and step C3, respectively aiming at the channel images of the preset types of channels corresponding to the target image, respectively multiplying the corresponding attribute values of the pixel points in the channel images by the weights of the corresponding channels, updating the channel images, further updating the channel images of the preset types of channels corresponding to the target image, updating the target image, and then entering the step D.
5. The image attention mechanism processing method based on the convolutional neural network as claimed in claim 5, wherein: in the step C1, the target image is processed in a global pooling manner according to the pixel values of the pixels of the preset types of the channels and the pixels of the channel image, so as to obtain the N-dimensional array corresponding to the target image.
6. The image attention mechanism processing method based on the convolutional neural network as claimed in claim 1, wherein: and D, adopting three layers of convolution layers of preset types, carrying out fusion processing on the target image, and updating the target image.
7. The image attention mechanism processing method based on the convolutional neural network as claimed in claim 6, wherein the three layers of convolutional layers of preset type in step D sequentially include the following:
the first layer adopts a convolution kernel of 1 multiplied by 1, the following layer is connected with a batch normalization layer, and the activation function adopts ReLu; the second layer adopts a convolution kernel of 3 multiplied by 3, the following layer is connected with a batch normalization layer, and the activation function adopts ReLu; the last layer uses a 1 x 1 convolution kernel followed by a batch normalization layer without the use of an activation function.
8. The image attention mechanism processing method based on the convolutional neural network as claimed in claim 1, wherein: in the step E, a skip-join method is adopted to add the pixel values of the pixels on the original target image and the pixel values of the pixels at the same position on the target image, so as to obtain a final target image, which is used as a result of performing attention processing on the target image.
CN201910896954.8A 2019-09-23 2019-09-23 Image attention mechanism processing method based on convolutional neural network Active CN110807752B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910896954.8A CN110807752B (en) 2019-09-23 2019-09-23 Image attention mechanism processing method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910896954.8A CN110807752B (en) 2019-09-23 2019-09-23 Image attention mechanism processing method based on convolutional neural network

Publications (2)

Publication Number Publication Date
CN110807752A true CN110807752A (en) 2020-02-18
CN110807752B CN110807752B (en) 2022-07-08

Family

ID=69487633

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910896954.8A Active CN110807752B (en) 2019-09-23 2019-09-23 Image attention mechanism processing method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN110807752B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139627A (en) * 2021-06-22 2021-07-20 北京小白世纪网络科技有限公司 Mediastinal lump identification method, system and device
CN113288162A (en) * 2021-06-03 2021-08-24 北京航空航天大学 Short-term electrocardiosignal atrial fibrillation automatic detection system based on self-adaptive attention mechanism

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753903A (en) * 2019-02-27 2019-05-14 北航(四川)西部国际创新港科技有限公司 A kind of unmanned plane detection method based on deep learning
CN110046598A (en) * 2019-04-23 2019-07-23 中南大学 The multiscale space of plug and play and channel pay attention to remote sensing image object detection method
CN110188611A (en) * 2019-04-26 2019-08-30 华中科技大学 A kind of pedestrian recognition methods and system again introducing visual attention mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109753903A (en) * 2019-02-27 2019-05-14 北航(四川)西部国际创新港科技有限公司 A kind of unmanned plane detection method based on deep learning
CN110046598A (en) * 2019-04-23 2019-07-23 中南大学 The multiscale space of plug and play and channel pay attention to remote sensing image object detection method
CN110188611A (en) * 2019-04-26 2019-08-30 华中科技大学 A kind of pedestrian recognition methods and system again introducing visual attention mechanism

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113288162A (en) * 2021-06-03 2021-08-24 北京航空航天大学 Short-term electrocardiosignal atrial fibrillation automatic detection system based on self-adaptive attention mechanism
CN113288162B (en) * 2021-06-03 2022-06-28 北京航空航天大学 Short-term electrocardiosignal atrial fibrillation automatic detection system based on self-adaptive attention mechanism
CN113139627A (en) * 2021-06-22 2021-07-20 北京小白世纪网络科技有限公司 Mediastinal lump identification method, system and device
CN113139627B (en) * 2021-06-22 2021-11-05 北京小白世纪网络科技有限公司 Mediastinal lump identification method, system and device

Also Published As

Publication number Publication date
CN110807752B (en) 2022-07-08

Similar Documents

Publication Publication Date Title
CN107767384B (en) Image semantic segmentation method based on countermeasure training
CN109360171B (en) Real-time deblurring method for video image based on neural network
Liao et al. DR-GAN: Automatic radial distortion rectification using conditional GAN in real-time
CN109726627B (en) Neural network model training and universal ground wire detection method
CN111489394B (en) Object posture estimation model training method, system, device and medium
Tang et al. Single image dehazing via lightweight multi-scale networks
CN109919209A (en) A kind of domain-adaptive deep learning method and readable storage medium storing program for executing
CN110807752B (en) Image attention mechanism processing method based on convolutional neural network
CN113822284B (en) RGBD image semantic segmentation method based on boundary attention
CN112084934B (en) Behavior recognition method based on bone data double-channel depth separable convolution
CN109903323B (en) Training method and device for transparent object recognition, storage medium and terminal
CN112884648A (en) Method and system for multi-class blurred image super-resolution reconstruction
CN113658091A (en) Image evaluation method, storage medium and terminal equipment
CN107729885B (en) Face enhancement method based on multiple residual error learning
CN111126561B (en) Image processing method based on multi-path parallel convolution neural network
CN111667401B (en) Multi-level gradient image style migration method and system
CN111340088A (en) Image feature training method, model, device and computer storage medium
CN116703768A (en) Training method, device, medium and equipment for blind spot denoising network model
US11531890B2 (en) Padding method for a convolutional neural network
CN113012072A (en) Image motion deblurring method based on attention network
CN114549958A (en) Night and disguised target detection method based on context information perception mechanism
CN114022458A (en) Skeleton detection method and device, electronic equipment and computer readable storage medium
CN113743411A (en) Unsupervised video consistent part segmentation method based on deep convolutional network
JPWO2021015232A5 (en)
CN111340089A (en) Image feature learning method, model, apparatus and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 211100 floor 5, block a, China Merchants high speed rail Plaza project, No. 9, Jiangnan Road, Jiangning District, Nanjing, Jiangsu (South Station area)

Applicant after: JIANGSU AIJIA HOUSEHOLD PRODUCTS Co.,Ltd.

Address before: 211100 No. 18 Zhilan Road, Science Park, Jiangning District, Nanjing City, Jiangsu Province

Applicant before: JIANGSU AIJIA HOUSEHOLD PRODUCTS Co.,Ltd.

GR01 Patent grant
GR01 Patent grant