CN111275694B - Attention mechanism guided progressive human body division analysis system and method - Google Patents

Attention mechanism guided progressive human body division analysis system and method Download PDF

Info

Publication number
CN111275694B
CN111275694B CN202010081219.4A CN202010081219A CN111275694B CN 111275694 B CN111275694 B CN 111275694B CN 202010081219 A CN202010081219 A CN 202010081219A CN 111275694 B CN111275694 B CN 111275694B
Authority
CN
China
Prior art keywords
module
convolutional layer
human body
output
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN202010081219.4A
Other languages
Chinese (zh)
Other versions
CN111275694A (en
Inventor
邵杰
黄茜
曹坤涛
徐行
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Research Institute Of Yibin University Of Electronic Science And Technology
University of Electronic Science and Technology of China
Original Assignee
Research Institute Of Yibin University Of Electronic Science And Technology
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Research Institute Of Yibin University Of Electronic Science And Technology, University of Electronic Science and Technology of China filed Critical Research Institute Of Yibin University Of Electronic Science And Technology
Priority to CN202010081219.4A priority Critical patent/CN111275694B/en
Publication of CN111275694A publication Critical patent/CN111275694A/en
Application granted granted Critical
Publication of CN111275694B publication Critical patent/CN111275694B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a system and a method for analyzing a human body by progressive division guided by an attention mechanism, wherein the proposed system mainly explores the enhancement effect of significance detection on human body analysis and the effectiveness of the attention mechanism on human body analysis. On the network structure, a feature extraction module is constructed, feature information is effectively extracted, multi-dimensional features are fused, and the human body analysis effect is enhanced; an adaptive attention module is designed to carry out position attention weighting on the features, and an effective fusion idea for fusing different levels of features is provided; finally, the significance detection and the human body analysis are integrated into an end-to-end network structure in a bottom-to-top mode, and the modules are applied to all branches, so that a unified effective structure is obtained. The performance exceeds the performance of the existing known method, and the optimal human body analysis effect is shown.

Description

Attention mechanism guided progressive human body division analysis system and method
Technical Field
The invention belongs to the field of image processing, and particularly relates to a system and a method for analyzing a progressively divided human body guided by an attention mechanism.
Background
Understanding human anatomy is a crucial but challenging topic in computer vision, and human body interpretation is one of the tasks to achieve this goal. Human body parsing is a dense prediction task aimed at accurately locating the human body and further dividing it into multiple semantic regions at the pixel level. In recent years, human body analysis is widely applied to other tasks also aimed at analyzing human body, such as pedestrian re-recognition, posture estimation, and human body image generation.
In recent work, researchers have proposed various methods to improve the expressiveness of the human body analysis network. One typical approach is to utilize additional domain information provided by other related tasks. For example, some work (fanging Xia, Peng Wang, Xianjie Chen and Alan L.Yuille. Joint Multi-person position estimation and magnetic part segmentation [ C ]. CVPR,2017: 6080-6089. and XuechENIE, JianshiFeng and Shuiching Yang.Multi free to adaptation for joint human positioning and position estimation [ C ]. ECCV,2018: 519-534) investigated the guidance of pose structure to human body interpretation by adding joint structure losses or dynamically updating model constraints learned from the pose estimation task. There have been other works (KeGong, Xiaoodan Liang, Yiche Li, Yimin Chen, Ming Yang, Liang Lin. instant-level Human matching via part grouping network [ C ]. ECCV,2018: 805-822. and Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, Yao Zhua. devil in the Details: TodsAccurate Single and Multiple Human matching [ C ]. AAAI,2019: 4814-. Although these information fusions provide a satisfactory improvement, there may be incompatibilities with training multiple tasks in the same network due to inconsistent optimization objectives, which somewhat weakens the predictive power of the overall structure.
In previous work (Ke Gong, Xiaoodan Liang, Dongyu Zhang, XiaohuiShen and Liang Lin. Look into Person: Self-Supervised Structure-Sensitive Learning and aNew Benchmark for Human matching [ C ] CVPR,2017: 6757-6765. and Xiaoodan Liang, KeGong, XiaohuiShen and Liang Lin. Look into Person: Joint Body matching & Poseestimation network world a New Benchk [ J ] TPAMI,2019:41(4)871 885), the approach of applying the attention machine did not explore the adaptive attention module for Human analysis tasks, but simply used some attention modules along the semantic meaning of Human Body, and thus did not refine the Human Body well.
Disclosure of Invention
Aiming at the defects in the prior art, the attention mechanism guided progressive human body analysis dividing system and method provided by the invention solve the problem that the prior art cannot accurately predict and analyze human body parts and analyze significance.
In order to achieve the purpose of the invention, the invention adopts the technical scheme that:
an attention mechanism guided progressive compartmentalization human body interpretation system comprising: the system comprises a residual error neural network ResNet-101, a significance detection subsystem and a human body analysis subsystem;
the residual error neural network ResNet-101 is a structural neural network and is used for processing a human body image to obtain a shallow layer low-level feature map and a deep layer high-level feature map; the output Block1 and the output Block2 are connected with the significance detection subsystem in a communication mode and used for inputting the shallow low-level feature map into the significance detection subsystem; the output Block3 and the output Block4 are in communication connection with the human body analysis subsystem and are used for inputting the deep high-level feature map into the human body analysis subsystem;
the significance detection subsystem is used for carrying out significance prediction on the shallow low-level feature map to obtain two classification significance prediction maps;
the human body analysis subsystem is used for carrying out human body analysis prediction on the deep-layer high-level characteristic diagram to obtain a human body analysis prediction diagram.
Further, the significance detection subsystem includes: convolutional layer Conv1, convolutional layer Conv2, convolutional layer Conv3, convolutional layer Conv4, adaptive attention module GAM1, upsampling module 1, and upsampling module 2;
the convolutional layer Conv1 is a 1 × 1 convolutional layer, is used for performing dimension reduction processing on a shallow layer low-level feature map transmitted by an output Block1 of a residual neural network ResNet-101, and has an input end in communication connection with the output Block1 of the residual neural network ResNet-101 and an output end in communication connection with an input end A of an adaptive attention module GAM 1;
the convolutional layer Conv2 is a 1 × 1 convolutional layer, is used for performing dimension reduction processing on a shallow layer low-level feature map transmitted by an output Block2 of a residual neural network ResNet-101, and has an input end in communication connection with the output Block2 of the residual neural network ResNet-101 and an output end in communication connection with an input end of an up-sampling module 1;
the up-sampling module 1 is used for performing up-sampling processing on image data subjected to dimensionality reduction processing and transmitted by a shallow layer low-level feature map from an output Block2 of a residual neural network ResNet-101, and the output end of the up-sampling module is in communication connection with an input end B of an adaptive attention module GAM 1;
the adaptive attention module GAM1 is used for extracting attention features, and the output end of the adaptive attention module GAM1 is in communication connection with the convolutional layer Conv3 and is in communication connection with the human body analysis subsystem for feature enhancement;
the convolutional layer Conv3, the convolutional layer Conv4 and the up-sampling module 2 are used for processing the attention features extracted by the adaptive attention module GAM1 to obtain a two-classification significance prediction graph; the convolutional layer Conv3 is a 3 × 3 convolutional layer, the output of which is communicatively connected to the input of convolutional layer Conv 4; the convolutional layer Conv4 is a 1 × 1 convolutional layer, and the output end of the convolutional layer is in communication connection with the input end of the up-sampling module 2; the output end of the up-sampling module 2 is used as the processing result output end of the significance detection subsystem to output the two-classification significance prediction graph obtained by the system operation.
Further, the human body analysis subsystem includes: a feature extraction module FEM1, a feature extraction module FEM2, an adaptive attention module GAM2, an upsampling module 3, an upsampling module 4, an adding module 1, a convolutional layer Conv5, and a convolutional layer Conv 6;
the feature extraction module FEM1 is used for carrying out multi-dimensional feature extraction on a deep high-level feature map transmitted by an output Block Block3 of a residual neural network ResNet-101 to obtain multi-dimensional context information, the input end of the feature extraction module FEM1 is in communication connection with the output Block Block3 of the residual neural network ResNet-101, and the output end of the feature extraction module FEM1 is in communication connection with the input end A of the adaptive attention module GAM 2;
the feature extraction module FEM2 is used for carrying out multi-dimensional feature extraction on a deep high-level feature map transmitted by an output Block Block4 of a residual neural network ResNet-101 to obtain multi-dimensional context information, the input end of the feature extraction module FEM2 is in communication connection with the output Block Block4 of the residual neural network ResNet-101, and the output end of the feature extraction module FEM2 is in communication connection with the input end B of the adaptive attention module GAM 2;
the adaptive attention module GAM2 is used for processing multi-dimensional context information to obtain effective weighting characteristics, and the output end of the adaptive attention module GAM2 is in communication connection with the input end of the up-sampling module 3;
the up-sampling module 3 is used for performing up-sampling processing on the effective weighting characteristics, and the output end of the up-sampling module is in communication connection with the input end A of the addition module 1;
the addition module 1 is used for adding the attention characteristics extracted by the adaptive attention module GAM1 and the effective weighting characteristics obtained by the adaptive attention module GAM2 according to elements so as to fuse the characteristic diagrams provided by the adaptive attention module GAM1 and the adaptive attention module GAM2, highlight a target area and improve compactness among classes; its input B is communicatively connected to the output of the adaptive attention module GAM1, and its output is communicatively connected to the input of the convolutional layer Conv 5;
the convolutional layer Conv5, the convolutional layer Conv6 and the up-sampling module 4 are used for processing the attention characteristics obtained by adding the elements by the addition module 1 to obtain a human body analysis prediction graph; the convolutional layer Conv5 is a 3 × 3 convolutional layer, the output of which is communicatively connected to the input of convolutional layer Conv 6; the convolutional layer Conv6 is a 1 × 1 convolutional layer, and the output end of the convolutional layer is connected with the input end of the up-sampling module 4 in a communication manner; the output end of the up-sampling module 4 is used as the processing result output end of the human body analysis subsystem to output the human body analysis prediction graph obtained by the system operation.
Further, the feature extraction module FEM1 and the feature extraction module FEM2 each include: convolutional layer Conv11, convolutional layer Conv12, convolutional layer Conv13, convolutional layer Conv14, convolutional layer Conv15, convolutional layer Conv16, convolutional layer Conv17, and addition module 11;
an input of the convolutional layer Conv11 communicatively connected to an input of convolutional layer Conv12, a convolutional layer Conv13 input, and an input of convolutional layer Conv14, and acting as an input of feature extraction module FEM1 and an input of feature extraction module FEM 2; an output of the convolutional layer Conv11 is communicatively connected with an input of convolutional layer Conv 15; an output of the convolutional layer Conv12 is communicatively connected with an input of convolutional layer Conv 16; an output of the convolutional layer Conv13 is communicatively connected with an input of convolutional layer Conv 17; an output of the convolutional layer Conv14 is communicatively connected to an input a of the summing module 11, an output of the convolutional layer Conv15 is communicatively connected to an input B of the summing module 11, an output of the convolutional layer Conv16 is communicatively connected to an input C of the summing module 11, and an output of the convolutional layer Conv17 is communicatively connected to an input D of the summing module 11; the output end of the addition module 11 is used as the output end of the feature extraction module FEM1 and the output end of the feature extraction module FEM 2;
the convolutional layer Conv11 is a 3 × 3 void convolutional layer, and the void convolution rate is 3;
the convolutional layer Conv12 is a 3 × 3 void convolutional layer, and the void convolution rate is 8;
the convolutional layer Conv13 is a 3 × 3 void convolutional layer, and the void convolution rate is 12;
the convolutional layers Conv14, Conv15, Conv16 and Conv17 were all 1 × 1 convolutional layers.
Further, the adaptive attention module GAM1 and the adaptive attention module GAM2 each include: convolutional layer Conv21, convolutional layer Conv22, global mean pooling layer 21, global mean pooling layer 22, addition module 21, Softmax layer, and multiplication module 21;
the convolutional layer Conv21 is a 1 × 1 convolutional layer with inputs as input a of adaptive attention module GAM1 and input a of adaptive attention module GAM2, and outputs communicatively connected to inputs of the global mean pooling layer 21;
the convolutional layer Conv22 is a 1 × 1 convolutional layer with inputs as input B of adaptive attention module GAM1 and input B of adaptive attention module GAM2, and outputs communicatively connected to inputs of global mean pooling layer 22;
the output end of the global pooling layer 21 is in communication connection with the input end A of the adding module 21, and the output end of the global mean pooling layer 22 is in communication connection with the input end B of the adding module 21;
an output of the summing module 21 is communicatively coupled to an input of a Softmax layer;
an output of the Softmax layer is communicatively connected to an input of a multiplication module 21;
the output of the multiplication module 21 serves as the output of the adaptive attention module GAM1 and the output of the adaptive attention module GAM 2.
The adaptive attention module focuses on selectively extracting location information and fusing different levels of weighted attention features to achieve mutual information fusion. Featuring input data of adaptive attention module
Figure GDA0002620996510000061
Wherein C, H, W represents the number of characteristic channels, height and width, respectively, and i represents the ith operation. The inputs to the attention module are two different levels of profiles A and B, denoted respectively
Figure GDA0002620996510000062
And
Figure GDA0002620996510000063
feature(s)
Figure GDA0002620996510000064
And
Figure GDA0002620996510000065
after passing through the convolutional layer Conv21 and the convolutional layer Conv22, respectively, the number of channels is reduced to C/2;
newly acquired features
Figure GDA0002620996510000071
And
Figure GDA0002620996510000072
the number of channels is further reduced by the global mean pooling layer 21 and the global mean pooling layer 22, and the processing flow can be expressed as the following expression:
Figure GDA0002620996510000073
and
Figure GDA0002620996510000074
after the two feature maps a and B at different levels are processed as described above, the fusion is completed by the addition module 21 by adding elements, which is done to keep more residual attention weight information. Then, it is passed through a normalization operation so that the weight values are between (0, 1), which is implemented by the Softmax layer.
Such as formula
Figure GDA0002620996510000075
Shown;
finally, the original features are connected
Figure GDA0002620996510000076
And
Figure GDA0002620996510000077
as S ∈ R2C×H×WMultiply it by element with the weight obtained from the previous operation to obtain the final weighted feature map, such as
Figure GDA0002620996510000078
As shown.
An attention mechanism guided progressive human body division analysis method comprises the following steps:
s1, obtaining human body images of the known corresponding two-classification significance prediction image and human body analysis prediction image from the big data platform to form a training data set and a test data set;
s2, training the attention mechanism guided progressive dividing human body analysis system through the training data set to obtain a trained attention mechanism guided progressive dividing human body analysis system;
s3, verifying the trained attention mechanism guided progressive dividing human body analysis system through the test data set to obtain a verified attention mechanism guided progressive dividing human body analysis system;
and S4, predicting and analyzing the human body image through the verified attention mechanism guided progressive dividing human body analysis system to obtain a two-classification significance prediction graph and a human body analysis prediction graph corresponding to the human body image.
Further, the step S2 includes the following steps:
s21, preprocessing the training data set;
s22, setting the initial parameters and training rules of the human body analysis system by the progressive division guided by the attention mechanism;
and S23, performing parameter iteration on each module in the attention mechanism guided progressive division human body analysis system according to the preprocessed training data set through a back propagation method.
Further, the step S21 includes the following steps: and carrying out random scaling processing of 0.5-1.5 on the data in the training data set and carrying out operations of cutting and left-right turning on the data in the training data set.
Further, the initial parameters and the training rules in step S22 include the following expressions:
Figure GDA0002620996510000081
LAPPNet=Lparsing+αLsailency(2)
α=1 (3)
power=0.9 (4)
base_lr=0.007 (5)
wherein, formula 1 is a learning rate iteration rule, lr is a current learning rate, base _ lr is an initial learning rate, iter is a current iteration number, max _ iter is a total iteration number, and power is an index parameter; equation 2 is the loss function of the training rule, LparsingFor the cross-entropy loss of the segmentation prediction graph and the segmentation annotation graph, LsailencyThe cross entropy loss of the significance prediction graph and the real labeling graph is shown, and alpha is a proportion parameter used for balancing the segmentation loss and the significance loss.
The invention has the beneficial effects that: the system provided by the invention mainly explores the enhancement effect of significance detection on human body analysis and the effectiveness of attention mechanism on human body analysis. On the network structure, a feature extraction module is constructed, feature information is effectively extracted, multi-dimensional features are fused, and the human body analysis effect is enhanced; an adaptive attention module is designed to carry out position attention weighting on the features, and an effective fusion idea for fusing different levels of features is provided; finally, the significance detection and the human body analysis are integrated into an end-to-end network structure in a bottom-to-top mode, and the modules are applied to all branches, so that a unified effective structure is obtained. The performance exceeds the performance of the existing known method, and the optimal human body analysis effect is shown.
Drawings
FIG. 1 is a block diagram of an attention mechanism guided progressive partition body analysis system;
FIG. 2 is a block diagram of a feature extraction module architecture;
FIG. 3 is a block diagram of an adaptive attention module architecture;
FIG. 4 is a schematic flow chart of a human body analysis method using progressive segmentation guided by attention mechanism;
FIG. 5 is a graph showing the effect of the experiment.
Detailed Description
The following description of the embodiments of the present invention is provided to facilitate the understanding of the present invention by those skilled in the art, but it should be understood that the present invention is not limited to the scope of the embodiments, and it will be apparent to those skilled in the art that various changes may be made without departing from the spirit and scope of the invention as defined and defined in the appended claims, and all matters produced by the invention using the inventive concept are protected.
As shown in fig. 1: an attention mechanism guided progressive compartmentalization human body interpretation system comprising: the system comprises a residual error neural network ResNet-101, a significance detection subsystem and a human body analysis subsystem;
the residual error neural network ResNet-101 is a structural neural network and is used for processing a human body image to obtain a shallow layer low-level feature map and a deep layer high-level feature map; the output Block1 and the output Block2 are connected with the significance detection subsystem in a communication mode and used for inputting the shallow low-level feature map into the significance detection subsystem; the output Block3 and the output Block4 are in communication connection with the human body analysis subsystem and are used for inputting the deep high-level feature map into the human body analysis subsystem;
the significance detection subsystem is used for carrying out significance prediction on the shallow low-level feature map to obtain two classification significance prediction maps;
the human body analysis subsystem is used for carrying out human body analysis prediction on the deep-layer high-level characteristic diagram to obtain a human body analysis prediction diagram.
The significance detection subsystem includes: convolutional layer Conv1, convolutional layer Conv2, convolutional layer Conv3, convolutional layer Conv4, adaptive attention module GAM1, upsampling module 1, and upsampling module 2;
the convolutional layer Conv1 is a 1 × 1 convolutional layer, is used for performing dimension reduction processing on a shallow layer low-level feature map transmitted by an output Block1 of a residual neural network ResNet-101, and has an input end in communication connection with the output Block1 of the residual neural network ResNet-101 and an output end in communication connection with an input end A of an adaptive attention module GAM 1;
the convolutional layer Conv2 is a 1 × 1 convolutional layer, is used for performing dimension reduction processing on a shallow layer low-level feature map transmitted by an output Block2 of a residual neural network ResNet-101, and has an input end in communication connection with the output Block2 of the residual neural network ResNet-101 and an output end in communication connection with an input end of an up-sampling module 1;
the up-sampling module 1 is used for performing up-sampling processing on image data subjected to dimensionality reduction processing and transmitted by a shallow layer low-level feature map from an output Block2 of a residual neural network ResNet-101, and the output end of the up-sampling module is in communication connection with an input end B of an adaptive attention module GAM 1;
the adaptive attention module GAM1 is used for extracting attention features, and the output end of the adaptive attention module GAM1 is in communication connection with the convolutional layer Conv3 and is in communication connection with the human body analysis subsystem for feature enhancement;
the convolutional layer Conv3, the convolutional layer Conv4 and the up-sampling module 2 are used for processing the attention features extracted by the adaptive attention module GAM1 to obtain a two-classification significance prediction graph; the convolutional layer Conv3 is a 3 × 3 convolutional layer, the output of which is communicatively connected to the input of convolutional layer Conv 4; the convolutional layer Conv4 is a 1 × 1 convolutional layer, and the output end of the convolutional layer is in communication connection with the input end of the up-sampling module 2; the output end of the up-sampling module 2 is used as the processing result output end of the significance detection subsystem to output the two-classification significance prediction graph obtained by the system operation.
The human body analysis subsystem comprises: a feature extraction module FEM1, a feature extraction module FEM2, an adaptive attention module GAM2, an upsampling module 3, an upsampling module 4, an adding module 1, a convolutional layer Conv5, and a convolutional layer Conv 6;
the feature extraction module FEM1 is used for carrying out multi-dimensional feature extraction on a deep high-level feature map transmitted by an output Block Block3 of a residual neural network ResNet-101 to obtain multi-dimensional context information, the input end of the feature extraction module FEM1 is in communication connection with the output Block Block3 of the residual neural network ResNet-101, and the output end of the feature extraction module FEM1 is in communication connection with the input end A of the adaptive attention module GAM 2;
the feature extraction module FEM2 is used for carrying out multi-dimensional feature extraction on a deep high-level feature map transmitted by an output Block Block4 of a residual neural network ResNet-101 to obtain multi-dimensional context information, the input end of the feature extraction module FEM2 is in communication connection with the output Block Block4 of the residual neural network ResNet-101, and the output end of the feature extraction module FEM2 is in communication connection with the input end B of the adaptive attention module GAM 2;
the adaptive attention module GAM2 is used for processing multi-dimensional context information to obtain effective weighting characteristics, and the output end of the adaptive attention module GAM2 is in communication connection with the input end of the up-sampling module 3;
the up-sampling module 3 is used for performing up-sampling processing on the effective weighting characteristics, and the output end of the up-sampling module is in communication connection with the input end A of the addition module 1;
the addition module 1 is used for adding the attention characteristics extracted by the adaptive attention module GAM1 and the effective weighting characteristics obtained by the adaptive attention module GAM2 according to elements so as to fuse the characteristic diagrams provided by the adaptive attention module GAM1 and the adaptive attention module GAM2, highlight a target area and improve compactness among classes; its input B is communicatively connected to the output of the adaptive attention module GAM1, and its output is communicatively connected to the input of the convolutional layer Conv 5;
the convolutional layer Conv5, the convolutional layer Conv6 and the up-sampling module 4 are used for processing the attention characteristics obtained by adding the elements by the addition module 1 to obtain a human body analysis prediction graph; the convolutional layer Conv5 is a 3 × 3 convolutional layer, the output of which is communicatively connected to the input of convolutional layer Conv 6; the convolutional layer Conv6 is a 1 × 1 convolutional layer, and the output end of the convolutional layer is connected with the input end of the up-sampling module 4 in a communication manner; the output end of the up-sampling module 4 is used as the processing result output end of the human body analysis subsystem to output the human body analysis prediction graph obtained by the system operation.
As shown in fig. 2: the feature extraction module FEM1 and the feature extraction module FEM2 each include: convolutional layer Conv11, convolutional layer Conv12, convolutional layer Conv13, convolutional layer Conv14, convolutional layer Conv15, convolutional layer Conv16, convolutional layer Conv17, and addition module 11;
an input of the convolutional layer Conv11 communicatively connected to an input of convolutional layer Conv12, a convolutional layer Conv13 input, and an input of convolutional layer Conv14, and acting as an input of feature extraction module FEM1 and an input of feature extraction module FEM 2; an output of the convolutional layer Conv11 is communicatively connected with an input of convolutional layer Conv 15; an output of the convolutional layer Conv12 is communicatively connected with an input of convolutional layer Conv 16; an output of the convolutional layer Conv13 is communicatively connected with an input of convolutional layer Conv 17; an output of the convolutional layer Conv14 is communicatively connected to an input a of the summing module 11, an output of the convolutional layer Conv15 is communicatively connected to an input B of the summing module 11, an output of the convolutional layer Conv16 is communicatively connected to an input C of the summing module 11, and an output of the convolutional layer Conv17 is communicatively connected to an input D of the summing module 11; the output end of the addition module 11 is used as the output end of the feature extraction module FEM1 and the output end of the feature extraction module FEM 2;
the convolutional layer Conv11 is a 3 × 3 void convolutional layer, and the void convolution rate is 3;
the convolutional layer Conv12 is a 3 × 3 void convolutional layer, and the void convolution rate is 8;
the convolutional layer Conv13 is a 3 × 3 void convolutional layer, and the void convolution rate is 12;
the convolutional layers Conv14, Conv15, Conv16 and Conv17 were all 1 × 1 convolutional layers.
As shown in fig. 3: the adaptive attention module GAM1 and the adaptive attention module GAM2 each include: convolutional layer Conv21, convolutional layer Conv22, global mean pooling layer 21, global mean pooling layer 22, addition module 21, Softmax layer, and multiplication module 21;
the convolutional layer Conv21 is a 1 × 1 convolutional layer with inputs as input a of adaptive attention module GAM1 and input a of adaptive attention module GAM2, and outputs communicatively connected to inputs of the global mean pooling layer 21;
the convolutional layer Conv22 is a 1 × 1 convolutional layer with inputs as input B of adaptive attention module GAM1 and input B of adaptive attention module GAM2, and outputs communicatively connected to inputs of global mean pooling layer 22;
the output end of the global pooling layer 21 is in communication connection with the input end A of the adding module 21, and the output end of the global mean pooling layer 22 is in communication connection with the input end B of the adding module 21;
an output of the summing module 21 is communicatively coupled to an input of a Softmax layer;
an output of the Softmax layer is communicatively connected to an input of a multiplication module 21;
the output of the multiplication module 21 serves as the output of the adaptive attention module GAM1 and the output of the adaptive attention module GAM 2.
The adaptive attention module focuses on selectively extracting location informationAnd fusing different levels of weighted attention features to achieve mutual information fusion. Featuring input data of adaptive attention module
Figure GDA0002620996510000131
Wherein C, H, W represents the number of characteristic channels, height and width, respectively, and i represents the ith operation. The inputs to the attention module are two different levels of profiles A and B, denoted respectively
Figure GDA0002620996510000132
And
Figure GDA0002620996510000133
feature(s)
Figure GDA0002620996510000134
And
Figure GDA0002620996510000135
after passing through the convolutional layer Conv21 and the convolutional layer Conv22, respectively, the number of channels is reduced to C/2;
newly acquired features
Figure GDA0002620996510000136
And
Figure GDA0002620996510000137
the number of channels is further reduced by the global mean pooling layer 21 and the global mean pooling layer 22, and the processing flow can be expressed as the following expression:
Figure GDA0002620996510000138
and
Figure GDA0002620996510000139
after the two feature maps a and B at different levels are processed as described above, the fusion is completed by the addition module 21 by adding elements, which is done to keep more residual attention weight information. Then, let it passAnd normalizing operation to make the weight value between (0, 1), wherein the operation is realized by a normalization module Softmax. Such as formula
Figure GDA0002620996510000141
Finally, the original features are connected
Figure GDA0002620996510000142
And
Figure GDA0002620996510000143
as S ∈ R2C×H×WMultiply it by element with the weight obtained from the previous operation to obtain the final weighted feature map, such as
Figure GDA0002620996510000144
As shown.
As shown in fig. 4: an attention mechanism guided progressive human body division analysis method comprises the following steps:
s1, obtaining human body images of the known corresponding two-classification significance prediction image and human body analysis prediction image from the big data platform to form a training data set and a test data set;
in this example, three mainstream human body analytic data sets including LIP, CIHP, and PPSS were selected for the experiment.
LIP is a current maximum number of human body analysis data sets, which comprises 50462 pictures, wherein 30462 pictures are used for training, 10000 pictures are used for verification, and the remaining 10000 pictures are used for testing. The data set contains 20 categories in total, and most pictures contain only a single human body.
The CIHP is a data set for instance body analysis, each picture contains multiple instances, and the pictures are more complex and challenging compared with the existing mainstream data set. The data set contains 38280 pictures, 28280 pictures for training, 5000 pictures in the test set and 5000 pictures in the validation set, and 20 pictures in the category classification.
PPSS is a small human body analytic data set, mainly composed of pedestrian pictures, with the complexity of a real scene. The data set was collected from 171 video sequences and contained 3673 pictures in total. Wherein, the training set consists of the first 100 sequences, and the test set consists of the last 71 sequences. The data set contains a total of 8 categories.
The three data sets are selected to verify the adaptability and robustness of the system to different types of data sets, and the LIP and the CIHP both contain 20 classes, which belong to a complex multi-class analysis problem. Meanwhile, the CIHP comprises a plurality of examples, and the difficulty of analysis is increased. In addition, the PPSS is a data set with a small classification number, mainly consists of pedestrian pictures, has a different picture style from the first two data sets, and can be used for detecting the robustness of the system.
S2, training the attention mechanism guided progressive dividing human body analysis system through the training data set to obtain a trained attention mechanism guided progressive dividing human body analysis system;
s3, verifying the trained attention mechanism guided progressive dividing human body analysis system through the test data set to obtain a verified attention mechanism guided progressive dividing human body analysis system;
and S4, predicting and analyzing the human body image through the verified attention mechanism guided progressive dividing human body analysis system to obtain a two-classification significance prediction graph and a human body analysis prediction graph corresponding to the human body image.
The step S2 includes the steps of:
s21, preprocessing the training data set;
s22, setting the initial parameters and training rules of the human body analysis system by the progressive division guided by the attention mechanism;
and S23, performing parameter iteration on each module in the attention mechanism guided progressive division human body analysis system according to the preprocessed training data set through a back propagation method.
The step S21 includes the following: and carrying out random scaling processing of 0.5-1.5 on the data in the training data set and carrying out operations of cutting and left-right turning on the data in the training data set. The significance labeling graph in the training data set is obtained by unifying non-background pixels in the labeling graph, and finally, the background class is marked by '0' and the edge is marked by '1'.
The initial parameters and the training rules in step S22 include the following expressions:
Figure GDA0002620996510000151
LAPPNet=Lparsing+αLsailency(2)
α=1 (3)
power=0.9 (4)
base_lr=0.007 (5)
wherein, formula 1 is a learning rate iteration rule, lr is a current learning rate, base _ lr is an initial learning rate, iter is a current iteration number, max _ iter is a total iteration number, and power is an index parameter; equation 2 is the loss function of the training rule, LparsingFor the cross-entropy loss of the segmentation prediction graph and the segmentation annotation graph, LsailencyThe cross entropy loss of the significance prediction graph and the real labeling graph is shown, and alpha is a proportion parameter used for balancing the segmentation loss and the significance loss.
In the training process of this embodiment, different picture input sizes are adopted because of differences in data of the three platforms LIP, CIHP, and PPSS. For LIP, the input size is 473 × 473; for CIHP, the input size used is 512 × 512; for PPSS, the input size is 256 × 256. The three data set classification cases also have differences, the number of LIP and CIHP categories K is set to 20, and the number of PPSS categories K is 8.
The system provided by the invention is trained and verified on the three data sets mentioned in the steps. In the verification process, an edge label graph does not need to be generated. All experiments take the average cross-over ratio mIoU as an evaluation standard, and the formula is
Figure GDA0002620996510000161
Where K +1 represents the total number of categories of the data set (corresponding to the number of categories K), pijRepresenting class i is recognizedTotal number of pixels classified as class j, pjiRepresenting the total number of pixels, p, for which class j is identified as class iiiIndicating that the correct total number of pixels was identified. The experimental results show that the mIoU realized by the system on LIP, CIHP and PPSS is 54.08%, 59.88% and 60.2% respectively. The performance on all three data sets outperformed the existing methods. This proves that the system provided by the invention has effectiveness, robustness and universality in solving the human body analysis of the actual scene. Fig. 5 shows a comparison of the effect of the human segmentation map generated by the human body analysis system proposed by the present invention. In the verification process, in order to prove the effectiveness of the feature extraction module and the attention module provided by the invention, a series of experiments of eliminating the above modules from the original system are performed on the LIP data set, and the specific experimental results are shown in the following table, wherein GAM1 represents the attention module used in the significance detection subsystem, and GAM2 represents the attention module used in the human body analysis subsystem. A comparison of the segmentation maps generated with the original system is also shown in FIG. 5, in which CE2P is a paper (Tao Ruan, Ting Liu, Zilong Huang, Yunchao Wei, Shikui Wei, Yao zhao. device in the Details: Towards Accurate Single and Multiple HumanParsing [ C)]AAAI,2019: 4814-. The comparison shows that the two modules provided by the invention have outstanding enhancement effect and application value.
TABLE 1 comparison of the performance of mIoU of the present invention with the method described in each article
Figure GDA0002620996510000171

Claims (7)

1. An attention mechanism guided progressive compartmentalization human body analysis system comprising: the system comprises a residual error neural network ResNet-101, a significance detection subsystem and a human body analysis subsystem;
the residual error neural network ResNet-101 is a structural neural network and is used for processing a human body image to obtain a shallow layer low-level feature map and a deep layer high-level feature map; the output Block1 and the output Block2 are connected with the significance detection subsystem in a communication mode and used for inputting the shallow low-level feature map into the significance detection subsystem; the output Block3 and the output Block4 are in communication connection with the human body analysis subsystem and are used for inputting the deep high-level feature map into the human body analysis subsystem;
the significance detection subsystem is used for carrying out significance prediction on the shallow low-level feature map to obtain two classification significance prediction maps;
the significance detection subsystem includes: convolutional layer Conv1, convolutional layer Conv2, convolutional layer Conv3, convolutional layer Conv4, adaptive attention module GAM1, upsampling module 1, and upsampling module 2;
the convolutional layer Conv1 is a 1 × 1 convolutional layer, is used for performing dimension reduction processing on a shallow layer low-level feature map transmitted by an output Block1 of a residual neural network ResNet-101, and has an input end in communication connection with the output Block1 of the residual neural network ResNet-101 and an output end in communication connection with an input end A of an adaptive attention module GAM 1;
the convolutional layer Conv2 is a 1 × 1 convolutional layer, is used for performing dimension reduction processing on a shallow layer low-level feature map transmitted by an output Block2 of a residual neural network ResNet-101, and has an input end in communication connection with the output Block2 of the residual neural network ResNet-101 and an output end in communication connection with an input end of an up-sampling module 1;
the up-sampling module 1 is used for performing up-sampling processing on image data subjected to dimensionality reduction processing and transmitted by a shallow layer low-level feature map from an output Block2 of a residual neural network ResNet-101, and the output end of the up-sampling module is in communication connection with an input end B of an adaptive attention module GAM 1;
the adaptive attention module GAM1 is used for extracting attention features, and the output end of the adaptive attention module GAM1 is in communication connection with the convolutional layer Conv3 and is in communication connection with the human body analysis subsystem for feature enhancement;
the convolutional layer Conv3, the convolutional layer Conv4 and the up-sampling module 2 are used for processing the attention features extracted by the adaptive attention module GAM1 to obtain a two-classification significance prediction graph; the convolutional layer Conv3 is a 3 × 3 convolutional layer, the output of which is communicatively connected to the input of convolutional layer Conv 4; the convolutional layer Conv4 is a 1 × 1 convolutional layer, and the output end of the convolutional layer is in communication connection with the input end of the up-sampling module 2; the output end of the up-sampling module 2 is used as a processing result output end of the significance detection subsystem to output two classification significance prediction graphs obtained by the system operation; the human body analysis subsystem is used for carrying out human body analysis prediction on the deep-layer high-level characteristic diagram to obtain a human body analysis prediction diagram;
the human body analysis subsystem comprises: a feature extraction module FEM1, a feature extraction module FEM2, an adaptive attention module GAM2, an upsampling module 3, an upsampling module 4, an adding module 1, a convolutional layer Conv5, and a convolutional layer Conv 6;
the feature extraction module FEM1 is used for carrying out multi-dimensional feature extraction on a deep high-level feature map transmitted by an output Block Block3 of a residual neural network ResNet-101 to obtain multi-dimensional context information, the input end of the feature extraction module FEM1 is in communication connection with the output Block Block3 of the residual neural network ResNet-101, and the output end of the feature extraction module FEM1 is in communication connection with the input end A of the adaptive attention module GAM 2;
the feature extraction module FEM2 is used for carrying out multi-dimensional feature extraction on a deep high-level feature map transmitted by an output Block Block4 of a residual neural network ResNet-101 to obtain multi-dimensional context information, the input end of the feature extraction module FEM2 is in communication connection with the output Block Block4 of the residual neural network ResNet-101, and the output end of the feature extraction module FEM2 is in communication connection with the input end B of the adaptive attention module GAM 2;
the adaptive attention module GAM2 is used for processing multi-dimensional context information to obtain effective weighting characteristics, and the output end of the adaptive attention module GAM2 is in communication connection with the input end of the up-sampling module 3;
the up-sampling module 3 is used for performing up-sampling processing on the effective weighting characteristics, and the output end of the up-sampling module is in communication connection with the input end A of the addition module 1;
the addition module 1 is used for adding the attention characteristics extracted by the adaptive attention module GAM1 and the effective weighting characteristics obtained by the adaptive attention module GAM2 according to elements so as to fuse the characteristic diagrams provided by the adaptive attention module GAM1 and the adaptive attention module GAM2, highlight a target area and improve compactness among classes; its input B is communicatively connected to the output of the adaptive attention module GAM1, and its output is communicatively connected to the input of the convolutional layer Conv 5;
the convolutional layer Conv5, the convolutional layer Conv6 and the up-sampling module 4 are used for processing the attention characteristics obtained by adding the elements by the addition module 1 to obtain a human body analysis prediction graph; the convolutional layer Conv5 is a 3 × 3 convolutional layer, the output of which is communicatively connected to the input of convolutional layer Conv 6; the convolutional layer Conv6 is a 1 × 1 convolutional layer, and the output end of the convolutional layer is connected with the input end of the up-sampling module 4 in a communication manner; the output end of the up-sampling module 4 is used as the processing result output end of the human body analysis subsystem to output the human body analysis prediction graph obtained by the operation of the subsystem.
2. The attention mechanism-guided progressive segmentation human body analysis system as claimed in claim 1, wherein the feature extraction module FEM1 and the feature extraction module FEM2 each comprise: convolutional layer Conv11, convolutional layer Conv12, convolutional layer Conv13, convolutional layer Conv14, convolutional layer Conv15, convolutional layer Conv16, convolutional layer Conv17, and addition module 11;
an input of the convolutional layer Conv11 communicatively connected to an input of convolutional layer Conv12, a convolutional layer Conv13 input, and an input of convolutional layer Conv14, and acting as an input of feature extraction module FEM1 and an input of feature extraction module FEM 2; an output of the convolutional layer Conv11 is communicatively connected with an input of convolutional layer Conv 15; an output of the convolutional layer Conv12 is communicatively connected with an input of convolutional layer Conv 16; an output of the convolutional layer Conv13 is communicatively connected with an input of convolutional layer Conv 17; an output of the convolutional layer Conv14 is communicatively connected to an input a of the summing module 11, an output of the convolutional layer Conv15 is communicatively connected to an input B of the summing module 11, an output of the convolutional layer Conv16 is communicatively connected to an input C of the summing module 11, and an output of the convolutional layer Conv17 is communicatively connected to an input D of the summing module 11; the output end of the addition module 11 is used as the output end of the feature extraction module FEM1 and the output end of the feature extraction module FEM 2;
the convolutional layer Conv11 is a 3 × 3 void convolutional layer, and the void convolution rate is 3;
the convolutional layer Conv12 is a 3 × 3 void convolutional layer, and the void convolution rate is 8;
the convolutional layer Conv13 is a 3 × 3 void convolutional layer, and the void convolution rate is 12;
the convolutional layers Conv14, Conv15, Conv16 and Conv17 were all 1 × 1 convolutional layers.
3. The attention mechanism-guided progressive segmentation human body interpretation system of claim 1, wherein the adaptive attention module GAM1 and adaptive attention module GAM2 each comprise: convolutional layer Conv21, convolutional layer Conv22, global mean pooling layer 21, global mean pooling layer 22, addition module 21, Softmax layer, and multiplication module 21;
the convolutional layer Conv21 is a 1 × 1 convolutional layer with inputs as input a of adaptive attention module GAM1 and input a of adaptive attention module GAM2, and outputs communicatively connected to inputs of the global mean pooling layer 21;
the convolutional layer Conv22 is a 1 × 1 convolutional layer with inputs as input B of adaptive attention module GAM1 and input B of adaptive attention module GAM2, and outputs communicatively connected to inputs of global mean pooling layer 22;
the output end of the global mean pooling layer 21 is in communication connection with the input end A of the adding module 21, and the output end of the global mean pooling layer 22 is in communication connection with the input end B of the adding module 21;
an output of the summing module 21 is communicatively coupled to an input of a Softmax layer;
an output of the Softmax layer is communicatively connected to an input of a multiplication module 21;
the output of the multiplication module 21 serves as the output of the adaptive attention module GAM1 and the output of the adaptive attention module GAM 2.
4. A human body analysis method by progressive division guided by attention mechanism is characterized by comprising the following steps:
s1, obtaining human body images of the known corresponding two-classification significance prediction image and human body analysis prediction image from the big data platform to form a training data set and a test data set;
s2, training the attention mechanism guided progressive dividing human body analysis system through the training data set to obtain a trained attention mechanism guided progressive dividing human body analysis system;
s3, verifying the trained attention mechanism guided progressive dividing human body analysis system through the test data set to obtain a verified attention mechanism guided progressive dividing human body analysis system;
and S4, predicting and analyzing the human body image through the verified attention mechanism guided progressive dividing human body analysis system to obtain a two-classification significance prediction graph and a human body analysis prediction graph corresponding to the human body image.
5. The attention mechanism-guided progressive segmentation human body analysis method according to claim 4, wherein the step S2 includes the steps of:
s21, preprocessing the training data set;
s22, setting the initial parameters and training rules of the human body analysis system by the progressive division guided by the attention mechanism;
and S23, performing parameter iteration on each module in the attention mechanism guided progressive division human body analysis system according to the preprocessed training data set through a back propagation method.
6. The attention mechanism-guided progressive segmentation human body analysis method according to claim 5, wherein the step S21 includes the following steps: and carrying out random scaling processing of 0.5-1.5 on the data in the training data set and carrying out operations of cutting and left-right turning on the data in the training data set.
7. The attention mechanism-guided progressive segmentation human body analysis method according to claim 5, wherein the initial parameters and the training rules in the step S22 include the following expressions:
Figure FDA0002642806430000061
LAPPNet=Lparsing+αLsailency(2)
α=1(3)
power=0.9(4)
base_lr=0.007(5)
wherein, the formula (1) is a learning rate iteration rule, lr is a current learning rate, base _ lr is an initial learning rate, iter is a current iteration number, max _ iter is a total iteration number, and power is an index parameter; equation (2) is a loss function of the training rule, LparsingFor the cross-entropy loss of the segmentation prediction graph and the segmentation annotation graph, LsailencyThe cross entropy loss of the significance prediction graph and the real labeling graph is shown, and alpha is a proportion parameter used for balancing the segmentation loss and the significance loss.
CN202010081219.4A 2020-02-06 2020-02-06 Attention mechanism guided progressive human body division analysis system and method Expired - Fee Related CN111275694B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010081219.4A CN111275694B (en) 2020-02-06 2020-02-06 Attention mechanism guided progressive human body division analysis system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010081219.4A CN111275694B (en) 2020-02-06 2020-02-06 Attention mechanism guided progressive human body division analysis system and method

Publications (2)

Publication Number Publication Date
CN111275694A CN111275694A (en) 2020-06-12
CN111275694B true CN111275694B (en) 2020-10-23

Family

ID=71001989

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010081219.4A Expired - Fee Related CN111275694B (en) 2020-02-06 2020-02-06 Attention mechanism guided progressive human body division analysis system and method

Country Status (1)

Country Link
CN (1) CN111275694B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738218B (en) * 2020-07-27 2020-11-24 成都睿沿科技有限公司 Human body abnormal behavior recognition system and method
CN114549332A (en) * 2020-11-25 2022-05-27 杭州火烧云科技有限公司 Convolutional neural network skin type processing method and device based on human body analysis prior support
CN114511573B (en) * 2021-12-29 2023-06-09 电子科技大学 Human body analysis device and method based on multi-level edge prediction

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086869A (en) * 2018-07-16 2018-12-25 北京理工大学 A kind of human action prediction technique based on attention mechanism
CN110084108A (en) * 2019-03-19 2019-08-02 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Pedestrian re-identification system and method based on GAN neural network
CN110097115A (en) * 2019-04-28 2019-08-06 南开大学 A kind of saliency object detecting method based on attention metastasis
CN110135375A (en) * 2019-05-20 2019-08-16 中国科学院宁波材料技术与工程研究所 More people's Attitude estimation methods based on global information integration
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放***箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN110674685A (en) * 2019-08-19 2020-01-10 电子科技大学 Human body analytic segmentation model and method based on edge information enhancement

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8972445B2 (en) * 2009-04-23 2015-03-03 Deep Sky Concepts, Inc. Systems and methods for storage of declarative knowledge accessible by natural language in a computer capable of appropriately responding
US9830709B2 (en) * 2016-03-11 2017-11-28 Qualcomm Incorporated Video analysis with convolutional attention recurrent neural networks
CN108830157B (en) * 2018-05-15 2021-01-22 华北电力大学(保定) Human behavior identification method based on attention mechanism and 3D convolutional neural network
CN109284670B (en) * 2018-08-01 2020-09-25 清华大学 Pedestrian detection method and device based on multi-scale attention mechanism

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109086869A (en) * 2018-07-16 2018-12-25 北京理工大学 A kind of human action prediction technique based on attention mechanism
CN110084108A (en) * 2019-03-19 2019-08-02 华东计算技术研究所(中国电子科技集团公司第三十二研究所) Pedestrian re-identification system and method based on GAN neural network
CN110097115A (en) * 2019-04-28 2019-08-06 南开大学 A kind of saliency object detecting method based on attention metastasis
CN110135375A (en) * 2019-05-20 2019-08-16 中国科学院宁波材料技术与工程研究所 More people's Attitude estimation methods based on global information integration
CN110674685A (en) * 2019-08-19 2020-01-10 电子科技大学 Human body analytic segmentation model and method based on edge information enhancement
CN110648334A (en) * 2019-09-18 2020-01-03 中国人民解放***箭军工程大学 Multi-feature cyclic convolution saliency target detection method based on attention mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Method for Human Parsing Based on Deep Learning And Attention Mechanism;Rui Yang.et.;《The 2019 6th International Conference on Systems and Informatics (ICSAI 2019)》;20191231;第1163-1167页 *
基于深度学习的人体解析研究综述;邵杰等;《电子科技大学学报》;20190930;第48卷(第5期);第644-654页 *

Also Published As

Publication number Publication date
CN111275694A (en) 2020-06-12

Similar Documents

Publication Publication Date Title
CN109949317B (en) Semi-supervised image example segmentation method based on gradual confrontation learning
CN112597941B (en) Face recognition method and device and electronic equipment
CN111275694B (en) Attention mechanism guided progressive human body division analysis system and method
CN111126258A (en) Image recognition method and related device
CN110569814B (en) Video category identification method, device, computer equipment and computer storage medium
CN112434608B (en) Human behavior identification method and system based on double-current combined network
CN110175248B (en) Face image retrieval method and device based on deep learning and Hash coding
CN112861575A (en) Pedestrian structuring method, device, equipment and storage medium
CN112580458B (en) Facial expression recognition method, device, equipment and storage medium
CN110390308B (en) Video behavior identification method based on space-time confrontation generation network
CN112966574A (en) Human body three-dimensional key point prediction method and device and electronic equipment
CN113487610B (en) Herpes image recognition method and device, computer equipment and storage medium
CN114255403A (en) Optical remote sensing image data processing method and system based on deep learning
CN111523421A (en) Multi-user behavior detection method and system based on deep learning and fusion of various interaction information
CN116311214B (en) License plate recognition method and device
CN116012653A (en) Method and system for classifying hyperspectral images of attention residual unit neural network
CN110992301A (en) Gas contour identification method
CN111612802B (en) Re-optimization training method based on existing image semantic segmentation model and application
CN111199199B (en) Action recognition method based on self-adaptive context area selection
CN117115824A (en) Visual text detection method based on stroke region segmentation strategy
CN111582057B (en) Face verification method based on local receptive field
CN113159071B (en) Cross-modal image-text association anomaly detection method
CN115527159A (en) Counting system and method based on cross-modal scale attention aggregation features
CN115424012A (en) Lightweight image semantic segmentation method based on context information
CN116503618B (en) Method and device for detecting remarkable target based on multi-mode and multi-stage feature aggregation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20201023

Termination date: 20220206

CF01 Termination of patent right due to non-payment of annual fee