CN111695569A - Image pixel level classification method based on multi-segmentation-map fusion - Google Patents

Image pixel level classification method based on multi-segmentation-map fusion Download PDF

Info

Publication number
CN111695569A
CN111695569A CN202010397565.3A CN202010397565A CN111695569A CN 111695569 A CN111695569 A CN 111695569A CN 202010397565 A CN202010397565 A CN 202010397565A CN 111695569 A CN111695569 A CN 111695569A
Authority
CN
China
Prior art keywords
segmentation
pixel
mask
consensus
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010397565.3A
Other languages
Chinese (zh)
Other versions
CN111695569B (en
Inventor
姚莉
乔昂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202010397565.3A priority Critical patent/CN111695569B/en
Publication of CN111695569A publication Critical patent/CN111695569A/en
Application granted granted Critical
Publication of CN111695569B publication Critical patent/CN111695569B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image pixel level classification method based on multi-segmentation-map fusion. The method comprises three main steps, firstly, a guiding mechanism is introduced among a plurality of segmentation graphs, and a segmentation graph with higher precision is used for guiding a segmentation graph with lower precision to improve the precision. Then, a consensus mechanism is introduced, and classification consensus is achieved among edge area pixel points which may generate classification conflicts in each graph through a negotiation strategy. And finally, a fusion strategy based on a full convolution neural network is used, and the two mechanisms are effectively combined to obtain final output. The invention can effectively solve the problem of pixel classification conflict of the edge region, and a plurality of segmentation maps are fused to obtain a pixel classification result with finer granularity. The method can be used in combination with a variety of techniques including, but not limited to, supervised learning methods such as deep neural networks, random forests, and support vector machines. By using the method, the defect that the existing method attaches importance to the conflict pixel point can be effectively overcome, and a fusion segmentation result with higher quality is obtained.

Description

Image pixel level classification method based on multi-segmentation-map fusion
Technical Field
The invention relates to an image processing and analyzing technology, and belongs to the technical field of image content understanding.
Background
Image content understanding is an important research goal in the field of computer vision. With the continuous development of computer vision technology, the understanding of image content is also developing in the direction of finer granularity. Segmentation, that is, classification of pixel-size images, is one of the important methods for understanding the content of images, and how to achieve a finer-grained classification effect on the basis of the existing technology is the focus of current research. This inevitably involves the problem of pixel point classification conflicts between different prior art techniques, which generally occur at the edge of the partitioned content in different parts. The existing method still does not provide an effective solution for several kinds of conflict problems existing at present:
1) classification conflicts between different segmentations of the same foreground object.
2) Classification conflicts between different partitions of the same background content.
3) The classification of the contiguous part of the foreground object and background content conflicts.
Disclosure of Invention
The invention aims to solve the problems of classification conflicts among different partitions of the same foreground object, classification conflicts among different partitions of the same background content and classification conflicts of a connected part of the foreground object and the background content.
In order to achieve the purpose, the method adopted by the invention is as follows: an image pixel level classification method based on multi-segmentation map fusion comprises the following steps:
(1) a guiding mechanism is introduced between the multiple segmentation maps. Depending on the quality of the segmentation of the content by the respective portions, the use of an attention mechanism to provide high-precision content to low-precision portions makes it of great interest. Taking the application scenario of the two segmentation map fusion as an example, if the input to be fused is a foreground object and background content segmentation map, the classification accuracy of the edge part is slightly low due to higher requirements on local and global semantics compared with the foreground object segmentation, and the classification result of the foreground object segmentation in the edge area can be used as attention to supplement the semantic information of the background content segmentation in the corresponding area. The multi-partition situation can be extrapolated accordingly.
(2) A consensus mechanism is introduced among the multi-segmentation graphs, and classification conflicts possibly existing in the edge area are resolved by learning a consensus mask in the learning stage of the supervised learning method. Taking the application scenario of the two segmentation maps as an example, the two segmentation maps are coded as 0 and 1, respectively. The common identification mask is a binary mask, which is initially a 0-value mask, and if the (i, j) position value is 0, the two parties achieve that the classification of the pixel at the corresponding position of the input image is the same as that of the 0-code division image, and if the value is 1, the opposite is true. In the learning process, under the guidance of the corresponding item of the loss function, the two parts negotiate continuously to obtain a more reasonable consensus mask result. The multi-partition situation can be extrapolated accordingly.
(3) A fusion strategy based on a full convolution neural network is used, the two mechanisms are effectively combined, the final output is obtained, and two segmentation graph application scenes are taken as an example, and a multi-segmentation graph scene can be deduced according to the two segmentation graph application scenes.
As an improvement of the present invention, the fusion strategy is implemented by the following sub-steps:
(3.1) finishing initialization work, and performing size registration on all the object segmentation blocks obtained in the previous step by taking the size of the input original image as a reference;
(3.2) removing repeated segmentation blocks, and if different segmentation blocks of the same object are not completely overlapped, keeping the overlapped part; solving the pixel attribution problem of the overlapping region between different objects by the subsequent substep;
(3.3) adjusting the contour range, inputting the segmentation blocks of all classes of objects into a full convolution neural network of a coder-decoder structure, and adjusting the contour range of each object through learning;
and (3.4) combining the masks, combining the object masks obtained in the previous step, and judging the attribution of the pixel points in the edge overlapping area according to the consensus masks.
Has the advantages that:
(1) the invention solves the problem of pixel classification conflict between different segmentations of the same foreground object and the same background content of the image by introducing a guidance mechanism between different segmentation images.
(2) The invention solves the problem of classification conflict of the pixels in the edge area between the image foreground object and the background content by introducing a consensus mechanism among different segmentations.
(3) The method provided by the invention can be used in cooperation with various supervised learning methods, including but not limited to deep neural networks, random forests, support vector machines and other methods, and can improve the classification precision of the pixel level and remarkably improve the segmentation quality by matching with corresponding loss functions through a final fusion strategy.
Drawings
FIG. 1 is a flow diagram of an overall multi-partition inter-image pixel classification conflict resolution scheme;
FIG. 2 is a schematic diagram of an attention-based multi-partition guidance mechanism;
FIG. 3 is a schematic diagram of a multi-partition consensus mechanism based on consensus masks;
FIG. 4 is a flow diagram of a full convolutional neural network based multi-partition fusion strategy;
Detailed Description
The following examples are intended to illustrate the present invention, but are not intended to limit the scope of the present invention.
The following is a process for solving pixel conflicts for multiple segmentation maps in conjunction with the accompanying drawings.
As shown in FIG. 1, which is an overall flow chart of the solution of the present invention for the pixel classification conflict between multi-segmentation images, the method of the present invention includes the following steps:
(1) a guiding mechanism is introduced, according to the segmentation quality of each part of segmentation content, a high-precision content is provided for a low-precision part by using an attention mechanism so as to pay attention to the low-precision part, semantic information of a corresponding region is supplemented, and the segmentation precision of the low-precision part is improved;
(2) introducing a consensus mechanism, and under the guidance of a corresponding item of a loss function, continuously negotiating and learning a consensus mask on a pixel point of an edge region in a learning stage of a supervised learning method to resolve possible classification conflicts in the edge region;
(3) and (3) integrating the results of the two mechanisms by using a fusion strategy based on a full convolution neural network to obtain final output.
The invention will be further described below by way of two segmentation figures, from which the multiple segmentation figures can be derived.
(1) And the guiding mechanism is used for guiding the lower-precision segmentation graph to improve the precision by using the higher-precision segmentation graph:
the attention-based guidance mechanism is shown in fig. 2, and for two segmentation graphs, we assume that the input to be fused is a segmentation graph mentor of a foreground objectfoGraph in separated from background contentbcThe background content segmentation has higher requirements on local and global semantics compared with the foreground object segmentation, so that the classification precision of the edge part is slightly low, and the classification result of the foreground object segmentation in the edge area can be used as attention to supplement semantic information of the background content segmentation in the corresponding area. If define outbcFor the guided output of background content parts, the relationship between them can be formalized as
Figure BDA0002488233250000031
Wherein
Figure BDA0002488233250000032
And
Figure BDA0002488233250000033
pixel-by-pixel multiplication and pixel-by-pixel addition operations, respectively, rescale (·) is used for registration of the inter-segmentation map dimensions, and norm (·) is a normalization operation, which is inversely related to the number of segmentation maps.
Defining p and g as prediction output and labeling output, and the guidance loss term in this scenario is
Figure BDA0002488233250000034
(2) The consensus mechanism resolves the possible classification conflicts in the edge region by learning a consensus mask in the learning stage of the supervised learning method, and the consensus mechanism based on the consensus mask is shown in fig. 3:
(2.1) initializing consensus masks:
the two parts are coded as 0 and 1, respectively. The common identification mask is a binary mask, which is initially a 0-value mask, and if the (i, j) position value is 0, the two parties achieve that the classification of the pixel at the corresponding position of the input image is the same as that of the 0-code division image, and if the value is 1, the opposite is true. And carrying out size registration on the two segmentation images, and adjusting the two segmentation images to be the same size.
(2.2) calculating a foreground mask and a background mask:
and carrying out size registration on the input segmentation graph, and adjusting the input segmentation graph to the same size.
For foreground objects, different partitions of the same generic object are merged into the same mask and truncated using a learned threshold. Then merging each class of mask after cutting, using class number to carry out regularization on mask pixel point values, and calculating to obtain a binary foreground mask.
For background content, the pixel values of the background content portion are assigned the same encoding values as in the initialization phase, and the pixel values of the non-background portion are assigned the opposite values, thereby generating a background content mask.
(2.3) consensus learning:
in the learning process of supervised learning, the related loss items between the two are continuously reduced, and the consensus learning is achieved.
The loss function is defined as follows, f and b are defined as two input segmentation graphs, N represents the number of the segmentation graphs input by continuous iteration in the learning process, and the consensus loss term in the scene is
Figure BDA0002488233250000041
(3) A fusion strategy based on a full convolution neural network is shown in FIG. 4, and the two mechanisms are effectively combined to obtain a final output. Taking the two-segmentation-graph application scenario as an example, the multi-segmentation-graph scenario can be deduced accordingly, and the fusion strategy is implemented by the following sub-steps:
and (3.1) finishing initialization work. And performing size registration on all the object segmentation blocks obtained in the previous step by taking the size of the input original image as a reference.
And (3.2) removing repeated segmentation blocks. If the same object and different segmentation blocks are not completely overlapped, the overlapped part is reserved; the pixel attribution problem of the overlapping area between different objects is solved by the subsequent sub-steps.
And (3.3) adjusting the profile range. The segmented blocks of all classes of objects are input into a full convolution neural network of a coder-decoder structure, and the outline range of each object is adjusted through learning.
And (3.4) merging the masks. And merging the object masks obtained in the last step, and judging the attribution of the pixel points in the edge overlapping area according to the consensus masks.

Claims (4)

1. An image pixel level classification method based on multi-segmentation map fusion is characterized by comprising the following steps:
(1) a guiding mechanism is introduced, according to the segmentation quality of each part of segmentation content, a high-precision content is provided for a low-precision part by using an attention mechanism so as to pay attention to the low-precision part, semantic information of a corresponding region is supplemented, and the segmentation precision of the low-precision part is improved;
(2) introducing a consensus mechanism, and under the guidance of a corresponding item of a loss function, continuously negotiating and learning a consensus mask on a pixel point of an edge region in a learning stage of a supervised learning method to resolve possible classification conflicts in the edge region;
(3) and (3) integrating the results of the two mechanisms by using a fusion strategy based on a full convolution neural network to obtain final output.
2. The method for classifying image pixel levels based on multi-segmentation-map fusion as claimed in claim 1, wherein the guiding mechanism in step (1) is implemented by the following sub-steps:
(1.1) defining the input to be fused as a segmentation graph mentor of the foreground objectfoGraph in separated from background contentbcDefine outbcFor the output of the background content part after guidance, the relationship between them is
Figure FDA0002488233240000011
Wherein
Figure FDA0002488233240000012
And
Figure FDA0002488233240000013
respectively pixel-by-pixel multiplication and pixel-by-pixel multiplicationPrime addition operation, recalcle (·) is used for registration of inter-segmentation graph dimensions, norm (·) is a normalization operation, and is inversely related to the number of segmentation graphs;
(1.2) the following loss items are matched in the corresponding supervised learning method to measure the guidance effect: defining p and g as prediction output and labeling output, and the guidance loss term in this scenario is
Figure FDA0002488233240000014
3. The method for classifying image pixel levels based on multi-segmentation-map fusion as claimed in claim 1, wherein the consensus mechanism in the step (2) is implemented by the following sub-steps:
(2.1) initializing a common identification mask, respectively coding the two parts into 0 and 1, wherein the common identification mask is a binary mask and is a 0-value mask initially, if the (i, j) position value is 0, the two parts achieve that the classification of the pixels at the corresponding positions of the input image is the same as that of a 0 code division image, otherwise, the two parts are subjected to size registration and adjusted to the same size;
(2.2) calculating a foreground mask and a background mask, carrying out size registration on the input segmentation images, adjusting the input segmentation images to be the same size, combining different segmentation blocks of the same class of objects into the same mask for the foreground objects, using a learned threshold value to carry out truncation, then combining each class of mask after truncation, using the class number to carry out regularization on mask pixel point values, and calculating to obtain a binary foreground mask; for background content, assigning the pixel value of a background content part with the same coding value as that in the initialization stage, and assigning the pixel value of a non-background part with an opposite value to generate a background content mask;
(2.3) consensus learning, wherein in the learning process of supervised learning, the related loss term between the two is continuously reduced to achieve the consensus learning, the loss term is defined as follows, f and b are defined as two input segmentation graphs, N represents the number of the segmentation graphs which are continuously input in the learning process in an iteration mode, and the consensus loss term in the scene is
Figure FDA0002488233240000021
4. The method for classifying image pixel levels based on multi-segmentation-map fusion as claimed in claim 1, wherein the fusion strategy in the step 3 is implemented by the following sub-steps:
(3.1) finishing initialization work, and performing size registration on all the object segmentation blocks obtained in the previous step by taking the size of the input original image as a reference;
(3.2) removing repeated segmentation blocks, and if different segmentation blocks of the same object are not completely overlapped, keeping the overlapped part; solving the pixel attribution problem of the overlapping region between different objects by the subsequent substep;
(3.3) adjusting the contour range, inputting the segmentation blocks of all classes of objects into a full convolution neural network of an encoder-decoder structure, and finely adjusting the contour range of each object through learning;
and (3.4) merging the masks. And merging the object masks obtained in the last step, and judging the attribution of the pixel points in the edge overlapping area according to the consensus masks.
CN202010397565.3A 2020-05-12 2020-05-12 Image pixel level classification method based on multi-segmentation-map fusion Active CN111695569B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010397565.3A CN111695569B (en) 2020-05-12 2020-05-12 Image pixel level classification method based on multi-segmentation-map fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010397565.3A CN111695569B (en) 2020-05-12 2020-05-12 Image pixel level classification method based on multi-segmentation-map fusion

Publications (2)

Publication Number Publication Date
CN111695569A true CN111695569A (en) 2020-09-22
CN111695569B CN111695569B (en) 2023-04-18

Family

ID=72477703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010397565.3A Active CN111695569B (en) 2020-05-12 2020-05-12 Image pixel level classification method based on multi-segmentation-map fusion

Country Status (1)

Country Link
CN (1) CN111695569B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453246A (en) * 2023-06-12 2023-07-18 深圳市众联视讯科技有限公司 Intelligent door lock capable of identifying objects outside door and alarming and identification alarming method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780500A (en) * 2016-12-09 2017-05-31 深圳市唯特视科技有限公司 A kind of image partition method of use regression algorithm
CN109685067A (en) * 2018-12-26 2019-04-26 江西理工大学 A kind of image, semantic dividing method based on region and depth residual error network
CN110047077A (en) * 2019-04-17 2019-07-23 湘潭大学 A kind of image processing method for ether mill common recognition mechanism

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780500A (en) * 2016-12-09 2017-05-31 深圳市唯特视科技有限公司 A kind of image partition method of use regression algorithm
CN109685067A (en) * 2018-12-26 2019-04-26 江西理工大学 A kind of image, semantic dividing method based on region and depth residual error network
CN110047077A (en) * 2019-04-17 2019-07-23 湘潭大学 A kind of image processing method for ether mill common recognition mechanism

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
侯小刚 等: "基于超像素多特征融合的快速图像分割算法", 《电子学报》 *
王书朋 等: "基于自适应分割的多曝光图像融合算法", 《计算机应用》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116453246A (en) * 2023-06-12 2023-07-18 深圳市众联视讯科技有限公司 Intelligent door lock capable of identifying objects outside door and alarming and identification alarming method
CN116453246B (en) * 2023-06-12 2024-02-02 深圳市众联视讯科技有限公司 Intelligent door lock capable of identifying objects outside door and alarming and identification alarming method

Also Published As

Publication number Publication date
CN111695569B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
CN110322495B (en) Scene text segmentation method based on weak supervised deep learning
Batsos et al. CBMV: A coalesced bidirectional matching volume for disparity estimation
Liu et al. Local similarity pattern and cost self-reassembling for deep stereo matching networks
CN111583097A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
US20240029272A1 (en) Matting network training method and matting method
Nandi et al. Traffic sign detection based on color segmentation of obscure image candidates: a comprehensive study
CN111507337A (en) License plate recognition method based on hybrid neural network
KR20180067909A (en) Apparatus and method for segmenting image
Michalak et al. Fast Binarization of Unevenly Illuminated Document Images Based on Background Estimation for Optical Character Recognition Purposes.
CN115565071A (en) Hyperspectral image transform network training and classifying method
CN111695569B (en) Image pixel level classification method based on multi-segmentation-map fusion
Yang et al. Study of detection method on real-time and high precision driver seatbelt
Chen et al. Pgnet: Panoptic parsing guided deep stereo matching
Sun et al. TSINIT: a two-stage Inpainting network for incomplete text
Zhao et al. Traffic signs and markings recognition based on lightweight convolutional neural network
CN110880011B (en) Image segmentation method, device and non-transitory computer readable medium thereof
Fröhlich et al. As time goes by—anytime semantic segmentation with iterative context forests
CN111914947A (en) Image instance segmentation method, device and equipment based on feature fusion and storage medium
CN116228795A (en) Ultrahigh resolution medical image segmentation method based on weak supervised learning
Khan et al. A robust light-weight fused-feature encoder-decoder model for monocular facial depth estimation from single images trained on synthetic data
CN114627139A (en) Unsupervised image segmentation method, unsupervised image segmentation device and unsupervised image segmentation equipment based on pixel feature learning
Tsai et al. Real-time automatic multilevel color video thresholding using a novel class-variance criterion
Ke et al. Subject-aware image outpainting
Vasam et al. Instance Segmentation on Real time Object Detection using Mask R-CNN
CN114463187B (en) Image semantic segmentation method and system based on aggregation edge features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant