CN113128519B - Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method - Google Patents

Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method Download PDF

Info

Publication number
CN113128519B
CN113128519B CN202110461176.7A CN202110461176A CN113128519B CN 113128519 B CN113128519 B CN 113128519B CN 202110461176 A CN202110461176 A CN 202110461176A CN 113128519 B CN113128519 B CN 113128519B
Authority
CN
China
Prior art keywords
image
rgb
depth
detection method
target detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110461176.7A
Other languages
Chinese (zh)
Other versions
CN113128519A (en
Inventor
陈莉
赵志华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NORTHWEST UNIVERSITY
Original Assignee
NORTHWEST UNIVERSITY
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NORTHWEST UNIVERSITY filed Critical NORTHWEST UNIVERSITY
Priority to CN202110461176.7A priority Critical patent/CN113128519B/en
Publication of CN113128519A publication Critical patent/CN113128519A/en
Application granted granted Critical
Publication of CN113128519B publication Critical patent/CN113128519B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-mode multi-spliced RGB-D significance target detection method, which comprises the following steps: s1, dividing an image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics; s2, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea; and S3, detecting the saliency target of the image based on the final score of each image subarea. The invention can rapidly and efficiently realize the detection of the saliency target, and can detect the precision of the saliency.

Description

Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method
Technical Field
The invention relates to the field of image detection, in particular to a multi-mode multi-spliced RGB-D saliency target detection method.
Background
The salient object detection is an important component of computer vision, and with the continuous development of the field of computer vision, a salient object detection method with higher efficiency and better accuracy is urgently needed.
During the development of saliency detection, various methods have evolved, such as utilizing color features, positional information, texture features, etc. of images. Some conventional methods use center priors, edge priors, semantic priors, etc. However, since the color scene in the image is very complex, these models often fail when there is no obvious contrast between the object and the background, and it is difficult to distinguish the similar objects by these features.
Disclosure of Invention
In order to solve the problems, the invention provides a multi-mode multi-spliced RGB-D saliency target detection method, which can rapidly and efficiently realize the detection of the saliency target and can detect the saliency accurately.
And simultaneously, the complementarity of the visible light camera and the near infrared camera is utilized, the feature extraction is carried out on the visible light face picture by utilizing a deep learning algorithm, and finally, the feature extracted by the deep learning model is subjected to hierarchical fusion by utilizing a fusion algorithm, so that the effect of complementary advantages is achieved.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a multi-mode multi-spliced RGB-D significance target detection method comprises the following steps:
s1, dividing an image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics;
s2, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea;
and S3, detecting the saliency target of the image based on the final score of each image subarea.
Further, in the step S1, extraction of RGB image color information, depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.
Further, in the step S1, detection of an image uploading target is first implemented based on the dssd_xception_coco model, and then division of image sub-areas is implemented based on a detection result of the target.
Further, each object is configured with an image sub-area, and the remaining background is configured with an image sub-area.
Further, the Dssd_Xreception_coco model adopts a DSSD target detection algorithm, a coco data set is used for pre-training the Xreception neural network, then the model is trained by the previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a proper detection model for detecting each target in the image is obtained.
Further, in the step S3, the salient object detection of the image is realized according to the final score of each image subarea based on the res net50 model.
Further, the method further comprises the following steps: the step of realizing the identification of the image human eye observation angle is realized, different human eye observation angles correspond to different image deflection angle adjustment models, and the adjustment of the image deflection angle is realized based on the image deflection angle model
The invention has the following beneficial effects:
the detection of the saliency target can be realized quickly and efficiently, and the precision of the saliency detection can be realized.
Drawings
FIG. 1 is a flowchart of a multi-mode multi-spliced RGB-D saliency target detection method according to embodiment 1 of the present invention
Fig. 2 is a flowchart of a multi-mode multi-spliced RGB-D saliency target detection method according to embodiment 2 of the present invention.
Detailed Description
The present invention will be described in further detail with reference to examples in order to make the objects and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1
A multi-mode multi-spliced RGB-D significance target detection method comprises the following steps:
s1, dividing an image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics;
s2, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea;
and S3, realizing the saliency target detection of the image according to the final score of each image subarea based on the ResNet50 model.
In this embodiment, in the step S1, extraction of RGB image color information, depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.
In this embodiment, in step S1, detection of an image in-load target is first implemented based on the dssd_xception_coco model, and then division of image sub-regions is implemented based on a detection result of the target. Each object is provided with an image subarea, and the rest background is provided with an image subarea. The Dssd_Xreception_coco model adopts a DSSD target detection algorithm, a coco data set is used for pre-training an Xreception neural network, then the model is trained by a previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a proper detection model for detecting each target (such as a person, a tree, furniture and the like) in the image is obtained.
Example 2
A multi-mode multi-spliced RGB-D significance target detection method comprises the following steps:
s1, identifying human eye observation angles of images, wherein different human eye observation angles correspond to different image deflection angle adjustment models, and adjusting the image deflection angles based on the image deflection angle models;
s2, dividing the image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics;
s3, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea;
and S4, realizing the saliency target detection of the image according to the final score of each image subarea based on the ResNet50 model.
In this embodiment, in the step S1, extraction of RGB image color information, depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.
In this embodiment, in step S1, detection of the image-in-load targets is first implemented based on the dssd_xception_coco model, then division of image sub-regions is implemented based on the detection result of the targets, each target is configured with an image sub-region, and the remaining background is configured with an image sub-region. The Dssd_Xreception_coco model adopts a DSSD target detection algorithm, a coco data set is used for pre-training an Xreception neural network, then the model is trained by a previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a proper detection model for detecting each target (such as a person, a tree, furniture and the like) in the image is obtained.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (6)

1. The multi-mode multi-spliced RGB-D significance target detection method is characterized by comprising the following steps of:
s1, realizing the identification of the human eye observation angles of the images, wherein different human eye observation angles correspond to different image deflection angle adjustment models, and realizing the adjustment of the image deflection angles based on the image deflection angle models;
s2, dividing the image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics;
s3, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea;
and S4, detecting the saliency target of the image based on the final score of each image subarea.
2. The multi-mode multi-spliced RGB-D saliency target detection method of claim 1, wherein in step S1, extraction of RGB image color information, depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.
3. The multi-mode multi-spliced RGB-D significance target detection method according to claim 1, wherein in the step S1, firstly, detection of an image on-load target is achieved based on a Dssd_Xreception_coco model, and then division of image subareas is achieved based on a target detection result.
4. A multi-modal multi-stitched RGB-D saliency target detection method as claimed in claim 3 wherein each target is configured with an image sub-region and the remaining background is configured with an image sub-region.
5. A multi-modal multi-splice RGB-D salient object detection method as claimed in claim 3, wherein the dssd_xception_coco model employs Dssd object detection algorithm, the coco data set is used to pretrain the Xception neural network, then the model is trained with the previously prepared data set, parameters in the deep neural network are fine-tuned, and finally a suitable detection model for detecting the objects in the image is obtained.
6. The multi-mode multi-stitched RGB-D saliency target detection method of claim 1, wherein in step S3, saliency target detection of images is achieved according to final scores of each image sub-region based on a res net50 model.
CN202110461176.7A 2021-04-27 2021-04-27 Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method Active CN113128519B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110461176.7A CN113128519B (en) 2021-04-27 2021-04-27 Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110461176.7A CN113128519B (en) 2021-04-27 2021-04-27 Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method

Publications (2)

Publication Number Publication Date
CN113128519A CN113128519A (en) 2021-07-16
CN113128519B true CN113128519B (en) 2023-08-08

Family

ID=76780202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110461176.7A Active CN113128519B (en) 2021-04-27 2021-04-27 Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method

Country Status (1)

Country Link
CN (1) CN113128519B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015180527A1 (en) * 2014-05-26 2015-12-03 清华大学深圳研究生院 Image saliency detection method
CN107145892A (en) * 2017-05-24 2017-09-08 北京大学深圳研究生院 A kind of image significance object detection method based on adaptive syncretizing mechanism
CN107909078A (en) * 2017-10-11 2018-04-13 天津大学 Conspicuousness detection method between a kind of figure
CN108345892A (en) * 2018-01-03 2018-07-31 深圳大学 A kind of detection method, device, equipment and the storage medium of stereo-picture conspicuousness
CN108846416A (en) * 2018-05-23 2018-11-20 北京市新技术应用研究所 The extraction process method and system of specific image
CN111353508A (en) * 2019-12-19 2020-06-30 华南理工大学 Saliency detection method and device based on RGB image pseudo-depth information

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2015180527A1 (en) * 2014-05-26 2015-12-03 清华大学深圳研究生院 Image saliency detection method
CN107145892A (en) * 2017-05-24 2017-09-08 北京大学深圳研究生院 A kind of image significance object detection method based on adaptive syncretizing mechanism
CN107909078A (en) * 2017-10-11 2018-04-13 天津大学 Conspicuousness detection method between a kind of figure
CN108345892A (en) * 2018-01-03 2018-07-31 深圳大学 A kind of detection method, device, equipment and the storage medium of stereo-picture conspicuousness
CN108846416A (en) * 2018-05-23 2018-11-20 北京市新技术应用研究所 The extraction process method and system of specific image
CN111353508A (en) * 2019-12-19 2020-06-30 华南理工大学 Saliency detection method and device based on RGB image pseudo-depth information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Pipit Utami.A Study on Facial Expression Recognition in Assessing Teaching Skills: Datasets and Methods.《Procedia Computer Science》.2019,第161卷全文. *

Also Published As

Publication number Publication date
CN113128519A (en) 2021-07-16

Similar Documents

Publication Publication Date Title
US11830230B2 (en) Living body detection method based on facial recognition, and electronic device and storage medium
Borji et al. Adaptive object tracking by learning background context
JP4755202B2 (en) Face feature detection method
US11443454B2 (en) Method for estimating the pose of a camera in the frame of reference of a three-dimensional scene, device, augmented reality system and computer program therefor
Ghimire et al. A robust face detection method based on skin color and edges
Crihalmeanu et al. Enhancement and registration schemes for matching conjunctival vasculature
CN106326832B (en) Device and method for processing image based on object region
Kumano et al. Pose-invariant facial expression recognition using variable-intensity templates
CA2369163A1 (en) Automatic colour defect correction
CN110956114A (en) Face living body detection method, device, detection system and storage medium
CN111460884A (en) Multi-face recognition method based on human body tracking
CN106295640A (en) The object identification method of a kind of intelligent terminal and device
JP2002216129A (en) Face area detector, its method and computer readable recording medium
CN112926464A (en) Face living body detection method and device
CN109993090B (en) Iris center positioning method based on cascade regression forest and image gray scale features
JP2015204030A (en) Authentication device and authentication method
Supriyanti et al. Detecting pupil and iris under uncontrolled illumination using fixed-Hough circle transform
Hu et al. Fast face detection based on skin color segmentation using single chrominance Cr
CN105913069A (en) Image identification method
CN113128519B (en) Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method
Teng et al. Leaf segmentation, its 3d position estimation and leaf classification from a few images with very close viewpoints
Asteriadis et al. Head pose estimation with one camera, in uncalibrated environments
Rurainsky et al. Eye center localization using adaptive templates
Le et al. Pedestrian lane detection in unstructured environments for assistive navigation
Abdallah et al. Different techniques of hand segmentation in the real time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant