CN113128519B

CN113128519B - Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method

Info

Publication number: CN113128519B
Application number: CN202110461176.7A
Authority: CN
Inventors: 陈莉; 赵志华
Original assignee: NORTHWEST UNIVERSITY
Current assignee: NORTHWEST UNIVERSITY
Priority date: 2021-04-27
Filing date: 2021-04-27
Publication date: 2023-08-08
Anticipated expiration: 2041-04-27
Also published as: CN113128519A

Abstract

The invention discloses a multi-mode multi-spliced RGB-D significance target detection method, which comprises the following steps: s1, dividing an image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics; s2, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea; and S3, detecting the saliency target of the image based on the final score of each image subarea. The invention can rapidly and efficiently realize the detection of the saliency target, and can detect the precision of the saliency.

Description

Multi-mode multi-spliced RGB-D (red, green and blue) -D (digital video) saliency target detection method

Technical Field

The invention relates to the field of image detection, in particular to a multi-mode multi-spliced RGB-D saliency target detection method.

Background

The salient object detection is an important component of computer vision, and with the continuous development of the field of computer vision, a salient object detection method with higher efficiency and better accuracy is urgently needed.

During the development of saliency detection, various methods have evolved, such as utilizing color features, positional information, texture features, etc. of images. Some conventional methods use center priors, edge priors, semantic priors, etc. However, since the color scene in the image is very complex, these models often fail when there is no obvious contrast between the object and the background, and it is difficult to distinguish the similar objects by these features.

Disclosure of Invention

In order to solve the problems, the invention provides a multi-mode multi-spliced RGB-D saliency target detection method, which can rapidly and efficiently realize the detection of the saliency target and can detect the saliency accurately.

And simultaneously, the complementarity of the visible light camera and the near infrared camera is utilized, the feature extraction is carried out on the visible light face picture by utilizing a deep learning algorithm, and finally, the feature extracted by the deep learning model is subjected to hierarchical fusion by utilizing a fusion algorithm, so that the effect of complementary advantages is achieved.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:

a multi-mode multi-spliced RGB-D significance target detection method comprises the following steps:

s1, dividing an image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics;

s2, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea;

and S3, detecting the saliency target of the image based on the final score of each image subarea.

Further, in the step S1, extraction of RGB image color information, depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.

Further, in the step S1, detection of an image uploading target is first implemented based on the dssd_xception_coco model, and then division of image sub-areas is implemented based on a detection result of the target.

Further, each object is configured with an image sub-area, and the remaining background is configured with an image sub-area.

Further, the Dssd_Xreception_coco model adopts a DSSD target detection algorithm, a coco data set is used for pre-training the Xreception neural network, then the model is trained by the previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a proper detection model for detecting each target in the image is obtained.

Further, in the step S3, the salient object detection of the image is realized according to the final score of each image subarea based on the res net50 model.

Further, the method further comprises the following steps: the step of realizing the identification of the image human eye observation angle is realized, different human eye observation angles correspond to different image deflection angle adjustment models, and the adjustment of the image deflection angle is realized based on the image deflection angle model

The invention has the following beneficial effects:

the detection of the saliency target can be realized quickly and efficiently, and the precision of the saliency detection can be realized.

Drawings

FIG. 1 is a flowchart of a multi-mode multi-spliced RGB-D saliency target detection method according to embodiment 1 of the present invention

Fig. 2 is a flowchart of a multi-mode multi-spliced RGB-D saliency target detection method according to embodiment 2 of the present invention.

Detailed Description

The present invention will be described in further detail with reference to examples in order to make the objects and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Example 1

and S3, realizing the saliency target detection of the image according to the final score of each image subarea based on the ResNet50 model.

In this embodiment, in the step S1, extraction of RGB image color information, depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.

In this embodiment, in step S1, detection of an image in-load target is first implemented based on the dssd_xception_coco model, and then division of image sub-regions is implemented based on a detection result of the target. Each object is provided with an image subarea, and the rest background is provided with an image subarea. The Dssd_Xreception_coco model adopts a DSSD target detection algorithm, a coco data set is used for pre-training an Xreception neural network, then the model is trained by a previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a proper detection model for detecting each target (such as a person, a tree, furniture and the like) in the image is obtained.

Example 2

s1, identifying human eye observation angles of images, wherein different human eye observation angles correspond to different image deflection angle adjustment models, and adjusting the image deflection angles based on the image deflection angle models;

s2, dividing the image into non-overlapping subareas, respectively extracting RGB image color information, depth image Depth information and symmetrical invariant LBP characteristics of each image subarea, and forming a region histogram based on the symmetrical invariant LBP characteristics;

s3, measuring the correlation of RGB image color information, depth image Depth information and region histogram based on class condition mutual information entropy, and utilizing a self-adaptive score fusion algorithm to realize fusion of the RGB image color information, the Depth image Depth information and the region histogram in a score level, so as to obtain the final score of each image subarea;

and S4, realizing the saliency target detection of the image according to the final score of each image subarea based on the ResNet50 model.

In this embodiment, in step S1, detection of the image-in-load targets is first implemented based on the dssd_xception_coco model, then division of image sub-regions is implemented based on the detection result of the targets, each target is configured with an image sub-region, and the remaining background is configured with an image sub-region. The Dssd_Xreception_coco model adopts a DSSD target detection algorithm, a coco data set is used for pre-training an Xreception neural network, then the model is trained by a previously prepared data set, various parameters in the deep neural network are finely adjusted, and finally a proper detection model for detecting each target (such as a person, a tree, furniture and the like) in the image is obtained.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. The multi-mode multi-spliced RGB-D significance target detection method is characterized by comprising the following steps of:

s1, realizing the identification of the human eye observation angles of the images, wherein different human eye observation angles correspond to different image deflection angle adjustment models, and realizing the adjustment of the image deflection angles based on the image deflection angle models;

and S4, detecting the saliency target of the image based on the final score of each image subarea.

2. The multi-mode multi-spliced RGB-D saliency target detection method of claim 1, wherein in step S1, extraction of RGB image color information, depth image Depth information, and symmetric invariant LBP features of each sub-region is implemented based on a Depth convolution network.

3. The multi-mode multi-spliced RGB-D significance target detection method according to claim 1, wherein in the step S1, firstly, detection of an image on-load target is achieved based on a Dssd_Xreception_coco model, and then division of image subareas is achieved based on a target detection result.

4. A multi-modal multi-stitched RGB-D saliency target detection method as claimed in claim 3 wherein each target is configured with an image sub-region and the remaining background is configured with an image sub-region.

5. A multi-modal multi-splice RGB-D salient object detection method as claimed in claim 3, wherein the dssd_xception_coco model employs Dssd object detection algorithm, the coco data set is used to pretrain the Xception neural network, then the model is trained with the previously prepared data set, parameters in the deep neural network are fine-tuned, and finally a suitable detection model for detecting the objects in the image is obtained.

6. The multi-mode multi-stitched RGB-D saliency target detection method of claim 1, wherein in step S3, saliency target detection of images is achieved according to final scores of each image sub-region based on a res net50 model.