CN106920250B - Robot target identification and localization method and system based on RGB-D video - Google Patents

Robot target identification and localization method and system based on RGB-D video Download PDF

Info

Publication number
CN106920250B
CN106920250B CN201710078328.9A CN201710078328A CN106920250B CN 106920250 B CN106920250 B CN 106920250B CN 201710078328 A CN201710078328 A CN 201710078328A CN 106920250 B CN106920250 B CN 106920250B
Authority
CN
China
Prior art keywords
target
frame
video
positioning
candidate area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710078328.9A
Other languages
Chinese (zh)
Other versions
CN106920250A (en
Inventor
陶文兵
李坤乾
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201710078328.9A priority Critical patent/CN106920250B/en
Publication of CN106920250A publication Critical patent/CN106920250A/en
Application granted granted Critical
Publication of CN106920250B publication Critical patent/CN106920250B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a kind of robot target identification and localization method and system based on RGB-D video, by target candidate extraction, identification, the optimization of the reliability estimating based on temporal consistency, Target Segmentation, location estimation, target category is determined in the scene and obtains accurate spatial position positioning.Depth information of scene is utilized in the present invention, enhance the spatial level perception ability of identification and location algorithm, by using space-time consistency constraint when length based on key frame, while improving video treatment effeciency, the identity and relevance of target in long timing target identification and location tasks ensure that.In position fixing process, by the Accurate Segmentation target in plane space and the location consistency in the same target of depth information evaluation space, the collaboration target positioning in multi information mode is realized.Calculation amount is small, and real-time is good, and identification is high with positioning accuracy, can be applied to parse the robot task of understanding technology based on online visual information.

Description

Robot target identification and localization method and system based on RGB-D video
Technical field
The invention belongs to technical field of computer vision, more particularly, to a kind of robot mesh based on RGB-D video Mark not with localization method and system.
Background technique
In recent years, with the fast development of robot technology, the machine vision technique of object manipulator task is also obtained The extensive concern of researcher.Wherein, the identification Yu accurate positioning of target are an important rings for robot vision problem, after being execution The precondition of continuous task.
Existing target identification method generally comprise extract target information to be identified as basis of characterization and with field to be identified Two steps of matching of scape.The expression of traditional target to be identified generally comprises geometry, target appearance, extracts local feature The methods of, often there is the deficiencies of poor universality, stability are insufficient, target abstracting capabilities are poor in such methods.The above object table The defect reached also brings the difficulty for being difficult to overcome to subsequent matching process.
After the expression for obtaining target to be identified, object matching refer to will obtain the objective expression and scene characteristic to be identified into Row compares, to identify target.Generally speaking, existing method includes the two class methods based on Region Matching and characteristic matching.Base Matching in region refers to that the information for extracting image local subregion is compared, calculation amount and subregion number to be matched It is directly proportional;The characteristic feature in image is matched based on the method for feature, matching accuracy rate and feature representation validity It is closely related.The above two classes method proposes higher requirement to the acquisition of candidate region and feature representation, but due to two dimension The limitation of flat image information and design feature, often effect is poor in the complex environment identification mission of object manipulator.
Target positioning is widely present in industrial production life, such as the GPS in outdoor sports, military radar monitoring, naval vessels Sonar etc., such equipment accurate positioning, operation distance range are very wide but at high price.The positioning system of view-based access control model It is research hotspot new in recent years.According to the difference of visual sensor, be broadly divided into based on monocular vision sensor, binocular and The localization method of depth transducer, panoramic vision sensor.Monocular vision sensor price is low, structure is simple, is easy to demarcate, but Positioning accuracy is often poor;Panoramic vision sensor can get complete scene information, and positioning accuracy is higher, but computationally intensive, It is expensive that real-time is poor, the device is complicated;Estimation of Depth or depth information acquisition equipment based on binocular vision are to scene distance sense Know that ability is stronger, and system is relatively simple, real-time is easily achieved, and the concern being subject in recent years is also more and more.But this neck The research in domain is still at an early stage, still lacks target positioning side that is efficient, can handling RGB-Depth video in real time at present Method.
Due to demand with higher for depth information sensing capability, existing robot system acquires mostly RGB-Depth video as visual information source, depth information be the three-dimensional perception of scene, complex target hierarchy divide, Positioning provides information abundant.However, due to the complexity of robot operative scenario, computation complexity is higher, operand compared with Greatly, at present there has been no system, quickly and easily RGB-Depth video object identification and accurate positioning method.Therefore, research is based on The Indoor Robot target identification and Precision Orientation Algorithm of RGB-Depth video not only have very strong researching value, but also have Boundless application prospect.
Summary of the invention
Aiming at the above defects or improvement requirements of the prior art, the present invention provides a kind of machines based on RGB-D video People's target identification and localization method and system, the RGB-Depth video obtained by the first visual angle of handling machine people are realized real-time , the precise positioning that accurate target identification and target are in robot working environment, thus the complexity such as auxiliary mark crawl Robot task.Thus the technology for lacking object localization method efficient, that RGB-Depth video can be handled in real time at present is solved Problem.
To achieve the above object, according to one aspect of the present invention, a kind of robot mesh based on RGB-D video is provided Mark is not and localization method, comprising:
(1) the RGB-D sequence of frames of video of scene where obtaining positioning target to be identified;
(2) the key video sequence frame in the RGB-D sequence of frames of video is extracted, and target is extracted to the key video sequence frame and is waited Favored area is filtered screening to the object candidate area according to the corresponding depth information of each key video sequence frame;
(3) object candidate area after filtering screening is identified based on depth network, passes through long timing space time correlation Constraint and multiframe identify Uniform estimates, carry out confidence level sequence to target identification result;
(4) local Fast Segmentation is carried out to the object candidate area after filtering screening, according to the confidence of target identification result The timing intervals relationship of degree and each key video sequence frame chooses chief video frame from the key video sequence frame, and to segmentation Region carries out front and back consecutive frame extension and collaboration optimization;
(5) it determines that key feature points are used as positioning reference point in the scene, and then estimates that camera perspective and camera motion are estimated Evaluation, by identifying that segmentation result carries out target signature consistency constraint and target position consistency about to chief video frame Beam estimates the collaboration confidence level of positioning target to be identified and carries out space accurate positioning.
Preferably, the step (2) specifically includes:
(2.1) with interval sampling or key frame extraction method, the key video sequence of positioning target to be identified for identification is determined Frame;
(2.2) using the target candidate obtained based on the confidence level sort method like physical property priori in the key video sequence frame Region forms object candidate area set and obtains each object candidate area using the corresponding depth information of each key video sequence frame Hierarchy attributes in internal and its neighborhood, optimize screening to the object candidate area set, sort again.
Preferably, the step (3) specifically includes:
(3.1) object candidate area after step (2) are screened is sent into trained target identification depth network, The target identification prediction result of the corresponding key video sequence frame of object candidate area after obtaining each screening and the prediction of each target identification As a result the first confidence level;
(3.2) it is constrained according to the space time correlation of long timing, feature is carried out to the target identification prediction result of key video sequence frame Conformance Assessment evaluates the second confidence level of each target identification prediction result, will be set by first confidence level with described second The accumulation confidence level that reliability obtains is ranked up, and further filters out the target time that accumulation confidence level is lower than default confidence threshold value Favored area.
Preferably, the step (4) specifically includes:
(4.1) object candidate area for step (3.2) acquisition and its extension neighborhood, carry out quick Target Segmentation behaviour Make, obtains the initial segmentation of target, determine object boundary;
It (4.2) is constraint with space-time consistency in short-term, based on the accumulation confidence level ranking results in step (3.2), from institute It states and filters out chief video frame in key video sequence frame;
(4.3) with it is long when space-time consistency be constraint, be based on step (4.1) initial segmentation, to positioning target to be identified Appearance modeling is carried out, 3-D graphic building is carried out to chief video frame and its consecutive frame, and design maximum a posteriori probability-horse Er Kefu random field energy function, cuts algorithm by figure and optimizes to initial segmentation, is closing to the object segmentation result of single frames Extension and optimization are split in consecutive frame before and after key video frame.
Preferably, the step (5) specifically includes:
(5.1) the chief video frame that step (4.2) are obtained, according to adjacent between each chief video frame And visual field coincidence relation, multiple groups same place point is extracted to as positioning reference point;
(5.2) the chief video frame estimation camera perspective variation being overlapped according to the visual field, and then pass through geometrical relationship, benefit With the motion information of the depth information estimation camera of positioning reference point point pair;
(5.3) according to the information that fathoms, camera perspective and the phase of positioning target to be identified in chief video frame The motion information of machine evaluates the spatial position consistency to be identified for positioning target in chief video frame;
(5.4) according to step (4.3) as a result, the feature consistency of evaluation positioning target two dimension cut zone to be identified;
(5.5) feature consistency by overall merit positioning target two dimension cut zone to be identified and spatial position one Cause property determines the spatial position of positioning target to be identified.
It is another aspect of this invention to provide that provide it is a kind of based on RGB-D video robot target identification with positioning be System, comprising:
Module is obtained, the RGB-D sequence of frames of video for scene where obtaining positioning target to be identified;
Filtering screening module, for extracting the key video sequence frame in the RGB-D sequence of frames of video, and to the crucial view Frequency frame extracts object candidate area, is filtered according to the corresponding depth information of each key video sequence frame to the object candidate area Screening;
Confidence level sorting module is led to for being identified based on depth network to the object candidate area after filtering screening Too long timing space time correlation constraint and multiframe identify Uniform estimates, carry out confidence level sequence to target identification result;
Optimization module, for carrying out local Fast Segmentation to the object candidate area after filtering screening, according to target identification As a result the timing intervals relationship of confidence level and each key video sequence frame chooses chief video from the key video sequence frame Frame, and front and back consecutive frame is carried out to cut zone and extends and cooperate with optimization;Wherein, optimization module is especially by following steps reality It is existing:
Quick Target Segmentation operation is carried out to object candidate area and its extension neighborhood, obtains the initial segmentation of target, Determine object boundary;
It is constraint with space-time consistency in short-term, based on accumulation confidence level ranking results, filters out master from key video sequence frame Want key video sequence frame;
With it is long when space-time consistency be constraint, be based on initial segmentation, to positioning target to be identified progress appearance modeling, to master It wants key video sequence frame and its consecutive frame to carry out 3-D graphic building, and designs maximum a posteriori probability-markov random file energy Function cuts algorithm by figure and optimizes to initial segmentation, adjacent before and after key video sequence frame to the object segmentation result of single frames Extension and optimization are split in frame;
Locating module, for determine in the scene key feature points as positioning reference point, and then estimate camera perspective and Camera motion estimated value, by identifying that segmentation result carries out target signature consistency constraint and target position to chief video frame Consistency constraint is set, estimate the collaboration confidence level of positioning target to be identified and carries out space accurate positioning.
In general, through the invention it is contemplated above technical scheme is compared with the prior art, mainly have skill below Art advantage: utilizing depth information of scene in the present invention, the spatial level perception ability of identification and location algorithm is enhanced, by adopting It ensure that long timing target while improving video treatment effeciency with space-time consistency constraint when length based on key frame The identity and relevance of identification and target in location tasks.In position fixing process, pass through the Accurate Segmentation mesh in plane space Be marked with and the same target of depth information evaluation space location consistency, it is fixed to realize collaboration target in multi information mode Position.Calculation amount is small, and real-time is good, and identification is high with positioning accuracy, can be applied to parse understanding technology based on online visual information Robot task.
Detailed description of the invention
Fig. 1 is the overall procedure schematic diagram of present invention method;
Fig. 2 is the flow diagram of target identification in the embodiment of the present invention;
Fig. 3 is the flow diagram of targeting accuracy positioning in the embodiment of the present invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below Not constituting a conflict with each other can be combined with each other.
Method disclosed by the invention is related to key frame screening, the target identification based on depth network, segmentation, label interframe biography Pass, the location estimation based on consistency constraint and collaboration optimization etc. technologies, can be directly used for RGB-D video being that visual information is defeated In the robot system entered, auxiliary robot completes target identification and targeting accuracy location tasks.
It is as shown in Figure 1 the overall procedure schematic diagram of present invention method.It will be seen from figure 1 that this method includes Target identification and target are accurately positioned two big steps, and target identification is the precondition of targeting accuracy positioning.Its specific embodiment party Formula is as follows:
(1) the RGB-D sequence of frames of video of scene where obtaining positioning target to be identified;
Preferably, in an embodiment of the invention, can by Kinect even depth visual sensor acquire to The RGB-D video sequence of scene where identification positioning target;RGB picture pair can also be acquired by binocular imaging apparatus, and passed through Disparity estimation depth information of scene is calculated as depth channel information, to synthesize RGB-D video as input.
(2) the key video sequence frame in RGB-D sequence of frames of video is extracted, and object candidate area is extracted to key video sequence frame, Screening is filtered to object candidate area according to each key video sequence frame corresponding depth information;
(3) object candidate area after filtering screening is identified based on depth network, passes through long timing space time correlation Constraint and multiframe identify Uniform estimates, carry out confidence level sequence to target identification result;
(4) local Fast Segmentation is carried out to the object candidate area after filtering screening, according to the confidence of target identification result The timing intervals relationship of degree and each key video sequence frame chooses chief video frame from key video sequence frame, and to cut zone Carry out front and back consecutive frame extension and collaboration optimization;
(5) it determines that key feature points are used as positioning reference point in the scene, and then estimates that camera perspective and camera motion are estimated Evaluation, by identifying that segmentation result carries out target signature consistency constraint and target position consistency about to chief video frame Beam estimates the collaboration confidence level of positioning target to be identified and carries out space accurate positioning.
Preferably, in one embodiment of the invention, above-mentioned steps (1) specifically include:
(1.1) the RGB-D video sequence of scene where acquiring positioning target to be identified with Kinect, and sampled and put down with neighborhood Sliding mode depth of cracking closure image cavity, is modified to it according to Kinect parameter and is converted to real depth information, with RGB number According to as input;
(1.2) when as clock synchronization, passed sequentially through using the acquisition of binocular equipment camera calibration, Stereo matching (as to feature extraction, Same physical structure corresponding points are extracted, calculate parallax) step, finally by projection model estimating depth as depth in video The input in channel.
Preferably, in one embodiment of the invention, above-mentioned steps (2) specifically include:
(2.1) with interval sampling or key frame extraction method, the key video sequence of positioning target to be identified for identification is determined Frame;
Wherein, step (2.1) specifically includes: utilizing quick Scale invariant features transform (Scale-invariant Feature transform, SIFT) the scene Duplication that matching process obtains consecutive frame is put, to estimate the field of current shooting Scape change rate switches faster video frame for photographed scene, improves sample frequency, switches slower video for photographed scene Frame reduces sample frequency.In addition, interval sampling side can be directlyed adopt when practical application request is more demanding to efficiency of algorithm Method substitutes this step.
(2.2) using the target candidate obtained based on the confidence level sort method like physical property priori in the key video sequence frame Region forms object candidate area set and obtains each object candidate area using the corresponding depth information of each key video sequence frame Hierarchy attributes in internal and its neighborhood, optimize screening to the object candidate area set, sort again.
Wherein, BING algorithm or Edge box algorithm can be based on the confidence level sort method like physical property priori.Such as Fig. 2 It is shown, the depth information of corresponding frame is recycled, the hierarchy attributes inside object candidate area and its in neighborhood is obtained, is set according to height Answer that depth information is smooth, the biggish principle of in-out-snap boundary depth information gradient inside the candidate frame of reliability, to target candidate Regional ensemble optimizes screening, sorts again.
Preferably, in one embodiment of the invention, above-mentioned steps (3) specifically include:
(3.1) know as shown in Fig. 2, the object candidate area after step (2) are screened is sent into trained target Other depth network, the target identification prediction result of the corresponding key video sequence frame of object candidate area after obtaining each screening and each mesh Identify the first confidence level of other prediction result;
Wherein, trained target identification depth network can be such as SPP-Net, R-CNN, Fast-R-CNN deeply Degree identification network, can also be substituted by other depth recognition networks.
(3.2) it is constrained according to the space time correlation of long timing, feature is carried out to the target identification prediction result of key video sequence frame Conformance Assessment evaluates the second confidence level of each target identification prediction result, will be obtained by the first confidence level and the second confidence level Accumulation confidence level be ranked up, further filter out the object candidate area that accumulation confidence level is lower than default confidence threshold value.
Optionally, in one embodiment of the invention, it can be obtained by applying identification instruction to algorithm to be identified The detection recognition result of target is positioned, and passes through filtering low confidence recognition result boosting algorithm efficiency.
Optionally, in one embodiment of the invention, above-mentioned steps (4) specifically include:
(4.1) as shown in figure 3, object candidate area and its extension neighborhood, progress for step (3.2) acquisition are quick Target Segmentation operation, obtains the initial segmentation of target, determines object boundary;
Wherein, as an alternative embodiment, can be used the GrabCut partitioning algorithm based on RGB-D information into The quick Target Segmentation operation of row, obtains the initial segmentation of target, to obtain the two-dimensional localization of target in current video frame As a result.
(4.2) in order to further increase the efficiency that video object positions, as shown in figure 3, being about with space-time consistency in short-term Beam is strong with single frames recognition confidence height, consecutive frame space-time consistency based on the accumulation confidence level ranking results in step (3.2) For criterion, chief video frame is filtered out from key video sequence frame;
(4.3) with it is long when space-time consistency be constraint, be based on step (4.1) initial segmentation, to positioning target to be identified Appearance modeling is carried out, 3-D graphic building is carried out to chief video frame and its consecutive frame, and design maximum a posteriori probability-horse Er Kefu random field energy function, cuts algorithm by figure and optimizes to initial segmentation, is closing to the object segmentation result of single frames It is split extension in consecutive frame before and after key video frame, to realize that the two dimension target based on length-space-time consistency in short-term is divided Positioning and optimizing.
Optionally, in one embodiment of the invention, above-mentioned steps (5) specifically include:
(5.1) as shown in figure 3, for the chief video frame that step (4.2) obtain, according to each chief video frame Between adjacent and visual field coincidence relation, extract multiple groups same place point to as positioning reference point;
(5.2) the chief video frame estimation camera perspective variation being overlapped according to the visual field, and then pass through geometrical relationship, benefit With the motion information of the depth information estimation camera of positioning reference point point pair;
Wherein, the motion information of camera includes camera moving distance and motion track.
(5.3) as shown in figure 3, according to the information that fathoms of positioning target to be identified in chief video frame, camera The motion information of visual angle and camera evaluates the spatial position consistency to be identified for positioning target in chief video frame;
(5.4) according to step (4.3) as a result, evaluation it is to be identified positioning target two dimension cut zone feature consistency, It is general that regional depth feature is extracted for characteristic distance measurement and feature consistency evaluation using the depth network based on region;
(5.5) feature consistency by overall merit positioning target two dimension cut zone to be identified and spatial position one Cause property determines the spatial position of positioning target to be identified.
In one embodiment of the invention, a kind of robot target identification and positioning based on RGB-D video is disclosed System, the system include:
Module is obtained, the RGB-D sequence of frames of video for scene where obtaining positioning target to be identified;
Filtering screening module, for extracting the key video sequence frame in the RGB-D sequence of frames of video, and to the crucial view Frequency frame extracts object candidate area, is filtered according to the corresponding depth information of each key video sequence frame to the object candidate area Screening;
Confidence level sorting module is led to for being identified based on depth network to the object candidate area after filtering screening Too long timing space time correlation constraint and multiframe identify Uniform estimates, carry out confidence level sequence to target identification result;
Optimization module, for carrying out local Fast Segmentation to the object candidate area after filtering screening, according to target identification As a result the timing intervals relationship of confidence level and each key video sequence frame chooses chief video from the key video sequence frame Frame, and front and back consecutive frame is carried out to cut zone and extends and cooperate with optimization;
Locating module, for determine in the scene key feature points as positioning reference point, and then estimate camera perspective and Camera motion estimated value, by identifying that segmentation result carries out target signature consistency constraint and target position to chief video frame Consistency constraint is set, estimate the collaboration confidence level of positioning target to be identified and carries out space accurate positioning.
Wherein, the specific embodiment of each module is referred to the description of embodiment of the method, and the embodiment of the present invention will not be done It repeats.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include Within protection scope of the present invention.

Claims (4)

1. a kind of robot target identification and localization method based on RGB-D video characterized by comprising
(1) the RGB-D sequence of frames of video of scene where obtaining positioning target to be identified;
(2) the key video sequence frame in the RGB-D sequence of frames of video is extracted, and target candidate area is extracted to the key video sequence frame Domain is filtered screening to the object candidate area according to the corresponding depth information of each key video sequence frame;
(3.1) object candidate area after step (2) are screened is sent into trained target identification depth network, obtained The target identification prediction result and each target identification prediction result of the corresponding key video sequence frame of object candidate area after each screening The first confidence level;
(3.2) it is constrained according to the space time correlation of long timing, it is consistent to carry out feature to the target identification prediction result of key video sequence frame Property evaluation, evaluate the second confidence level of each target identification prediction result, will be by first confidence level and second confidence level Obtained accumulation confidence level is ranked up, and further filters out the target candidate area that accumulation confidence level is lower than default confidence threshold value Domain;
(4.1) object candidate area for step (3.2) acquisition and its extension neighborhood, carry out quick Target Segmentation operation, The initial segmentation for obtaining target, determines object boundary;
It (4.2) is constraint with space-time consistency in short-term, based on the accumulation confidence level ranking results in step (3.2), from the pass Chief video frame is filtered out in key video frame;
(4.3) with it is long when space-time consistency be constraint, be based on step (4.1) initial segmentation, to positioning target to be identified progress Appearance modeling carries out 3-D graphic building to chief video frame and its consecutive frame, and designs maximum a posteriori probability-Ma Erke Husband's random field energy function, cuts algorithm by figure and optimizes to initial segmentation, regards to the object segmentation result of single frames in key Extension and optimization are split in consecutive frame before and after frequency frame;
(5) it determines that key feature points are used as positioning reference point in the scene, and then estimates camera perspective and camera motion estimated value, By identifying that segmentation result carries out target signature consistency constraint and target position consistency constraint to chief video frame, estimate It counts the collaboration confidence level of positioning target to be identified and carries out space accurate positioning.
2. the method according to claim 1, wherein the step (2) specifically includes:
(2.1) with interval sampling or key frame extraction method, the key video sequence frame of positioning target to be identified for identification is determined;
(2.2) using the object candidate area obtained based on the confidence level sort method like physical property priori in the key video sequence frame It forms object candidate area set and obtains the inside of each object candidate area using the corresponding depth information of each key video sequence frame And its hierarchy attributes in neighborhood, screening is optimized to the object candidate area set, is sorted again.
3. the method according to claim 1, wherein the step (5) specifically includes:
(5.1) the chief video frame that step (4.2) are obtained, according to the adjacent and view between each chief video frame Wild coincidence relation extracts multiple groups same place point to as positioning reference point;
(5.2) the chief video frame estimation camera perspective variation being overlapped according to the visual field, and then by geometrical relationship, using fixed The motion information of the depth information estimation camera of position reference point point pair;
(5.3) according to the information that fathoms of positioning target to be identified in chief video frame, camera perspective and camera Motion information evaluates the spatial position consistency to be identified for positioning target in chief video frame;
(5.4) according to step (4.3) as a result, the feature consistency of evaluation positioning target two dimension cut zone to be identified;
(5.5) consistent by the feature consistency and spatial position of overall merit positioning target two dimension cut zone to be identified Property, determine the spatial position of positioning target to be identified.
4. a kind of robot target identification and positioning system based on RGB-D video characterized by comprising
Module is obtained, the RGB-D sequence of frames of video for scene where obtaining positioning target to be identified;
Filtering screening module, for extracting the key video sequence frame in the RGB-D sequence of frames of video, and to the key video sequence frame Object candidate area is extracted, sieve is filtered to the object candidate area according to the corresponding depth information of each key video sequence frame Choosing;
Confidence level sorting module passes through length for identifying based on depth network to the object candidate area after filtering screening The constraint of timing space time correlation and multiframe identify Uniform estimates, carry out confidence level sequence to target identification result;
Optimization module, for carrying out local Fast Segmentation to the object candidate area after filtering screening, according to target identification result Confidence level and each key video sequence frame timing intervals relationship, from the key video sequence frame choose chief video frame, and Front and back consecutive frame extension and collaboration optimization are carried out to cut zone;Wherein, the optimization module by object candidate area and It extends neighborhood and carries out quick Target Segmentation operation, obtains the initial segmentation of target, determines object boundary;With space-time one in short-term Cause property is constraint, based on accumulation confidence level ranking results, filters out chief video frame from the key video sequence frame;With length When space-time consistency be constraint, be based on initial segmentation, to positioning target to be identified progress appearance modeling, to chief video frame And its consecutive frame carries out 3-D graphic building, and designs maximum a posteriori probability-markov random file energy function, is cut by figure Algorithm optimizes initial segmentation, is split expansion in consecutive frame before and after the key video sequence frame to the object segmentation result of single frames Exhibition and optimization;
Locating module for determining that key feature points are used as positioning reference point in the scene, and then estimates camera perspective and camera Motion estimated values, by identifying that segmentation result carries out target signature consistency constraint and target position one to chief video frame The constraint of cause property estimates the collaboration confidence level of positioning target to be identified and carries out space accurate positioning.
CN201710078328.9A 2017-02-14 2017-02-14 Robot target identification and localization method and system based on RGB-D video Active CN106920250B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710078328.9A CN106920250B (en) 2017-02-14 2017-02-14 Robot target identification and localization method and system based on RGB-D video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710078328.9A CN106920250B (en) 2017-02-14 2017-02-14 Robot target identification and localization method and system based on RGB-D video

Publications (2)

Publication Number Publication Date
CN106920250A CN106920250A (en) 2017-07-04
CN106920250B true CN106920250B (en) 2019-08-13

Family

ID=59453597

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710078328.9A Active CN106920250B (en) 2017-02-14 2017-02-14 Robot target identification and localization method and system based on RGB-D video

Country Status (1)

Country Link
CN (1) CN106920250B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108214487B (en) * 2017-12-16 2021-07-20 广西电网有限责任公司电力科学研究院 Robot target positioning and grabbing method based on binocular vision and laser radar
CN109977981B (en) * 2017-12-27 2020-11-24 深圳市优必选科技有限公司 Scene analysis method based on binocular vision, robot and storage device
CN108304808B (en) * 2018-02-06 2021-08-17 广东顺德西安交通大学研究院 Monitoring video object detection method based on temporal-spatial information and deep network
CN108627816A (en) * 2018-02-28 2018-10-09 沈阳上博智像科技有限公司 Image distance measuring method, device, storage medium and electronic equipment
CN108460790A (en) * 2018-03-29 2018-08-28 西南科技大学 A kind of visual tracking method based on consistency fallout predictor model
CN108981698B (en) * 2018-05-29 2020-07-14 杭州视氪科技有限公司 Visual positioning method based on multi-mode data
CN110675421B (en) * 2019-08-30 2022-03-15 电子科技大学 Depth image collaborative segmentation method based on few labeling frames
CN115091472B (en) * 2022-08-26 2022-11-22 珠海市南特金属科技股份有限公司 Target positioning method based on artificial intelligence and clamping manipulator control system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598890A (en) * 2015-01-30 2015-05-06 南京邮电大学 Human body behavior recognizing method based on RGB-D video
CN104867161A (en) * 2015-05-14 2015-08-26 国家电网公司 Video-processing method and device
CN105589974A (en) * 2016-02-04 2016-05-18 通号通信信息集团有限公司 Surveillance video retrieval method and system based on Hadoop platform
CN105931270A (en) * 2016-04-27 2016-09-07 石家庄铁道大学 Video keyframe extraction method based on movement trajectory analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110007806A (en) * 2009-07-17 2011-01-25 삼성전자주식회사 Apparatus and method for detecting hand motion using a camera
US20160132754A1 (en) * 2012-05-25 2016-05-12 The Johns Hopkins University Integrated real-time tracking system for normal and anomaly tracking and the methods therefor

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598890A (en) * 2015-01-30 2015-05-06 南京邮电大学 Human body behavior recognizing method based on RGB-D video
CN104867161A (en) * 2015-05-14 2015-08-26 国家电网公司 Video-processing method and device
CN105589974A (en) * 2016-02-04 2016-05-18 通号通信信息集团有限公司 Surveillance video retrieval method and system based on Hadoop platform
CN105931270A (en) * 2016-04-27 2016-09-07 石家庄铁道大学 Video keyframe extraction method based on movement trajectory analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Battlefield Video Target Mining;Zhongwei Guo等;《International Congress on Image & Signal Processing》;20101130;全文
Confidence-driven infrared target detection;Zhang, Zhiguo, Liu, Liman, Tao, Wenbing等;《Infrared Physics & Technology》;20140531;全文

Also Published As

Publication number Publication date
CN106920250A (en) 2017-07-04

Similar Documents

Publication Publication Date Title
CN106920250B (en) Robot target identification and localization method and system based on RGB-D video
CN107093171B (en) Image processing method, device and system
CN104715471B (en) Target locating method and its device
Čech et al. Scene flow estimation by growing correspondence seeds
CN109544456A (en) The panorama environment perception method merged based on two dimensional image and three dimensional point cloud
CN110570457B (en) Three-dimensional object detection and tracking method based on stream data
CN104517095B (en) A kind of number of people dividing method based on depth image
CN103458261B (en) Video scene variation detection method based on stereoscopic vision
KR101139389B1 (en) Video Analysing Apparatus and Method Using Stereo Cameras
CN104156937A (en) Shadow detection method and device
CN107560592A (en) A kind of precision ranging method for optronic tracker linkage target
TWI686748B (en) People-flow analysis system and people-flow analysis method
CN106295640A (en) The object identification method of a kind of intelligent terminal and device
KR20160109761A (en) Method and System for Recognition/Tracking Construction Equipment and Workers Using Construction-Site-Customized Image Processing
US20210304435A1 (en) Multi-view positioning using reflections
US20170147609A1 (en) Method for analyzing and searching 3d models
CN112017188B (en) Space non-cooperative target semantic recognition and reconstruction method
CN110415297A (en) Localization method, device and unmanned equipment
CN109409250A (en) A kind of across the video camera pedestrian of no overlap ken recognition methods again based on deep learning
CN108399630B (en) Method for quickly measuring distance of target in region of interest in complex scene
CN103679699A (en) Stereo matching method based on translation and combined measurement of salient images
CN112581495A (en) Image processing method, device, equipment and storage medium
CN114022531A (en) Image processing method, electronic device, and storage medium
CN104504162B (en) A kind of video retrieval method based on robot vision platform
Ibisch et al. Arbitrary object localization and tracking via multiple-camera surveillance system embedded in a parking garage

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant