CN111191650A - Object positioning method and system based on RGB-D image visual saliency - Google Patents
Object positioning method and system based on RGB-D image visual saliency Download PDFInfo
- Publication number
- CN111191650A CN111191650A CN202010003692.0A CN202010003692A CN111191650A CN 111191650 A CN111191650 A CN 111191650A CN 202010003692 A CN202010003692 A CN 202010003692A CN 111191650 A CN111191650 A CN 111191650A
- Authority
- CN
- China
- Prior art keywords
- rgb
- image
- visual saliency
- saliency
- salient
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J19/00—Accessories fitted to manipulators, e.g. for monitoring, for viewing; Safety devices combined with or specially adapted for use in connection with manipulators
- B25J19/02—Sensing devices
- B25J19/021—Optical sensing devices
- B25J19/023—Optical sensing devices including video camera means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Image Analysis (AREA)
- Manipulator (AREA)
Abstract
An article positioning method and system based on RGB-D image visual saliency mainly comprises a camera, a mechanical arm and an operation table; the article to be grabbed is stacked on the operating platform, the mechanical arm is a UR5 mechanical arm, and the operating platform is a horizontal panel. When the system is initialized, the camera corrects the operating platform, and provides a reference plane for positioning the object by the mechanical arm and grabbing the object by the mechanical arm. Firstly, acquiring an RGB-D image of an operation platform scene by a camera; then, calculating a visual saliency map based on the RGB-D image, namely an RGB-D image saliency map; and finally, positioning the article based on the visual saliency map and providing mechanical arm article operation information. The pixel-level visual saliency map and the position information of the salient object can be generated simultaneously, and multiple operations of the manipulator are supported.
Description
Technical Field
The invention relates to the field of computer vision target positioning, in particular to an article positioning method and system based on RGB-D image vision saliency.
Background
When the scene is complex, especially when various items are scattered, the quick positioning of the items by the vision-based mechanical arm is a challenging task. Whether the mechanical arm successfully grabs the scene object or not is related to the selection of the order of grabbing the object, namely, the type and the position of the object which are most suitable for grabbing under the current scene need to be judged.
The mechanical arm grabbing based on visual perception is suitable for general article stacking scenes, such as logistics warehouses, can replace manual sorting, and achieves full-automatic and intelligent logistics management of unmanned factories, unmanned warehouses and the like.
Currently, a mechanical arm article grabbing application system generally collects scene visual information based on an RGB-D camera. A feedback map (affordance map) is computed based on the RGB-D image, from which points of proper operation are located. If the feedback graph has no proper points, a depth reinforcement learning strategy is adopted to actively try to change the space distribution of the scene objects, and the process is carried out until the feedback graph has proper points, so that the capturing success rate is not high.
When the scene is complex, the objects are overlapped and stacked and mutually shielded, and at the moment, the optimal positioning point cannot be confirmed by adopting the feedback graph method, the placing sequence of the objects in the scene needs to be actively interfered, namely, a feedback graph and reinforcement learning-based method is adopted. However, as a consequence of active intervention requires reinforcement learning to assess, a risk uncontrolled situation may arise, i.e. in an ineffective loop of death. Therefore, the object positioning based on the feedback graph for the reinforcement learning mechanism has the defects of complexity, uncontrollable property, high calculation consumption and the like.
Therefore, how to solve the problems of fast positioning and clamping of the mechanical arm in a complex scene of disordered distribution of various articles, namely, researching a new, fast, convenient and flexible method for fast positioning and clamping of the mechanical arm with less calculation consumption becomes a technical problem to be solved urgently.
Disclosure of Invention
In order to research a flexible and effective mechanism positioning method and system to complete rapid scene object positioning. The invention provides a positioning method for rapidly analyzing scenes based on visual saliency and simulating a human visual attention mechanism. Visual-saliency-based analysis may be used to accomplish specific visual tasks based on a priori knowledge, rules, and the like. The human visual attention mechanism can quickly browse a scene according to the significance degree, the first noticed area or target is often related to self experience knowledge and a specific purpose, and also related to the relative significance degree of the area or target in the scene, so how to realize the object positioning based on the visual significance is a technical difficulty which must be overcome. In order to solve the technical problems, the invention provides the method for sequencing scene articles based on semantic information to calculate the visual saliency, and provides distributed grabbing for mechanical arm grabbing according to the visual saliency value as a basis.
To solve the above technical problem, according to an aspect of the present invention, there is provided an article positioning method based on RGB-D image visual saliency, comprising the following steps:
acquiring an RGB-D image of an operation platform scene by a camera;
secondly, calculating a visual saliency map based on the RGB-D image, namely the RGB-D image saliency map;
thirdly, positioning the articles based on the visual saliency map and providing mechanical arm article operation information;
and step two, performing visual saliency detection on the RGB-D image, and calculating to obtain a visual saliency map, as shown in formula (1):
wherein, P (z)S|IRGB-D) Representing the visual saliency of the current scene, i.e. the visual saliency map, is defined as the probability p (z) of whether a pixel of an RGB-D image is salient or notS|xc,xd);IRGB-DRepresenting an RGB-D image; x is the number ofcAnd xdRespectively representing RGB image salient features and Depth image salient features, and respectively extracting by adopting a CNN network; p (z)S,xc,xd) Representing a joint probability distribution, p (x)c,xd) Representing a significant feature probability distribution; the visualization effect is represented by a temperature map, the larger the significant value is, the warmer the color is, and the smaller the significant value is, the colder the color is;
based on the RGB-D image saliency map, carrying out salient object position estimation, as shown in formula (6):
wherein O represents the salient object position coordinates, zSRepresenting the visual saliency of a salient object; p (O, z)s|IRGB-D) Representing a joint distribution of salient objects and visual saliency, p (I)RGB-D|O,zs) Representing the distribution of the RGB-D image saliency map, p (O, z), given target coordinates and visual saliencys) Representing the joint distribution of objects and visual saliency, p (I)RGB-D) Representing the RGB-D image feature distribution.
Preferably, when z is givenSWhen is, O and IRGB-DIs condition independent, then the following is obtained as equation (7):
p(IRGB-D|O,zs)=p(IRGB-D|zs) (7)
when the posterior probability of visual saliency of the target region is satisfied as a constraint condition of the salient target, under the condition that the image feature distribution is not changed, the formula (6) is approximately transformed to obtain a formula (8) as follows:
p(O,zs|IRGB-D)∝p(zs|IRGB-D)L(O)C(O,zs) (8)
wherein L (O) represents a target detection region, C (O, z)s) Representing constraints, decidesMeaning in the form:
wherein, bjWhich indicates the area of the detection target,indicating that the region is visually significant; and L (O) is obtained by detecting a target area based on the RGB image by adopting a target detection algorithm, namely a Faster R-CNN algorithm.
Preferably, the camera is a Kinect camera.
Preferably, the adopted mechanical arm has two operation functions of suction and clamping; when the scene is complex, namely, articles are stacked together and are seriously shielded, a significant image is generated to be distributed in a pixel level, and the 'sucking' operation of a manipulator is supported; when the scene provides a target detection rectangular area, the salient target rectangular area can be obtained based on the salient image, and the manipulator clamping operation is supported.
Preferably, an operation stop threshold value is set based on the saliency value, the manipulator operation is sequentially driven according to the magnitude of the visual saliency value until the saliency value is lower than the threshold value, and the visual saliency of the RGB-D image of the scene is recalculated.
To solve the above technical problem, according to another aspect of the present invention, there is provided an article positioning system based on RGB-D image visual saliency using the method of claim 1, comprising: the system comprises a camera, a mechanical arm and an operating platform, wherein articles to be grabbed are stacked on the operating platform, the mechanical arm is a UR5 mechanical arm, the operating platform is a horizontal panel, and when the system is initialized, the camera corrects the operating platform and provides a reference plane for positioning the articles by the mechanical arm and grabbing the articles by the mechanical arm.
Preferably, when z is givenSWhen is, O and IRGB-DIs condition independent, then the following is obtained as equation (7):
p(IRGB-D|O,zs)=p(IRGB-D|zs) (7)
when the posterior probability of visual saliency of the target region is satisfied as a constraint condition of the salient target, under the condition that the image feature distribution is not changed, the formula (6) is approximately transformed to obtain a formula (8) as follows:
p(O,zs|IRGB-D)∝p(zs|IRGB-D)L(O)C(O,zs) (8)
wherein L (O) represents a target detection region, C (O, z)s) Representing constraints, defined in the form:
wherein, bjWhich indicates the area of the detection target,indicating that the region is visually significant; and L (O) is obtained by detecting a target area based on the RGB image by adopting a target detection algorithm, namely a Faster R-CNN algorithm.
Preferably, the camera is a Kinect camera.
Preferably, the adopted mechanical arm has two operation functions of suction and clamping; when the scene is complex, namely, articles are stacked together and are seriously shielded, a significant image is generated to be distributed in a pixel level, and the 'sucking' operation of a manipulator is supported; when the scene provides a target detection rectangular area, the salient target rectangular area can be obtained based on the salient image, and the manipulator clamping operation is supported.
Preferably, an operation stop threshold value is set based on the saliency value, the manipulator operation is sequentially driven according to the magnitude of the visual saliency value until the saliency value is lower than the threshold value, and the visual saliency of the RGB-D image of the scene is recalculated.
The invention has the beneficial effects that:
1. the object positioning method based on the RGB-D image visual saliency is based on the fact that the visual saliency is used as a mechanical arm object selection judgment basis for positioning, and the complexity of learning based on a deep reinforcement learning strategy is solved.
2. The pixel-level visual saliency map and the position information of the saliency target can be generated simultaneously, various operations of the manipulator are supported, and the limitation of interpretation basis is overcome.
3. The problem of article positioning sequence strategy learning is simplified, and the article positioning sequence criterion has universality and universality.
4. The selection of articles based on visual saliency only requires sorting of visual saliency values to determine a priority without special training for a particular scene. And reflecting the attention sequence of the scene articles based on the visual saliency, and selecting the articles as a mechanical arm to position the articles. Only scene objects need to be detected, and reinforcement learning for specific scenes is not needed.
5. The method has the advantages that the obvious target position is estimated accurately by adopting the estimation method, the basis is provided for the subsequent sequential positioning of the articles, the system operation efficiency is improved, the success rate of the system operation is increased, and the system operation time is effectively reduced.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the principles of the invention. The above and other objects, features and advantages of the present invention will become more apparent from the detailed description of the embodiments of the present invention when taken in conjunction with the accompanying drawings.
FIG. 1 is a robotic arm item positioning system;
FIG. 2 is a block diagram of a method for positioning items based on visual saliency facing the operation of a robotic arm;
FIG. 3 is a diagram of the operation of the robotic arm to provide for sequencing item prioritization based on saliency maps;
FIG. 4 is a drawing of the test experiment "suck" operation;
FIG. 5 is a diagram of the test experiment "clamp" operation.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and embodiments. It is to be understood that the specific embodiments described herein are for purposes of illustration only and are not to be construed as limitations of the invention. It should be noted that, for convenience of description, only the portions related to the present invention are shown in the drawings.
In addition, the embodiments of the present invention and the features of the embodiments may be combined with each other without conflict. The present invention will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
As shown in fig. 1, the RGB-D image based visual saliency-based mechanical arm article positioning system includes a Kinect camera, a mechanical arm, a manipulator and an operation table, an article to be grabbed is stacked on the operation table, the mechanical arm is a UR5 mechanical arm, the operation table is a horizontal panel, and when the system is initialized, the Kinect camera corrects the operation table to provide a reference plane for positioning the article by the mechanical arm and grabbing the article by the manipulator. First, an RGB-D image of the console scene is acquired by the Kinect camera. Then, a visual saliency map is computed based on the RGB-D image. And positioning the article based on the visual saliency map, and providing mechanical arm article operation information. The adopted manipulator has two operation functions of suction and clamping. In the specific operation process, the executed operation flow is as follows: when the scene is complex, namely, articles are stacked together and are seriously shielded, a significant image is generated to be distributed in a pixel level, and the 'sucking' operation of a manipulator is supported; when the scene provides a target detection rectangular area, the salient target rectangular area can be obtained based on the salient image, and the manipulator clamping operation is supported. And finally, setting an operation stop threshold value based on the saliency value, sequentially driving the mechanical hands to operate according to the magnitude sequence of the visual saliency value until the saliency value is lower than the threshold value, stopping, recalculating the visual saliency of the RGB-D image of the scene, and repeating the steps.
Therefore, by carrying out scene analysis, the visual saliency with different scales is generated, various operations of the mechanical arm are realized, and the mechanical arm grabbing system can be suitable for quick positioning and operating tasks of articles in different scenes.
FIG. 2 is a block diagram of a salient item positioning method oriented to robotic arm operation. The method comprises the steps of collecting an RGB-D image by using a Kinect device, carrying out visual saliency detection on the RGB-D image based on a DMNB (hybrid Mixed-member Naive Bayes Model), and calculating to obtain a scene saliency map, namely obtaining the saliency map based on the RGB-D image. The sequence of scene item operations is then sorted based on the saliency values, as shown in FIG. 3.
To calculate the visual saliency of an RGB-D image, a binary random variable zs is defined to represent whether a pixel of the RGB-D image is salient or not, as shown in equation 1:
wherein, P (z)S|IRGB-D) Representing the visual saliency of the current scene, i.e. saliency map, is defined as the probability p (z) of whether a pixel of an RGB-D image is salient or notS|xc,xd) The visualization effect is shown in a temperature diagram, the larger the significant value is, the warmer the color is, and the smaller the significant value is, the colder the color is; i isRGB-DRepresenting RGB-D images, xcAnd xdRespectively representing RGB image salient features and Depth image (Depth image) salient features, and respectively extracting by adopting a CNN network; p (z)S,xc,xd) Representing a joint probability distribution, p (x)c,xd) Representing a significant feature probability distribution.
Expanding the formula (1) based on Bayesian theorem, as shown in the formula (2):
where x iscAnd xdGiven hidden variable zSThe conditions are independent, and therefore, equation (2) is transformed into equation (3), as follows:
p(xc,xd|zs)=p(xc|zs)p(xd|zs) (3)
combining with the formula (3), the formula (2) is transformed into the formula (4)
Wherein, p (z)S) Representing a prior distribution, p (x)c|zs) And p (x)d|zs) Representing a visual saliency distribution, p (x), based on color features and depth featuresc,xd) Representing a significant feature probability distribution, which is simplified for computational efficiency. Finally, the significance value is calculated by equation (5):
p(zs|xc,xd)∝p(zs)p(xc|zs)p(xd|zs) (5)
(1) salient object position estimation based on RGB-D image salient map
Aiming at the technical problem of the invention, in order to obtain an effective visual saliency value so as to determine a priority according to a size ordering, the invention innovatively provides a calculation method of a saliency target estimation, wherein the saliency target estimation is shown as a formula 6:
wherein O represents the salient object position coordinates, zsIndicating the visual saliency of the salient objects. p (O, z)s|IRGB-D) Representing a joint distribution of salient objects and visual saliency, p (I)RGB-D|O,zs) Representing the distribution of the RGB-D image saliency map, p (O, z), given target coordinates and visual saliencys) Representing the joint distribution of objects and visual saliency, p (I)RGB-D) Representing the RGB-D image feature distribution.
The invention innovatively provides that the estimation method is adopted to accurately estimate the obvious target position, provides a basis for the subsequent sequential positioning of articles, improves the system operation efficiency, increases the success rate of system operation, and effectively reduces the system operation time.
When given zSWhen is, O and IRGB-DIs condition independent, then the following is obtained as equation (7):
p(IRGB-D|O,zs)=p(IRGB-D|zs) (7)
when the posterior probability of visual saliency of the target region is satisfied as a constraint condition of the salient target, under the condition that the image feature distribution is not changed, in order to calculate efficiency, the formula (6) is approximately transformed to obtain a formula (8) as follows:
p(O,zs|IRGB-D)∝p(zs|IRGB-D)L(O)C(O,zs) (8)
wherein L (O) represents a target detection region, C (O, z)s) Representing constraints, defined in the form:
wherein, bjWhich indicates the area of the detection target,indicating that the region is visually significant. And L (O) is obtained by detecting a target area based on the RGB image by adopting a target detection algorithm, namely a Faster R-CNN algorithm.
Based on the problem of detecting the repeated area solved by the non-maximization inhibition algorithm, the rectangular frame of the obvious object can be positioned under the condition of sparse distribution of articles in a scene, and the method is suitable for the 'grabbing' operation of a manipulator.
In order to verify the effectiveness of the object positioning method based on the visual saliency of the RGB-D image, the following test experiments are carried out:
selecting 40 different objects to construct different scenes, and grabbing by using a manipulator as shown in fig. 1, and performing grabbing experiments as shown in fig. 4 and 5. Wherein, fig. 4 is the test experiment suction operation, and fig. 5 is the test experiment clamping operation.
If a conventional feedback map is used, the robot will repeat this failed operation when the article corresponding to its maximum cannot be manipulated by the robot arm, since the environment and the feedback map are unchanged. Thus, if a manipulator fails three times on the same object, we define the test operation as a failure; we define the test as successful if the first 10 objects in the scene were successfully operated by the robot. On this basis, we define three indexes:
(1) the average number of test scenarios successfully grabbed at each time;
(2) "suck" operation success rate, which is defined as the number of successfully grasped objects divided by the number of lifting operations;
(3) test success, defined as the number of successful tests divided by the number of tests.
Table 1 records all test results for 20 different scenarios. Experiments show that after the feedback diagram and the reinforcement learning are actively optimized, the suction operation success rate and the test success rate of the system are improved to a certain extent compared with those of a method based on the feedback diagram, but the time complexity is greatly increased. The method based on the visual saliency solves the problem of object positioning from the perspective of visual saliency, does not depend on reinforcement learning, greatly improves the success rate, and does not increase excessive time complexity.
It follows that when relying solely on static feedback maps to obtain operational decisions, failures are likely to occur in cluttered scenarios.
When relying on feedback maps for active detection optimization to provide success rates, the system will ensure that the locations of items near the scene are sparse by actively disturbing the item distribution.
The method introduces the position estimation of the obvious objects, can automatically detect the sparse situation of the scene objects, and avoids the possibility of failed operation. The success rate of grabbing is greatly improved; meanwhile, for the situation that the article is seriously overlapped and blocked, the method can output the pixel level saliency map, so that the method has stronger adaptability to the scene. Thus, a more reliable decision can be obtained.
TABLE 1 System article location test results
The technical scheme of the invention solves the problems of complexity and limitation of strategy judgment basis in the process of positioning and grabbing articles based on machine vision in the operation of the traditional mechanical arm. The problem of article positioning sequence strategy learning is simplified, and the article positioning sequence criterion has universality and universality. The object positioning method based on the RGB-D image visual saliency is based on the fact that the visual saliency is used as a mechanical arm object selection judgment basis for positioning, and the complexity of learning based on a deep reinforcement learning strategy is solved. The method obtains the pixel-level and target-level saliency maps, supports various operations of a manipulator, and overcomes the limitation of interpretation bases.
The strategy training based on the deep reinforcement learning needs a large amount of labeled video data, and meanwhile, the training needs specific hardware support, so that the method is the optimal strategy learning under a specific scene. The visual saliency-based article selection method only needs the sorting of the visual saliency values to determine the priority, and does not need special training for specific scenes.
And scene positioning perception multi-scale positioning information is output, and various positioning requirements of the manipulator are supported. The original method only outputs pixel level information based on a feedback graph, and target level information is not generated. The invention calculates visual saliency in various output modes, and is suitable for pixel-level and target-level positioning. Each pixel in the image corresponds to a visual saliency value, the visual saliency map is suitable for mechanical arm suction operation, object segmentation is carried out on the image based on the visual saliency values of the pixel values to obtain the outline of the object, and the visual saliency map is suitable for mechanical arm clamping operation.
The mechanical arm object positioning and grabbing can be based on an object 6D posture estimation method, but the method is complex in scene, serious in object shielding and unreliable in posture estimation under the object stacking condition. Meanwhile, whether the object posture is suitable for the operation type of the manipulator needs to be quantified, and actual scene debugging is needed.
The method and the system for positioning the articles based on the visual saliency of the RGB-D images can be used in the fields of unmanned sorting of warehouses, service robots and the like.
And reflecting the attention sequence of the scene articles based on the visual saliency, and selecting the articles as a mechanical arm to position the articles. Only scene objects need to be detected, and reinforcement learning for specific scenes is not needed. In addition, the invention can simultaneously generate the pixel-level visual saliency map and the saliency target position information and support various operations of the manipulator.
So far, the technical solutions of the present invention have been described with reference to the preferred embodiments shown in the drawings, but it should be understood by those skilled in the art that the above embodiments are only for clearly illustrating the present invention, and not for limiting the scope of the present invention, and it is apparent that the scope of the present invention is not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.
Claims (10)
1. An article positioning method based on RGB-D image visual saliency is characterized by comprising the following steps:
acquiring an RGB-D image of an operation platform scene by the camera;
secondly, calculating a visual saliency map based on the RGB-D image, namely the RGB-D image saliency map;
thirdly, positioning the articles based on the visual saliency map and providing mechanical arm article operation information;
and step two, performing visual saliency detection on the RGB-D image, and calculating to obtain a visual saliency map, as shown in formula (1):
wherein, P (z)S|IRGB-D) Representing the visual saliency of the current scene, i.e. the visual saliency map, is defined as the probability p (z) of whether a pixel of an RGB-D image is salient or notS|xc,xd);IRGB-DRepresenting an RGB-D image; x is the number ofcAnd xdRespectively representing RGB image salient features and Depth image salient features, and respectively extracting by adopting a CNN network; p (z)S,xc,xd) Representing a joint probability distribution, p (x)c,xd) Outline of representing salient featuresRate distribution; the visualization effect is represented by a temperature map, the larger the significant value is, the warmer the color is, and the smaller the significant value is, the colder the color is;
based on the RGB-D image saliency map, carrying out salient object position estimation, as shown in formula (6):
wherein O represents the salient object position coordinates, zSRepresenting the visual saliency of a salient object; p (O, z)s|IRGB-D) Representing a joint distribution of salient objects and visual saliency, p (I)RGB-D|O,zs) Representing the distribution of the RGB-D image saliency map, p (O, z), given target coordinates and visual saliencys) Representing the joint distribution of objects and visual saliency, p (I)RGB-D) Representing the RGB-D image feature distribution.
2. The RGB-D image visual saliency-based item positioning method of claim 1,
when given zSWhen is, O and IRGB-DIs condition independent, then the following is obtained as equation (7):
p(IRGB-D|O,zs)=p(IRGB-D|zs) (7)
when the posterior probability of visual saliency of the target region is satisfied as a constraint condition of the salient target, under the condition that the image feature distribution is not changed, the formula (6) is approximately transformed to obtain a formula (8) as follows:
p(O,zs|IRGB-D)∝p(zs|IRGB-D)L(O)C(O,zs) (8)
wherein L (O) represents a target detection region, C (O, z)s) Representing constraints, defined in the form:
3. The RGB-D image visual saliency-based item positioning method of claim 1,
the camera is a Kinect camera.
4. The RGB-D image visual saliency-based item positioning method of claim 1,
the adopted manipulator has two operation functions of suction and clamping; when the scene is complex, namely, articles are stacked together and are seriously shielded, a significant image is generated to be distributed in a pixel level, and the 'sucking' operation of a manipulator is supported; when the scene provides a target detection rectangular area, the salient target rectangular area can be obtained based on the salient image, and the manipulator clamping operation is supported.
5. The RGB-D image visual saliency-based item positioning method of claim 1,
and setting an operation stop threshold value based on the saliency value, sequentially driving the mechanical hands to operate according to the magnitude sequence of the visual saliency value until the saliency value is lower than the threshold value, stopping, and recalculating the visual saliency of the RGB-D image of the scene.
6. An article location system based on RGB-D image visual saliency employing the method of claim 1, comprising: the system comprises a camera, a mechanical arm and an operating platform, wherein articles to be grabbed are stacked on the operating platform, the mechanical arm is a UR5 mechanical arm, the operating platform is a horizontal panel, and when the system is initialized, the camera corrects the operating platform and provides a reference plane for positioning the articles by the mechanical arm and grabbing the articles by the mechanical arm.
7. An article positioning system based on RGB-D image visual saliency as claimed in claim 6,
when given zSWhen is, O and IRGB-DIs condition independent, then the following is obtained as equation (7):
p(IRGB-D|O,zs)=p(IRGB-D|zs) (7)
when the posterior probability of visual saliency of the target region is satisfied as a constraint condition of the salient target, under the condition that the image feature distribution is not changed, the formula (6) is approximately transformed to obtain a formula (8) as follows:
p(O,zs|IRGB-D)∝p(zs|IRGB-D)L(O)C(O,zs) (8)
wherein L (O) represents a target detection region, C (O, z)s) Representing constraints, defined in the form:
8. An article positioning system based on RGB-D image visual saliency as claimed in claim 6,
the camera is a Kinect camera.
9. An article positioning system based on RGB-D image visual saliency as claimed in claim 6,
the adopted manipulator has two operation functions of suction and clamping; when the scene is complex, namely, articles are stacked together and are seriously shielded, a significant image is generated to be distributed in a pixel level, and the 'sucking' operation of a manipulator is supported; when the scene provides a target detection rectangular area, the salient target rectangular area can be obtained based on the salient image, and the manipulator clamping operation is supported.
10. An article positioning system based on RGB-D image visual saliency as claimed in claim 6,
and setting an operation stop threshold value based on the saliency value, sequentially driving the mechanical hands to operate according to the magnitude sequence of the visual saliency value until the saliency value is lower than the threshold value, stopping, and recalculating the visual saliency of the RGB-D image of the scene.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911402160 | 2019-12-30 | ||
CN2019114021608 | 2019-12-30 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111191650A true CN111191650A (en) | 2020-05-22 |
CN111191650B CN111191650B (en) | 2023-07-21 |
Family
ID=70709757
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010003692.0A Active CN111191650B (en) | 2019-12-30 | 2020-01-02 | Article positioning method and system based on RGB-D image visual saliency |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111191650B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112077842A (en) * | 2020-08-21 | 2020-12-15 | 上海明略人工智能(集团)有限公司 | Clamping method, clamping system and storage medium |
CN112223288A (en) * | 2020-10-09 | 2021-01-15 | 南开大学 | Visual fusion service robot control method |
CN113222003A (en) * | 2021-05-08 | 2021-08-06 | 北方工业大学 | RGB-D-based indoor scene pixel-by-pixel semantic classifier construction method and system |
WO2022037389A1 (en) * | 2020-08-18 | 2022-02-24 | 维数谷智能科技(嘉兴)有限公司 | Reference plane-based high-precision method and system for estimating multi-degree-of-freedom attitude of object |
Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103679740A (en) * | 2013-12-30 | 2014-03-26 | 中国科学院自动化研究所 | ROI (Region of Interest) extraction method of ground target of unmanned aerial vehicle |
CN103824284A (en) * | 2014-01-26 | 2014-05-28 | 中山大学 | Key frame extraction method based on visual attention model and system |
CN104408733A (en) * | 2014-12-11 | 2015-03-11 | 武汉大学 | Object random walk-based visual saliency detection method and system for remote sensing image |
US20150117783A1 (en) * | 2013-10-24 | 2015-04-30 | Adobe Systems Incorporated | Iterative saliency map estimation |
US20150169989A1 (en) * | 2008-11-13 | 2015-06-18 | Google Inc. | Foreground object detection from multiple images |
US20150310303A1 (en) * | 2014-04-29 | 2015-10-29 | International Business Machines Corporation | Extracting salient features from video using a neurosynaptic system |
CN105389550A (en) * | 2015-10-29 | 2016-03-09 | 北京航空航天大学 | Remote sensing target detection method based on sparse guidance and significant drive |
US20160180188A1 (en) * | 2014-12-19 | 2016-06-23 | Beijing University Of Technology | Method for detecting salient region of stereoscopic image |
CN106997478A (en) * | 2017-04-13 | 2017-08-01 | 安徽大学 | RGB-D image salient target detection method based on salient center prior |
CN107992874A (en) * | 2017-12-20 | 2018-05-04 | 武汉大学 | Image well-marked target method for extracting region and system based on iteration rarefaction representation |
US20180285683A1 (en) * | 2017-03-30 | 2018-10-04 | Beihang University | Methods and apparatus for image salient object detection |
CN108846416A (en) * | 2018-05-23 | 2018-11-20 | 北京市新技术应用研究所 | The extraction process method and system of specific image |
CN109146925A (en) * | 2018-08-23 | 2019-01-04 | 郑州航空工业管理学院 | Conspicuousness object detection method under a kind of dynamic scene |
CN109598268A (en) * | 2018-11-23 | 2019-04-09 | 安徽大学 | A kind of RGB-D well-marked target detection method based on single flow depth degree network |
CN109740613A (en) * | 2018-11-08 | 2019-05-10 | 深圳市华成工业控制有限公司 | A kind of Visual servoing control method based on Feature-Shift and prediction |
-
2020
- 2020-01-02 CN CN202010003692.0A patent/CN111191650B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150169989A1 (en) * | 2008-11-13 | 2015-06-18 | Google Inc. | Foreground object detection from multiple images |
US20150117783A1 (en) * | 2013-10-24 | 2015-04-30 | Adobe Systems Incorporated | Iterative saliency map estimation |
CN103679740A (en) * | 2013-12-30 | 2014-03-26 | 中国科学院自动化研究所 | ROI (Region of Interest) extraction method of ground target of unmanned aerial vehicle |
CN103824284A (en) * | 2014-01-26 | 2014-05-28 | 中山大学 | Key frame extraction method based on visual attention model and system |
US20150310303A1 (en) * | 2014-04-29 | 2015-10-29 | International Business Machines Corporation | Extracting salient features from video using a neurosynaptic system |
CN104408733A (en) * | 2014-12-11 | 2015-03-11 | 武汉大学 | Object random walk-based visual saliency detection method and system for remote sensing image |
US20160180188A1 (en) * | 2014-12-19 | 2016-06-23 | Beijing University Of Technology | Method for detecting salient region of stereoscopic image |
CN105389550A (en) * | 2015-10-29 | 2016-03-09 | 北京航空航天大学 | Remote sensing target detection method based on sparse guidance and significant drive |
US20180285683A1 (en) * | 2017-03-30 | 2018-10-04 | Beihang University | Methods and apparatus for image salient object detection |
CN106997478A (en) * | 2017-04-13 | 2017-08-01 | 安徽大学 | RGB-D image salient target detection method based on salient center prior |
CN107992874A (en) * | 2017-12-20 | 2018-05-04 | 武汉大学 | Image well-marked target method for extracting region and system based on iteration rarefaction representation |
CN108846416A (en) * | 2018-05-23 | 2018-11-20 | 北京市新技术应用研究所 | The extraction process method and system of specific image |
CN109146925A (en) * | 2018-08-23 | 2019-01-04 | 郑州航空工业管理学院 | Conspicuousness object detection method under a kind of dynamic scene |
CN109740613A (en) * | 2018-11-08 | 2019-05-10 | 深圳市华成工业控制有限公司 | A kind of Visual servoing control method based on Feature-Shift and prediction |
CN109598268A (en) * | 2018-11-23 | 2019-04-09 | 安徽大学 | A kind of RGB-D well-marked target detection method based on single flow depth degree network |
Non-Patent Citations (6)
Title |
---|
GERMÁN M. GARCÍA: "Saliency-based object discovery on RGB-D data with a late-fusion approach" * |
JIANHUA ZHANG: "Objectness ranking by uniform Bayesian model with multimodal and global cues" * |
夏辰: "基于重构的自底向上视觉注意模型研究" * |
杜杰: "基于区域特征融合的显著目标检测研究" * |
王松涛: "基于特征融合的RGB-D图像视觉显著性检测方法研究" * |
黄子超: "先验融合和特征指导的显著目标检测方法研究" * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022037389A1 (en) * | 2020-08-18 | 2022-02-24 | 维数谷智能科技(嘉兴)有限公司 | Reference plane-based high-precision method and system for estimating multi-degree-of-freedom attitude of object |
CN112077842A (en) * | 2020-08-21 | 2020-12-15 | 上海明略人工智能(集团)有限公司 | Clamping method, clamping system and storage medium |
CN112223288A (en) * | 2020-10-09 | 2021-01-15 | 南开大学 | Visual fusion service robot control method |
CN112223288B (en) * | 2020-10-09 | 2021-09-14 | 南开大学 | Visual fusion service robot control method |
CN113222003A (en) * | 2021-05-08 | 2021-08-06 | 北方工业大学 | RGB-D-based indoor scene pixel-by-pixel semantic classifier construction method and system |
CN113222003B (en) * | 2021-05-08 | 2023-08-01 | 北方工业大学 | Construction method and system of indoor scene pixel-by-pixel semantic classifier based on RGB-D |
Also Published As
Publication number | Publication date |
---|---|
CN111191650B (en) | 2023-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111191650A (en) | Object positioning method and system based on RGB-D image visual saliency | |
US10124489B2 (en) | Locating, separating, and picking boxes with a sensor-guided robot | |
US11209265B2 (en) | Imager for detecting visual light and projected patterns | |
JP7352260B2 (en) | Robot system with automatic object detection mechanism and its operating method | |
Schwarz et al. | Fast object learning and dual-arm coordination for cluttered stowing, picking, and packing | |
US9802317B1 (en) | Methods and systems for remote perception assistance to facilitate robotic object manipulation | |
US9649767B2 (en) | Methods and systems for distributing remote assistance to facilitate robotic object manipulation | |
WO2020034872A1 (en) | Target acquisition method and device, and computer readable storage medium | |
EP3186777B1 (en) | Combination of stereo and structured-light processing | |
US9259844B2 (en) | Vision-guided electromagnetic robotic system | |
US9205558B1 (en) | Multiple suction cup control | |
US20230260071A1 (en) | Multicamera image processing | |
CN111571581B (en) | Computerized system and method for locating grabbing positions and tracks using image views | |
JP7377627B2 (en) | Object detection device, object grasping system, object detection method, and object detection program | |
CN113538459A (en) | Multi-mode grabbing obstacle avoidance detection optimization method based on drop point area detection | |
Xu et al. | A vision-guided robot manipulator for surgical instrument singulation in a cluttered environment | |
US20210001488A1 (en) | Silverware processing systems and methods | |
EP4249178A1 (en) | Detecting empty workspaces for robotic material handling | |
WO2023092519A1 (en) | Grabbing control method and apparatus, and electronic device and storage medium | |
Su et al. | Pose-Aware Placement of Objects with Semantic Labels-Brandname-based Affordance Prediction and Cooperative Dual-Arm Active Manipulation | |
WO2023073780A1 (en) | Device for generating learning data, method for generating learning data, and machine learning device and machine learning method using learning data | |
WO2024053150A1 (en) | Picking system | |
US20230364787A1 (en) | Automated handling systems and methods | |
Piao et al. | Robotic tidy-up tasks using point cloud-based pose estimation | |
CN117726896A (en) | Computer-implemented method for generating (training) images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |