CN112528826B - Control method of picking device based on 3D visual perception - Google Patents
Control method of picking device based on 3D visual perception Download PDFInfo
- Publication number
- CN112528826B CN112528826B CN202011414617.XA CN202011414617A CN112528826B CN 112528826 B CN112528826 B CN 112528826B CN 202011414617 A CN202011414617 A CN 202011414617A CN 112528826 B CN112528826 B CN 112528826B
- Authority
- CN
- China
- Prior art keywords
- fruit
- immature
- block
- target
- blocks
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 230000016776 visual perception Effects 0.000 title claims abstract description 24
- 235000013399 edible fruits Nutrition 0.000 claims abstract description 243
- 238000009826 distribution Methods 0.000 claims abstract description 13
- 238000013135 deep learning Methods 0.000 claims abstract description 4
- WYTGDNHDOZPMIW-RCBQFDQVSA-N alstonine Natural products C1=CC2=C3C=CC=CC3=NC2=C2N1C[C@H]1[C@H](C)OC=C(C(=O)OC)[C@H]1C2 WYTGDNHDOZPMIW-RCBQFDQVSA-N 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 19
- 230000033001 locomotion Effects 0.000 claims description 18
- 238000013527 convolutional neural network Methods 0.000 claims description 14
- 238000004364 calculation method Methods 0.000 claims description 13
- 238000012545 processing Methods 0.000 claims description 8
- 230000002093 peripheral effect Effects 0.000 claims description 5
- 241000270295 Serpentes Species 0.000 claims description 3
- 230000009471 action Effects 0.000 claims description 3
- 239000007787 solid Substances 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 230000008569 process Effects 0.000 abstract description 10
- 238000011176 pooling Methods 0.000 description 13
- 230000004913 activation Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 10
- 238000012549 training Methods 0.000 description 10
- 238000012360 testing method Methods 0.000 description 4
- 238000012935 Averaging Methods 0.000 description 3
- 239000012636 effector Substances 0.000 description 3
- 230000003068 static effect Effects 0.000 description 3
- 241001164374 Calyx Species 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 208000034656 Contusions Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 208000034526 bruise Diseases 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000013016 damping Methods 0.000 description 1
- 238000013434 data augmentation Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 235000012055 fruits and vegetables Nutrition 0.000 description 1
- 230000002262 irrigation Effects 0.000 description 1
- 238000003973 irrigation Methods 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 239000002420 orchard Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/68—Food, e.g. fruit or vegetables
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/02—Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Multimedia (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Harvesting Machines For Specific Crops (AREA)
Abstract
The invention provides a control method of a picking device based on 3D visual perception, which comprises the following steps: s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked; s2, the control device identifies target fruits through an image acquisition device of the picking device, S3, the image acquisition device processes through a 3D color threshold to remove noise points from the background; s4, the control device uses deep learning to detect and position mature fruits; s5, selecting a region of interest around the target fruit to determine the existence of immature fruit; s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits. The invention actively distinguishes the immature fruits from the target based on visual perception, and selects single pushing operation or snakelike pushing operation composed of a plurality of linear pushing operations according to the distribution situation of the immature fruits around the target fruits so as to distinguish the immature fruits below the target fruits and at the same height of the target, thereby remarkably improving picking performance and greatly improving picking efficiency.
Description
Technical Field
The application relates to the field of intelligent robots, in particular to a control method of a picking device based on 3D visual perception.
Background
The fruit picking mode in China is generally manual picking, the labor cost accounts for 50% -70% of the total cost of fruits and vegetables, and the fruit picking mode is high in cost, low in efficiency and difficult to realize high-altitude operation. The fruit picking robot is a device which replaces manpower and can automatically pick fruits. At present, domestic fruit picking robots are still in the early stage of development, and most fruit picking robots cannot meet the requirements of fruit farmers on picking fruits. At present, the end effector of the fruit picking robot has no buffering stage when clamping fruits, and the fruits are tender and easy to bruise. In addition, fruit picking robots require selectively harvested fruits due to the different maturity stages of the fruit. The existing fruit picking robot has great defects in identifying and selecting single mature fruits, and not accidentally damaging or accidentally selecting immature fruits for picking. Although there are also methods of exploring the end effector's search space for viable trajectories through image recognition and using search algorithms, where each step of the trajectory is planned by a collision detector. Most methods are passive methods, with the aim of avoiding immature fruits or other parts without changing the environment. However, immature fruits are not always avoidable, and when the target fruit is completely surrounded by immature fruits, there may be a difficulty for the end effector to find ways to avoid all of the immature fruits and to pick them.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a control method of a picking device based on 3D visual perception.
The invention discloses a control method of a picking device based on 3D visual perception, which is characterized by comprising the following steps: the method comprises the following steps:
s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image collection and packaging device removes noise points from the background through 3D color threshold processing;
s4, the control device uses deep learning to detect and position mature fruits;
s5, selecting a region of interest around the target fruit to determine the existence of immature fruit;
s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits.
Wherein, in step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
in step S4, it includes:
s41, identifying and segmenting the pixel-level object by using a segmented convolutional neural network; creating a number of masks for the ripe fruit through the network, wherein one mask represents the detected target fruit; by matching with the depth image, projecting the mask into 3D points, and obtaining the 3D position of the target fruit in the camera frame coordinates;
s42, transforming coordinates from the camera frame to the picking device arm frame based on the camera external calibration device;
in step S5, a bounding box of a block in each region of interest in the point cloud is cut out in the point cloud library, and the identification and calculation of the immature fruit are performed on the corresponding block.
Wherein the region of interest is a region comprising a 3D point cloud of the target fruit and potentially one or more immature fruits; the region of interest is divided into four layers: a top layer, an upper middle layer, a lower middle layer and a bottom layer; each layer of the region of interest is divided into nine cube blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; so that the center block C of the position in the xy plane C Surrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; in front and left side views, the top and bottom layers are equal in height to one and two times the sum of the heights of the upper and lower intermediate layers, respectively; the grippers are moved upward to distinguish the immature fruit around the target fruit in the upper and lower intermediate layers, the distribution of which can vary in the height direction.
In particular, the grippers operate in three distinct phases: in a first stage, the gripper grabs from below, the gripper moving the immature fruit horizontally in the bottom layer; during the second stage, the gripper is moved upward to surround the target fruit and distinguish the immature fruit in the upper and lower intermediate layers; during the third stage, if the center in the top layerBlock C C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit.
Specifically, the first stage is to horizontally differentiate the immature fruit below the target fruit in the bottom layer, using the number Nh of center block adjacent non-immature fruit blocks to determine whether to use a single push operation or a serpentine operation;
ignoring the center block, the solid arrow in the block indicates that the block is occupied by immature fruit, and the blank arrow indicates that the block is unoccupied; nh is 5 and greater than a predetermined threshold th=4, selecting a single push operation to push the immature fruit aside; when the single pushing operation moves towards the immature fruit, the direction of the pushing operation of the gripper is calculated based on the position of the occupied zone according to the following formula:
where Oi is the vector of the ith occupied block in the largest set of adjacent occupied blocks and n is the total number of blocks in the largest set of adjacent occupied blocks; the parameter r is used to scale the Ds norm, which should ensure that the clamp is detached from the outside of the block, r=50mm;
the gripper moves from the center of the unoccupied block to the center of the occupied block so that the gripper has the highest likelihood of pushing all blocks apart;
ds=0 if only the center block Cc is occupied; the direction in which the gripper has to move to push the immature fruit is determined by calculating the shortest path from the current position of the gripper to the centre of the centre section CC. If no immature fruit is detected in the block, the gripper is not pushed at this stage and moves straight up from below.
If the number Nh of the areas adjacent to the areas without immature fruit in the central area is smaller than the threshold number Th, the clamp is pushed by a horizontal snake shape; the serpentine operation involves movement in three directions, forward, left and right, with the gripper pushing out the immature fruit in the three directions; the overall direction of the serpentine pushing operation is calculated based on the location of the unoccupied blocks according to the following formula:
wherein U is j Is the vector of the j-th unoccupied block within the largest set of adjacent unoccupied blocks, m is the total number of blocks within the largest set of adjacent unoccupied blocks; during a horizontal serpentine pushing operation, the device moves in the xy plane, where the resultant vector of the serpentine motion is equal to Dz, and the amplitude ah and number of pushes Nhp of the serpentine motion are determined according to the particular grasping scenario.
Specifically, the second stage is to surround the target fruit in the upper and lower middle layers and distinguish the immature fruit in the center layer; the upward serpentine pushing action employed in the upper and lower intermediate layers includes movement of the gripper in a substantially vertical direction toward the target fruit and from side to traverse the immature fruit; the vertical direction passes through the center of the target fruit. Calculating a direction of upward pushing du_z in the xy-plane based on the maximum number nu of center-block adjacent non-ripe fruit blocks; if nu is greater than the threshold th, the direction Du_z is calculated from the occupied block according to the following formula, as in the single pushing operation in the bottom layer 9:
where au is a parameter for scaling the du_z norm, where au=5 mm;
if Nu is less than threshold Th, then the calculation uses the unoccupied block, calculated by the following formula:
where M is the intermediate vector of the calculation Du_z. The grippers move along duz and-Du z to push the sides of the immature fruit apart.
Specifically, during the third stage, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit;
only in the center area C of the top layer C The drag operation is performed when there is immature fruit. If the center block C C The gripper moves upwards directly to grasp the target fruit; to avoid collision between the gripper and the table, three blocks L close to the table are skipped R 、C R 、R R To calculate the drag direction, the drag direction D in the xy plane dr The determination may be made according to the following equation:
wherein U is j Is the vector of the j-th unoccupied block within the largest group of adjacent unoccupied blocks. The block used for calculation is L C 、L F 、C F 、R F 、R C . The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. D (D) dr Is scaled to l, where l = 50mm. In which there is usually less immature fruit, but if all the zones are occupied by immature fruit, the direction of towing is along with C F Alignment; wherein the drag and push operations are moved up the same height in the vertical direction.
According to the control method of the picking device based on 3D visual perception, objects are actively distinguished from targets based on visual perception, single pushing operation or snake pushing operation consisting of a plurality of linear pushing operations is selected according to the distribution situation of immature fruits around the target fruits, so that the immature fruits below the target fruits and at the same height of the target are distinguished, due to multi-directional pushing, denser immature fruits can be processed, and the generated left-right movement can break the static contact force between the target fruits and the immature fruits, so that the gripper can receive the target fruits more easily; a trailing operation is subsequently taken that includes avoiding the immature fruit and actively pushing the immature fruit away to address the problem of erroneously capturing the immature fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less immature fruit and then pushes it back to move the immature fruit aside for further differentiation, significantly improving picking performance, avoiding damage to the target fruit and the immature fruit, and greatly improving picking efficiency.
Drawings
Fig. 1 is a flowchart of a control method of a picking device based on 3D visual perception according to the present invention.
Fig. 2 is a flow chart of a control method of a picking device based on 3D visual perception according to the present invention.
Fig. 3a-3c are schematic image recognition diagrams of a control method of a picking device based on 3D visual perception according to the present invention.
Fig. 4a-4c are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Fig. 5a-5b are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Fig. 6a-6D are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Figures 7a-7b are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Figures 8a-8D are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Fig. 9a-9b are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention.
As shown in fig. 1-2, a method for controlling a picking device based on 3D visual perception is characterized by comprising: the method comprises the following steps:
s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image collection and packaging device removes noise points from the background through 3D color threshold processing;
s4, the control device uses deep learning to detect and position mature fruits;
s5, selecting a region of interest around the target fruit to determine the existence of immature fruit;
s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits.
Wherein, in step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
in step S4, it includes:
s41, identifying and segmenting the pixel-level object by using a segmented convolutional neural network; creating a number of masks for the ripe fruit through the network, wherein one mask represents the detected target fruit; by matching with the depth image, projecting the mask into 3D points, and obtaining the 3D position of the target fruit in the camera frame coordinates;
s42, transforming coordinates from the camera frame to the picking device arm frame based on the camera external calibration device;
in step S5, a bounding box of a block in each region of interest in the point cloud is cut out in the point cloud library, and the identification and calculation of the immature fruit are performed on the corresponding block.
Among the image processing steps, the first step is to remove adjacent noise points by using hue, saturation, and intensity color thresholding. Some sensing points from irrigation pipes or shelves are around the ripe fruit, at a distance behind the fruit. Inaccurate depth sensing results in some sensing points being connected to the fruit in front, which is mistaken for an immature fruit. To avoid this effect, the first step is to remove adjacent noise points by using hue, saturation and intensity color thresholding.
The second step is the detection and positioning of ripe fruits: a segmented convolutional neural network is used to identify and segment pixel-level objects. Through the network, several masks are created for the ripe fruit, with one mask representing the detected target fruit. By matching with the depth image, the mask is projected as a 3D point, obtaining the 3D position of the target fruit in the camera frame coordinates. Thereafter, the coordinates are transformed from the camera frame to the picking device arm frame based on the camera external calibration device.
The third step is the calculation of the immature fruit.
Wherein the region of interest comprises a region comprising a 3D point cloud of the target fruit and potentially one or more immature fruits. As shown in fig. 3, the region of interest is divided into four layers: a top layer 6, an upper middle layer 7, a lower middle layer 8 and a bottom layer 9; as shown in the top view of fig. 3, the region of interest is further divided into nine cube blocks per layer. On each layer, the blocks form a 3 x 3 grid, the center of which is located at the horizontal midpoint of the target fruit; so that the center block C of the position in the xy plane C Surrounding the target fruit; in top view, the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; in front and left side views, the height of the top layer 6 and bottom layer 9 is equal to one and two times the height of the middle layer section, respectively; the gripper is moved upwards to distinguish the immature fruit around the target fruit in the intermediate layer, the immature fruit distribution in the intermediate layer being variable in the height direction.
In order to obtain a higher movement resolution, the intermediate layer is divided into an upper intermediate layer 7, a lower intermediate layer 8, and the movement in the movement of the intermediate layer is divided into two steps. The central area of the top layer 6 is lower than the other peripheral areas in the same layer by 80% of the other peripheral areas. This is because the object segmentation method does not include a green calyx. To avoid that the calyx is detected as immature fruit, the bottom of the center block is left in the top 1 blank.
To create a distinct path, each block is assigned a representation from block to center block C C Is a horizontal vector of the direction of (2). The direction of the vectors is determined by the location of the blocks such that all vectors are directed from the center of the respective block to the center block C C Is defined in the center of the (c). The number of points N in the point cloud area is used to determine if there is immature fruit in the block. Using a 1280 x 720 resolution camera, the thresholds for N for the top layer 6, upper middle layer 7, lower middle layer 8, and bottom layer 9 are 200, 100, and 300, respectively.
The gripper operates in three different phases: in a first stage, the gripper is grasped from below, moving the immature fruit horizontally in the bottom layer 9; during the second stage, the gripper moves upward to surround the target fruit and distinguish the immature fruit within the central layer; during the third phase, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit.
In particular, the first stage is to distinguish the immature fruit horizontally below the target fruit in the bottom layer 9, using the number Nh of central blocks adjacent to the blocks without immature fruit to determine whether to use a single push operation or a serpentine operation.
As shown in fig. 5a, the center block is ignored, the solid arrow in the block indicates that the block is occupied by immature fruit, and the blank arrow indicates the unoccupied block; nh is 5 and greater than a predetermined threshold th=4, so a single push operation is selected to push the immature fruit aside;
when the single pushing operation moves towards the immature fruit, the direction of the pushing operation of the gripper is calculated based on the position of the occupied zone according to the following formula:
where Oi is the vector of the i-th occupied block in the largest set of adjacent occupied blocks and n is the total number of blocks in the largest set of adjacent occupied blocks. The parameter r is used to scale the Ds norm, which should ensure that the clamp is disengaged from the outside of the block, r=50 mm.
The arrow in fig. 5a shows the calculated pushing direction for a single pushing operation; the gripper moves from the center of the unoccupied block to the center of the occupied block so that the gripper has the highest possibility of pushing all blocks apart.
Ds=0 if only the center block Cc is occupied; the direction in which the gripper has to move to push the immature fruit is determined by calculating the shortest path from the current position of the gripper to the centre of the centre section CC.
If no immature fruit is detected in the block, the gripper is not pushed at this stage and moves straight up from below.
If the number Nh of center area sections adjacent sections without immature fruit is less than the threshold number Th, the gripper adopts a horizontal serpentine pushing operation. Fig. 5b shows an example of path computation wherein a serpentine operation is selected to push the immature fruit from side to side. The red arrow is the general direction of operation, while the blue arrow is the serpentine path. Since the serpentine operation involves movement in three directions, forward, left and right, the gripper can push out the immature fruit in the three directions.
The overall direction of the serpentine pushing operation is calculated based on the location of the unoccupied blocks according to the following formula:
wherein U is j Is the vector of the j-th unoccupied block within the largest set of adjacent unoccupied blocks, and m is the total number of blocks within the largest set of adjacent unoccupied blocks. During a horizontal serpentine pushing operation, the device moves in the xy plane, where the resultant vector of the serpentine motion is equal to Dz, and the amplitude ah and number of pushes Nhp of the serpentine motion are determined according to the particular grasping scenario. For example, the effectiveness of these values may be affected by stem length, fruit weight, or fruit damping ratio, which are difficult to calculate. Where ah=20 mm and nhp=5.
In particular, the second stage is to enclose the target fruit in the upper intermediate layer 7, the lower intermediate layer 8 and to distinguish the immature fruit in the central layer.
As shown in fig. 6, the upward serpentine pushing action employed in the upper and lower intermediate layers 7, 8 involves movement of the gripper in a substantially vertical direction toward the target fruit and from side to traverse the movement of the immature fruit. The vertical direction passes through the center of the target fruit. The direction of upward pushing du_z in the xy plane is calculated based on the maximum number nu of center-block adjacent non-ripe fruit blocks. If nu is greater than the threshold th, the direction Du_z is calculated from the occupied block as in the single push operation in the bottom layer 9.
Where au is a parameter for scaling the du_z norm, where au=5 mm. If Nu is less than threshold Th, as shown in fig. 7a, then the calculation uses the unoccupied block, calculated by the following formula:
where M is the intermediate vector of the calculation Du_z. In fig. 7a, the grippers are moved along du_z and-du_z to push the sides of the immature fruit apart. The front view in fig. 7b shows the gripper moving gradually at the left or right intermediate point to pass the lower intermediate layer 8 and the upper intermediate layer 7. The push number nup in each layer is set to 5.
Specifically, during the third stage, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit.
As shown in fig. 8a, when there is an immature fruit above the top layer 6 of the target fruit, sometimes the gripper wraps the immature fruit when moving upward to catch the target fruit, or the gripper may damage the immature fruit. In addition, the immature fruit may prevent the wrapping sheet from closing, resulting in the inability to cut the stem of the target fruit.
During the third stage, a drag operation is employed that allows the gripper to grasp the target fruit without capturing unwanted immature fruit.
As shown in fig. 8, the dragging operation includes an upward dragging step to move the target fruit to the area containing less immature fruit and an upward pushing back step, as shown in fig. 8c, which pushes the upper immature fruit away before closing the finger. The push-back step is necessary when in the pulled position shown in fig. 8b, the target fruit stem is inclined so that the fruit is difficult to drop due to static forces and is prone to damage when the gripper is moved further up towards the cutting position.
Only in the center area C of the top layer C The drag operation is performed when there is immature fruit. If the center block C C The gripper moves upwards directly and grabs the target fruit. Fig. 9 is a diagram showing a calculation method of the drag operation corresponding to that in fig. 8. As shown in fig. 9a, to avoid collision between the clamper and the table, three blocks L near the table are skipped R 、C R 、R R To calculate the drag direction. Then, the dragging direction D in the xy plane dr The determination may be made according to the following equation:
wherein U is j Is the vector of the j-th unoccupied block within the largest group of adjacent unoccupied blocks. The block used for calculation is L C 、L F 、C F 、R F 、R C . The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. D (D) dr Is scaled to l, where l = 50mm. Where there is typically less immature fruit, but if all the pieces areOccupied by immature fruit, the direction of dragging and C F Alignment. Fig. 9b shows a drag and push back step in which the drag and push back operation is moved up the same height in the vertical direction.
The construction of the convolutional neural network comprises the following steps:
step 1, data acquisition:
a rough generalized fruit recognition model is built, and fruit images on branches are shot, so that natural conditions of an orchard are ensured, wherein the capturing of the images is not limited in any way, namely, lamplight conditions, shooting angles, distances from fruits and other conditions are not limited at all.
Step 2, data preparation:
one major drawback of deep neural networks is that they rely largely on large amounts of marker data to provide good accuracy. These large datasets help the training phase learn all embedded parameters and minimize the risk of network overfitting. Preparing such a large number of images is very laborious, expensive and time-consuming.
More training data can be created from existing samples through data augmentation to effectively mitigate over-fitting, making some transformations to the original image so that the new image still has the characteristics of the original image and is visually categorized into the same category. This will increase the versatility of the model since the same picture will not be exposed multiple times. In this study, an automatic data addition method, including image cropping, horizontal flipping, rotation, and brightness operations, was applied to generate 16 images from one image. After modifying the generated image and deleting the invalid image, e.g. cropped from the non-fruit area, the total number of images in the dataset including the original data is obtained.
The augmentation process is performed prior to loading the data into the network, first the enhanced image can easily monitor any possible remote images; and secondly, the load of the model is smaller, so that the training time is reduced. The enhanced image is then resized so that all of the input images have the same resolution.
The prepared data set is split into two subsets for training and testing, with most of the data being randomly selected for training and the remainder of the data being selected for testing.
Step 3, constructing a convolutional neural network structure:
convolutional neural networks are a subset of deep networks that can automatically extract features of RGB images and classify them. It has the characteristics of convolution operation, pool layer and nonlinear activation function. The general topology of deep convolutional neural networks includes a series of convolutional and pooling layers, as well as some fully-connected layers. The convolutional neural network structure has three conversion layers, the model has three protection layers and two fully connected layers, the recognition speed is very fast, and less memory is required for training.
Wherein the convolutional network layer: convolutional networks can learn translation invariants and spatial hierarchies, so convolutional networks can learn pre-identified patterns anywhere in an image and learn more and more complex patterns through successive layers. Convolutional networks are typically composed of three types of layers: convolution layer, pooling layer and full connection layer.
The convolutional layer is characterized by two parameters: the size of the filters and the number of filters are calculated. All three convolution layers use 3 x 3 filters, with the number of filters being 16, 32 and 64, respectively.
To reduce the size of the feature map, a max pooling layer is provided after each convolution layer. The max pooling layer has no trainable parameters and can only reduce the number of features by selecting the maximum value in each window and discarding other values. The first pooling layer uses 4x4 windows and the second and third pooling layers use 2x2 windows.
The convolution operation is followed by a complementary step of the rectification function, which further breaks the inherent linearity of the input image by outputting only non-negative values, in the convolution network all convolution layers as well as the first fully connected layer use the rectification function as an activation function. The rectification function is:
setting an activation function at the last layer of the model, wherein the activation function is as follows:
where z is a vector of K inputs and j represents the output unit. The activation function is necessary for multi-class, single tag classification to normalize the input data to a probability distribution.
Before entering the classification stage, a global averaging pooling layer is employed. The global average pooling layer does not contain trainable parameters, and can obviously reduce parameters and improve model precision, thereby obviously improving the robustness of the model. The global averaging pooling layer is based on the average output of each feature map in the previous layer and the embedded flat layer. The global average pooling layer is used to compute the class activation map. The classification activation map obtains a convolutional neural network that is used to identify regions of a particular class in the image, i.e., which regions in the image are associated with that class. The class activation map for a class is determined by multiplying the output image of the last convolutional layer by the weights assigned after summation. The formula of the classified activation graph is shown as the formula:
wherein Mc is a class activation map of class c,the kth weight corresponding to the c class, f k (x, y) is related to the kth feature map of the last convolutional layer.
All filtering features in the overall convolution application network are encoded as input data to the fully connected classifier layer. The full connection layer connects all neurons of the previous layer and the current layer through a certain weight. The classification phase of the current model consists of two fully connected layers. Convolutional neural networks predict the class of an input image with a certain probability level. The error of this process needs to be measured by means of a loss function. A classification cross entropy loss function is used to evaluate the accuracy of the proposed model, which minimizes the difference between the output of the predictive probability distribution and the actual distribution of the target.
Step 4, network optimization:
the network is configured to load an input image with an associated tag. The input image is divided into training data and test data; 80% are used for training and the remaining 20% are used for test data. 10% of the training data set was used as the validation data set.
Increasing the network depth may improve overall performance, which is highest when the number of training samples is proportional to the network capacity. The three convolutional layers perform best and the structure is further optimized. The optimization process of the network is evaluated using different optimizers.
A robust model is built in the deep convolutional neural network, which is capable of identifying multiple categories of fruit on branches based on RGB images of the deep convolutional neural network, the deep convolutional neural model consisting of three convolutional layers and three maximum pooling layers, positioned after a global average pooling and two fully connected layers. The global averaging pooling layer is used, so that the need of a flat layer is eliminated, the global accuracy of unviewed data is improved, the classification index score is increased, the trainable total parameters are reduced, and the processing is faster. The network has higher fruit recognition rate and classification precision, high response speed, no influence of natural conditions, small calculated amount, and the fruit picking robot can rapidly and accurately recognize target fruits and interested areas by using the deep convolutional neural network, so that the ignored fruits are guaranteed to be minimum, and the yield is highest.
According to the control method of the picking device based on 3D visual perception, the immature fruits are actively distinguished from the target based on visual perception, and single pushing operation or snakelike pushing operation consisting of a plurality of linear pushing operations are selected according to the distribution situation of the immature fruits around the target fruits, so that the immature fruits below the target fruits and at the same height of the target are distinguished, and due to multi-directional pushing, denser immature fruits can be processed, and the generated left-right movement can break the static contact force between the target fruits and the immature fruits, so that the gripper can receive the target fruits more easily; a trailing operation is subsequently taken that includes avoiding the immature fruit and actively pushing the immature fruit away to address the problem of erroneously capturing the immature fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less immature fruit and then pushes it back to move the immature fruit aside for further differentiation, significantly improving picking performance, avoiding damage to the target fruit and the immature fruit, and greatly improving picking efficiency.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution, and the present invention is intended to be covered in the scope of the present invention.
Claims (3)
1. A method for controlling a picking device based on 3D visual perception, comprising: the method comprises the following steps:
s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image acquisition device removes noise points from the background through 3D color threshold processing;
s4, the control device uses deep learning to detect and position mature fruits;
s5, selecting a region of interest around the target fruit to determine the existence of immature fruit;
s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits;
wherein, in step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
in step S4, it includes:
s41, identifying and segmenting the pixel-level object by using a segmented convolutional neural network; creating a number of masks for the ripe fruit through the network, wherein one mask represents the detected target fruit; by matching with the depth image, projecting the mask into 3D points, and obtaining the 3D position of the target fruit in the camera frame coordinates;
s42, transforming coordinates from a camera frame to a device arm frame based on a camera external calibration device;
in step S5, using a bounding box of a block in each region of interest in the point cloud library, and identifying and calculating immature fruits for the corresponding block;
wherein the region of interest is a region comprising a 3D point cloud of the target fruit and potentially one or more immature fruits; the region of interest is divided into four layers: a top layer (6), an upper middle layer (7), a lower middle layer (8) and a bottom layer (9); each layer of the region of interest is divided into nine cube blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; so that the center block C of the position in the xy plane C Surrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; in front and left side views, the top layer (6) and the bottom layer (9) are equal in height to one and two times the sum of the heights of the upper middle layer (7) and the lower middle layer (8), respectively; the clamp holder (4) moves upwards to distinguish the immature fruits around the target fruits in the upper middle layer (7) and the lower middle layer (8), and the distribution of the immature fruits in the upper middle layer (7) and the lower middle layer (8) can be changed along the height direction;
the gripper (4) operates in three different phases: in a first stage, the gripper (4) is grasped from below, which moves the immature fruit horizontally in the bottom layer (9); during the second phase, the gripper (4) moves upwards to enclose the target fruit and distinguish the immature fruit within the upper (7), lower (8) intermediate layers; during the third phase, if the central block C in the top layer (6) C Occupied, the gripper (4) can drag the target fruit to a gripping position with less immature fruit;
the first stage is to distinguish the immature fruit horizontally below the target fruit in the bottom layer (9), using the number Nh of central blocks adjacent to the blocks without immature fruit to determine whether to use a single push operation or a serpentine operation;
ignoring the center block, the solid arrow in the block indicates that the block is occupied by immature fruit, and the blank arrow indicates that the block is unoccupied; nh is 5 and greater than a predetermined threshold th=4, selecting a single push operation to push the immature fruit aside; when the single pushing operation moves towards the immature fruit, the direction of the pushing operation of the gripper is calculated based on the position of the occupied zone according to the following formula:
where Oi is the vector of the ith occupied block in the largest set of adjacent occupied blocks and n is the total number of blocks in the largest set of adjacent occupied blocks; the parameter r is used to scale the Ds norm, which should ensure that the clamp is detached from the outside of the block, r=50mm;
the gripper moves from the center of the unoccupied block to the center of the occupied block so that the gripper has the highest likelihood of pushing all blocks apart;
ds=0 if only the center block Cc is occupied; determining the direction in which the gripper must move to push the immature fruit by calculating the shortest path from the current position of the gripper to the centre of the centre section CC; if no immature fruit is detected in the block, the gripper is not pushed in this stage and moves straight up from below;
if the number Nh of the areas adjacent to the areas without immature fruit in the central area is smaller than the threshold number Th, the clamp is pushed by a horizontal snake shape; the serpentine operation involves movement in three directions, forward, left and right, with the gripper pushing out the immature fruit in the three directions; the overall direction of the serpentine pushing operation is calculated based on the location of the unoccupied blocks according to the following formula:
wherein U is j Is the vector of the j-th unoccupied block within the largest set of adjacent unoccupied blocks, m is the total number of blocks within the largest set of adjacent unoccupied blocks; during a horizontal serpentine pushing operation, the device moves in the xy plane, where the resultant vector of the serpentine motion is equal to Dz, and the amplitude ah and number of pushes Nhp of the serpentine motion are determined according to the particular grasping scenario.
2. A control method of a picking device based on 3D visual perception according to claim 1, characterized in that the second stage is to enclose the target fruit in the upper middle layer (7), the lower middle layer (8) and to distinguish the immature fruit in the central layer; the upward serpentine pushing action employed in the upper (7), lower (8) intermediate layers includes movement of the gripper in a substantially vertical direction toward the target fruit and from side to traverse the immature fruit; the vertical direction passes through the center of the target fruit; calculating a direction of upward pushing du_z in the xy-plane based on the maximum number nu of center-block adjacent non-ripe fruit blocks; if nu is greater than threshold th, the direction Du_z is calculated from the occupied block according to the following formula, as in the single push operation in the bottom layer (9):
where au is a parameter for scaling the du_z norm, where au=5 mm;
if Nu is less than threshold Th, then the calculation uses the unoccupied block, calculated by the following formula:
where M is the intermediate vector of the calculation Du_z; the grippers move along duz and-Du z to push the sides of the immature fruit apart.
3. A method of controlling a picking device based on 3D visual perception according to claim 2, characterized in that during the third phase, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit;
only in the center area C of the top layer C The dragging operation is performed only when the immature fruit exists; if the center block C C The gripper moves upwards directly to grasp the target fruit; to avoid collision between the gripper and the table, three blocks L close to the table are skipped R 、C R 、R R To calculate the drag direction, the drag direction D in the xy plane dr The determination may be made according to the following equation:
wherein U is j Is the vector of the j-th unoccupied block within the largest group of adjacent unoccupied blocks; the block used for calculation is L C 、L F 、C F 、R F 、R C The method comprises the steps of carrying out a first treatment on the surface of the The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks; d (D) dr Scaled to l, where l = 50mm; in which there is usually less immature fruit, but if all the zones are occupied by immature fruit, the direction of towing is along with C F Alignment; wherein the drag and push operations are moved up the same height in the vertical direction.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011414617.XA CN112528826B (en) | 2020-12-04 | 2020-12-04 | Control method of picking device based on 3D visual perception |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011414617.XA CN112528826B (en) | 2020-12-04 | 2020-12-04 | Control method of picking device based on 3D visual perception |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112528826A CN112528826A (en) | 2021-03-19 |
CN112528826B true CN112528826B (en) | 2024-02-02 |
Family
ID=74997805
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011414617.XA Active CN112528826B (en) | 2020-12-04 | 2020-12-04 | Control method of picking device based on 3D visual perception |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112528826B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109588114A (en) * | 2018-12-20 | 2019-04-09 | 武汉科技大学 | A kind of parallelism recognition picker system and method applied to fruit picking robot |
CN110033487A (en) * | 2019-02-25 | 2019-07-19 | 上海交通大学 | Vegetables and fruits collecting method is blocked based on depth association perception algorithm |
WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
-
2020
- 2020-12-04 CN CN202011414617.XA patent/CN112528826B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019144575A1 (en) * | 2018-01-24 | 2019-08-01 | 中山大学 | Fast pedestrian detection method and device |
CN109588114A (en) * | 2018-12-20 | 2019-04-09 | 武汉科技大学 | A kind of parallelism recognition picker system and method applied to fruit picking robot |
CN110033487A (en) * | 2019-02-25 | 2019-07-19 | 上海交通大学 | Vegetables and fruits collecting method is blocked based on depth association perception algorithm |
Also Published As
Publication number | Publication date |
---|---|
CN112528826A (en) | 2021-03-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112136505B (en) | Fruit picking sequence planning method based on visual attention selection mechanism | |
Zhao et al. | On-tree fruit recognition using texture properties and color data | |
Puttemans et al. | Automated visual fruit detection for harvest estimation and robotic harvesting | |
Wu et al. | Detection and counting of banana bunches by integrating deep learning and classic image-processing algorithms | |
CN111666883B (en) | Grape picking robot target identification and fruit stalk clamping and cutting point positioning method | |
Li et al. | Fast detection and location of longan fruits using UAV images | |
CN107451602A (en) | A kind of fruits and vegetables detection method based on deep learning | |
EP3700835A1 (en) | Systems and methods for detecting waste receptacles using convolutional neural networks | |
Kalampokas et al. | Grape stem detection using regression convolutional neural networks | |
Ning et al. | Recognition of sweet peppers and planning the robotic picking sequence in high-density orchards | |
Thakur et al. | An innovative approach for fruit ripeness classification | |
CN111783693A (en) | Intelligent identification method of fruit and vegetable picking robot | |
Silwal et al. | Effort towards robotic apple harvesting in Washington State | |
Wang et al. | A transformer-based mask R-CNN for tomato detection and segmentation | |
CN112528826B (en) | Control method of picking device based on 3D visual perception | |
He et al. | Detecting and localizing strawberry centers for robotic harvesting in field environment | |
Khanal et al. | Machine Vision System for Early-stage Apple Flowers and Flower Clusters Detection for Precision Thinning and Pollination | |
CN112544235B (en) | Intelligent fruit picking robot | |
Klaoudatos et al. | Development of an Experimental Strawberry Harvesting Robotic System. | |
Rathore et al. | A two-stage deep-learning model for detection and occlusion-based classification of kashmiri orchard apples for robotic harvesting | |
Aljaafreh et al. | A Real-Time Olive Fruit Detection for Harvesting Robot Based on YOLO Algorithms | |
Dewi et al. | Image Processing Application on Automatic Fruit Detection for Agriculture Industry | |
Qiongyan et al. | Study on spike detection of cereal plants | |
CN114648628A (en) | Apple maturity detection method | |
Pichhika et al. | Detection of Multi-varieties of On-tree Mangoes using MangoYOLO5 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |