CN112528826B - Control method of picking device based on 3D visual perception - Google Patents

Control method of picking device based on 3D visual perception Download PDF

Info

Publication number
CN112528826B
CN112528826B CN202011414617.XA CN202011414617A CN112528826B CN 112528826 B CN112528826 B CN 112528826B CN 202011414617 A CN202011414617 A CN 202011414617A CN 112528826 B CN112528826 B CN 112528826B
Authority
CN
China
Prior art keywords
fruit
immature
block
target
blocks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011414617.XA
Other languages
Chinese (zh)
Other versions
CN112528826A (en
Inventor
唐玉新
唐双凌
徐陶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Academy of Agricultural Sciences
Original Assignee
Jiangsu Academy of Agricultural Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Academy of Agricultural Sciences filed Critical Jiangsu Academy of Agricultural Sciences
Priority to CN202011414617.XA priority Critical patent/CN112528826B/en
Publication of CN112528826A publication Critical patent/CN112528826A/en
Application granted granted Critical
Publication of CN112528826B publication Critical patent/CN112528826B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/68Food, e.g. fruit or vegetables
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Harvesting Machines For Specific Crops (AREA)

Abstract

The invention provides a control method of a picking device based on 3D visual perception, which comprises the following steps: s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked; s2, the control device identifies target fruits through an image acquisition device of the picking device, S3, the image acquisition device processes through a 3D color threshold to remove noise points from the background; s4, the control device uses deep learning to detect and position mature fruits; s5, selecting a region of interest around the target fruit to determine the existence of immature fruit; s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits. The invention actively distinguishes the immature fruits from the target based on visual perception, and selects single pushing operation or snakelike pushing operation composed of a plurality of linear pushing operations according to the distribution situation of the immature fruits around the target fruits so as to distinguish the immature fruits below the target fruits and at the same height of the target, thereby remarkably improving picking performance and greatly improving picking efficiency.

Description

Control method of picking device based on 3D visual perception
Technical Field
The application relates to the field of intelligent robots, in particular to a control method of a picking device based on 3D visual perception.
Background
The fruit picking mode in China is generally manual picking, the labor cost accounts for 50% -70% of the total cost of fruits and vegetables, and the fruit picking mode is high in cost, low in efficiency and difficult to realize high-altitude operation. The fruit picking robot is a device which replaces manpower and can automatically pick fruits. At present, domestic fruit picking robots are still in the early stage of development, and most fruit picking robots cannot meet the requirements of fruit farmers on picking fruits. At present, the end effector of the fruit picking robot has no buffering stage when clamping fruits, and the fruits are tender and easy to bruise. In addition, fruit picking robots require selectively harvested fruits due to the different maturity stages of the fruit. The existing fruit picking robot has great defects in identifying and selecting single mature fruits, and not accidentally damaging or accidentally selecting immature fruits for picking. Although there are also methods of exploring the end effector's search space for viable trajectories through image recognition and using search algorithms, where each step of the trajectory is planned by a collision detector. Most methods are passive methods, with the aim of avoiding immature fruits or other parts without changing the environment. However, immature fruits are not always avoidable, and when the target fruit is completely surrounded by immature fruits, there may be a difficulty for the end effector to find ways to avoid all of the immature fruits and to pick them.
Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a control method of a picking device based on 3D visual perception.
The invention discloses a control method of a picking device based on 3D visual perception, which is characterized by comprising the following steps: the method comprises the following steps:
s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image collection and packaging device removes noise points from the background through 3D color threshold processing;
s4, the control device uses deep learning to detect and position mature fruits;
s5, selecting a region of interest around the target fruit to determine the existence of immature fruit;
s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits.
Wherein, in step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
in step S4, it includes:
s41, identifying and segmenting the pixel-level object by using a segmented convolutional neural network; creating a number of masks for the ripe fruit through the network, wherein one mask represents the detected target fruit; by matching with the depth image, projecting the mask into 3D points, and obtaining the 3D position of the target fruit in the camera frame coordinates;
s42, transforming coordinates from the camera frame to the picking device arm frame based on the camera external calibration device;
in step S5, a bounding box of a block in each region of interest in the point cloud is cut out in the point cloud library, and the identification and calculation of the immature fruit are performed on the corresponding block.
Wherein the region of interest is a region comprising a 3D point cloud of the target fruit and potentially one or more immature fruits; the region of interest is divided into four layers: a top layer, an upper middle layer, a lower middle layer and a bottom layer; each layer of the region of interest is divided into nine cube blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; so that the center block C of the position in the xy plane C Surrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; in front and left side views, the top and bottom layers are equal in height to one and two times the sum of the heights of the upper and lower intermediate layers, respectively; the grippers are moved upward to distinguish the immature fruit around the target fruit in the upper and lower intermediate layers, the distribution of which can vary in the height direction.
In particular, the grippers operate in three distinct phases: in a first stage, the gripper grabs from below, the gripper moving the immature fruit horizontally in the bottom layer; during the second stage, the gripper is moved upward to surround the target fruit and distinguish the immature fruit in the upper and lower intermediate layers; during the third stage, if the center in the top layerBlock C C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit.
Specifically, the first stage is to horizontally differentiate the immature fruit below the target fruit in the bottom layer, using the number Nh of center block adjacent non-immature fruit blocks to determine whether to use a single push operation or a serpentine operation;
ignoring the center block, the solid arrow in the block indicates that the block is occupied by immature fruit, and the blank arrow indicates that the block is unoccupied; nh is 5 and greater than a predetermined threshold th=4, selecting a single push operation to push the immature fruit aside; when the single pushing operation moves towards the immature fruit, the direction of the pushing operation of the gripper is calculated based on the position of the occupied zone according to the following formula:
where Oi is the vector of the ith occupied block in the largest set of adjacent occupied blocks and n is the total number of blocks in the largest set of adjacent occupied blocks; the parameter r is used to scale the Ds norm, which should ensure that the clamp is detached from the outside of the block, r=50mm;
the gripper moves from the center of the unoccupied block to the center of the occupied block so that the gripper has the highest likelihood of pushing all blocks apart;
ds=0 if only the center block Cc is occupied; the direction in which the gripper has to move to push the immature fruit is determined by calculating the shortest path from the current position of the gripper to the centre of the centre section CC. If no immature fruit is detected in the block, the gripper is not pushed at this stage and moves straight up from below.
If the number Nh of the areas adjacent to the areas without immature fruit in the central area is smaller than the threshold number Th, the clamp is pushed by a horizontal snake shape; the serpentine operation involves movement in three directions, forward, left and right, with the gripper pushing out the immature fruit in the three directions; the overall direction of the serpentine pushing operation is calculated based on the location of the unoccupied blocks according to the following formula:
wherein U is j Is the vector of the j-th unoccupied block within the largest set of adjacent unoccupied blocks, m is the total number of blocks within the largest set of adjacent unoccupied blocks; during a horizontal serpentine pushing operation, the device moves in the xy plane, where the resultant vector of the serpentine motion is equal to Dz, and the amplitude ah and number of pushes Nhp of the serpentine motion are determined according to the particular grasping scenario.
Specifically, the second stage is to surround the target fruit in the upper and lower middle layers and distinguish the immature fruit in the center layer; the upward serpentine pushing action employed in the upper and lower intermediate layers includes movement of the gripper in a substantially vertical direction toward the target fruit and from side to traverse the immature fruit; the vertical direction passes through the center of the target fruit. Calculating a direction of upward pushing du_z in the xy-plane based on the maximum number nu of center-block adjacent non-ripe fruit blocks; if nu is greater than the threshold th, the direction Du_z is calculated from the occupied block according to the following formula, as in the single pushing operation in the bottom layer 9:
where au is a parameter for scaling the du_z norm, where au=5 mm;
if Nu is less than threshold Th, then the calculation uses the unoccupied block, calculated by the following formula:
where M is the intermediate vector of the calculation Du_z. The grippers move along duz and-Du z to push the sides of the immature fruit apart.
Specifically, during the third stage, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit;
only in the center area C of the top layer C The drag operation is performed when there is immature fruit. If the center block C C The gripper moves upwards directly to grasp the target fruit; to avoid collision between the gripper and the table, three blocks L close to the table are skipped R 、C R 、R R To calculate the drag direction, the drag direction D in the xy plane dr The determination may be made according to the following equation:
wherein U is j Is the vector of the j-th unoccupied block within the largest group of adjacent unoccupied blocks. The block used for calculation is L C 、L F 、C F 、R F 、R C . The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. D (D) dr Is scaled to l, where l = 50mm. In which there is usually less immature fruit, but if all the zones are occupied by immature fruit, the direction of towing is along with C F Alignment; wherein the drag and push operations are moved up the same height in the vertical direction.
According to the control method of the picking device based on 3D visual perception, objects are actively distinguished from targets based on visual perception, single pushing operation or snake pushing operation consisting of a plurality of linear pushing operations is selected according to the distribution situation of immature fruits around the target fruits, so that the immature fruits below the target fruits and at the same height of the target are distinguished, due to multi-directional pushing, denser immature fruits can be processed, and the generated left-right movement can break the static contact force between the target fruits and the immature fruits, so that the gripper can receive the target fruits more easily; a trailing operation is subsequently taken that includes avoiding the immature fruit and actively pushing the immature fruit away to address the problem of erroneously capturing the immature fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less immature fruit and then pushes it back to move the immature fruit aside for further differentiation, significantly improving picking performance, avoiding damage to the target fruit and the immature fruit, and greatly improving picking efficiency.
Drawings
Fig. 1 is a flowchart of a control method of a picking device based on 3D visual perception according to the present invention.
Fig. 2 is a flow chart of a control method of a picking device based on 3D visual perception according to the present invention.
Fig. 3a-3c are schematic image recognition diagrams of a control method of a picking device based on 3D visual perception according to the present invention.
Fig. 4a-4c are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Fig. 5a-5b are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Fig. 6a-6D are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Figures 7a-7b are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Figures 8a-8D are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Fig. 9a-9b are schematic views of a picking process of a method of controlling a picking device based on 3D visual perception according to the present invention.
Detailed Description
Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention.
As shown in fig. 1-2, a method for controlling a picking device based on 3D visual perception is characterized by comprising: the method comprises the following steps:
s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image collection and packaging device removes noise points from the background through 3D color threshold processing;
s4, the control device uses deep learning to detect and position mature fruits;
s5, selecting a region of interest around the target fruit to determine the existence of immature fruit;
s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits.
Wherein, in step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
in step S4, it includes:
s41, identifying and segmenting the pixel-level object by using a segmented convolutional neural network; creating a number of masks for the ripe fruit through the network, wherein one mask represents the detected target fruit; by matching with the depth image, projecting the mask into 3D points, and obtaining the 3D position of the target fruit in the camera frame coordinates;
s42, transforming coordinates from the camera frame to the picking device arm frame based on the camera external calibration device;
in step S5, a bounding box of a block in each region of interest in the point cloud is cut out in the point cloud library, and the identification and calculation of the immature fruit are performed on the corresponding block.
Among the image processing steps, the first step is to remove adjacent noise points by using hue, saturation, and intensity color thresholding. Some sensing points from irrigation pipes or shelves are around the ripe fruit, at a distance behind the fruit. Inaccurate depth sensing results in some sensing points being connected to the fruit in front, which is mistaken for an immature fruit. To avoid this effect, the first step is to remove adjacent noise points by using hue, saturation and intensity color thresholding.
The second step is the detection and positioning of ripe fruits: a segmented convolutional neural network is used to identify and segment pixel-level objects. Through the network, several masks are created for the ripe fruit, with one mask representing the detected target fruit. By matching with the depth image, the mask is projected as a 3D point, obtaining the 3D position of the target fruit in the camera frame coordinates. Thereafter, the coordinates are transformed from the camera frame to the picking device arm frame based on the camera external calibration device.
The third step is the calculation of the immature fruit.
Wherein the region of interest comprises a region comprising a 3D point cloud of the target fruit and potentially one or more immature fruits. As shown in fig. 3, the region of interest is divided into four layers: a top layer 6, an upper middle layer 7, a lower middle layer 8 and a bottom layer 9; as shown in the top view of fig. 3, the region of interest is further divided into nine cube blocks per layer. On each layer, the blocks form a 3 x 3 grid, the center of which is located at the horizontal midpoint of the target fruit; so that the center block C of the position in the xy plane C Surrounding the target fruit; in top view, the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; in front and left side views, the height of the top layer 6 and bottom layer 9 is equal to one and two times the height of the middle layer section, respectively; the gripper is moved upwards to distinguish the immature fruit around the target fruit in the intermediate layer, the immature fruit distribution in the intermediate layer being variable in the height direction.
In order to obtain a higher movement resolution, the intermediate layer is divided into an upper intermediate layer 7, a lower intermediate layer 8, and the movement in the movement of the intermediate layer is divided into two steps. The central area of the top layer 6 is lower than the other peripheral areas in the same layer by 80% of the other peripheral areas. This is because the object segmentation method does not include a green calyx. To avoid that the calyx is detected as immature fruit, the bottom of the center block is left in the top 1 blank.
To create a distinct path, each block is assigned a representation from block to center block C C Is a horizontal vector of the direction of (2). The direction of the vectors is determined by the location of the blocks such that all vectors are directed from the center of the respective block to the center block C C Is defined in the center of the (c). The number of points N in the point cloud area is used to determine if there is immature fruit in the block. Using a 1280 x 720 resolution camera, the thresholds for N for the top layer 6, upper middle layer 7, lower middle layer 8, and bottom layer 9 are 200, 100, and 300, respectively.
The gripper operates in three different phases: in a first stage, the gripper is grasped from below, moving the immature fruit horizontally in the bottom layer 9; during the second stage, the gripper moves upward to surround the target fruit and distinguish the immature fruit within the central layer; during the third phase, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit.
In particular, the first stage is to distinguish the immature fruit horizontally below the target fruit in the bottom layer 9, using the number Nh of central blocks adjacent to the blocks without immature fruit to determine whether to use a single push operation or a serpentine operation.
As shown in fig. 5a, the center block is ignored, the solid arrow in the block indicates that the block is occupied by immature fruit, and the blank arrow indicates the unoccupied block; nh is 5 and greater than a predetermined threshold th=4, so a single push operation is selected to push the immature fruit aside;
when the single pushing operation moves towards the immature fruit, the direction of the pushing operation of the gripper is calculated based on the position of the occupied zone according to the following formula:
where Oi is the vector of the i-th occupied block in the largest set of adjacent occupied blocks and n is the total number of blocks in the largest set of adjacent occupied blocks. The parameter r is used to scale the Ds norm, which should ensure that the clamp is disengaged from the outside of the block, r=50 mm.
The arrow in fig. 5a shows the calculated pushing direction for a single pushing operation; the gripper moves from the center of the unoccupied block to the center of the occupied block so that the gripper has the highest possibility of pushing all blocks apart.
Ds=0 if only the center block Cc is occupied; the direction in which the gripper has to move to push the immature fruit is determined by calculating the shortest path from the current position of the gripper to the centre of the centre section CC.
If no immature fruit is detected in the block, the gripper is not pushed at this stage and moves straight up from below.
If the number Nh of center area sections adjacent sections without immature fruit is less than the threshold number Th, the gripper adopts a horizontal serpentine pushing operation. Fig. 5b shows an example of path computation wherein a serpentine operation is selected to push the immature fruit from side to side. The red arrow is the general direction of operation, while the blue arrow is the serpentine path. Since the serpentine operation involves movement in three directions, forward, left and right, the gripper can push out the immature fruit in the three directions.
The overall direction of the serpentine pushing operation is calculated based on the location of the unoccupied blocks according to the following formula:
wherein U is j Is the vector of the j-th unoccupied block within the largest set of adjacent unoccupied blocks, and m is the total number of blocks within the largest set of adjacent unoccupied blocks. During a horizontal serpentine pushing operation, the device moves in the xy plane, where the resultant vector of the serpentine motion is equal to Dz, and the amplitude ah and number of pushes Nhp of the serpentine motion are determined according to the particular grasping scenario. For example, the effectiveness of these values may be affected by stem length, fruit weight, or fruit damping ratio, which are difficult to calculate. Where ah=20 mm and nhp=5.
In particular, the second stage is to enclose the target fruit in the upper intermediate layer 7, the lower intermediate layer 8 and to distinguish the immature fruit in the central layer.
As shown in fig. 6, the upward serpentine pushing action employed in the upper and lower intermediate layers 7, 8 involves movement of the gripper in a substantially vertical direction toward the target fruit and from side to traverse the movement of the immature fruit. The vertical direction passes through the center of the target fruit. The direction of upward pushing du_z in the xy plane is calculated based on the maximum number nu of center-block adjacent non-ripe fruit blocks. If nu is greater than the threshold th, the direction Du_z is calculated from the occupied block as in the single push operation in the bottom layer 9.
Where au is a parameter for scaling the du_z norm, where au=5 mm. If Nu is less than threshold Th, as shown in fig. 7a, then the calculation uses the unoccupied block, calculated by the following formula:
where M is the intermediate vector of the calculation Du_z. In fig. 7a, the grippers are moved along du_z and-du_z to push the sides of the immature fruit apart. The front view in fig. 7b shows the gripper moving gradually at the left or right intermediate point to pass the lower intermediate layer 8 and the upper intermediate layer 7. The push number nup in each layer is set to 5.
Specifically, during the third stage, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit.
As shown in fig. 8a, when there is an immature fruit above the top layer 6 of the target fruit, sometimes the gripper wraps the immature fruit when moving upward to catch the target fruit, or the gripper may damage the immature fruit. In addition, the immature fruit may prevent the wrapping sheet from closing, resulting in the inability to cut the stem of the target fruit.
During the third stage, a drag operation is employed that allows the gripper to grasp the target fruit without capturing unwanted immature fruit.
As shown in fig. 8, the dragging operation includes an upward dragging step to move the target fruit to the area containing less immature fruit and an upward pushing back step, as shown in fig. 8c, which pushes the upper immature fruit away before closing the finger. The push-back step is necessary when in the pulled position shown in fig. 8b, the target fruit stem is inclined so that the fruit is difficult to drop due to static forces and is prone to damage when the gripper is moved further up towards the cutting position.
Only in the center area C of the top layer C The drag operation is performed when there is immature fruit. If the center block C C The gripper moves upwards directly and grabs the target fruit. Fig. 9 is a diagram showing a calculation method of the drag operation corresponding to that in fig. 8. As shown in fig. 9a, to avoid collision between the clamper and the table, three blocks L near the table are skipped R 、C R 、R R To calculate the drag direction. Then, the dragging direction D in the xy plane dr The determination may be made according to the following equation:
wherein U is j Is the vector of the j-th unoccupied block within the largest group of adjacent unoccupied blocks. The block used for calculation is L C 、L F 、C F 、R F 、R C . The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks. D (D) dr Is scaled to l, where l = 50mm. Where there is typically less immature fruit, but if all the pieces areOccupied by immature fruit, the direction of dragging and C F Alignment. Fig. 9b shows a drag and push back step in which the drag and push back operation is moved up the same height in the vertical direction.
The construction of the convolutional neural network comprises the following steps:
step 1, data acquisition:
a rough generalized fruit recognition model is built, and fruit images on branches are shot, so that natural conditions of an orchard are ensured, wherein the capturing of the images is not limited in any way, namely, lamplight conditions, shooting angles, distances from fruits and other conditions are not limited at all.
Step 2, data preparation:
one major drawback of deep neural networks is that they rely largely on large amounts of marker data to provide good accuracy. These large datasets help the training phase learn all embedded parameters and minimize the risk of network overfitting. Preparing such a large number of images is very laborious, expensive and time-consuming.
More training data can be created from existing samples through data augmentation to effectively mitigate over-fitting, making some transformations to the original image so that the new image still has the characteristics of the original image and is visually categorized into the same category. This will increase the versatility of the model since the same picture will not be exposed multiple times. In this study, an automatic data addition method, including image cropping, horizontal flipping, rotation, and brightness operations, was applied to generate 16 images from one image. After modifying the generated image and deleting the invalid image, e.g. cropped from the non-fruit area, the total number of images in the dataset including the original data is obtained.
The augmentation process is performed prior to loading the data into the network, first the enhanced image can easily monitor any possible remote images; and secondly, the load of the model is smaller, so that the training time is reduced. The enhanced image is then resized so that all of the input images have the same resolution.
The prepared data set is split into two subsets for training and testing, with most of the data being randomly selected for training and the remainder of the data being selected for testing.
Step 3, constructing a convolutional neural network structure:
convolutional neural networks are a subset of deep networks that can automatically extract features of RGB images and classify them. It has the characteristics of convolution operation, pool layer and nonlinear activation function. The general topology of deep convolutional neural networks includes a series of convolutional and pooling layers, as well as some fully-connected layers. The convolutional neural network structure has three conversion layers, the model has three protection layers and two fully connected layers, the recognition speed is very fast, and less memory is required for training.
Wherein the convolutional network layer: convolutional networks can learn translation invariants and spatial hierarchies, so convolutional networks can learn pre-identified patterns anywhere in an image and learn more and more complex patterns through successive layers. Convolutional networks are typically composed of three types of layers: convolution layer, pooling layer and full connection layer.
The convolutional layer is characterized by two parameters: the size of the filters and the number of filters are calculated. All three convolution layers use 3 x 3 filters, with the number of filters being 16, 32 and 64, respectively.
To reduce the size of the feature map, a max pooling layer is provided after each convolution layer. The max pooling layer has no trainable parameters and can only reduce the number of features by selecting the maximum value in each window and discarding other values. The first pooling layer uses 4x4 windows and the second and third pooling layers use 2x2 windows.
The convolution operation is followed by a complementary step of the rectification function, which further breaks the inherent linearity of the input image by outputting only non-negative values, in the convolution network all convolution layers as well as the first fully connected layer use the rectification function as an activation function. The rectification function is:
setting an activation function at the last layer of the model, wherein the activation function is as follows:
where z is a vector of K inputs and j represents the output unit. The activation function is necessary for multi-class, single tag classification to normalize the input data to a probability distribution.
Before entering the classification stage, a global averaging pooling layer is employed. The global average pooling layer does not contain trainable parameters, and can obviously reduce parameters and improve model precision, thereby obviously improving the robustness of the model. The global averaging pooling layer is based on the average output of each feature map in the previous layer and the embedded flat layer. The global average pooling layer is used to compute the class activation map. The classification activation map obtains a convolutional neural network that is used to identify regions of a particular class in the image, i.e., which regions in the image are associated with that class. The class activation map for a class is determined by multiplying the output image of the last convolutional layer by the weights assigned after summation. The formula of the classified activation graph is shown as the formula:
wherein Mc is a class activation map of class c,the kth weight corresponding to the c class, f k (x, y) is related to the kth feature map of the last convolutional layer.
All filtering features in the overall convolution application network are encoded as input data to the fully connected classifier layer. The full connection layer connects all neurons of the previous layer and the current layer through a certain weight. The classification phase of the current model consists of two fully connected layers. Convolutional neural networks predict the class of an input image with a certain probability level. The error of this process needs to be measured by means of a loss function. A classification cross entropy loss function is used to evaluate the accuracy of the proposed model, which minimizes the difference between the output of the predictive probability distribution and the actual distribution of the target.
Step 4, network optimization:
the network is configured to load an input image with an associated tag. The input image is divided into training data and test data; 80% are used for training and the remaining 20% are used for test data. 10% of the training data set was used as the validation data set.
Increasing the network depth may improve overall performance, which is highest when the number of training samples is proportional to the network capacity. The three convolutional layers perform best and the structure is further optimized. The optimization process of the network is evaluated using different optimizers.
A robust model is built in the deep convolutional neural network, which is capable of identifying multiple categories of fruit on branches based on RGB images of the deep convolutional neural network, the deep convolutional neural model consisting of three convolutional layers and three maximum pooling layers, positioned after a global average pooling and two fully connected layers. The global averaging pooling layer is used, so that the need of a flat layer is eliminated, the global accuracy of unviewed data is improved, the classification index score is increased, the trainable total parameters are reduced, and the processing is faster. The network has higher fruit recognition rate and classification precision, high response speed, no influence of natural conditions, small calculated amount, and the fruit picking robot can rapidly and accurately recognize target fruits and interested areas by using the deep convolutional neural network, so that the ignored fruits are guaranteed to be minimum, and the yield is highest.
According to the control method of the picking device based on 3D visual perception, the immature fruits are actively distinguished from the target based on visual perception, and single pushing operation or snakelike pushing operation consisting of a plurality of linear pushing operations are selected according to the distribution situation of the immature fruits around the target fruits, so that the immature fruits below the target fruits and at the same height of the target are distinguished, and due to multi-directional pushing, denser immature fruits can be processed, and the generated left-right movement can break the static contact force between the target fruits and the immature fruits, so that the gripper can receive the target fruits more easily; a trailing operation is subsequently taken that includes avoiding the immature fruit and actively pushing the immature fruit away to address the problem of erroneously capturing the immature fruit above the target fruit, wherein the gripper pulls the target fruit to a location with less immature fruit and then pushes it back to move the immature fruit aside for further differentiation, significantly improving picking performance, avoiding damage to the target fruit and the immature fruit, and greatly improving picking efficiency.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution, and the present invention is intended to be covered in the scope of the present invention.

Claims (3)

1. A method for controlling a picking device based on 3D visual perception, comprising: the method comprises the following steps:
s1, a control device controls a picking device to walk along the lower part of a fruit support to be picked;
s2, the control device identifies the target fruit through the image acquisition device of the picking device,
s3, the image acquisition device removes noise points from the background through 3D color threshold processing;
s4, the control device uses deep learning to detect and position mature fruits;
s5, selecting a region of interest around the target fruit to determine the existence of immature fruit;
s6, calculating picking paths based on the distribution and the quantity of the immature fruits around the target fruits;
wherein, in step S2, adjacent noise points are removed by using hue, saturation and intensity color threshold processing,
in step S4, it includes:
s41, identifying and segmenting the pixel-level object by using a segmented convolutional neural network; creating a number of masks for the ripe fruit through the network, wherein one mask represents the detected target fruit; by matching with the depth image, projecting the mask into 3D points, and obtaining the 3D position of the target fruit in the camera frame coordinates;
s42, transforming coordinates from a camera frame to a device arm frame based on a camera external calibration device;
in step S5, using a bounding box of a block in each region of interest in the point cloud library, and identifying and calculating immature fruits for the corresponding block;
wherein the region of interest is a region comprising a 3D point cloud of the target fruit and potentially one or more immature fruits; the region of interest is divided into four layers: a top layer (6), an upper middle layer (7), a lower middle layer (8) and a bottom layer (9); each layer of the region of interest is divided into nine cube blocks, the blocks form a 3 x 3 grid, and the center of the grid is positioned at the horizontal midpoint of the target fruit; so that the center block C of the position in the xy plane C Surrounding the target fruit; the length and width of the outer eight peripheral blocks are equal to the length and width of the central block; in front and left side views, the top layer (6) and the bottom layer (9) are equal in height to one and two times the sum of the heights of the upper middle layer (7) and the lower middle layer (8), respectively; the clamp holder (4) moves upwards to distinguish the immature fruits around the target fruits in the upper middle layer (7) and the lower middle layer (8), and the distribution of the immature fruits in the upper middle layer (7) and the lower middle layer (8) can be changed along the height direction;
the gripper (4) operates in three different phases: in a first stage, the gripper (4) is grasped from below, which moves the immature fruit horizontally in the bottom layer (9); during the second phase, the gripper (4) moves upwards to enclose the target fruit and distinguish the immature fruit within the upper (7), lower (8) intermediate layers; during the third phase, if the central block C in the top layer (6) C Occupied, the gripper (4) can drag the target fruit to a gripping position with less immature fruit;
the first stage is to distinguish the immature fruit horizontally below the target fruit in the bottom layer (9), using the number Nh of central blocks adjacent to the blocks without immature fruit to determine whether to use a single push operation or a serpentine operation;
ignoring the center block, the solid arrow in the block indicates that the block is occupied by immature fruit, and the blank arrow indicates that the block is unoccupied; nh is 5 and greater than a predetermined threshold th=4, selecting a single push operation to push the immature fruit aside; when the single pushing operation moves towards the immature fruit, the direction of the pushing operation of the gripper is calculated based on the position of the occupied zone according to the following formula:
where Oi is the vector of the ith occupied block in the largest set of adjacent occupied blocks and n is the total number of blocks in the largest set of adjacent occupied blocks; the parameter r is used to scale the Ds norm, which should ensure that the clamp is detached from the outside of the block, r=50mm;
the gripper moves from the center of the unoccupied block to the center of the occupied block so that the gripper has the highest likelihood of pushing all blocks apart;
ds=0 if only the center block Cc is occupied; determining the direction in which the gripper must move to push the immature fruit by calculating the shortest path from the current position of the gripper to the centre of the centre section CC; if no immature fruit is detected in the block, the gripper is not pushed in this stage and moves straight up from below;
if the number Nh of the areas adjacent to the areas without immature fruit in the central area is smaller than the threshold number Th, the clamp is pushed by a horizontal snake shape; the serpentine operation involves movement in three directions, forward, left and right, with the gripper pushing out the immature fruit in the three directions; the overall direction of the serpentine pushing operation is calculated based on the location of the unoccupied blocks according to the following formula:
wherein U is j Is the vector of the j-th unoccupied block within the largest set of adjacent unoccupied blocks, m is the total number of blocks within the largest set of adjacent unoccupied blocks; during a horizontal serpentine pushing operation, the device moves in the xy plane, where the resultant vector of the serpentine motion is equal to Dz, and the amplitude ah and number of pushes Nhp of the serpentine motion are determined according to the particular grasping scenario.
2. A control method of a picking device based on 3D visual perception according to claim 1, characterized in that the second stage is to enclose the target fruit in the upper middle layer (7), the lower middle layer (8) and to distinguish the immature fruit in the central layer; the upward serpentine pushing action employed in the upper (7), lower (8) intermediate layers includes movement of the gripper in a substantially vertical direction toward the target fruit and from side to traverse the immature fruit; the vertical direction passes through the center of the target fruit; calculating a direction of upward pushing du_z in the xy-plane based on the maximum number nu of center-block adjacent non-ripe fruit blocks; if nu is greater than threshold th, the direction Du_z is calculated from the occupied block according to the following formula, as in the single push operation in the bottom layer (9):
where au is a parameter for scaling the du_z norm, where au=5 mm;
if Nu is less than threshold Th, then the calculation uses the unoccupied block, calculated by the following formula:
where M is the intermediate vector of the calculation Du_z; the grippers move along duz and-Du z to push the sides of the immature fruit apart.
3. A method of controlling a picking device based on 3D visual perception according to claim 2, characterized in that during the third phase, if the center block C in the top layer C Occupied, the gripper can drag the target fruit to a gripping position with less immature fruit;
only in the center area C of the top layer C The dragging operation is performed only when the immature fruit exists; if the center block C C The gripper moves upwards directly to grasp the target fruit; to avoid collision between the gripper and the table, three blocks L close to the table are skipped R 、C R 、R R To calculate the drag direction, the drag direction D in the xy plane dr The determination may be made according to the following equation:
wherein U is j Is the vector of the j-th unoccupied block within the largest group of adjacent unoccupied blocks; the block used for calculation is L C 、L F 、C F 、R F 、R C The method comprises the steps of carrying out a first treatment on the surface of the The parameter m is the total number of blocks within the largest group of adjacent unoccupied blocks; d (D) dr Scaled to l, where l = 50mm; in which there is usually less immature fruit, but if all the zones are occupied by immature fruit, the direction of towing is along with C F Alignment; wherein the drag and push operations are moved up the same height in the vertical direction.
CN202011414617.XA 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception Active CN112528826B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011414617.XA CN112528826B (en) 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011414617.XA CN112528826B (en) 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception

Publications (2)

Publication Number Publication Date
CN112528826A CN112528826A (en) 2021-03-19
CN112528826B true CN112528826B (en) 2024-02-02

Family

ID=74997805

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011414617.XA Active CN112528826B (en) 2020-12-04 2020-12-04 Control method of picking device based on 3D visual perception

Country Status (1)

Country Link
CN (1) CN112528826B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109588114A (en) * 2018-12-20 2019-04-09 武汉科技大学 A kind of parallelism recognition picker system and method applied to fruit picking robot
CN110033487A (en) * 2019-02-25 2019-07-19 上海交通大学 Vegetables and fruits collecting method is blocked based on depth association perception algorithm
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019144575A1 (en) * 2018-01-24 2019-08-01 中山大学 Fast pedestrian detection method and device
CN109588114A (en) * 2018-12-20 2019-04-09 武汉科技大学 A kind of parallelism recognition picker system and method applied to fruit picking robot
CN110033487A (en) * 2019-02-25 2019-07-19 上海交通大学 Vegetables and fruits collecting method is blocked based on depth association perception algorithm

Also Published As

Publication number Publication date
CN112528826A (en) 2021-03-19

Similar Documents

Publication Publication Date Title
CN112136505B (en) Fruit picking sequence planning method based on visual attention selection mechanism
Zhao et al. On-tree fruit recognition using texture properties and color data
Puttemans et al. Automated visual fruit detection for harvest estimation and robotic harvesting
Wu et al. Detection and counting of banana bunches by integrating deep learning and classic image-processing algorithms
CN111666883B (en) Grape picking robot target identification and fruit stalk clamping and cutting point positioning method
Li et al. Fast detection and location of longan fruits using UAV images
CN107451602A (en) A kind of fruits and vegetables detection method based on deep learning
EP3700835A1 (en) Systems and methods for detecting waste receptacles using convolutional neural networks
Kalampokas et al. Grape stem detection using regression convolutional neural networks
Ning et al. Recognition of sweet peppers and planning the robotic picking sequence in high-density orchards
Thakur et al. An innovative approach for fruit ripeness classification
CN111783693A (en) Intelligent identification method of fruit and vegetable picking robot
Silwal et al. Effort towards robotic apple harvesting in Washington State
Wang et al. A transformer-based mask R-CNN for tomato detection and segmentation
CN112528826B (en) Control method of picking device based on 3D visual perception
He et al. Detecting and localizing strawberry centers for robotic harvesting in field environment
Khanal et al. Machine Vision System for Early-stage Apple Flowers and Flower Clusters Detection for Precision Thinning and Pollination
CN112544235B (en) Intelligent fruit picking robot
Klaoudatos et al. Development of an Experimental Strawberry Harvesting Robotic System.
Rathore et al. A two-stage deep-learning model for detection and occlusion-based classification of kashmiri orchard apples for robotic harvesting
Aljaafreh et al. A Real-Time Olive Fruit Detection for Harvesting Robot Based on YOLO Algorithms
Dewi et al. Image Processing Application on Automatic Fruit Detection for Agriculture Industry
Qiongyan et al. Study on spike detection of cereal plants
CN114648628A (en) Apple maturity detection method
Pichhika et al. Detection of Multi-varieties of On-tree Mangoes using MangoYOLO5

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant