CN116363505A - Target picking method based on picking robot vision system - Google Patents
Target picking method based on picking robot vision system Download PDFInfo
- Publication number
- CN116363505A CN116363505A CN202310210531.2A CN202310210531A CN116363505A CN 116363505 A CN116363505 A CN 116363505A CN 202310210531 A CN202310210531 A CN 202310210531A CN 116363505 A CN116363505 A CN 116363505A
- Authority
- CN
- China
- Prior art keywords
- mask
- picking
- cnn model
- training
- fruit
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 235000013399 edible fruits Nutrition 0.000 claims abstract description 70
- 238000012549 training Methods 0.000 claims abstract description 64
- 238000012795 verification Methods 0.000 claims abstract description 16
- 238000012360 testing method Methods 0.000 claims abstract description 13
- 230000008569 process Effects 0.000 claims abstract description 9
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 238000013527 convolutional neural network Methods 0.000 claims description 48
- 238000000605 extraction Methods 0.000 claims description 16
- 238000002372 labelling Methods 0.000 claims description 16
- 238000013528 artificial neural network Methods 0.000 claims description 11
- 238000001514 detection method Methods 0.000 claims description 10
- 238000012216 screening Methods 0.000 claims description 10
- 230000003321 amplification Effects 0.000 claims description 8
- 238000003199 nucleic acid amplification method Methods 0.000 claims description 8
- 230000002776 aggregation Effects 0.000 claims description 6
- 238000004220 aggregation Methods 0.000 claims description 6
- 238000011176 pooling Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 230000011218 segmentation Effects 0.000 claims description 6
- 238000009826 distribution Methods 0.000 claims description 4
- 235000012055 fruits and vegetables Nutrition 0.000 abstract description 6
- 238000005516 engineering process Methods 0.000 abstract description 5
- 238000013135 deep learning Methods 0.000 abstract description 4
- 241000227653 Lycopersicon Species 0.000 description 3
- 235000007688 Lycopersicon esculentum Nutrition 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000003384 imaging method Methods 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 238000010200 validation analysis Methods 0.000 description 2
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 241000219094 Vitaceae Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 235000021021 grapes Nutrition 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000004807 localization Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 239000002420 orchard Substances 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
- 230000026676 system process Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/10—Terrestrial scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
- G06T7/73—Determining position or orientation of objects or cameras using feature-based methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20016—Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02T—CLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
- Y02T10/00—Road transport of goods or passengers
- Y02T10/10—Internal combustion engine [ICE] based vehicles
- Y02T10/40—Engine management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Molecular Biology (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention relates to a target picking method based on a picking robot vision system, which comprises the following steps: acquiring an original fruit image by using a camera; preprocessing the collected original fruit image, forming a data set by the preprocessed fruit image, and dividing the data set into a training set, a verification set and a test set; establishing a Mask R-CNN model, and training; optimizing the training process to obtain a Mask R-CNN model with highest training precision; and loading a Mask R-CNN model with highest training precision on a CPU of the picking robot, and finally realizing target identification picking based on a vision system of the picking robot. According to the invention, a high-pixel camera is used for acquiring mature fruit images, a deep learning technology is applied to intelligent fruit picking, a network structure is adjusted according to actual use scenes, a Mask R-CNN model is trained, finally the Mask R-CNN model can automatically detect mature fruits, intelligent picking is realized, and the problem that a picking robot cannot well identify and extract fruits and vegetables due to a complex surrounding environment is solved.
Description
Technical Field
The invention relates to the technical field of deep learning and artificial intelligence, in particular to a target picking method based on a picking robot vision system.
Background
With the development of the age and the progress of science and technology, the life quality of people is continuously improved, and the demand for fruits is gradually increased. The continuous expansion of fruit planting area, gradual reduction of agricultural practitioners and aggravation of population aging trend, and manual picking can not meet the concentrated and rapid picking process in the fruit maturity stage. In addition, the traditional manual picking mode has the defects of low picking efficiency, high labor intensity, high-altitude picking difficulty, high potential safety hazard and the like. These have severely limited the long-term development of the planting industry. In order to save manpower and material resources and improve the economic income of fruit growers, the mechanization of the planting industry becomes a necessary trend of development.
The large-scale mechanical picking device mainly aims at a large farm, has high picking efficiency and large scale, but the price of the required purchasing picking equipment is relatively high, and the maintenance cost of the machine are small, so that the large-scale mechanical picking device is not suitable for small-area fruit planting taking families as units in China. Secondly, because the machine is manually operated, the fruit is inevitably damaged to a certain extent in the picking process. Moreover, mechanical picking is not selective, and may involve some unhealthy, immature or damaged grapes, and may shake some leaves, etc., which may increase the difficulty of fruit screening.
The existing picking robot has complex running environment and a plurality of uncertainty factors, so that the picking difficulty is high. Efficient and quick fruit and vegetable picking requires accurate recognition and three-dimensional positioning support. The vision system of the picking robot operates through 4 phases: target detection, target recognition, three-dimensional reconstruction and three-dimensional positioning. The target detection requires a detection algorithm to timely detect a target object in an image; target recognition requires that picked objects and other interfering items be identified; the three-dimensional reconstruction is to acquire a two-dimensional image of a target object through a camera, and then acquire three-dimensional information of fruits in space through algorithms such as feature extraction, stereo matching and the like; and obtaining space coordinates through three-dimensional reconstruction to complete three-dimensional positioning. The precision of target identification and positioning directly determines the picking efficiency of the picking robot, whether crops are damaged, whether the picking robot body is damaged by collision, and the like. In picking operations, the factors that cause inaccuracy in target identification and localization are quite numerous and can be summarized in the following ways: 1) A change in natural illumination; 2) A complex growth environment; 3) The fruits are overlapped or blocked by branches, leaves, stems and the like; 4) The vibration of the mechanical arm causes inaccurate imaging of the sensor; 5) Radio frequency interference, robot controllers, cameras, sensors, etc.; 6) Robot mechanical failure, etc. With such a number of interfering factors, accurate fruit and vegetable identification and positioning presents a significant challenge.
Disclosure of Invention
The invention aims to provide a target picking method based on a picking robot vision system, which can automatically detect mature fruits and realize intelligent picking, and solves the problem that a picking robot cannot well identify and extract fruits and vegetables due to a complex surrounding environment.
In order to achieve the above purpose, the present invention adopts the following technical scheme: a method of picking a target based on a vision system of a picking robot, the method comprising the sequential steps of:
(1) Acquiring an original fruit image by using a camera;
(2) Preprocessing the collected original fruit image, forming a data set by the preprocessed fruit image, and dividing the data set into a training set, a verification set and a test set;
(3) Establishing a Mask R-CNN model, inputting a training set into the Mask R-CNN model for training, and obtaining a trained Mask R-CNN model;
(4) Optimizing the training process to obtain a Mask R-CNN model with highest training precision;
(5) And loading a Mask R-CNN model with highest training precision on a CPU of the picking robot, and finally realizing target identification picking based on a vision system of the picking robot.
The step (2) specifically comprises the following steps:
(2a) Primary screening: screening 1056 original fruit images which are clear in image and contain fruit targets according to actual requirements;
(2b) Labeling: labeling the preliminarily screened fruit images by using a labelme tool, labeling the ripe fruits as 1, labeling the immature fruits as 0, and setting the areas except the fruits as the background without labeling, and establishing a label image as a target detection label;
(2c) Data set classification: the marked fruit images are composed into a data set, and the data set is processed according to the following steps of 8:1:1 is divided into a training set, a verification set and a test set;
(2d) Data amplification: and 5 data amplifications are carried out on each image in the training set and the verification set, namely rotation is carried out by 90 degrees, 180 degrees and 270 degrees, color dithering and Gaussian noise are turned horizontally and vertically, so that the training set contains 4000 images, the verification set contains 640 images, and the test set contains 640 images.
In the step (3), the Mask R-CNN model includes a main network for extracting an input image feature map, a region candidate network, a region of interest alignment layer and a region convolution neural network, the Mask R-CNN model adopts a residual network and a feature pyramid as extraction features of the main network, the main network outputs the feature map to the region candidate network, the region candidate network generates a region of interest, a candidate object boundary box is provided, the region of interest alignment layer matches the feature map output by the region of interest and the main network, feature aggregation and pooling are completed to be of a fixed size, the feature map feature aggregation and pooling is output to the region convolution neural network through a full connection layer, the region convolution neural network includes a first branch, a second branch and a third branch, the first branch realizes classification of fruits through a softmax classifier, the second branch realizes more accurate target positioning through a boundary box regressing device, the third branch completes contour segmentation of mature fruits through the full convolution network, a Mask is generated, and finally, each branch outputs information comprehensively, and images including classification, positioning boundary boxes and accurate positioning of fruits are realized.
The step (4) specifically refers to: and adding a mosaic enhancement method in the training process, improving the local capacity of the Mask R-CNN model, optimizing the long tail distribution of the data set, improving the training precision, and correspondingly obtaining the Mask R-CNN model with the highest training precision when the training precision reaches 98.7%.
The step (5) specifically refers to: the method comprises the steps of obtaining fruit images to be picked by using a camera of a picking robot, processing the fruit images to be picked based on a Mask R-CNN model with highest training precision, outputting a target recognition extraction result by the Mask R-CNN model with highest training precision, and picking by the picking robot according to the target recognition extraction result.
According to the technical scheme, the beneficial effects of the invention are as follows: firstly, acquiring mature fruit images by using a high-pixel camera, applying a deep learning technology to intelligent fruit picking, adjusting a network structure according to an actual use scene, training a Mask R-CNN model, finally enabling the Mask R-CNN model to automatically detect mature fruits, and realizing intelligent picking; secondly, the Mask R-CNN model is applied to a vision system of the picking robot so as to realize self-adaptive recognition and extraction of the target, the problem that the picking robot cannot recognize and extract fruits and vegetables well due to a complex surrounding environment is solved, and experimental results show that the method can better solve the recognition and extraction problem of the target in the complex environment; thirdly, the Mask branch layer of the Mask R-CNN model can realize classification of pixel level, so that the target recognition rate is higher than that of the fast R-CNN, and the Mask branch layer has a certain screening effect on mature and immature fruits during fruit recognition and extraction.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a flow chart of a vision processing system of the picking robot in the present invention;
FIG. 3 is a diagram of a visual system object recognition and extraction framework.
Detailed Description
As shown in fig. 1, a method for picking targets based on a vision system of a picking robot, the method comprising the following sequential steps:
(1) Acquiring an original fruit image by using a camera;
(2) Preprocessing the collected original fruit image, forming a data set by the preprocessed fruit image, and dividing the data set into a training set, a verification set and a test set;
(3) Establishing a Mask R-CNN model, inputting a training set into the Mask R-CNN model for training, and obtaining a trained Mask R-CNN model;
(4) Optimizing the training process to obtain a Mask R-CNN model with highest training precision, namely an optimal model;
(5) And loading a Mask R-CNN model with highest training precision on a CPU of the picking robot, and finally realizing target identification picking based on a vision system of the picking robot.
The step (2) specifically comprises the following steps:
(2a) Primary screening: screening 1056 original fruit images which are clear in image and contain fruit targets according to actual requirements;
(2b) Labeling: labeling the preliminarily screened fruit images by using a labelme tool, labeling the ripe fruits as 1, labeling the immature fruits as 0, and setting the areas except the fruits as the background without labeling, and establishing a label image as a target detection label;
(2c) Data set classification: the marked fruit images are composed into a data set, and the data set is processed according to the following steps of 8:1:1 is divided into a training set, a verification set and a test set;
(2d) Data amplification: and 5 data amplifications are carried out on each image in the training set and the verification set, namely rotation is carried out by 90 degrees, 180 degrees and 270 degrees, color dithering and Gaussian noise are turned horizontally and vertically, so that the training set contains 4000 images, the verification set contains 640 images, and the test set contains 640 images.
In the step (3), the Mask R-CNN model includes a main network for extracting an input image feature map, a region candidate network, a region of interest alignment layer and a region convolution neural network, the Mask R-CNN model adopts a residual network and a feature pyramid as extraction features of the main network, the main network outputs the feature map to the region candidate network, the region candidate network generates a region of interest, a candidate object boundary box is provided, the region of interest alignment layer matches the feature map output by the region of interest and the main network, feature aggregation and pooling are completed to be of a fixed size, the feature map feature aggregation and pooling is output to the region convolution neural network through a full connection layer, the region convolution neural network includes a first branch, a second branch and a third branch, the first branch realizes classification of fruits through a softmax classifier, the second branch realizes more accurate target positioning through a boundary box regressing device, the third branch completes contour segmentation of mature fruits through the full convolution network, a Mask is generated, and finally, each branch outputs information comprehensively, and images including classification, positioning boundary boxes and accurate positioning of fruits are realized.
The step (4) specifically refers to: and adding a mosaic enhancement method in the training process, improving the local capacity of the Mask R-CNN model, optimizing the long tail distribution of the data set, improving the training precision, and correspondingly obtaining the Mask R-CNN model with the highest training precision when the training precision reaches 98.7%. After training is completed, an initial training result of 4000 images can be obtained, the initial training result is analyzed to find that the training precision does not reach the expected value, a mosaic enhancement method is added, the local capacity of a Mask R-CNN model is improved, the long tail distribution of a data set is optimized, the model is improved, the training precision is improved, the final training precision reaches 98.7%, and the detection speed is also improved before improvement.
The step (5) specifically refers to: the method comprises the steps of obtaining fruit images to be picked by using a camera of a picking robot, processing the fruit images to be picked based on a Mask R-CNN model with highest training precision, outputting a target recognition extraction result by the Mask R-CNN model with highest training precision, and picking by the picking robot according to the target recognition extraction result.
The invention is further described below in connection with fig. 1 to 3.
First, image acquisition
The Sony A7R4 camera is adopted, the pixel position of the camera is 6100 ten thousand, the image resolution is 9504 multiplied by 6336 pixels, and the hand-held camera enters an orchard and a greenhouse to collect manual data of tomatoes. In the case of picture data acquisition, the following considerations apply:
1. ensuring that the acquired picture data are shot from all angles, and conforming to the actual scene;
2. in the image dataset, the detected object must have at least one similar, more or less, shape, object to the side, relative size, rotation angle, etc.;
3. the quality of the collected data is guaranteed, the size is best and is close to the size of a used scene, the diversity of the data is collected in the scene as much as possible, and various photos in a natural scene are collected as much as possible;
4. the number of the data sets is as large as possible, so that experimental contingency is avoided, and the target detection precision is improved;
5. the data set comprises the object to be detected and the object not to be detected, but only the object to be detected is marked if the object to be detected is marked;
6. if multiple classes are detected, then the number of times each target detected class appears in your data is nearly the same;
7. the number of data sets may be increased by means of image enhancement.
2. Image processing
And screening the photos collected by the camera, removing the photos with abnormal angles, imaging or blurred photos and the like, and manufacturing the remaining screened photos into a training set and a verification set for target detection.
The training set is used for training the model and determining parameters;
the verification set is used for determining the network structure and adjusting the super parameters of the model;
a test set for checking generalization ability of the model;
parameters, which refer to variables obtained by learning a model, such as weights and biases;
super-parameters refer to parameters set according to experience, such as iteration times, the number of hidden layers, the number of neurons in each layer, learning rate and the like. Thus, a complete scientific dataset was created with 800 pictures as training sets, 128 as validation sets, and 128 as test sets.
The data amplification can improve the generalization performance of the neural network. Therefore, 5 data amplification modes are respectively carried out on each image in the training set and the verification set, namely rotation by 90 degrees, 180 degrees and 270 degrees, horizontal and vertical overturn color dithering and Gaussian noise. After the data is enhanced, the corresponding labeling information is updated, so that the task of expanding the data set can be completed well, and the training precision based on the Mask R-CNN model is improved better. Finally, the training set had 4000 images and the validation set had 640 images. The test set had 640 images.
The vision system of the picking robot mainly comprises the following three parts: the camera acquires the scene image, the vision system processes the scene image, saves and returns the processing result, and the whole flow is shown in figure 2.
The Mask R-CNN model is a neural Network framework for completing object segmentation and recognition aiming at a single image, adopts an Anchor technology adopted by an R-CNN series Network, optimizes recognition effects of objects with different scales by combining an image pyramid Network (FPN), and realizes accurate object segmentation by introducing a full convolution Network (Fully Convolutional Networks, FCN), wherein the whole implementation flow chart of the model is shown in a figure 3.
The Mask R-CNN model can detect a target object in an image and mark a target object area, and can detect rice lodging. The Mask R-CNN model is added with a branch target Mask prediction network Mask on the basis of the fast R-CNN, so that target detection is realized and high-quality segmentation of target instances is realized.
In summary, the high-pixel camera is used for acquiring the mature fruit image, the deep learning technology is applied to intelligent fruit picking, the network structure is adjusted according to the actual use scene, the Mask R-CNN model is trained, and finally the Mask R-CNN model can automatically detect the mature fruit and realize intelligent fruit picking; according to the invention, the Mask R-CNN model is applied to a vision system of the picking robot so as to realize self-adaptive recognition and extraction of the target, so that the problem that the picking robot cannot recognize and extract fruits and vegetables well due to a complex surrounding environment is solved, and experimental results show that the recognition and extraction problem of the target in the complex environment can be well solved; according to practical inspection, the algorithm not only solves the problem that the picking robot cannot well identify and extract tomatoes due to the complex surrounding environment, but also has good screening effect on mature and immature tomatoes.
Claims (5)
1. A picking method of targets based on a picking robot vision system is characterized in that: the method comprises the following steps in sequence:
(1) Acquiring an original fruit image by using a camera;
(2) Preprocessing the collected original fruit image, forming a data set by the preprocessed fruit image, and dividing the data set into a training set, a verification set and a test set;
(3) Establishing a Mask R-CNN model, inputting a training set into the Mask R-CNN model for training, and obtaining a trained Mask R-CNN model;
(4) Optimizing the training process to obtain a Mask R-CNN model with highest training precision;
(5) And loading a Mask R-CNN model with highest training precision on a CPU of the picking robot, and finally realizing target identification picking based on a vision system of the picking robot.
2. The picking robot vision system-based target picking method of claim 1, characterized by: the step (2) specifically comprises the following steps:
(2a) Primary screening: screening 1056 original fruit images which are clear in image and contain fruit targets according to actual requirements;
(2b) Labeling: labeling the preliminarily screened fruit images by using a labelme tool, labeling the ripe fruits as 1, labeling the immature fruits as 0, and setting the areas except the fruits as the background without labeling, and establishing a label image as a target detection label;
(2c) Data set classification: the marked fruit images are composed into a data set, and the data set is processed according to the following steps of 8:1:1 is divided into a training set, a verification set and a test set;
(2d) Data amplification: and 5 data amplifications are carried out on each image in the training set and the verification set, namely rotation is carried out by 90 degrees, 180 degrees and 270 degrees, color dithering and Gaussian noise are turned horizontally and vertically, so that the training set contains 4000 images, the verification set contains 640 images, and the test set contains 640 images.
3. The picking robot vision system-based target picking method of claim 1, characterized by: in the step (3), the Mask R-CNN model includes a main network for extracting an input image feature map, a region candidate network, a region of interest alignment layer and a region convolution neural network, the Mask R-CNN model adopts a residual network and a feature pyramid as extraction features of the main network, the main network outputs the feature map to the region candidate network, the region candidate network generates a region of interest, a candidate object boundary box is provided, the region of interest alignment layer matches the feature map output by the region of interest and the main network, feature aggregation and pooling are completed to be of a fixed size, the feature map feature aggregation and pooling is output to the region convolution neural network through a full connection layer, the region convolution neural network includes a first branch, a second branch and a third branch, the first branch realizes classification of fruits through a softmax classifier, the second branch realizes more accurate target positioning through a boundary box regressing device, the third branch completes contour segmentation of mature fruits through the full convolution network, a Mask is generated, and finally, each branch outputs information comprehensively, and images including classification, positioning boundary boxes and accurate positioning of fruits are realized.
4. The picking robot vision system-based target picking method of claim 1, characterized by: the step (4) specifically refers to: and adding a mosaic enhancement method in the training process, improving the local capacity of the Mask R-CNN model, optimizing the long tail distribution of the data set, improving the training precision, and correspondingly obtaining the Mask R-CNN model with the highest training precision when the training precision reaches 98.7%.
5. The picking robot vision system-based target picking method of claim 1, characterized by: the step (5) specifically refers to: the method comprises the steps of obtaining fruit images to be picked by using a camera of a picking robot, processing the fruit images to be picked based on a Mask R-CNN model with highest training precision, outputting a target recognition extraction result by the Mask R-CNN model with highest training precision, and picking by the picking robot according to the target recognition extraction result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310210531.2A CN116363505A (en) | 2023-03-07 | 2023-03-07 | Target picking method based on picking robot vision system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310210531.2A CN116363505A (en) | 2023-03-07 | 2023-03-07 | Target picking method based on picking robot vision system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116363505A true CN116363505A (en) | 2023-06-30 |
Family
ID=86910986
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310210531.2A Pending CN116363505A (en) | 2023-03-07 | 2023-03-07 | Target picking method based on picking robot vision system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116363505A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116935070A (en) * | 2023-09-19 | 2023-10-24 | 北京市农林科学院智能装备技术研究中心 | Modeling method for picking target of fruit cluster picking robot |
CN117456368A (en) * | 2023-12-22 | 2024-01-26 | 安徽大学 | Fruit and vegetable identification picking method, system and device |
CN117617002A (en) * | 2024-01-04 | 2024-03-01 | 太原理工大学 | Method for automatically identifying tomatoes and intelligently harvesting tomatoes |
-
2023
- 2023-03-07 CN CN202310210531.2A patent/CN116363505A/en active Pending
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116935070A (en) * | 2023-09-19 | 2023-10-24 | 北京市农林科学院智能装备技术研究中心 | Modeling method for picking target of fruit cluster picking robot |
CN116935070B (en) * | 2023-09-19 | 2023-12-26 | 北京市农林科学院智能装备技术研究中心 | Modeling method for picking target of fruit cluster picking robot |
CN117456368A (en) * | 2023-12-22 | 2024-01-26 | 安徽大学 | Fruit and vegetable identification picking method, system and device |
CN117456368B (en) * | 2023-12-22 | 2024-03-08 | 安徽大学 | Fruit and vegetable identification picking method, system and device |
CN117617002A (en) * | 2024-01-04 | 2024-03-01 | 太原理工大学 | Method for automatically identifying tomatoes and intelligently harvesting tomatoes |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Dias et al. | Multispecies fruit flower detection using a refined semantic segmentation network | |
CN116363505A (en) | Target picking method based on picking robot vision system | |
Xu et al. | Fast method of detecting tomatoes in a complex scene for picking robots | |
CN109784204B (en) | Method for identifying and extracting main fruit stalks of stacked cluster fruits for parallel robot | |
CN106525732B (en) | Rapid nondestructive detection method for internal and external quality of apple based on hyperspectral imaging technology | |
CN114387520B (en) | Method and system for accurately detecting compact Li Zijing for robot picking | |
CN111178177A (en) | Cucumber disease identification method based on convolutional neural network | |
CN113160062B (en) | Infrared image target detection method, device, equipment and storage medium | |
CN112990103B (en) | String mining secondary positioning method based on machine vision | |
Zhang et al. | A method for organs classification and fruit counting on pomegranate trees based on multi-features fusion and support vector machine by 3D point cloud | |
CN113222959B (en) | Fresh jujube wormhole detection method based on hyperspectral image convolutional neural network | |
Ünal et al. | Classification of hazelnut kernels with deep learning | |
Chen et al. | A surface defect detection system for golden diamond pineapple based on CycleGAN and YOLOv4 | |
Chen et al. | Segmentation of field grape bunches via an improved pyramid scene parsing network | |
Peng et al. | Litchi detection in the field using an improved YOLOv3 model | |
CN116110042A (en) | Tomato detection method based on CBAM attention mechanism of YOLOv7 | |
Ma et al. | Using an improved lightweight YOLOv8 model for real-time detection of multi-stage apple fruit in complex orchard environments | |
CN109596620A (en) | Product surface shape defect detection method and system based on machine vision | |
CN111353432A (en) | Rapid honeysuckle medicinal material cleaning method and system based on convolutional neural network | |
Kong et al. | Detection model based on improved faster-RCNN in apple orchard environment | |
Huang et al. | Mango surface defect detection based on HALCON | |
Melnychenko et al. | Apple detection with occlusions using modified YOLOv5-v1 | |
CN116452872A (en) | Forest scene tree classification method based on improved deep pavv3+ | |
CN116071653A (en) | Automatic extraction method for multi-stage branch structure of tree based on natural image | |
CN115100533A (en) | Training and using method of litchi maturity recognition model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |