CN115271200B

CN115271200B - Intelligent coherent picking system for famous tea

Info

Publication number: CN115271200B
Application number: CN202210876003.6A
Authority: CN
Inventors: 朱立学; 张智浩; 林桂潮; 张世昂; 官金炫; 陈品岚; 陈定贤; 陈嘉辉; 钟凯勇; 蔡柳坚
Original assignee: Zhongkai University of Agriculture and Engineering
Current assignee: Zhongkai University of Agriculture and Engineering
Priority date: 2022-07-25
Filing date: 2022-07-25
Publication date: 2023-05-30
Anticipated expiration: 2042-07-25
Also published as: CN115271200A

Abstract

The invention provides an intelligent coherent picking system for famous tea, which comprises a target identification module, a path planning module, a picking execution module and an operation control module; the operation control module is respectively in communication interaction with the target identification module, the path planning module and the picking execution module. The target identification module obtains picking points by constructing a tea tender bud image data set, improving a YoloV5 network model and fitting a minimum external cuboid of a three-dimensional point cloud of tea tender buds. The system realizes accurate positioning of tea buds through the visual recognition system, and simultaneously realizes intelligent and mechanized tea picking through the picking robot without a large amount of labor force, and has high picking efficiency and high famous tea yield.

Description

Intelligent coherent picking system for famous tea

Technical Field

The invention relates to the technical field of tea picking, in particular to an intelligent coherent picking system for famous tea.

Background

In recent years, the planting and production of famous and excellent tea are greatly developed in China, so that the income of tea farmers is obviously increased. The famous tea is picked by a manual or handheld tea picking machine, so that the manual picking selectivity is high, the picking accuracy is high, the damage to the tea is not easy to occur, but the manual picking wastes a large amount of labor force, and the picking efficiency is low; the handheld picking machine is low in cost, high in efficiency and capable of effectively saving labor force, but most of handheld picking machines adopt a one-knife reciprocating cutting mode, and tea leaves are influenced by illumination, temperature and humidity, irrigation modes and the like, the growth heights of tea leaves buds are different, meanwhile, the volumes of the tea leaves buds are small, the forms are different, the space distribution is dense, and the one-knife reciprocating cutting mode is extremely easy to cause the problems of missing picking, wrong picking, even breakage of the tea leaves and the like. In addition, because the tea layers obtained by the existing handheld tea picking machine are uneven, subsequent classification and screening still need to be manually carried out, and the labor cost and the time of the tea picking process are increased.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention aims to provide an intelligent coherent picking system for famous tea, which realizes accurate positioning of tea buds through a visual identification system, and simultaneously realizes intelligent and mechanical tea picking through a picking robot, so that a large amount of labor force is not needed, the picking efficiency is high, and the famous tea yield is high.

The aim of the invention is achieved by the following technical scheme:

an intelligent coherent picking system for famous tea, which is characterized in that: the picking device comprises a target identification module, a path planning module, a picking execution module and an operation control module; the operation control module is respectively in communication interaction with the target identification module, the path planning module and the picking execution module;

the target identification module is used for acquiring picking points of tea buds;

the path planning module is used for planning the travelling path of the picking execution module, and specifically comprises the following steps: the operation control module inputs the picking points obtained by the target identification module and the positions of the current picking execution module into the path planning module, and the path planning module fits the three-dimensional coordinates of the picking execution module with the three-dimensional coordinates of the picking points to obtain the shortest and optimal picking path;

the picking execution module comprises a moving mechanical arm and an end effector, and the moving mechanical arm moves the end effector to each picking point according to the picking path planned in the path planning module; the end effector comprises a shearing part and a telescopic part, picking of tea buds is achieved through closing of the shearing part, and collection of tea is achieved through shrinkage of the telescopic part and opening of the shearing part.

Further optimizing, the specific steps of the target identification module for acquiring picking points of tea buds are as follows:

s10, constructing a tea bud image data set; s20, constructing a feature map with rich semantic information through a Bi-directional feature pyramid network (Bi-Directional Feature Pyramid Network, biFPN) and a channel attention mechanism (Efficient Channel Attention, ECA) in the data set of the step S10, so as to improve a YoloV5 network model, obtain an improved YoloV5 network model and realize detection of small-size tea buds; s30, obtaining a three-dimensional point cloud of the tea based on the training result of the improved YoloV5 network model in the step S20; screening out three-dimensional point clouds of tea buds from the three-dimensional point clouds of tea leaves; finally, fitting a minimum external cuboid of the tea buds to obtain the accurate positions and picking points of the tea buds;

the YoloV5 network model comprises a Backbone module, a Neck module and a Head module; the backstone module comprises a Focus module, an SPP module and a CBS module which are used for slicing the picture, and a CSP module which is used for enhancing the learning performance of the whole convolutional neural network; the Neck module comprises a CBS module and a CSP module; the Head module includes a Detect layer that utilizes mesh-based anchors for object detection on feature maps of different scales.

Preferably, the yolv 5 network model adopts a network model with the smallest model file size and the smallest depth and width of the feature map.

Further preferably, the step S10 specifically includes: firstly, acquiring image data of tea buds by using an RGB-D camera to obtain color images and depth images of the tea buds; marking the color image by using a marking tool, performing data set enhancement operation, and expanding the number of the data sets to construct a tea sprout image data set; finally, dividing the data set to form a training set, a testing set and a verification set;

the step S20 specifically includes:

s21, preprocessing the images in the training set in the step S10, and unifying the resolutions of all the images in the training set; inputting the preprocessed image into a Backbone module to obtain feature images with different sizes; s22, inputting the feature graphs with different sizes in the step S21 into a Neck module, and adopting a bidirectional feature pyramid network to replace an original path aggregation network (Path Aggregation Network, PANet) in the Neck module to perform multi-feature fusion; then up-sampling and down-sampling the feature images in sequence, generating feature images with various sizes through the splicing of a channel attention mechanism, and inputting the feature images into a Detect layer of a Head module; s23, carrying out back propagation by combining a plurality of loss functions, and updating and adjusting weight parameters for gradients in the model; and S24, verifying the existing model by adopting the verification set in the step S10 to obtain the improved YoloV5 network model.

Preferably, the labeling tool is a Labelimg labeling tool.

Further preferably, the step S30 specifically includes:

s31, acquiring detection frame coordinates according to the result of the improved YoloV5 network model in the step S20, and generating a color image and a region of interest (Region of Interest, ROI) of a corresponding depth image;

s32, according to the mapping relation between the pixel coordinates of the depth image and the pixel coordinates of the color image, obtaining corresponding mapped color image coordinates through the coordinate values, the pixel values and the recording distances of the depth image;

s33, obtaining a three-dimensional point cloud of the tea through coordinate fusion of the color image and the depth image, wherein the three-dimensional point cloud comprises the following specific steps of:

in the method, in the process of the invention,

a coordinate system representing a three-dimensional point cloud; />

A coordinate system representing a color image; d represents a depth value, obtained by a depth image; f (f) _x 、f _y Representing the camera focal length;

s34, because the generated three-dimensional point cloud of the tea leaves comprises tea leaves buds and background point clouds thereof, an average value of the three-dimensional point cloud of the tea leaves is obtained through calculation, and the average value is used as a distance threshold value; filtering the background point cloud which is larger than the distance threshold value to obtain a primarily segmented three-dimensional point cloud; adopting a DBSCAN clustering algorithm, and setting a parameter radius Eps and the minimum sample number M required to be contained in the neighborhood _p The primarily segmented three-dimensional point clouds are gathered into one type, and three-dimensional point clouds of tea buds are screened out;

s35, adopting a principal component analysis (Principal Component Analysis, PCA) to fit the minimum external cuboid of the tea buds at the position according to the growth posture of the tea buds; calculating to obtain the coordinates of each vertex of the cuboid; and obtaining the coordinates of the center point of the bottom surface of the cuboid by solving the average value of four vertexes of the bottom surface of the cuboid, and taking the point as a picking point of tea buds.

Further preferably, the step S35 specifically includes:

s351, screening three main directions of the three-dimensional point cloud of the tea buds, namely x, y and z directions by adopting a principal component analysis method, and calculating mass centers and covariance to obtain a covariance matrix; the method comprises the following steps:

wherein P is _c Representing centroid coordinates of the three-dimensional point cloud; n represents the number of three-dimensional point clouds (i.e., the number of points); (x) _i ,y _i ,z _i ) Representing three-dimensional coordinates of the i-th point;

wherein C is _p Representing a covariance matrix of the three-dimensional point cloud;

s352, singular value decomposition is carried out on the covariance matrix to obtain a characteristic value and a characteristic vector, wherein the specific formula is as follows:

in U _p Representing covariance matrix C _p C _p ^T Is a feature vector matrix of (a); d (D) _p Indicating that a non-0 value on a diagonal is C _p C _p ^T A diagonal matrix of square roots of non-0 eigenvalues;

represents a C _p ^T C _p Is a feature vector matrix of (a);

the direction of the feature vector corresponding to the maximum feature value is the main axis direction of the cuboid;

s353, projecting coordinate points onto the direction vector, calculating the position coordinates P of each vertex _i Interior to coordinate point unit vectorObtaining the maximum value and the minimum value of the product in each direction, enabling a, b and c to be the average value of the maximum value and the minimum value of x, y and z respectively, obtaining the center point O and the length L of the cuboid, and generating the cuboid with the most proper and compact tea bud;

the specific formula is as follows:

O＝ax+by+cz；

wherein X is a unit vector of the coordinate point in the X direction; y is a unit vector of the coordinate point in the Y direction; z is a unit vector of the coordinate point in the Z direction; l (L) _x 、L _y 、L _z The lengths of the cuboids in the x direction, the y direction and the z direction are respectively;

s354, judging the coordinates of the smallest four points in the y direction of the cuboid to be used as four vertex coordinates of the bottom surface of the cuboid; and finally, obtaining the coordinate of the center point of the bottom surface of the cuboid, namely the picking point, through the average value of the four vertex coordinates.

The invention has the following technical effects:

according to the method, the small targets of the tea shoots under the large visual field are checked and positioned through the deep learning method, firstly, the detection model of the small-size tea shoots is constructed, and meanwhile, the semantic expression and positioning capability of the image on multiple scales are enhanced through the improved YoloV5 network model, so that the method is suitable for judging and identifying the tea shoots with small targets, different forms and dense distribution space, misjudgment and missed judgment caused by the difference or mutual overlapping of the tea shoots in the tea shoot identification process are avoided, and the accuracy of identification and judgment is improved. Simultaneously, this application is through the minimum external cuboid of fitting tealeaves tender bud, realize the accurate positioning of tealeaves picking the point (use minimum external cuboid bottom surface central point to be the picking point of tealeaves tender bud promptly), effectively avoid picking the damage that causes tealeaves tender bud, ensure simultaneously that the whole tealeaves tender bud of most effective, complete picking, ensure the quality of tealeaves. Through the cooperation of the target identification module, the path planning module, the picking execution module and the operation control module, the shortest and optimal picking feeding path can be obtained efficiently, the picking efficiency is improved, and the picking flow is simplified, so that automatic and mechanical tea picking is realized, the use of labor force is reduced, and the time and cost for tea picking are saved.

Drawings

Fig. 1 is a flow chart of a picking system for picking tea leaves in an embodiment of the invention.

Fig. 2 is a flowchart of a target recognition module according to an embodiment of the present application obtaining picking points.

Fig. 3 is a schematic diagram of a picture marked by using a marking tool in the embodiment of the present application.

Fig. 4 is a multi-scale feature fusion structure diagram based on a bi-directional feature pyramid network structure in an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Examples:

an intelligent coherent picking system for famous tea, which is characterized in that: the picking device comprises a target identification module, a path planning module, a picking execution module and an operation control module; the operation control module is respectively communicated and interacted with the target identification module, the path planning module and the picking execution module;

the target identification module is used for acquiring picking points of tea buds, and specifically comprises the following steps:

s10, constructing a tea bud image data set: firstly, acquiring image data of tea buds by using an RGB-D camera to obtain color images and depth images of the tea buds; then, marking the color image by using a marking tool, such as a Labelimg marking tool (shown in figure 1), performing dataset enhancement operation (the dataset enhancement operation can be realized by adopting the conventional technical means, such as space conversion, color conversion and the like, and expanding the number of datasets to construct a tea shoot image dataset; and finally, dividing the data set to form a training set, a testing set and a verification set.

S20, constructing a feature map with rich semantic information through a Bi-directional feature pyramid network (Bi-Directional Feature Pyramid Network, biFPN) and a channel attention mechanism (Efficient Channel Attention, ECA) in the data set of the step S10, so as to improve a YoloV5 network model, obtain an improved YoloV5 network model and realize detection of small-size tea buds;

the YoloV5 network model adopts a network model with the minimum model file size and the minimum depth and width of a feature map, and comprises a Backbone module, a Neck module and a Head module; the backstone module comprises a Focus module, an SPP module and a CBS module which are used for slicing the picture, and a CSP module which is used for enhancing the learning performance of the whole convolutional neural network; the Neck module comprises a CBS module and a CSP module; the Head module comprises a Detect layer for detecting targets on feature graphs with different scales by using a grid-based anchor;

the method specifically comprises the following steps:

s21, preprocessing the images in the training set in the step S10, and unifying the resolutions of all the images in the training set; inputting the preprocessed image into a Backbone module to obtain feature images with different sizes;

s22, inputting the feature graphs with different sizes in the step S21 into a Neck module, and adopting a bidirectional feature pyramid network to replace an original path aggregation network (Path Aggregation Network, PANet) in the Neck module to perform multi-feature fusion; then up-sampling and down-sampling the feature images in sequence, generating feature images with various sizes through the splicing of a channel attention mechanism, and inputting the feature images into a Detect layer of a Head module;

in the yolv5 network model (i.e., in the existing yolv5 network structure), the reinforcing feature is used to extract BiFPN, upsample p5_in, and then, after upsampling, bifpn_concat stacking with p4_in to obtain p4_td; then upsampling is carried out on the P4_td, and BiFPN_Concat stacking is carried out on the upsampled P3_in to obtain P3_out; then, carrying out downsampling on the P3_out, and carrying out BiFPN_Concat stacking with the P4_td after downsampling to obtain the P4_out; and then downsampling is carried out on the P4_out, and the downsampled P5_out is stacked with the P5_in to obtain the P5_out. According to the method, efficient bidirectional cross connection is used for feature fusion, nodes with small contribution to feature fusion in PANet are removed, additional connection is added between input nodes and output nodes at the same level, more features are fused without additional cost, and semantic expression and positioning capability on multiple scales are enhanced, as shown in figure 2.

Then adding ECA after the 9 th layer, the module changes the input feature map from a matrix of [ h, w, c ] to a vector of [1, c ] through global average pooling (Global Average Pooling), then calculates to obtain an adaptive one-dimensional convolution kernel size, and uses the kernel size in one-dimensional convolution to obtain the weight of each channel in the feature map; and multiplying the normalized weight and the original input feature map channel by channel to generate a weighted feature map.

The attention mechanism uses a 1x1 convolution layer after the global average pooling layer, so that a full connection layer is removed, dimension reduction is avoided, cross-channel interaction is effectively captured, and finally the probability of judging an object and the detection precision of a model are improved; the specific formula is as follows:

wherein, C represents the channel dimension; k represents volume and size; y and b are respectively 2 and 1;

s23, carrying out back propagation by combining various loss functions (such as classification loss, positioning loss, execution loss and the like), and updating and adjusting weight parameters for gradients in the model;

and S24, verifying the existing model by adopting the verification set in the step S10 to obtain the improved YoloV5 network model.

S30, obtaining a three-dimensional point cloud of the tea based on the training result of the improved YoloV5 network model in the step S20; screening out three-dimensional point clouds of tea buds from the three-dimensional point clouds of tea leaves; finally, fitting a minimum external cuboid of the tea buds to obtain the accurate positions and picking points of the tea buds; the method comprises the following steps:

s31, firstly, according to the result of the improved YoloV5 network model in the step S20, acquiring detection frame coordinates, and generating a color image and a region of interest (Region of Interest, ROI) of a corresponding depth image;

s32, obtaining corresponding mapped color image coordinates according to the mapping relation between the pixel coordinates of the depth image and the pixel coordinates of the color image and through the coordinate values, the pixel values and the recording distances of the depth image;

in the method, in the process of the invention,

a coordinate system representing a three-dimensional point cloud; />

wherein, the DBSCAN clustering algorithm randomly selects a data sample in space, and determines whether the sample number distributed in the neighborhood radius Eps is greater than or equal to the minimum sample number M _p The threshold number determines whether it is a core object:

if so, dividing all points in the neighborhood into the same cluster group, and meanwhile, on the basis of the cluster group above, searching for all samples with reachable density by breadth-first search, and dividing the samples into the cluster group;

if the data sample is a non-core object, marking the data sample as noise point removal;

the formula is specifically as follows:

N _Eps (p)＝{q∈D|dist(p,q)≤Eps}；

wherein D represents a point cloud sample set; p and q respectively represent sample points summarized by the sample set;

for any p.epsilon.D, if its Eps corresponds to |N _Eps (p) | contains at least M _p P is the core object; if q is within Eps of p and p is the core object, then q is reachable from p density;

s35, adopting a principal component analysis (Principal Component Analysis, PCA) to fit the minimum external cuboid of the tea buds at the position according to the growth posture of the tea buds; calculating to obtain the coordinates of each vertex of the cuboid; obtaining the coordinates of the center point of the bottom surface of the cuboid by solving the average value of four vertexes of the bottom surface of the cuboid, and taking the point as a picking point of tea buds; the method comprises the following steps:

represents a C _p ^T C _p Is a feature vector matrix of (a);

s353, projecting coordinate points onto the direction vector, calculating the position coordinates P of each vertex _i Obtaining the maximum value and the minimum value of the inner product of the coordinate point unit vector in each direction, enabling a, b and c to be the average value of the maximum value and the minimum value on x, y and z respectively, obtaining the center point O and the length L of the cuboid, and generating the cuboid with the most proper and compact tea bud;

the specific formula is as follows:

O＝ax+by+cz；

The path planning module is used for planning the travelling path of the picking execution module, and specifically comprises the following steps: the operation control module inputs the positions of each picking point and the current picking execution module (specifically, the shearing part) obtained by the target identification module to the path planning module, and the path planning module fits the three-dimensional coordinates of the picking execution module (specifically, the shearing part) with the three-dimensional coordinates of each picking point to obtain the shortest and optimal picking path;

for example: a, a ₁ Constructing picking execution module (specifically, shear part) picking sequence path objective function: firstly, modeling picking sequence path planning of a picking execution module (particularly a shear part) based on Markov decision process theory (Markov Decision Processes, MDP); then, the operation control module controls the picking execution module (in particular the shearing part) to execute a prediction action according to the received three-dimensional information of the picking points obtained by the method in the target identification module, and then observes a new state and obtains rewards, so that the target function maximizes the expected accumulated rewards obtained by the picking execution module (in particular the shearing part); a, a ₂ Setting up a simulation environment containing a tea plantation and a picking execution module by using simulation equipment, and combining illumination intensity and illumination intensityPhysical quantities such as machine direction, tea bud position, color and the like are used as parameters of a simulation environment for training; in the training process, the randomness of the simulation environment is gradually increased, data with continuously increased learning difficulty is acquired through the interactive acquisition of the robot and the environment, and the data is sampled; a, a ₃ And optimizing the objective function by adopting a near-end strategy optimization algorithm module (Proximal Policy Optimization, PPO) and combining a solver to obtain a picking sequence planning model of a picking execution module (particularly a shearing part).

The picking execution module comprises a moving mechanical arm and an end effector, wherein the moving mechanical arm moves the end effector to each picking point according to a picking path planned in the path planning module, and the moving mechanical arm can adopt a multi-degree-of-freedom mechanical arm to select the degree of freedom of the mechanical arm according to actual picking conditions; and communication interaction is realized between the motion mechanical arm and the end effector through the operation control module. The end effector comprises a shearing part and a telescopic part, picking of tea buds is achieved through closing of the shearing part, and collection of tea is achieved through shrinkage of the telescopic part and opening of the shearing part. If necessary, a tea collecting basket is arranged on the moving mechanical arm and positioned on the end effector.

The operation control module can build Ubuntu18.04 system on the edge computer and simultaneously create a Pytorch-based deep learning operation environment; deploying the target identification module and the path planning module into an edge computer, and cooperatively managing resource scheduling and network signal interaction of the two neural network modules; while deploying the motion robot control system and the end effector drive system in the edge computer.

The specific method for picking tea leaves of the picking system comprises the following steps:

s001, initializing a system in a tea garden, and then manually controlling a motion mechanical arm to drive an end effector to move to a pre-tea-picking point;

s002, the operation control module opens the target recognition module and the path planning module and simultaneously opens the camera;

s003, the camera transmits the visual field to the target recognition module, and the target recognition module carries out multi-target detection on the tea buds in the image; detecting tea buds and fitting out a smallest external cuboid of the tea buds to obtain a bottom surface central point of the smallest external cuboid; then, the center point of the bottom surface of the minimum external cuboid is transmitted to an operation control module;

s004, the operation control module guides the center point of the bottom surface of the minimum circumscribed cuboid and the current position coordinate of the end effector into a PPO model of the path planning module to obtain the shortest optimal path picked by the end effector; transmitting the path obtained by the calculation to an operation control module;

s005, the operation control module transmits the path information to the picking execution module, and the picking execution module controls the end effector to reach a picking point through the motion mechanical arm, so that the end effector can realize picking and collecting actions; until picking of tea leaves and buds in the current visual field of the camera is completed;

s006, moving the picking system to the next visual field range, and continuing picking of the tea buds.

Although embodiments of the present invention have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. An intelligent coherent picking system for famous tea, which is characterized in that: the picking device comprises a target identification module, a path planning module, a picking execution module and an operation control module; the operation control module is respectively in communication interaction with the target identification module, the path planning module and the picking execution module;

the picking execution module comprises a moving mechanical arm and an end effector, and the moving mechanical arm moves the end effector to each picking point according to the picking path planned in the path planning module; the end effector comprises a shearing part and a telescopic part, picking of tea buds is realized through closing of the shearing part, and collection of tea is realized through shrinkage of the telescopic part and opening of the shearing part;

the specific steps of the target identification module for acquiring picking points of tea buds are as follows:

s10, constructing a tea bud image data set; s20, constructing a feature map with rich semantic information through a bidirectional feature pyramid network and a channel attention mechanism in the data set of the step S10, improving a YoloV5 network model, obtaining an improved YoloV5 network model and detecting small-size tea buds; s30, obtaining a three-dimensional point cloud of the tea based on the training result of the improved YoloV5 network model in the step S20; screening out three-dimensional point clouds of tea buds from the three-dimensional point clouds of tea leaves; finally, fitting a minimum external cuboid of the tea buds to obtain the accurate positions and picking points of the tea buds;

the YoloV5 network model comprises a Backbone module, a Neck module and a Head module; the backstone module comprises a Focus module, an SPP module and a CBS module which are used for slicing the picture, and a CSP module which is used for enhancing the learning performance of the whole convolutional neural network; the Neck module comprises a CBS module and a CSP module; the Head module comprises a Detect layer for detecting targets on feature graphs with different scales by using a grid-based anchor;

the step S10 specifically includes: firstly, acquiring image data of tea buds by using an RGB-D camera to obtain color images and depth images of the tea buds; marking the color image by using a marking tool, performing data set enhancement operation, and expanding the number of the data sets to construct a tea sprout image data set; finally, dividing the data set to form a training set, a testing set and a verification set;

the step S20 specifically includes:

s21, preprocessing the images in the training set in the step S10, and unifying the resolutions of all the images in the training set; inputting the preprocessed image into a Backbone module to obtain feature images with different sizes; s22, inputting the feature graphs with different sizes in the step S21 into a Neck module, and adopting a bidirectional feature pyramid network to replace an original path aggregation network in the Neck module to perform multi-feature fusion; then up-sampling and down-sampling the feature images in sequence, generating feature images with various sizes through the splicing of a channel attention mechanism, and inputting the feature images into a Detect layer of a Head module; s23, carrying out back propagation by combining a plurality of loss functions, and updating and adjusting weight parameters for gradients in the model; s24, verifying the existing model by adopting the verification set in the step S10 to obtain an improved YoloV5 network model;

the step S30 specifically includes:

s31, acquiring detection frame coordinates according to the result of the improved YoloV5 network model in the step S20, and generating a color image and an interested region of a corresponding depth image;

in the method, in the process of the invention,

a coordinate system representing a three-dimensional point cloud; />

s35, fitting the minimum external cuboid of the tea buds at the position according to the growth posture of the tea buds by adopting a principal component analysis method; calculating to obtain the coordinates of each vertex of the cuboid; obtaining the coordinates of the center point of the bottom surface of the cuboid by solving the average value of four vertexes of the bottom surface of the cuboid, and taking the point as a picking point of tea buds;

the step S35 specifically includes:

wherein C is _p Representing a three-dimensional point cloudA covariance matrix;

represents a C _p ^T C _p Is a feature vector matrix of (a);

the specific formula is as follows:

O＝ax+by+cz；