CN117036370A - Plant organ point cloud segmentation method based on attention mechanism and graph convolution - Google Patents

Plant organ point cloud segmentation method based on attention mechanism and graph convolution Download PDF

Info

Publication number
CN117036370A
CN117036370A CN202310704110.5A CN202310704110A CN117036370A CN 117036370 A CN117036370 A CN 117036370A CN 202310704110 A CN202310704110 A CN 202310704110A CN 117036370 A CN117036370 A CN 117036370A
Authority
CN
China
Prior art keywords
point
feature
point cloud
layer
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310704110.5A
Other languages
Chinese (zh)
Inventor
马韫韬
蔡智博
朱晋宇
郭焱
李保国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Agricultural University
Original Assignee
China Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Agricultural University filed Critical China Agricultural University
Priority to CN202310704110.5A priority Critical patent/CN117036370A/en
Publication of CN117036370A publication Critical patent/CN117036370A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)

Abstract

A plant organ point cloud segmentation method based on an attention mechanism and graph convolution belongs to the technical field of three-dimensional point cloud instance segmentation. The method comprises the steps of dividing a network TRGCN based on a point attention mechanism and a double-branch parallel example of space diagram convolution, directly inputting a three-dimensional point cloud, respectively focusing on local feature extraction and global feature extraction by double branches, and fusing the two features through a T-G feature coupling layer. The method takes five plant point cloud data of tomatoes, corns, tobaccos, sorghum and wheat as research objects, and the dual-branch parallel neural network architecture TRGCN can capture local characteristics and global characteristics of the point cloud at the same time, is used for training a high-robustness example segmentation model, can improve the segmentation precision of the plant point cloud, has good generalization capability, and can provide good data support for rapid, efficient and accurate plant phenotype analysis.

Description

Plant organ point cloud segmentation method based on attention mechanism and graph convolution
Technical Field
The invention belongs to the technical field of three-dimensional point cloud instance segmentation, and particularly relates to a double-branch parallel plant organ point cloud segmentation method based on an attention mechanism and space diagram convolution.
Background
With the popularization of laser radar equipment and the advent of various consumer-level depth sensors, point cloud data is increasingly being used in various fields, such as robots, autopilot, city planning, and the like. In phenotypic studies, three-dimensional point clouds have become the most directly effective data form for studying plant structure and morphology as a real-world low-resolution representation. Many studies have employed three-dimensional structures of plants for organ segmentation, monitoring growth vigor, and evaluating varieties, etc. Points in the three-dimensional coordinate system serve as the most basic units constituting the point cloud, are similar to pixel points in a two-dimensional picture, but can accommodate more high-dimensional semantic information. In phenotypic research, the morphological structure of plant organs is an intuitive and important character, and can reflect the adaptability of plants to external conditions and the growth condition, such as photosynthesis efficiency, water absorption efficiency and the like. The plant organ point cloud segmentation refers to the process of semantically dividing plants according to different organs (such as stems, leaves, fruits and the like), is the basis for deep understanding of point cloud data later, has important significance for understanding the functional structure of the plants, and is a challenging research direction at present.
The traditional plant point cloud segmentation algorithm needs to manually perform feature description in advance, the segmentation process is complex and complicated, and along with the arrival of big data age, the traditional processing method is difficult to meet the requirement of rapid and accurate analysis. Therefore, the demand for automated segmentation methods is increasing. With the rapid increase of the performance of the computer graphics processor, as a leading technology of artificial intelligence, deep learning has been successfully used for solving various two-dimensional vision problems. However, due to the disorder and complexity of the point cloud in spatial summary, the application of the deep learning method on the point cloud also faces many challenges. The Convolutional Neural Network (CNN) with good performance in the visual segmentation task extracts the features through shared kernel convolution, so that the model efficiency is improved, and the inherent translational invariance of the CNN enables the control of the local features to be more accurate. However, CNNs themselves are typically relatively small in receptive field, relatively weak in capturing ability for global features, and cannot directly act on the original point cloud data. Another neural network structure applied to point cloud data is a graph roll-up network (GCN), which treats each individual point in the point cloud as a vertex in the graph data structure, and can extract local features by performing a convolution-like operation directly in the point cloud data. The transducer with outstanding performance in the natural language processing field can capture global features equally well, and the core thought Attention (Attention) mechanism is also very suitable for processing point cloud data. These deep learning methods have achieved satisfactory segmentation results on many common point cloud data sets, revealing the effectiveness of the deep learning method for point cloud data segmentation.
However, the structural complexity of the plant point cloud results in a relatively large amount of semantic information that needs to be identified in the organ segmentation task. When the point cloud is acquired, the shielding problem among the blades often causes the loss of part of the point cloud, and the problems of holes and sparseness occur. In addition, the similarity between plant organs is high, and different leaf examples often have the same color, morphological structure, texture and other characteristics, and the highly repeated characteristics are not friendly to learning of the neural network. Finally, the plants of different varieties have different geometric characteristics, and even the plants of the same variety have different phenotypic characteristics under different growing environments, even large differences can be generated, so that the requirement on the generalization capability of the network is high. In summary, the current segmentation accuracy of the plant point cloud cannot meet the requirements.
Disclosure of Invention
The invention aims to solve the problem of accurate organ segmentation in complex plant point clouds, and provides a double-branch parallel plant organ point cloud segmentation method based on an attention mechanism and a space diagram convolution.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a plant organ point cloud segmentation method based on an attention mechanism and graph convolution, the method comprising:
step one: the feature encoder takes an original point cloud as input, adopts a multi-layer perceptron to map features to a high-dimensional space, adopts a point cloud attention mechanism to extract the features preliminarily, and then inputs initial feature data into a TRGCN block, and the module can be used for cascade superposition to deepen understanding of the high-dimensional features; the feature aggregation layer in the TRGCN block extracts the neighborhood feature and downsamples the point cloud, then enters a double-branch parallel network part, is a local feature capturing branch formed by space graph convolution and a global feature learning branch formed by a point attention mechanism respectively, finally inputs feature data into the T-G feature coupling layer to obtain a target number of point clouds and corresponding high-dimensional abstract features, and the encoder part extracts high-dimensional feature information from the original plant point clouds by superposing the TRGCN block for dividing tasks;
step two: the feature decoder part stacks three cascaded TRGCN blocks, and receives the outputs of the three TRGCN blocks in the encoder respectively, but replaces the feature aggregation layer with the interpolation layer, and the interpolation layer restores the features of the high-dimensional point set to the low-dimensional point set, and still outputs the grouping result of the K nearest neighbor algorithm for two branches of the TRGCN to calculate; for segmentation result prediction, the decoder sets an independent interpolation layer behind the TRGCN block, adopts a single point attention layer to ensure information integrity, and finally adopts a multi-layer perceptron to output a segmentation result of point cloud;
step three: training a network: all experiments in this study were performed on a stand-alone server equipped with a 12-core 20-thread CPU, 64GB memory, and a Nvidia GeForce RTX 3090Ti GPU; the neural network training is carried out by using an independent server, and in the training stage, all plant point cloud segmentation models adopt the same super parameters, wherein the super parameters are specifically as follows: training batch size was set to 32, initial learning rate was set to 0.001, the network was optimized using Adam method for 100 cycles, learning rate was halved every 20 cycles, weight decay was set to 0.0001, momentum was set to 0.9, K value of K nearest neighbor algorithm was set to 12, and feature dimension of point attention layer was set to 256.
Further, the first step specifically comprises:
(1) Feature polymeric layer
The specific process of feature polymerization is as follows: inputting x points with feature dimension, firstly sampling the points at the most distant point by using a random, grouping the point clouds by adopting a K nearest neighbor algorithm, inputting the point clouds into a multi-layer perceptron to aggregate the neighbor point features to a central point, and finally obtaining y points with feature' dimension features by adopting a maximum pooling operation;
the characteristic aggregation layer adopts a K nearest neighbor algorithm to sample and group the input point set; the feature aggregation layer outputs the calculated K neighbor matrix and shares the K neighbor matrix with the subsequent parallel branches;
(2) Local feature capture branching
The branch is constructed based on a dynamic space diagram convolution and is used for extracting local features in an input plant point cloud; firstly, constructing a feature graph G= (V, E) based on a point set V and neighbor information E, and carrying out feature extraction on an input feature space by adopting edge convolution; extracting a certain point x i The formula of the characteristics is as follows:
f i =?h(x i ,y i )
wherein x is j For point x i Is one of the neighbor points? And h represents a certain aggregation function and a certain relational operation, respectively; the method comprises the steps that neighbor point features around candidate points are aggregated through a relation operation, so that feature information of the candidate points is obtained, and the relation operation is defined as edge convolution;
the maximum pooling is adopted as an aggregation function, and the specific process is as follows:
conv i =Max(MLP(h(x i ,x i -x j )))
the relational operation h is defined as point x i ,x i And its neighbor point x j Feature difference value and point x of (2) i Linear combinations between the output values;
(3) Global feature learning branching
The feature is extracted by adopting a vector attention mechanism in a local neighborhood, and the calculation formula is as follows:
wherein x is j Is the point x i X is the independent point set in each single plant point cloud, ρ is the regularization function, γ is the mapping function, β is a certain relational operation, which is defined in this study as the difference between the neighborhood point and the point of interest, φ,Alpha is a feature transformation method of a point level, Q, K, V (Query, key, value is a proper noun in an attention mechanism, corresponding to Chinese is query, key and value) values in the self-attention mechanism are respectively obtained, delta is a position coding function, a point attention layer is proposed according to the attention mechanism, and an improved calculation formula is as follows:
(4) T-G feature coupling layer
Through the processing, the feature matrix with two dimensions and identical shapes is obtained: a matrix G with significant local features and a matrix T with complete global features; and (3) inputting the spliced G and T into a feature coupling layer to obtain a target feature matrix:
TG=Linear(ReLU(Linear(T,G)))
the T-G characteristic coupling layer is designed by adopting two linear layers and one ReLU activation layer, so that the network can learn more important information of each of the two parts of matrixes and combine the two parts of matrixes into a target characteristic matrix.
Compared with the prior art, the invention has the beneficial effects that: the invention designs a brand new dual-branch parallel instance segmentation network TRGCN based on a point attention mechanism and space diagram convolution, which directly inputs three-dimensional point cloud, and the dual branches respectively pay attention to local feature extraction and global feature extraction and fuse the two features through a T-G feature coupling layer. The result shows that TRGCN has excellent performance on different plant point clouds, has higher accuracy than other main stream point cloud segmentation networks, has good generalization capability, and can provide good data support for rapid, efficient and accurate plant phenotype analysis.
Drawings
FIG. 1 is a network architecture diagram of a TRGCN of the present invention;
FIG. 2 is a block diagram of the TRGCN block of the present invention;
FIG. 3 is a schematic diagram of a TRGCN block global feature learning layer of the present invention;
fig. 4 is a graph of the segmentation result of five plant point clouds according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following embodiments, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the detailed description and specific examples are intended for purposes of illustration only and are not intended to limit the scope of the invention.
Example 1:
the present study is based on a point cloud self-attention mechanism and a space diagram convolution, and innovatively proposes a dual-branch parallel network Transformer Graph Convolution Network (TRGCN) designed by adopting an encoder-decoder architecture (fig. 1).
The feature encoder takes the original point cloud as input, adopts a multi-layer perceptron to map the features into a high-dimensional space (defaults to 32 dimensions), and adopts a point cloud attention mechanism to extract the features preliminarily. The initial feature data is then input to a TRGCN module (fig. 2) that can be stacked in multiple cascades to continually deepen understanding of the high-dimensional features. Specifically, the feature aggregation layer in the TRGCN block extracts the neighborhood feature and downsamples the point cloud, and then enters a dual-branch parallel network part, which is a local feature capture branch formed by space diagram convolution and a global feature learning branch formed by a point attention mechanism, respectively. And finally, inputting the characteristic data into a specially designed T-G characteristic coupling layer to obtain the target number of point clouds and corresponding high-dimensional abstract characteristics. The encoder part extracts high-dimensional characteristic information from the original plant point cloud by superposing TRGCN blocks for dividing tasks.
(1) Feature polymeric layer
The feature aggregation layer in the TRGCN block is used for reducing the input point set base number and abstracting the feature vector with high dimension in the process of stacking a plurality of modules. For example, from the original input to the first TRGCN block, the number of points is reduced from N to N/4, and the feature dimension of the point cloud is increased from F to 2F;
the specific process of feature polymerization is as follows: the method comprises the steps of inputting x points with feature dimension, firstly, sampling the points at the most distant randomly, then grouping the point clouds by adopting a K nearest neighbor algorithm, inputting the point clouds into a multi-layer perceptron to aggregate the neighbor point features to a central point, and finally, calculating to obtain y points with feature 'dimension feature by adopting a maximum pooling operation (default y=x/4, feature' =2 x feature).
The feature aggregation layer adopts a K neighbor algorithm to sample and group the input point set. In addition, in order to save the video memory space during training, the layer outputs the calculated K neighbor matrix and shares the K neighbor matrix with the subsequent parallel branches.
(2) Local feature capture branching
The branch is constructed based on a dynamic space graph convolution for extracting local features in an input plant point cloud. First, a feature map g= (V, E) is constructed based on the point set V and the neighbor information E, and feature extraction is performed on the input feature space using edge convolution. Extracting a certain point x i The formula of the features is as follows:
f i =?h(x i ,y i )
wherein x is j Representative point x i Is one of the neighbor points? And h represents some aggregation function and some relational operation, respectively. The feature information of the candidate points can be obtained by aggregating the neighbor point features around the candidate points by adopting a relational operation, wherein the relational operation is defined as edge convolution. To enhance local features in a point cloudIt is understood that the present study uses maximum pooling as an aggregation function, as follows:
conv i =Max(MLP(h(x i ,x i -x j )))
the relational operation h is defined as point x i x i And its neighbor point x j Feature difference value and point x of (2) i Linear combinations between the output values. This choice not only preserves the features of the local point sets that affect each other, but also partially considers the global features of the whole.
(3) Global feature learning branching
As shown in fig. 3, this branch is constructed based on a point cloud attention mechanism, and is very suitable for processing point cloud data, which can be essentially regarded as word vectors embedded in an attention space, and the present study adopts a vector attention mechanism in a local neighborhood to extract features, and the calculation formula is as follows:
wherein x is j Is the point x i One of K neighbor points phi,Alpha is a characteristic transformation method of the point level, Q, K, V values in a self-attention mechanism are respectively obtained, delta is a position coding function, rho is a regularization function, gamma is a mapping function, and beta is a certain relational operation, and the difference value between a neighborhood point and a focus point is defined in the study. According to the above attention mechanism, the present study proposes a point attention layer, and the improved calculation formula is as follows:
unlike the common attentional mechanisms, position coding is also added to the alpha function to enhance the understanding of the features. On the basis of the point attention layer, the TRGCN encoder constructs a residual structure in the global feature learning branch. And adding a linear layer before and after the point attention layer, and connecting the final output with the input in a residual way, so that information exchange is promoted, network convergence is accelerated, and possibility is provided for training a deep network.
(4) T-G feature coupling layer
Through the processing, the feature matrix with two dimensions and identical shapes can be obtained: a matrix G with significant local features and a matrix T with complete global features. And (3) inputting the spliced G and T into a feature coupling layer to obtain a target feature matrix:
TG=Linear(ReLU(Linear(T,G)))
the T-G characteristic coupling layer is designed by adopting two linear layers and one ReLU activation layer, so that the network can learn more important information of each of the two parts of matrixes and combine the two parts of matrixes into a target characteristic matrix.
In summary, the TRGCN network feature encoder portion may design a model that accommodates different visual tasks by varying the number of stacks of TRGCN blocks. Fewer TRGCN blocks may be used for lightweight classification networks, while more cascaded TRGCN blocks may be used for finer-grained tasks such as point cloud segmentation and target recognition.
The feature decoder section also stacks three concatenated TRGCN blocks and receives the outputs of the three TRGCN blocks in the encoder, respectively, but replaces the feature aggregation layer with the interpolation layer. Contrary to the feature aggregation layer, the interpolation layer in the decoder restores the features of the high-dimensional point set to the low-dimensional point set, but still outputs the grouping result of the K-nearest neighbor algorithm for the two branch computation of the TRGCN. For example segmentation result prediction, the decoder sets an independent interpolation layer behind the TRGCN block, adopts a single point attention layer to ensure information integrity, and finally adopts a multi-layer perceptron to output a segmentation result of point cloud.
And (5) training a network. All experiments in this study were performed on a separate server equipped with a 12 core 20 thread CPU, 64GB memory and a Nvidia GeForce RTX 3090Ti GPU. In the training stage, the 5-plant point cloud segmentation model adopts the same super parameters, and specifically comprises the following steps: training batch size was set to 32, initial learning rate was set to 0.001, the network was optimized using Adam method for 100 cycles, learning rate was halved every 20 cycles, weight decay was set to 0.0001, momentum was set to 0.9, K value of K nearest neighbor algorithm was set to 12, and feature dimension of point attention layer was set to 256.
Organ example segmentation tests were performed on 5 plant point cloud data, with the highest average cross-over ratio of 86.38% and average accuracy of 88.58%. In order to verify the segmentation capability of the TRGCN, three main stream point cloud segmentation networks are selected to be compared with the TRGCN, among 5 segmentation tasks, the TRGCN has 9 indexes leading other three methods, the optimal precision is obtained in most segmentation tasks, and particularly the precision improvement on sorghum leaves is more obvious, so that the TRGCN is better in treating monocotyledonous plant point clouds. Because the canopy structure of dicotyledonous crops is crowded, the problem of shielding is easy to cause, the segmentation effect of tobacco and tomato point clouds is not as good as that of monocotyledonous crops, but the segmentation effect is still better than that of other three segmentation networks. Specific test results are shown in table 1, and fig. 4 is a graph of the segmentation effect of five plant point clouds.
The invention also adopts sorghum point cloud as a research object to discuss the stacking quantity of TRGCN pooling layers and cascaded TRGCN blocks, and the result shows that the network segmentation performance adopting the maximum pooling is optimal, and the accuracy is about 2% higher than that of the average pooling and summation pooling. When the number of the cascaded TRGCN blocks is 3, the network is optimal in training time and segmentation effect, and higher segmentation accuracy is obtained at the expense of a certain time. The specific test results are shown in tables 2 and 3.
Table 1 is a table showing the comparison of the segmentation accuracy of the TRGCN network of the present invention with other mainstream networks
Pooling layer Training time (seconds) Average cross ratio (%) Average accuracy (%)
Maximum pooling 2082 78.9292 84.9198
Average pooling 2085 75.7748 80.6104
Summation pooling 2086 76.3709 82.9647
Table 2 the ablation experiment 1 of the invention, the segmentation effect table of different pooling layers
TRGCN block number Training time (seconds) Average cross ratio (%)
2 1728 73.7120
3 2082 78.9292
4 2202 75.5498
Table 3 the present invention ablates experiment 2, a table of segmentation effects for different TRGCN block stacking numbers.

Claims (2)

1. A plant organ point cloud segmentation method based on an attention mechanism and graph convolution is characterized by comprising the following steps of: the method comprises the following steps:
step one: the feature encoder takes an original point cloud as input, adopts a multi-layer perceptron to map features to a high-dimensional space, adopts a point cloud attention mechanism to extract the features preliminarily, and then inputs initial feature data into a TRGCN block, and the module can be used for cascade superposition to deepen understanding of the high-dimensional features; the feature aggregation layer in the TRGCN block extracts neighborhood features and downsamples point clouds at the same time, then enters a double-branch parallel network part, is a local feature capturing branch formed by space graph convolution and a global feature learning branch formed by a point attention mechanism respectively, and finally inputs feature data into the T-G feature coupling layer to obtain target number of point clouds and corresponding high-dimensional abstract features;
step two: the feature decoder part stacks three cascaded TRGCN blocks, and receives the outputs of the three TRGCN blocks in the encoder respectively, but replaces the feature aggregation layer with the interpolation layer, and the interpolation layer restores the features of the high-dimensional point set to the low-dimensional point set, and still outputs the grouping result of the K nearest neighbor algorithm for two branches of the TRGCN to calculate; for segmentation result prediction, the decoder sets an independent interpolation layer behind the TRGCN block, adopts a single point attention layer to ensure information integrity, and finally adopts a multi-layer perceptron to output a segmentation result of point cloud;
step three: training a network: a CPU with 12 cores and 20 threads, a 64GB memory and a Nvidia GeForce RTX 3090Ti GPU are arranged on an independent server; the neural network training is carried out by using an independent server, and in the training stage, all plant point cloud segmentation models adopt the same super parameters, wherein the super parameters are specifically as follows: training batch size was set to 32, initial learning rate was set to 0.001, the network was optimized using Adam method for 100 cycles, learning rate was halved every 20 cycles, weight decay was set to 0.0001, momentum was set to 0.9, K value of K nearest neighbor algorithm was set to 12, and feature dimension of point attention layer was set to 256.
2. The method for segmenting the plant organ point cloud based on the attention mechanism and the graph convolution according to claim 1, wherein the method comprises the following steps of: the first step is specifically as follows:
(1) Feature polymeric layer
The specific process of feature polymerization is as follows: inputting x points with feature dimension, firstly sampling the points at the most distant point by using a random, grouping the point clouds by adopting a K nearest neighbor algorithm, inputting the point clouds into a multi-layer perceptron to aggregate the neighbor point features to a central point, and finally obtaining y points with feature' dimension features by adopting a maximum pooling operation;
the characteristic aggregation layer adopts a K nearest neighbor algorithm to sample and group the input point set; the feature aggregation layer outputs the calculated K neighbor matrix and shares the K neighbor matrix with the subsequent parallel branches;
(2) Local feature capture branching
The branch is constructed based on a dynamic space diagram convolution and is used for extracting local features in an input plant point cloud; firstly, constructing a feature graph G= (V, E) based on a point set V and neighbor information E, and carrying out feature extraction on an input feature space by adopting edge convolution; extracting a certain point x i The formula of the characteristics is as follows:
f i =?h(x i ,y i )
wherein x is j For point x i Is one of the neighbor points? And h represents a certain aggregation function and a certain relational operation, respectively; the method comprises the steps that neighbor point features around candidate points are aggregated through a relation operation, so that feature information of the candidate points is obtained, and the relation operation is defined as edge convolution;
the maximum pooling is adopted as an aggregation function, and the specific process is as follows:
conv i =Max(MLP(h(x i ,x i -x j )))
the relational operation h is defined as point x i ,x i And its neighbor point x j Feature difference value and point x of (2) i Linear combinations between the output values;
(3) Global feature learning branching
The feature is extracted by adopting a vector attention mechanism in a local neighborhood, and the calculation formula is as follows:
wherein x is j Is the point x i X is an independent point set in each single plant point cloud, ρ is a regularization function, γ is a mapping function, β is a difference between a neighborhood point and a point of interest, φ,Alpha is a characteristic transformation method of point level, Q, K, V values in a self-attention mechanism are respectively obtained, delta is a position coding function, a point attention layer is provided according to the attention mechanism, and an improved calculation formula is as follows:
(4) T-G feature coupling layer
Through the processing, the feature matrix with two dimensions and identical shapes is obtained: a matrix G with significant local features and a matrix T with complete global features; and (3) inputting the spliced G and T into a feature coupling layer to obtain a target feature matrix:
TG=Linear(ReLU(Linear(T,G)))
the T-G characteristic coupling layer is designed by adopting two linear layers and one ReLU activation layer, so that the network can learn more important information of each of the two parts of matrixes and combine the two parts of matrixes into a target characteristic matrix.
CN202310704110.5A 2023-06-14 2023-06-14 Plant organ point cloud segmentation method based on attention mechanism and graph convolution Pending CN117036370A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310704110.5A CN117036370A (en) 2023-06-14 2023-06-14 Plant organ point cloud segmentation method based on attention mechanism and graph convolution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310704110.5A CN117036370A (en) 2023-06-14 2023-06-14 Plant organ point cloud segmentation method based on attention mechanism and graph convolution

Publications (1)

Publication Number Publication Date
CN117036370A true CN117036370A (en) 2023-11-10

Family

ID=88625100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310704110.5A Pending CN117036370A (en) 2023-06-14 2023-06-14 Plant organ point cloud segmentation method based on attention mechanism and graph convolution

Country Status (1)

Country Link
CN (1) CN117036370A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455929A (en) * 2023-12-26 2024-01-26 福建理工大学 Tooth segmentation method and terminal based on double-flow self-attention force diagram convolution network
CN117726822A (en) * 2024-02-18 2024-03-19 安徽大学 Three-dimensional medical image classification segmentation system and method based on double-branch feature fusion
CN117745148A (en) * 2024-02-10 2024-03-22 安徽省农业科学院烟草研究所 Multi-source data-based rice stubble flue-cured tobacco planting quality evaluation method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117455929A (en) * 2023-12-26 2024-01-26 福建理工大学 Tooth segmentation method and terminal based on double-flow self-attention force diagram convolution network
CN117455929B (en) * 2023-12-26 2024-03-15 福建理工大学 Tooth segmentation method and terminal based on double-flow self-attention force diagram convolution network
CN117745148A (en) * 2024-02-10 2024-03-22 安徽省农业科学院烟草研究所 Multi-source data-based rice stubble flue-cured tobacco planting quality evaluation method and system
CN117745148B (en) * 2024-02-10 2024-05-10 安徽省农业科学院烟草研究所 Multi-source data-based rice stubble flue-cured tobacco planting quality evaluation method and system
CN117726822A (en) * 2024-02-18 2024-03-19 安徽大学 Three-dimensional medical image classification segmentation system and method based on double-branch feature fusion
CN117726822B (en) * 2024-02-18 2024-05-03 安徽大学 Three-dimensional medical image classification segmentation system and method based on double-branch feature fusion

Similar Documents

Publication Publication Date Title
CN117036370A (en) Plant organ point cloud segmentation method based on attention mechanism and graph convolution
Wang et al. Deep CNNs meet global covariance pooling: Better representation and generalization
CN103390063B (en) A kind of based on ant group algorithm with the search method of related feedback images of probability hypergraph
WO2024040828A1 (en) Method and device for fusion and classification of remote sensing hyperspectral image and laser radar image
Liu et al. Deep multibranch fusion residual network for insect pest recognition
CN113470076A (en) Multi-target tracking method for yellow-feather chickens in flat-breeding henhouse
Gao et al. Natural scene recognition based on convolutional neural networks and deep Boltzmannn machines
CN112365511A (en) Point cloud segmentation method based on overlapped region retrieval and alignment
Dai et al. Unsupervised pre-training for detection transformers
CN114863278A (en) Crop disease identification method based on FCSA-EfficientNet V2
Ma et al. Using an improved lightweight YOLOv8 model for real-time detection of multi-stage apple fruit in complex orchard environments
CN113505856B (en) Non-supervision self-adaptive classification method for hyperspectral images
Zhang et al. Knowledge amalgamation for object detection with transformers
Lu et al. Intelligent grading of tobacco leaves using an improved bilinear convolutional neural network
Zhao et al. A target detection algorithm for remote sensing images based on a combination of feature fusion and improved anchor
Zhu et al. Identification of table grapes in the natural environment based on an improved Yolov5 and localization of picking points
CN111144469B (en) End-to-end multi-sequence text recognition method based on multi-dimensional associated time sequence classification neural network
CN116311504A (en) Small sample behavior recognition method, system and equipment
Sun et al. Vipformer: Efficient vision-and-pointcloud transformer for unsupervised pointcloud understanding
CN116758415A (en) Lightweight pest identification method based on two-dimensional discrete wavelet transformation
CN115620064A (en) Point cloud down-sampling classification method and system based on convolutional neural network
Li et al. Prune the Convolutional Neural Networks with Sparse Shrink
Wang et al. Insulator defect detection based on improved you-only-look-once v4 in complex scenarios
Zhao et al. Facial expression recognition based on visual transformers and local attention features network
Wenwen et al. Animal Pose Estimation Algorithm Based on the Lightweight Stacked Hourglass Network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination