CN114708431A

CN114708431A - Material image segmentation method based on multi-dimensional feature fusion and drawing attention

Info

Publication number: CN114708431A
Application number: CN202210318948.6A
Authority: CN
Inventors: 韩越兴; 魏惠姗; 王冰; 陈侨川; 钱权
Original assignee: University of Shanghai for Science and Technology
Current assignee: University of Shanghai for Science and Technology
Priority date: 2022-03-29
Filing date: 2022-03-29
Publication date: 2022-07-05

Abstract

The invention discloses a material image segmentation method based on multi-dimensional feature fusion and drawing attention. The method can be applied to image segmentation in the field of materials science. The method comprises the steps of firstly preprocessing a material image for training, then constructing a multi-dimensional feature fusion graph attention network, optimizing parameters of the network by using cross entropy loss, and performing prediction segmentation on the material image by using a trained model; and finally, saving the output material image processing result. The invention integrates the attention of the multi-dimensional feature fusion into the coding and decoding network, improves the segmentation precision of the network on the material image, reduces the time cost and the labor cost of material image processing, and promotes the progress and the development of the corresponding academic and production circles.

Description

Material image segmentation method based on multi-dimensional feature fusion and drawing attention

Technical Field

The invention relates to the field of computer vision two-dimensional image analysis and processing, and provides a material image segmentation method based on multi-dimensional feature fusion and drawing attention, aiming at two-dimensional image data. The method can be applied to image segmentation in the field of materials science, improves the image segmentation precision, reduces the time cost and the labor cost of image processing, and promotes the progress and the development of corresponding academic circles and production circles.

Background

Image semantic segmentation is a problem of general attention in the fields of image processing and the like. Semantic segmentation is to segment different objects in a picture from the level of pixels, label each pixel in an original picture, classify the pixel into different labels, and the segmentation precision includes understanding of information in the picture. The material images are typically taken by advanced electron microscopes, all of which are monochromatic pixel images, such as grayscale images. The gray scale image is characterized in that the contrast of each area is not high. The gray scale map is displayed in black, white and gray according to the brightness presented by the material structure. The material structure has the characteristics of various shape structures, small texture characterization difference, discontinuous and even fuzzy boundary and the like. Therefore, how to utilize artificial intelligence technology to perform semantic segmentation on a material image quickly and accurately and extract useful information from the material image is one of the challenges in the field of computer vision.

There are many methods for image semantic segmentation, wherein the image semantic segmentation based on the neural network is one of the research hotspots which are concerned much at present, and there are many research results. FCN (full probabilistic network) is a classic framework for image semantic segmentation, which is trained in an end-to-end method, and uses a trained classification network for semantic segmentation; to restore the resolution of the image, the FCN also upsamples using deconvolution. Compared with FCN, U-Net has a more symmetrical coding and decoding structure, jump connection from coding to decoding part is helpful for recovering position information, but because the basic module for constructing the network structure is a simple convolution block, a certain degree of gradient disappearance problem exists, and the increase of the network depth is limited; in addition, U-Net does not fully consider the relation between pixels, and lacks the exploration of the dependency relationship between local features, thereby affecting the accuracy of the final segmentation result. Therefore, how to construct a deeper and more effective network structure and optimize the network to explore more features can be considered as the key for improving the semantic segmentation precision.

Disclosure of Invention

In order to solve the problems in the prior art, the invention aims to overcome the defects in the prior art, and designs a material image segmentation method based on the attention of a multi-dimensional feature fusion graph, so that the network exploration on the local features of the image is enhanced, and the high-precision segmentation of the material image is realized.

In order to achieve the purpose of the invention, the invention adopts the following technical scheme:

a material image segmentation method based on multi-dimensional feature fusion and attention drawing comprises the following steps:

(1) image preprocessing:

respectively adjusting the original image and the labeled graph for training into uniform specifications, and storing the preprocessed image locally;

(2) constructing a network model based on graph attention:

inputting training set data into a network, optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file;

(3) and (3) performing image prediction segmentation:

loading a trained model parameter file, inputting test set data into a network, obtaining a predicted segmentation result, and representing the segmentation result by a binary graph;

(4) saving and outputting an image processing result:

and (4) storing the original image of the test set sample and the segmentation result image in the same image.

Preferably, in the step (1), the image preprocessing thereof comprises the steps of:

(1-1) cutting away the material performance data describing part of the original image;

(1-2) uniformly adjusting the images to 512 x 512 pixels;

(1-3) converting all the labeled graphs into black and white graphs by using a binarization algorithm;

(1-4) dividing and saving the preprocessed image data.

Preferably, in the step (2), the multi-dimensional feature fusion based graph attention module comprises three sub-modules, which are: (a) a graph encoder module, (b) a graph attention module (c) a graph encoder module; constructing a network model based on attention of the graph, and constructing a graph structure from the feature graph by adopting a graph encoder; constructing a graph attention module by using the graph volume and the graph attention; restoring the graph structure into a feature graph by adopting a graph encoder, wherein the design and construction of the graph encoder comprise the following steps:

(2-1-1) adjusting the dimension size of the feature map from cxhxw to cxhw using the outputted feature map at the encoder part;

(2-1-2) dividing the feature graph into H multiplied by W nodes, wherein the feature dimension of each node is 1 multiplied by C;

(2-1-3) establishing connection between each node by adopting a four-neighborhood mode, namely establishing edge connection between each central node and four nearest upper, lower, left and right nodes;

(2-1-4) establishing an adjacency matrix through a graph structure to describe the connection condition between nodes;

(2-1-5) saving the established graph structure in the node characteristics and the adjacency matrix.

Preferably, the map attention module fuses map volumes and map attention, and the design and construction of the module comprises the following steps:

(2-2-1) taking the node feature matrix and the adjacency matrix as the input of the graph attention module;

(2-2-2) performing multi-head attention on the input node features (1 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;

(2-2-3) performing local aggregation and feature dimension reduction on the input node features (1 × C) by adopting a layer of graph convolution, wherein the layer of node feature dimensions are obtained by reducing the input node feature dimensions to 1/2 × C;

(2-2-4) performing multi-head attention on the input node features (1/2 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;

(2-2-5) performing local aggregation and feature dimension reduction on the input node features (1/2 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the previous-layer graph convolution layer is reduced to 1/4 × C;

(2-2-6) performing multi-head attention on the input node features (1/4 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;

(2-2-7) performing local aggregation and feature dimension raising on the input node features (1/4 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the output node feature dimension of the previous-layer graph convolution layer is increased to 1/2 × C;

(2-2-8) fusing the output of the graph attention layer with the node characteristic dimension size of 1/2 xC and the output of the graph convolution layer dimensionality operation with the node characteristic dimension size of 1/2 xC in an addition mode;

(2-2-9) performing local aggregation and feature dimension lifting on the input node features (1/2 xC) by adopting one-layer graph convolution, wherein the node feature dimension of the output of the previous-layer graph convolution layer is increased to 1 xC;

(2-2-10) merging the output of the graph attention layer with the node characteristic dimension size of 1 xC and the output of the graph convolution layer dimensionality increasing operation with the node characteristic dimension size of 1 xC in an addition mode;

and (2-2-11) fusing the characteristics of different branches according to a self-defined ratio by adopting the hyper-parameter a.

Preferably, a Resize function is used to construct a graph decoder module, the design of the graph decoder comprising the steps of:

(2-3-1) adjusting the output dimension of the image attention module from C × HW to C × H × W;

(2-3-2) converting the node characteristics after dimension adjustment into a characteristic diagram input by a decoder.

Preferably, in the step (2), the graph attention module uses a graph convolution and a graph attention layer, and the graph convolution is implemented as follows: the graph convolution operation is realized by adopting the degree matrix, the adjacency matrix and the node characteristics, and the calculation formula is as follows:

H¹⁺¹is the output of the graph convolution layer, W¹Is a weight matrix, H¹Is a node characteristic matrix, D is a degree matrix,

is a contiguous matrix plus an identity matrix, and σ is the activation function.

Preferably, the weight coefficients in the graph attention operation utilize the following formula:

α_ijis the attention coefficient, W is the weight matrix,

is the feature vector of the node i,

is the feature vector of the node j,

is the weight vector, σ is the activation function, and softmax is a specific activation function.

Preferably, the multi-head attention mechanism in the attention operation of the graph utilizes the following formula:

is the feature vector of the output i-node,

is a feature vector of j node, W^kIs the weight matrix for the kth attention head,

is the attention coefficient between the i node and the j node in the kth attention head, K is the number of attention heads, and σ is the activation function.

Preferably, the hyper-parameter a in the graph attention module utilizes the following formula:

is the sum of the output of the graph attention layer and the output of the graph convolution layer upscaling operation, H^l+1Is the output characteristic of the convolution of the layer 1+1 graph, alpha is a hyperparameter,

is the output of the attention layer.

Preferably, in the step (2), the iteration time epoch is set to 100 when the network model is trained, and usually, the iteration time epoch is not greater than 75, i.e. the network parameter can converge to be close to the optimal value, and the network training includes the following steps:

(2-4-1) inputting the training set images into the network;

(2-4-2) optimizing network parameters by adopting an Adam first-order optimization algorithm, iteratively updating the weight of the neural network based on training data, and setting a weight attenuation coefficient to reduce the problem of model overfitting;

(2-4-3) in order to further obtain more excellent network performance, setting a learning rate, adopting a scheme of dynamically reducing the learning rate to further approach to the optimal value of the network parameter, multiplying the learning rate lr by an attenuation factor to reduce the learning rate when the loss value does not decrease within a certain epoch, and storing the trained model parameter file.

Preferably, in the step (3), the prediction of the material image includes the steps of:

(3-1) loading the trained model parameter file;

(3-2) inputting the image data into a network to obtain a predicted segmentation result;

and (3-3) locally storing the original image of the test set sample and the segmentation result image in the same image.

Compared with the prior art, the invention has the following obvious and prominent substantive characteristics and remarkable advantages:

1. the invention is based on the multi-dimensional feature fusion graph attention network, can be applied to the image segmentation in the field of materials science and improve the segmentation effect, and combines the graph attention module applied to the graph structure with jump connection to migrate to the data structure of Euclidean space, thereby deepening the network and not changing the size of the feature graph, and relieving the loss of space information; rough features are learned through a graph structure representing a low-resolution feature graph, fine detail features are learned through a graph structure representing a high-resolution feature graph, and graph structures with different resolutions are fused to accurately learn and represent the images, so that high-precision segmentation of the images is realized;

2. the invention combines the characteristic that the graph volume layer is not suitable for being too deep, provides a hyper-parameter a to control the message propagation and aggregation of the graph volume layer of each layer to the global node characteristics, properly deepens the number of the graph volume layers, improves the fitting capability of the network and improves the accuracy of semantic segmentation.

Drawings

FIG. 1 is a block diagram of the operational flow of the present invention.

FIG. 2 is a flow chart of the pretreatment method of the present invention. The method comprises the following steps: (1) cutting the image data into image blocks only containing material structures; (2) uniformly adjusting all images into images with 512 x 512 pixels; (3) judging whether the marked image is a black-white image or not, and performing black-white image conversion on the non-black-white image by using a binarization algorithm; (4) the preprocessed image data is divided and saved.

FIG. 3 is a flow chart of segmenting an image according to the present invention. The method comprises the following steps: (1) inputting image data, cutting an original image for training and testing into image blocks only containing material structures, preprocessing the image blocks, and storing preprocessed data locally; (2) putting the training set images into an encoder in an attention network based on multi-dimensional feature fusion to obtain a feature map representing high-level features; (3) putting the high-level feature graph into a graph encoder to obtain an adjacent matrix and node features representing a graph structure; (4) putting the adjacency matrix and the node characteristics of the graph structure into a graph attention module to carry out node characteristic aggregation and updating; (5) putting the updated graph structure into a graph decoder to restore the graph structure into a form of a feature graph; (6) putting the feature map into a decoder of the network; (7) optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file; (8) loading a trained model parameter file, and inputting test set data into a network; (9) the predicted segmentation result is calculated and represented by a binarized map. (10) And outputting and storing the image result.

FIG. 4 is a block diagram of the attention module of the present invention. The method comprises the following steps: (1) inputting an adjacency matrix and a node characteristic matrix representing a diagram structure; (2) respectively entering the input features into the graph attention layer (GAT1) and the graph convolutional layer (GCN1) output node features; (3) the node characteristics output by the GCN1 enter a graph attention layer (GAT2) and a graph convolution layer (GCN2) respectively to output new node characteristics; (4) the node features output by GCN2 enter the graph attention layer (GAT 3); (5) the node features output by the GAT3 enter the graph convolutional layer (GCN 3); (6) the result of multiplying the feature output by GCN3 by (1-a) is added to the result of multiplying the node feature output by GAT by a; (7) the summed features enter the graph convolution layer (GCN); (8) the result of multiplying the feature output by (1-a) by the GCN and the result of multiplying the node feature output by GAT1 by a are added to output the final node feature.

Fig. 5 is a flow chart of the encoder of the present invention. The method comprises the following steps: (1) dividing an input feature graph into nodes and node features; (2) connecting the adjacent nodes by adopting a four-neighborhood mode; (3) constructing a adjacency matrix of the graph structure; (4) the adjacency matrix and node characteristics of the graph structure are preserved.

Detailed Description

In order to make the technical solution of the present invention better understood, the following preferred embodiments of the present invention are described in detail with reference to the accompanying drawings. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without any inventive step, shall fall within the scope of the present invention.

The above-described scheme is further illustrated below with reference to specific embodiments, which are detailed below:

the first embodiment is as follows:

in this embodiment, a method for segmenting a material image based on a multi-dimensional feature fusion graph attention is provided, and an efficient multi-dimensional feature fusion graph attention network structure is constructed by the method, so that the segmentation accuracy of a network on image data is improved.

In the method of this embodiment, a material image is used to train a model, so as to obtain model parameters of such data, and further obtain high-precision prediction of similar segmentation data except for a sample, as shown in fig. 1, the method of this embodiment includes the following steps:

(1) image preprocessing: respectively adjusting the original image and the labeled graph for training into uniform specifications, and storing the preprocessed image locally;

(2) constructing a network model based on graph attention: inputting training set data into a network, optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file;

(3) and (3) performing image prediction segmentation: loading a trained model parameter file, inputting test set data into a network, and acquiring a predicted segmentation result, wherein the segmentation result is represented by a binary graph;

(4) saving and outputting an image processing result: and (4) storing the original image of the test set sample and the segmentation result image in the same image.

The invention relates to a material image segmentation method based on multi-dimensional feature fusion and drawing attention, which comprises the steps of firstly preprocessing an image for training to obtain a clearer image, and storing the preprocessed data locally; then training a graph attention network on the training data set by using cross entropy loss; and then predicting the test data set by using the trained model, and storing the predicted binarization image result.

Example two

This embodiment is substantially the same as the first embodiment, and is characterized in that:

in this embodiment, as shown in fig. 1, the image preprocessing includes the following steps:

(1-1) clipping image data into image blocks of 512 × 512 pixels;

(1-2) judging whether the marked image is a black-white image or not, and performing black-white image conversion on the non-black-white image by using a binarization algorithm;

(1-3) dividing and saving the preprocessed image data.

In this embodiment, the multi-dimensional feature fusion graph attention module includes three sub-modules, which are: (a) a graph encoder module, (b) a graph attention module (c) a graph encoder module; the graph encoder is constructed in a graph constructing mode, and the graph encoder construction comprises the following steps:

(2-1) adjusting a dimension size of the feature map from cxhxw to cxhw using the outputted feature map at the encoder part;

(2-2) dividing the feature graph into H multiplied by W nodes, wherein the feature dimension of each node is 1 multiplied by C;

(2-3) establishing connection between each node by adopting a four-neighborhood mode, namely establishing edge connection between each central node and four nearest upper, lower, left and right nodes;

(2-4) establishing an adjacency matrix through a graph structure, and describing the connection condition between nodes;

(2-5) saving the established graph structure in a node characteristic and adjacency matrix manner;

in this embodiment, the graph attention network module performs feature fusion using a graph volume and a graph attention layer; the graph attention module construction includes the following steps:

(2-6) taking the node feature matrix and the adjacency matrix as the input of the graph attention module;

(2-7) performing multi-head attention on the input node features (1 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;

(2-8) performing local aggregation and feature dimension reduction on the input node features (1 × C) by adopting a layer of graph convolution, wherein the node feature dimension of the input node is reduced to 1/2 × C;

(2-9) performing multi-head attention on the input node features (1/2 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;

(2-10) performing local aggregation and feature dimensionality reduction on the input node features (1/2 xC) by adopting one-layer graph convolution, wherein the node feature dimensionality of the output node feature of the graph convolution layer at the previous layer is reduced to 1/4 xC;

(2-11) performing multi-head attention on the input node features (1/4 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;

(2-12) locally aggregating and feature-increasing the input node features (1/4 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the previous-layer graph convolution layer is increased to 1/2 × C;

(2-13) fusing the output of the graph attention layer with the node characteristic dimension size of 1/2 xC and the output of the graph convolution layer dimensionality increasing operation with the node characteristic dimension size of 1/2 xC in an addition mode;

(2-14) locally aggregating and feature-increasing dimensions of the input node features (1/2 xC) by adopting one-layer graph convolution, wherein the node feature dimensions of the output node features of the convolution layer of the previous layer graph are increased to 1 xC;

(2-15) merging the output of the graph attention layer with the node characteristic dimension size of 1 xC and the output of the graph convolution layer dimensionality increasing operation with the node characteristic dimension size of 1 xC in an addition mode;

(2-16) fusing the characteristics of different branches according to a self-defined ratio by adopting a hyper-parameter a;

in this embodiment, the graph attention module uses a graph convolution and a graph attention layer, and the graph convolution is implemented as follows:

(2-17) the graph convolution operation is realized by adopting the degree matrix, the adjacency matrix and the node characteristics, and the calculation formula is as follows:

is a contiguous matrix plus an identity matrix, σ is the activation function;

in this embodiment, the attention-achieving steps are as follows:

(2-18) calculating an attention coefficient by using the node characteristics, wherein the calculation formula is as follows:

α_ijis the attention coefficient, W is the weight matrix,

is the feature vector of the node i,

is the feature vector of the node j,

is a weight vector, σ is an activation function, softmax is a specific activation function;

(2-19) updating the node characteristics by adopting a multi-head attention mechanism, wherein the calculation formula is as follows:

is the feature vector of the output i-node,

is the feature vector of the j node, W^kIs the weight matrix for the kth attention head,

are the attention coefficients of the i and j nodes in the kth attention head, K is the number of attention heads, and σ is the activation function.

In this example, it is noted that the force module uses the hyper-parameter a to fuse the characteristics of different branches, and the implementation steps are as follows:

(2-20) fusing the characteristics of different branches according to a self-defined ratio by adopting a hyper-parameter a, wherein the calculation formula is as follows:

is the sum of the output of the graph attention layer and the output of the graph convolution layer upscaling operation, H^l+1Is the output characteristic of the 1+1 th layer graph convolution, a is the hyper-parameter,

is the output of the graph attention layer;

in this embodiment, Resize function is used to construct a graph decoder module, and the graph decoder construction includes the following steps:

(2-21) adjusting the output dimension of the image attention module from C × HW to C × H × W;

(2-22) converting the node characteristics after dimension adjustment into a characteristic diagram input by a decoder;

in this embodiment, the iteration time epoch is set to 100 when training the network model, and usually, the iteration time epoch is not greater than 75, i.e., the network parameter can converge to a value near the optimal value, where the network training includes the following steps:

(2-23) inputting the training set images into the network;

(2-24) optimizing network parameters by adopting an Adam first-order optimization algorithm, iteratively updating neural network weights based on training data, and setting weight attenuation coefficients to reduce the problem of model overfitting;

(2-25) in order to further obtain more excellent network performance, setting a learning rate, adopting a scheme of dynamically reducing the learning rate to further approach to the optimal value of the network parameter, multiplying the learning rate lr by an attenuation factor to reduce the learning rate when the loss value does not decrease within a certain epoch, and storing the trained model parameter file.

The method is based on the multi-dimensional feature fusion attention network, can be applied to image segmentation in the field of materials science and improves the segmentation effect, and an attention module applied to a graph structure is combined with jump connection to be migrated into a data structure of an Euclidean space, so that the loss of spatial information is relieved while the network is deepened and the size of a feature graph is not changed; rough features are learned through a graph structure representing a low-resolution feature graph, fine detail features are learned through a graph structure representing a high-resolution feature graph, and graph structures with different resolutions are fused to accurately learn and represent the images, so that high-precision segmentation of the images is achieved. The embodiment combines the characteristic that the graph convolution layer is not suitable for being too deep, proposes that the hyper-parameter a controls the message propagation and aggregation of the graph convolution layer of each layer to the global node characteristics, and properly deepens the number of the graph convolution layers to improve the fitting capability of the network and improve the accuracy of semantic segmentation.

EXAMPLE III

The present embodiment is substantially the same as the second embodiment, and is characterized in that:

in this embodiment, the predictive segmentation of the two-dimensional image comprises the steps of:

(3-1) loading the trained model parameter file;

In summary, fig. 2 is a flowchart of a method for segmenting a material image based on multi-dimensional feature fusion and attention, including the following steps:

firstly, cutting an original image for training into image blocks only containing a material structure, preprocessing the image blocks to obtain clearer image blocks, and storing preprocessed data locally; constructing a multi-dimensional feature fusion-based graph attention network, inputting training set data into the network, optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file; loading a trained model parameter file, inputting image data into a network, obtaining a predicted segmentation result, and representing the segmentation result by a binary graph; and outputting and storing the post-processed image result. The method can be applied to image segmentation in the field of materials science, and promotes the progress and development of various subject fields. FIG. 2 is a flowchart illustrating a preprocessing method according to the present embodiment. The method comprises the following steps: (1) cutting the image data into image blocks only containing material structures; (2) uniformly adjusting all images into 512 x 512 pixel images; (3) judging whether the marked image is a black-white image or not, and performing black-white image conversion on the non-black-white image by using a binarization algorithm; (4) the preprocessed image data is divided and saved.

Fig. 3 is a flowchart of segmenting an image according to the present embodiment. The method comprises the following steps: (1) inputting image data, cutting an original image for training and testing into image blocks only containing material structures, preprocessing the image blocks, and storing preprocessed data locally; (2) putting the training set images into an encoder in a multi-dimensional feature fusion-based graph attention network to obtain a feature graph representing high-level features; (3) putting the high-level feature graph into a graph encoder to obtain an adjacent matrix and node features representing a graph structure; (4) putting the adjacency matrix and the node characteristics of the graph structure into a graph attention module to carry out node characteristic aggregation and updating; (5) putting the updated graph structure into a graph decoder to restore the graph structure into a form of a feature graph; (6) putting the feature map into a decoder of the network; (7) optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file; (8) loading a trained model parameter file, and inputting test set data into a network; (9) the predicted segmentation result is calculated and represented by a binarized map. (10) And outputting and storing the image result.

Fig. 4 is a structural diagram of the force module according to the embodiment. The method comprises the following steps: (1) inputting an adjacency matrix and a node characteristic matrix representing a diagram structure; (2) respectively entering the input features into the graph attention layer (GAT1) and the graph convolutional layer (GCN1) output node features; (3) the node characteristics output by the GCN1 enter a graph attention layer (GAT2) and a graph convolution layer (GCN2) respectively to output new node characteristics; (4) the node features output by GCN2 enter the graph attention layer (GAT 3); (5) the node features output by the GAT3 enter the graph convolutional layer (GCN 3); (6) the result of multiplying the feature output by GCN3 by (1-a) is added to the result of multiplying the node feature output by GAT by a; (7) the summed features enter the graph convolution layer (GCN); (8) the result of multiplying the feature output by (1-a) by the GCN and the result of multiplying the node feature output by GAT1 by a are added to output the final node feature.

Fig. 5 is a flowchart of the encoder of the present embodiment. The method comprises the following steps: (1) dividing an input feature graph into nodes and node features; (2) connecting the adjacent nodes by adopting a four-neighborhood mode; (3) constructing a adjacency matrix of the graph structure; (4) the adjacency matrix and node characteristics of the graph structure are preserved.

The embodiment is based on a multi-dimensional feature fusion drawing attention material image segmentation method. The method can be applied to image segmentation in the field of materials science. The method comprises the steps of firstly preprocessing a material image for training, then constructing a multi-dimensional feature fusion graph attention network, optimizing parameters of the network by using cross entropy loss, and performing prediction segmentation on the material image by using a trained model; and finally, saving the output material image processing result. In the embodiment, the multi-dimensional feature fusion image attention module is fused into the coding and decoding network, so that the segmentation precision of the network on the material image is improved, the time cost and the labor cost of material image processing are reduced, and the progress and the development of the corresponding academic and production circles are promoted.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes and modifications can be made according to the purpose of the invention, and any changes, modifications, substitutions, combinations or simplifications made according to the spirit and principle of the technical solution of the present invention shall be equivalent substitutions, as long as the purpose of the present invention is met, and the technical principle and the inventive concept of the material image segmentation method based on multi-dimensional feature fusion of the present invention are not departed from the technical principle and the inventive concept of the multi-dimensional feature fusion drawing attention.

Claims

1. A method for segmenting a material image based on multi-dimensional feature fusion and attention drawing is characterized by comprising the following steps:

(1) image preprocessing:

(2) constructing a network model based on graph attention:

(3) and (3) performing image prediction segmentation:

(4) saving and outputting an image processing result:

2. The method for segmenting the material image based on the multi-dimensional feature fusion attention of the drawing is characterized in that in the step (1), the image preprocessing comprises the following steps:

(1-2) uniformly adjusting the images to 512 x 512 pixels;

(1-4) dividing and saving the preprocessed image data.

3. The method for segmenting the material image based on the attention of the multi-dimensional feature fusion is characterized in that in the step (2), a multi-dimensional feature fusion based attention module is constructed, and a graph encoder is adopted to construct a graph structure from the feature graph; constructing a graph attention module by using the graph volume and the graph attention; restoring the graph structure into a feature graph by adopting a graph encoder, wherein the design and construction of the graph encoder comprise the following steps:

(2-1-4) establishing an adjacency matrix through a graph structure, and describing the connection condition between nodes;

4. The method for multi-dimensional feature fusion based graph attention material image segmentation according to claim 3, wherein the graph attention module fuses graph volumes and graph attention, and the module design and construction comprises the following steps:

(2-2-9) performing local aggregation and feature dimension raising on the input node features (1/2 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the output node feature dimension of the previous-layer graph convolution layer is increased to 1 × C;

5. The method for multi-dimensional feature fusion based attention-directed material image segmentation as claimed in claim 3, wherein the design of the image encoder comprises the steps of:

6. The method for segmenting the material image based on the multi-dimensional feature fusion attention map is characterized in that the hyper-parameter a in the map attention module utilizes the following formula:

is the sum of the output of the graph attention layer and the output of the graph convolution layer upscaling operation, H^l+1Is the output characteristic of the 1+1 th layer graph convolution, alpha is a hyper-parameter,

is the output of the attention layer.

7. The method for segmenting the material image based on the attention of the multi-dimensional feature fusion graph according to claim 1, wherein in the step (2), the training of the network model based on the attention of the graph comprises the following steps:

(2-4-1) inputting the training set images into the network;

(2-4-2) optimizing network model parameters using cross entropy loss;

and (2-4-3) storing the trained network parameter file.

8. The method for multi-dimensional feature fusion based attention-directed material image segmentation as claimed in claim 1, wherein in the step (3), the prediction of the material image comprises the steps of:

(3-1) loading the trained model parameter file;

and (3-3) locally storing the original test set sample image and the probability map in the same picture.