CN114708431A - Material image segmentation method based on multi-dimensional feature fusion and drawing attention - Google Patents

Material image segmentation method based on multi-dimensional feature fusion and drawing attention Download PDF

Info

Publication number
CN114708431A
CN114708431A CN202210318948.6A CN202210318948A CN114708431A CN 114708431 A CN114708431 A CN 114708431A CN 202210318948 A CN202210318948 A CN 202210318948A CN 114708431 A CN114708431 A CN 114708431A
Authority
CN
China
Prior art keywords
graph
attention
node
image
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210318948.6A
Other languages
Chinese (zh)
Inventor
韩越兴
魏惠姗
王冰
陈侨川
钱权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Shanghai for Science and Technology
Original Assignee
University of Shanghai for Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Shanghai for Science and Technology filed Critical University of Shanghai for Science and Technology
Priority to CN202210318948.6A priority Critical patent/CN114708431A/en
Publication of CN114708431A publication Critical patent/CN114708431A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a material image segmentation method based on multi-dimensional feature fusion and drawing attention. The method can be applied to image segmentation in the field of materials science. The method comprises the steps of firstly preprocessing a material image for training, then constructing a multi-dimensional feature fusion graph attention network, optimizing parameters of the network by using cross entropy loss, and performing prediction segmentation on the material image by using a trained model; and finally, saving the output material image processing result. The invention integrates the attention of the multi-dimensional feature fusion into the coding and decoding network, improves the segmentation precision of the network on the material image, reduces the time cost and the labor cost of material image processing, and promotes the progress and the development of the corresponding academic and production circles.

Description

Material image segmentation method based on multi-dimensional feature fusion and drawing attention
Technical Field
The invention relates to the field of computer vision two-dimensional image analysis and processing, and provides a material image segmentation method based on multi-dimensional feature fusion and drawing attention, aiming at two-dimensional image data. The method can be applied to image segmentation in the field of materials science, improves the image segmentation precision, reduces the time cost and the labor cost of image processing, and promotes the progress and the development of corresponding academic circles and production circles.
Background
Image semantic segmentation is a problem of general attention in the fields of image processing and the like. Semantic segmentation is to segment different objects in a picture from the level of pixels, label each pixel in an original picture, classify the pixel into different labels, and the segmentation precision includes understanding of information in the picture. The material images are typically taken by advanced electron microscopes, all of which are monochromatic pixel images, such as grayscale images. The gray scale image is characterized in that the contrast of each area is not high. The gray scale map is displayed in black, white and gray according to the brightness presented by the material structure. The material structure has the characteristics of various shape structures, small texture characterization difference, discontinuous and even fuzzy boundary and the like. Therefore, how to utilize artificial intelligence technology to perform semantic segmentation on a material image quickly and accurately and extract useful information from the material image is one of the challenges in the field of computer vision.
There are many methods for image semantic segmentation, wherein the image semantic segmentation based on the neural network is one of the research hotspots which are concerned much at present, and there are many research results. FCN (full probabilistic network) is a classic framework for image semantic segmentation, which is trained in an end-to-end method, and uses a trained classification network for semantic segmentation; to restore the resolution of the image, the FCN also upsamples using deconvolution. Compared with FCN, U-Net has a more symmetrical coding and decoding structure, jump connection from coding to decoding part is helpful for recovering position information, but because the basic module for constructing the network structure is a simple convolution block, a certain degree of gradient disappearance problem exists, and the increase of the network depth is limited; in addition, U-Net does not fully consider the relation between pixels, and lacks the exploration of the dependency relationship between local features, thereby affecting the accuracy of the final segmentation result. Therefore, how to construct a deeper and more effective network structure and optimize the network to explore more features can be considered as the key for improving the semantic segmentation precision.
Disclosure of Invention
In order to solve the problems in the prior art, the invention aims to overcome the defects in the prior art, and designs a material image segmentation method based on the attention of a multi-dimensional feature fusion graph, so that the network exploration on the local features of the image is enhanced, and the high-precision segmentation of the material image is realized.
In order to achieve the purpose of the invention, the invention adopts the following technical scheme:
a material image segmentation method based on multi-dimensional feature fusion and attention drawing comprises the following steps:
(1) image preprocessing:
respectively adjusting the original image and the labeled graph for training into uniform specifications, and storing the preprocessed image locally;
(2) constructing a network model based on graph attention:
inputting training set data into a network, optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file;
(3) and (3) performing image prediction segmentation:
loading a trained model parameter file, inputting test set data into a network, obtaining a predicted segmentation result, and representing the segmentation result by a binary graph;
(4) saving and outputting an image processing result:
and (4) storing the original image of the test set sample and the segmentation result image in the same image.
Preferably, in the step (1), the image preprocessing thereof comprises the steps of:
(1-1) cutting away the material performance data describing part of the original image;
(1-2) uniformly adjusting the images to 512 x 512 pixels;
(1-3) converting all the labeled graphs into black and white graphs by using a binarization algorithm;
(1-4) dividing and saving the preprocessed image data.
Preferably, in the step (2), the multi-dimensional feature fusion based graph attention module comprises three sub-modules, which are: (a) a graph encoder module, (b) a graph attention module (c) a graph encoder module; constructing a network model based on attention of the graph, and constructing a graph structure from the feature graph by adopting a graph encoder; constructing a graph attention module by using the graph volume and the graph attention; restoring the graph structure into a feature graph by adopting a graph encoder, wherein the design and construction of the graph encoder comprise the following steps:
(2-1-1) adjusting the dimension size of the feature map from cxhxw to cxhw using the outputted feature map at the encoder part;
(2-1-2) dividing the feature graph into H multiplied by W nodes, wherein the feature dimension of each node is 1 multiplied by C;
(2-1-3) establishing connection between each node by adopting a four-neighborhood mode, namely establishing edge connection between each central node and four nearest upper, lower, left and right nodes;
(2-1-4) establishing an adjacency matrix through a graph structure to describe the connection condition between nodes;
(2-1-5) saving the established graph structure in the node characteristics and the adjacency matrix.
Preferably, the map attention module fuses map volumes and map attention, and the design and construction of the module comprises the following steps:
(2-2-1) taking the node feature matrix and the adjacency matrix as the input of the graph attention module;
(2-2-2) performing multi-head attention on the input node features (1 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-2-3) performing local aggregation and feature dimension reduction on the input node features (1 × C) by adopting a layer of graph convolution, wherein the layer of node feature dimensions are obtained by reducing the input node feature dimensions to 1/2 × C;
(2-2-4) performing multi-head attention on the input node features (1/2 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-2-5) performing local aggregation and feature dimension reduction on the input node features (1/2 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the previous-layer graph convolution layer is reduced to 1/4 × C;
(2-2-6) performing multi-head attention on the input node features (1/4 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-2-7) performing local aggregation and feature dimension raising on the input node features (1/4 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the output node feature dimension of the previous-layer graph convolution layer is increased to 1/2 × C;
(2-2-8) fusing the output of the graph attention layer with the node characteristic dimension size of 1/2 xC and the output of the graph convolution layer dimensionality operation with the node characteristic dimension size of 1/2 xC in an addition mode;
(2-2-9) performing local aggregation and feature dimension lifting on the input node features (1/2 xC) by adopting one-layer graph convolution, wherein the node feature dimension of the output of the previous-layer graph convolution layer is increased to 1 xC;
(2-2-10) merging the output of the graph attention layer with the node characteristic dimension size of 1 xC and the output of the graph convolution layer dimensionality increasing operation with the node characteristic dimension size of 1 xC in an addition mode;
and (2-2-11) fusing the characteristics of different branches according to a self-defined ratio by adopting the hyper-parameter a.
Preferably, a Resize function is used to construct a graph decoder module, the design of the graph decoder comprising the steps of:
(2-3-1) adjusting the output dimension of the image attention module from C × HW to C × H × W;
(2-3-2) converting the node characteristics after dimension adjustment into a characteristic diagram input by a decoder.
Preferably, in the step (2), the graph attention module uses a graph convolution and a graph attention layer, and the graph convolution is implemented as follows: the graph convolution operation is realized by adopting the degree matrix, the adjacency matrix and the node characteristics, and the calculation formula is as follows:
Figure BDA0003569778860000031
H1+1is the output of the graph convolution layer, W1Is a weight matrix, H1Is a node characteristic matrix, D is a degree matrix,
Figure BDA0003569778860000032
is a contiguous matrix plus an identity matrix, and σ is the activation function.
Preferably, the weight coefficients in the graph attention operation utilize the following formula:
Figure BDA0003569778860000033
αijis the attention coefficient, W is the weight matrix,
Figure BDA0003569778860000034
is the feature vector of the node i,
Figure BDA0003569778860000035
is the feature vector of the node j,
Figure BDA0003569778860000036
is the weight vector, σ is the activation function, and softmax is a specific activation function.
Preferably, the multi-head attention mechanism in the attention operation of the graph utilizes the following formula:
Figure BDA0003569778860000041
Figure BDA0003569778860000042
is the feature vector of the output i-node,
Figure BDA0003569778860000043
is a feature vector of j node, WkIs the weight matrix for the kth attention head,
Figure BDA0003569778860000044
is the attention coefficient between the i node and the j node in the kth attention head, K is the number of attention heads, and σ is the activation function.
Preferably, the hyper-parameter a in the graph attention module utilizes the following formula:
Figure BDA0003569778860000045
Figure BDA0003569778860000046
is the sum of the output of the graph attention layer and the output of the graph convolution layer upscaling operation, Hl+1Is the output characteristic of the convolution of the layer 1+1 graph, alpha is a hyperparameter,
Figure BDA0003569778860000047
is the output of the attention layer.
Preferably, in the step (2), the iteration time epoch is set to 100 when the network model is trained, and usually, the iteration time epoch is not greater than 75, i.e. the network parameter can converge to be close to the optimal value, and the network training includes the following steps:
(2-4-1) inputting the training set images into the network;
(2-4-2) optimizing network parameters by adopting an Adam first-order optimization algorithm, iteratively updating the weight of the neural network based on training data, and setting a weight attenuation coefficient to reduce the problem of model overfitting;
(2-4-3) in order to further obtain more excellent network performance, setting a learning rate, adopting a scheme of dynamically reducing the learning rate to further approach to the optimal value of the network parameter, multiplying the learning rate lr by an attenuation factor to reduce the learning rate when the loss value does not decrease within a certain epoch, and storing the trained model parameter file.
Preferably, in the step (3), the prediction of the material image includes the steps of:
(3-1) loading the trained model parameter file;
(3-2) inputting the image data into a network to obtain a predicted segmentation result;
and (3-3) locally storing the original image of the test set sample and the segmentation result image in the same image.
Compared with the prior art, the invention has the following obvious and prominent substantive characteristics and remarkable advantages:
1. the invention is based on the multi-dimensional feature fusion graph attention network, can be applied to the image segmentation in the field of materials science and improve the segmentation effect, and combines the graph attention module applied to the graph structure with jump connection to migrate to the data structure of Euclidean space, thereby deepening the network and not changing the size of the feature graph, and relieving the loss of space information; rough features are learned through a graph structure representing a low-resolution feature graph, fine detail features are learned through a graph structure representing a high-resolution feature graph, and graph structures with different resolutions are fused to accurately learn and represent the images, so that high-precision segmentation of the images is realized;
2. the invention combines the characteristic that the graph volume layer is not suitable for being too deep, provides a hyper-parameter a to control the message propagation and aggregation of the graph volume layer of each layer to the global node characteristics, properly deepens the number of the graph volume layers, improves the fitting capability of the network and improves the accuracy of semantic segmentation.
Drawings
FIG. 1 is a block diagram of the operational flow of the present invention.
FIG. 2 is a flow chart of the pretreatment method of the present invention. The method comprises the following steps: (1) cutting the image data into image blocks only containing material structures; (2) uniformly adjusting all images into images with 512 x 512 pixels; (3) judging whether the marked image is a black-white image or not, and performing black-white image conversion on the non-black-white image by using a binarization algorithm; (4) the preprocessed image data is divided and saved.
FIG. 3 is a flow chart of segmenting an image according to the present invention. The method comprises the following steps: (1) inputting image data, cutting an original image for training and testing into image blocks only containing material structures, preprocessing the image blocks, and storing preprocessed data locally; (2) putting the training set images into an encoder in an attention network based on multi-dimensional feature fusion to obtain a feature map representing high-level features; (3) putting the high-level feature graph into a graph encoder to obtain an adjacent matrix and node features representing a graph structure; (4) putting the adjacency matrix and the node characteristics of the graph structure into a graph attention module to carry out node characteristic aggregation and updating; (5) putting the updated graph structure into a graph decoder to restore the graph structure into a form of a feature graph; (6) putting the feature map into a decoder of the network; (7) optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file; (8) loading a trained model parameter file, and inputting test set data into a network; (9) the predicted segmentation result is calculated and represented by a binarized map. (10) And outputting and storing the image result.
FIG. 4 is a block diagram of the attention module of the present invention. The method comprises the following steps: (1) inputting an adjacency matrix and a node characteristic matrix representing a diagram structure; (2) respectively entering the input features into the graph attention layer (GAT1) and the graph convolutional layer (GCN1) output node features; (3) the node characteristics output by the GCN1 enter a graph attention layer (GAT2) and a graph convolution layer (GCN2) respectively to output new node characteristics; (4) the node features output by GCN2 enter the graph attention layer (GAT 3); (5) the node features output by the GAT3 enter the graph convolutional layer (GCN 3); (6) the result of multiplying the feature output by GCN3 by (1-a) is added to the result of multiplying the node feature output by GAT by a; (7) the summed features enter the graph convolution layer (GCN); (8) the result of multiplying the feature output by (1-a) by the GCN and the result of multiplying the node feature output by GAT1 by a are added to output the final node feature.
Fig. 5 is a flow chart of the encoder of the present invention. The method comprises the following steps: (1) dividing an input feature graph into nodes and node features; (2) connecting the adjacent nodes by adopting a four-neighborhood mode; (3) constructing a adjacency matrix of the graph structure; (4) the adjacency matrix and node characteristics of the graph structure are preserved.
Detailed Description
In order to make the technical solution of the present invention better understood, the following preferred embodiments of the present invention are described in detail with reference to the accompanying drawings. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without any inventive step, shall fall within the scope of the present invention.
The above-described scheme is further illustrated below with reference to specific embodiments, which are detailed below:
the first embodiment is as follows:
in this embodiment, a method for segmenting a material image based on a multi-dimensional feature fusion graph attention is provided, and an efficient multi-dimensional feature fusion graph attention network structure is constructed by the method, so that the segmentation accuracy of a network on image data is improved.
In the method of this embodiment, a material image is used to train a model, so as to obtain model parameters of such data, and further obtain high-precision prediction of similar segmentation data except for a sample, as shown in fig. 1, the method of this embodiment includes the following steps:
(1) image preprocessing: respectively adjusting the original image and the labeled graph for training into uniform specifications, and storing the preprocessed image locally;
(2) constructing a network model based on graph attention: inputting training set data into a network, optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file;
(3) and (3) performing image prediction segmentation: loading a trained model parameter file, inputting test set data into a network, and acquiring a predicted segmentation result, wherein the segmentation result is represented by a binary graph;
(4) saving and outputting an image processing result: and (4) storing the original image of the test set sample and the segmentation result image in the same image.
The invention relates to a material image segmentation method based on multi-dimensional feature fusion and drawing attention, which comprises the steps of firstly preprocessing an image for training to obtain a clearer image, and storing the preprocessed data locally; then training a graph attention network on the training data set by using cross entropy loss; and then predicting the test data set by using the trained model, and storing the predicted binarization image result.
Example two
This embodiment is substantially the same as the first embodiment, and is characterized in that:
in this embodiment, as shown in fig. 1, the image preprocessing includes the following steps:
(1-1) clipping image data into image blocks of 512 × 512 pixels;
(1-2) judging whether the marked image is a black-white image or not, and performing black-white image conversion on the non-black-white image by using a binarization algorithm;
(1-3) dividing and saving the preprocessed image data.
In this embodiment, the multi-dimensional feature fusion graph attention module includes three sub-modules, which are: (a) a graph encoder module, (b) a graph attention module (c) a graph encoder module; the graph encoder is constructed in a graph constructing mode, and the graph encoder construction comprises the following steps:
(2-1) adjusting a dimension size of the feature map from cxhxw to cxhw using the outputted feature map at the encoder part;
(2-2) dividing the feature graph into H multiplied by W nodes, wherein the feature dimension of each node is 1 multiplied by C;
(2-3) establishing connection between each node by adopting a four-neighborhood mode, namely establishing edge connection between each central node and four nearest upper, lower, left and right nodes;
(2-4) establishing an adjacency matrix through a graph structure, and describing the connection condition between nodes;
(2-5) saving the established graph structure in a node characteristic and adjacency matrix manner;
in this embodiment, the graph attention network module performs feature fusion using a graph volume and a graph attention layer; the graph attention module construction includes the following steps:
(2-6) taking the node feature matrix and the adjacency matrix as the input of the graph attention module;
(2-7) performing multi-head attention on the input node features (1 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-8) performing local aggregation and feature dimension reduction on the input node features (1 × C) by adopting a layer of graph convolution, wherein the node feature dimension of the input node is reduced to 1/2 × C;
(2-9) performing multi-head attention on the input node features (1/2 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-10) performing local aggregation and feature dimensionality reduction on the input node features (1/2 xC) by adopting one-layer graph convolution, wherein the node feature dimensionality of the output node feature of the graph convolution layer at the previous layer is reduced to 1/4 xC;
(2-11) performing multi-head attention on the input node features (1/4 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-12) locally aggregating and feature-increasing the input node features (1/4 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the previous-layer graph convolution layer is increased to 1/2 × C;
(2-13) fusing the output of the graph attention layer with the node characteristic dimension size of 1/2 xC and the output of the graph convolution layer dimensionality increasing operation with the node characteristic dimension size of 1/2 xC in an addition mode;
(2-14) locally aggregating and feature-increasing dimensions of the input node features (1/2 xC) by adopting one-layer graph convolution, wherein the node feature dimensions of the output node features of the convolution layer of the previous layer graph are increased to 1 xC;
(2-15) merging the output of the graph attention layer with the node characteristic dimension size of 1 xC and the output of the graph convolution layer dimensionality increasing operation with the node characteristic dimension size of 1 xC in an addition mode;
(2-16) fusing the characteristics of different branches according to a self-defined ratio by adopting a hyper-parameter a;
in this embodiment, the graph attention module uses a graph convolution and a graph attention layer, and the graph convolution is implemented as follows:
(2-17) the graph convolution operation is realized by adopting the degree matrix, the adjacency matrix and the node characteristics, and the calculation formula is as follows:
Figure BDA0003569778860000081
H1+1is the output of the graph convolution layer, W1Is a weight matrix, H1Is a node characteristic matrix, D is a degree matrix,
Figure BDA0003569778860000082
is a contiguous matrix plus an identity matrix, σ is the activation function;
in this embodiment, the attention-achieving steps are as follows:
(2-18) calculating an attention coefficient by using the node characteristics, wherein the calculation formula is as follows:
Figure BDA0003569778860000083
αijis the attention coefficient, W is the weight matrix,
Figure BDA0003569778860000084
is the feature vector of the node i,
Figure BDA0003569778860000085
is the feature vector of the node j,
Figure BDA0003569778860000086
is a weight vector, σ is an activation function, softmax is a specific activation function;
(2-19) updating the node characteristics by adopting a multi-head attention mechanism, wherein the calculation formula is as follows:
Figure BDA0003569778860000087
Figure BDA0003569778860000088
is the feature vector of the output i-node,
Figure BDA0003569778860000089
is the feature vector of the j node, WkIs the weight matrix for the kth attention head,
Figure BDA00035697788600000810
are the attention coefficients of the i and j nodes in the kth attention head, K is the number of attention heads, and σ is the activation function.
In this example, it is noted that the force module uses the hyper-parameter a to fuse the characteristics of different branches, and the implementation steps are as follows:
(2-20) fusing the characteristics of different branches according to a self-defined ratio by adopting a hyper-parameter a, wherein the calculation formula is as follows:
Figure BDA00035697788600000811
Figure BDA00035697788600000812
is the sum of the output of the graph attention layer and the output of the graph convolution layer upscaling operation, Hl+1Is the output characteristic of the 1+1 th layer graph convolution, a is the hyper-parameter,
Figure BDA00035697788600000813
is the output of the graph attention layer;
in this embodiment, Resize function is used to construct a graph decoder module, and the graph decoder construction includes the following steps:
(2-21) adjusting the output dimension of the image attention module from C × HW to C × H × W;
(2-22) converting the node characteristics after dimension adjustment into a characteristic diagram input by a decoder;
in this embodiment, the iteration time epoch is set to 100 when training the network model, and usually, the iteration time epoch is not greater than 75, i.e., the network parameter can converge to a value near the optimal value, where the network training includes the following steps:
(2-23) inputting the training set images into the network;
(2-24) optimizing network parameters by adopting an Adam first-order optimization algorithm, iteratively updating neural network weights based on training data, and setting weight attenuation coefficients to reduce the problem of model overfitting;
(2-25) in order to further obtain more excellent network performance, setting a learning rate, adopting a scheme of dynamically reducing the learning rate to further approach to the optimal value of the network parameter, multiplying the learning rate lr by an attenuation factor to reduce the learning rate when the loss value does not decrease within a certain epoch, and storing the trained model parameter file.
The method is based on the multi-dimensional feature fusion attention network, can be applied to image segmentation in the field of materials science and improves the segmentation effect, and an attention module applied to a graph structure is combined with jump connection to be migrated into a data structure of an Euclidean space, so that the loss of spatial information is relieved while the network is deepened and the size of a feature graph is not changed; rough features are learned through a graph structure representing a low-resolution feature graph, fine detail features are learned through a graph structure representing a high-resolution feature graph, and graph structures with different resolutions are fused to accurately learn and represent the images, so that high-precision segmentation of the images is achieved. The embodiment combines the characteristic that the graph convolution layer is not suitable for being too deep, proposes that the hyper-parameter a controls the message propagation and aggregation of the graph convolution layer of each layer to the global node characteristics, and properly deepens the number of the graph convolution layers to improve the fitting capability of the network and improve the accuracy of semantic segmentation.
EXAMPLE III
The present embodiment is substantially the same as the second embodiment, and is characterized in that:
in this embodiment, the predictive segmentation of the two-dimensional image comprises the steps of:
(3-1) loading the trained model parameter file;
(3-2) inputting the image data into a network to obtain a predicted segmentation result;
and (3-3) locally storing the original image of the test set sample and the segmentation result image in the same image.
In summary, fig. 2 is a flowchart of a method for segmenting a material image based on multi-dimensional feature fusion and attention, including the following steps:
firstly, cutting an original image for training into image blocks only containing a material structure, preprocessing the image blocks to obtain clearer image blocks, and storing preprocessed data locally; constructing a multi-dimensional feature fusion-based graph attention network, inputting training set data into the network, optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file; loading a trained model parameter file, inputting image data into a network, obtaining a predicted segmentation result, and representing the segmentation result by a binary graph; and outputting and storing the post-processed image result. The method can be applied to image segmentation in the field of materials science, and promotes the progress and development of various subject fields. FIG. 2 is a flowchart illustrating a preprocessing method according to the present embodiment. The method comprises the following steps: (1) cutting the image data into image blocks only containing material structures; (2) uniformly adjusting all images into 512 x 512 pixel images; (3) judging whether the marked image is a black-white image or not, and performing black-white image conversion on the non-black-white image by using a binarization algorithm; (4) the preprocessed image data is divided and saved.
Fig. 3 is a flowchart of segmenting an image according to the present embodiment. The method comprises the following steps: (1) inputting image data, cutting an original image for training and testing into image blocks only containing material structures, preprocessing the image blocks, and storing preprocessed data locally; (2) putting the training set images into an encoder in a multi-dimensional feature fusion-based graph attention network to obtain a feature graph representing high-level features; (3) putting the high-level feature graph into a graph encoder to obtain an adjacent matrix and node features representing a graph structure; (4) putting the adjacency matrix and the node characteristics of the graph structure into a graph attention module to carry out node characteristic aggregation and updating; (5) putting the updated graph structure into a graph decoder to restore the graph structure into a form of a feature graph; (6) putting the feature map into a decoder of the network; (7) optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file; (8) loading a trained model parameter file, and inputting test set data into a network; (9) the predicted segmentation result is calculated and represented by a binarized map. (10) And outputting and storing the image result.
Fig. 4 is a structural diagram of the force module according to the embodiment. The method comprises the following steps: (1) inputting an adjacency matrix and a node characteristic matrix representing a diagram structure; (2) respectively entering the input features into the graph attention layer (GAT1) and the graph convolutional layer (GCN1) output node features; (3) the node characteristics output by the GCN1 enter a graph attention layer (GAT2) and a graph convolution layer (GCN2) respectively to output new node characteristics; (4) the node features output by GCN2 enter the graph attention layer (GAT 3); (5) the node features output by the GAT3 enter the graph convolutional layer (GCN 3); (6) the result of multiplying the feature output by GCN3 by (1-a) is added to the result of multiplying the node feature output by GAT by a; (7) the summed features enter the graph convolution layer (GCN); (8) the result of multiplying the feature output by (1-a) by the GCN and the result of multiplying the node feature output by GAT1 by a are added to output the final node feature.
Fig. 5 is a flowchart of the encoder of the present embodiment. The method comprises the following steps: (1) dividing an input feature graph into nodes and node features; (2) connecting the adjacent nodes by adopting a four-neighborhood mode; (3) constructing a adjacency matrix of the graph structure; (4) the adjacency matrix and node characteristics of the graph structure are preserved.
The embodiment is based on a multi-dimensional feature fusion drawing attention material image segmentation method. The method can be applied to image segmentation in the field of materials science. The method comprises the steps of firstly preprocessing a material image for training, then constructing a multi-dimensional feature fusion graph attention network, optimizing parameters of the network by using cross entropy loss, and performing prediction segmentation on the material image by using a trained model; and finally, saving the output material image processing result. In the embodiment, the multi-dimensional feature fusion image attention module is fused into the coding and decoding network, so that the segmentation precision of the network on the material image is improved, the time cost and the labor cost of material image processing are reduced, and the progress and the development of the corresponding academic and production circles are promoted.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes and modifications can be made according to the purpose of the invention, and any changes, modifications, substitutions, combinations or simplifications made according to the spirit and principle of the technical solution of the present invention shall be equivalent substitutions, as long as the purpose of the present invention is met, and the technical principle and the inventive concept of the material image segmentation method based on multi-dimensional feature fusion of the present invention are not departed from the technical principle and the inventive concept of the multi-dimensional feature fusion drawing attention.

Claims (8)

1. A method for segmenting a material image based on multi-dimensional feature fusion and attention drawing is characterized by comprising the following steps:
(1) image preprocessing:
respectively adjusting the original image and the labeled graph for training into uniform specifications, and storing the preprocessed image locally;
(2) constructing a network model based on graph attention:
inputting training set data into a network, optimizing model parameters of the network by using cross entropy loss, and storing a trained network parameter file;
(3) and (3) performing image prediction segmentation:
loading a trained model parameter file, inputting test set data into a network, obtaining a predicted segmentation result, and representing the segmentation result by a binary graph;
(4) saving and outputting an image processing result:
and (4) storing the original image of the test set sample and the segmentation result image in the same image.
2. The method for segmenting the material image based on the multi-dimensional feature fusion attention of the drawing is characterized in that in the step (1), the image preprocessing comprises the following steps:
(1-1) cutting away the material performance data describing part of the original image;
(1-2) uniformly adjusting the images to 512 x 512 pixels;
(1-3) converting all the labeled graphs into black and white graphs by using a binarization algorithm;
(1-4) dividing and saving the preprocessed image data.
3. The method for segmenting the material image based on the attention of the multi-dimensional feature fusion is characterized in that in the step (2), a multi-dimensional feature fusion based attention module is constructed, and a graph encoder is adopted to construct a graph structure from the feature graph; constructing a graph attention module by using the graph volume and the graph attention; restoring the graph structure into a feature graph by adopting a graph encoder, wherein the design and construction of the graph encoder comprise the following steps:
(2-1-1) adjusting the dimension size of the feature map from cxhxw to cxhw using the outputted feature map at the encoder part;
(2-1-2) dividing the feature graph into H multiplied by W nodes, wherein the feature dimension of each node is 1 multiplied by C;
(2-1-3) establishing connection between each node by adopting a four-neighborhood mode, namely establishing edge connection between each central node and four nearest upper, lower, left and right nodes;
(2-1-4) establishing an adjacency matrix through a graph structure, and describing the connection condition between nodes;
(2-1-5) saving the established graph structure in the node characteristics and the adjacency matrix.
4. The method for multi-dimensional feature fusion based graph attention material image segmentation according to claim 3, wherein the graph attention module fuses graph volumes and graph attention, and the module design and construction comprises the following steps:
(2-2-1) taking the node feature matrix and the adjacency matrix as the input of the graph attention module;
(2-2-2) performing multi-head attention on the input node features (1 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-2-3) performing local aggregation and feature dimension reduction on the input node features (1 × C) by adopting a layer of graph convolution, wherein the layer of node feature dimensions are obtained by reducing the input node feature dimensions to 1/2 × C;
(2-2-4) performing multi-head attention on the input node features (1/2 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-2-5) performing local aggregation and feature dimension reduction on the input node features (1/2 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the previous-layer graph convolution layer is reduced to 1/4 × C;
(2-2-6) performing multi-head attention on the input node features (1/4 × C) by adopting one-layer graph attention, and outputting updated node features by learning attention weights;
(2-2-7) performing local aggregation and feature dimension raising on the input node features (1/4 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the output node feature dimension of the previous-layer graph convolution layer is increased to 1/2 × C;
(2-2-8) fusing the output of the graph attention layer with the node characteristic dimension size of 1/2 xC and the output of the graph convolution layer dimensionality operation with the node characteristic dimension size of 1/2 xC in an addition mode;
(2-2-9) performing local aggregation and feature dimension raising on the input node features (1/2 × C) by adopting one-layer graph convolution, wherein the node feature dimension of the output node feature dimension of the previous-layer graph convolution layer is increased to 1 × C;
(2-2-10) merging the output of the graph attention layer with the node characteristic dimension size of 1 xC and the output of the graph convolution layer dimensionality increasing operation with the node characteristic dimension size of 1 xC in an addition mode;
and (2-2-11) fusing the characteristics of different branches according to a self-defined ratio by adopting the hyper-parameter a.
5. The method for multi-dimensional feature fusion based attention-directed material image segmentation as claimed in claim 3, wherein the design of the image encoder comprises the steps of:
(2-3-1) adjusting the output dimension of the image attention module from C × HW to C × H × W;
(2-3-2) converting the node characteristics after dimension adjustment into a characteristic diagram input by a decoder.
6. The method for segmenting the material image based on the multi-dimensional feature fusion attention map is characterized in that the hyper-parameter a in the map attention module utilizes the following formula:
Figure FDA0003569778850000021
Figure FDA0003569778850000022
is the sum of the output of the graph attention layer and the output of the graph convolution layer upscaling operation, Hl+1Is the output characteristic of the 1+1 th layer graph convolution, alpha is a hyper-parameter,
Figure FDA0003569778850000023
is the output of the attention layer.
7. The method for segmenting the material image based on the attention of the multi-dimensional feature fusion graph according to claim 1, wherein in the step (2), the training of the network model based on the attention of the graph comprises the following steps:
(2-4-1) inputting the training set images into the network;
(2-4-2) optimizing network model parameters using cross entropy loss;
and (2-4-3) storing the trained network parameter file.
8. The method for multi-dimensional feature fusion based attention-directed material image segmentation as claimed in claim 1, wherein in the step (3), the prediction of the material image comprises the steps of:
(3-1) loading the trained model parameter file;
(3-2) inputting the image data into a network to obtain a predicted segmentation result;
and (3-3) locally storing the original test set sample image and the probability map in the same picture.
CN202210318948.6A 2022-03-29 2022-03-29 Material image segmentation method based on multi-dimensional feature fusion and drawing attention Pending CN114708431A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210318948.6A CN114708431A (en) 2022-03-29 2022-03-29 Material image segmentation method based on multi-dimensional feature fusion and drawing attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210318948.6A CN114708431A (en) 2022-03-29 2022-03-29 Material image segmentation method based on multi-dimensional feature fusion and drawing attention

Publications (1)

Publication Number Publication Date
CN114708431A true CN114708431A (en) 2022-07-05

Family

ID=82170986

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210318948.6A Pending CN114708431A (en) 2022-03-29 2022-03-29 Material image segmentation method based on multi-dimensional feature fusion and drawing attention

Country Status (1)

Country Link
CN (1) CN114708431A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984296A (en) * 2023-03-21 2023-04-18 译企科技(成都)有限公司 Medical image segmentation method and system applying multi-attention mechanism

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115984296A (en) * 2023-03-21 2023-04-18 译企科技(成都)有限公司 Medical image segmentation method and system applying multi-attention mechanism
CN115984296B (en) * 2023-03-21 2023-06-13 译企科技(成都)有限公司 Medical image segmentation method and system applying multi-attention mechanism

Similar Documents

Publication Publication Date Title
CN110059772B (en) Remote sensing image semantic segmentation method based on multi-scale decoding network
WO2023077816A1 (en) Boundary-optimized remote sensing image semantic segmentation method and apparatus, and device and medium
CN109886066B (en) Rapid target detection method based on multi-scale and multi-layer feature fusion
CN112070779A (en) Remote sensing image road segmentation method based on convolutional neural network weak supervised learning
Cherabier et al. Learning priors for semantic 3d reconstruction
CN109117894B (en) Large-scale remote sensing image building classification method based on full convolution neural network
CN114283120B (en) Domain-adaptive-based end-to-end multisource heterogeneous remote sensing image change detection method
CN116797787B (en) Remote sensing image semantic segmentation method based on cross-modal fusion and graph neural network
CN112215859B (en) Texture boundary detection method based on deep learning and adjacency constraint
CN103049340A (en) Image super-resolution reconstruction method of visual vocabularies and based on texture context constraint
CN113706545A (en) Semi-supervised image segmentation method based on dual-branch nerve discrimination dimensionality reduction
CN111723812A (en) Real-time semantic segmentation method based on sequence knowledge distillation
CN115393289A (en) Tumor image semi-supervised segmentation method based on integrated cross pseudo label
CN114708431A (en) Material image segmentation method based on multi-dimensional feature fusion and drawing attention
CN116645592A (en) Crack detection method based on image processing and storage medium
CN115240024A (en) Method and system for segmenting extraterrestrial pictures by combining self-supervised learning and semi-supervised learning
CN114998360A (en) Fat cell progenitor cell segmentation method based on SUnet algorithm
CN113643303A (en) Three-dimensional image segmentation method based on two-way attention coding and decoding network
CN113888505A (en) Natural scene text detection method based on semantic segmentation
Yu et al. Genetic algorithm approach to image segmentation using morphological operations
CN116721206A (en) Real-time indoor scene vision synchronous positioning and mapping method
CN111709275B (en) Deep network construction method for Affordance reasoning
CN114220019A (en) Lightweight hourglass type remote sensing image target detection method and system
CN113837176A (en) Deep learning-based vector curve simplification method, model and model establishment method
CN110472653B (en) Semantic segmentation method based on maximized region mutual information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination