CN115761240A - Image semantic segmentation method and device for neural network of chaotic back propagation map - Google Patents

Image semantic segmentation method and device for neural network of chaotic back propagation map Download PDF

Info

Publication number
CN115761240A
CN115761240A CN202310031394.6A CN202310031394A CN115761240A CN 115761240 A CN115761240 A CN 115761240A CN 202310031394 A CN202310031394 A CN 202310031394A CN 115761240 A CN115761240 A CN 115761240A
Authority
CN
China
Prior art keywords
neural network
chaotic
graph
node
regions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310031394.6A
Other languages
Chinese (zh)
Other versions
CN115761240B (en
Inventor
陶鹏
陈洛南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Institute of Advanced Studies of UCAS
Original Assignee
Hangzhou Institute of Advanced Studies of UCAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Institute of Advanced Studies of UCAS filed Critical Hangzhou Institute of Advanced Studies of UCAS
Priority to CN202310031394.6A priority Critical patent/CN115761240B/en
Publication of CN115761240A publication Critical patent/CN115761240A/en
Application granted granted Critical
Publication of CN115761240B publication Critical patent/CN115761240B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses an image semantic segmentation method and device of a neural network of a chaotic back propagation diagram, wherein the method comprises the following steps: step S1: pre-dividing an input image into a plurality of regions, and converting the regions into region adjacency graphs; step S2: the input graph is used as the input graph of the maximum segmentation problem; and step S3: constructing a graph convolution neural network, which comprises an embedded layer and a plurality of graph convolution layers; s4, training a convolutional neural network by using a chaotic back propagation algorithm; step S5: and after the network training is finished, the output projection is carried out to obtain a solution of the combined optimization problem. The invention models the semantic segmentation problem into a combined optimization problem of maximum segmentation, does not need to provide label data of an image, can flexibly adjust according to the contradictory requirements of clear outline and accurate semantic segmentation, and meets different semantic segmentation tasks.

Description

Image semantic segmentation method and device for neural network of chaotic back propagation map
Technical Field
The invention relates to the field of computer vision (or image processing), in particular to an image semantic segmentation method and device of a neural network of a chaotic back propagation diagram.
Background
Image Semantic Segmentation (Semantic Segmentation) is an important ring for image processing and image understanding in machine vision technology, is a typical hot point problem in the field of computer vision, and is also a difficult problem, which is the first step of image processing and aims to separate an object from a background. The technical means of semantic segmentation generally divides an image into a plurality of mutually disjoint areas according to characteristics of the image, such as gray scale, color, spatial texture, geometric shape and the like, so that the characteristics show consistency or similarity in the same area and obviously differ among different areas. Through long-term research, semantic segmentation has produced considerable research efforts and methods. Conventional methods include threshold-based segmentation methods, region-based segmentation methods, and edge detection-based segmentation methods, among others. Methods based on deep learning have become mainstream in recent years, such as VGGnet and ResNet based on feature coding; R-CNN based on region selection; FCN based on deconvolution, and so on. However, the deep learning based method has some drawbacks such as a local minimum problem and a problem of requiring tag data due to the dependence on the BP algorithm and its variants. Furthermore, one important difficulty of image semantic segmentation with respect to whole image classification or object detection is that: the task requires, on the one hand, correct classification of high-level semantic features and, on the other hand, that the contours fit as closely as possible to the real boundaries-pixel level, to be aligned. These two requirements are actually opposite, and both require improvements to existing models.
On the one hand, the existing Combinatorial Optimization Problem (COP) refers to the problem of finding the "best" object from a limited set of objects, and the so-called "best" object refers to finding the object with the highest evaluation score or lowest cost, which may be a set, an arrangement or a graph. Recently, a Graph Neural Network (GNN) is used to solve a large-scale combinatorial optimization problem by converting an objective function of the combinatorial optimization problem into a differentiable loss function, and the solving speed is greatly improved. Therefore, the image semantic segmentation can be converted into a maximum segmentation problem in a graph neural network mode to carry out semantic classification and identification. On the other hand, recently, a brain-inspired Chaotic Back Propagation (CBP) algorithm has appeared and is proposed and used to train the conventional multi-layered perceptron (MLP), so that the performance of the final optimization and generalization of the network model is significantly improved compared to the conventional BP algorithm and its variants (e.g., SGDM and Adam).
Therefore, the image semantic segmentation can be converted into a maximum segmentation problem through a graph neural network mode to perform semantic classification recognition, and a brand new CGBP algorithm is provided, namely, the original CBP algorithm used for MLP is expanded to the graph neural network, so that the performance of a network model is improved, a feasible solution of a better combination optimization problem is obtained, and the image is segmented with higher quality.
Disclosure of Invention
In order to make up for the defects of the prior art, the invention provides a semantic segmentation method based on a neural network of a chaotic back propagation diagram.
The invention provides a semantic segmentation method based on a neural network of a chaotic back propagation diagram for solving the technical problems, which comprises the following steps:
step S1: pre-dividing an input image into a plurality of regions, converting the regions into a region adjacency graph, taking the regions as nodes of the graph, taking corresponding node values as average values of all pixel points in the regions, and if the two regions have adjacent pixel points, connecting edges exist in the two regions, and the weight of the edges is the Euclidean distance of the corresponding two node values;
step S2: the input graph is used as the input graph of the maximum segmentation problem, comprises the nodes of the region adjacent graph, the connection matrix and the weight of the edge, and constructs the corresponding coding matrix Q, Q ij Is the weight of node i to node j,Q ii Is the negative of the sum of the weights of the edges connected to node i, the optimized objective function is the Hamiltonian
Figure 811543DEST_PATH_IMAGE001
(1);
And step S3: constructing a graph convolution neural network comprising an embedding layer and a plurality of graph convolution layers, and converting the Hamiltonian into a differentiable loss function
Figure 563598DEST_PATH_IMAGE002
(3);
S4, training a convolutional neural network by using a chaotic back propagation algorithm, wherein the loss function of the training network comprises a first loss function and a chaotic loss function,
Figure 386061DEST_PATH_IMAGE003
(5);
step S5: and after the network training is finished, the output projection is subjected to solution of the combined optimization problem, and a corresponding image segmentation result is obtained.
The invention also provides an image segmentation device of the chaotic counter-propagation graph neural network, which comprises a preprocessing unit, an objective function unit, a graph convolution neural network unit, a chaotic counter-propagation algorithm unit and an output unit:
the preprocessing unit is used for pre-dividing an input image into a plurality of regions and converting the regions into a region adjacency graph, the regions serve as nodes of the graph, corresponding node values are average values of all pixel points in the regions, if the pixel points in the two regions are adjacent, the two regions have connecting edges, and the weight of the edges is the Euclidean distance of the corresponding two node values;
the target function unit is used for taking the input graph as the input graph of the maximum segmentation problem, including the region adjacent graph nodes, the connection matrix and the weight of the edge, and constructing the corresponding coding matrix Q, Q ij Is the weight of node i to node j, Q ii Is the negative of the sum of the weights of the edges connected to node i, resulting in an optimizationIs the target function of Hamiltonian
Figure 853427DEST_PATH_IMAGE004
(1);
The graph convolution neural network unit comprises an embedded layer and a plurality of graph convolution layers and converts the Hamiltonian in the target function unit into a differentiable loss function
Figure 521168DEST_PATH_IMAGE005
(3);
The chaotic back propagation algorithm unit is used for training a graph convolution neural network unit, the loss function of the training network comprises a first loss function and a chaotic loss function,
Figure 658889DEST_PATH_IMAGE006
(5);
and the output unit is used for obtaining a solution of a combined optimization problem through output projection after network training is completed, and obtaining a corresponding image segmentation result.
Compared with the prior art, the invention has the following advantages:
1. the method comprises the steps of pre-dividing an image into a plurality of regions, generating a region adjacency graph, constructing Q and a Hamilton quantity of a target function through the region adjacency graph, converting the Hamilton quantity into a differentiable loss function, constructing a proposed graph convolution neural network on the basis, and training the graph neural network by a CGBP algorithm. By means of chaotic dynamics in a real brain, gradient dynamics in a back propagation algorithm is converted into mixed dynamics of gradients and chaos, and by means of initial value sensitivity and ergodicity of the chaotic dynamics, a CGBP algorithm can obtain more efficient sampling than random dynamics in a weight space of a network, and the global optimization capability of a neural network is improved. Therefore, the invention can obtain a better feasible solution of the combination optimization problem, thereby realizing the segmentation of the image with higher quality.
2. The semantic segmentation problem is modeled into the combined optimization problem of maximum segmentation, and label data of the image is not required to be provided, so that the method has a wider application range compared with the conventional deep learning method. Considering that an image semantic segmentation task requires correct classification of high-level semantic features and requires that a contour fits a real boundary as much as possible-the pixel level needs to be aligned, the method can segment the region again by pre-segmentation-calculating pixel points and mean variance of the region and exceeding a threshold value, and form a new region adjacency graph.
3. The invention uses the watershed algorithm to pre-divide the image, so that the noise in the background can be effectively filtered when the method is used for subsequent processing. In particular, compared with the traditional threshold-based segmentation method, the method has stronger identification capability on the target object in the image.
Drawings
FIG. 1 is a schematic diagram of a chaotic counter-propagation algorithm;
FIG. 2 is a schematic diagram of a semantic segmentation method based on a neural network of a chaotic inverse propagation map;
FIG. 3 is a dynamic variation diagram of a segmentation result in a process of training a neural network by applying a CGBP algorithm;
fig. 4 shows the results of the method in some example pictures.
Detailed Description
In order to facilitate understanding of the technical solutions of the present invention, the following detailed description is made with reference to the accompanying drawings and specific embodiments.
The above prior art solutions have all the drawbacks which are the results of the applicant after practical and careful study, and therefore, the discovery process of the above problems and the solutions proposed by the following embodiments of the present application for the above problems should be the contributions of the applicant to the present application during the course of the present application.
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The present invention will be further described with reference to the following detailed description so that the technical means, the creation features, the achievement purposes and the effects of the present invention can be easily understood.
At present, a chaotic dynamics of neurons based on physiological experiment discovery is presented, and an error back propagation method (CBP) based on brain chaotic characteristics is provided, as shown in figure 1, a neural network of an L-layer perceptron is the output of the jth neuron of the ith layer, and w is the output of the jth neuron of the ith layer ijk Is to calculate x ij Time x i-1,k The activation function is a sigmoid function; the forward computation process of the network can be written as
Figure 652252DEST_PATH_IMAGE007
Wherein x ij Is the output of the jth neuron of the ith layer, w ijk Is to calculate x ij Time x i-1,k The weight of (c).
Bias term modification in BP neural unit into BP loss
Figure 609844DEST_PATH_IMAGE008
Weight w ijk Can be written as
Figure 81277DEST_PATH_IMAGE009
Bias terms in neural units in the CBP algorithm also include chaotic losses
Figure 339083DEST_PATH_IMAGE010
Is a
Figure 972189DEST_PATH_IMAGE011
In which I 0 Is a constant number between 0 and 1, z ij Is to controlw ijk Parameters of the intensity of the chaos.
The above weightsw ijk Can be written as
Figure 682656DEST_PATH_IMAGE012
Due to z ij Is a tunable parameter, so we can
Figure 692201DEST_PATH_IMAGE013
And x i-1,k All put in z ij At this point, a simplified form of the update formula can be written
Figure 804513DEST_PATH_IMAGE014
The last two terms at the right end of the equation are respectively defined as a gradient term and a chaotic term.
The algorithm has global optimization capability, can enable chaos to be generated inside the neural network in the weight learning process, and helps the network model to escape from local minimum in the learning process by virtue of rich and complex dynamics of the chaos.
The Combinatorial Optimization Problem (COP) refers to the problem of finding the "best" object from a limited set of objects, and the so-called "best" object refers to finding the object with the highest evaluation score or lowest cost, which may be a set, an arrangement or a graph. Typical combinatorial optimization problems often involve sorting, screening, etc. problems, such as a Traveling Salesman Problem (TSP), a 0-1 knapsack problem (knapack problem), a Maximum Independent Set (MIS) problem, and a Maximum Cut (MC) problem, etc. In general, most combinatorial optimization problems are NP-hard, i.e., the optimal solution cannot be solved efficiently and accurately in polynomial time. The traditional solution method for the combinatorial optimization problem can be divided into three categories: 1) Precision algorithms such as dynamic programming and branch definition; 2) Approximation algorithms such as greedy and relaxation algorithms; 3) Meta-heuristic algorithms such as genetic algorithms and particle swarm algorithms. All of the above algorithms have some limitations, and especially for large-scale combinatorial optimization problems, a feasible solution cannot be obtained within an affordable time. Recently, a Graph Neural Network (GNN) is used to solve a large-scale combinatorial optimization problem by converting an objective function of the combinatorial optimization problem into a differentiable loss function, and although the solving speed is greatly improved, the final solution is not satisfactory. On the other hand, a brain-inspired Chaotic Back Propagation (CBP) algorithm has recently been proposed and used to train conventional multi-layered perceptrons, so that the performance of the final optimization and generalization of the network model is significantly improved compared to the conventional BP algorithm and its variants (e.g., SGDM and Adam). Therefore, the MLP-based CBP algorithm provides the CGBP algorithm suitable for the graph neural network, so that the performance of the graph neural network model is improved, a feasible solution of a better combination optimization problem is obtained, and higher-quality semantic segmentation is realized.
As shown in fig. 2, which is a schematic diagram of a method for semantic segmentation based on a neural network of a chaotic inverse propagation map, the following detailed steps of the method include:
step 1: converting image into region adjacency graph
For each sample image in the sample image set, in order to be able to process the picture by using the neural network, the image needs to be converted into data of a picture structure. Since there are usually many pixels in an image, in order to improve the subsequent calculation efficiency, we first pre-divide the pixels into several regions. Secondly, the obtained areas are used as nodes of the graph, and the corresponding node values are the average values of all pixel points in the areas. If two areas have adjacent pixel points, the two areas are considered to have a connecting edge, and the weight of the edge is the Euclidean distance of the corresponding two node values. Thus, the original image is converted into a region adjacency graph consisting of nodes and edges.
Step 2: maximum cut problem coding
We translate the graph cut problem into a graph max cut problem. Like most combinatorial optimization problems, the maximum cut problem can be solved by converting it into a quadratic binary unconstrained optimization (QUBO), under which the optimized objective function is the amount of hamilton
Figure 874100DEST_PATH_IMAGE001
(1)
Wherein x is i Is a binary decision variable whose value is 0 or 1, and Q is a constant square matrix encoding the maximum cut problem to be solved, Q ij The value of (a) is the weight of node i to node j, Q ii Is the negative of the sum of the weights of the edges connected to node i.
And 3, step 3: graph convolution neural network
The calculation process of graph convolutional neural network (GCN) can be represented by the following update formula
Figure 71863DEST_PATH_IMAGE015
(2)
Wherein
Figure 885099DEST_PATH_IMAGE016
A feature vector representing the ith node of the l-th layer,
Figure 117497DEST_PATH_IMAGE017
and
Figure 92406DEST_PATH_IMAGE018
then it is the network weight of the l-th layer,
Figure 43045DEST_PATH_IMAGE019
a set of neighbors representing a node i,
Figure 394392DEST_PATH_IMAGE020
is the product of the square root of the degree of nodes
Figure 481296DEST_PATH_IMAGE021
Figure 358598DEST_PATH_IMAGE022
Is an activation function.
And 4, step 4: combinatorial optimization problem solving
In order to solve the combinatorial optimization problem using the graph-convolution neural network, it is necessary to convert the hamiltonian of the combinatorial optimization problem into a differentiable loss function. Only the use of
Figure 265374DEST_PATH_IMAGE023
De-substitution of binary decision variable x i I.e., where theta refers to all parameters of the convolutional neural network model,
Figure 685991DEST_PATH_IMAGE023
representing the output of the ith node of the network model. Based on this transformation, a loss function of the network model can be defined
Figure 361823DEST_PATH_IMAGE024
(3)
It should be noted that in order to make the neighboring nodes belong to the same class as much as possible, Q is modified, i.e., W is subtracted p S, wherein W p For constant parameters to be set, S ij =A ij ,S ii Is the negative of the degree of node i. When an optimizer is used to optimize this loss function, the Hamiltonian of the corresponding combinatorial optimization problem is also reduced, and thus a solution to the combinatorial optimization problem can be obtained.
And 5: chaotic back propagation algorithm of graph neural network
Usually, optimization algorithms based on Back Propagation (BP), such as SGD and Adam, are easily trapped in local minimal solutions, and in order to increase the probability that a model converges to a global optimal solution, a chaotic loss function is additionally added according to the inspiration of chaotic dynamics in the brain
Figure 944114DEST_PATH_IMAGE025
(4)
Wherein
Figure 338186DEST_PATH_IMAGE026
The kth output of the l-th layer of any node d,
Figure 296915DEST_PATH_IMAGE027
then is
Figure 92833DEST_PATH_IMAGE028
Corresponding intensity of chaos, I 0 It is a constant between 0 and 1, typically taking 0.65. The total loss of training at this time is
Figure 111604DEST_PATH_IMAGE006
(5)
It should be noted that when
Figure 727393DEST_PATH_IMAGE029
In the case of the Sigmoid function,
Figure 224234DEST_PATH_IMAGE026
and
Figure 140237DEST_PATH_IMAGE030
equivalence otherwise in the formula (2)
Figure 64331DEST_PATH_IMAGE029
Obtained by converting into Sigmoid function
Figure 432995DEST_PATH_IMAGE028
. In addition, the calculation is performed by a Sigmoid function
Figure 467947DEST_PATH_IMAGE026
The steepness factor may be introduced to change the scale of the weight change, i.e.
Figure 238457DEST_PATH_IMAGE031
(6)
Here get
Figure 333452DEST_PATH_IMAGE032
=10. The gradient of the chaos loss versus the weight is found below
Figure 189413DEST_PATH_IMAGE033
(7)
It can be seen that this loss introduces negative feedback in the gradient of the output, so that when z is large enough chaotic dynamics can be generated to assist the model in searching the solution space. In order for the model to converge eventually, we need to do
Figure 28056DEST_PATH_IMAGE034
Simulated annealing is carried out, the strategy being ordinary exponential annealing, i.e.
Figure 918652DEST_PATH_IMAGE035
(8)
Where β is the annealing constant, here taken to be 0.999.
Step 6: projection and segmentation
Because the output of the model is a vector with values between 0 and 1, the output needs to be projected before evaluating the solution of the combinatorial optimization, and the simplest strategy is adopted here, namely, 0.5 is used as a threshold, 1 is greater than the threshold, and 0 is less than the threshold. The final output after projection can be used as the standard of segmentation, such as 1 as the target and 0 as the background.
FIG. 3 is a process of training a neural network by applying a CGBP algorithm, and proves that a network model can be gradually converged by a chaotic algorithm CBP and a final segmentation result is obtained.
Example 1: maximum cut
Taking the cutting of the eagle in fig. 2 as an example, the specific steps of the method are given as follows:
step S1: the image is converted into an adjacency graph. The size of the picture in fig. 2 is 180 × 250 × 3, where 180 and 250 are the height and width of the image, respectively, and 3 is the number of channels, representing the three RGB color channels, respectively. Because the image has more pixel points, in order to improve the subsequent calculation efficiency, the pixel points are firstly pre-divided into a plurality of areas. The methods for achieving this goal are many, and in this patent, we use the existing Compact Watershed (Compact Watershed) method, which can segment the image based on the gradient of the image gray scale, and specifically implement the Watershed function in the Python-based sketch package. In view of the general purpose image targeted by this patent, in order to take account of both computational speed and performance, the markers parameter of the function is set to 100 (i.e., pre-divided into about 100 regions), the compactness parameter is set to 0.001 (larger values make the divided regions more regular), and the other parameters are set to function default values. Good results have generally been obtained by testing using this set of parameters. After the pre-segmentation processing, the picture is segmented into 108 regions, each region is taken as a node of the picture, the corresponding node value is an average value of all pixel points in the region, 304 edges are obtained at the same time, and the weight of the edge is the Euclidean distance of the corresponding two node values, so that an adjacency graph corresponding to the picture is obtained.
The color difference between the eagle and the background is large, and if the joint value of the eagle image and the background directly takes the average value of all the pixel points, the region can not be accurately divided into the eagle image or the background, so that the region at the boundary of the eagle image and the background can be subdivided into 100 regions. For example, there are 10 regions at the boundary between the eagle image and the background, and then, the 10 regions are subdivided into 1000 small regions, and step 2 is performed together with the remaining 98 regions, so as to improve the separation accuracy for the boundary between the image and the background.
Step S2: and coding the maximum segmentation problem. We translate the graph cut problem into a graph max cut problem. Like most combinatorial optimization problems, the maximum cut problem can be solved by converting it into a quadratic binary unconstrained optimization (QUBO), under which the optimized objective function is the amount of hamilton
Figure 181618DEST_PATH_IMAGE036
(1)
Wherein x i Is a binary decision variable whose value is 0 or 1, and Q is a constant square matrix encoding the maximum cut problem to be solved, Q ij The value of (a) is the weight of node i to node j, Q ii Is the negative of the sum of the weights of the edges connected to node i
And step S3: and constructing a graph convolution neural network. The initial features of each node are 10 dimensions, and the initial features of all nodes are provided by the weights of the embedding layer. The dimensions of the input and output of the first graph convolution layer are 10 and 5, respectively, followed by the Relu activation function and the dropout layer. The input and output dimensions of the second graph convolution layer are 5 and 1, respectively, followed by a Sigmoid activation function to bring the output value between 0 and 1. The weights of the entire network are initialized using the default method in the pytorech.
And step S4: and training the graph convolutional neural network by using a CGBP algorithm. Firstly, defining a loss function of the model, wherein an expression (3) is used as the loss function of the combined optimization problem, secondly, defining a chaotic loss function in an expression (4), and adding the two terms to obtain a total loss function. For the embedded layer, the first and second map convolutional layers, z has an initial value of 2,0.3 and 0.1, respectively, noting that both convolutional layers have two sets of learnable parameters of weight and bias, which share a common z. To ensure final convergence, we use the simulated annealing strategy in equation (8). The maximum number of training rounds is set to 2000, and the optimization algorithm is a standard gradient descent algorithm. To accelerate the speed of convergence, we switch to Adam optimization algorithm after the chaos disappears, i.e. after more than 500 rounds of training.
Step S5: projection output and picture segmentation. After training of the network model is completed, the output of the network is projected by taking 0.5 as a threshold, and the projection is 1 (belonging to a set) when the output is greater than the threshold
Figure 259296DEST_PATH_IMAGE038
As a target for segmentation), less than the threshold projects 0 (belonging to the set)
Figure 901629DEST_PATH_IMAGE040
As background for segmentation) and thus completes semantic segmentation of the image. Fig. 2 shows the final segmentation result, and it can be seen that the method can successfully segment the eagle part from the image.
To analyze the performance advantage of this method more quantitatively, we selected a plurality of example images and compared them with the method of this patent and the traditional OTSU algorithm based on threshold (also called the atsu method). Fig. 4 compares the segmentation results of the two methods, and it is intuitively easy to find that the OTSU algorithm usually takes pixels with similar colors in the background as the target, and takes partially dissimilar pixels in the target as the background, so that the problem is well avoided by the method. The PAC (average pixel accuracy) and mIOU (average cross-over ratio) of the different methods are given quantitatively in table 1, and are both indicators of how good or bad the segmentation result is quantitatively measured, with closer to 1 representing better segmentation result. From table 1, it can be seen that the PAC and mIOU of the results obtained by the CGBP algorithm training are significantly higher than OTSU, and the more complex and higher the picture background. In addition, we also analyze the chaotic effect, namely, the images are segmented by using a BP type algorithm, and the obtained PAC and mIOU are higher than OTSU but lower than the result of CGBP algorithm, which shows that the chaotic dynamics introduced into the brain can improve the segmentation accuracy.
Figure 381152DEST_PATH_IMAGE041
TABLE 1
The markers parameter of the function of the example dog is set to 200 (because the image dog has a certain similarity with the background color, a higher marker value needs to be set), the compactness parameter is set to 0.001 (a larger value makes the segmented region more regular), and other parameters are set to function default values. After pre-segmentation processing, the picture is segmented into 212 regions, each region is taken as a node of the picture, a corresponding node value is an average value of all pixel points in the region, 620 edges are obtained at the same time, and the weight of the edges is the Euclidean distance between two corresponding node values, so that an adjacent map corresponding to the picture is obtained.
The markers parameter of the function of the dolphin of the embodiment is set to 200 (because the image dolphin has a certain similarity to the background color (the lower sea), a higher marker value needs to be set), the compactness parameter is set to 0.001, and the other parameters are set to function default values. After pre-segmentation processing, the image is segmented into 210 areas, the boundary of the image dolphin and the sea background is subdivided to obtain 200+2000 areas, each area is used as a node of the image, the corresponding node value is the average value of all pixel points in the area, 6500 edges are obtained at the same time, and the weight of the edges is the Euclidean distance between the corresponding two node values, so that an adjacent image corresponding to the image is obtained.
The markers parameter of the function of the polar bear of the example is set to 100, the compactness parameter is set to 0.001, and the other parameters are set to function defaults. After pre-segmentation processing, the picture is segmented into 106 regions, each region is taken as a node of the picture, a corresponding node value is an average value of all pixel points in the region, 298 sides are obtained at the same time, and the weight of the sides is the Euclidean distance between two corresponding node values, so that an adjacent map corresponding to the picture is obtained.
The invention also comprises an image segmentation device of the neural network of the chaotic counter-propagation diagram;
the device comprises a preprocessing unit, an objective function unit, a graph convolution neural network unit, a chaotic back propagation algorithm unit and an output unit:
the preprocessing unit is used for pre-dividing an input image into a plurality of regions and converting the regions into a region adjacency graph, the regions serve as nodes of the graph, corresponding node values are average values of all pixel points in the regions, if the pixel points in the two regions are adjacent, the two regions have connecting edges, and the weight of the edges is the Euclidean distance of the corresponding two node values;
the target function unit is used for taking the input graph as the input graph of the maximum segmentation problem, including the region adjacent graph nodes, the connection matrix and the weight of the edge, and constructing the corresponding coding matrix Q, Q ij The value of (a) is the weight of node i to node j, Q ii Is the negative of the sum of the weights of the edges connected to node i, the objective function to be optimized is the Hamiltonian
Figure 83529DEST_PATH_IMAGE036
(1);
The graph convolution neural network unit comprises an embedded layer and a plurality of graph convolution layers and converts the Hamiltonian in the target function unit into a differentiable loss function
Figure 648503DEST_PATH_IMAGE042
(3);
The chaotic back propagation algorithm unit is used for training a graph convolution neural network unit, the loss function of the training network comprises a first loss function and a chaotic loss function,
Figure 828948DEST_PATH_IMAGE043
(5);
and the output unit is used for obtaining a solution of a combined optimization problem through output projection after network training is completed, and obtaining a corresponding image segmentation result.

Claims (11)

1. An image semantic segmentation method based on a neural network of a chaotic back propagation diagram is characterized by comprising the following steps:
step S1: pre-dividing an input image into a plurality of regions, converting the regions into a region adjacency graph, taking the regions as nodes of the graph, taking corresponding node values as average values of all pixel points in the regions, and if the two regions have adjacent pixel points, connecting edges exist in the two regions, and the weight of the edges is the Euclidean distance of the corresponding two node values;
step S2: the input graph is used as the input graph of the maximum segmentation problem, comprises the nodes of the region adjacent graph, the connection matrix and the weight of the edge, and constructs the corresponding coding matrix Q, Q ij The value of (a) is the weight of node i to node j, Q ii Is the negative of the sum of the weights of the edges connected to node i, the objective function to be optimized is the Hamiltonian
Figure 964522DEST_PATH_IMAGE001
(1);
And step S3: constructing a graph convolution neural network comprising an embedded layer and a plurality of graph convolution layers, and converting the Hamiltonian into a differentiable loss function
Figure 457689DEST_PATH_IMAGE002
(3);
S4, training a convolutional neural network by using a chaotic back propagation algorithm, wherein the loss function of the training network comprises a first loss function and a chaotic loss function,
Figure 116203DEST_PATH_IMAGE003
(5);
step S5: and after the network training is finished, the output projection is subjected to solution of the combined optimization problem, and a corresponding image segmentation result is obtained.
2. The image semantic segmentation method based on the neural network of the chaotic inverse propagation map as claimed in claim 1, wherein the pre-segmentation method in the step 1 is implemented by Compact Watershed algorithm; and calculating the pixel point and mean value variance of the region, exceeding a threshold value, re-segmenting the region, and forming a new region adjacency graph.
3. The image semantic segmentation method based on the neural network of the chaotic inverse propagation map as claimed in claim 1, wherein the input map in the step 2 is taken as the input map of the maximum segmentation problem, the input map comprises the weights of nodes, connection matrixes and edges, and the coding matrix Q is constructed by the Hamilton quantity under the quadratic binary unconstrained optimization problem framework
Figure 812764DEST_PATH_IMAGE001
(1);
Wherein x i Is a binary decision variable whose value is 0 or 1, and Q is a constant square matrix encoding the maximum cut problem to be solved, Q ij The value of (a) is the weight of node i to node j, Q ii Is the negative of the sum of the weights of the edges connected to node i.
4. The image semantic segmentation method based on the neural network of the chaotic inverse propagation map as claimed in claim 1, wherein the computation process of the neural network of the graph convolution in the step 3 can be expressed by the following update formula
Figure 988661DEST_PATH_IMAGE004
(2);
Wherein
Figure 87067DEST_PATH_IMAGE005
A feature vector representing the ith node of the l-th layer,
Figure 182062DEST_PATH_IMAGE006
and
Figure 615187DEST_PATH_IMAGE007
then it is the network weight of the l-th layer,
Figure 844043DEST_PATH_IMAGE008
a set of neighbors representing a node i,
Figure 282108DEST_PATH_IMAGE009
is the product of the square root of the degree of nodes
Figure 141480DEST_PATH_IMAGE010
Figure 484737DEST_PATH_IMAGE011
Is an activation function.
5. The image semantic segmentation method based on the neural network of the chaotic inverse propagation map as claimed in any one of claims 1 or 4, wherein in step 3, a binary decision variable x in the Hamiltonian is adopted i By using
Figure 297709DEST_PATH_IMAGE012
Alternatively, where theta refers to all parameters of the convolutional neural network model,
Figure 42812DEST_PATH_IMAGE012
representing the output of the ith node of the network model to obtain a first loss function of the network model
Figure 151713DEST_PATH_IMAGE013
(3) (ii) a In order to make the neighboring nodes belong to the same class as much as possible, Q is modified, i.e. W is subtracted p S, wherein W p Set to 100,S ij =A ij ,S ii Is the negative of the degree of node i.
6. The image semantic segmentation method based on the neural network of the chaotic inverse propagation map as claimed in claim 4, wherein in step 3, the initial features of the nodes of the neural network of the graph convolution correspond to the weights of the embedding layer, and the weights of the embedding layer are network input parameters; the graph convolution layer comprises a first graph convolution layer and a second graph convolution layer, the first graph convolution layer is a Relu activation function and is output through the dropout layer, and the second graph convolution layer is a Sigmoid activation function.
7. The image semantic segmentation method based on the neural network of the chaotic back propagation map as claimed in claim 1, wherein the chaotic loss function in the step 4 is
Figure 716687DEST_PATH_IMAGE014
(4);
Wherein
Figure 536613DEST_PATH_IMAGE015
The kth output of the l-th layer of any node d,
Figure 870642DEST_PATH_IMAGE016
then is
Figure 806237DEST_PATH_IMAGE015
Corresponding intensity of chaos, I 0 It is a constant between 0 and 1.
8. The image semantic segmentation method based on the neural network of the chaotic inverse propagation map as claimed in claim 7, wherein the activation function
Figure 999452DEST_PATH_IMAGE011
As a Sigmoid function, in equation (4)
Figure 718009DEST_PATH_IMAGE015
And
Figure 500021DEST_PATH_IMAGE017
equivalence;
Figure 652522DEST_PATH_IMAGE015
may introduce a gradient factor
Figure 457667DEST_PATH_IMAGE018
To changeThe scale of the change in weight, i.e.
Figure 307811DEST_PATH_IMAGE019
(6);
Gradient of chaos loss to weight is
Figure 491799DEST_PATH_IMAGE020
(7)。
9. The image semantic segmentation method based on the neural network of the chaotic inverse propagation map as claimed in claim 8, wherein the image semantic segmentation method is applied to image semantic segmentation based on the neural network of the chaotic inverse propagation map
Figure 441301DEST_PATH_IMAGE021
Performing simulated annealing with a strategy of ordinary exponential annealing, i.e.
Figure 592796DEST_PATH_IMAGE022
(8);
Wherein β is an annealing constant, wherein β is an annealing constant close to 1 but less than 1, such that
Figure 918736DEST_PATH_IMAGE023
And gradually becomes smaller.
10. The image semantic segmentation method based on the neural network of the chaotic backward propagation map as claimed in claim 3, wherein a Hamiltonian corresponding to a maximal cut is written as
Figure 659027DEST_PATH_IMAGE024
(10) WhereinAAs a contiguous matrix, i.e. a nodeiAndjwhen there is no edge
Figure 904064DEST_PATH_IMAGE025
When there is a side
Figure 683801DEST_PATH_IMAGE026
11. The image segmentation device of the neural network of the chaotic counter-propagation map is characterized in that,
the device comprises a preprocessing unit, an objective function unit, a graph convolution neural network unit, a chaotic back propagation algorithm unit and an output unit:
the preprocessing unit is used for pre-dividing an input image into a plurality of regions and converting the regions into a region adjacency graph, the regions serve as nodes of the graph, corresponding node values are average values of all pixel points in the regions, if the pixel points in the two regions are adjacent, the two regions have connecting edges, and the weight of the edges is the Euclidean distance of the corresponding two node values;
the target function unit is used for taking the input graph as the input graph of the maximum segmentation problem, including the region adjacent graph nodes, the connection matrix and the weight of the edge, and constructing the corresponding coding matrix Q, Q ij The value of (a) is the weight of node i to node j, Q ii Is the negative of the sum of the weights of the edges connected to node i, the objective function to be optimized is the Hamiltonian
Figure 626480DEST_PATH_IMAGE027
(1);
The graph convolution neural network unit comprises an embedded layer and a plurality of graph convolution layers and converts the Hamiltonian in the target function unit into a differentiable loss function
Figure 34328DEST_PATH_IMAGE028
(3);
The chaotic back propagation algorithm unit is used for training a graph convolution neural network unit, the loss function of the training network comprises a first loss function and a chaotic loss function,
Figure 191812DEST_PATH_IMAGE029
(5);
and the output unit is used for obtaining a solution of a combined optimization problem through output projection after network training is completed, and obtaining a corresponding image segmentation result.
CN202310031394.6A 2023-01-10 2023-01-10 Image semantic segmentation method and device for chaotic back propagation graph neural network Active CN115761240B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310031394.6A CN115761240B (en) 2023-01-10 2023-01-10 Image semantic segmentation method and device for chaotic back propagation graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310031394.6A CN115761240B (en) 2023-01-10 2023-01-10 Image semantic segmentation method and device for chaotic back propagation graph neural network

Publications (2)

Publication Number Publication Date
CN115761240A true CN115761240A (en) 2023-03-07
CN115761240B CN115761240B (en) 2023-05-09

Family

ID=85348874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310031394.6A Active CN115761240B (en) 2023-01-10 2023-01-10 Image semantic segmentation method and device for chaotic back propagation graph neural network

Country Status (1)

Country Link
CN (1) CN115761240B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116204846A (en) * 2023-05-06 2023-06-02 云南星晟电力技术有限公司 Method for rapidly positioning abnormal sensor data of power distribution network based on visible graph
CN118135339A (en) * 2024-05-06 2024-06-04 贵州万德科技有限公司 Monitoring management method and system for chilli food production and processing

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4803736A (en) * 1985-11-27 1989-02-07 The Trustees Of Boston University Neural networks for machine vision
CN107330902A (en) * 2017-07-04 2017-11-07 河南师范大学 Chaos-Genetic BP neural network image partition method based on Arnold conversion
CN107423814A (en) * 2017-07-31 2017-12-01 南昌航空大学 A kind of method that dynamic network model is established using depth convolutional neural networks
CN109145939A (en) * 2018-07-02 2019-01-04 南京师范大学 A kind of binary channels convolutional neural networks semantic segmentation method of Small object sensitivity
US20220108151A1 (en) * 2020-10-01 2022-04-07 North Carolina State University Physics augmented neural networks configured for operating in environments that mix order and chaos
CN114707227A (en) * 2022-04-28 2022-07-05 水利部南京水利水文自动化研究所 Dam safety early warning and warning method and system based on digital twins
CN115331162A (en) * 2022-07-14 2022-11-11 西安科技大学 Cross-scale infrared pedestrian detection method, system, medium, equipment and terminal

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4803736A (en) * 1985-11-27 1989-02-07 The Trustees Of Boston University Neural networks for machine vision
CN107330902A (en) * 2017-07-04 2017-11-07 河南师范大学 Chaos-Genetic BP neural network image partition method based on Arnold conversion
CN107423814A (en) * 2017-07-31 2017-12-01 南昌航空大学 A kind of method that dynamic network model is established using depth convolutional neural networks
CN109145939A (en) * 2018-07-02 2019-01-04 南京师范大学 A kind of binary channels convolutional neural networks semantic segmentation method of Small object sensitivity
US20220108151A1 (en) * 2020-10-01 2022-04-07 North Carolina State University Physics augmented neural networks configured for operating in environments that mix order and chaos
CN114707227A (en) * 2022-04-28 2022-07-05 水利部南京水利水文自动化研究所 Dam safety early warning and warning method and system based on digital twins
CN115331162A (en) * 2022-07-14 2022-11-11 西安科技大学 Cross-scale infrared pedestrian detection method, system, medium, equipment and terminal

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RISHENG WANG等: "Medical image segmentation using deep learning: A survey" *
青晨;禹晶;肖创柏;段娟;: "深度卷积神经网络图像语义分割研究进展" *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116204846A (en) * 2023-05-06 2023-06-02 云南星晟电力技术有限公司 Method for rapidly positioning abnormal sensor data of power distribution network based on visible graph
CN116204846B (en) * 2023-05-06 2023-08-01 云南星晟电力技术有限公司 Method for rapidly positioning abnormal sensor data of power distribution network based on visible graph
CN118135339A (en) * 2024-05-06 2024-06-04 贵州万德科技有限公司 Monitoring management method and system for chilli food production and processing

Also Published As

Publication number Publication date
CN115761240B (en) 2023-05-09

Similar Documents

Publication Publication Date Title
CN109614985B (en) Target detection method based on densely connected feature pyramid network
CN108182441B (en) Parallel multichannel convolutional neural network, construction method and image feature extraction method
CN111583263B (en) Point cloud segmentation method based on joint dynamic graph convolution
Massiceti et al. Random forests versus neural networks—what's best for camera localization?
US20190228268A1 (en) Method and system for cell image segmentation using multi-stage convolutional neural networks
CN109299701B (en) Human face age estimation method based on GAN expansion multi-human species characteristic collaborative selection
CN115761240A (en) Image semantic segmentation method and device for neural network of chaotic back propagation map
CN113298024A (en) Unmanned aerial vehicle ground small target identification method based on lightweight neural network
Grigorev et al. Depth estimation from single monocular images using deep hybrid network
CN114419413A (en) Method for constructing sensing field self-adaptive transformer substation insulator defect detection neural network
CN113298129A (en) Polarized SAR image classification method based on superpixel and graph convolution network
CN115018039A (en) Neural network distillation method, target detection method and device
CN114120045B (en) Target detection method and device based on multi-gate control hybrid expert model
CN114492634B (en) Fine granularity equipment picture classification and identification method and system
CN115512226A (en) LiDAR point cloud filtering method integrated with attention machine system multi-scale CNN
JP7225731B2 (en) Imaging multivariable data sequences
CN113297964B (en) Video target recognition model and method based on deep migration learning
Zheng et al. Fruit tree disease recognition based on convolutional neural networks
CN116524255A (en) Wheat scab spore identification method based on Yolov5-ECA-ASFF
CN116977265A (en) Training method and device for defect detection model, computer equipment and storage medium
CN116563683A (en) Remote sensing image scene classification method based on convolutional neural network and multi-layer perceptron
CN116110074A (en) Dynamic small-strand pedestrian recognition method based on graph neural network
CN113205152B (en) Feature fusion method for look-around fusion
Liu Self-adaptive scale pedestrian detection algorithm based on deep residual network
CN113989671A (en) Remote sensing scene classification method and system based on semantic perception and dynamic graph convolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant