CN114528441A - Graph structure data node classification method and device and electronic equipment - Google Patents

Graph structure data node classification method and device and electronic equipment Download PDF

Info

Publication number
CN114528441A
CN114528441A CN202111651405.8A CN202111651405A CN114528441A CN 114528441 A CN114528441 A CN 114528441A CN 202111651405 A CN202111651405 A CN 202111651405A CN 114528441 A CN114528441 A CN 114528441A
Authority
CN
China
Prior art keywords
node
nodes
label
labeled
structure data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111651405.8A
Other languages
Chinese (zh)
Inventor
杨一帆
余晓填
王孝宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Intellifusion Technologies Co Ltd
Original Assignee
Shenzhen Intellifusion Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Intellifusion Technologies Co Ltd filed Critical Shenzhen Intellifusion Technologies Co Ltd
Priority to CN202111651405.8A priority Critical patent/CN114528441A/en
Publication of CN114528441A publication Critical patent/CN114528441A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9024Graphs; Linked lists
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a graph structure data node classification method, a device and electronic equipment, wherein the method comprises the following steps: inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data; aiming at each node with a label in the graph structure data, determining a prediction label of each node with the label according to the characteristics of each node with the label; repairing the characteristics of the nodes with the labels according to the residual error between the prediction labels and the true value labels of the nodes with the labels to obtain the target characteristics of each node with the labels; adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relationship between each labeled node and the non-label node; and determining a node classification result of the graph structure data according to the adjusted target characteristics of the non-label nodes. The influence of the node characteristic over-smooth problem caused by the graph neural network model on the node classification is eliminated, and the accuracy of the node classification result of the graph structure data is improved.

Description

Graph structure data node classification method and device and electronic equipment
Technical Field
The present application relates to the field of network technologies, and in particular, to a method and an apparatus for classifying graph structure data nodes, and an electronic device.
Background
The graph structure data describes various complex data objects through the characteristics of the nodes and the connection relation between the nodes, for example, each face image in the monitoring data is used as a node, and if two face images are similar, the edges are connected between the nodes corresponding to the two face images, so that the information integration of the whole monitoring data is realized.
In the prior art, node features are generated by aggregating features of adjacent nodes at each layer based on a preset graph neural network, and then node classification is performed according to the node features.
However, as the number of network layers of the graph neural network increases and the number of iterations increases, the node characteristics are subject to an over-smoothing problem, and the accuracy of the node classification result of the graph structure data is reduced.
Disclosure of Invention
The application provides a method and a device for classifying nodes of graph structure data and electronic equipment, which aim to overcome the defects that in the prior art, the accuracy of node classification results of the graph structure data is reduced and the like.
The first aspect of the present application provides a graph structure data node classification method, including:
acquiring graph structure data to be subjected to node classification; wherein the graph structure data comprises labeled nodes and unlabeled nodes;
inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data;
for each labeled node in the graph structure data, determining a predictive label of each labeled node according to the characteristics of each labeled node;
repairing the characteristics of the labeled nodes according to the residual errors between the prediction labels and the true value labels of the labeled nodes to obtain the target characteristics of each labeled node;
adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relation between each labeled node and the non-label node to obtain the target characteristics of each non-label node;
and determining a node classification result of the graph structure data according to the target characteristics of the non-label nodes.
Optionally, the repairing the characteristic of the labeled node according to a residual between the prediction label and the true label of the labeled node to obtain a target characteristic of each labeled node includes:
repairing the characteristics of the labeled nodes according to the prediction error represented by the residual error between the prediction label and the true value label of the labeled nodes;
predicting the prediction label of the labeled node according to the repaired characteristics, and returning to the step of repairing the characteristics of the labeled node according to the prediction error represented by the residual error between the prediction label and the true value label of the labeled node;
and when the prediction error represented by the residual error between the prediction label and the truth label of the labeled node reaches a preset standard, determining the characteristic currently used for predicting the prediction label as the target characteristic of the labeled node.
Optionally, the inputting the graph structure data into a preset graph neural network model to obtain characteristics of each node in the graph structure data includes:
and determining the characteristics of each node in the graph structure data according to the characteristics of adjacent nodes connected with the node based on the graph neural network model.
Optionally, the adjusting the characteristics of each non-labeled node according to the target characteristics of each labeled node and the graph connection relationship between each labeled node and the non-labeled node includes:
determining target non-tag nodes forming an adjacent relation with the tagged nodes according to the graph connection relation between the tagged nodes and the non-tag nodes;
and adjusting the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes.
Optionally, the method further includes:
and adjusting the characteristics of other non-label nodes according to the graph connection relationship between the non-label nodes with the adjusted characteristics and other non-label nodes.
Optionally, the adjusting the characteristics of the target unlabeled node according to the target characteristics of the labeled node includes:
and re-determining the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes based on the graph neural network model.
Optionally, the determining a node classification result of the graph structure data according to the target feature of each non-tag node includes:
determining a prediction label of each label-free node according to the target characteristic of each label-free node;
and classifying the non-label nodes according to the prediction labels of the non-label nodes to obtain the node classification result of the graph structure data.
A second aspect of the present application provides a device for classifying nodes of a graph structure, including:
the acquisition module is used for acquiring the graph structure data to be subjected to node classification; wherein the graph structure data comprises labeled nodes and unlabeled nodes;
the characteristic extraction module is used for inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data;
the prediction module is used for determining a prediction tag of each tagged node according to the characteristics of each tagged node in the graph structure data;
the characteristic repairing module is used for repairing the characteristics of the labeled nodes according to the residual errors between the predicted labels and the true labels of the labeled nodes to obtain the target characteristics of the labeled nodes;
the characteristic adjusting module is used for adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relation between each labeled node and each non-label node to obtain the target characteristics of each non-label node;
and the classification module is used for determining the node classification result of the graph structure data according to the target characteristics of the label-free nodes.
Optionally, the feature repairing module is specifically configured to:
repairing the characteristics of the labeled nodes according to the prediction error represented by the residual error between the prediction label and the true value label of the labeled nodes;
predicting the prediction label of the labeled node according to the repaired characteristics, and returning to the step of repairing the characteristics of the labeled node according to the prediction error represented by the residual error between the prediction label and the true value label of the labeled node;
and when the prediction error represented by the residual error between the prediction label and the truth label of the labeled node reaches a preset standard, determining the characteristic currently used for predicting the prediction label as the target characteristic of the labeled node.
Optionally, the feature extraction module is specifically configured to:
and determining the characteristics of each node in the graph structure data according to the characteristics of adjacent nodes connected with the node based on the graph neural network model.
Optionally, the feature adjusting module is specifically configured to:
determining target non-tag nodes forming an adjacent relation with the tagged nodes according to the graph connection relation between the tagged nodes and the non-tag nodes;
and adjusting the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes.
Optionally, the feature adjusting module is further configured to:
and adjusting the characteristics of other non-label nodes according to the graph connection relationship between the non-label nodes with the adjusted characteristics and other non-label nodes.
Optionally, the feature adjusting module is specifically configured to:
and re-determining the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes based on the graph neural network model.
Optionally, the classification module is specifically configured to:
determining a prediction label of each label-free node according to the target characteristic of each label-free node;
and classifying the non-label nodes according to the prediction labels of the non-label nodes to obtain the node classification result of the graph structure data.
A third aspect of the present application provides an electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executes computer-executable instructions stored by the memory to cause the at least one processor to perform the method as set forth in the first aspect above and in various possible designs of the first aspect.
A fourth aspect of the present application provides a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement a method as set forth in the first aspect and various possible designs of the first aspect.
This application technical scheme has following advantage:
the application provides a graph structure data node classification method, a device and electronic equipment, wherein the method comprises the following steps: acquiring graph structure data to be subjected to node classification; the graph structure data comprises labeled nodes and unlabeled nodes; inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data; determining a prediction tag of each tagged node according to the characteristics of each tagged node aiming at each tagged node in the graph structure data; repairing the characteristics of the nodes with the labels according to the residual errors between the prediction labels and the true value labels of the nodes with the labels to obtain the target characteristics of each node with the labels; adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relationship between each labeled node and each non-label node to obtain the target characteristics of each non-label node; and determining a node classification result of the graph structure data according to the target characteristics of each label-free node. According to the method provided by the scheme, the characteristics of the labeled nodes directly obtained by the graph neural network model are repaired, the characteristics of the label-free nodes are adaptively adjusted, the influence of the node characteristic over-smoothness problem caused by the graph neural network model on node classification is eliminated, and the accuracy of the node classification result of the graph structure data is improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings can be obtained by those skilled in the art according to these drawings.
Fig. 1 is a schematic flowchart of a graph structure data node classification method according to an embodiment of the present application;
FIG. 2 is a schematic diagram illustrating exemplary structure data of a graph according to an embodiment of the present disclosure;
FIG. 3 is a block diagram of another exemplary graph structure data provided by an embodiment of the present application;
fig. 4 is a schematic structural diagram of a graph structure data node classification device according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
With the above figures, there are shown specific embodiments of the present application, which will be described in more detail below. These drawings and written description are not intended to limit the scope of the disclosed concepts in any way, but rather to illustrate the concepts of the disclosure to those skilled in the art by reference to specific embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. In the description of the following examples, "plurality" means two or more unless specifically limited otherwise.
In the prior art, node features are generated by aggregating features of adjacent nodes of each layer based on a preset graph neural network, and then node classification is performed according to the node features. However, as the number of network layers of the graph neural network increases and the number of iterations increases, the node characteristics are subject to an over-smoothing problem, and the accuracy of the node classification result of the graph structure data is reduced.
In order to solve the above problems, in the method, the device and the electronic device for classifying nodes of graph structure data provided by the embodiment of the application, graph structure data to be subjected to node classification is obtained; the graph structure data comprises labeled nodes and unlabeled nodes; inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data; determining a prediction tag of each tagged node according to the characteristics of each tagged node aiming at each tagged node in the graph structure data; repairing the characteristics of the nodes with the labels according to the residual errors between the prediction labels and the true value labels of the nodes with the labels to obtain the target characteristics of each node with the labels; adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relationship between each labeled node and each non-label node to obtain the target characteristics of each non-label node; and determining a node classification result of the graph structure data according to the target characteristics of each label-free node. According to the method provided by the scheme, the characteristics of the labeled nodes directly obtained by the graph neural network model are repaired, the characteristics of the label-free nodes are adaptively adjusted, the influence of the node characteristic over-smoothness problem caused by the graph neural network model on node classification is eliminated, and the accuracy of the node classification result of the graph structure data is improved.
The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.
The embodiment of the application provides a graph structure data node classification method, which is used for classifying nodes of graph structure data so as to realize data clustering. The execution subject of the embodiment of the present application is an electronic device, such as a server, a desktop computer, a notebook computer, a tablet computer, and other electronic devices that can be used to classify nodes of graph structure data.
As shown in fig. 1, a schematic flowchart of a graph structure data node classification method provided in an embodiment of the present application is shown, where the method includes:
step 101, obtaining graph structure data to be subjected to node classification.
Wherein the graph structure data includes labeled nodes and unlabeled nodes. The graph structure data is composed of connection relationships between nodes and edges, and can be written as: g ═ V, E, where V denotes nodes and E denotes edges.
For example, in the case of application to the field of human image filing, the graph structure data may be monitoring data, each node corresponds to one human face image, if two human face images are similar, an edge is connected between nodes corresponding to the two human face images, a node whose human identity has been determined is a labeled node, and a node whose human identity has not been determined is an unlabeled node. The labeled nodes are nodes of which specific categories have been determined, such nodes can be used as training samples, the categories of the nodes are labels, and the non-labeled nodes are nodes of which specific categories have not been determined currently.
And 102, inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data.
The graph neural network model may be constructed based on a machine learning network such as a Graph Convolution Network (GCN) or a graph attention mechanism network (GAT), and a specific construction process of the graph neural network model may refer to the prior art, which is not limited in the embodiment of the present application.
Specifically, the graph neural network model may determine the characteristics of each node in the graph structure data according to the inter-node similarity represented by the connection relationship between each node in the obtained graph structure data.
In the graph structure data, if there is a certain similarity between two nodes, there is a connection relationship between the two nodes, that is, there are adjacent nodes. For a certain node, the characteristics of the node can be determined according to the characteristics of the adjacent nodes adjacent to the certain node.
And 103, aiming at each labeled node in the graph structure data, determining the prediction label of each labeled node according to the characteristics of each labeled node.
Specifically, the labeled nodes can be utilized to perform model training on the graph neural network model, so that the graph neural network model can detect the corresponding prediction labels for each labeled node based on the characteristics of the labeled nodes.
And step 104, repairing the characteristics of the nodes with the labels according to the residual errors between the predicted labels and the true labels of the nodes with the labels to obtain the target characteristics of each node with the labels.
It should be noted that the prediction labels output by the graph neural network model are in a vector coding form, the truth labels are also in a vector coding form, and specifically, the prediction error of the current graph neural network, that is, the residual between the prediction labels and the truth labels of the labeled nodes, may be determined according to the prediction labels and the truth labels of the labeled nodes based on a preset loss function, such as a cross entropy loss function.
Specifically, it can be assumed that the prediction error of the neural network model of the current graph is caused by the feature inaccuracy of the labeled node, and therefore, the feature of the labeled node can be repaired to obtain the corresponding target feature with the goal of reducing the residual between the prediction label and the true value label of the labeled node.
And 105, adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relation between each labeled node and each non-label node to obtain the target characteristics of each non-label node.
Specifically, since the graph neural network model is determined based on the features of the adjacent nodes of a certain node, after the features of the labeled nodes in the graph structure data are repaired, the features of the unlabeled nodes having an adjacent relationship with the labeled nodes can be adaptively adjusted based on the repaired features (target features) of the labeled nodes to obtain the target features of the unlabeled nodes in the current graph structure data.
And step 106, determining a node classification result of the graph structure data according to the target characteristics of the non-label nodes.
Wherein, the target feature of the node is usually a real number vector of M dimension, and N node feature matrixes are X ∈ RNxMClass Y of node belongs to WNxCThe label category is a single code, and the number of categories is C. Adjacency matrix S ∈ WNxNIs the identification of the connection relation in the graph structure data.
In particular, node features (target features) and adjacency matrices may be used as inputs to the graph network model to predict classes of unlabeled nodes based on the graph neural network model.
Specifically, in an embodiment, the predicted label of each non-label node may be determined according to the target feature of each non-label node; and classifying the non-label nodes according to the prediction labels of the non-label nodes to obtain the node classification result of the graph structure data.
Specifically, the data formats of the target feature and the prediction tag are real number vectors, and the corresponding relations between different features and the prediction tag can be preset, and a large number of corresponding relations between different features and the prediction tag are stored in a preset database, and after the target feature of the non-tag node is obtained, the prediction tag of the non-tag node is determined by referring to the corresponding relations between different features and the prediction tag preset in the database.
Further, after the prediction labels of the plurality of unlabeled nodes are obtained, the unlabeled nodes may be clustered according to the prediction label of each unlabeled node to obtain a node classification result of the graph structure data. The node classification result indicates which non-label nodes are of one type and which prediction label the non-label nodes of the type correspond to.
Specifically, in order to improve the node classification efficiency, a multilayer sensor may be added on the basis of the graph neural network model, so as to determine the prediction labels of each non-label node according to the target characteristics of each non-label node based on the multilayer sensor, and then classify the non-label nodes according to the prediction labels, so as to obtain the node classification result of the graph structure data.
The training process of the multi-layer sensor can refer to the prior art, and the embodiment of the application is not limited.
It should be noted that, in some complex application scenarios, graph structure data is generally composed of a large number of nodes and the data scale is large, so to avoid overload operation of a graph neural network model, a neighboring sampler sampling subgraph can be used in the graph structure data to divide the whole graph structure data into a plurality of subgraphs, each subgraph is subjected to node classification, and after node classification operations of all the subgraphs are completed, the classification results are summarized to obtain the node classification results of the graph structure data.
On the basis of the foregoing embodiments, as an implementable manner, in an embodiment, the repairing the characteristics of the labeled nodes according to the residual between the prediction labels and the true labels of the labeled nodes to obtain the target characteristics of each labeled node includes:
step 1041, repairing the characteristics of the labeled node according to a prediction error represented by a residual error between a prediction label of the labeled node and a true value label;
1042, predicting the prediction label of the labeled node according to the repaired characteristics, and returning to the step of repairing the characteristics of the labeled node according to the prediction error represented by the residual error between the prediction label of the labeled node and the true value label;
step 1043, when the prediction error represented by the residual error between the prediction label of the labeled node and the true value label reaches the preset standard, determining the feature currently used for predicting the prediction label as the target feature of the labeled node.
In particular, a label propagation algorithm may be used to repair the characteristics of the labeled nodes with the goal of reducing prediction errors. The feature repairing refers to modifying specific numerical values of some/some elements in a real number vector corresponding to the feature.
Specifically, for a labeled node, after each pair of labeled nodes is subjected to feature repair once, label prediction is performed again to obtain a residual corresponding to the current feature. When the prediction error represented by the residual between the prediction tag and the true value tag of the tagged node does not reach the preset standard, step 1041-1042 is repeatedly executed until the prediction error represented by the residual between the prediction tag and the true value tag of the tagged node reaches the preset standard. Wherein the prediction criterion may be a prediction error of 0.
For example, whether the test error represented by the current residual reaches the preset standard may be detected based on the following formula:
Y=(1-λ)Y+λSY
wherein, Y represents the residual error between the predicted label and the true label, S represents the normalized adjacent matrix, the adjacent matrix is used for representing the node connection relation in the graph structure data, the lambda-over parameter is used for controlling the influence of the initialization stage on the iteration result, and the specific parameter value can be adjusted according to the practical application. Specifically, iteration can be performed based on the formula until iteration converges, and if Y converges, it can be determined that the test error represented by the current residual reaches the preset standard.
On the basis of the foregoing embodiment, as an implementable manner, in an embodiment, inputting the graph structure data into a preset graph neural network model to obtain features of each node in the graph structure data includes:
and 1021, determining the characteristics of each node in the graph structure data according to the characteristics of the adjacent nodes connected with the node based on the graph neural network model.
It should be noted that, in the graph structure data, there is a certain similarity between two nodes in which an edge connection relationship exists, where the degree of similarity may be represented based on the weight of a connection edge.
Correspondingly, in an embodiment, a target non-tag node forming an adjacency relation with each tagged node may be determined according to a graph connection relation between each tagged node and a non-tag node; and adjusting the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes.
As shown in fig. 2, for the structural diagram of an exemplary graph structure data provided in the embodiment of the present application, 1.1, 1.2, and 1.3 denote labeled nodes, and 2.1, 2.2, and 2.3 denote target non-labeled nodes forming an adjacency relation with the labeled nodes.
Specifically, in one embodiment, the features of the target unlabeled nodes may be re-determined from the target features of the labeled nodes based on the graph neural network model.
The specific determination rule of the node feature may be set according to an actual requirement, for example, the feature average value of the adjacent node is determined as the feature of the current node.
For example, taking 2.1 node in fig. 2 as an example, if the labeled nodes forming the adjacency relationship include 1.1 node and 1.2 node, the feature average of 1.1 node and 1.2 node may be determined as the feature of 2.1 node. As shown in fig. 3, for another exemplary diagram structure data structure diagram provided in the embodiment of the present application, a weight is assigned to an edge of the current diagram structure data, where the weight represents a similarity degree between two nodes, and then a weighted average of features of the 1.1 node and the 1.2 node may be determined as a feature of the 2.1 node.
Further, in an embodiment, the characteristics of other non-labeled nodes may also be adjusted according to the graph connection relationship between the non-labeled node whose characteristic adjustment has been completed and other non-labeled nodes.
Specifically, based on a post-processing algorithm of label propagation, the features of other non-labeled nodes may be adjusted according to the graph connection relationship between the non-labeled node whose feature adjustment has been completed and other non-labeled nodes until there is no non-labeled node to be feature adjusted in the graph structure data.
Specifically, when the graph structure data node classification method provided by the embodiment of the application is applied to face archiving, a newly acquired snapshot sample can be put into a historical database. Each face snapshot is a node in the graph network structure, the node characteristics are image characteristics of the face snapshot, and edges between the nodes are the similarity of every two face characteristics. The label of the picture can be a label data set (labeled nodes) labeled by history, and the unlabeled data sample (unlabeled node) is a newly acquired snapshot sample. Based on the graph structure data node classification method provided by the embodiment, the attention model (graph neural network model) of the graph is trained based on the label data set, the trained model is used for prediction, the sample prediction accuracy is improved through a label propagation post-processing algorithm, and the filing result is generated.
According to the graph structure data node classification method provided by the embodiment of the application, the graph structure data to be subjected to node classification is obtained; the graph structure data comprises labeled nodes and unlabeled nodes; inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data; determining a prediction tag of each tagged node according to the characteristics of each tagged node aiming at each tagged node in the graph structure data; repairing the characteristics of the nodes with the labels according to the residual errors between the prediction labels and the true value labels of the nodes with the labels to obtain the target characteristics of each node with the labels; adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relationship between each labeled node and each non-label node to obtain the target characteristics of each non-label node; and determining a node classification result of the graph structure data according to the target characteristics of each label-free node. According to the method provided by the scheme, the characteristics of the labeled nodes directly obtained by the graph neural network model are repaired, the characteristics of the label-free nodes are adaptively adjusted, the influence of the node characteristic over-smoothness problem caused by the graph neural network model on node classification is eliminated, and the accuracy of the node classification result of the graph structure data is improved. The method can be applied to a newly-added face snapshot filing scene, and the filing accuracy can be improved.
The embodiment of the application provides a device for classifying nodes of graph structure data, which is used for executing the method for classifying the nodes of the graph structure data provided by the embodiment.
Fig. 4 is a schematic structural diagram of a graph structure data node classification device according to an embodiment of the present application. The graph structure data node classification device 40 includes: an acquisition module 401, a feature extraction module 402, a prediction module 403, a feature repair module 404, a feature adjustment module 405, and a classification module 406.
The acquisition module is used for acquiring graph structure data to be subjected to node classification; the graph structure data comprises labeled nodes and unlabeled nodes; the characteristic extraction module is used for inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data; the prediction module is used for determining a prediction tag of each tagged node according to the characteristics of each tagged node in the graph structure data; the characteristic repairing module is used for repairing the characteristics of the nodes with the labels according to the residual error between the predicted labels and the true value labels of the nodes with the labels so as to obtain the target characteristics of each node with the labels; the characteristic adjusting module is used for adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relation between each labeled node and each non-label node to obtain the target characteristics of each non-label node; and the classification module is used for determining the node classification result of the graph structure data according to the target characteristics of the non-label nodes.
Specifically, in an embodiment, the feature repairing module is specifically configured to:
repairing the characteristics of the labeled nodes according to the prediction error represented by the residual error between the prediction label of the labeled nodes and the true value label;
predicting the prediction label of the labeled node according to the repaired characteristics, and returning to the step of repairing the characteristics of the labeled node according to the prediction error represented by the residual error between the prediction label of the labeled node and the true value label;
and when the prediction error represented by the residual error between the prediction label and the truth label of the labeled node reaches a preset standard, determining the characteristic currently used for predicting the prediction label as the target characteristic of the labeled node.
Specifically, in an embodiment, the feature extraction module is specifically configured to:
and determining the characteristics of each node in the graph structure data according to the characteristics of adjacent nodes connected with the node based on the graph neural network model.
Specifically, in an embodiment, the feature adjusting module is specifically configured to:
determining target non-tag nodes forming an adjacent relation with the tagged nodes according to the graph connection relation between each tagged node and the non-tag nodes;
and adjusting the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes.
Specifically, in an embodiment, the feature adjusting module is further configured to:
and adjusting the characteristics of other label-free nodes according to the graph connection relation between the label-free nodes with the adjusted characteristics and other label-free nodes.
Specifically, in an embodiment, the feature adjusting module is specifically configured to:
and based on the graph neural network model, re-determining the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes.
Specifically, in an embodiment, the classification module is specifically configured to:
determining a prediction label of each label-free node according to the target characteristics of each label-free node;
and classifying the non-label nodes according to the prediction labels of the non-label nodes to obtain the node classification result of the graph structure data.
With regard to the graph structure data node classification apparatus in the present embodiment, the specific manner in which each module performs operations has been described in detail in the embodiment related to the method, and will not be elaborated here.
The graph structure data node classification device provided in the embodiment of the present application is configured to execute the graph structure data node classification method provided in the above embodiment, and an implementation manner of the graph structure data node classification device is the same as a principle, and is not described again.
The embodiment of the application provides electronic equipment for executing the graph structure data node classification method provided by the embodiment.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 50 includes: at least one processor 51 and memory 52;
the memory stores computer-executable instructions; the at least one processor executes the computer-executable instructions stored by the memory to cause the at least one processor to perform the graph structure data node classification method provided by the above embodiments.
The electronic device provided in the embodiment of the present application is configured to execute the graph structure data node classification method provided in the foregoing embodiment, and an implementation manner and a principle of the method are the same and are not described again.
The embodiment of the present application provides a computer-readable storage medium, where computer-executable instructions are stored in the computer-readable storage medium, and when a processor executes the computer-executable instructions, the graph structure data node classification method provided in any embodiment above is implemented.
The storage medium containing the computer-executable instructions of the embodiment of the present application may be used to store the computer-executable instructions of the graph structure data node classification method provided in the foregoing embodiment, and an implementation manner and a principle thereof are the same and are not described again.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit is only a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or in the form of hardware plus a software functional unit.
The integrated unit implemented in the form of a software functional unit may be stored in a computer-readable storage medium. The software functional unit is stored in a storage medium and includes several instructions to enable a computer device (which may be a personal computer, a server, or a network device) or a processor (processor) to execute some steps of the methods according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
It is obvious to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to perform all or part of the above described functions. For the specific working process of the device described above, reference may be made to the corresponding process in the foregoing method embodiment, which is not described herein again.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present application.

Claims (10)

1. A method for classifying graph structure data nodes is characterized by comprising the following steps:
acquiring graph structure data to be subjected to node classification; wherein the graph structure data comprises labeled nodes and unlabeled nodes;
inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data;
for each labeled node in the graph structure data, determining a predictive label of each labeled node according to the characteristics of each labeled node;
repairing the characteristics of the labeled nodes according to the residual errors between the prediction labels and the true value labels of the labeled nodes to obtain the target characteristics of each labeled node;
adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relation between each labeled node and the non-label node to obtain the target characteristics of each non-label node;
and determining a node classification result of the graph structure data according to the target characteristics of the non-label nodes.
2. The method of claim 1, wherein the repairing the signature of the labeled node based on a residual between a prediction label and a truth label of the labeled node to obtain a target signature of each labeled node comprises:
repairing the characteristics of the labeled nodes according to the prediction error represented by the residual error between the prediction label and the true value label of the labeled nodes;
predicting the prediction label of the labeled node according to the repaired characteristics, and returning to the step of repairing the characteristics of the labeled node according to the prediction error represented by the residual error between the prediction label and the true value label of the labeled node;
and when the prediction error represented by the residual error between the prediction label and the truth label of the labeled node reaches a preset standard, determining the characteristic currently used for predicting the prediction label as the target characteristic of the labeled node.
3. The method according to claim 1, wherein the inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data comprises:
and determining the characteristics of each node in the graph structure data according to the characteristics of adjacent nodes connected with the node based on the graph neural network model.
4. The method of claim 3, wherein the adjusting the characteristics of each of the unlabeled nodes based on the target characteristics of each of the labeled nodes and the graph connection relationship between each of the labeled nodes and the unlabeled nodes comprises:
determining target non-tag nodes forming an adjacent relation with the tagged nodes according to the graph connection relation between the tagged nodes and the non-tag nodes;
and adjusting the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes.
5. The method of claim 4, further comprising:
and adjusting the characteristics of other non-label nodes according to the graph connection relationship between the non-label nodes with the adjusted characteristics and other non-label nodes.
6. The method of claim 4, wherein said adjusting the characteristics of the target unlabeled node based on the target characteristics of the labeled node comprises:
and re-determining the characteristics of the target non-labeled nodes according to the target characteristics of the labeled nodes based on the graph neural network model.
7. The method according to claim 1, wherein the determining the node classification result of the graph structure data according to the target feature of each non-label node comprises:
determining a prediction label of each label-free node according to the target characteristic of each label-free node;
and classifying the non-label nodes according to the prediction labels of the non-label nodes to obtain the node classification result of the graph structure data.
8. A graph structure data node classification apparatus, comprising:
the acquisition module is used for acquiring the graph structure data to be subjected to node classification; wherein the graph structure data comprises labeled nodes and unlabeled nodes;
the characteristic extraction module is used for inputting the graph structure data into a preset graph neural network model to obtain the characteristics of each node in the graph structure data;
the prediction module is used for determining a prediction tag of each tagged node according to the characteristics of each tagged node in the graph structure data;
the characteristic repairing module is used for repairing the characteristics of the labeled nodes according to the residual errors between the predicted labels and the true labels of the labeled nodes to obtain the target characteristics of the labeled nodes;
the characteristic adjusting module is used for adjusting the characteristics of each non-label node according to the target characteristics of each labeled node and the graph connection relation between each labeled node and each non-label node to obtain the target characteristics of each non-label node;
and the classification module is used for determining a node classification result of the graph structure data according to the target characteristics of the non-label nodes.
9. An electronic device, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
execution of the computer-executable instructions stored by the memory by the at least one processor causes the at least one processor to perform the method of any one of claims 1 to 7.
10. A computer-readable storage medium having computer-executable instructions stored thereon which, when executed by a processor, implement the method of any one of claims 1 to 7.
CN202111651405.8A 2021-12-30 2021-12-30 Graph structure data node classification method and device and electronic equipment Pending CN114528441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111651405.8A CN114528441A (en) 2021-12-30 2021-12-30 Graph structure data node classification method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111651405.8A CN114528441A (en) 2021-12-30 2021-12-30 Graph structure data node classification method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN114528441A true CN114528441A (en) 2022-05-24

Family

ID=81620566

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111651405.8A Pending CN114528441A (en) 2021-12-30 2021-12-30 Graph structure data node classification method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN114528441A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115293919A (en) * 2022-07-22 2022-11-04 浙江大学 Graph neural network prediction method and system oriented to social network distribution generalization

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115293919A (en) * 2022-07-22 2022-11-04 浙江大学 Graph neural network prediction method and system oriented to social network distribution generalization
CN115293919B (en) * 2022-07-22 2023-08-04 浙江大学 Social network distribution outward generalization-oriented graph neural network prediction method and system

Similar Documents

Publication Publication Date Title
US20200401939A1 (en) Systems and methods for preparing data for use by machine learning algorithms
US11416772B2 (en) Integrated bottom-up segmentation for semi-supervised image segmentation
CN113614748A (en) Systems and methods for incremental learning for object detection
CA3066029A1 (en) Image feature acquisition
CN110969200B (en) Image target detection model training method and device based on consistency negative sample
JP2015504215A (en) Method and system for comparing images
WO2023024670A1 (en) Device clustering method and apparatus, and computer device and storage medium
CN111639607A (en) Model training method, image recognition method, model training device, image recognition device, electronic equipment and storage medium
CN111694957B (en) Method, equipment and storage medium for classifying problem sheets based on graph neural network
CN110954734B (en) Fault diagnosis method, device, equipment and storage medium
CN112613569A (en) Image recognition method, and training method and device of image classification model
CN112818162A (en) Image retrieval method, image retrieval device, storage medium and electronic equipment
CN111178196B (en) Cell classification method, device and equipment
WO2021099938A1 (en) Generating training data for object detection
CN112115996A (en) Image data processing method, device, equipment and storage medium
CN114528441A (en) Graph structure data node classification method and device and electronic equipment
CN115034315A (en) Business processing method and device based on artificial intelligence, computer equipment and medium
CN112966687B (en) Image segmentation model training method and device and communication equipment
CN117649515A (en) Digital twinning-based semi-supervised 3D target detection method, system and equipment
CN108830302B (en) Image classification method, training method, classification prediction method and related device
US20230237272A1 (en) Table column identification using machine learning
CN112861962B (en) Sample processing method, device, electronic equipment and storage medium
US20230074640A1 (en) Duplicate scene detection and processing for artificial intelligence workloads
US11368756B1 (en) System and method for correlating video frames in a computing environment
CN114139636A (en) Abnormal operation processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination