CN114282587A

CN114282587A - Data processing method and device, computer equipment and storage medium

Info

Publication number: CN114282587A
Application number: CN202111034264.5A
Authority: CN
Inventors: 陈德里; 林衍凯; 赵光香; 任宣丞; 李鹏; 周杰; 孙栩
Original assignee: Peking University; Tencent Technology Shenzhen Co Ltd
Current assignee: Peking University; Tencent Technology Shenzhen Co Ltd
Priority date: 2021-09-03
Filing date: 2021-09-03
Publication date: 2022-04-05

Abstract

The application discloses a data processing method, a data processing device, computer equipment and a storage medium, and belongs to the technical field of computers. According to the method, in the parameter adjusting process of the graph neural network, the conflict level parameter of each labeled node is determined to measure the topological position of each labeled node, the target weight of each labeled node is distributed to each labeled node on the basis of the conflict level parameter, and the target weight is put into the parameter adjusting process to adjust the influence of different labeled nodes in different topological positions in the parameter adjusting process, for example, a larger target weight is distributed to labeled nodes of which the topological positions are close to the class center, and a smaller weight is classified to labeled nodes of which the topological positions are close to the class boundary, so that the phenomenon of class imbalance existing in the graph neural network generally can be improved, and the identification accuracy of the graph neural network is improved.

Description

Data processing method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, a computer device, and a storage medium.

Background

With the development of computer technology, although the conventional deep learning method is applied with great success in extracting features of european space data, data (e.g. Graph data) in many practical application scenarios is generated from non-european spaces, and thus Graph Neural Networks (GNNs) are becoming a research hotspot. The graph neural network refers to a neural network structure for processing graph data, where the graph data refers to a data structure including nodes and edges, for example, each account in the social network may correspond to one node in the graph neural network, and when account a and account B have a friend relationship, there is a connected edge between the nodes corresponding to account a and account B in the graph neural network.

The graph neural network can be used for processing node classification tasks, namely, the category of each node can be identified based on graph data of each node, and for example, the graph neural network can process account classification tasks in a social network. In the training phase of the graph neural network, the number of the labeled nodes contained in the label sets provided for different categories is different, and the position distribution of the labeled nodes contained in each label set in the graph neural network is also unbalanced, so that the graph neural network has a significant difference in prediction capability for different categories when processing a node classification task, for example, the recognition accuracy for category 1 is generally high, but the recognition accuracy for category 2 is generally low. Therefore, a method capable of improving the recognition accuracy of the neural network is needed.

Disclosure of Invention

The embodiment of the application provides a data processing method and device, computer equipment and a storage medium, which can improve the recognition accuracy of a graph neural network. The technical scheme is as follows:

in one aspect, a data processing method is provided, and the method includes:

acquiring respective conflict level parameters of a plurality of labeled nodes based on the position information of the labeled nodes in the graph neural network, wherein the conflict level parameters are used for representing the topological positions of the labeled nodes in the corresponding labeled categories;

acquiring target weights of the plurality of labeled nodes based on respective conflict level parameters of the plurality of labeled nodes, wherein the target weights are used for representing weighted influence factors introduced for the labeled nodes based on the topological positions;

and adjusting parameters of the graph neural network based on the respective target weights of the plurality of labeled nodes to obtain a target graph neural network, wherein the target graph neural network is used for identifying the category of each node in the graph neural network.

In one aspect, a data processing apparatus is provided, the apparatus comprising:

the device comprises a first obtaining module, a second obtaining module and a third obtaining module, wherein the first obtaining module is used for obtaining respective conflict level parameters of a plurality of marking nodes based on position information of the marking nodes in a graph neural network, and the conflict level parameters are used for representing topological positions of the marking nodes in corresponding marking categories;

a second obtaining module, configured to obtain target weights of the plurality of labeled nodes based on respective conflict level parameters of the plurality of labeled nodes, where the target weights are used to characterize weighted influence factors introduced to the labeled nodes based on the topological positions;

and the parameter adjusting module is used for adjusting parameters of the graph neural network based on the respective target weights of the plurality of labeled nodes to obtain a target graph neural network, and the target graph neural network is used for identifying the category of each node in the graph neural network.

In one possible implementation, the first obtaining module includes:

a random walk unit, configured to perform random walk on any one of the plurality of labeled nodes from the labeled node to obtain a probability matrix of the labeled node, where the probability matrix is used to represent probability distribution of the labeled node stopping to any node in the graph neural network during random walk;

a first obtaining unit, configured to obtain, based on a probability matrix of the labeled node, a collision expectation of the labeled node, where the collision expectation is used to characterize a mathematical expectation that any node obeying the probability distribution encounters a possibility of a different category when random walk stops, where the different category is a category other than a labeled category corresponding to the labeled node;

and the determining unit is used for determining the conflict expectation of the annotation node as the conflict level parameter of the annotation node.

In one possible implementation, the first obtaining unit is configured to:

for any target marking node corresponding to any different category, determining the termination probability of starting random walk from the target marking node and stopping at any node obeying the probability distribution;

adding the termination probabilities of the target marking nodes in any one of the different categories to obtain a first numerical value;

dividing the first numerical value by the number of target labeling nodes contained in any one of the different categories to obtain a second numerical value;

adding the second numerical values corresponding to the different categories to obtain a third numerical value;

and determining the mathematical expectation of each third numerical value corresponding to each node obeying the probability distribution as the conflict expectation of the labeled node.

In one possible implementation, in a case that the number of nodes included in the graph neural network is greater than a number threshold, the probability matrix of each labeled node is obtained based on a partial sampling of the nodes in the graph neural network.

In one possible implementation, the second obtaining module includes:

a second obtaining unit, configured to obtain a cosine annealing value of the labeled node for any one of the labeled nodes, where the cosine annealing value is used to represent a sorting condition of conflict level parameters of the labeled node in a corresponding labeled category;

and the third obtaining unit is used for obtaining the target weight of the labeled node based on the cosine annealing value, the minimum weight threshold and the maximum weight threshold of the labeled node.

In one possible implementation, the second obtaining unit is configured to:

determining the sorting order of the conflict level parameters of the marking nodes in the corresponding marking categories;

and acquiring the cosine annealing value of the labeled node based on the sorting order and the number of the labeled nodes contained in the labeled category.

In one possible implementation, the third obtaining unit is configured to:

adding one to the cosine annealing value to obtain a fourth numerical value;

multiplying the fourth value by the difference between the maximum weight threshold and the minimum weight threshold to obtain a fifth value;

and adding one half of the fifth numerical value to the minimum weight threshold value to obtain the target weight of the labeled node.

In one possible implementation, the parameter adjusting module is configured to:

determining a plurality of prediction probabilities that each of the plurality of labeled nodes respectively corresponds to a plurality of categories based on the graph neural network, the prediction probabilities being used to characterize the likelihood that the labeled node corresponds to each category;

obtaining a loss function value of the iteration based on a plurality of prediction probabilities of the plurality of labeled nodes, label categories of the labeled nodes and target weights of the labeled nodes;

and responding to the fact that the loss function value does not accord with a stopping condition, iteratively adjusting parameters of the graph neural network until the loss function value accords with the stopping condition, and stopping iteration to obtain the target graph neural network.

In one possible embodiment, each node in the graph neural network corresponds to each account in a social network; the category to which each node belongs corresponds to the account category to which each account belongs.

In one possible implementation, each node in the graph neural network corresponds to each article having a reference relationship; the category to which each node belongs corresponds to the article category to which each article belongs.

In one aspect, a computer device is provided, the computer device comprising one or more processors and one or more memories, the one or more memories storing therein at least one computer program, the at least one computer program being loaded and executed by the one or more processors to implement the data processing method of any one of the possible implementations as described above.

In one aspect, a storage medium is provided, in which at least one computer program is stored, the at least one computer program being loaded and executed by a processor to implement the data processing method according to any one of the above possible implementations.

In one aspect, a computer program product or computer program is provided that includes one or more program codes stored in a computer readable storage medium. The one or more processors of the computer device can read the one or more program codes from the computer-readable storage medium, and the one or more processors execute the one or more program codes, so that the computer device can execute the data processing method of any one of the above-mentioned possible embodiments.

The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise:

in the parameter adjusting process of the graph neural network, determining a conflict level parameter of each labeled node to measure the topological position of each labeled node, distributing a target weight of each labeled node on the basis of the conflict level parameter, and putting the target weight into the parameter adjusting process to adjust the influence of different labeled nodes at different topological positions in the parameter adjusting process, for example, distributing a larger target weight to the labeled nodes at the topological positions close to the class center, and distributing a smaller weight to the labeled nodes at the topological positions close to the class boundary, so that the phenomenon of class imbalance commonly existing in the graph neural network can be improved, and the identification accuracy of the graph neural network is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to be able to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment of a data processing method according to an embodiment of the present application;

fig. 2 is a flowchart of a data processing method provided in an embodiment of the present application;

fig. 3 is a flowchart of a data processing method provided in an embodiment of the present application;

fig. 4 is a schematic diagram of a data processing method provided in an embodiment of the present application;

FIG. 5 is a graph of the comparison of model performance provided by the examples of the present application;

fig. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The terms "first," "second," and the like in this application are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency or limitation on the number or order of execution.

The term "at least one" in this application means one or more, and the meaning of "a plurality" means two or more, for example, a plurality of first locations means two or more first locations.

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.

With the research and progress of artificial intelligence technology, the artificial intelligence technology is developed and researched in a plurality of fields, such as common smart homes, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, unmanned aerial vehicles, robots, smart medical services, smart customer service, internet of vehicles, automatic driving, smart traffic and the like.

The technical scheme provided by the embodiment of the application relates to the technologies of machine learning and the like of artificial intelligence, and although the traditional deep learning algorithm is applied to the aspect of extracting the characteristics of Euclidean space data with great success, data in many practical application scenes are generated from non-Euclidean space, and the expression of the traditional deep learning algorithm on processing the non-Euclidean space data is still difficult to satisfy. For example, in e-commerce, a learning system based on Graph (Graph) data can make very accurate recommendations by utilizing the interaction between users (User) and products (Item), but the complexity of Graph data makes traditional deep learning algorithms face huge challenges in processing. This is because graph data is irregular, each containing a variable sized unordered node, and each node in the graph data has a different number of neighboring nodes, resulting in some important operations (e.g., convolution) that are easily computed on the Image (Image) but are no longer suitable for direct use with the graph data. Furthermore, one core assumption of conventional deep learning algorithms is that data samples (also referred to as samples) are independent of each other; this is not the case for graph data, however, where each node in the graph data has edges that are related to other nodes in the graph data, and this information can be used to capture the interdependencies between nodes (i.e., data samples).

In recent years, the expansion of deep learning algorithm on Graph data is becoming a research focus, and technicians define and design a Neural network structure, namely Graph Neural Networks (GNNs), for processing Graph data by using the ideas of convolutional Networks, cyclic Networks and deep automatic encoders. The embodiment of the present application relates to a method capable of improving GNN recognition accuracy, which is described in detail below.

Hereinafter, terms related to the embodiments of the present application are described.

Graph data (Graph): refers to a data structure comprising nodes and edges, and the graph data represents an abstract data structure in which samples are connected by edges in the GNN training process, and is a completely different data structure from an Image (Image). For example, each account in the social network may correspond to a node in the GNN, and when account a and account B have a friend relationship, there is a connected edge between the nodes corresponding to account a and account B in the GNN.

Graph Neural Networks (GNNs): a neural network structure for processing graph data. For example, graph neural networks include, but are not limited to: graph Convolution Networks (GCNs), Graph Attention Networks (GATs), Personalized Neural Prediction Propagation Networks (PPNPs), Chebyshev graphical Convolution Networks (chebnets), Sample and Propagation Networks (SAGE), simplified Graph Convolution Networks (SGCs), and the like.

A node classification system: the GNN for processing the node classification task may be referred to as a node classification system, where the input of the node classification system is a node and its associated graph structure data, and the output is one of several preset classes, i.e., the class to which the input node identified by the computer belongs.

Topological structure: refers to an abstract structure formed by connection relations among nodes.

Category imbalance: the node classification system has a phenomenon that prediction capabilities of different categories are obviously different, taking the account classification system as an example, if a GNN always has higher accuracy in identifying head net red account numbers and lower accuracy in identifying waist net red account numbers, it is indicated that the GNN has a category imbalance phenomenon.

Quantity Imbalance (Quantity-impedance): the method refers to class imbalance caused by the difference of the number of nodes in a label set formed by label nodes of different classes, wherein the number of nodes refers to the number of label nodes contained in the label set of each class.

Topological Imbalance (Topology-impedance): the node position refers to a topological position of the labeled node included in the labeled set of each category, for example, the topological position is close to the center of the topological structure of each node belonging to the category or close to the boundary of the topological structure of each node belonging to the category.

And (3) marking a set: the node set with labels in the graph, that is, the node set formed by labeled nodes in each category, is generally used as a training set.

Label-free collection: the unlabeled node set in the graph, that is, the node set formed by all the unlabeled nodes in the graph, is generally used as a test set.

In the GNN node classification system, the category to which each node belongs can be identified based on graph data of each node, for example, the GNN node classification system can process account classification tasks in a social network. In the GNN training phase, the number of labeled nodes contained in the label sets provided for different categories is different, and the position distribution of the labeled nodes contained in each label set in the GNN is also unbalanced, so that when the GNN processes a node classification task, the prediction capabilities for different categories are significantly different, for example, the recognition accuracy for category 1 is generally higher, but the recognition accuracy for category 2 is generally lower, that is, the GNN node classification system generally has a phenomenon of category imbalance. Therefore, a method for improving the GNN recognition accuracy is needed to improve the class imbalance in the GNN node classification system.

Fig. 1 is a schematic diagram of an implementation environment of a data processing method according to an embodiment of the present application. Referring to fig. 1, a terminal 110 and a server 120 are involved in this implementation environment.

The terminal 110 is configured to provide the graph data to the server 120, for example, taking a social network as an example, a social application is installed and run on the terminal 110, and if a user logs in an account on the application on the terminal 110 and operates the account to focus on a plurality of other accounts, the graph data corresponding to the account will be generated. Optionally, the terminal is a smart phone, a tablet computer, a laptop computer, a desktop computer, a smart speaker, a smart watch, etc., but is not limited thereto.

The terminal 110 and the server 120 can be directly or indirectly connected through wired or wireless communication, and the application is not limited thereto.

The server 120 is configured to provide data processing services, for example, the server 120 collects account id of each account corresponding to each terminal 110 and account id of all other accounts concerned by each account, and constructs an account relationship diagram based on the account id and the account id, where each node in the account relationship diagram represents one account and two nodes connected by using an edge have a friend relationship (such as mutual attention), and the account classification task can be implemented by using the account relationship diagram.

It should be noted that the above process is only described by taking account classification scenarios in a social network as an example, in other exemplary scenarios, the server 120 collects citation relationships between articles (such as references), and constructs an article relationship graph based on the citation relationships, each node in the article relationship graph represents an article, and an article classification task can be implemented by using the article relationship graph, where two nodes connected by an edge have a citation relationship (such as article a is a reference of article B).

Optionally, the server 120 includes at least one of a server, a plurality of servers, a cloud computing platform, or a virtualization center. For example, server 120 undertakes primary computational tasks and terminal 110 undertakes secondary computational tasks; alternatively, the server 120 undertakes the secondary computing work and the terminal 110 undertakes the primary computing work; alternatively, the terminal 110 and the server 120 perform cooperative computing by using a distributed computing architecture.

In some embodiments, the server is an independent physical server, or a server cluster or distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, cloud database, cloud computing, cloud function, cloud storage, web service, cloud communication, middleware service, domain name service, security service, CDN (Content Delivery Network), big data and artificial intelligence platform, and the like.

Those skilled in the art will appreciate that the number of terminals 110 may be greater or fewer. For example, the number of the terminals 110 may be only one, or the number of the terminals 110 may be several tens or hundreds, or more. The number and the device type of the terminals 110 are not limited in the embodiments of the present application.

Fig. 2 is a flowchart of a data processing method according to an embodiment of the present application. Referring to fig. 2, the embodiment is applied to a computer device, and is described by taking the computer device as a server as an example, and includes the following steps:

201. the server acquires respective conflict level parameters of a plurality of labeling nodes based on the position information of the labeling nodes in the graph neural network, wherein the conflict level parameters are used for representing the topological positions of the labeling nodes in the corresponding labeling categories.

The graph neural network refers to any neural network structure for processing graph data, including but not limited to: GCN, GAT, PPNP, CHEB, SAGE, SGC, etc., and the embodiment of the present application does not specifically limit the structure of the neural network in the figure.

Illustratively, in an account classification scenario, a graph neural network represents a social network between platform accounts, each node in the graph neural network corresponds to each account in the social network, a category to which each node belongs corresponds to an account category to which each account belongs, and a connection relationship between nodes in the graph neural network represents an association relationship between an account and an account in the social network, for example, the association relationship includes but is not limited to: friend relationships, one-way concern relationships, two-way concern relationships, commented upon each other within 3 days, commented upon each other within a week, and the like.

Illustratively, in an article classification scenario, a graph neural network represents a reference relationship between articles in a library, each node in the graph neural network corresponds to each article having the reference relationship, a category to which each node belongs corresponds to an article category to which each article belongs, and a connection relationship between nodes in the graph neural network represents the reference relationship between the articles, for example, the reference relationship refers to: article a is a reference to article B.

In some embodiments, the server obtains a training sample set of the graph neural network, where the training sample set includes a plurality of label sets, each label set corresponds to a label category, and each label set includes a plurality of label nodes belonging to the corresponding label category, and then constructs a node relationship graph based on graph data of all nodes (including labeled nodes and unlabeled nodes). And generating the neural network of the node relation graph on the basis of the node relation graph by combining a plurality of given initialized model parameters, wherein the initial values of the model parameters are specified by a technician or default values preset by a server are used.

In some embodiments, the server obtains the collision level parameter of each labeled node by using a random walk method, where the random walk process refers to starting from any labeled node in the node relationship graph and randomly walking to other nodes adjacent to the labeled node until the walk stops at a certain node, and the nodes located when the walk stops include, but are not limited to: the labeled node (i.e. the labeled node may return to the original point after the random walk), other labeled nodes or unlabeled nodes, and the termination probability of starting from the labeled node and stopping the random walk at any node can form a probability matrix, and the probability matrix can represent the probability distribution obeyed by the random walk process of the labeled node.

In some embodiments, the server obtains, for each labeled node, a probability matrix of the labeled node, and obtains, for a probability distribution obeyed by each labeled node, a collision expectation based on the probability matrix, where the collision expectation can be used as a collision level parameter of the labeled node. The conflict expectation can measure the influence conflict situation generated by the annotation node and other annotation nodes from different categories, so that the topological position of the annotation node in the annotation category to which the annotation node belongs can be reflected laterally, that is, when the conflict expectation of an annotation node is larger, the annotation node encounters stronger conflict in the sub-graph range with larger influence of the annotation node, and the annotation node is closer to the category boundary in the topological sense, otherwise, when the conflict expectation of an annotation node is smaller, the annotation node encounters gentler conflict in the sub-graph range with larger influence of the annotation node, and the annotation node is closer to the category center in the topological sense. The manner of acquiring the conflict expectation will be described in detail in the following embodiments, and will not be described here.

202. The server obtains respective target weights of the plurality of labeled nodes based on the respective conflict level parameters of the plurality of labeled nodes, wherein the target weights are used for representing weighting influence factors introduced to the labeled nodes based on the topological positions.

In some embodiments, after the conflict level parameter for measuring the topological position of each labeled node is obtained, in order to correct the phenomenon of class imbalance caused by topology imbalance, the training influence of labeled nodes close to the class center on the topological structure on the graph neural network should be increased, and the training influence of labeled nodes close to the class boundary on the topological structure on the graph neural network should be reduced. This is because if a certain labeled node is close to the class boundary on the topology structure, the labeled node is usually located at the boundary between two different classes, so that a misjudgment situation is relatively easily generated, and thus a smaller target weight should be given to the labeled node, whereas if a certain labeled node is close to the class center on the topology structure, the labeled node is usually far from the boundary between different classes, so that a misjudgment situation is not easily generated, and thus a larger target weight should be given to the labeled node. The following embodiments will be described in detail with respect to the manner of obtaining the target weight, and will not be described here.

In the process, different target weights are distributed to each labeled node based on respective conflict level parameters of each labeled node, and the target weights can be used for adjusting the contribution degree of each labeled node to the loss function, so that the iterative training process of the overall graph neural network is guided, and the phenomenon of class imbalance caused by topology imbalance is improved.

203. And the server adjusts the parameters of the graph neural network based on the respective target weights of the plurality of labeled nodes to obtain a target graph neural network, wherein the target graph neural network is used for identifying the category of each node in the graph neural network.

In some embodiments, the server can weight the contribution value provided by each labeled node in the loss function based on the target weight of each labeled node, so that the training influence of the labeled node close to the center of the class on the topological structure on the graph neural network can be increased, and the training influence of the labeled node close to the boundary of the class on the topological structure on the graph neural network can be reduced, so as to correct the topological imbalance phenomenon of a node classification system formed by the graph neural network, and thus, after multiple rounds of iterative training, the recognition accuracy of the trained target graph neural network for each class can be improved. The manner of iterative training will be described in detail in the following embodiments, and will not be described here.

All the above optional technical solutions can be combined arbitrarily to form the optional embodiments of the present disclosure, and are not described herein again.

According to the method provided by the embodiment of the application, in the parameter adjusting process of the graph neural network, the conflict level parameter of each labeled node is determined to measure the topological position of each labeled node, the target weight of each labeled node is distributed to each labeled node on the basis of the conflict level parameter, and the target weight is put into the parameter adjusting process to adjust the influence of different labeled nodes in different topological positions in the parameter adjusting process, for example, a larger target weight is distributed to the labeled nodes of which the topological positions are close to the class center, and a smaller weight is classified to the labeled nodes of which the topological positions are close to the class boundary, so that the phenomenon of class imbalance commonly existing in the graph neural network can be improved, and the identification accuracy of the graph neural network is improved.

Fig. 3 is a flowchart of a data processing method according to an embodiment of the present application. Referring to fig. 3, the embodiment is applied to a computer device, and is described by taking the computer device as a server as an example, and includes the following steps:

301. the server obtains a plurality of labeling nodes in the graph neural network and labeling categories corresponding to the labeling nodes.

In some embodiments, the server obtains a training sample set of the graph neural network, where the training sample set includes a plurality of label sets, each label set corresponds to a label category, and each label set includes a plurality of label nodes belonging to the corresponding label category, and then constructs a node relationship graph based on graph data of all nodes (including labeled nodes and unlabeled nodes). And generating the neural network of the node relation graph on the basis of the node relation graph by combining a plurality of given initialized model parameters, wherein the initial values of the model parameters are specified by a technician or default values preset by a server are used. For example, the training sample set includes, but is not limited to: the academic citation network of CORA, citeSeer, PubMed, and the like, and the corporate purchase graph network of Photo, Computers, and the like, or other different graph data example data sets, which is not specifically limited in the embodiment of the present application.

It should be noted that, in the embodiment of the present application, only the server-side training graph neural network is taken as an example for description, but the data processing method may also be independently executed by a terminal, that is, the terminal side independently completes the training process of the graph neural network, and the server and the terminal are both exemplary descriptions of computer devices, and the embodiment of the present application does not specifically limit on what kind of device the training process of the graph neural network is executed on.

302. And for any one of the plurality of labeled nodes, the server performs random walk from the labeled node to obtain a probability matrix of the labeled node, wherein the probability matrix is used for representing the probability distribution of the labeled node stopping to any node in the graph neural network during the random walk.

In some embodiments, when the server obtains the probability matrix of each labeled node, the server obtains, for each labeled node, the termination probability that the labeled node stops to each node in the node relationship graph during random walk based on a random walk algorithm, and based on the termination probabilities that the labeled node stops to all nodes, the probability matrix may be generated, and it is ensured that each termination probability in the probability matrix obeys the same probability distribution.

In some embodiments, the node classification system based on the graph neural network is divided into: a conductive node classification system and a push node classification system. The conductive node classification system means that labeled nodes of all known classes and unlabeled nodes to be predicted appear in a training process, in which case, class information is mainly conducted from the labeled nodes to the unlabeled nodes through edges in a node relation graph, and usually, a graph neural network of the conductive node classification system contains a small number of nodes, for example, the conductive node classification system belongs to the case that the number of nodes contained in the graph neural network is less than or equal to a number threshold. The derivative node classification system means that the unlabeled nodes to be predicted do not necessarily appear in the training process, and usually, the derivative node classification system includes a large number of nodes and a large graph structure, for example, when the number of nodes included in the graph neural network is greater than a number threshold, the derivative node classification system belongs to. Wherein the number threshold is an integer greater than or equal to 1.

In some embodiments, for the conductive node classification system, the server records termination probabilities of all nodes in the node relationship graph from the labeled node in the probability matrix acquired by each labeled node, so that the training accuracy of the conductive node classification system can be improved.

In some embodiments, for the derivative node classification system, the server samples the graph neural network to obtain a part of nodes, and only records the termination probability of the part of nodes randomly wandering from the labeled node to the node relationship in the probability matrix obtained for each labeled node, in other words, the probability matrix of each labeled node is obtained based on the sampling of the part of nodes in the graph neural network, so that the training acceleration can be performed on the node classification system in a large-scale scene, the computing resources of the server are saved, and the computing efficiency of the server is improved.

303. The server obtains a collision expectation of the labeled node based on the probability matrix of the labeled node, wherein the collision expectation is used for representing a mathematical expectation of the probability that any node obeying the probability distribution encounters different categories when the random walk stops.

Wherein the different category is a category other than the label category corresponding to the label node.

In some embodiments, since the annotation node usually has one or more distinct categories, and each distinct category corresponds to a set of annotations, the annotation node in the set of annotations for each distinct category is referred to as a target annotation node, and for any target annotation node corresponding to any one of the distinct categories, the server performs the following sub-steps 3031-3035:

3031. the server determines the probability of termination starting at any of the target annotation nodes and stopping at any node subject to the probability distribution.

In some embodiments, since the server obtains the probability matrix of each labeled node in the above step 302, it obtains the respective probability matrix for all labeled nodes of all classes, and the target labeled node is essentially a labeled node of a different class, so that for any node that obeys the probability distribution, the server can obtain, from the probability matrix of the target labeled node, the termination probability that the server randomly walks away from the target labeled node and stops at any node that obeys the probability distribution.

In one example, assuming that the labeled node is v, the probability distribution that the labeled node v obeys when the random walk stops is denoted as P_v，：The label type to which the label node v belongs is y_v(i.e., labeling the real label of the node v), k is the number of classes of the overall design of the neural network, k is an integer greater than or equal to 1, and C is the number of the j distinct classes_jA label set representing the jth distinct class (containing all target label nodes belonging to the jth distinct class), then starting random walk from the ith target label node belonging to the jth distinct class and stopping atObey the probability distribution P_v，：Is denoted as P_i，xWherein, x to P_v，：，i∈C_j，j∈[1，k]And j ≠ y_v。

3032. The server adds the termination probabilities of the target marking nodes in any one of the different categories to obtain a first numerical value.

In some embodiments, since the label set corresponding to the different category includes a plurality of target label nodes, the server performs the step 3031 on each target label node to obtain the termination probability of randomly walking to the same node obeying the probability distribution, and thus a plurality of termination probabilities can be obtained for the plurality of target label nodes, and the plurality of termination probabilities are added to obtain the first numerical value.

Based on the above example, the first value can be expressed as the following expression:

3033. the server divides the first numerical value by the number of the target marking nodes contained in any one of the different categories to obtain a second numerical value.

In some embodiments, since the first value is the sum of the termination probabilities of all target annotation nodes in the different category randomly walking to the same node obeying the probability distribution, the second value is obtained by dividing the first value by the number of target annotation nodes included in the different category, so that the second value can reflect the overall influence level of the different category on the node, where the number of target annotation nodes included in the different category is also referred to as the set size of the corresponding annotated set of the different category.

Based on the above example, the second value can be expressed as the following expression:

3034. and the server adds the second numerical values corresponding to the different categories to obtain a third numerical value.

In some embodiments, since the second value is calculated for each distinct category to reflect the overall level of influence of the distinct category on the node, a plurality of second values can be calculated for a plurality of distinct categories, and the plurality of second values corresponding to each of the plurality of distinct categories are added to obtain a third value that can reflect the overall level of influence of all distinct categories on the node.

Based on the above example, the third value can be expressed as the following expression:

3035. and the server determines the mathematical expectation of each third numerical value corresponding to each node obeying the probability distribution as the conflict expectation of the labeled node.

In some embodiments, since there are a plurality of nodes obeying the probability distribution, a corresponding third value can be obtained for each node, a mathematical expectation can be obtained for the third values corresponding to all the nodes obeying the probability distribution, and the mathematical expectation is determined as the conflict expectation of the labeled node, so that the topological position reflection of the labeled node relative to other nodes in the corresponding labeled category, in other words, whether the labeled node is close to the center or the boundary in the topological structure formed by all the nodes in the corresponding labeled category, can be reflected.

Based on the above example, the conflicting expectations T of the node v are labeled_vCan be expressed as the following equation:

304. and the server determines the conflict expectation of the annotation node as a conflict level parameter of the annotation node, wherein the conflict level parameter is used for representing the topological position of the annotation node in the corresponding annotation category.

In some embodiments, for each annotation node, the server obtains the conflict expectation of each annotation node through the above steps 3031-3035, and uses the conflict expectation of each annotation node as the conflict level parameter of each annotation node, this is because if the expected value of the collision is large, it means that the target labeled node in a different class is encountered with a high probability when the random walk from the labeled node stops, that is, the labeled node is a node which is easy to collide with a different class, and means that the topological position of the labeled node is closer to the boundary of the labeled class to which the labeled node belongs, otherwise, if the expected value of the conflict is small, the target marking node meeting different categories with a small probability is represented when the random walk from the marking node stops, namely the marking node is a node which is not easy to conflict with different categories, and the topological position of the marking node is represented to be closer to the center of the marking category.

In the step 302-304, the server obtains the conflict level parameter of each of the plurality of labeled nodes based on the location information of the plurality of labeled nodes in the graph neural network, because the conflict level parameter can represent the topological location of the labeled node in the corresponding labeled category, in other words, can reflect the relative location of the labeled node in the labeled category of the labeled node compared with other nodes, the conflict level parameter can indicate whether the degree of influence on each labeled node in the training process should be enlarged or reduced, in order to improve the category imbalance phenomenon caused by topology imbalance, it is necessary to increase the training influence of the labeled node close to the center of the category on the topological structure on the graph neural network, and reduce the training influence of the labeled node close to the boundary on the category on the topological structure on the graph neural network.

305. And the server acquires a cosine annealing value of the annotation node, wherein the cosine annealing value is used for representing the ordering condition of the conflict level parameters of the annotation node in the corresponding annotation category.

In some embodiments, the server sorts the conflict level parameters of all the labeled nodes in the labeled category from small to large, and determines the sorting order of the conflict level parameters of the current labeled node in the corresponding labeled category; and acquiring the cosine annealing value of the labeled node based on the sorting order and the number of the labeled nodes contained in the labeled category.

In some embodiments, when determining the sorting order, the server calls a RANK () sorting function to sort the conflict level parameters of all the labeled nodes in the label set corresponding to the label category from small to large.

In some embodiments, when obtaining the cosine annealing value, the server divides the sorting order by the number of labeled nodes included in the labeled category to obtain a first coefficient, multiplies the target coefficient by a circumferential rate pi to obtain a second coefficient, and determines a cosine value of the second coefficient as the cosine annealing value.

In one example, assume that the current labeled node is v, and the conflict level parameter for labeled node v is T_vThen the conflict level parameter T_vIs denoted RANK (T)_v) The expression of the cosine annealing value of the label node v is as follows:

wherein cos () represents a cosine function, RANK () represents a sorting function, | · | represents the set size of the label set L of the label category to which the label node v belongs, that is, the L1 norm of the label set L.

306. And the server acquires the target weight of the labeled node based on the cosine annealing value, the minimum weight threshold and the maximum weight threshold of the labeled node, wherein the target weight is used for representing a weighting influence factor introduced to the labeled node based on the topological position.

In some embodiments, the server increments the cosine anneal value by one to obtain a fourth value; multiplying the fourth value by the difference between the maximum weight threshold and the minimum weight threshold to obtain a fifth value; and adding one half of the fifth numerical value to the minimum weight threshold value to obtain the target weight of the labeled node.

Optionally, the minimum weight threshold or the maximum weight threshold is an adjustable parameter set by a technician, and optionally, the minimum weight threshold or the maximum weight threshold is a default minimum value or a default maximum value of a weight plan preset by a server.

Based on the above example, assume that the minimum weight threshold is w_minThe maximum weight threshold is w_maxThen the target weight w of the label node v_vExpressed as the following equation:

wherein the content of the first and second substances,

is the value of the cosine of the annealing,

is a fourth numerical value which is a function of,

is the fifth numerical value.

In the step 305-306, the server obtains the target weight of each of the plurality of labeled nodes based on the respective conflict level parameters of the plurality of labeled nodes, where the target weight can be used to adjust the contribution degree of each labeled node to the loss function, and plans the training weight of each labeled node based on a cosine annealing mechanism to perform node-level weight adjustment on the whole training sample set, so as to guide the iterative training process of the overall graph neural network, so as to improve the class imbalance phenomenon inherently caused by topology imbalance. In addition, the influence of extreme values on the training result can be effectively avoided through a cosine annealing mechanism, and the training weight planning is carried out on all the labeled nodes according to the sequencing order of the conflict level parameters.

307. The server determines a plurality of prediction probabilities of the marking node respectively corresponding to a plurality of categories, and the prediction probabilities are used for representing the possibility that the marking node corresponds to each category.

In the above step 307, the server determines, based on the graph neural network, a plurality of prediction probabilities that each of the plurality of labeled nodes respectively corresponds to a plurality of categories, that is, the server obtains a plurality of prediction probabilities for each of the labeled nodes, and can determine the prediction category of the labeled node based on the plurality of prediction probabilities.

In some embodiments, the server selects the class to which the largest prediction probability of the plurality of prediction probabilities belongs as the prediction class for the labeled node.

In some embodiments, the server ranks the prediction probabilities in descending order, and randomly selects a class to which the prediction probability belongs as the prediction class of the labeled node from the prediction probabilities ranked at the top target bits, for example, the top target bits are the top 3 bits.

In some embodiments, the server determines one or more prediction probabilities greater than a probability threshold, and randomly selects a class to which the prediction probability belongs from the one or more prediction probabilities as the prediction class of the labeled node, wherein the probability threshold is any value greater than or equal to 0 and less than or equal to 1, for example, the probability threshold is 0.8.

In some embodiments, the manner of obtaining the respective plurality of prediction probabilities for each labeled node is different for neural networks having different network structures.

Taking the conductive node classification system as an example, assume that

Graph neural network representing conducted learning, and X represents graph neural network

Input node characteristics, A denotes a graph neural network

Topological feature of input, theta, represents graph neural network

Learnable model parameters, g representing graph neural network

The prediction probability corresponding to the output prediction category, then the expression of g is as follows:

wherein softmax () represents an exponential normalization function.

Taking the derivative node classification system as an example, because the graph neural network of the derivative node classification system includes a large number of nodes and a large scale of the graph data structure, the termination probability of randomly migrating from each labeled node to all nodes in the node relationship graph is directly obtained based on the step 302, and a way of calculating the probability matrix P may bring about a large calculation overhead and a large calculation cost, as another implementation way provided in the step 302, for the derivative node classification system, the probability matrix of each labeled node

Based on partial node sampling in the neural network of the graph, training is accelerated aiming at a node classification system under a large-graph scene, so that the computing resources of the server are saved, and the computing efficiency of the server is improved.

In the above-mentioned push node classification system, the following assumptions are made

Graph neural network representing derivative learning, and X represents graph neural network

Input node characteristics, θ', represent a neural network of the graph

Learnable model parameters, g' representing a neural network of the graph

The prediction probability corresponding to the output prediction category, then the expression of g' is as follows:

wherein softmax () represents an exponential normalization function.

308. And the server acquires the loss function value of the iteration based on the plurality of prediction probabilities of the plurality of marking nodes, the marking types of the marking nodes and the target weights of the marking nodes.

In some embodiments, for each annotation node in the annotation set corresponding to the same annotation class, the server determines the prediction class to which the annotation node belongs based on the multiple prediction probabilities, in a manner described in step 307 above. Then, obtaining a logarithm value of the prediction probability corresponding to the prediction type, multiplying the logarithm value, the labeling type to which the labeling node belongs and the target weight of the labeling node to obtain an adjustment loss value, summing the adjustment loss values of all the labeling nodes belonging to the labeling set to obtain a target sum value, and dividing the target sum value by the set size of the labeling set to obtain the opposite number of the numerical value as a loss function value of the iteration. In the process of obtaining the loss function value, the cross entropy loss of each labeled node is weighted by adopting respective target weight, namely, the influence factor (namely the target weight) considering the topological position is added on the basis of the cross entropy loss, so that the phenomenon of class imbalance caused by topological imbalance in the graph neural network can be improved.

Illustratively, directed to a conductive node classification system to

Input node characteristics, A denotes a graph neural network

Topological feature of input, theta, represents graph neural network

Learnable model parameters, g representing graph neural network

The prediction probability corresponding to the output prediction category, softmax () represents an exponential normalization function, then the expression of the cross entropy loss function loss based on target weight adjustment is as follows:

wherein, y_vRepresenting the label type (real type) of the label node v, L representing the label set composed of all label nodes, | L | representing the set size of the label set L, i.e. L1 norm, | v being any label node in the label set L, w_vRepresenting the target weight, g, of the annotation node v_vRepresentation neural network

And aiming at the prediction probability corresponding to the prediction category output by the labeling node v.

Illustratively, directed to a derivative node classification system to

Input node characteristics, θ', represent a neural network of the graph

Learnable model parameters, g' representing a neural network of the graph

wherein the content of the first and second substances,

representing neural networks based on the graph

The probability matrix obtained by sampling partial nodes in (1), y_vA label type (i.e. a real type) representing a label node v, L represents a label set composed of all label nodes, | L | represents a set size of the label set L, i.e. an L1 norm, v is any label node in the label set L, w'_vRepresenting target weight, g ', of annotation node v'_vRepresentation neural network

Output for annotation node vAnd predicting the prediction probability corresponding to the category.

In the above process, for the deductive node classification system, the learning process represented by the node information is decoupled from the propagation process of the node features on the node relation Graph, so that a large Graph structure is prevented from being directly input to a Graph neural network, and thus the data processing method related to the embodiment of the application can be rapidly deployed in an ultra-large Graph structure, and even the data processing method can be successfully deployed on a Microsoft Academic Graph (MAG) reference Graph network at the level of ten million nodes, so as to improve the problem of class imbalance of the MAG referencing Graph network.

309. And the server responds to the fact that the loss function value does not accord with the stop condition, iteratively adjusts the parameters of the graph neural network until the loss function value accords with the stop condition, and stops iteration to obtain the target graph neural network.

The target graph neural network is used for identifying the category to which each node in the graph neural network belongs.

In some embodiments, the stopping condition is that the loss function value is less than a loss threshold, wherein the loss threshold is any value greater than or equal to 0 and less than or equal to 1. If the loss function value of the current iteration is greater than or equal to the loss threshold and does not meet the stop condition, the server iteratively adjusts the parameters of the graph neural network and re-executes the

step

302 and 308 until the loss function value of any iteration is less than the loss threshold and meets the stop condition, the iteration is stopped to obtain the target graph neural network, and the target graph neural network can be put into the application of the node classification task.

In some embodiments, the stop condition is that the number of iterations is greater than an iteration threshold, where the iteration threshold is any integer greater than or equal to 1. If the iteration number of the current iteration is less than or equal to the iteration threshold and does not meet the stop condition, the server iteratively adjusts the parameters of the neural network of the graph, and re-executes the step 302-308 until the iteration number of any iteration is greater than the iteration threshold and meets the stop condition, and then stops the iteration to obtain the neural network of the target graph.

In some embodiments, the stopping condition is that the loss function value is less than a loss threshold or the number of iterations is greater than an iteration threshold. If the loss function value of the current iteration is greater than or equal to the loss threshold value and the iteration number is less than or equal to the iteration threshold value and does not meet the stop condition, the server iteratively adjusts the parameters of the graph neural network and re-executes the

step

302 and 308 until the loss function value of any iteration is less than the loss threshold value or the iteration number is greater than the iteration threshold value and meets the stop condition, and then the iteration is stopped to obtain the target graph neural network.

In the above step 307-.

Fig. 4 is a schematic diagram of a data processing method provided by an embodiment of the present application, and as shown in 401, before the method of the embodiment of the present application is applied for improvement, all square nodes in the left portion 401 actually belong to the same class 1, and all circular nodes actually belong to another class 2. The round node B, the square node R1 and the square node R2 are labeled nodes, and the rest nodes are unlabeled nodes. It can be seen that the square node R1 is actually located at the intersection of the two different types of topologies 1 and 2, i.e. the topological location of the square node R1 is closer to the boundary corresponding to the type 1, so that two circular nodes X closer to the square node R1 are unmarked nodes with larger impact conflicts, and two square nodes Y are unmarked nodes with insufficient impact because they are farther from the center R2 of the type 1. The background color of the nodes in the left part 401 represents the prediction category of the conventional graph neural network, the black background color of the nodes represents the prediction category as category 1, and the white background color of the nodes represents the prediction category as category 2, and it is obvious that the circular node X located above actually belongs to category 2 (which should be white background color), but is erroneously recognized as category 1 (but is erroneously recognized as black background color), so that there is a misjudgment situation and the recognition accuracy is poor.

The right part 402 represents that after the method of the embodiment of the present application is improved, a graph node reweighting mechanism is used to assign a larger target weight to the square node R2 near the center of the category 1 (represented by an increase in node area in the graph), and assign a larger target weight to the circular node B near the center of the category 2, and in addition, assign a smaller target weight to the square node R1 far from the center of the category 2 (represented by a decrease in node area in the graph), so that the improved graph neural network can accurately divide the circular node X located above into the categories 2, thereby reducing the misjudgment condition and improving the recognition accuracy.

The data processing method can be applied to various graph node classification systems, wherein the graph node classification systems refer to node class prediction made by relying on the characteristics of the nodes and the topological characteristics associated with the nodes. For example, the data processing method is applicable to advertisement recommendation scenes based on a social network, and is deployed in the social network formed by the incidence relation of the accounts of the users, so that the user portrait identification accuracy (namely, the account classification accuracy) based on the graph structure is improved, personalized interest mining can be accurately carried out, and more accurate advertisement recommendation is realized. For another example, the method is applicable to an automatic article classification scene based on a citation network, namely, the accuracy of automatic article classification can be improved by a similar method, and further, the performance overall performance and the category balance performance of an automatic article classification system based on a citation relation are improved.

According to the method provided by the embodiment of the application, the target weight of each labeled node can be calculated by detecting the influence conflict among the nodes, namely acquiring the conflict level parameter of each labeled node, and the contribution part of each labeled node in the loss function is weighted by using the target weight, so that the class imbalance phenomenon caused by asymmetric topological structures in graph structures such as a social network and the like can be improved, the performance and the comprehensive capacity of a node classification system are improved, and the user experience and the user friendliness degree are improved.

Furthermore, the graph node classification system applying the data processing method does not affect the original performance of the graph node classification system, in other words, after the topology class imbalance correction technology is introduced, the overall performance index of the graph node classification system is not damaged; the class balance performance of the graph node classification system can be further improved, and the problem of class imbalance caused by an asymmetric topological structure is greatly improved; in addition, the method can be compatible with the basic graph neural networks with different architectures and graph data with different styles, so that the migration difficulty and the re-development difficulty are reduced.

Further, the target graph neural network obtained by improving the data processing method can be redeployed at the server to perform user request response, for example, a user inputs node information (namely, node characteristics) and graph information (namely, topological characteristics) which the user wants to predict into the target graph neural network, the trained target graph neural network can automatically predict the class of the input node according to the input node characteristics and topological characteristics, and the class imbalance problem caused by the asymmetric topological structure can be corrected in the prediction process, so that the prediction result of class balance is returned on the premise of ensuring that the overall performance of the graph node classification system is not reduced.

Aiming at the traditional graph neural network and the improved target graph neural network, the following two model performance measurement indexes are introduced: 1) weighted average F1 value (Weighted-F1, W-F): the average F1 value weighted according to the frequency of the labeled nodes among different classes of the graph node classification system is an index concerning the overall performance of the graph node classification system; 2) average F1 values (Macor-F1, M-F): the F1 value is directly averaged among different classes of the graph node classification system, and is a performance index for evaluating the class balance of the graph node classification system.

Fig. 5 is a graph of comparison results of model performances provided in an embodiment of the present application, as shown in fig. 5, a portion 501 represents a comparison of system performances under three different scaling ratios and sampling schemes, namely 20 × k, 50 × k, and 100 × k, when performing TINL (Topology-unbalanced Node Learning) on a Reddit social network graph, a white column represents a performance index when a graph Node reweighting RN mechanism is not used, a shaded column represents a performance index when the RN mechanism is used, and it can be seen that the overall performance of the system on the Reddit social network graph can be improved after the RN mechanism is used, where the performance index used in a vertical coordinate is a percentage (W-F%) of W-F. Similarly, section 502 represents the system performance comparison for three different scaling and sampling schemes, 20 × k, 50 × k and 100 × k, when performing TINL and QINL (Quantity-Imbalance Node Learning) on the revidit social network diagram; section 503 represents a comparison of system performance for three different scaling and sampling schemes, 20 xk, 50 xk and 100 xk, when TINL is performed on paper reference MAG; section 504 represents the comparison of system performance for three different scaling and sampling schemes 20 × k, 50 × k and 100 × k when TINL and QINL are performed on paper reference MAG. Obviously, by applying the data processing method, namely the RN mechanism, provided by the embodiment of the application, the system performance and the overall effect can be effectively improved under various application scenes and various marking proportions, and further the use experience of a user can be optimized.

Furthermore, after the RN mechanism is introduced, the balance of class expression of the graph node classification system can be improved, so that the classification system can make more reasonable prediction on nodes in classes with non-ideal topological structures, and a short board of the node classification system prediction capability is supplemented, so that the experience of a user when the user uses the system to make predictions on the nodes of the specific classes is improved.

Further, the data processing method according to the embodiment of the present application can be applied to a plurality of graph neural networks and graph data structures in different fields (as shown in table 1 below), so as to reduce difficulty and workload in migration between different graph data or different graph neural networks.

TABLE 1

Fig. 6 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application, please refer to fig. 6, where the apparatus includes:

a first obtaining module 601, configured to obtain, based on location information of a plurality of labeled nodes in a graph neural network, a conflict level parameter of each of the plurality of labeled nodes, where the conflict level parameter is used to represent a topological location of the labeled node in a corresponding labeled category;

a second obtaining module 602, configured to obtain, based on the respective conflict level parameters of the plurality of labeled nodes, target weights of the plurality of labeled nodes, where the target weights are used to characterize weighted influence factors introduced to the labeled nodes based on the topology location;

a parameter adjusting module 603, configured to adjust a parameter of the graph neural network based on the target weight of each of the plurality of labeled nodes, to obtain a target graph neural network, where the target graph neural network is used to identify a category to which each node in the graph neural network belongs.

According to the device provided by the embodiment of the application, in the parameter adjusting process of the graph neural network, the conflict level parameter of each labeled node is determined to measure the topological position of each labeled node, the target weight of each labeled node is distributed to each labeled node on the basis of the conflict level parameter, and the target weight is put into the parameter adjusting process to adjust the influence of different labeled nodes in different topological positions in the parameter adjusting process, for example, a larger target weight is distributed to labeled nodes of which the topological positions are close to the class center, and a smaller weight is classified to labeled nodes of which the topological positions are close to the class boundary, so that the phenomenon of class imbalance existing in the graph neural network can be improved, and the identification accuracy of the graph neural network is improved.

In a possible implementation manner, based on the apparatus composition of fig. 6, the first obtaining module 601 includes:

a first obtaining unit, configured to obtain, based on the probability matrix of the labeled node, a collision expectation of the labeled node, where the collision expectation is used to characterize a mathematical expectation that any node obeying the probability distribution encounters a different category when the random walk stops, where the different category is a category other than the labeled category corresponding to the labeled node;

In one possible implementation, the first obtaining unit is configured to:

determining the termination probability of starting random walk from the target labeling node and stopping at any node obeying the probability distribution for any target labeling node corresponding to any different category;

dividing the first numerical value by the number of target marking nodes contained in any one of the different categories to obtain a second numerical value;

In one possible implementation, in a case that the number of nodes included in the graph neural network is greater than the number threshold, the probability matrix of each labeled node is obtained based on a partial sampling of the nodes in the graph neural network.

In a possible implementation manner, based on the apparatus composition of fig. 6, the second obtaining module 602 includes:

a second obtaining unit, configured to obtain a cosine annealing value of the labeled node for any labeled node in the plurality of labeled nodes, where the cosine annealing value is used to represent a sorting condition of conflict level parameters of the labeled node in a corresponding labeled category;

and the third acquisition unit is used for acquiring the target weight of the labeled node based on the cosine annealing value, the minimum weight threshold and the maximum weight threshold of the labeled node.

In one possible implementation, the second obtaining unit is configured to:

In one possible implementation, the third obtaining unit is configured to:

adding one to the cosine annealing value to obtain a fourth numerical value;

In one possible implementation, the parameter adjusting module 603 is configured to:

determining a plurality of prediction probabilities that each of the plurality of labeled nodes respectively corresponds to a plurality of categories based on the graph neural network, wherein the prediction probabilities are used for representing the possibility that the labeled node corresponds to each category;

and responding to the fact that the loss function value does not accord with a stopping condition, iteratively adjusting the parameters of the graph neural network until the loss function value accords with the stopping condition, and stopping iteration to obtain the target graph neural network.

In one possible embodiment, each node in the graph neural network corresponds to each account in the social network; the category to which each node belongs corresponds to the account category to which each account belongs.

In one possible implementation, each node in the graph neural network corresponds to each article with a reference relationship; the category to which each node belongs corresponds to the article category to which each article belongs.

It should be noted that: in the data processing apparatus provided in the above embodiment, when processing the graph data, only the division of the functional modules is illustrated, and in practical applications, the functions can be distributed by different functional modules as needed, that is, the internal structure of the computer device can be divided into different functional modules to complete all or part of the functions described above. In addition, the data processing apparatus and the data processing method provided in the above embodiments belong to the same concept, and specific implementation processes thereof are described in detail in the data processing method embodiments and are not described herein again.

Fig. 7 is a schematic structural diagram of a computer device according to an embodiment of the present application, and as shown in fig. 7, the computer device is taken as an example of a terminal 700 for explanation, at this time, the terminal 700 independently completes an iterative training process of a graph neural network. Optionally, the device types of the terminal 700 include: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 700 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so on.

In general, terminal 700 includes: a processor 701 and a memory 702.

Optionally, processor 701 includes one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. Alternatively, the processor 701 is implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). In some embodiments, processor 701 includes a main processor and a coprocessor, the main processor is a processor for Processing data in the wake state, also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 is integrated with a GPU (Graphics Processing Unit) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, processor 701 further includes an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.

In some embodiments, memory 702 includes one or more computer-readable storage media, which are optionally non-transitory. Optionally, memory 702 also includes high speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 702 is used to store at least one program code for execution by the processor 701 to implement the data processing methods provided by the various embodiments herein.

In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, memory 702, and peripheral interface 703 may be connected by buses or signal lines. Each peripheral can be connected to the peripheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 704, a display screen 705, a camera assembly 706, an audio circuit 707, a positioning component 708, and a power source 709.

The peripheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 701 and the memory 702. In some embodiments, processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 701, the memory 702, and the peripheral interface 703 are implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. Optionally, the radio frequency circuitry 704 communicates with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 704 further includes NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 705 is used to display a UI (User Interface). Optionally, the UI includes graphics, text, icons, video, and any combination thereof. When the display screen 705 is a touch display screen, the display screen 705 also has the ability to capture touch signals on or over the surface of the display screen 705. The touch signal can be input to the processor 701 as a control signal for processing. Optionally, the display 705 is also used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 705 is one, providing the front panel of the terminal 700; in other embodiments, the display 705 is at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in still other embodiments, the display 705 is a flexible display disposed on a curved surface or on a folded surface of the terminal 700. Even more optionally, the display 705 is arranged in a non-rectangular irregular figure, i.e. a shaped screen. Optionally, the Display 705 is made of a material such as an LCD (Liquid Crystal Display), an OLED (Organic Light-Emitting Diode), and the like.

The camera assembly 706 is used to capture images or video. Optionally, camera assembly 706 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 706 also includes a flash. Optionally, the flash is a monochrome temperature flash, or a bi-color temperature flash. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp and is used for light compensation under different color temperatures.

In some embodiments, the audio circuitry 707 includes a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing or inputting the electric signals to the radio frequency circuit 704 to realize voice communication. For the purpose of stereo sound collection or noise reduction, a plurality of microphones are respectively disposed at different positions of the terminal 700. Optionally, the microphone is an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. Alternatively, the speaker is a conventional membrane speaker, or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to human, but also the electric signal can be converted into a sound wave inaudible to human for use in distance measurement or the like. In some embodiments, audio circuitry 707 also includes a headphone jack.

The positioning component 708 is used to locate the current geographic Location of the terminal 700 for navigation or LBS (Location Based Service). Alternatively, the Positioning component 708 is a Positioning component based on the GPS (Global Positioning System) of the united states, the beidou System of china, the graves System of russia, or the galileo System of the european union.

Power supply 709 is provided to supply power to various components of terminal 700. Optionally, power supply 709 is alternating current, direct current, disposable battery, or rechargeable battery. When power source 709 includes a rechargeable battery, the rechargeable battery supports wired or wireless charging. The rechargeable battery is also used to support fast charge technology.

In some embodiments, terminal 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.

In some embodiments, the acceleration sensor 711 detects the magnitude of acceleration in three coordinate axes of a coordinate system established with the terminal 700. For example, the acceleration sensor 711 is used to detect components of the gravitational acceleration in three coordinate axes. Optionally, the processor 701 controls the display screen 705 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 711. The acceleration sensor 711 is also used for acquisition of motion data of a game or a user.

In some embodiments, the gyro sensor 712 detects a body direction and a rotation angle of the terminal 700, and the gyro sensor 712 and the acceleration sensor 711 cooperate to acquire a 3D motion of the terminal 700 by the user. The processor 701 implements the following functions according to the data collected by the gyro sensor 712: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Optionally, pressure sensors 713 are disposed on the side frames of terminal 700 and/or underneath display 705. When the pressure sensor 713 is disposed on a side frame of the terminal 700, a user's grip signal on the terminal 700 can be detected, and the processor 701 performs right-left hand recognition or shortcut operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at a lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 714 is used for collecting a fingerprint of a user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. Alternatively, the fingerprint sensor 714 is provided on the front, back, or side of the terminal 700. When a physical button or a vendor Logo is provided on the terminal 700, the fingerprint sensor 714 can be integrated with the physical button or the vendor Logo.

The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 controls the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the ambient light intensity is high, the display brightness of the display screen 705 is increased; when the ambient light intensity is low, the display brightness of the display screen 705 is adjusted down. In another embodiment, processor 701 also dynamically adjusts the shooting parameters of camera assembly 706 based on the ambient light intensity collected by optical sensor 715.

A proximity sensor 716, also referred to as a distance sensor, is typically disposed on a front panel of the terminal 700. The proximity sensor 716 is used to collect the distance between the user and the front surface of the terminal 700. In one embodiment, when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 gradually decreases, the processor 701 controls the display 705 to switch from the bright screen state to the dark screen state; when the proximity sensor 716 detects that the distance between the user and the front surface of the terminal 700 is gradually increased, the processor 701 controls the display 705 to switch from the breath-screen state to the bright-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 7 does not constitute a limitation of terminal 700, and can include more or fewer components than shown, or combine certain components, or employ a different arrangement of components.

Fig. 8 is a schematic structural diagram of a computer device 800 according to an embodiment of the present application, where the computer device 800 may generate a relatively large difference due to different configurations or performances, the computer device 800 includes one or more processors (CPUs) 801 and one or more memories 802, where the memory 802 stores at least one computer program, and the at least one computer program is loaded and executed by the one or more processors 801 to implement the data Processing method according to the embodiments. Optionally, the computer device 800 further has a wired or wireless network interface, a keyboard, an input/output interface, and other components to facilitate input and output, and the computer device 800 further includes other components for implementing the device functions, which are not described herein again.

In an exemplary embodiment, a computer-readable storage medium, such as a memory including at least one computer program, which is executable by a processor in a terminal to perform the data processing method in the above-described embodiments, is also provided. For example, the computer-readable storage medium includes a ROM (Read-Only Memory), a RAM (Random-Access Memory), a CD-ROM (Compact Disc Read-Only Memory), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In an exemplary embodiment, a computer program product or computer program is also provided, comprising one or more program codes, the one or more program codes being stored in a computer readable storage medium. The one or more processors of the computer device can read the one or more program codes from the computer-readable storage medium, and the one or more processors execute the one or more program codes, so that the computer device can execute to complete the data processing method in the above-described embodiment.

Those skilled in the art will appreciate that all or part of the steps for implementing the above embodiments can be implemented by hardware, or can be implemented by a program instructing relevant hardware, and optionally, the program is stored in a computer readable storage medium, and optionally, the above mentioned storage medium is a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method of data processing, the method comprising:

2. The method of claim 1, wherein the obtaining the conflict level parameter of each of a plurality of labeled nodes in the graph neural network based on the location information of the labeled nodes comprises:

randomly walking any one of the plurality of labeled nodes from the labeled node to obtain a probability matrix of the labeled node, wherein the probability matrix is used for representing the probability distribution of the labeled node stopping to any node in the graph neural network during random walking;

acquiring a conflict expectation of the labeled node based on the probability matrix of the labeled node, wherein the conflict expectation is used for representing a mathematical expectation of the possibility that any node obeying the probability distribution encounters different categories when the random walk stops, and the different categories are categories except the labeled category corresponding to the labeled node;

and determining the conflict expectation of the labeled node as the conflict level parameter of the labeled node.

3. The method of claim 2, wherein obtaining the collision expectation of the labeled node based on the probability matrix of the labeled node comprises:

4. The method according to claim 2 or 3, wherein in the case that the number of nodes included in the graph neural network is greater than a number threshold, the probability matrix of each labeled node is obtained based on partial node sampling in the graph neural network.

5. The method of claim 1, wherein obtaining the target weight for each of the plurality of labeled nodes based on the respective conflict level parameters of the plurality of labeled nodes comprises:

for any one of the plurality of labeled nodes, obtaining a cosine annealing value of the labeled node, wherein the cosine annealing value is used for representing the ordering condition of the conflict level parameters of the labeled node in the corresponding labeled category;

and acquiring the target weight of the labeled node based on the cosine annealing value, the minimum weight threshold and the maximum weight threshold of the labeled node.

6. The method of claim 5, wherein obtaining the cosine anneal value for the annotated node comprises:

7. The method of claim 5, wherein obtaining the target weight of the labeled node based on the cosine anneal value, the minimum weight threshold, and the maximum weight threshold of the labeled node comprises:

adding one to the cosine annealing value to obtain a fourth numerical value;

8. The method of claim 1, wherein the adjusting parameters of the graph neural network based on the target weights of the plurality of labeled nodes to obtain a target graph neural network comprises:

9. The method of claim 1, wherein each node in the graph neural network corresponds to each account in a social network; the category to which each node belongs corresponds to the account category to which each account belongs.

10. The method of claim 1, wherein each node in the graph neural network corresponds to each article having a citation relationship; the category to which each node belongs corresponds to the article category to which each article belongs.

11. A data processing apparatus, characterized in that the apparatus comprises:

12. The apparatus of claim 11, wherein the first obtaining module comprises:

13. A computer device, characterized in that the computer device comprises one or more processors and one or more memories in which at least one computer program is stored, the at least one computer program being loaded and executed by the one or more processors to implement the data processing method according to any one of claims 1 to 10.

14. A storage medium having stored therein at least one computer program which is loaded and executed by a processor to implement the data processing method of any one of claims 1 to 10.

15. A computer program product, characterized in that the computer program product comprises at least one computer program which is loaded and executed by one or more processors of a computer device to implement the data processing method of any one of claims 1 to 10.