CN113807520A

CN113807520A - Knowledge graph alignment model training method based on graph neural network

Info

Publication number: CN113807520A
Application number: CN202111355413.8A
Authority: CN
Inventors: 刘禹汐; 姜青涛; 侯立旺; 马荣; 宋建强
Original assignee: Beijing Daoda Tianji Technology Co ltd
Current assignee: Beijing Daoda Tianji Technology Co ltd
Priority date: 2021-11-16
Filing date: 2021-11-16
Publication date: 2021-12-17

Abstract

The embodiment of the disclosure provides a knowledge graph alignment model training method and an alignment method based on a graph neural network. The method comprises the steps of obtaining a training sample, wherein the training sample comprises a knowledge graph and an identifier corresponding to an entity in the knowledge graph, and the knowledge graph comprises entity information, side type information and side attribute information; obtaining a feature vector of an entity included in the knowledge graph based on a preset relation graph neural network model; and calculating difference values of the feature vectors of a plurality of entities under the same identification according to the identifications corresponding to the entities, and training the relation graph neural network model according to the difference values to obtain a knowledge graph alignment model for entity alignment. In this way, the obtained knowledge graph alignment model can consider other important information except the side type information and the entity information on the knowledge graph, so that the entity alignment effect is good.

Description

Knowledge graph alignment model training method based on graph neural network

Technical Field

The present disclosure pertains to the field of knowledge-graph fusion, and more particularly to the field of designing graph-neural-network-model-based entity alignment.

Background

Entity alignment is an important technology in knowledge graph construction and one of the research hotspots in recent years. The entity alignment is to find out the relationship which is expressed differently but corresponds to the same entity in the real world in the knowledge map formed by heterogeneous data sources, and the data from different sources which are isolated from each other can be gathered and fused to form a new knowledge base with richer information through the entity alignment.

However, in the existing entity alignment method, the alignment effect is poor.

Disclosure of Invention

The disclosure provides a knowledge graph alignment model training method based on a graph neural network, an alignment method, a device, equipment and a storage medium.

According to a first aspect of the present disclosure, there is provided a method for training a knowledge-graph alignment model based on a graph neural network, the method comprising:

acquiring a training sample, wherein the training sample comprises a knowledge graph and an identifier corresponding to an entity in the knowledge graph, and the knowledge graph comprises entity information, side type information and side attribute information;

obtaining a feature vector of an entity included in the knowledge graph based on a preset relation graph neural network model;

calculating difference values of the feature vectors of a plurality of entities under the same identification according to the identifications corresponding to the entities, and training a relation graph neural network model according to the difference values to obtain a knowledge graph alignment model;

obtaining a feature vector of an entity included in the knowledge graph based on a preset relational graph neural network model, wherein the obtaining of the feature vector comprises the following steps:

decomposing the knowledge graph into a plurality of sub-graph sets according to entity information, side type information and side attribute information based on a preset relation graph neural network model;

performing convolution on the edge type information and the edge attribute information corresponding to each sub-image in the plurality of sub-image sets to obtain feature vectors of a plurality of entities; wherein the content of the first and second substances,

a subgraph set belongs to the same entity, and each subgraph in the subgraph set corresponds to at least one type of edge information and at least one type of edge attribute information of one entity.

In some implementation manners of the first aspect, convolving the edge type information and the edge attribute information corresponding to each sub-graph in the plurality of sub-graph sets to obtain feature vectors of a plurality of entities, includes:

performing convolution on the edge type information and the edge attribute information corresponding to each sub-image in the plurality of sub-image sets to obtain a feature vector of an entity corresponding to each sub-image in each sub-image set;

and aggregating the feature vectors of the entities corresponding to each sub-graph in each sub-graph set to obtain the feature vectors of a plurality of entities.

In some implementation manners of the first aspect, the edge type information and the edge attribute information corresponding to each sub-graph in the multiple sub-graph sets are convolved to obtain a feature vector of an entity corresponding to each sub-graph in each sub-graph set, and the feature vector satisfies a formula:

wherein the content of the first and second substances,

representing a neural network model of a relational graph

Node of a layer

Is determined by the feature vector of (a),

which represents a non-linear activation function,

is a normalization factor that is a function of,Rthe side type information is represented by the side type information,

representing a neural network model of a relational graphlLayer to edge type isrThe sub-graph of (a) requires a learned parameter,

hidden layer representing a relational graph neural network modellNode (a) of

Is determined by the feature vector of (a),

is shown aslLayer to edge type isrThe sub-graph of (1) corresponds to the attribute parameters of the edge to be learned,

presentation hiding layerlNode (a) ofjThe attribute vector of the connecting edge is,

a weight parameter indicating a weight set in order to retain information of the node itself,

presentation hiding layerlNode (a) ofjThe feature vector of (2).

In some implementation manners of the first aspect, calculating difference values of feature vectors of a plurality of entities under the same identifier according to identifiers corresponding to the entities, and training the neural network model of the relational graph according to the difference values to obtain the knowledge graph alignment model, includes:

calculating the difference values of the feature vectors of a plurality of entities under the same identification according to the identification corresponding to the entity;

updating parameters in the neural network model of the relational graph according to the difference values;

and when the difference of the feature vectors of a plurality of entities of the same identification is calculated by the relation graph neural network model after the parameters are updated and is smaller than a first preset threshold, and the difference of the feature vectors between the entities corresponding to different identifications is larger than a second preset threshold, obtaining a knowledge graph alignment model based on the relation graph neural network model after the parameters are updated.

In some implementation manners of the first aspect, in a case that the edge type information is a high-dimensional vector larger than a preset dimension, based on a preset relational graph neural network model, according to the entity information, the edge type information, and the edge attribute information, decomposing the knowledge graph into a plurality of sub-graph sets includes:

and based on a preset relational graph neural network model, according to the entity information, the edge type information and the edge attribute information, decomposing the knowledge graph into a plurality of sub-graph sets in a bias decomposition mode and a block diagonalization decomposition mode.

According to a second aspect of the present disclosure, there is provided a method for knowledge-graph alignment based on a graph neural network, the method comprising:

acquiring a knowledge graph to be aligned, wherein the knowledge graph to be aligned comprises entity information, side type information and side attribute information;

inputting a knowledge graph to be aligned into a knowledge graph alignment model to obtain feature vectors of a plurality of entities, wherein the knowledge graph alignment model is obtained according to the first aspect and any one training method in some implementation manners of the first aspect;

and aligning different entities according to the feature vectors of the plurality of entities.

In some implementations of the second aspect, aligning the different entities according to the feature vector includes:

calculating the difference value between the feature vector corresponding to each entity except the target entity in different entities and the feature vector corresponding to the target entity according to the feature vectors corresponding to different entities;

and when the difference value is smaller than the preset threshold value, aligning the entity corresponding to the difference value with the target entity.

According to a third aspect of the present disclosure, there is provided an apparatus for training a knowledge-graph alignment model based on a graph neural network, the apparatus comprising:

the acquisition module is used for acquiring a training sample, wherein the training sample comprises a knowledge graph and an identifier corresponding to an entity in the knowledge graph, and the knowledge graph comprises entity information, side type information and side attribute information;

the calculation module is used for obtaining the characteristic vectors of the entities included in the knowledge graph based on a preset relational graph neural network model;

the training module is used for calculating the difference values of the feature vectors of a plurality of entities under the same identification according to the identification corresponding to the entity, and training the relation graph neural network model according to the difference values to obtain a knowledge graph alignment model;

According to a fourth aspect of the present disclosure, there is provided a map neural network-based knowledge graph alignment apparatus, the apparatus comprising:

the system comprises an acquisition module, a processing module and a display module, wherein the acquisition module is used for acquiring a knowledge graph to be aligned, and the knowledge graph to be aligned comprises entity information, side type information and side attribute information;

a calculation module, configured to input the knowledge graph to be aligned into a knowledge graph alignment model to obtain feature vectors of a plurality of entities, where the knowledge graph alignment model is obtained according to the first aspect and any one of the implementation manners of the first aspect;

and the alignment module is used for aligning different entities according to the feature vectors of the entities.

According to a fifth aspect of the present disclosure, an electronic device is provided. The electronic device includes: a memory having stored thereon a computer program that, when executed, implements a method of training a knowledge-graph alignment model based on a graph neural network as in the first aspect, and some implementations of the first aspect, described above, or implements a method of knowledge-graph alignment based on a graph neural network as in the second aspect, and some implementations of the second aspect, described above.

According to a sixth aspect of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method of training a knowledge-graph alignment model based on a graph neural network as in the first aspect, and some implementations of the first aspect, or implements a method of knowledge-graph alignment based on a graph neural network as in the second aspect, and some implementations of the second aspect.

According to the training method, the alignment method, the device, the equipment and the storage medium of the knowledge graph alignment model based on the graph neural network, the preset relation graph neural network model is convoluted on the basis of convolution through entities and edge types and on the graph formed by adding edge attribute information embedded vector connection, and the graph convolution neural network is capable of displaying various kinds of rich information on a fully modeled knowledge graph, so that other important information except for connecting edges (edge type information) and node characteristics (entity information) on the knowledge graph can be considered by the obtained knowledge graph alignment model, and the entity alignment effect is good.

It should be understood that the statements herein reciting aspects are not intended to limit the critical or essential features of the embodiments of the present disclosure, nor are they intended to limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The above and other features, advantages and aspects of various embodiments of the present disclosure will become more apparent by referring to the following detailed description when taken in conjunction with the accompanying drawings. The accompanying drawings are included to provide a further understanding of the present disclosure, and are not intended to limit the disclosure thereto, and the same or similar reference numerals will be used to indicate the same or similar elements, where:

FIG. 1 shows a flow diagram of a method for training a knowledge-graph alignment model based on a graph neural network according to an embodiment of the present disclosure;

FIG. 2 illustrates a schematic diagram of a knowledge-graph of an embodiment of the present disclosure;

FIG. 3 shows a flow diagram of a method for knowledge-graph alignment based on graph neural networks, in accordance with an embodiment of the present disclosure;

FIG. 4 shows a block diagram of a training apparatus for a knowledge-graph alignment model based on a graph neural network according to an embodiment of the present disclosure;

FIG. 5 shows a block diagram of a knowledge-graph alignment apparatus based on a graph neural network according to an embodiment of the present disclosure;

FIG. 6 illustrates a block diagram of an exemplary electronic device capable of implementing embodiments of the present disclosure.

Detailed Description

To make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.

In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

The entity alignment is to find out the relationship which is expressed differently but corresponds to the same entity in the real world in the knowledge map formed by heterogeneous data sources, and the data from different sources which are isolated from each other can be gathered and fused to form a new knowledge base with richer information through the entity alignment.

The knowledge map technology can greatly improve the comprehensive utilization capacity of military service data by modeling analysis and deep excavation of massive heterogeneous data, provides a brand-new visual angle for multi-dimensional examination and inspection of a complex data world, and has wide application space in the aspects of information reconnaissance and excavation, combat command control, battlefield situation perception, grid power space safety and the like.

Early entity alignment algorithms were based on knowledge of symbolic representations, and required either a significant amount of labor cost, or a corresponding rule for a specific domain.

In recent years, a model based on knowledge representation learning, which embeds entities into a low-dimensional dense vector space and performs entity alignment by calculating distances between entity embedding vectors, is widely used for an entity alignment task, and among them, there is a translation-based model. The translation-based model considers the relationship between the entities in the knowledge base as the translation operation between the entity vectors, and the model can learn partial semantic information between the entities by enabling all the entities in the knowledge base to satisfy the relationship as much as possible.

In addition to translation models, some other knowledge representation learning models have emerged, where convolution model based approaches outperform translation models in the handling of part of the problem. Convolution-based models early used CNN to act on a matrix of solid vectors or elements, but CNN models were designed for regular array data and did not handle data with irregular graph structures very well.

In recent years, researchers have proposed an entity alignment algorithm based on a GCN model, which is also called a graph convolution network and is a kind of neural network for extracting the features of graph structure data, and which can extract the whole structure information of a graph by directly applying a convolution operation to the graph. On the basis of the GCN, researchers have proposed an R-GCN (Relational graph convolutional neural network) model, which is an extension of the GCN and adds edge information on the basis of a graph convolutional network, and thus can be used to learn entity embedded representation in a knowledge graph. Regarding convolution of the R-GCN relational graph, updating of a vertex is determined by the vertex connected by different types of edges, the vertex is divided into an incoming edge and an outgoing edge under the same type of edge, and the R-GCN model can distinguish the importance degree of different neighbors by using the characteristics of nodes on the assumption of the edge type pointing to the R-GCN model.

However, the early entity alignment algorithm based on symbolic representation is time-consuming and labor-consuming, and faces the problems of low computational efficiency, poor expansibility and the like.

The translation-based model can only learn partial semantic information among entities, the semantic information is not rich enough, and the structure of the whole graph is not considered in the representation of the entities and the relations.

Convolution-based models are designed for regular array data and do not handle data with irregular graph structures very well.

The GCN runs on an undirected graph and a unmarked graph, so that the relation information of the knowledge graph can be ignored, and the entity alignment performance is poor. In addition, the R-GCN model has limitations in that it ignores other important information on the network besides the edge and node characteristics, and also results in poor entity alignment performance.

In summary, in the conventional entity alignment method, the alignment effect is relatively poor.

In the disclosure, the inventor considers that the existing entity alignment algorithm based on the R-GCN is constructed based on the relation structure between entities and ignores other important information except for connecting edges and node characteristics on a network, so that the existing R-GCN model is improved, and a training method, an alignment method, a device, equipment and a storage medium of a knowledge graph alignment model based on a graph neural network are provided, wherein a training sample is obtained and comprises a knowledge graph and identifications corresponding to the entities in the knowledge graph, and the knowledge graph comprises entity information, edge type information and edge attribute information; then, obtaining a characteristic vector of an entity included in the knowledge graph based on a preset relation graph neural network model; then calculating difference values of the feature vectors of a plurality of entities under the same identification according to the identifications corresponding to the entities, and training a relation graph neural network model according to the difference values to obtain a knowledge graph alignment model; obtaining a feature vector of an entity included in the knowledge graph based on a preset relational graph neural network model, wherein the obtaining of the feature vector comprises the following steps: decomposing the knowledge graph into a plurality of sub-graph sets according to entity information, side type information and side attribute information based on a preset relation graph neural network model; performing convolution on the edge type information and the edge attribute information corresponding to each sub-image in the plurality of sub-image sets to obtain feature vectors of a plurality of entities; each subgraph in the subgraph set corresponds to at least one type of edge information and at least one type of edge attribute information of one entity. In the process of training the knowledge graph alignment model, not only entity information and side type information in the knowledge graph but also side attribute information in the knowledge graph are considered, and then a preset relation graph neural network model is trained based on the corresponding relation between the entity information and the side type information and the side attribute information as a training sample to obtain the knowledge graph alignment model, so that other important information except for connecting sides (side type information) and node characteristics (entity information) on the knowledge graph network can be considered in the obtained knowledge graph alignment model, and the entity alignment effect is good.

The technical solutions provided by the embodiments of the present disclosure are described below with reference to the accompanying drawings.

Fig. 1 is a flowchart illustrating a method for training a knowledge-graph alignment model based on a graph neural network according to an embodiment of the present disclosure, where as shown in fig. 1, the method for training the knowledge-graph alignment model based on the graph neural network may include:

s101: and acquiring a training sample, wherein the training sample comprises a knowledge graph and an identifier corresponding to an entity in the knowledge graph, and the knowledge graph comprises entity information, side type information and side attribute information.

S102: and obtaining the characteristic vector of the entity included in the knowledge graph based on a preset relation graph neural network model.

S103: and calculating the difference values of the feature vectors of a plurality of entities under the same identification according to the identifications corresponding to the entities, and training the relation graph neural network model according to the difference values to obtain a knowledge graph alignment model.

Wherein the identifier belongs to a kind of label information of the entity, it can be understood that the identifier is a label of the same entity in the real world, for example, the entity a is called C in some environments, and is called D in other environments, then the identifiers of C and D are a for subsequent entity recognition model training.

In one embodiment, obtaining feature vectors of entities included in the knowledge graph based on a preset relational graph neural network model includes:

a subgraph set belongs to the same entity, and each subgraph in the subgraph set corresponds to at least one type of edge information and at least one type of edge attribute information of one entity. Taking a sub-graph set belonging to the same entity as a further example, the set is a plurality of sub-graphs of the entity a.

In the process of training the knowledge graph alignment model in S101-S104, not only entity information and side type information in the knowledge graph but also side attribute information in the knowledge graph are considered, and then a preset relationship graph neural network model is trained based on a corresponding relationship between the entity information and the side type information and the side attribute information as a training sample to obtain the knowledge graph alignment model, so that other important information except for a connection side (side type information) and a node feature (entity information) on a network can be considered in the obtained knowledge graph alignment model, and the effect of entity alignment is good.

In an embodiment, the preset relational graph neural network model belongs to an improved R-GCN model, and the model disassembles networked data including complex additional information into a plurality of different subgraphs, where the plurality of different subgraphs correspond to one entity, that is, the subgraph set, each subgraph after being disassembled only includes a single connected edge type, each subgraph after being disassembled is modeled by using a conventional graph convolution neural network, and finally, edge type information and edge attribute information corresponding to each subgraph in each subgraph set are convolved to obtain a feature vector of an entity corresponding to each subgraph in each subgraph set, and then, feature vectors of entities corresponding to each subgraph in each subgraph set are aggregated to obtain feature vectors of a plurality of entities.

In one embodiment, the improved R-GCN model is used to convolve the edge type information and the edge attribute information corresponding to each sub-graph in a plurality of sub-graph sets to obtain a feature vector of an entity corresponding to each sub-graph in each sub-graph set, and the feature vector satisfies formula (1):

（1）

in the formula (1), the first and second groups,

representing the neural network model of the relational graph

Node of a layer

Is determined by the feature vector of (a),

which represents a non-linear activation function,

representing the neural network model of the relational graphlLayer to edge type isrThe sub-graph of (a) requires a learned parameter,

a hidden layer representing the neural network model of the relational graphlNode (a) of

Is determined by the feature vector of (a),

，

indicating to reserve a node

The weight parameter set by the self information,

presentation hiding layerlNode (a) ofjThe feature vector of (2).

Therefore, the improved R-GCN model can be convolved on a knowledge graph with entity edge attributes connected, learn the embedding of entities based on attribute information and combine the embedding of the entities based on a relationship structure, and is a graph convolution neural network capable of displaying various rich information on a comprehensive modeling knowledge graph by simultaneously utilizing the relationship structure characteristics and the attribute structure characteristics of the entities. In a specific learning and training process, calculating difference values of feature vectors of a plurality of entities under the same identification according to the identifications corresponding to the entities, and training the relational graph neural network model according to the difference values to obtain a knowledge graph alignment model, wherein the method comprises the following steps:

updating parameters in the relational graph neural network model according to the difference values;

Specifically, in the process of training to obtain the knowledge-graph alignment model, it may be understood that the distance between entities expected to be aligned is as close as possible when the difference between the feature vectors of the entities of the same identifier is smaller than the first preset threshold. The difference of the feature vectors between the entities corresponding to different identities being greater than the second preset threshold may be understood as the distance between the pair of negative entities being as far as possible, and thus may be trained using an edge-based scoring function as shown in equation (2) as a target.

（2）

In the formula (I), the compound is shown in the specification,

representing an interval hyperparameter;

，

representing entitiesAn embedded vector defining S, S 'representing respectively a set of a priori aligned entities and a set of negative examples of a set of a priori aligned entities, S' being generated by replacing one of the a priori aligned entities (a different replacement); definition of

Are aligned entities.

In one embodiment, when the edge type is a high-dimensional vector, the type of the connected edge grows exponentially as the dimension grows. At the moment, the neural network of the relational graph is directly applied, the neural network needs to be decomposed into a large number of subgraphs, and each subgraph is very sparse, so that the nodes are difficult to learn and effectively express in the situation. In order to reduce the parameter relationship, the R-GCN model improved in the present disclosure, i.e., the preset relational graph neural network model used in the present disclosure, uses two different approaches, bias-decomposition (basis-decomposition) and block-diagonalization-decomposition (block-diagonalization-decomposition).

In a particular embodiment, a knowledge graph including side type information, side attribute information, and entity information may be as shown in FIG. 2 to facilitate a further understanding of the above-described process. For example, a C warplane is referred to as a model in one environment and a nickname in another environment, but the model and the nickname of the C warplane both refer to a type of warplane in the real world, and the purpose of the present disclosure is to identify the same entity with different names in different environments, for example, identify the same type of warplane, according to the side type information and the side attribute information of the knowledge graph.

The core innovation point of the method is to improve the R-GCN model, and add information of edge attributes on the basis of the conventional R-GCN model so as to increase the analysis and identification of entities in the knowledge graph.

The training method of the knowledge graph alignment model based on the graph neural network is improved on the basis that an R-GCN model is convoluted through entities and edge types, and is convoluted on a graph formed by adding edge attribute information embedded vector connection, so that the graph convolution neural network can display various rich information on a comprehensive modeling knowledge graph, and therefore other important information except for connecting edges (edge type information) and node characteristics (entity information) on the knowledge graph can be considered in the trained knowledge graph alignment model, and the entity alignment effect is good.

Corresponding to the flow diagram of the training method of the knowledge graph alignment model based on the graph neural network shown in fig. 1, the present disclosure also provides a knowledge graph alignment method based on the graph neural network.

Fig. 3 is a schematic flowchart of a method for knowledge graph alignment based on a graph neural network according to an embodiment of the present disclosure, and as shown in fig. 3, the method for knowledge graph alignment based on a graph neural network may include:

s301: and acquiring a knowledge graph to be aligned, wherein the knowledge graph to be aligned comprises entity information, side type information and side attribute information.

S302: inputting the knowledge graph to be aligned into a knowledge graph alignment model to obtain the feature vectors of a plurality of entities, wherein the knowledge graph alignment model is obtained according to the training method in fig. 1.

S303: and aligning different entities according to the feature vectors of the plurality of entities.

In one embodiment, aligning the different entities according to the feature vector includes:

The target entity is specifically an entity to be aligned, for example, in one embodiment, if it is desired to align entities with different names of C fighters in different knowledge graphs, the C fighters are the target entity.

Specifically, in the above alignment process, the specific relationship is as shown in equation (3).

（3）

Wherein the content of the first and second substances,

indicating an entity that is not aligned with,

representing an entity set in the whole knowledge graph, calculating the distance between each entity vector in the knowledge graph and the entity vector to be aligned, arranging the calculation results in an ascending order, taking the first m entities to form a candidate aligned entity set, and setting a distance threshold value

As a hyper-parameter. If it is not

Then alignment is deemed to be possible, otherwise alignment between the two entities is deemed to be not possible.

In the knowledge graph alignment method based on the graph neural network, the used knowledge graph alignment model is obtained by performing convolution improvement on a graph formed by adding edge attribute information embedded vector connection on the basis of convolution through an entity and an edge type based on an R-GCN model, and is a graph convolution neural network capable of displaying various rich information on a fully modeled knowledge graph, so that other important information except for the edge (edge type information) and node characteristics (entity information) on the knowledge graph can be considered by the obtained knowledge graph alignment model, and the effect is better when entity alignment is performed.

It is noted that while for simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that acts and modules referred to are not necessarily required by the disclosure.

The above is a description of embodiments of the method, and the embodiments of the apparatus are further described below.

Corresponding to the flow chart of the training method of the knowledge-graph alignment model based on the graph neural network shown in fig. 1, fig. 4 shows a block diagram of a training device 400 of the knowledge-graph alignment model based on the graph neural network. The training apparatus 400, as shown in fig. 4, may include:

the obtaining module 401 may be configured to obtain a training sample, where the training sample includes a knowledge graph and an identifier corresponding to an entity in the knowledge graph, where the knowledge graph includes entity information, edge type information, and edge attribute information;

a calculating module 402, configured to obtain feature vectors of entities included in the knowledge graph based on a preset relational graph neural network model;

the training module 403 may be configured to calculate difference values of feature vectors of multiple entities under the same identifier according to identifiers corresponding to the entities, and train the neural network model of the relational graph according to the difference values to obtain a knowledge graph alignment model;

In an embodiment, the training module 403 may be further configured to convolve the edge type information and the edge attribute information corresponding to each sub-graph in the multiple sub-graph sets to obtain a feature vector of an entity corresponding to each sub-graph in each sub-graph set; and aggregating the feature vectors of the entities corresponding to each sub-graph in each sub-graph set to obtain the feature vectors of a plurality of entities.

In one embodiment, the edge type information and the edge attribute information corresponding to each sub-graph in the plurality of sub-graph sets are convolved to obtain a feature vector of an entity corresponding to each sub-graph in each sub-graph set, and the formula is satisfied:

wherein the content of the first and second substances,

representing the neural network model of the relational graph

Node of a layer

Is determined by the feature vector of (a),

which represents a non-linear activation function,

Is determined by the feature vector of (a),

indicating to reserve a node

The weight parameter set by the self information,

presentation hiding layerlNode (a) ofjThe feature vector of (2).

In an embodiment, the training module 403 may be further configured to calculate, according to the identifier corresponding to the entity, a difference value between feature vectors of multiple entities under the same identifier; updating parameters in the neural network model of the relational graph according to the difference values; and when the difference of the feature vectors of a plurality of entities of the same identification is calculated by the relation graph neural network model after the parameters are updated and is smaller than a first preset threshold, and the difference of the feature vectors between the entities corresponding to different identifications is larger than a second preset threshold, obtaining a knowledge graph alignment model based on the relation graph neural network model after the parameters are updated.

In an embodiment, the training module 403 may be further configured to, based on a preset relation graph neural network model, disassemble the knowledge graph into a plurality of sub-graph sets by bias decomposition and block diagonalization decomposition according to the entity information, the edge type information, and the edge attribute information under the condition that the edge type information is a high-dimensional vector greater than a preset dimension.

The training device of the knowledge graph alignment model based on the graph neural network is improved on the basis that the R-GCN model is convolved through entities and edge types, and is convolved on a graph formed by adding edge attribute information embedded vector connection, so that the graph convolution neural network can display various rich information on a comprehensive modeling knowledge graph, and therefore other important information except for edges (edge type information) and node characteristics (entity information) on the knowledge graph can be considered in the trained knowledge graph alignment model, and the entity alignment effect is good.

It can be understood that each module in the training device of the knowledge graph alignment model based on the graph neural network shown in fig. 4 has a function of implementing each step in fig. 1, and can achieve the corresponding technical effect, and for brevity, no further description is provided herein.

Corresponding to the flow chart of the method for knowledge-map alignment based on the graph neural network shown in fig. 3, fig. 5 shows a block diagram of a knowledge-map alignment apparatus 500 based on the graph neural network. The training apparatus 500, as shown in fig. 5, may include:

the acquiring module 501 may be configured to acquire a to-be-aligned knowledge graph, where the to-be-aligned knowledge graph includes entity information, side type information, and side attribute information;

a calculating module 502, configured to input the knowledge graph to be aligned into a knowledge graph alignment model to obtain feature vectors of a plurality of entities, where the knowledge graph alignment model is obtained according to the training method in fig. 1;

the alignment module 503 may be configured to align different entities according to the feature vectors of the multiple entities.

In an embodiment, the alignment module 503 may be further configured to calculate, according to the feature vectors corresponding to different entities, a difference between the feature vector corresponding to each entity except the target entity in the different entities and the feature vector corresponding to the target entity; and when the difference value is smaller than the preset threshold value, aligning the entity corresponding to the difference value with the target entity.

In the knowledge graph alignment device based on the graph neural network, the used knowledge graph alignment model is obtained by performing convolution and training on a graph which is formed by adding edge attribute information embedded vector connection on the basis of convolution of an entity and an edge type based on an R-GCN model, and is capable of displaying various kinds of rich information on a comprehensive modeling knowledge graph, so that other important information except for the edge (edge type information) and node characteristics (entity information) on the knowledge graph can be considered by the obtained knowledge graph alignment model, and the effect is better when entity alignment is performed.

It can be understood that each module in the knowledge graph alignment apparatus based on the graph neural network shown in fig. 5 has a function of implementing each step in fig. 3, and can achieve the corresponding technical effect, and for brevity, no further description is provided herein.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the described module may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

FIG. 6 illustrates a schematic block diagram of an electronic device 600 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

The apparatus 600 includes a computing unit 601, which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM) 602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM603, various programs and data required for the operation of the device 600 can also be stored. The calculation unit 601, the ROM602, and the RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.

The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 601 performs the various methods and processes described above, such as the training method of the knowledge-graph alignment model based on the graph neural network in fig. 1, or the knowledge-graph alignment method based on the graph neural network in fig. 3. For example, in some embodiments, the method of training the knowledge-graph alignment model based on the graph neural network of fig. 1, or the knowledge-graph alignment method based on the graph neural network of fig. 3, may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as the storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM602 and/or the communication unit 609. When the computer program is loaded into the RAM 606 and executed by the computing unit 601, one or more steps of the zero-knowledge proof of keys method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured by any other suitable means (e.g., by means of firmware) to perform the training method of the knowledge-graph alignment model based on the graph neural network in fig. 1, or the knowledge-graph alignment method based on the graph neural network in fig. 3.

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A method for training a knowledge-graph alignment model based on a graph neural network is characterized by comprising the following steps:

calculating difference values of the feature vectors of a plurality of entities under the same identification according to the identification corresponding to the entities, and training the relation graph neural network model according to the difference values to obtain a knowledge graph alignment model;

wherein, the obtaining of the feature vector of the entity included in the knowledge graph based on the preset relational graph neural network model comprises:

decomposing the knowledge graph into a plurality of sub-graph sets according to the entity information, the side type information and the side attribute information based on a preset relation graph neural network model;

2. The method of claim 1, wherein convolving the edge type information and the edge attribute information corresponding to each sub-graph in the plurality of sub-graph sets to obtain feature vectors for a plurality of entities comprises:

3. The method of claim 2, wherein the convolving the edge type information and the edge attribute information corresponding to each sub-graph in the plurality of sub-graph sets to obtain the feature vector of the entity corresponding to each sub-graph in each sub-graph set satisfies a formula:

wherein the content of the first and second substances,

representing the neural network model of the relational graph

Node of a layer

Is determined by the feature vector of (a),

which represents a non-linear activation function,

Is determined by the feature vector of (a),

presentation hiding layerlNode (a) ofjThe feature vector of (2).

4. The method according to claim 1, wherein the calculating, according to the identifiers corresponding to the entities, difference values of feature vectors of a plurality of entities under the same identifier, and training the neural network model of the relational graph according to the difference values to obtain a knowledge graph alignment model comprises:

5. The method of claim 1, wherein in the case that the edge type information is a high-dimensional vector larger than a preset dimension, the decomposing the knowledge graph into a plurality of subgraph sets according to the entity information, the edge type information and the edge attribute information based on a preset relational graph neural network model comprises:

6. A knowledge graph alignment method based on a graph neural network is characterized by comprising the following steps:

inputting the knowledge graph to be aligned into a knowledge graph alignment model to obtain feature vectors of a plurality of entities, wherein the knowledge graph alignment model is obtained according to the training method of any one of claims 1 to 5;

and aligning different entities according to the feature vectors of the entities.

7. The method of claim 6, wherein aligning different entities according to the feature vector comprises:

calculating the difference value between the feature vector corresponding to each entity except the target entity in the different entities and the feature vector corresponding to the target entity according to the feature vectors corresponding to the different entities;

and when the difference value is smaller than a preset threshold value, aligning the entity corresponding to the difference value with the target entity.

8. An apparatus for training a knowledge-graph alignment model based on a graph neural network, the apparatus comprising:

the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a training sample, and the training sample comprises a knowledge graph and an identifier corresponding to an entity in the knowledge graph, wherein the knowledge graph comprises entity information, side type information and side attribute information;

9. An apparatus for knowledge-graph alignment based on graph neural networks, the apparatus comprising:

a calculation module, configured to input the knowledge graph to be aligned into a knowledge graph alignment model to obtain feature vectors of a plurality of entities, where the knowledge graph alignment model is obtained according to the training method of any one of claims 1 to 5;

10. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.

11. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.