CN117952198A

CN117952198A - Time sequence knowledge graph representation learning method based on time characteristics and complex evolution

Info

Publication number: CN117952198A
Application number: CN202311606478.4A
Authority: CN
Inventors: 冯思玲; 刘倩; 黄梦醒; 王冠军; 冯文龙
Original assignee: Hainan University
Current assignee: Hainan University
Priority date: 2023-11-29
Filing date: 2023-11-29
Publication date: 2024-04-30
Anticipated expiration: 2043-11-29

Abstract

The invention discloses a time sequence knowledge graph representation learning method based on time characteristics and complex evolution, which comprises the steps of constructing TFCE models, including a time characteristic module, a complex evolution module and two decoders for embedding time, wherein the time characteristic module is used for carrying out time coding on entities and relations in the knowledge graph, capturing long-distance dependency relations and incidence relations of the time characteristics and obtaining a time embedding matrix; introducing a perception mechanism and an attention network into the complex evolution module, and respectively learning the evolution representation of the entity and the relation in the knowledge graph on each time stamp, and updating an entity embedding matrix and a relation embedding matrix; two decoders embedded with time simultaneously conduct entity prediction and relation prediction; and inputting the data set into the TFCE model for training, and simultaneously calculating a loss function to update model parameters, and obtaining an optimal TFCE model when the loss function continuously descends until convergence. The invention can improve the prediction effect of future events.

Description

Time sequence knowledge graph representation learning method based on time characteristics and complex evolution

Technical Field

The invention relates to the technical field of time sequence knowledge graph reasoning, in particular to a time sequence knowledge graph representation learning method based on time characteristics and complex evolution.

Background

The time knowledge graph TKG represents facts about an entity and its relationship, wherein each fact is associated with a timestamp. In TKG, predicting new facts on future timestamps based on observed historical knowledge patterns is very helpful in understanding potential factors of events and coping with emerging events. Thus, reasoning under extrapolation settings is very important in many practical applications and may provide beneficial assistance, such as disaster relief and financial analysis.

Since TKG is essentially a KG sequence, there are three main limitations to the existing methods: (1) Mainly focusing on facts of a given query, the time dependencies of all facts in each timestamp are ignored; (2) The static attribute is a fixed attribute, does not change with time, cannot reflect the dynamic change of an entity or a relation, cannot provide enough information support for a prediction task, and has certain limitation; (3) Most of the advanced algorithms at present consider time sequence attributes, namely dynamic attributes related to time, which are captured through the evolution process of capturing entities and relations, but do not consider the time sequence attributes of time stamps of time sequence knowledge maps.

Disclosure of Invention

In order to solve the technical problems, the invention provides a time sequence knowledge graph representation learning method based on time characteristics and complex evolution, which can improve the prediction effect on future events.

In order to achieve the above purpose, the technical scheme of the invention is as follows:

the time sequence knowledge graph representation learning method based on time characteristics and complex evolution comprises the following steps:

Constructing a TFCE model, wherein the TFCE model comprises a time feature module, a complex evolution module and two decoders for embedding time, wherein the time feature module is used for carrying out time coding on entities and relations in a knowledge graph, capturing long-distance dependency relations of time features and obtaining a time embedding matrix; the complex evolution module is introduced with a perception mechanism and an attention network and is used for respectively learning the evolution representation of the entity and the relation in the knowledge graph on each time stamp and updating the entity embedding matrix and the relation embedding matrix; the two decoders of the embedding time adopt TimeConvTransE and TimeConvTransR to simultaneously carry out entity prediction and relation prediction, and output results through a sigmoid function;

inputting a data set into the TFCE model for training, calculating a loss function, updating model parameters by an Adam optimizer, and obtaining an optimal TFCE model when the loss function continuously descends until convergence;

And carrying out corresponding entity prediction or relation prediction in the time sequence knowledge graph through an optimal TFCE model.

Preferably, the processing procedure of the time feature module specifically includes the following steps:

Traversing a data set, putting all non-repeated time stamps in the data set into an embedding matrix, and obtaining two different initial time embedding matrices through different learnable parameters based on the embedding matrix;

and after the two different initial time embedding matrixes are processed by a transducer encoder, the GRU component is used for further extracting the long-distance dependency relationship, so that two different time embedding matrixes are obtained.

Preferably, the processing procedure of the complex evolution module specifically comprises the following steps:

The method comprises the steps of obtaining a relation embedding matrix R '_t-1 at a time stamp t-1 in a time sequence knowledge graph, connecting the relation embedding matrix R' _t-1 with two different time embedding matrices, and learning a low-dimensional representation of a relation by using a multi-layer perception mechanism, wherein the specific formula is as follows:

wherein, [; and is indicative of a vector join operation,

To obtain a relationship representation with stronger generalization capability, a randomized relationship embedding matrix R and a relationship embedding matrixThe different parts in the relation path are weighted by connecting and adopting the attention network, and key information in the path is captured, wherein the key information is as follows:

Wherein g (-) represents the upper triangular part of the matrix,

Acquiring an entity embedding matrix H' _t-1 at a time stamp t-1 in the time sequence knowledge graph, connecting the entity embedding matrix with two different time embedding matrices, and then learning a low-dimensional representation of an entity by using a multi-layer perception mechanism:

to better capture structural dependencies between entities and relationships, a GRU component is used, specifically formulated as follows:

R _t and As an entry, the entry is put into the GCN, and then the association between entities is captured through facts, and the association between relationships is captured through shared entities, and the specific formula is as follows:

Wherein l represents the layer number index of KG at the time stamp t, l epsilon [0,w-1], w represents the total layer number of the current KG, AndRepresenting aggregation characteristics and self-circulation parameters in layer l ^th, capturing structural dependencies between concurrent facts at layer l ^th in fine granularity through translational aggregation of entities, relationships and time, c _o being the degree of entity o; f (-) is the leak ReLU activation function,

The different parts of the entity are weighted by the attention network:

After capturing potential key information of the entity, adding static attribute of the knowledge graph, and updating the entity matrix by using GRU:

H_t＝GRU(H_t-1,H^s) (19)。

Preferably, before connecting the relation embedding matrix R' _t-1 with two different time embedding matrices, the method further comprises the steps of:

The relation embedding matrix R' _t-1 at the time stamp t-1 is subjected to an operation of removing outliers.

Preferably, each of TimeConvTransE and TimeConvTransR comprises a one-dimensional convolution layer and a fully-connected layer.

Preferably, the loss function is:

L＝u₁L^e+u₂L^r+u₃L^st (29)

Where u ₁、u₂ and u ₃ are parameters controlling the penalty term, L ^st is a penalty function of the static diagram constraint component,

Based on the technical scheme, the invention has the beneficial effects that: the invention provides a TFCE model based on time characteristics and a time sequence knowledge graph representation learning method under complex evolution, wherein the model comprises a time characteristic module, a complex evolution module and a decoder for embedding time. Specifically, the TFCE model uses a temporal feature module to time encode entities and relationships in the knowledge-graph, and incorporate temporal information into the presentation learning process. By learning the patterns of the entity and relationship changes with time, the understanding and discovery of the time-sequential relationships in the knowledge-graph can be aided. The complex evolution module learns the evolution representation of entities and relationships on each timestamp by recursively modeling the KG sequence, and simultaneously, adding time features can effectively model the time dependency relationship between events so as to more accurately capture the timing relationship between different timestamps. Finally, incomplete time series data is processed by a decoder with embedded time to complete reasoning for representing learning. Experimental results on three real-world datasets show that TFCE models are superior to the current classical and advanced models in terms of prediction effect, prediction time and the like.

Drawings

FIG. 1 is a flow chart of a time series knowledge graph representation learning method based on time features and complex evolutions in one embodiment;

FIG. 2 is a graph of predicted time versus time using different algorithms in one embodiment;

FIG. 3 is a graph of performance analysis of training data changes based on raw metrics when performing entity prediction tasks on ICESW s, ICEWS05-15, and ICEWS using different algorithms in one embodiment.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention.

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.

Table 1 important symbols and description thereof

As shown in fig. 1, the present embodiment provides a time sequence knowledge graph representation learning method based on time characteristics and complex evolution, and builds TFCE a model, which is mainly divided into four parts: a time feature module, a complex evolution module, a decoder for embedding time, a loss function, wherein,

And (3) performing time coding on the entities and the relations in the knowledge graph based on the time feature module, and incorporating the time information into the representation learning process.

By learning the patterns of the entity and relationship changes with time, the understanding and discovery of the time-sequential relationships in the knowledge-graph can be aided. The time feature module effectively captures long-distance dependency relationships through a transducer, and uses GRU modeling sequence data and processes long-term dependency relationships. This operation allows the model to better capture long-range dependencies and associations in the time series. To capture these relationships and better build the model, the dataset is first traversed, all non-duplicate timestamps are put into one timestamp matrix T, and then two different initial time-embedding matrices are obtained by different learnable parameters, specifically as follows:

Wherein, Represented as hadamard product operations. V ₁ and V ₂ represent initial time embedding matrices, f ()'s represent upper triangle parts of the matrix, and in order to clear useless information in the matrix, the operation burden of the model is reduced,/> And/>Representing the learning parameters.

After the initial time embedding matrix is obtained, a transducer is processed so as to capture the long-distance dependency relationship, and the specific formula is as follows:

V′₁＝Transformer(V₁) (3)

V′₂＝Transformer(V₂) (4)

In order to better process long-term dependency in a time sequence, a GRU component is used for further extracting long-distance dependency, and two different time embedding matrixes are obtained, wherein the specific formula is as follows:

V″₁＝σ(GRU(V′₁)) (5)

V″₂＝σ(GRU(V′₂)) (6)

wherein σ (·) is a sigmoid function.

The multi-layer sensing mechanism and the attention network are added on the basis of the evolution units based on the complex evolution module.

At present, the inference models of a plurality of time sequence knowledge maps all have evolution units, but the evolution units often capture KG internal structure dependency relations at each time stamp by using GCN or capture all facts in parallel by using GRU or add static diagram constraint components to integrate into evolution embedding. The complex evolution module is based on the evolution units and adds a plurality of layers of perception mechanisms and attention networks. The multi-layer perception mechanism learns the low-dimensional representation of the entity to mine the structural features hidden in the TKG. The attention network weights the different parts of the relationship path, capturing critical information in the path. By skillfully adding multiple layers of awareness mechanisms and attention networks, a more efficient representation of entity relationships can be obtained.

Formally, the evolution unit computes a sequence from KGs of the latest m time stamps to the entity embedding matrix sequence and a set of relationship embedding matrix sequences. In particular, the input at the first timestamp includes an entity embedding matrix H, a relationship embedding matrix R, and a static entity embedding matrix H ^s. Wherein, H and R are entity and relation embedded matrixes obtained by randomization, and H ^s is entity embedded matrix obtained by corresponding calculation of static knowledge graph.

One of the innovation points of the method is to perform a series of modeling on the relation embedding matrix before operating the relation embedding matrix so as to mine the structural features hidden in the time sequence knowledge graph. Firstly, the relation embedding matrix R' _t-1 at the time stamp t-1 is subjected to abnormal value removal operation, so that the numerical value range is controlled, the rationality and stability of data are ensured, and the performance and the robustness of the model are improved. The specific formula is as follows:

min_val＝min(R′_t-1) (7)

Wherein min_val takes the minimum value of the relation embedding matrix R' _t-1.

max_val＝max(R′_t-1) (8)

Wherein max_val takes the maximum value of the relation embedding matrix R' _t-1.

Wherein the clip (-) is a tensor limiting function, limiting the tensor only between a minimum and a maximum.

To better capture hidden relationships between timestamps and relationships, a relationship embedding matrix is connected to the time embedding matrix:

wherein, [; representing vector join operations, updating a relationship embedding matrix by fusion

Then, embedding the relation into the matrixUsing a multi-layer sensing mechanism to learn a low-dimensional representation of the relationship, the specific formula is as follows:

Wherein, the relation embedding matrix is obtained through a multi-layer perception mechanism

To obtain a relationship representation with stronger generalization capability, a randomized relationship embedding matrix R and a relationship embedding matrixAnd (3) connecting:

Wherein, [; and g ()'s represent the upper triangle part of the matrix, and the relation embedding matrix is updated through connection

To capture critical information in the path, the TFCE model uses the attention network to weight different parts of the relationship path:

The entity embedding matrix at the time stamp t-1 is H' _t-1, in order to better capture the hidden relationship between the time stamp and the entity, the entity matrix is connected with the time embedding matrix, and then a multi-layer perception mechanism is used, so that the low-dimensional representation of the entity is learned:

wherein, [; representing vector join operations, updating entity embedding matrices by fusion

Wherein, the entity embedded matrix is obtained through a multi-layer sensing mechanism

Wherein, through the GRU, a relation embedding matrix R _t is obtained.

Next, we will R _t andThe method is put into GCN as an entry, and further captures the association between entities through facts and captures the association between relationships through shared entities. The specific formula is as follows:

wherein l represents the layer number index of KG at the time stamp t, l epsilon [0,w-1], and w represents the total layer number of the current KG. AndIndicating the polymerization characteristics and self-circulation parameters in layer i ^th. The structural dependencies between concurrent facts are captured at layer l ^th at fine granularity by translational aggregation of entities, relationships and time. c _o is the degree of entity o; f () is the leak ReLU activation function. Self-cycling operations may be considered the self-evolution of an entity.

To capture the potential key information of the entity itself, the TFCE model uses the attention network to weight different parts of the entity:

Adding static attribute after capturing potential key information of the entity, and updating the entity matrix by using GRU:

H_t＝GRU(H_t-1,H^s) (19)

after obtaining the feature matrix of the entity, the relationship and the time, in order to perform the entity prediction and the relationship prediction simultaneously, the present embodiment designs two time-embedded decoders TimeConvTransE and TimeConvTransR. Specifically, the decoder convolves the concatenation of the four embeddings (entity embedding matrix H _t, relationship embedding matrix R _t, time embedding matrices V ₁ and V ₂) and scores the resulting representation. Formally, the convolution operator is calculated as follows:

Where c is the number of convolution kernels, K is the number of kernels, n.epsilon.0..d.,. W _c is a learnable kernel parameter. H _t, etc. are caps of a matrix that represent the fills of their respective versions. The convolution operation integrates the characteristic information of entities, relationships, and time while maintaining the embedded translation properties. Thus, each convolution kernel forms an output vector Can be aligned to obtain a matrix/>

After nonlinear one-dimensional convolution, the final output of TimeConvTransE is defined as follows:

phi (H _t,R_t,V₁,V₂)＝Relu(vector(CONVF_conv)W)H_o,t (21) where vector is a feature map operator, Is a linear transformation matrix. TimeConvTransR score was calculated in the same way except that R _t was replaced with/>

The work of reference KG reasoning involves modeling the conditional probability vectors in equations (1) and (2) with a scoring function (i.e. decoder) such that the quaternion (s, r, o, t) holds, the conditional probability vectors of which are:

Wherein, D is the embedded dimension.

GCN is selected as the decoder in this embodiment, and TimeConvTransE and TimeConvTransR each comprise a one-dimensional convolutional layer and a fully-connected layer, which are represented using TimeConvTransE (.) and TimeConvTransR (.) respectively. Then the conditional probability vectors for all entities are:

also, the conditional probability vectors for all relationships are:

Wherein θ (-) represents a sigmoid function, And/>The embedding of head entity s, relation R, tail entity o in H _t and R _t, respectively, V ₁ and V ₂ represent time embedding matrices,/>And

The method for calculating the loss function comprises the following steps:

where α represents the rising speed of the angle, y e [0, 1..m ]. We set the maximum angle of two embedments of one entity to 180 °. The cosine value of the angle between two embeddings of entity i is then noted as

Thus, the final loss is:

L＝u₁L^e+u₂L^r+u₃L^st (29)

Where u ₁、u₂ and u ₃ are parameters of the control loss term.

In the training process, an Adam optimizer is used to train the model. The reason for this is that in order to get a more accurate representation through learning as soon as possible during training, an optimizer that can converge quickly needs to be selected. The Adam optimizer comprehensively considers the mean value and the variance of the gradient when updating, so that the method is high in convergence speed, low in memory requirement and easy to adjust. Table 2 shows the training process for TFCE models.

TABLE 2

Experimental results

Experimental results of TFCE model and other reference models proposed by a learning method (hereinafter referred to as the present method) based on time characteristics and time sequence knowledge graph representation under complex evolution are summarized in tables 3 and 4. The best results are shown in bold, and the next best results are underlined. Table 3 shows the results of different methods on the entity prediction task. From the results, it can be seen that:

(1) The method has better performance than the corresponding basic static method. For example, TA-DistMul is better than DistMult in all indices. This shows that more accurate representations can be obtained by incorporating time information into the knowledge graph. However, it should be noted that not all time-aware methods can be superior to static methods. For example, TTransE fails to outperform RotatE, possibly because adding a time representation in the TransE scoring function breaks the transition between entities. This illustrates the importance of modeling time information in a reasonable manner.

(2) The method is significantly better than the baseline method in performance and achieves the most advanced results in MRR index on all three data sets. Compared with other methods, the result of the method is respectively improved by 7.95%, 11.99% and 10.44% on the MRR evaluation of each data set, which indicates that the introduction of a time feature module, the development of a complex evolution module and the use of an embedded time decoder model the evolution characteristics of TKG, and the representation learning can be performed more accurately.

(3) The TFCE model is obviously superior to the static model because the TFCE model is added with a time feature module, so that the importance of the self-adaption is weighted on different time steps in the time sequence, the long-range dependence and the association relationship in the time sequence are better captured, and further, the time sequence data are fully mined.

(4) The TFCE model is significantly better than the time-aware model because the TFCE model complex evolution module not only learns low-latitude representations of entities using multi-layer awareness mechanisms, so that structural features hidden in TKG are mined, but also the adoption of the attention network weights different parts in the relational path, thereby capturing key information in the path. Therefore, it can more accurately predict the fact of future time stamps, and thus perform representation learning on TKG.

(5) The TFCE model is obviously superior to the most advanced time perception model because the design of the time features enables the model to better mine the relationships among the entity, the relationship and the time stamp, and meanwhile, the use of the static diagram and the multi-layer attention network in the complex evolution module is more beneficial to the learning of entity evolution embedding, so that the TFCE model is more excellent in performance.

Table 4 shows the prediction results of the present method and the existing other methods on the relation prediction task. The best results are shown in bold, and the next best results are underlined. Since some methods cannot be directly applied to relational prediction tasks, such as RE-NET and CyGNet, typical methods were chosen for comparison. According to the results, the present method is superior to the baseline method. In terms of MRR, the result of TFCE model was improved by 6.49%,7.48% and 4.74% on each dataset over the most advanced methods, respectively, demonstrating the superiority of the model of the present method in relation prediction. The excellent performance of TFCE model shows that the comprehensive use of multi-layer perception mechanism and attention network in time characteristic module and complex evolution module and embedded time encoder can more accurately perform time sequence knowledge graph representation learning.

TABLE 3 Table 3

TABLE 4 Table 4

Comparison of prediction time: to study the efficiency of the TFCE model, the TFCE model was compared to the current classical and advanced CyGNet, RE-NET, RE-GCN and TiRGN entity predicted run times under the same environment. The RE-GCN is a baseline algorithm with the fastest running speed at present, tiRGN is the most advanced baseline algorithm at present, and five models are used for entity prediction under the given real history condition. As can be seen from the results of fig. 2: TFCE models were 4.33 times, 29.10 times, and 7.55 times faster than TRGN on ICE14S, ICE-15 and ICE18, respectively. This is because TiRGN takes a considerable amount of time to extract the historic duplicate entities and relationships from the global history space, while the TFCE model represents learning with KG sequences as the attempted descriptions, capturing the features of the entities and relationships more quickly and accurately. Thus, TFCE model is more efficient than the most advanced baseline model TIRGN; the TFCE model has an efficiency that exceeds the current best RE-GCN model. Compared with the RE-GCN model, the TFCE model clears useless information in the matrix in the KG parallel modeling process, so that the operation load of the model is reduced; the TFCE model surpasses the RE-NET because the TFCE model replaces a mode of processing single query one by each timestamp by using a complex evolutionary representation learning query mode, and the representation learning mode can more accurately capture the structure dependency relationship among KGs; in the CyGNet algorithm, reasoning often involves a path search operation, i.e., starting from the starting entity, finding the target entity through the relationship path. This path search process may require traversing a large number of entities and relationships over a large-scale knowledge graph, increasing the computational time complexity. The TFCE model is added with time characteristics, so that the time dependency relationship between events is effectively modeled, and the time sequence relationship between different time stamps can be more accurately and quickly captured. In summary, the TFCE model demonstrates its superiority by comparison with the four models that are currently most classical and most advanced in terms of run time in a physical prediction task.

Ablation experiment: in order to eliminate the bias between training and test results, an ablation study was performed to better understand the effectiveness of the different model components capturing the corresponding features. As shown in tables 5 and 6, the time feature module (tf) has the greatest effect on performance, indicating that capturing long-distance dependencies and associations in the time series is critical to prediction. The complex evolution module (ce) has a certain influence on all data sets, which shows that the low-latitude representation of the entity can be learned through a multi-layer perception mechanism to better mine the structural characteristics hidden in the TKG, and further has a positive effect on the evolution representation of the entity and the relationship, wherein the attention network has a positive influence on the performance, and the network is shown that the model can more effectively process time sequence data by weighting different parts in the relationship path, so that the relationship among different KGs can be more effectively captured. The decoder (td) of the embedded time, which is also positively influenced by the performance, can better perform reasoning representing the learning by processing incomplete time series data. The above results therefore further demonstrate that the mechanisms employed by the algorithm of the present method all contribute to the prediction.

TABLE 5

TABLE 6

To study the effect of training proportions on the performance of the proposed TFCE framework, the present invention experiments by varying the proportions of training data to the total data set available. This analysis can help the present invention understand the performance variation of the model with different amounts of training data. Figure 3 shows the effect of training scale on the 7 algorithms in table 3 and the algorithm proposed by the present method. Of the different training proportions from 60% to 80%, TFCE outperforms all algorithms, which can help determine the best amount of training data needed to achieve satisfactory results and direct resource allocation in a practical scenario.

The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims

1. The time sequence knowledge graph representation learning method based on time characteristics and complex evolution is characterized by comprising the following steps of:

2. The time-feature-based and complex-evolution time-series knowledge graph representation learning method according to claim 1, wherein the processing procedure of the time feature module specifically comprises the following steps:

3. The time-series knowledge graph representation learning method based on time characteristics and complex evolution according to claim 2, wherein the processing procedure of the complex evolution module specifically comprises the following steps:

wherein, [; and is indicative of a vector join operation,

Wherein g (-) represents the upper triangular part of the matrix,

Wherein l represents the layer number index of KG at the time stamp t, l epsilon [0,w-1], w represents the total layer number of the current KG, And/>Representing aggregation characteristics and self-circulation parameters in layer l ^th, capturing structural dependencies between concurrent facts at layer l ^th in fine granularity through translational aggregation of entities, relationships and time, c _o being the degree of entity o; f (-) is the leak ReLU activation function,

The different parts of the entity are weighted by the attention network:

H_t＝GRU(H_t-1,H^s) (19)。

4. A time-series knowledge graph representation learning method based on time features and complex evolution according to claim 3, further comprising the steps of, before connecting the relation embedding matrix R' _t-1 with two different time embedding matrices:

5. The method for learning a time series knowledge graph representation based on time characteristics and complex evolution according to claim 4, wherein each of TimeConvTransE and TimeConvTransR comprises a one-dimensional convolution layer and a full-connectivity layer.

6. The time series knowledge graph representation learning method based on time characteristics and complex evolution according to claim 5, wherein the loss function is:

L＝u₁L^e+u₂L^r+u₃L^st (29)