CN117435715B

CN117435715B - Question answering method for improving time sequence knowledge graph based on auxiliary supervision signals

Info

Publication number: CN117435715B
Application number: CN202311755781.0A
Authority: CN
Inventors: 马廷淮; 朱玉
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2023-12-20
Filing date: 2023-12-20
Publication date: 2024-03-19
Anticipated expiration: 2043-12-20
Also published as: CN117435715A

Abstract

Aiming at the task background of a time sequence knowledge graph, a dynamic time coding module is added in an NSM reasoning module to reflect time sequence information of the time sequence knowledge graph, an entity representation is modified in the whole instruction module, the entity is divided into a static representation part and a dynamic representation part by adopting relative time representation, and the two parts are spliced to update the entity representation in real time. The invention combines NSM and reinforcement learning algorithm, and realizes accurate and efficient time sequence knowledge spectrum question-answering under time sequence knowledge spectrum multi-hop reasoning by simulating and tracking state change and time sequence relation in the knowledge spectrum, and compared with other types of methods, the invention shows the interpretability of the reasoning path. The invention provides intelligent, accurate and personalized question-answering service, promotes the development and application of artificial intelligence and knowledge graph technology, and has wide application prospect.

Description

Question answering method for improving time sequence knowledge graph based on auxiliary supervision signals

Technical Field

The invention relates to a question-answering method for improving a time sequence knowledge graph based on an auxiliary supervision signal, and belongs to the technical field of artificial intelligence.

Background

In the current artificial intelligence field, time sequence knowledge graph question-answering is an important task, and involves reasoning and question-answering time sequence information in the knowledge graph.

However, the conventional reinforcement learning method faces some challenges in solving the time-series knowledge graph question-answering problem. Reinforcement learning typically relies only on delayed reward signals, which means that the agent needs to explore without explicit supervision, which can lead to inefficiency and lack of interpretive reasoning paths.

Therefore, the technical problem to be solved by the invention is that the timing sequence knowledge graph question-answering lacks the interpretability of the reasoning path, and the timing sequence knowledge graph question-answering is realized based on the reinforcement learning method, so that the problem of unconscious supervision and unconscious exploration occurs in the reasoning process.

Disclosure of Invention

The purpose is as follows: in order to overcome the defects in the prior art, the invention provides a question-answering method for improving a time sequence knowledge graph based on an auxiliary supervision signal, which has stronger interpretability and reasoning capability on a reasoning path and can answer multi-hop complex time sequence questions more accurately.

The technical scheme is as follows: in order to solve the technical problems, the invention adopts the following technical scheme:

an auxiliary supervision signal-based timing sequence knowledge graph question-answering improvement method comprises the following steps:

and obtaining the semantic representation of the natural language problem through a pre-training language model.

And replacing the semantic representation of the natural language problem with the TKG embedded representation to obtain the replaced semantic representation of the natural language problem.

And inputting the replaced semantic representation of the natural language problem into an information fusion layer, and outputting a final problem representation by the information fusion layer.

The final problem is represented as input to a neural state machine, which outputs a list of entity distribution probabilities after the loss function converges.

And carrying out weighted average on the entity distribution probability to obtain additional state information, and adding the additional state information into the strategy network to form a new strategy network.

And inputting the final question representation into a new strategy network, and outputting a final answer.

Preferably, the method further comprises: and calculating a reward function according to the final answer and the real answer, and learning the new strategy network according to the reward function to obtain an updated strategy network. And inputting the final question representation into the updated strategy network, and outputting a final answer.

As a preferred solution, the semantic expression calculation formula of the natural language question is as follows:

wherein,semantic representation representing natural language questions, +.>Representing Dx +.>Wherein D is the TKG-embedded dimension, < >>Is the embedding dimension of DistillBert; />Is an NLP processing module;is text information of natural language questions.

As a preferred solution, the replacing the semantic representation of the natural language question with the TKG embedded representation to obtain the replaced semantic representation of the natural language question specifically includes:

replacing the entity part in the semantic representation of the natural language problem with the TKG entity embedded representation to obtain。

And thenThe time stamp part in the sequence replaces TKG time stamp representation to obtain replaced natural language question semantic representation。

Wherein,i element->The calculation formula is as follows:

wherein,representation entity->Or->，/>Represents a time stamp->D x D-dimensional learnable matrix representing a problem representation after replacement of an entity representation,/for>Representation->Middle (f)iElement(s)>Representing head entity->Representing the tail entity.

Wherein,i element->The calculation formula is as follows:

wherein T is ₁ ，T ₂ And respectively representing the maximum time and the minimum time obtained by sequencing all the times obtained by querying the time sequence knowledge graph by all the entities in the question.

As a preferred solution, the inputting the replaced semantic representation of the natural language question into the information fusion layer, and the information fusion layer outputting the final question representation specifically includes:

the replaced natural language problem semantic representation is processed and integrated through a multi-layer self-attention mechanism and a feedforward neural network to obtain a matrixMatrix +.>Element->Representing +.>. Wherein (1)>Flag bits representing natural language questions.

Preferably, the final problem is represented as input to a neural state machine, and after the loss function converges, the neural state machine outputs a entity distribution probability list, which specifically includes:

representing according to final problemCalculating instruction vector corresponding to the current kth reasoning step +.>。

Based on the current kth inference step entityDynamic characteristics of the computing entity->Obtaining the current kth inference step entity +.>，/>，/>，/>Representing the collection of entities in the reasoning step.

According to the entityAcquisition entity->Probability of entity distribution in the kth-1 inference step +.>。

According to instruction vectorsCalculating a matching vector +.>。

According to entity distribution probabilityAnd matching vector->Calculating the entity set of the current kth inference step +.>。

According to the entityCalculating the current kth inference step entity distribution probability +.>。

Repeating the above steps, when the loss function of each reasoning step is converged, outputting entity distribution probability by the neural state machine, and obtaining entity scores through n reasoning stepsDistribution probability list。

Preferably, the instruction vectorThe calculation formula is as follows:

wherein,represents the current kth inferencing step attention weight,/->Hidden state information representing the encoder;jrepresenting the length of the hidden state sequence,/->Indicating the total length of the hidden state sequence.

The saidThe calculation formula is as follows:

wherein,representing the current kth reasoning step final question representation, < +.>Representation->The D x D dimension of (c) can learn the matrix,representation->D x D-dimensional learnable bias matrices; softmax represents the activation function.

Dynamic characteristics of the entityThe calculation formula is as follows:

wherein,representing the difference between the time stamp of the entity in the current reasoning step and the time stamp in the question,/for>D x D-dimensional learning matrix, which is a dynamic feature of an entity,/for example>D x D dimension learning bias matrix, which is a dynamic feature of an entity; />Representing an activation function.

The matching vectorThe calculation formula is as follows:

wherein,d x D-dimensional learnable matrix representing matching vectors,/->Representing an activation function->Represents the Kronecker product,/>Represent the firstiRelationship of individual entities.

The saidThe calculation formula is as follows:

wherein,urepresenting the total number of entities associated with the current inference step;

the saidThe calculation formula is as follows:

wherein,is a parameter for deriving the entity distribution, +.>Representing an activation function.

Preferably, the loss functionThe calculation formula is as follows:

wherein,and->Hyper-parameters representing control factor weights, +.>Indicating forward loss, < >>Indicating backward loss, < >>Representing constraint losses.

Preferably, the weighted average of the entity distribution probabilities is used as additional state information, and the additional state information is added into the policy network to form a new policy network, which specifically includes:

distributing probability lists from entitiesEach entity is acquirednThe corresponding entity distribution probability in each reasoning step is tonAnd carrying out weighted average on the corresponding entity distribution probabilities in each reasoning step to obtain the weighted average distribution probability of each entity.

And adding the weighted average distribution probability as additional state information into the state information in the strategy network to form a new strategy network.

The new policy network isWherein the status information is->Action information is +.>。

Wherein,，/>。

wherein,representing the current entity +_>Representing the current timestamp,/-, and>representing entities in question->Representing a timestamp in a question,/->Representing additional status information->Representing the currently selected action entity, +.>Representing the currently selected action relationship, +.>Representing the currently selected action timestamp.

Preferably, the inputting the final question representation into the new policy network and outputting the final answer specifically includes:

representing the final problemThe input max pooling layer gets the question context vector +.>。

Acquiring a historical search path。

Searching for paths based on historyQuestion context vector->Calculating the desired target node +.>And (2) He Ji->。

According to the desired target nodeAnd (2) He Ji->Calculate +.>Candidate action score for step optional action +.>，/>。

Score candidate actionsThe action corresponding to the maximum value is used as the input of the next-step agent, and when iterating to the L-th step, the agent outputs the action as the final answer.

The history search pathThe calculation formula is as follows:

wherein,，/>respectively represent reasoning +.>In the step, the relationship, entity and time of the current node are determined.

The desired target nodeAnd (2) He Ji->The calculation formula is as follows:

wherein,target node +.>Learner matrix->Target edge of D x D dimension +.>The matrix may be learned by a user,a learnable matrix representing the output desired target, +.>To activate the function.

The candidate action scoreThe calculation formula is as follows:

wherein,，/>is->Is a learning matrix of (a); />To activate the function +.>Representing the current entity +_>Representing a current timestamp; />Indicate->Optional action of step,/->Indicate->Entity in the optional action of step, +.>Indicate->Relationships in optional actions of steps.

As a preferred scheme, calculating a reward function according to the final answer and the real answer, and learning a new strategy network according to the reward function to obtain an updated strategy network; inputting the final question representation into the updated strategy network, and outputting a final answer, wherein the method specifically comprises the following steps:

according to the final answerAnd true answer +.>Obtain prize value->，/>Representing the final state.

According to the prize valueCalculating a reward function +.>。

According to a reward functionLearning the new strategy network to obtain an updated strategy network; and inputting the final question representation into the updated strategy network, and outputting a final answer.

Wherein the prize valueThe calculation formula is as follows:

indicating a function, when the equation in the indicating function is true output 1, it is false output 0.

The bonus functionCalculation ofThe formula is as follows:

wherein,，/>representing dirichlet distribution, +.>Is a question of->Is a parameter vector of dirichlet distribution.

The beneficial effects are that: the question-answering method for improving the time sequence knowledge graph based on the auxiliary supervision signal is used for solving the problems that the existing time sequence knowledge graph question-answering method generally lacks interpretability on an inference path, and is different from supervision learning when the time sequence knowledge graph question-answering is realized based on a reinforcement learning method, only the reward signal is adopted in reinforcement learning, the reward signal is delayed generally, and the action taken by the feedback can not be effective only after the environment is long, so that the problem of unconscious supervision exploration can occur in the inference process. Secondly, aiming at the task background of the time sequence knowledge graph, a dynamic time coding module is added in an NSM (neural state machine) reasoning module to embody the time sequence information of the time sequence knowledge graph, the entity representation is modified in the whole instruction module, the entity is divided into a static representation and a dynamic representation by adopting relative time representation, and the two parts are spliced to update the entity representation in real time. The model combines NSM and reinforcement learning algorithm, and realizes accurate and efficient time sequence knowledge spectrum question-answering under time sequence knowledge spectrum multi-hop reasoning by simulating and tracking state change and time sequence relation in the knowledge spectrum, and compared with other types of methods (based on information retrieval and semantic analysis), the model shows the interpretability of the reasoning path.

The method provided by the invention has wide application prospect, can be used in the fields of intelligent assistant, knowledge graph management, intelligent search and the like, provides intelligent, accurate and personalized question-answering service, and promotes the development and application of artificial intelligence and knowledge graph technology.

Drawings

FIG. 1 is a flow chart of a method for improving a time-series knowledge graph question-answering based on an auxiliary supervisory signal.

Fig. 2 is a diagram of the policy network operation structure of the agent inference phase.

Fig. 3 is an illustration of the reasoning process of the method of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made more apparent and fully by reference to the accompanying drawings, in which embodiments of the invention are shown, and in which it is evident that the embodiments shown are only some, but not all embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without any inventive effort, are intended to be within the scope of the present invention.

The invention will be further described with reference to specific examples.

As shown in FIG. 1, the invention introduces a method for improving a time sequence knowledge graph question-answering based on an auxiliary supervision signal, which comprises the following steps:

step 1), acquiring semantic representation of the natural language problem through a pre-training language model.

The method considers that the parameter quantity of NLP (natural language processing) task is large, the GPU (graphics processing Unit) has limited computational power, and selects a more general and lighter DistillBert as a pre-training language model to acquire initial semantic representation of natural language problemsThe specific formula is as follows:

wherein,is a d×n embedding matrix, N is the number of tokens (basic units of problem text information), and D is the dimension of TKG (time series knowledge graph) embedding. />Is a D x->Is a learnable matrix of (1), wherein->Is the embedding dimension of DistillBert. />Is an NLP processing module. />Is text information of natural language questions.

Step 2) the problem is remodelled for the background knowledge stored in the time sequence knowledge graph, so that the problem representation contains TKG information. Namely: the TKG (Temporal Knowledge Graph) embedded representation is utilized to perform representation learning on the entity and the timestamp in the TKG, and then the representation of the entity and the timestamp in the initial semantic representation of the natural language problem obtained in the step 1) is replaced by the TKG embedded representation. And (3) replacing the entity and timestamp representation part in the natural language problem semantic matrix obtained in the step (1) with TKG embedded representation, inputting the TKG embedded representation into an information fusion layer to perform secondary coding on the problem, and fully utilizing a time sequence knowledge graph to perform knowledge enhancement.

For time sequence knowledge graph questions and answers, the data set consists of two parts including a time constraint question-answer pair and a corresponding time sequence knowledge graph. A time series knowledge graph (TKG) is composed of a large number of tetrads to describe fact information in the real world. In general, a time series data set is formally defined as，Header entity representing a quadruple->And tail entity->In timestamp->When there is a relationship->，/>A set of entities is represented and,representing a set of relationships->Representing a set of time stamps.

According to the method, the complex space-time relationship between time and relationship is considered, a TComplEx embedding method is selected, TKG is mapped to a complex space, the special property of a time sequence knowledge graph can be well adapted, the space-time modeling capability of embedded representation is improved, and the evolution of the complex relationship between the entity and the relationship along with time can be captured.

Specifically, firstly, replacing an entity part in the initial semantic representation of the natural language problem with an embedded representation of a TKG entity to obtain a replaced semantic representation of the natural language problem asReplacing the TKG timestamp representation with the timestamp part in the initial semantic representation of the natural language question, and finally replacing the initial semantic representation of the natural language question to be represented as。

TComplEx embedding with TKG embeddingThe method uses background knowledge stored in a time sequence knowledge graph to remodel the problem, and replaces entity and timestamp representation in the natural language problem semantic matrix obtained in the step 1) with TKG embedded representation, wherein the entity representation is replaced first and then the problem representation is shown asThe specific formula is as follows:

Further, the problem after the replacement time stamp representation is expressed asThe specific formula is as follows:

wherein T is ₁ ，T ₂ The maximum and minimum time after all the time obtained after all the entities in the question query the time sequence knowledge graph are sequenced.

Step 3), constructing an information fusion layer for remodelling the problem, wherein the information fusion layer consists of two transducer coding layers, and the final replaced natural language problem semantic representation after the final replaced natural language problem semantic representation is partially replaced by the TKG embedded representation in the step 2)The information is input into the information fusion layer and then is recombined for encoding, and the information fusion layer processes and integrates the information to obtain +.>And from->The final question representation is acquired->The specific formula is as follows:

wherein,is a DxN embedding matrix, N is the number of tokens, and D is the dimension of TKG embedding. The final question is expressed as +.>，/>Flag bits representing natural language questions, each sentence beginning with inserted +.>The final output of the class token correspondence is used to aggregate the contribution of the entire sequence characterization information. />Representing the column vector corresponding to token N.

And 4) in order to solve the problem that the reinforcement learning strategy network lacks supervision signals to guide the agent to perform effective exploration, blind path exploration occurs and false reasoning occurs. The method introduces an NSM (neural state machine) for generating an auxiliary supervisory signal for the reasoning phase. The method is characterized in that the method is improved into a time-aware nerve state machine, a dynamic coding is used for an entity, so that the entity is suitable for a time sequence knowledge graph question-answering task, and the auxiliary supervision signal is probability distribution of each node in an inference path of output after the module converges.

The NSM neural state machine acquires auxiliary supervision signals by simulating the process of gradually reasoning to the next node from the entity corresponding to the final problem representation on the time sequence knowledge graph, considering probability distribution of each node which possibly becomes a path point, and finally recording the probability distribution of all nodes which possibly become the path point in each reasoning step after the constraint function converges. Thus, the entity is associated with a probability of reasoning about the answer.

The time-aware neural state machine consists of an instruction module and an inference module. Representing the final problem obtained in the step 3)As one of the inputs of the instruction module, the instruction vector is output +.>An input for a second stage reasoning module.

Step 401) workflow of instruction module:

in the instruction module, makeWith the intent mechanism, specific portions of the problem can be learned when different instruction vectors are at different times. And utilizeDynamic update +.>Representing such that information of previous instruction vectors can be dynamically updated. The steps are repeatedly updated, and the output of the instruction module can be obtained through n reasoning steps, namely: instruction vector list +.>Wherein->Representing the current kth reasoning step. />Representing the final question representation corresponding to the current kth reasoning step,/for example>Representing the weight corresponding to the current kth inferencing step,/-, and>representing the bias corresponding to the current kth inference step. />The last instruction vector may take on a given initial value. />、/>The last step can learn the matrix and bias the matrix.

Instruction vectorThe initialization is 0 vector, and the specific formula of the instruction vector is as follows:

wherein,representing attention weight,/->Representing hidden state information of the encoder, each hidden state (hidden layer state) in the encoder corresponds to a word in the input sentence, j represents the length of the hidden state sequence,superscript->Representing the total length of the hidden state sequence->Representing the current kth reasoning step. />Representing a +.>D x D-dimensional learnable matrix of +.>Representing a +.>The D x D dimensions of (c) may learn the bias matrix.

Step 402), workflow of inference module:

firstly, aiming at the task background of question and answer of a time sequence knowledge graph, in order to embody the time sequence characteristic and enhance the time information modeling, the method is realThe body characteristics will change over time, thus for the entities involved in the inference module，/>Representing the entity set in the reasoning module, so that the dynamic characteristic of the entity is represented by adopting a relative time representation method +.>The specific formula is as follows:

wherein,representing the difference between the timestamp of the entity in the current reasoning step and the time in question,/for>D x D-dimensional learning matrix, which is a dynamic feature of an entity,/for example>D x D dimension learning bias matrix, which is a dynamic feature of an entity. />Representing an activation function.

Ultimately, potentially invariant features of an entity will be representedDynamic characterization of an entity ∈>Is spliced to obtain the current reasoning kth step entity +.>Denoted as->。

Via step 401), the input of the inference module is the instruction vector obtained by the instruction module of the current stepAnd entity embedding->Entity distribution obtained in the last reasoning step +.>Composition is prepared.

The relevant relationship type information is used to encode the entity because in the multi-hop TKBQA (Temporal Knowledge Graph Question Answering) task, the inference path consisting of multiple relationship types may reflect the important semantics that lead to the answer entity. In addition, this approach helps to reduce noise.

Matching vectorCan be compromised by the current instruction>And the relation vector of the current reasoning step>The specific formula is as follows:

wherein,d x D-dimensional learnable matrix, which is a matching vector,>representing an activation function->Represent Kronecker products, commonly used for matrix and tensor calculations.

Aggregating matching messages from adjacent quads and assigning weights to them based on their degree of interest in the last reasoning step, resulting in a set of entitiesThe specific formula is as follows:

wherein,urepresenting the total number of entities associated with the current inference step,is entity->And calculating the output entity distribution probability through an inference module in k-1 inference steps.

According to the entityCalculating the current kth inference step entity distribution probability +.>The calculation formula is as follows:

The steps are repeatedly updated, and the output of each step of reasoning module can be obtained through n reasoning steps, namely: entity distribution probability columnWatch (watch)。

Step 5), a step of calculating a loss function of the time-aware NSM (neural state machine), as follows:

to generate the auxiliary supervisory signals mentioned in step 4), i.e. the probability distribution of each entity in the inference path of the output after convergence of the NSM (neural state machine) module. By means of a bidirectional search algorithm (bidirectional BFS), NSM is constructed into a mixed bidirectional reasoning structure, forward reasoning and backward reasoning form a cyclic pipeline, and two NSM models of the forward reasoning and the backward reasoning share an instruction moduleThe intermediate reasoning step inputs the same instruction vector +.>The reasoning modules of the two models are connected in series to form a loop (forward and backward)>,/> ，/> 。

The hybrid bidirectional inference structure has three loss functions in total and forwardsBackward->Constraint loss->When the partial training converges, a distribution of intermediate states is output.

Forward directionBackward->Constraint loss->The formula is as follows:

，/>。

wherein:and->Representing the final entity distribution in the forward (backward) reasoning process; />And->Representing forward (backward) real entity distribution, < ->The KL divergence is represented, which measures the difference between two distributed entities in an asymmetric way.

Further, constraint lossReflecting the degree of consistency between the intermediate entity distributions obtained by the two reasoning processes. It can be calculated by summing the losses for each intermediate step as follows:

wherein:is the Jensen-Shannon divergence, the difference between the two distributions is measured in a symmetrical fashion.Representing the final entity distribution in the forward reasoning process,/->Representing the final entity distribution in the backward reasoning process.

In summary of the above losses, the overall loss function of the hybrid bi-directional reasoning phase is defined as:wherein->E (0, 1) and ∈>E (0, 1) is the hyper-parameter of the control factor weight. When the loss function converges, then the entity distributions in the middle of the two reasoning processes should be similar or identical, i.e.>。

Step 6), distributing probability list from entityEach entity is acquirednThe corresponding entity distribution probability in each reasoning step is tonAnd carrying out weighted average on the corresponding entity distribution probabilities in each reasoning step to obtain the weighted average distribution probability of each entity. And adding the weighted average distribution probability as additional state information into the state information in the strategy network to form a new strategy network. The new policy network is +.>Wherein the status information is->Action information is +.>。

Wherein,，/>。

Step 6) taking the intermediate entity state distribution obtained in the step 5) as an auxiliary supervision signal in the reasoning process, and solving the problems that false paths and blind exploration phenomena can be generated when an agent performs reasoning due to lack of supervision signal giving in the existing reinforcement learning-based method.

Step 7), in the policy network stage, the method mainly comprises three parts, namely, problem context coding, history path coding and action scoring mechanism, as shown in fig. 2.

In the problem context encoding stage, the natural language problem processing stage is first used to obtain the final problem representationThe problem context vector is obtained using the max pooling layer, the formula of which is shown below:

the use of a max-pooling layer helps to extract key features in the problem context, reduce dimensionality, increase positional invariance, and suppress noise, thereby simplifying model structure and improving performance.

The history path coding part sets the path length as L in the strategy network stage, searches the pathFor a sequence of actions taken, wherein +.>。/>An entity representing a question, a time of question mention,/-respectively>Respectively representing the relationship, the entity and the time of the current node in the reasoning process.

Agent uses LSTM for history pathThe specific formula for coding is as follows:

，/>。

wherein,for the initial relationship, the LSTM state remains unchanged when the last action is a self-cycling action.

In the action scoring stage, scoring is performed for each optional action to calculate a state transition probability. In the first placeThe optional action of a step is denoted +.>，/>Represent all optional action sets of the current reasoning step, assume +.>There are n actions, the scheme sets up a maximum of 30 selectable actions in the set of selectable actions, namely n<=30. A weighted action scoring mechanism is used to help the agent pay more attention to the attributes or types of edges of the target node. The state information is encoded by two Multi-Layer Perceptron (MLP) to output the expected target node +.>And edge representation +.>The specific formula is as follows:

And finally, obtaining the target node score and the out-degree edge score of the candidate action by calculating the similarity. The agent obtains the final candidate action score through the weighted sum of the two scoresCandidate action score +.>The action corresponding to the maximum value is used as the input of the next-step agent, and when iterating to the L-th step, the agent outputs the action as the final answer. The specific formula is as follows:

/>

wherein,，/>is->Is a learning matrix of (a). />In order to activate the function,representing the current entity +_>Representing the current timestamp. />Indicate->Optional action of step,/->Indicate->Entity in the optional action of step, +.>Indicate->Relationships in optional actions of steps.

Step 8) modeling rewards in a policy networkAccording to the fact that the same entity quaternion is concentrated in a specific period, variability and sparsity can be presented in time, namely, answer entities can appear distributed characteristics along with time, and the prior knowledge is introduced into a reward function>The time-based rewards can make the agent know which snapshot is easier to find the answer by guiding the agent to learn. />Indicating the indication function whenThe intra-number equation is true output 1 and false output 0./>Representing the answer corresponding to the question.

Wherein the method comprises the steps ofIn the final state, and when the path length is L, the selected action is considered as the final answer to be inferred, i.e.>。

A Dirichlet distribution is estimated for each relationship based on the training set. The original rewards are molded by Dirichlet distribution, and the specific formula is as follows:

，/>，/>。

wherein,is relation->Is a parameter vector of dirichlet distribution. We can estimate +.>。

In a word, the invention adopts the deep learning technology, combines the knowledge graph and the reinforcement learning, can effectively capture the dynamic characteristic of the time evolution of the relationship in the knowledge graph, and in the whole process, not only the modeling capability of time sequence information is improved by embedding the temporal KG, but also an innovative thought is provided for solving the complex problem in the temporal knowledge graph.

Example 1:

the natural language problem of this embodiment is: einstein is taught at the university of Prins, ston in which year.

As shown in fig. 3, the inference path obtained by using the method according to the present invention according to the time sequence knowledge graph is: einstein- - > (resident US 1933) - - - >, taught at university of Prins ston (1933) - - - >, becomes a professor of university of Prins ston (1933). < einstein, colonization, usa, 1933> is a quadruple in the timing knowledge graph, each quadruple representing a fact.

The answer for this embodiment is: 1933.

experimental results of this example using the method of the present invention and experimental results obtained using other methods are shown in the following table:

wherein EmbedKGQA represents a method for embedding KG for a multi-hop KGQA task, embed represents embedding, KG represents knowledgegraph (Knowledge Graph), and KGQA represents Knowledge Graph Question Answering (Knowledge Graph question-answering).

T-EaE-add/T-EaE-replace represents two variants of the EaE model, eaE model being an entity perception method in which an entity acts as an expert. T-EaE-add indicates that all grounding entities and time spans are noted in the problem. T-EaE-replace represents replacing BERT inserts with entity/time insert TKG inserts. The full name of BERT is Bidirectional Encoder Representation from Transformers, representing a pre-trained language characterization model.

CronKGQA (2021) represents extending the edegkgqa to the technical QA task and answering the time questions with the technical KG embedding. The technical QA indicates Temporal Question Answering (timing question and answer), and the technical KG indicates Temporal Knowledge Graph (timing knowledge graph).

TMA (2023) represents a Time-aware multipath adaptive (Time-aware Multiway Adaptive (TMA)) converged network.

TSQA (2022) represents a time sensitive question-answer (Time Sensitivity for Question Answering (TSQA)) framework.

Our the process according to the invention.

The metrics of hits@1 and hits@10 are commonly used for knowledge-graph performance evaluation and refer to the average duty cycle of triples ranked less than k in the link prediction. The specific formula is as follows:

wherein,is the number of triples>Is the link prediction ranking of the i-th triplet. />The larger the index of (c) is, the better, in this performance comparison we choose n to be 1, 10.

The experimental result obtained by using the technical scheme (our) is shown as above, compared with the experimental effect obtained by other methods, the experimental result obtained by using the method is remarkably improved, and the Hits@1 index of the complex type problem is improved by 15%. This significant improvement may be due to the more in-depth understanding of the need for complex reasoning for the problem representation, which is processed multiple times by the present solution, enabling it to more effectively assist the agent in path exploration during reasoning. In addition, the NSM neural state machine with time perception is introduced in the scheme to provide probability distribution of the entity for reasoning, so that the intelligent agent does not blindly explore any more, the accuracy of searching answers is improved, and the effectiveness of the proposed method is further verified by the results.

The foregoing is only a preferred embodiment of the invention, it being noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the invention.

Claims

1. A question-answering method for improving a time sequence knowledge graph based on an auxiliary supervision signal is characterized by comprising the following steps of: the method comprises the following steps:

acquiring semantic representation of the natural language problem through a pre-training language model;

replacing the semantic representation of the natural language problem with the TKG embedded representation to obtain a replaced semantic representation of the natural language problem;

inputting the replaced semantic representation of the natural language problem into an information fusion layer, and outputting a final problem representation by the information fusion layer;

inputting the final problem representation into a neural state machine, and outputting an entity distribution probability list by the neural state machine after the loss function converges;

the entity distribution probability is weighted and averaged to be used as additional state information, and the additional state information is added into the state information in the strategy network to form a new strategy network;

inputting the final question representation into a new strategy network, and outputting a final answer; calculating a reward function according to the final answer and the real answer, and learning a new strategy network according to the reward function to obtain an updated strategy network; inputting the final question representation into the updated strategy network, and outputting a final answer;

calculating a reward function according to the final answer and the real answer, and learning a new strategy network according to the reward function to obtain an updated strategy network; inputting the final question representation into the updated strategy network, and outputting a final answer, wherein the method specifically comprises the following steps:

according to the final answerAnd true answer e _ans Obtain the prize value R(s) _L )，s _L Representing a final state;

according to the prize value R(s) _L ) Calculating a reward function

According to a reward functionLearning the new strategy network to obtain an updated strategy network; inputting the final question representation into the updated strategy network, and outputting a final answer;

wherein the prize value R (s _L ) The calculation formula is as follows:

representing an indication function, wherein when the equation in the indication function is true output 1, the indication function is false output 0;

the bonus functionThe calculation formula is as follows:

wherein,dirichlet () represents Dirichlet distribution, < ->Is a problem relation r _q Is a parameter vector of dirichlet distribution.

2. The question-answering method for improving a time sequence knowledge graph based on an auxiliary supervision signal according to claim 1, wherein the method comprises the following steps: the semantic representation calculation formula of the natural language problem is as follows:

Q _S ＝W _S DistillBert(q _text )；

wherein Q is _S Semantic representation representing natural language questions, W _S Representing D x D _DistillBert Wherein D is the TKG-embedded dimension, D _DistillBert Is the embedding dimension of DistillBert; distillBeet () is an NLP processing module; q _text Is text information of natural language questions.

3. The question-answering method for improving the time sequence knowledge graph based on the auxiliary supervision signals according to claim 2, wherein the method comprises the following steps: the replacing the semantic representation of the natural language problem with the TKG embedded representation to obtain the replaced semantic representation of the natural language problem specifically comprises the following steps:

replacing the entity part in the semantic representation of the natural language problem with the TKG entity embedded representation to obtain Q _E ；

At Q _E The time stamp part in the sequence replaces TKG time stamp representation to obtain replaced natural language question semantic representation Q _T ；

Wherein Q is _E The i-th element of (a)The calculation formula is as follows:

wherein e _i Representing entity e _h Or e _t ，t _τ Representing a time stamp, W _E A D x D-dimensional learnable matrix representing a problem representation after replacement of the entity representation,represents Q _S I element, e _h Representing the header entity, e _t Represents a tail entity; wherein Q is _T I element->The calculation formula is as follows:

4. The question-answering method for improving a time sequence knowledge graph based on an auxiliary supervision signal according to claim 1, wherein the method comprises the following steps: the method for inputting the replaced semantic representation of the natural language problem into the information fusion layer, and outputting the final problem representation by the information fusion layer specifically comprises the following steps:

the replaced natural language problem semantic representation is processed and integrated through a multi-layer self-attention mechanism and a feedforward neural network to obtain a matrix Q, and the element Q in the matrix Q is obtained _CLS Representing q as a final problem; wherein q _CLS Flag bits representing natural language questions.

5. The question-answering method for improving the time sequence knowledge graph based on the auxiliary supervision signals according to claim 4, wherein the method comprises the following steps: the final problem is expressed and input into a neural state machine, and after the loss function converges, the neural state machine outputs an entity distribution probability list, which specifically comprises:

calculating an instruction vector i corresponding to the current kth reasoning step according to the final problem representation q ^(k) ；

Based on the current kth inference step entityCalculating dynamic characteristics phi (delta k) of the entity to obtain the current kth reasoning step entitye _i ∈N _e ，N _e Representing the entity set in the reasoning step;

according to the entityAcquisition entity->Probability of entity distribution in the kth-1 inference step +.>

According to instruction vector i ^(k) Calculating a matching vector m ^k ；

According to entity distribution probabilityAnd a matching vector m ^k Calculating entity set E of the current kth reasoning step ^(k) ；

According to entity E ^(k) Calculating the entity distribution probability p of the current kth reasoning step ^(k) ；

Repeating the above steps, when the loss function of each reasoning step is converged, outputting entity distribution probability by the neural state machine, and obtaining an entity distribution probability list through n reasoning steps

6. The question-answering method for improving the time sequence knowledge graph based on the auxiliary supervision signals according to claim 5, wherein the method comprises the following steps: the instruction vector i ^(k) The calculation formula is as follows:

wherein,represents the current kth inference step attention weight, h _j Hidden state information representing the encoder; j represents the length of the hidden state sequence, l represents the total length of the hidden state sequence;

the saidThe calculation formula is as follows:

wherein q ^(k) Representing the final problem representation of the current kth reasoning step, W _α Representation ofD x D-dimensional learnable matrix of b _α Representation ofD x D-dimensional learnable bias matrices; softmax represents the activation function;

the dynamic characteristic phi (delta k) of the entity is calculated as follows:

Φ(Δk)＝σ(W _k ΔT+b _k )；

wherein DeltaT represents the difference between the time stamp of the entity in the current reasoning step and the time stamp in the question, W _k D x D dimension learning matrix, b, which is a dynamic feature of an entity _k D x D dimension learning bias matrix, which is a dynamic feature of an entity; sigma (·) represents the activation function;

the matching vector m ^k The calculation formula is as follows:

wherein W is _m D x D-dimensional learnable matrix representing the matching vector, σ (·) represents the activation function,represents the Kronecker product, r _i Representing the relationship of the ith entity;

the E is ^(k) The calculation formula is as follows:

where u represents the total number of entities associated with the current inference step;

the p is ^(k) The calculation formula is as follows:

p ^(k) ＝softmax(E ^(k) W)；

where W is a parameter deriving the entity distribution, softmax () represents the activation function.

7. The question-answering method for improving the time sequence knowledge graph based on the auxiliary supervision signals according to claim 5, wherein the method comprises the following steps: the weighted average of the entity distribution probability is used as additional state information, and the additional state information is added into the state information in the strategy network to form a new strategy network, which comprises the following steps:

distributing probability lists from entitiesAcquiring corresponding entity distribution probabilities in n reasoning steps of each entity, and carrying out weighted average on the corresponding entity distribution probabilities in n reasoning steps to acquire weighted average distribution probabilities of each entity;

adding the weighted average distribution probability as additional state information into the strategy network to form a new strategy network;

the new policy network is pi (A ^* |S ^* ) Wherein the state information is S ^* The action information is A ^* ；

Wherein S is ^* ＝(e _i ，t _i ，e _q ，t _q ，h ^dis )，

Wherein e _i Representing the current entity, t _i Representing the current timestamp, e _q Representing entities in question, t _q Representing time stamps in a question, h ^dis The additional state information is represented by a representation of the state information,representing the currently selected action entity, +.>Representing the currently selected action relationship, +.>Representing the currently selected action timestamp.

8. The question-answering method for improving the time sequence knowledge graph based on the auxiliary supervision signals according to claim 7, wherein the method comprises the following steps: inputting the final question representation into a new strategy network, and outputting a final answer, wherein the method specifically comprises the following steps:

inputting the final problem representation q into the max pooling layer to obtain the problem context vector h _q ；

Acquiring a history search path h _l ；

Searching for path h based on history _l Problem context vector h _q Computing desired target nodesAnd (2) He Ji->

According to the desired target nodeAnd (2) He Ji->Calculating candidate action score phi (a _n ，s _l )，l∈L；

Candidate action score φ (a _n ，s _l ) The action corresponding to the maximum value is used as the input of the intelligent agent in the next step, and when iterating to the L step, the intelligent agent outputs the action as a final answer;

the history searching path h _l The calculation formula is as follows:

h _l ＝((e _q ，t _q )，r ₁ ，(e ₁ ，t ₁ )，...，r _l ，(e _l ，t _l ))；

wherein L is L, r _l ，(e _l ，t _l ) Respectively representing the relationship, entity and time of the current node in the first reasoning step;

the desired target nodeAnd (2) He Ji->The calculation formula is as follows:

wherein,is D x D dimensionIs->Learner matrix->Target edge of D x D dimension +.>Learner matrix, w ₁ A learnable matrix representing the output desired target, reLU being an activation function;

the candidate action score phi (a _n ，s _l ) The calculation formula is as follows:

wherein beta is _n ＝sigmoid(W _β [h _l ；e _q ；r _q ；e _n ；r _n ])，W _β Is beta _n Is a learning matrix of (a); sigmoid is an activation function, e _i Representing the current entity, t _i Representing a current timestamp; a, a _n Representing optional actions of step i, e _n Representing entities in the optional actions of step i, r _n Representing the relationship in the optional actions of step i.