CN118133963A

CN118133963A - Knowledge subgraph construction method and device

Info

Publication number: CN118133963A
Application number: CN202410554025.XA
Authority: CN
Inventors: 骆丹; 叶龙; 姚安琪儿; 王德培; 赵子龙
Original assignee: Guangdong Southern Planning & Designing Institute Of Telecom Consultation Co ltd
Current assignee: Guangdong Southern Planning & Designing Institute Of Telecom Consultation Co ltd
Priority date: 2024-05-07
Filing date: 2024-05-07
Publication date: 2024-06-04
Anticipated expiration: 2044-05-07
Also published as: CN118133963B

Abstract

The invention relates to the technical field of natural language processing, and discloses a knowledge subgraph construction method and device, wherein the method comprises the following steps: generating a first knowledge subgraph according to the questioning information and a preset knowledge base, wherein the first knowledge subgraph comprises at least two first entities, the first entities are connected with each other, each relationship side is provided with a corresponding matching degree value, and the matching degree value is used for representing the matching degree between the relationship side and the questioning information; calculating a relation edge matrix of the first knowledge subgraph according to all the matching degree values; generating an entity matrix of the first knowledge sub-graph according to preset weight distribution conditions and all the first entities; updating the entity matrix according to the relation edge matrix and the entity matrix; determining at least one target entity in the updated entity matrix according to preset screening conditions; and generating a second knowledge subgraph of the first knowledge subgraph according to all the target entities and the connected relation edges thereof. Therefore, the knowledge subgraph construction accuracy can be improved by implementing the knowledge subgraph construction method.

Description

Knowledge subgraph construction method and device

Technical Field

The present invention relates to the field of natural language processing technologies, and in particular, to a knowledge subgraph construction method and apparatus.

Background

With the continuous development of information technology, the information quantity acquired by people every day is increased in an explosive manner, and the production and life of people are enriched.

However, the complex information fills the life of each person, so that more people pay attention to the independence and privacy of information acquisition, people prefer to acquire the information content which the people want to acquire in a searching manner so as to improve the production life efficiency, but the knowledge subgraphs generated by searching questions and answers based on a knowledge base at present often occur, particularly when the problem of relatively complex searching is performed, the generated knowledge subgraphs are relatively large, namely the construction accuracy of the knowledge subgraphs is relatively low, so that people need to further retrieve again, and the production life efficiency of people is affected.

It is important to provide a technical scheme for improving the knowledge subgraph construction accuracy.

Disclosure of Invention

The invention provides a knowledge subgraph construction method and device, which can improve knowledge subgraph construction accuracy.

In order to solve the technical problems, the first aspect of the invention discloses a knowledge subgraph construction method, which comprises the following steps:

generating a first knowledge subgraph according to questioning information and a preset knowledge base, wherein the first knowledge subgraph comprises at least two first entities, and relationship edges connected with each other exist between the first entities, the relationship edges are used for representing the relationship between the first entities, each relationship edge has a corresponding matching degree value, and the matching degree value is used for representing the matching degree between the relationship edge and the questioning information;

calculating a relation edge matrix of the first knowledge subgraph according to all the matching degree values;

generating an entity matrix of the first knowledge sub-graph according to a preset weight distribution condition and all the first entities, wherein the entity matrix comprises entity weights of each first entity, and the entity weights meet the preset weight distribution condition;

Updating the entity matrix according to the relation edge matrix and the entity matrix;

determining at least one target entity in the updated entity matrix according to preset screening conditions;

And generating a second knowledge subgraph of the first knowledge subgraph according to all the target entities and the connected relation edges thereof.

As an optional implementation manner, in the first aspect of the present invention, before the calculating, according to all the matching degree values, a relationship edge matrix of the first knowledge sub-graph, the method further includes:

calculating a relation edge vector of each relation edge and a question vector of the question information according to a preset text vector model;

For each relation edge, calculating a similarity value between the relation edge vector of the relation edge and the problem vector according to a preset similarity calculation model, and determining the similarity value as the matching degree value corresponding to the relation edge;

and calculating a relation edge matrix of the first knowledge subgraph according to all the matching degree values, wherein the relation edge matrix comprises the following components:

Classifying all the relation edges into at least one relation edge set according to all the matching degree values, wherein the matching degree values corresponding to each relation edge in the relation edge set are matched;

And calculating a relationship edge matrix of the first knowledge subgraph according to the first entities of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets.

As an optional implementation manner, in a first aspect of the present invention, the calculating a relationship edge matrix of the first knowledge sub graph according to the first entities of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets includes:

For each relation edge set, calculating a relation edge sub-matrix of the relation edge set according to the first entity connected with each relation edge in the relation edge set and the matching degree value corresponding to the relation edge set;

calculating a relation edge matrix of the first knowledge subgraph according to all the relation edge sub-matrices;

and the relation edge sub-matrix is calculated by a relation edge sub-matrix calculation formula, wherein the relation edge sub-matrix calculation formula is as follows:

Wherein, For representing the/>The relational edge submatrices of the sets of relational edges,/>For representing the/>An initial sub-matrix of the set of relationship edges, the initial sub-matrix comprising an initial relationship weight for each of the relationship edges in the set of relationship edges, the initial relationship weight determined from the first entity to which each of the relationship edges in the set of relationship edges is connected,/>For representing the/>The matching degree value corresponding to each relation edge set,/>For representing the total number of the set of relationship edges.

As an optional implementation manner, in the first aspect of the present invention, the calculating a relationship edge matrix of the first knowledge sub-graph according to all the relationship edge sub-matrices includes:

calculating a preliminary relationship edge matrix of the first knowledge subgraph according to a preset operation algorithm and all the relationship edge sub-matrices;

according to a preset matrix processing algorithm, performing preset matrix processing operation on the prepared relation edge matrix to obtain the relation edge matrix;

and before the preset matrix processing operation is performed on the prepared relation edge matrix according to a preset matrix processing algorithm to obtain the relation edge matrix, the method further comprises:

extracting element features of the preliminary relation edge matrix according to a preset feature recognition model;

Judging whether the element characteristics are matched with preset abnormal element characteristics, and when the element characteristics are judged to be matched with the preset abnormal element characteristics, triggering to execute the operation of executing preset matrix processing operation on the prepared relation edge matrix according to a preset matrix processing algorithm to obtain the operation of the relation edge matrix;

when the element characteristics are not matched with the preset abnormal element characteristics, determining the prepared relationship edge matrix as the relationship edge matrix of the first knowledge subgraph;

And the preparation relation edge matrix is calculated by a preparation relation edge matrix calculation formula, wherein the preparation relation edge matrix calculation formula is as follows:

Wherein, For representing the preliminary relational edge matrix;

and the preset matrix processing algorithm is as follows:

Wherein, For representing the relation edge matrix,/>For representing a preset normalization function.

In an optional implementation manner, in a first aspect of the present invention, the generating, according to a preset weight distribution condition and all the first entities, an entity matrix of the first knowledge sub-graph includes:

determining the entity weight of each first entity according to a preset weight distribution condition;

Generating an entity matrix of the first knowledge sub-graph according to all the entity weights;

And the first entity includes a second entity in the question information and a third entity associated with the second entity in the preset knowledge base, the third entity exists in a text sentence of the preset knowledge base, and the determining the entity weight of each first entity according to the preset weight distribution condition includes:

According to a preset semantic analysis model, analyzing a first criticality value of each second entity, a second criticality value of each third entity and a relevance value of each third entity to each second entity, wherein the first criticality value is used for representing the semantic influence of the second entity on the question information, and the second criticality value is used for representing the semantic influence of the third entity on the text sentence;

and determining entity weight of each first entity according to all the first criticality values, all the second criticality values and all the association degree values.

In an optional implementation manner, in a first aspect of the present invention, the determining, in the updated entity matrix, at least one target entity according to a preset screening condition includes:

Acquiring the entity weight of each first entity in the updated entity matrix;

For each first entity, judging whether the entity weight of the first entity is greater than or equal to a preset entity weight threshold, and determining the first entity as a target entity when judging that the entity weight of the first entity is greater than or equal to the preset entity weight threshold;

When judging that the entity weights of all the first entities are smaller than the preset entity weight threshold, updating the preset entity weight threshold according to all the entity weights, and re-triggering and executing the operation of judging whether the entity weight of each first entity is larger than or equal to the preset entity weight threshold.

As an optional implementation manner, in the first aspect of the present invention, before the generating the first knowledge subgraph according to the question information and the preset knowledge base, the method further includes:

Analyzing the question type information and the information complexity of the acquired question information according to a preset information processing model;

according to the problem type information, at least one knowledge sub-library is matched in a preset knowledge library;

Generating retrieval configuration environment information of all knowledge sub-bases according to the information complexity, wherein the retrieval configuration environment information comprises at least one of retrieval hop count threshold value, retrieval history nodes, retrieval information quantity, network configuration information, sensitive information base, readable and writable permission, tracking prevention configuration information and context semantic recognition model information;

configuring all knowledge sub-bases according to the retrieval configuration environment information;

And generating a first knowledge subgraph according to the questioning information and a preset knowledge base, wherein the first knowledge subgraph comprises the following steps:

And generating a first knowledge subgraph according to the questioning information and all the configured knowledge subgraphs.

The second aspect of the invention discloses a knowledge subgraph construction device, which comprises:

The first generation module is used for generating a first knowledge subgraph according to the questioning information and a preset knowledge base, wherein the first knowledge subgraph comprises at least two first entities, the first entities are connected with each other, the relationship edges are used for representing the relationship between the first entities, each relationship edge is provided with a corresponding matching degree value, and the matching degree value is used for representing the matching degree between the relationship edge and the questioning information;

the first calculation module is used for calculating a relation edge matrix of the first knowledge subgraph according to all the matching degree values;

The first generation module is further configured to generate an entity matrix of the first knowledge sub-graph according to a preset weight distribution condition and all the first entities, where the entity matrix includes an entity weight of each first entity, and the entity weight satisfies the preset weight distribution condition;

the updating module is used for updating the entity matrix according to the relation edge matrix and the entity matrix;

The first determining module is used for determining at least one target entity in the updated entity matrix according to preset screening conditions;

The first generation module is further configured to generate a second knowledge subgraph of the first knowledge subgraph according to all the target entities and the relationship edges connected with the target entities.

As an alternative embodiment, in the second aspect of the present invention, the apparatus may further include:

The second calculation module is used for calculating a relation edge vector of each relation edge and a question vector of the questioning information according to a preset text vector model before the first calculation module calculates the relation edge matrix of the first knowledge subgraph according to all the matching degree values;

The second calculating module is further configured to calculate, for each of the relationship edges, a similarity value between the relationship edge vector of the relationship edge and the problem vector according to a preset similarity calculation model;

The second determining module is used for determining the similarity value as the matching value corresponding to the relation edge;

and the specific mode of calculating the relation edge matrix of the first knowledge subgraph by the first calculation module according to all the matching degree values comprises the following steps:

In a second aspect of the present invention, as an optional implementation manner, the specific manner of calculating, by the first calculation module, the relationship edge matrix of the first knowledge sub graph according to the first entities of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets includes:

In a second aspect of the present invention, as an optional implementation manner, the calculating, by the first calculating module, a relationship edge matrix of the first knowledge subgraph according to all the relationship edge sub-matrices includes:

And, the apparatus further comprises:

The extraction module is used for extracting element characteristics of the prepared relation edge matrix according to a preset characteristic recognition model before the first calculation module executes preset matrix processing operation on the prepared relation edge matrix according to a preset matrix processing algorithm to obtain the relation edge matrix;

The judging module is used for judging whether the element characteristics are matched with preset abnormal element characteristics, and when the element characteristics are judged to be matched with the preset abnormal element characteristics, the first calculating module is triggered to execute the operation of executing preset matrix processing operation on the prepared relation edge matrix according to a preset matrix processing algorithm to obtain the operation of the relation edge matrix;

The second determining module is further configured to determine the preliminary relationship edge matrix as a relationship edge matrix of the first knowledge subgraph when the judging module judges that the element feature is not matched with the preset abnormal element feature;

Wherein, For representing the preliminary relational edge matrix;

and the preset matrix processing algorithm is as follows:

In a second aspect of the present invention, as an optional implementation manner, the specific manner of generating the entity matrix of the first knowledge sub-graph by the first generation module according to the preset weight distribution condition and all the first entities includes:

And the first entity comprises a second entity in the questioning information and a third entity associated with the second entity in the preset knowledge base, wherein the third entity exists in a text sentence of the preset knowledge base, and the specific mode of determining the entity weight of each first entity according to the preset weight distribution condition by the first generation module comprises the following steps:

In a second aspect of the present invention, as an optional implementation manner, the determining, by the first determining module, at least one target entity in the updated entity matrix according to a preset screening condition includes:

Acquiring the entity weight of each first entity in the updated entity matrix;

As an alternative embodiment, in the second aspect of the present invention, the apparatus further includes:

the analysis module is used for analyzing the question type information and the information complexity of the acquired question information according to a preset information processing model before the first generation module generates a first knowledge subgraph according to the question information and the preset knowledge base;

The matching module is used for matching at least one knowledge sub-base in a preset knowledge base according to the problem type information;

The second generation module is used for generating search configuration environment information of all knowledge sub-libraries according to the information complexity, wherein the search configuration environment information comprises at least one of a search hop threshold value, a search history node, a search information quantity, network configuration information, a sensitive information library, readable and writable rights, anti-tracking configuration information and context semantic recognition model information;

The configuration module is used for configuring all knowledge sub-libraries according to the retrieval configuration environment information;

And the specific mode of the first generation module for generating the first knowledge subgraph according to the questioning information and the preset knowledge base comprises the following steps:

The third aspect of the invention discloses another knowledge subgraph construction device, which comprises:

A memory storing executable program code;

A processor coupled to the memory;

the processor invokes the executable program code stored in the memory to execute the knowledge subgraph construction method disclosed in the first aspect of the present invention.

A fourth aspect of the present invention discloses a computer storage medium storing computer instructions for performing the knowledge sub-graph construction method disclosed in the first aspect of the present invention when the computer instructions are invoked.

Compared with the prior art, the embodiment of the invention has the following beneficial effects:

In the embodiment of the invention, a first knowledge subgraph is generated according to questioning information and a preset knowledge base, the first knowledge subgraph comprises at least two first entities, the first entities are provided with mutually connected relation edges, the relation edges are used for representing the relation between the first entities, each relation edge is provided with a corresponding matching degree value, and the matching degree value is used for representing the matching degree between the relation edge and the questioning information; calculating a relation edge matrix of the first knowledge subgraph according to all the matching degree values; generating an entity matrix of the first knowledge sub-graph according to the preset weight distribution condition and all the first entities, wherein the entity matrix comprises entity weights of each first entity, and the entity weights meet the preset weight distribution condition; updating the entity matrix according to the relation edge matrix and the entity matrix; determining at least one target entity in the updated entity matrix according to preset screening conditions; and generating a second knowledge subgraph of the first knowledge subgraph according to all the target entities and the connected relation edges thereof. It can be seen that by implementing the invention, at least two first entities can be generated according to the questioning information and the preset knowledge base; and there are interconnecting relation edges between the first entities; each relation edge has a corresponding first knowledge sub-graph for representing the matching degree value between the relation edge and the question information, so that the relation edge matrix of the first knowledge sub-graph is calculated based on all the matching degree values, and the entity matrix of the first knowledge sub-graph is generated according to the preset weight distribution condition and all the first entities, and then the entity matrix is updated according to the relation edge matrix and the entity matrix, so that at least one target entity is determined in the updated entity matrix according to the preset screening condition finally.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a knowledge subgraph construction method disclosed in an embodiment of the present invention;

FIG. 2 is a flow diagram of another knowledge subgraph construction method disclosed in an embodiment of the present invention;

FIG. 3 is a schematic diagram of a knowledge sub-graph construction apparatus according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of another knowledge sub-graph construction apparatus according to an embodiment of the invention;

Fig. 5 is a schematic structural diagram of another knowledge sub-graph construction apparatus according to an embodiment of the invention.

Detailed Description

In order that those skilled in the art will better understand the present invention, a technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The terms first, second and the like in the description and in the claims and in the above-described figures are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, apparatus, article, or article that comprises a list of steps or elements is not limited to only those listed but may optionally include other steps or elements not listed or inherent to such process, method, article, or article.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the invention. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.

The invention discloses a knowledge subgraph construction method and a knowledge subgraph construction device, which can generate at least two first entities according to the questioning information and a preset knowledge base; and there are interconnecting relation edges between the first entities; each relation edge has a corresponding first knowledge sub-graph for representing the matching degree value between the relation edge and the question information, so that the relation edge matrix of the first knowledge sub-graph is calculated based on all the matching degree values, and the entity matrix of the first knowledge sub-graph is generated according to the preset weight distribution condition and all the first entities, and then the entity matrix is updated according to the relation edge matrix and the entity matrix, so that at least one target entity is determined in the updated entity matrix according to the preset screening condition finally. The following will describe in detail.

Example 1

Referring to fig. 1, fig. 1 is a schematic flow chart of a knowledge subgraph construction method according to an embodiment of the present invention. The knowledge subgraph construction method described in fig. 1 may be applied to a knowledge base and/or a text database, or may be applied to a search engine interface, or may be applied to an intelligent device related to at least one of the knowledge base, the text database, and the search engine interface, where the intelligent device includes, but is not limited to, one or more of a battery device, a cloud device, an edge computing device, a relay device, a base station device, a city management device, and an intelligent networking device. As shown in fig. 1, the knowledge subgraph construction method may include the following operations:

101. generating a first knowledge subgraph according to the questioning information and a preset knowledge base, wherein the first knowledge subgraph comprises at least two first entities, the first entities are connected with each other, the relationship edges are used for representing the relationship between the first entities, each relationship edge is provided with a corresponding matching degree value, and the matching degree value is used for representing the matching degree between the relationship edge and the questioning information.

102. And calculating a relation edge matrix of the first knowledge subgraph according to all the matching degree values.

In this embodiment of the present invention, as an optional implementation manner, before calculating the relationship edge matrix of the first knowledge subgraph according to all the matching degree values, the method may further include the following operations:

and calculating a relation edge vector of each relation edge and a question vector of the question information according to a preset text vector model.

And for each relation edge, calculating a similarity value between a relation edge vector and a problem vector of the relation edge according to a preset similarity calculation model, and determining the similarity value as a matching degree value corresponding to the relation edge.

In this alternative embodiment, the above-mentioned preset text vector model may be optionally a basic model based on RNN architecture, or may be a trained model integration package, for example, the preset text vector model may include at least one of Sentence-Transformers, LSTM, bi-LSTM.

Further optionally, the preset similarity calculation model may be:

Wherein, For representing relational edge vectors,/>For representing problem vectors,/>For representing similarity values.

Therefore, by implementing the alternative embodiment, the relation edge vector of each relation edge and the question vector of the question information can be calculated according to the preset text vector model, so that the similarity value between the relation edge vector and the question vector is calculated according to the preset similarity calculation model, the matching degree value of the relation edge is further determined, the calculation accuracy of the matching degree value of the relation edge is improved, and the calculation accuracy of the relation edge matrix is improved.

Optionally, calculating the relationship edge matrix of the first knowledge subgraph according to all the matching degree values may include the following operations:

And classifying all the relationship edges into at least one relationship edge set according to all the matching degree values, wherein the matching degree values corresponding to each relationship edge in the relationship edge set are matched.

And calculating a relationship edge matrix of the first knowledge subgraph according to the first entity of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets.

In this alternative embodiment, the matching degree value of each relationship edge in the relationship edge set may be within a certain error range, where the error is determined by the adaptation of the specific application scenario, but finally, for each relationship edge set, a matching degree value representing the relationship edge set needs to be determined based on the matching degree values of all relationship edges in the relationship edge set, so as to execute subsequent operations.

Therefore, by implementing the alternative embodiment, all the relationship edges can be classified into at least one relationship edge set according to all the matching degree values, so that the relationship edge matrix of the first knowledge sub-graph is calculated according to the first entity of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets, the calculation accuracy, the scientificity and the feasibility of the relationship edge matrix can be further improved, and the updating accuracy of the entity matrix is improved.

In this optional embodiment, as an optional implementation manner, the calculating the relationship edge matrix of the first knowledge sub graph according to the first entity of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets may include the following operations:

And for each relation edge set, calculating a relation edge sub-matrix of the relation edge set according to the matching degree value corresponding to the relation edge set and the first entity connected with each relation edge in the relation edge set.

And calculating the relation edge matrix of the first knowledge subgraph according to all the relation edge sub-matrices.

Therefore, by implementing the alternative embodiment, further refinement operation can be further executed on each relationship edge set, namely, the relationship edge sub-matrix of the relationship edge set is calculated according to the matching degree value corresponding to the first entity connected by each relationship edge in the relationship edge set and the relationship edge set, so that the relationship edge matrix of the first knowledge sub-graph is calculated according to all the relationship edge sub-matrices, thereby further improving the calculation accuracy, scientificity and feasibility of the relationship edge matrix and being beneficial to further improving the update accuracy of the entity matrix.

Optionally, the above-mentioned relational edge sub-matrix is calculated by a relational edge sub-matrix calculation formula, and the above-mentioned relational edge sub-matrix calculation formula may be:

Wherein, For representing the/>Relational edge submatrices of sets of relational edges,/>For representing the/>Initial sub-matrix of each relationship edge set, the initial sub-matrix comprising initial relationship weights for each relationship edge in the relationship edge set, the initial relationship weights being determined according to a first entity connected to each relationship edge in the relationship edge set,/>For representing the/>Matching degree value corresponding to each relation edge set,/>For representing the total number of sets of relationship edges.

In this optional embodiment, optionally, the initial weights may be distributed uniformly according to all relationship sides of the entity corresponding to the relationship sides, or may further uniformly distribute the relationship sides in a specific direction, for example, the initial weights are all divided into a group from the entity to the outside and enter into a group from the entity to the inside, so that the internal parts of the groups are uniformly distributed.

Therefore, the implementation of the alternative embodiment can calculate the relationship edge sub-matrix of the relationship edge set by combining the specific relationship edge sub-matrix calculation formula with the initial sub-matrix of the relationship edge set and the matching degree value corresponding to the relationship edge set, thereby improving the calculation accuracy, scientificity and feasibility of the relationship edge sub-matrix, being beneficial to further improving the calculation accuracy, scientificity and feasibility of the relationship edge matrix and improving the update accuracy of the entity matrix.

In this optional embodiment, as an optional implementation manner, calculating the relationship edge matrix of the first knowledge sub-graph according to all the relationship edge sub-matrices may include the following operations:

and calculating a preliminary relationship edge matrix of the first knowledge subgraph according to a preset operation algorithm and all relationship edge sub-matrices.

And executing preset matrix processing operation on the prepared relation edge matrix according to a preset matrix processing algorithm to obtain the relation edge matrix.

It can be seen that, by implementing the alternative embodiment, the preliminary relationship side matrix of the first knowledge subgraph can be calculated according to the preset operation algorithm and all relationship side sub-matrices, so that the preset matrix processing operation is performed on the preliminary relationship side matrix according to the preset matrix processing algorithm, thereby obtaining the relationship side matrix, and further improving the calculation accuracy of the relationship side matrix.

Optionally, before performing the preset matrix processing operation on the prepared relational edge matrix according to the preset matrix processing algorithm to obtain the relational edge matrix, the method may further include the following operations:

and extracting element features of the prepared relation edge matrix according to a preset feature recognition model.

Judging whether the element characteristics are matched with preset abnormal element characteristics, and when the element characteristics are judged to be matched with the preset abnormal element characteristics, triggering execution of the operation according to a preset matrix processing algorithm, and executing preset matrix processing operation on the prepared relation edge matrix to obtain operation of the relation edge matrix.

And when the element characteristics are not matched with the preset abnormal element characteristics, determining the prepared relationship edge matrix as the relationship edge matrix of the first knowledge subgraph.

It can be seen that, before the preset matrix processing algorithm is executed on the prepared relationship side matrix, the element features of the prepared relationship side matrix can be extracted according to the preset feature recognition model, whether the element features are matched with the preset abnormal element features or not is determined based on the preset matrix processing algorithm, the preset matrix processing operation is executed on the prepared relationship side matrix, and when the element features are not matched with the preset abnormal element features, the prepared relationship side matrix is determined to be the relationship side matrix of the first knowledge subgraph, so that the calculation accuracy, the scientificity and the feasibility of the relationship side matrix can be further improved, the calculation algorithm complexity of the relationship side matrix can be reduced, the calculation amount of the relationship side matrix can be reduced, the system redundancy can be reduced, and the knowledge subgraph construction efficiency can be further improved.

Further optionally, the above-mentioned preliminary relationship side matrix is calculated by a preliminary relationship side matrix calculation formula, and the above-mentioned preliminary relationship side matrix calculation formula may be:

Wherein, For representing a preliminary relational edge matrix.

Further optionally, the preset matrix processing algorithm may be:

Wherein, For representing a relational edge matrix,/>For representing a preset normalization function.

Therefore, the implementation of the alternative embodiment can calculate the prepared relation edge matrix and the relation edge matrix through a specific preparation relation edge matrix calculation formula and a preset matrix processing algorithm, and can further improve the calculation accuracy, scientificity and feasibility of the relation edge matrix.

103. Generating an entity matrix of the first knowledge sub-graph according to the preset weight distribution condition and all the first entities, wherein the entity matrix comprises entity weights of each first entity, and the entity weights meet the preset weight distribution condition.

104. And updating the entity matrix according to the relation edge matrix and the entity matrix.

In the embodiment of the invention, the entity matrix can be regarded as a vector more accurately.

In an embodiment of the present invention, optionally, the foregoing

Wherein,For representing a relational edge matrix,/>For representing the entity matrix.

Therefore, the entity matrix can be updated through the mode, and the updating accuracy and the feasibility of the entity matrix are improved.

Further optionally, the update end condition in step 104 may be that the number of iterative updates is greater than or equal to a preset number of updates, or may be that the entity weight in the entity matrix is obtained, and the previous adjacent number of updates is analyzed; if the entity weight at the same position is changed, if the changed entity weight still exists, the iterative updating is continued, if not, the updated entity matrix is output, and the following steps 105 and 106 are executed.

105. And determining at least one target entity in the updated entity matrix according to the preset screening conditions.

106. And generating a second knowledge subgraph of the first knowledge subgraph according to all the target entities and the connected relation edges thereof.

It can be seen that by implementing the embodiment of the invention, the method and the device can generate at least two first entities according to the questioning information and the preset knowledge base; and there are interconnecting relation edges between the first entities; each relation edge has a corresponding first knowledge sub-graph for representing the matching degree value between the relation edge and the question information, so that the relation edge matrix of the first knowledge sub-graph is calculated based on all the matching degree values, and the entity matrix of the first knowledge sub-graph is generated according to the preset weight distribution condition and all the first entities, and then the entity matrix is updated according to the relation edge matrix and the entity matrix, so that at least one target entity is determined in the updated entity matrix according to the preset screening condition finally.

Example two

Referring to fig. 2, fig. 2 is a schematic flow chart of another knowledge subgraph construction method according to an embodiment of the present invention. The knowledge subgraph construction method described in fig. 2 may be applied to a knowledge base and/or a text database, or may be applied to a search engine interface, or may be applied to an intelligent device related to at least one of the knowledge base, the text database, and the search engine interface, where the intelligent device includes, but is not limited to, one or more of a battery device, a cloud device, an edge computing device, a relay device, a base station device, a city management device, and an intelligent networking device. As shown in fig. 2, the knowledge subgraph construction method may include the following operations:

201. generating a first knowledge subgraph according to the questioning information and a preset knowledge base, wherein the first knowledge subgraph comprises at least two first entities, the first entities are connected with each other, the relationship edges are used for representing the relationship between the first entities, each relationship edge is provided with a corresponding matching degree value, and the matching degree value is used for representing the matching degree between the relationship edge and the questioning information.

202. And calculating a relation edge matrix of the first knowledge subgraph according to all the matching degree values.

203. And determining the entity weight of each first entity according to the preset weight distribution condition, wherein the entity weight meets the preset weight distribution condition.

In an embodiment of the present invention, as an optional implementation manner, the first entity includes a second entity in the question information and a third entity associated with the second entity in the preset knowledge base, where the third entity exists in a text sentence in the preset knowledge base, and the determining, according to a preset weight distribution condition, an entity weight of each first entity may include the following operations:

According to a preset semantic analysis model, analyzing a first criticality value of each second entity, a second criticality value of each third entity and a relevance value of each third entity to each second entity, wherein the first criticality value is used for representing the semantic influence of the second entity on question information, and the second criticality value is used for representing the semantic influence of the third entity on text sentences.

And determining the entity weight of each first entity according to all the first criticality values, all the second criticality values and all the relevance values.

In the embodiment of the present invention, optionally, the preset semantic analysis model may be a basic model under RNN, or may be an integrated semantic analysis package, including but not limited to ERNIE, GPT, and the like.

Therefore, by implementing the alternative embodiment, the first criticality value of each second entity, the second criticality value of each third entity and the association degree value of each third entity to each second entity can be analyzed according to a preset semantic analysis model, so that the entity weight of each first entity can be further determined, comprehensive global analysis can be realized, and the accuracy of determining the entity weight can be improved.

In this optional embodiment, as another optional implementation manner, the determining manner of the entity weights may also be that all the second entities are identified, the entity weights of all the second entities are uniformly distributed, and the entity weights of all the third entities are set to a preset value, where the preset value may be set by itself according to an actual application scenario, for example: the preset value is 0 or is adjusted to be a relatively low weight value according to the entity weight adaptability of the second entity.

204. And generating an entity matrix of the first knowledge sub-graph according to all the entity weights.

205. And updating the entity matrix according to the relation edge matrix and the entity matrix.

206. And determining at least one target entity in the updated entity matrix according to the preset screening conditions.

In an embodiment of the present invention, as another optional implementation manner, the determining at least one target entity in the updated entity matrix according to the preset screening condition may include the following operations:

and acquiring the entity weight of each first entity in the updated entity matrix.

And judging whether the entity weight of each first entity is greater than or equal to a preset entity weight threshold, and determining the first entity as a target entity when judging that the entity weight of the first entity is greater than or equal to the preset entity weight threshold.

When judging that the entity weights of all the first entities are smaller than the preset entity weight threshold, updating the preset entity weight threshold according to all the entity weights, and re-triggering the operation of judging whether the entity weight of each first entity is larger than or equal to the preset entity weight threshold.

Further optionally, when it is determined that the entity weights of all the first entities are smaller than the preset entity weight threshold, the operation deviation in the steps 201 to 205 may be checked back by feeding back the corresponding object of the query information, and specifically, the operation steps of the steps 201 to 205 may be fed back to the corresponding object.

It can be seen that by implementing the optional embodiment, whether each entity weight in the updated entity matrix is greater than or equal to the preset entity weight threshold value can be further determined, so that the determination accuracy of the target entity is improved, the determination accuracy of the second knowledge subgraph is improved, and the abnormal situation occurring in the knowledge subgraph construction process can be accurately determined by the corresponding object of the questioning information through feedback, so that the construction accuracy of the knowledge subgraph is further improved.

207. And generating a second knowledge subgraph of the first knowledge subgraph according to all the target entities and the connected relation edges thereof.

In the embodiment of the present invention, for the supplementary explanation of step 201, step 202, step 205 to step 207, please refer to the supplementary explanation of step 101, step 102, step 104 to step 106 in the first embodiment, and the description of the embodiment of the present invention is omitted.

In an optional embodiment, before generating the first knowledge subgraph according to the questioning information and the preset knowledge base, the method may further include the following operations:

and analyzing the question type information and the information complexity of the acquired question information according to a preset information processing model.

And matching at least one knowledge sub-base in a preset knowledge base according to the problem type information.

And generating retrieval configuration environment information of all knowledge sub-bases according to the information complexity, wherein the retrieval configuration environment information comprises at least one of retrieval hop threshold value, retrieval history node, retrieval information quantity, network configuration information, sensitive information base, readable and writable authority, tracking prevention configuration information and context semantic recognition model information.

And configuring all knowledge sub-bases according to the retrieval configuration environment information.

Optionally, the generating the first knowledge subgraph according to the questioning information and the preset knowledge base may include the following operations:

It can be seen that, before the first knowledge sub-graph is generated, the method and the device can analyze the problem type information and the information complexity of the acquired question information based on the preset information processing model, so that at least one knowledge sub-base is matched in the preset knowledge base based on the problem type information, the matching accuracy of the knowledge sub-base is improved, the search configuration environment information of all the knowledge sub-bases is generated based on the information complexity, all the knowledge sub-bases are configured, the matching accuracy of the knowledge sub-bases can be further improved, and the first knowledge sub-graph is generated by combining the question information, so that the generation accuracy of the first knowledge sub-graph is improved, the generation accuracy of the second knowledge sub-graph is improved, the knowledge sub-graph construction accuracy is improved, the knowledge base question-answer reasoning efficiency is improved, and the production life efficiency of users is improved.

Example III

Referring to fig. 3, fig. 3 is a schematic structural diagram of a knowledge sub-graph construction apparatus according to an embodiment of the invention. The knowledge subgraph construction device described in fig. 3 may be applied to a knowledge base and/or a text database, or may be applied to a search engine interface, or may be applied to an intelligent device related to at least one of the knowledge base, the text database, and the search engine interface, where the intelligent device includes, but is not limited to, one or more of a battery device, a cloud device, an edge computing device, a relay device, a base station device, a city management device, and an intelligent networking device. As shown in fig. 3, the knowledge subgraph construction device may include:

The first generating module 301 is configured to generate a first knowledge subgraph according to the question information and a preset knowledge base, where the first knowledge subgraph includes at least two first entities, there are relationship edges connected to each other between the first entities, the relationship edges are used to represent a relationship between the first entities, and each relationship edge has a corresponding matching degree value, where the matching degree value is used to represent a matching degree between the relationship edge and the question information.

The first calculation module 302 is configured to calculate a relationship edge matrix of the first knowledge subgraph according to all the matching degree values.

The first generating module 301 is further configured to generate an entity matrix of the first knowledge sub-graph according to the preset weight distribution condition and all the first entities, where the entity matrix includes an entity weight of each first entity, and the entity weight satisfies the preset weight distribution condition.

And an updating module 303, configured to update the entity matrix according to the relationship edge matrix and the entity matrix.

The first determining module 304 is configured to determine at least one target entity in the updated entity matrix according to a preset screening condition.

The first generating module 301 is further configured to generate a second knowledge subgraph of the first knowledge subgraph according to all the target entities and the connected relationship edges thereof.

In an embodiment of the present invention, as an optional implementation manner, as shown in fig. 4, the apparatus may further include:

the second calculating module 305 is configured to calculate, according to a preset text vector model, a relationship edge vector of each relationship edge and a question vector of question information before the first calculating module 302 calculates the relationship edge matrix of the first knowledge subgraph according to all the matching degree values.

The second calculating module 305 is further configured to calculate, for each relationship edge, a similarity value between a relationship edge vector and a problem vector of the relationship edge according to a preset similarity calculation model.

A second determining module 306, configured to determine the similarity value as a matching value corresponding to the relationship edge.

Optionally, the specific manner of calculating the relationship edge matrix of the first knowledge subgraph by the first calculation module 302 according to all the matching degree values includes:

In this optional embodiment, as an optional implementation manner, the specific manner of calculating, by the first calculating module 302, the relationship edge matrix of the first knowledge subgraph according to the first entities of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets includes:

Optionally, the above-mentioned relational edge sub-matrix is calculated by a relational edge sub-matrix calculation formula, where the relational edge sub-matrix calculation formula is:

In this alternative embodiment, as another alternative implementation manner, the specific manner of calculating the relationship edge matrix of the first knowledge subgraph by using the first calculation module 302 according to all the relationship edge sub-matrices includes:

Optionally, as shown in fig. 4, the apparatus may further include:

the extracting module 307 is configured to extract element features of the preliminary relationship edge matrix according to a preset feature recognition model before the first calculating module 302 performs a preset matrix processing operation on the preliminary relationship edge matrix according to a preset matrix processing algorithm to obtain the relationship edge matrix.

The judging module 308 is configured to judge whether the element feature matches with a preset abnormal element feature, and when it is judged that the element feature matches with the preset abnormal element feature, trigger the first calculating module 302 to execute a preset matrix processing operation on the prepared relationship edge matrix according to a preset matrix processing algorithm, so as to obtain an operation of the relationship edge matrix.

The second determining module 306 is further configured to determine the preliminary relationship edge matrix as the relationship edge matrix of the first knowledge subgraph when the judging module 308 judges that the element feature does not match the preset abnormal element feature.

Optionally, the above-mentioned preliminary relationship side matrix is calculated by a preliminary relationship side matrix calculation formula, where the preliminary relationship side matrix calculation formula is:

/>

Wherein, For representing a preliminary relational edge matrix.

Further optionally, the preset matrix processing algorithm is as follows:

In this embodiment of the present invention, as another optional implementation manner, the specific manner of generating the entity matrix of the first knowledge sub-graph by the first generating module 301 according to the preset weight distribution condition and all the first entities includes:

And determining the entity weight of each first entity according to the preset weight distribution condition.

And generating an entity matrix of the first knowledge sub-graph according to all the entity weights.

Optionally, the first entity includes a second entity in the question information and a third entity associated with the second entity in the preset knowledge base, where the third entity exists in a text sentence in the preset knowledge base, and the specific manner of determining, by the first generating module 301, the entity weight of each first entity according to the preset weight distribution condition includes:

In this optional embodiment, as an optional implementation manner, the determining, by the first determining module 304, of the updated entity matrix according to the preset screening condition, a specific manner of determining at least one target entity includes:

In an alternative embodiment, as shown in fig. 4, the apparatus may further include:

The analysis module 309 is configured to analyze the question type information and the information complexity of the obtained question information according to the preset information processing model before the first generation module 301 generates the first knowledge subgraph according to the question information and the preset knowledge base.

And a matching module 310, configured to match at least one knowledge sub-base in a preset knowledge base according to the problem type information.

The second generating module 311 is configured to generate, according to the information complexity, search configuration environment information of all knowledge sub-bases, where the search configuration environment information includes at least one of a search hop threshold, a search history node, a search information amount, network configuration information, a sensitive information base, a readable and writable right, anti-tracking configuration information, and context semantic recognition model information.

A configuration module 312 is configured to configure all knowledge sub-bases according to the retrieved configuration environment information.

Optionally, the specific manner of generating the first knowledge subgraph by the first generating module 301 according to the questioning information and the preset knowledge base includes:

Example IV

Referring to fig. 5, fig. 5 is a schematic structural diagram of another knowledge sub-graph construction apparatus according to an embodiment of the invention. The knowledge subgraph construction device described in fig. 5 may be applied to a knowledge base and/or a text database, or may be applied to a search engine interface, or may be applied to an intelligent device related to at least one of the knowledge base, the text database, and the search engine interface, where the intelligent device includes, but is not limited to, one or more of a battery device, a cloud device, an edge computing device, a relay device, a base station device, a city management device, and an intelligent networking device. As shown in fig. 5, the knowledge subgraph construction device may include:

A memory 401 storing executable program code.

A processor 402 coupled with the memory 401.

The processor 402 invokes the executable program code stored in the memory 401 to perform the steps in the knowledge base construction method described in the first or second embodiment of the present invention.

Example five

The embodiment of the invention discloses a computer storage medium which stores computer instructions for executing the steps in the knowledge sub graph construction method described in the first embodiment or the second embodiment of the invention when the computer instructions are called.

Example six

An embodiment of the present invention discloses a computer program product comprising a non-transitory computer storage medium storing a computer program, and the computer program is operable to cause a computer to perform the steps of the knowledge sub graph construction method described in embodiment one or embodiment two.

The apparatus embodiments described above are merely illustrative, wherein the modules illustrated as separate components may or may not be physically separate, and the components shown as modules may or may not be physical, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above detailed description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course by means of hardware. Based on such understanding, the foregoing technical solutions may be embodied essentially or in part in the form of a software product that may be stored in a computer-readable storage medium including Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), programmable Read-Only Memory (Programmable Read-Only Memory, PROM), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM), one-time programmable Read-Only Memory (OTPROM), electrically erasable programmable Read-Only Memory (EEPROM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM) or other optical disc Memory, magnetic disc Memory, tape Memory, or any other medium that can be used for computer-readable carrying or storing data.

Finally, it should be noted that: the embodiment of the invention discloses a knowledge subgraph construction method and device, which are disclosed by the embodiment of the invention only for illustrating the technical scheme of the invention, but not limiting the technical scheme; although the invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art will understand that; the technical scheme recorded in the various embodiments can be modified or part of technical features in the technical scheme can be replaced equivalently; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions.

Claims

1. A knowledge subgraph construction method, the method comprising:

2. The knowledge subgraph construction method of claim 1 wherein prior to said calculating a relationship edge matrix of the first knowledge subgraph from all of the matching degree values, the method further comprises:

3. The knowledge subgraph construction method of claim 2 wherein the calculating a relationship edge matrix of the first knowledge subgraph according to the first entity of all the relationship edge sets and the matching degree values corresponding to all the relationship edge sets includes:

4. The knowledge subgraph construction method of claim 3 wherein calculating a relationship edge matrix of the first knowledge subgraph from all the relationship edge sub matrices includes:

Wherein, For representing the preliminary relational edge matrix;

and the preset matrix processing algorithm is as follows:

5. The knowledge sub-graph construction method according to any one of claims 1-4, wherein the generating an entity matrix of the first knowledge sub-graph according to a preset weight distribution condition and all the first entities includes:

6. The knowledge sub-graph construction method according to claim 5, wherein determining at least one target entity in the updated entity matrix according to a preset screening condition comprises:

Acquiring the entity weight of each first entity in the updated entity matrix;

7. The knowledge subgraph construction method of any one of claims 1-4 and 6 wherein prior to generating a first knowledge subgraph from the questioning information and a preset knowledge base, the method further comprises:

8. A knowledge sub-graph construction apparatus, the apparatus comprising:

9. A knowledge sub-graph construction apparatus, the apparatus comprising:

A memory storing executable program code;

A processor coupled to the memory;

The processor invokes the executable program code stored in the memory to perform the knowledge sub-graph construction method of any of claims 1-7.

10. A computer storage medium storing computer instructions which, when invoked, are operable to perform the knowledge sub-graph construction method of any one of claims 1-7.