WO2022156450A1 - 知识库的查询方法、装置、计算机设备和存储介质 - Google Patents

知识库的查询方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2022156450A1
WO2022156450A1 PCT/CN2021/139332 CN2021139332W WO2022156450A1 WO 2022156450 A1 WO2022156450 A1 WO 2022156450A1 CN 2021139332 W CN2021139332 W CN 2021139332W WO 2022156450 A1 WO2022156450 A1 WO 2022156450A1
Authority
WO
WIPO (PCT)
Prior art keywords
statement
entity
graph query
original
template
Prior art date
Application number
PCT/CN2021/139332
Other languages
English (en)
French (fr)
Inventor
郭又铭
顾松庠
Original Assignee
京东科技控股股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东科技控股股份有限公司 filed Critical 京东科技控股股份有限公司
Publication of WO2022156450A1 publication Critical patent/WO2022156450A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • the present application relates to the technical field of natural language processing, and in particular, to a knowledge base query method, apparatus, computer device and storage medium.
  • Knowledge base is a structured, easy-to-operate, easy-to-use, comprehensive and organized knowledge cluster in knowledge engineering.
  • the knowledge base is usually queried based on template classification, and specifically, a statement to be queried is matched with a template, and a query result is obtained based on the matched template.
  • the query result completely depends on the template, and if there is no template matching the sentence to be queried, the query cannot be performed.
  • algorithm engineers are required to design a large number of natural languages to templates, and to expand templates at any time, which results in high labor costs and time-consuming.
  • the present application provides a method, apparatus, computer device and storage medium for querying a knowledge base.
  • An embodiment of the present application provides a method for querying a knowledge base, including:
  • the knowledge base is queried by using the graph query statement to obtain the query result corresponding to the original statement.
  • Another embodiment of the present application provides an apparatus for querying a knowledge base, including:
  • the first obtaining module is used to obtain the original statement to be queried
  • a second acquiring module configured to perform parsing processing on the original statement to acquire a graph query statement template corresponding to the original statement
  • a generating module configured to generate a graph query statement according to the original statement and the graph query statement template
  • the query module is used to query the knowledge base by using the graph query statement to obtain the query result corresponding to the original statement.
  • Another embodiment of the present application provides a computer device, including a processor and a memory;
  • the processor runs a program corresponding to the executable program code by reading the executable program code stored in the memory, so as to implement the method for querying the knowledge base according to the embodiment of the above aspect .
  • Another embodiment of the present application provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method for querying a knowledge base according to the above-mentioned embodiment of the first aspect.
  • Another embodiment of the present application provides a computer program product, including a computer program, wherein, when the computer program is executed by a processor, the method for querying the knowledge base described in the above-mentioned embodiment of the first aspect is implemented.
  • FIG. 1 shows a schematic flowchart of a method for querying a knowledge base provided by one or more embodiments
  • FIG. 2 shows a schematic flowchart of another method for querying a knowledge base provided by one or more embodiments
  • FIG. 3 shows a schematic flowchart of another method for querying a knowledge base provided by one or more embodiments
  • FIG. 4 shows a schematic flowchart of another method for querying a knowledge base provided by one or more embodiments
  • FIG. 5 shows a schematic diagram of a knowledge base query provided by one or more embodiments
  • FIG. 6 shows a schematic structural diagram of a query apparatus for a knowledge base provided by one or more embodiments.
  • FIG. 1 is a schematic flowchart of a method for querying a knowledge base according to an embodiment of the present application.
  • the method for querying the knowledge base according to the embodiment of the present application can be executed by the apparatus for querying the knowledge base provided in the embodiment of the present application, and the apparatus can be configured in a computer device to query the knowledge base by using a graph query statement.
  • the query method of the knowledge base includes steps 101 to 104 .
  • Step 101 Obtain the original sentence to be queried.
  • the user can textually input the sentence to be queried, so that the computer device can obtain the original sentence to be queried.
  • the user can also input in the form of voice, and after the computer device collects the voice data, it performs voice recognition to obtain the original sentence to be queried.
  • Step 102 Perform parsing processing on the original statement to obtain a graph query statement template corresponding to the original statement.
  • the original statement can be parsed and processed to obtain a graph query statement template corresponding to the original statement.
  • the knowledge base includes entities and the relationship between entities, namely entity relationships, among them, entities are abstractions of objective individuals, such as names of persons, places, and institutions.
  • the entity, entity relationship, etc. contained in the original statement can be obtained by parsing the original statement, and a graph query statement template corresponding to the original statement can be obtained according to the acquired entity, entity relationship, etc.
  • a preset neural network model may be used to process the original sentence, and a graph query sentence template corresponding to the original sentence is obtained according to the processing result.
  • the graph query statement template includes a prescribed language that needs to be used when the graph query statement is used to query the knowledge base.
  • the graph query statement template may be a gremlin query statement template.
  • gremlin language is a query language for graph databases, which can be used to traverse property graphs.
  • the original sentence is "Where is A's hometown” (the letter A is a specific name, nickname, etc., such as Zhang San)
  • the original sentence is processed by using the preset neural network model, and the sentence type of the original sentence is determined as Query statement, and determine that the entity "A” in the original statement matches the entity type in the knowledge base as "name” and the corresponding attribute condition (has), and the entity relationship "hometown” in the original statement matches the edge in the knowledge base.
  • the type is "hometown” and the corresponding edge attribute (out).
  • the gremlin query statement template "gt_nlp_asm_kg.v() is generated. .has('name',).out('Hometown').order().by('stock_flag',decr).dedup()".
  • Step 103 Generate a graph query sentence according to the original sentence and the graph query sentence template.
  • the graph query statement template since the graph query statement template does not contain entities required for querying the knowledge base, the graph query statement template can be filled according to the original statement to generate the graph query statement.
  • Step 104 query the knowledge base by using the graph query statement to obtain the query result corresponding to the original statement.
  • the knowledge base can be queried by using the graph query statement to obtain the query result corresponding to the original statement. For example, according to the entity contained in the graph query statement, the entity can be found in the knowledge base, and the query result can be found from the entity pointed to by the entity.
  • the query method of the knowledge base provided by some embodiments, by acquiring the primitive to be queried, parsing the original statement to obtain a graph query statement template corresponding to the original statement, and generating a graph query according to the original statement and the graph query statement template statement, and query the knowledge base by using the graph query statement to obtain the query result corresponding to the original statement. Therefore, by generating graph query sentences according to the original sentence and graph query sentence template, and using the graph query sentence to query the knowledge base, compared with the template-based knowledge base query method, the query sentences are more abundant and diverse, which greatly improves the knowledge base query efficiency. Flexibility further improves accuracy and saves labor costs and time.
  • the graph query statement template may be obtained in the manner shown in FIG. 2 .
  • FIG. 2 is a schematic flowchart of another method for querying a knowledge base according to some embodiments.
  • the above process of parsing the original sentence to obtain a graph query sentence template includes steps 201 to 203 .
  • Step 201 Perform type analysis on the original statement to obtain the statement type of the original statement.
  • the input sentence may be a query sentence or a judgment sentence.
  • the original sentence input by the user is the query sentence "Where is A's hometown”.
  • the original sentence input by the user is the judgment sentence "Is the hometown of A in city m?".
  • statement type may include a query statement, a judgment statement, and the like.
  • the original sentence can be processed by word segmentation, and each word segmentation contained in the original sentence is obtained, and it is judged whether there is a word segmentation matching the preset word segmentation of each sentence type among the word segmentations included in the original sentence. If there is a participle that matches a certain sentence type in the participles contained in the original sentence, it can be considered that the sentence type of the original sentence is the sentence type.
  • the preset participles of the query sentence may include "what", “where", “where” and so on.
  • the sentence type of the original sentence is a query sentence.
  • the original sentence can be input into a sentence type recognition model obtained by pre-training, and the sentence type of the original sentence can be determined by using the sentence type recognition model. Therefore, the sentence type of the original sentence is determined by the model, which can improve the accuracy of the sentence type recognition.
  • Step 202 Perform entity detection on the original sentence to determine whether the original sentence contains an entity matching the specified entity type.
  • entity detection may be performed on the original statement, entities included in the original statement may be acquired, and the matching degree between each entity included in the original statement and a specified entity type may be calculated to determine whether the original statement includes An entity that matches the specified entity type.
  • the specified entity type refers to the entity type in the knowledge base to be queried.
  • the entity types contained in a knowledge base include "person", “school”, “unit” and so on.
  • the matching degree between any entity contained in the original statement and the specified entity type exceeds a preset threshold, it may be considered that the original statement contains an entity matching the specified entity type. If the matching degree between the entities included in the original sentence and each of the specified entity types is less than or equal to the preset threshold, it can be determined that the original sentence does not include entities matching the specified entity types.
  • Step 203 if the original sentence contains an entity matching any specified entity type, acquire a graph query sentence template corresponding to the sentence type and any specified entity type.
  • the operation performed on the entity in the graph query statement can be determined, then based on this The operation and the statement type of the original statement can generate the corresponding graph query statement template.
  • the attributes of the edges between entity types in the knowledge base are different, and the corresponding graph query statements are also different.
  • the edge attributes may include outgoing edges (out), incoming edges (in), in&out, and the like.
  • a graph query statement template library is preset, and each entity type may have a corresponding graph query statement template. For example, there is A ⁇ B in the knowledge base.
  • the entity type of A corresponds to a graph query statement template containing one entity, a graph query statement template containing an entity and an edge type, and a graph query statement containing two entities and an edge type. templates, etc.
  • a graph query statement template corresponding to any specified entity type can be obtained first, and the graph query statement template corresponding to the statement type can be filtered out.
  • a graph query statement template library is preset, and the graph query statement template library may include a large number of graph query statement templates.
  • Each graph query statement template contains a different number of entities, or contains edges. The number of types is different, or the edge properties are different.
  • the edge attributes may include outgoing edges (out), incoming edges (in), in&out, and the like.
  • the type analysis of the original statement can be performed to obtain the statement type of the original statement, and the entity detection of the original statement can be performed to determine the type of the original statement. Whether an entity matching the specified entity type is included, if the original statement contains an entity matching any specified entity type, obtain the graph query statement template corresponding to the statement type and any specified entity type, according to the obtained Graph query statement template and original statement to generate graph query statement.
  • the query statement is more abundant and diverse, avoiding the situation that the query cannot be queried because there is no template matching the original statement, and greatly improving the knowledge
  • the flexibility of library queries further improves accuracy and saves labor costs and time.
  • the original statement to be queried may also include relationships between entities, that is, entity relationships.
  • the graph query statement template may be obtained by using the manner shown in FIG. 3 .
  • FIG. 3 is a schematic flowchart of another method for querying a knowledge base provided by some embodiments. As shown in FIG. 3 , the above process of parsing the original sentence to obtain a graph query sentence template includes steps 301 to 304 .
  • Step 301 Perform type analysis on the original statement to obtain the statement type of the original statement.
  • Step 302 Perform entity detection on the original sentence to determine whether the original sentence contains an entity matching the specified entity type.
  • steps 301 to 302 are similar to the above-mentioned steps 201 to 202, and thus are not repeated here.
  • Step 303 if the original sentence contains an entity matching any of the specified entity types, perform entity relationship detection on the original sentence to determine whether the original sentence contains an entity relationship matching the specified edge type.
  • entity relationship detection can be performed on the original statement.
  • the entity relationship between the two entities can be obtained according to the word segmentation between the two entities in the original statement, or Determine the entity relationship contained in the original sentence according to the participle located after the entity in the original sentence. For example, if the original sentence is "where is A's hometown", and the entity contained in the original sentence is "A”, according to the participle "de” and "hometown” located after A, the entity relationship contained in the original sentence can be determined as "hometown” ".
  • the specified edge type refers to the edge type contained in the knowledge base to be queried.
  • the edge types contained in a knowledge base are "hometown”, “graduated”, “employed in”, and "published”.
  • the matching degree between the entity relationship contained in the original statement and the specified edge type is greater than the threshold, it can be determined that the original statement contains the entity relationship matching the specified edge type. If the matching degree between the entity relationship contained in the original statement and each specified edge type is less than the preset threshold, it can be determined that the original statement does not contain an entity relationship matching the specified edge type.
  • Step 304 if the original sentence contains an entity relationship matching any specified edge type, obtain a graph query sentence template corresponding to the sentence type, any specified entity type and any specified edge type.
  • the original statement contains entities matching any of the specified entity types, and if the original statement contains entity relationships that match any of the specified edge types, it means that the original statement contains entities matching the specified entity types. Entities, also containing entity relationships that match the specified edge type.
  • the edge attribute of the entity relationship matching any specified edge type in the original statement can be obtained, according to any specified entity type, any specified edge type matching and edge Attribute, which can determine the query operation for entity type and the query operation for edge type in the graph query statement template, and then combine the statement type of the original statement to generate the graph query statement template.
  • the graph query statement template contains operations on entities as "has”, and the edge attribute contained in the original statement is out.
  • the operation "out” of the edge type "hometown”, combined with the original statement as a query statement, can generate the gremlin query statement template "gt_nlp_asm_kg.v().has('name',).out('hometown').order( ).by('stock_flag',decr).dedup()".
  • a neural network model can also be used to obtain a graph query statement template.
  • the original sentence can be input into a pre-trained neural network model to obtain operations on entity types, edge types and corresponding edge attributes, and combined with the sentence types of the original sentence, a graph query sentence template is generated. It can be understood that the edge attribute corresponds to the operation on the edge type in the graph query statement template.
  • entity relationship detection can also be performed on the original statement to determine whether the original statement contains and specified. If the original statement contains an entity relationship that matches any of the specified edge types, get the graph query statement corresponding to the statement type, any of the specified entity types, and any of the specified edge types template. Therefore, when the original statement contains entity relationships, a graph query statement template corresponding to the original statement can also be obtained, so that the query statement is more abundant and diverse, and the flexibility and accuracy of the knowledge base query are improved.
  • the graph query statement template contains entities to be filled, and the graph query statement can be generated by means of 4, and the knowledge base can be queried by using the graph query statement.
  • FIG. 4 is a schematic flowchart of another method for querying a knowledge base according to some embodiments.
  • the query method of the knowledge base includes steps S401 to S405.
  • Step 401 Obtain the original sentence to be queried.
  • Step 402 Perform parsing processing on the original statement to obtain a graph query statement template corresponding to the original statement.
  • steps 401 to 402 can be referred to in the above-mentioned embodiments, and details are not described herein again.
  • Step 403 Perform entity extraction on the original statement to obtain an entity set corresponding to the original statement.
  • the entity extraction of the original statement can be performed by the method based on rules and dictionaries or machine learning methods, etc., to obtain the entities contained in the original statement, and obtain the entity set corresponding to the original statement according to the entities contained in the original statement. . It is understood that the entity set includes the entities contained in the original statement.
  • Step 404 using entities in the entity set to perform entity filling on the graph query statement template to generate a graph query statement.
  • the graph query statement template contains entities to be filled, when generating a graph query statement, entities in the entity set corresponding to the original statement can be used to fill the graph query statement template with entities, thereby generating a graph query statement.
  • an entity matching the type of the entity to be filled in the graph query statement template may be determined from the entity set, and the entity matching the type of the entity to be filled in the graph query statement template is filled into the graph query statement In the template, the graph query statement is generated.
  • the filling is performed by using an entity matching the type of the entity to be filled, which improves the accuracy of the graph query statement, thereby improving the accuracy of the subsequent knowledge base query.
  • the entity "A” that matches the type "name” of the entity to be filled in the entity set can be "Fill into the graph query statement template to generate the graph query statement "gt_nlp_asm_kg.v().has('name','A').out('hometown').order().by('stock_flag',decr) .dedup()".
  • the entity set that matches the type of each entity to be filled in the graph query statement template can be obtained, and the matched entities can be filled into the location of the entity to be filled in the graph query statement template.
  • the entities extracted from the original sentence may be inconsistent with the entity representation in the knowledge base.
  • each entity in the entity set can be linked with the entities in the knowledge base. , to determine each entity in the knowledge base corresponding to each entity in the entity set, which is called each target entity for the convenience of distinction.
  • the standard representation of entities extracted from the original sentence in the knowledge base is determined.
  • Step 405 query the knowledge base by using the graph query statement to obtain the query result corresponding to the original statement.
  • the graph query statement template contains entities to be filled.
  • entity extraction can be performed on the original statement to obtain the original query statement.
  • the entity set corresponding to the statement uses the entities in the entity set to fill the graph query statement template with entities to generate the graph query statement. Therefore, by filling the graph query statement template according to the entities in the original statement, the graph query statement is generated, so that the graph query statement does not depend on the template of the statement to be queried, and the flexibility of the knowledge base query is improved.
  • the graph query statement when entities in the entity set are used to perform entity filling on the graph query statement template to generate the graph query statement, the graph query statement can also be generated by the following method.
  • each entity in the entity set may be used to fill the entities to be filled in the graph query statement template to generate multiple candidate graph query statements.
  • the matching degree of each candidate graph query sentence and the original sentence can be calculated, and according to the matching degree between each candidate graph query sentence and the original sentence, a graph query can be selected from the multiple candidate graph query sentences. statement.
  • the candidate graph query sentence with the highest matching degree can be used as the graph query sentence corresponding to the original sentence, or multiple candidate graph query sentences with the highest matching degree can be selected as the graph query sentence corresponding to the original sentence, etc. .
  • the entity set corresponding to the original statement contains 3 entities, and the number of entities to be filled in the graph query statement template is 2, the 3 entities in the entity set can be used respectively to perform operations on the entities to be filled in the graph query statement template.
  • the candidate graph query sentence with the highest matching degree can be selected from the 6 candidate graph query sentences as the graph query sentence of the original sentence.
  • the entities in the entity set can also be arranged according to the number of entities to be filled in the graph query statement template, and the matching degree between each arrangement result and the original statement can be calculated. , select the target arrangement result with the highest matching degree from the multiple arrangement results, and sequentially fill each entity in the target arrangement result to the entity position to be filled in the graph query statement template to obtain the graph query statement.
  • the entity set corresponding to the original statement contains 3 entities, and the number of entities to be filled in the graph query statement template is 2, then two entities can be extracted from the entity set and arranged to obtain Each arrangement result is calculated, and the matching degree between each arrangement result and the original sentence can be calculated.
  • the arrangement result with the highest matching degree can be selected, and each entity in the arrangement result with the highest matching degree can be correspondingly filled into the entity that needs to be filled in the graph query statement template. position, get the graph query statement, that is, fill the first entity in the ranking result with the highest matching degree to the position of the first entity to be filled in the graph query statement template, and fill the second entity in the ranking result into the graph
  • the second position in the query statement template that needs to be filled with entities.
  • the graph query statement template can be processed according to each entity in the entity set. Fill in to obtain multiple candidate graph query statements, and select the original graph query statement from the multiple candidate graph query statements according to the degree of matching between each candidate graph query statement and the original statement, so that the generated graph query statement is consistent with the original statement.
  • the matching degree of the original sentence is high, which improves the accuracy of the knowledge base query.
  • FIG. 5 is a schematic diagram of a knowledge base query provided by some embodiments.
  • the first is the classification of sentence types to determine whether the original sentence is a query or a judgment
  • the second classification task is the classification of attribute conditions , including has, in, out, in&out a total of 4 categories (or can also add the "empty" category), where has represents the attributes of the entity, in, out, in&out are edge attributes, through the classification task to determine the original What are the attribute conditions used in the sentence
  • the third classification task the number of classifications is the same as the number of entity types and edge types contained in the knowledge base.
  • the knowledge base corresponding to the entities contained in the original sentence is determined.
  • Roberta is used to encode the original sentence as "where is A's hometown” to obtain the encoding vector, and the encoding vector is processed to obtain the sentence type of the original sentence as a query sentence, and the attribute condition used in the original sentence is " has” and "out”, as well as the entity type "name” that matches the entity in the original statement and the edge type "hometown” that matches the entity relationship in the original statement among the specified 6 entity types and edge types.
  • COND_JUD_OP processes the corresponding vector, and the statement type of the original statement is obtained as a query statement
  • ASK_CHOSE_OP is to fuse the results of the second classification task and the third classification task. Specifically, determine the entity and its corresponding attribute conditions and edge types and corresponding edge attributes. Then, according to the entity and the corresponding attribute condition, the edge type and the corresponding edge attribute, and the statement type query statement of the original statement, the gremlin query statement template is obtained.
  • gremlin query statement template After obtaining the gremlin query statement template, decode the gremlin query statement template and the original statement, perform entity extraction on the original statement, arrange each entity, and determine the arrangement result with the highest matching degree with the original statement.
  • the query statement template is combined with the arrangement result with the highest degree of matching with the original statement, that is, the entity in the arrangement result is filled into the corresponding position in the template to generate the gremlin query statement "gt_nlp_asm_kg.v().has('name','A' ).out('home').order().by('stock_flag',decr).dedup()".
  • FIG. 6 is a schematic structural diagram of an apparatus for querying a knowledge base according to some embodiments.
  • the query apparatus 600 of the knowledge base includes: a first acquisition module 610 , a second acquisition module 620 , a generation module 630 and a query module 640 .
  • the first obtaining module 610 is used to obtain the original statement to be queried
  • the second obtaining module 620 is configured to perform parsing processing on the original statement to obtain a graph query statement template corresponding to the original statement;
  • a generating module 630 configured to generate a graph query statement according to the original statement and the graph query statement template
  • the query module 640 is configured to query the knowledge base by using a graph query statement to obtain query results corresponding to the original statement.
  • the second obtaining module 620 includes:
  • the parsing unit is used to perform type analysis on the original statement to obtain the statement type of the original statement;
  • the determination unit is used to perform entity detection on the original statement to determine whether the original statement contains an entity matching the specified entity type
  • the acquiring unit is used to acquire a graph query statement template corresponding to the statement type and any specified entity type when the original statement contains an entity matching any specified entity type.
  • the original statement contains an entity relationship matching any of the specified edge types, obtain the graph query statement template corresponding to the statement type, any of the specified entity types, and any of the specified edge types.
  • the graph query statement template contains entities to be filled
  • the generating module 630 includes:
  • the entity extraction unit is used to perform entity extraction on the original statement to obtain the entity set corresponding to the original statement;
  • the generating unit is used to perform entity filling on the graph query statement template by using the entities in the entity set to generate the graph query statement.
  • a graph query sentence is selected from multiple candidate graph query sentences.
  • the device may also include:
  • a determination module used for determining each target entity in the knowledge base corresponding to each entity in the entity set
  • the generating unit is further configured to perform entity filling on the graph query statement template according to each target entity, so as to generate a graph query statement.
  • the knowledge base query device by acquiring the primitive to be queried, parses the original sentence to obtain a graph query sentence template corresponding to the original sentence, and generates a graph query according to the original sentence and the graph query sentence template statement, and query the knowledge base by using the graph query statement to obtain the query result corresponding to the original statement. Therefore, by generating graph query sentences according to the original sentence and graph query sentence template, and using graph query sentences to query the knowledge base, the query sentences are more abundant and diverse than the template-based knowledge base query method, which greatly improves the efficiency of knowledge base query. Flexibility further improves accuracy and saves labor costs and time.
  • some embodiments further provide a computer device, including a processor and a memory;
  • the processor executes the program corresponding to the executable program code by reading the executable program code stored in the memory, so as to implement the method for querying the knowledge base according to the above embodiment.
  • some embodiments further provide a non-transitory computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, implements the method for querying a knowledge base as described in the above embodiments.
  • some embodiments further provide a computer program product, including a computer program, wherein when the computer program is executed by a processor, the method for querying the knowledge base described in the embodiments of the above aspect is implemented.
  • first and second are only used for descriptive purposes, and cannot be understood as indicating or implying relative importance or implying the number of indicated technical features. Thus, a feature delimited with “first”, “second” may expressly or implicitly include at least one of that feature.
  • plurality means at least two, such as two, three, etc., unless expressly and specifically defined otherwise.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种知识库的查询方法、装置、计算机设备和存储介质,其中,方法包括:获取待查询的原语句(101);对原语句进行解析处理,以获取原语句对应的图查询语句模板(102);根据原语句和图查询语句模板,生成图查询语句(103);利用图查询语句对知识库进行查询,以获取原语句对应的查询结果(104)。

Description

知识库的查询方法、装置、计算机设备和存储介质
相关申请的交叉引用
本申请基于申请号为202110076011.8、申请日为2021年01月20日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本申请涉及自然语言处理技术领域,尤其涉及知识库的查询方法、装置、计算机设备和存储介质。
背景技术
知识库是知识工程中的结构化、易操作、易利用、全面有组织的知识集群。相关技术中,通常基于模板分类对知识库进行查询,具体是将待查询的语句与模板进行匹配,基于匹配的模板获取查询结果。
但是,基于模板分类的知识库查询方法,其查询结果完全依赖于模板,如果没有与待查询的语句匹配的模板,则无法进行查询。并且,需要算法工程人员进行大量的自然语言到模板的设计,并随时进行模板扩充,人工成本高,耗时长。
发明内容
本申请提出一种知识库的查询方法、装置、计算机设备和存储介质。
本申请一方面实施例提出了一种知识库的查询方法,包括:
获取待查询的原语句;
对所述原语句进行解析处理,以获取所述原语句对应的图查询语句模板;
根据所述原语句和所述图查询语句模板,生成图查询语句;
利用所述图查询语句对知识库进行查询,以获取所述原语句对应的查询结果。
本申请另一方面实施例提出了一种知识库的查询装置,包括:
第一获取模块,用于获取待查询的原语句;
第二获取模块,用于对所述原语句进行解析处理,以获取所述原语句对应的图查询语句模板;
生成模块,用于根据所述原语句和所述图查询语句模板,生成图查询语句;
查询模块,用于利用所述图查询语句对知识库进行查询,以获取所述原语句对应的查询结果。
本申请另一方面实施例提出了一种计算机设备,包括处理器和存储器;
其中,所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于实现如上述一方面实施例所述的知识库的查询方法。
本申请另一方面实施例提出了一种非临时性计算机可读存储介质,其上存储有计算 机程序,该程序被处理器执行时实现如上述一方面实施例所述的知识库的查询方法。
本申请另一方面实施例提出了一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现上述一方面实施例所述的知识库的查询方法。
本申请附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本申请的实践了解到。
附图说明
本申请上述的和/或附加的方面和优点从下面结合附图对实施例的描述中将变得明显和容易理解,其中:
图1示出一个或多个实施例提供的一种知识库的查询方法的流程示意图;
图2示出一个或多个实施例提供的另一种知识库的查询方法的流程示意图;
图3示出一个或多个实施例提供的另一种知识库的查询方法的流程示意图;
图4示出一个或多个实施例提供的另一种知识库的查询方法的流程示意图;
图5示出一个或多个实施例提供的一种知识库查询的示意图;
图6示出一个或多个实施例提供的一种知识库的查询装置的结构示意图。
具体实施方式
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本申请,而不能理解为对本申请的限制。
下面参考附图描述本申请实施例的知识库的查询方法、装置、计算机设备和存储介质。
图1为本申请实施例提供的一种知识库的查询方法的流程示意图。
本申请实施例的知识库的查询方法,可由本申请实施例提供的知识库的查询装置执行,该装置可配置于计算机设备中,以实现利用图查询语句对知识库进行查询。
如图1所示,该知识库的查询方法包括步骤101至步骤104。
步骤101,获取待查询的原语句。
在一些实施例中,用户可文字输入要查询的语句,由此,计算机设备可以获取待查询的原语句。或者,用户也可通过语音形式输入,计算机设备在采集到语音数据后,进行语音识别,获取待查询的原语句。
步骤102,对原语句进行解析处理,以获取原语句对应的图查询语句模板。
在获取到待查询的原语句后,可对原语句进行解析处理,获取原语句对应的图查询语句模板。由于知识库中包括实体、实体之间的关系即实体关系等,其中,实体是对客观个体的抽象,比如人名、地名、机构名等。那么,可通过对原语句进行解析处理,获取原语句中包含的实体、实体关系等,并根据获取的实体、实体关系等,获取原语句对应的图查询语句模板。
可选地,在对原语句进行解析处理时,可利用预设的神经网络模型对原语句进行处理, 根据处理结果获取原语句对应的图查询语句模板。
其中,图查询语句模板中包含图查询语句进行知识库查询时需要用的规定语言。比如,图查询语句模板可以是gremlin查询语句模板。其中,gremlin语言是图数据库的一种查询语言,可用来遍历属性图。
比如,原语句为“A的家乡在什么地方”(字母A为具体的人名、别称等,如张三),利用预设的神经网络模型对原语句进行处理,确定该原语句的语句类型为查询语句,及确定原语句中的实体“A”与知识库中匹配的实体类型为“name”及对应的属性条件(has),原语句中的实体关系“家乡”与知识库中匹配的边类型为“家乡”及对应的边属性(out)。之后,根据实体类型“name”及对应的属性条件(has),和边类型“家乡”及对应的边属性(out),结合语句类型为查询语句,生成gremlin查询语句模板“gt_nlp_asm_kg.v().has('name',).out('家乡').order().by('stock_flag',decr).dedup()”。
步骤103,根据原语句和图查询语句模板,生成图查询语句。
在一些实施例中,由于图查询语句模板中不包含进行知识库查询时所需的实体等,可根据原语句对图查询语句模板进行填充,生成图查询语句。
步骤104,利用图查询语句对知识库进行查询,以获取原语句对应的查询结果。
在获取图查询语句后,可利用图查询语句对知识库进行查询,获取原语句对应的查询结果。比如,可根据图查询语句中包含的实体,在知识库中查找到该实体,从该实体指向的实体中查找到查询结果。
一些实施例提供的知识库的查询方法,通过获取待查询的原语,对原语句进行解析处理,以获取原语句对应的图查询语句模板,并根据原语句和图查询语句模板,生成图查询语句,利用图查询语句对知识库进行查询,以获取原语句对应的查询结果。由此,通过根据原语句和图查询语句模板,生成图查询语句,利用图查询语句进行知识库查询,相比基于模板的知识库查询方法,查询语句更加丰富多元,大大提高了知识库查询的灵活性,也进一步提升了准确率,并且节省了人力成本和时间。
在一些实施例中,可通过图2所示的方式,获取图查询语句模板。图2为一些实施例提供的另一种知识库的查询方法的流程示意图。
如图2所示,上述对原语句进行解析处理,以获取图查询语句模板,包括步骤201至步骤203。
步骤201,对原语句进行类型解析,以获取原语句的语句类型。
在实际应用中,用户在进行知识库查询时,输入的语句可能是查询语句,也可能是判断语句。比如,用户输入的原语句为查询语句“A的家乡在什么地方”。又如,用户输入的原语句为判断语句“A的家乡是m市吗”。
由于不同的语句类型对应不同的图查询语句模板也不同,那么在获取待查询的原语句后,可对原语句进行类型解析,获取原语句的语句类型。其中,语句类型可包括查询语句、判断语句等。
作为一种实现方式,可对原语句进行分词处理,获取原语句中包含的各个分词,判断原 语句包含的分词中,是否存在与每个语句类型的预设分词匹配的分词。若原语句包含的分词中存在与某语句类型匹配的分词,可以认为原语句的语句类型为该语句类型。
比如,原语句为“B毕业于哪所学校”,查询语句的预设分词可包括“什么”、“哪”、“哪里”等。经过分析,原语句中包含查询语句的预设分词“哪”,可认为该原语句的语句类型为查询语句。
作为另一种可能的实现方式,可将原语句输入至预先训练得到的语句类型识别模型中,利用语句类型识别模型,确定原语句的语句类型。由此,通过模型确定原语句的语句类型,可提高语句类型识别的准确性。
步骤202,对原语句进行实体检测,以确定原语句中是否包含与指定的实体类型匹配的实体。
在一些实施例中,可对原语句进行实体检测,获取原语句中包含的实体,并计算原语句中包含的每个实体与指定的实体类型之间的匹配度,以确定原语句中是否包含与指定的实体类型匹配的实体。其中,指定的实体类型是指待查询的知识库中的实体类型。比如,某知识库中包含的实体类型有“人物”、“学校”、“单位”等。
若原语句中包含的任一实体与指定的实体类型之间的匹配度超过预设阈值,可认为确定原语句中包含与指定的实体类型匹配的实体。若原语句中包含的实体与每个指定的实体类型之间的匹配度均小于或等于预设阈值,可确定原语句中不包括与指定的实体类型匹配的实体。
步骤203,在原语句中包含与任一指定的实体类型匹配的实体的情况下,获取与语句类型及任一指定的实体类型对应的图查询语句模板。
作为一种可能的实现方式,在原语句中包含与任一指定的实体类型匹配的实体的情况下,说明原语句中包含有实体,那么可以确定图查询语句中对实体进行的操作,那么基于该操作以及原语句的语句类型,可生成对应的图查询语句模板。
以生成gremlin图查询模板为例,若原语句中包含有实体,可以确定对该实体进行的操作为“has”或者说原语句用到的属性条件有“has”,基于“has”和原语句的语句类型,可生成对应的图查询语句模板。
在实际应用中,知识库中实体类型之间的边的属性不同,对应的图查询语句也不同。其中,边属性可包括出边(out)、入边(in)、in&out等。作为另一种可能的实现方式,预设有图查询语句模板库,每个实体类型可具有对应的图查询语句模板。比如,知识库中有A→B,A的实体类型对应的包含一个实体的图查询语句模板、包含一个实体和一个边类型的图查询语句模板、包含两个实体和一个边类型的图查询语句模板等。
在原语句中包含与任一指定的实体类型匹配的实体的情况下,可先获取与任一指定的实体类型对应的图查询语句模板,并筛选出与语句类型对应的图查询语句模板。
作为再一种可能的实现方式,预设有图查询语句模板库,图查询语句模板库中可包括大量的图查询语句模板,每个图查询语句模板中包含的实体数量不同,或者包含的边类型数量不同,或者边属性不同。其中,边属性可包括出边(out)、入边(in)、in&out等。
在获取图查询语句模板时,可获取原语句的实体数量、实体关系数量对应的图查询语句 模板,根据原语句中实体关系,确定实体关系的边属性,筛选出与边属性匹配的图查询语句模板,并根据原语句的语句类型,进一步筛选出与语句类型对应的图查询语句模板。
一些实施例提供的知识库的查询方法,在获取图查询语句模板时,可通过对对原语句进行类型解析,以获取原语句的语句类型,并对原语句进行实体检测,以确定原语句中是否包含与指定的实体类型匹配的实体,在原语句中包含与任一指定的实体类型匹配的实体的情况下,获取与语句类型及任一指定的实体类型对应的图查询语句模板,根据获取的图查询语句模板和原语句,生成图查询语句。由此,利用获取的图查询语句进行知识库查询,相比基于模板的知识库查询方法,查询语句更加丰富多元,避免了因不存在与原语句匹配的模板无法查询的情况,大大提高了知识库查询的灵活性,也进一步提升了准确率,并且节省了人力成本和时间。
在实际应用中,待查询的原语句中可能还包含有实体之间的关系即实体关系。基于此,在一些实施例中,可利用图3所示的方式,获取图查询语句模板。图3为一些实施例提供的另一种知识库的查询方法的流程示意图。如图3所示,上述对原语句进行解析处理,以获取图查询语句模板包括步骤301至步骤304。
步骤301,对原语句进行类型解析,以获取原语句的语句类型。
步骤302,对原语句进行实体检测,以确定原语句中是否包含与指定的实体类型匹配的实体。
在一些实施例中,步骤301-步骤302与上述步骤201-步骤202类似,故在此不再赘述。
步骤303,在原语句中包含与任一指定的实体类型匹配的实体的情况下,对原语句进行实体关系检测,以确定原语句中是否包含与指定的边类型匹配的实体关系。
在原语句中包含与任一指定的实体类型匹配的实体的情况下,可对原语句进行实体关系检测,比如可根据原语句中两实体之间的分词,获取两实体之间的实体关系,或者根据位于原语句中实体后的分词,确定原语句包含的实体关系。比如,原语句为“A的家乡在什么地方”,原语句包含的实体为“A”,根据位于A后的分词“的”及“家乡”,可确定该原语句包含的实体关系为“家乡”。
在获取原语句中包含的实体关系后,计算原语句包含的实体关系与指定的边类型之间的匹配度。其中,指定的边类型是指待查询的知识库中包含的边类型。比如,某知识库中包含的边类型有“家乡”、“毕业于”、“就职于”、“发表”。
若原语句中包含的实体关系与指定的边类型之间的匹配度大于阈值时,可确定原语句中包含与指定的边类型匹配的实体关系。若原语句中包含的实体关系与每个指定的边类型之间的匹配度均小于预设阈值,可确定原语句中不包含与指定的边类型匹配的实体关系。
步骤304,在原语句中包含与任一指定的边类型匹配的实体关系的情况下,获取与语句类型、任一指定的实体类型及任一指定的边类型对应的图查询语句模板。
在原语句中包含与任一指定的实体类型匹配的实体的情况下,以及在原语句中包含与任一指定的边类型匹配的实体关系的情况下,说明原语句中包含与指定的实体类型匹配的实体,也包含与指定的边类型匹配的实体关系。
在一些实施例中,可通过对原语句进行解析,获取原语句中与任一指定的边类型匹配的实体关系的边属性,根据任一指定的实体类型、任一指定的边类型匹配和边属性,可确定图查询语句模板中对实体类型的查询操作及对边类型的查询操作,之后结合原语句的语句类型,生成图查询语句模板。
比如,原语句为“A的家乡在什么地方”,该原语句中包含的实体“A”与指定的实体类型“name”匹配,原语句包含的实体关系“家乡”与指定的边类型“家乡”匹配。由于原语句中包含与指定的实体类型匹配的实体,那么图查询语句模板中包含对实体的操作为“has”,且原语句包含的边属性为出边(out),图查询语句模板包含对边类型的“家乡”的操作“out”,结合该原语句为查询语句,可生成gremlin查询语句模板“gt_nlp_asm_kg.v().has('name',).out('家乡').order().by('stock_flag',decr).dedup()”。
可选地,也可利用神经网络模型,获取图查询语句模板。具体地,可将原语句输入预先训练的神经网络模型中,以获取对实体类型的操作和边类型及对应的边属性,并结合原语句的语句类型,生成图查询语句模板。可以理解的是,边属性对应图查询语句模板中对边类型的操作。
一些实施例提供的知识库的查询方法,在获取与语句类型及任一指定的实体类型对应的图查询语句模板时,还可对原语句进行实体关系检测,以确定原语句中是否包含与指定的边类型匹配的实体关系,在原语句中包含与任一指定的边类型匹配的实体关系的情况下,获取与语句类型、任一指定的实体类型及任一指定的边类型对应的图查询语句模板。由此,还可在原语句包含实体关系时,获取原语句对应的图查询语句模板,使得查询语句更加丰富多元,提高了知识库查询的灵活性和准确性。
在一些实施例中,图查询语句模板中包含待填充的实体,可通过4的方式生成图查询语句,利用图查询语句对知识库进行查询。图4为一些实施例提供的另一种知识库的查询方法的流程示意图。
如图4所示,该知识库的查询方法包括步骤S401至步骤S405。
步骤401,获取待查询的原语句。
步骤402,对原语句进行解析处理,以获取原语句对应的图查询语句模板。
在一些实施例中,步骤401-步骤402可参见上述实施例,故在此不再赘述。
步骤403,对原语句进行实体抽取,以获取原语句对应的实体集合。
在获取图查询语句模板后,可通过基于规则与词典的方法或者机器学习方法等对原语句进行实体抽取,获取原语句中包含的实体,根据原语句中包含的实体得到原语句对应的实体集合。可以理解的是,实体集合中包括原语句中包含的实体。
步骤404,利用实体集合中的实体,对图查询语句模板进行实体填充,以生成图查询语句。
由于图查询语句模板中包含待填充的实体,那么在生成图查询语句时,可利用原语句对应的实体集合中的实体,对图查询语句模板进行实体填充,从而生成图查询语句。
在一些实施例中,可从实体集合中确定出与图查询语句模板中待填充的实体的类型匹配 的实体,将与图查询语句模板中待填充的实体的类型匹配的实体充至图查询语句模板中,从而生成图查询语句。由此,利用与待填充的实体的类型匹配的实体进行填充,提高了图查询语句的准确性,从而提高了后续知识库查询的准确性。
比如,原语句“A的家乡在什么地方”对应的图查询语句模板中待填充的实体的类型为“name”,可将实体集合中与待填充的实体的类型“name”匹配的实体“A”填充至图查询语句模板中,生成图查询语句“gt_nlp_asm_kg.v().has('name','A').out('家乡').order().by('stock_flag',decr).dedup()”。
若待填充的实体有多个,可获取实体集合与图查询语句模板中每个待填充的实体的类型匹配的实体,并将匹配的实体填充至图查询语句模板中待填充的实***置。
在实际应用中,从原语句中抽取到的实体可能与知识库中的实体表示不一致。进一步地,为了提高查询的准确率,在利用实体集合中的实体,对图查询语句模板进行实体填充,生成图查询语句之前,可通过对实体集合中的各个实体与知识库中的实体进行链接,以确定知识库中与实体集合中各个实体对应的各个实体,为了便于区分称为各个目标实体。从而,确定从原语句中抽取的实体在知识库中的标准表示。在利用实体集合中的实体,对图查询语句模板进行实体填充时,可利用获取各个目标实体,对图查询语句模板进行实体填充,生成图查询语句。由此,利用包含有知识库中实体标准表示的图查询语句,对知识库进行查询,可以大大提高查询的准确率。
步骤405,利用图查询语句对知识库进行查询,以获取原语句对应的查询结果。
在一些实施例中,步骤405可参见上述实施例,在此不再赘述。
一些实施例提供的知识库的查询方法,图查询语句模板中包含待填充的实体,在根据原语句和图查询语句模板,生成图查询语句时,可通过对原语句进行实体抽取,以获取原语句对应的实体集合,利用实体集合中的实体,对图查询语句模板进行实体填充,以生成图查询语句。由此,通过根据原语句中的实体对图查询语句模板进行填充,生成图查询语句,使得图查询语句不依赖于待查询语句的模板,提高了知识库查询的灵活性。
在一些实施例中,在利用实体集合中的实体,对图查询语句模板进行实体填充,以生成图查询语句时,还可通过如下方法生成图查询语句。
在一些实施例中,可分别利用实体集合中各个实体,对图查询语句模板中待填充的实体进行填充,生成多个候选图查询语句。在获取多个候选图查询语句后,可计算每个候选图查询语句与原语句的匹配度,根据每个候选图查询语句与原语句的匹配度,从多个候选图查询语句中选取图查询语句。
在选取图查询语句时,可将匹配度最高的候选图查询语句作为原语句对应的图查询语句,或者也可选取匹配度最高的多个候选图查询语句,作为原语句对应的图查询语句等。
比如,原语句对应的实体集合中包含3个实体,图查询语句模板中待填充的实体的数量为2,可分别利用实体集合中的3个实体,对图查询语句模板中待填充的实体进行填充,得到6个候选图查询语句,根据候选图查询语句与原语句的匹配度,可从6个候选图查询语句中选取匹配度最高的候选图查询语句,作为原语句的图查询语句。
可选地,在实体集合包括多个实体时,也可根据图查询语句模板中待填充的实体的数量,对实体集合中的各个实体进行排列,并计算每个排列结果与原语句的匹配度,从多个排列结果中选取匹配度最高的目标排列结果,并将目标排列结果的中各个实体,依次填充至图查询语句模板中待填充的实***置,得到图查询语句。
比如,原语句对应的实体集合中包含3个实体,图查询语句模板中待填充的实体的数量为2,那么可从实体集合中抽取两个实体进行排列,得到
Figure PCTCN2021139332-appb-000001
个排列结果,计算每个排列结果与原语句的匹配度,可选取匹配度最高的排列结果,将匹配度最高的排列结果中的各个实体,对应填充至对图查询语句模板中需要填充实体的位置,得到图查询语句,即将匹配度最高的排列结果中的第1个实体,填充至图查询语句模板中第1个需要待填充实体的位置,将排列结果中的第2个实体填充至图查询语句模板中第2个需要填充实体的位置。
一些实施例提供的知识库的查询方法,在利用实体集合中的实体,对图查询语句模板进行实体填充,以生成图查询语句时,可根据利用实体集合中的各个实体对图查询语句模板进行填充,得到多个候选图查询语句,并根据每个候选图查询语句与原语句的匹配度,从多个候选图查询语句中选出原语句的图查询语句,从而使得生成的图查询语句与原语句的匹配度较高,提高了知识库查询的准确性。
下面结合图5,以利用神经网络模型将原语句转换为gremlin查询语句为例,对一些实施例提供的知识库的查询方法进行说明。图5为一些实施例提供的一种知识库查询的示意图。
如图5所示,将原语句进行编码后,需要完成三个分类任务,其中,第一个是语句类型的分类,以确定原语句是查询还是判断;第二个分类任务是属性条件的分类,包括has、in、out、in&out共4个分类(或者也可加上“空”这一分类),其中,has表示实体的属性,in、out、in&out为边属性,通过该分类任务确定原语句中用到的属性条件有哪些;第三个分类任务,分类数量与知识库中包含的实体类型和边类型的数量相同,通过该分类任务,以确定原语句中包含的实体对应的知识库中的实体类型,和实体关系对应的知识库中的边类型等。
图5中,利用Roberta对原语句为“A的家乡在什么地方”进行编码得到编码向量,对编码向量进行处理,得到原语句的语句类型为查询语句,原语句中用到的属性条件为“has”和“out”,以及指定的6个实体类型和边类型中与原语句中实体匹配的实体类型“name”和与原语句中实体关系匹配的边类型“家乡”。在获取三个分类结果后,可根据得到的三个分类结果进行融合,得到gremlin查询语句模为“gt_nlp_asm_kg.v().has('name',).out('家乡').order().by('stock_flag',decr).dedup()”。
图5中,“COND_JUD_OP”对相应的向量进行处理,得到原语句的语句类型为查询语句;“ASK_CHOSE_OP”是对第二个分类任务的结果和第三分类任务结果进行融合,具体是,确定实体及其对应的属性条件和边类型及对应的边属性。再根据实体及对应的属性条件和边类型及对应的边属性,和原语句的语句类型查询语句,得到gremlin查询语句模板。
在获取gremlin查询语句模板后,对gremlin查询语句模板和原语句进行解码,对原语句进行实体抽取,对各个实体进行排列,并确定与原语句匹配度最高的排列结果,利用 “GEN_OP”对gremlin查询语句模板和与原语句匹配度最高的排列结果进行组合,即将排列结果中的实体填充到模板中的相应位置,生成gremlin查询语句“gt_nlp_asm_kg.v().has('name','A').out('家乡').order().by('stock_flag',decr).dedup()”。在获取gremlin查询语句后,对知识库进行查询,获取知识库中name为A的节点边类型“家乡”的出边对应的节点为“m市”,即获取原语句“A的家乡在什么地方”的查询结果“m市”。
为了实现上述实施例,一些实施例还提出一种知识库的查询装置。图6为一些实施例提供的一种知识库的查询装置的结构示意图。
如图6所示,该知识库的查询装置600包括:第一获取模块610、第二获取模块620、生成模块630和查询模块640。
第一获取模块610,用于获取待查询的原语句;
第二获取模块620,用于对原语句进行解析处理,以获取原语句对应的图查询语句模板;
生成模块630,用于根据原语句和图查询语句模板,生成图查询语句;
查询模块640,用于利用图查询语句对知识库进行查询,以获取原语句对应的查询结果。
可选地,第二获取模块620,包括:
解析单元,用于对原语句进行类型解析,以获取原语句的语句类型;
确定单元,用于对原语句进行实体检测,以确定原语句中是否包含与指定的实体类型匹配的实体;
获取单元,用于在原语句中包含与任一指定的实体类型匹配的实体的情况下,获取与语句类型及任一指定的实体类型对应的图查询语句模板。
可选地,获取单元,用于:
对原语句进行实体关系检测,以确定原语句中是否包含与指定的边类型匹配的实体关系;
在原语句中包含与任一指定的边类型匹配的实体关系的情况下,获取与语句类型、任一指定的实体类型及任一指定的边类型对应的图查询语句模板。
可选地,图查询语句模板中包含待填充的实体,生成模块630,包括:
实体抽取单元,用于对原语句进行实体抽取,以获取原语句对应的实体集合;
生成单元,用于利用实体集合中的实体,对图查询语句模板进行实体填充,以生成图查询语句。
可选地,生成单元,用于:
将实体集合中与图查询语句模板中待填充的实体的类型匹配的实体填充至图查询语句模板,以生成图查询语句。
可选地,生成单元,用于:
分别利用实体集合中的各个实体,对图查询语句模板进行填充,以获取多个候选图查询语句;
确定每个候选图查询语句与原语句的匹配度;
根据匹配度,从多个候选图查询语句中选取图查询语句。
可选地,装置还可包括:
确定模块,用于确定知识库中与实体集合中各个实体对应的各个目标实体;
生成单元,还用于根据各个目标实体、对图查询语句模板进行实体填充,以生成图查询语句。
需要说明的是,上述对知识库的查询方法实施例的解释说明,也适用于该实施例的知识库的查询方法装置,故在此不再赘述。
一些实施例提供的知识库的查询装置,通过获取待查询的原语,对原语句进行解析处理,以获取原语句对应的图查询语句模板,并根据原语句和图查询语句模板,生成图查询语句,利用图查询语句对知识库进行查询,以获取原语句对应的查询结果。由此,通过根据原语句和图查询语句模板,生成图查询语句,利用图查询语句进行知识库查询,相比基于模板的知识库查询方法,查询语句更加丰富多元,大大提高了知识库查询的灵活性,也进一步提升了准确率,并且节省了人力成本和时间。
为了实现上述实施例,一些实施例还提出一种计算机设备,包括处理器和存储器;
其中,处理器通过读取存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于实现如上述实施例所述的知识库的查询方方法。
为了实现上述实施例,一些实施例还提出一种非临时性计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如上述实施例所述的知识库的查询方法。
为了实现上述实施例,一些实施例还提出一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现上述一方面实施例所述的知识库的查询方法。
在本说明书的描述中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本申请的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
尽管上面已经示出和描述了本申请的一些实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (17)

  1. 一种知识库的查询方法,包括:
    获取待查询的原语句;
    对所述原语句进行解析处理,以获取所述原语句对应的图查询语句模板;
    根据所述原语句和所述图查询语句模板,生成图查询语句;
    利用所述图查询语句对知识库进行查询,以获取所述原语句对应的查询结果。
  2. 如权利要求1所述的方法,其中所述对所述原语句进行解析处理,以获取图查询语句模板,包括:
    对所述原语句进行类型解析,以获取所述原语句的语句类型;
    对所述原语句进行实体检测,以确定所述原语句中是否包含与指定的实体类型匹配的实体;
    在所述原语句中包含与任一指定的实体类型匹配的实体的情况下,获取与所述语句类型及所述任一指定的实体类型对应的图查询语句模板。
  3. 如权利要求2所述的方法,其中所述获取与所述语句类型及所述任一指定的实体类型对应的图查询语句模板,包括:
    对所述原语句进行实体关系检测,以确定所述原语句中是否包含与指定的边类型匹配的实体关系;
    在所述原语句中包含与任一指定的边类型匹配的实体关系的情况下,获取与所述语句类型、任一指定的实体类型及所述任一指定的边类型对应的所述图查询语句模板。
  4. 如权利要求1所述的方法,其中所述图查询语句模板中包含待填充的实体,所述根据所述原语句和所述图查询语句模板,生成图查询语句,包括:
    对所述原语句进行实体抽取,以获取所述原语句对应的实体集合;
    利用所述实体集合中的实体,对所述图查询语句模板进行实体填充,以生成所述图查询语句。
  5. 如权利要求4所述的方法,其中所述利用所述实体集合中的实体,对所述图查询语句模板进行实体填充,以生成所述图查询语句,包括:
    将所述实体集合中与所述图查询语句模板中待填充的实体的类型匹配的实体填充至所述图查询语句模板,以生成所述图查询语句。
  6. 如权利要求4的方法,其中所述利用所述实体集合中的实体,对所述图查询语句模板进行实体填充,以生成所述图查询语句,包括:
    分别利用所述实体集合中的各个实体,对所述图查询语句模板进行填充,以获取多 个候选图查询语句;
    确定每个所述候选图查询语句与所述原语句的匹配度;
    根据所述匹配度,从所述多个候选图查询语句中选取所述图查询语句。
  7. 如权利要求4所述的方法,其中所述利用所述实体集合中的实体,对所述图查询语句模板进行实体填充,以生成所述图查询语句之前,还包括;
    确定所述知识库中与所述实体集合中各个实体对应的各个目标实体;
    所述利用所述实体集合中的实体,对所述图查询语句模板进行实体填充,以生成所述图查询语句,包括:
    根据所述各个目标实体、对所述图查询语句模板进行实体填充,以生成所述图查询语句。
  8. 一种知识库的查询装置,包括:
    第一获取模块,用于获取待查询的原语句;
    第二获取模块,用于对所述原语句进行解析处理,以获取所述原语句对应的图查询语句模板;
    生成模块,用于根据所述原语句和所述图查询语句模板,生成图查询语句;
    查询模块,用于利用所述图查询语句对知识库进行查询,以获取所述原语句对应的查询结果。
  9. 如权利要求8所述的装置,其中所述第二获取模块,包括:
    解析单元,用于对所述原语句进行类型解析,以获取所述原语句的语句类型;
    确定单元,用于对所述原语句进行实体检测,以确定所述原语句中是否包含与指定的实体类型匹配的实体;
    获取单元,用于在所述原语句中包含与任一指定的实体类型匹配的实体的情况下,获取与所述语句类型及所述任一指定的实体类型对应的图查询语句模板。
  10. 如权利要求9所述的装置,其中所述获取单元进一步用于:
    对所述原语句进行实体关系检测,以确定所述原语句中是否包含与指定的边类型匹配的实体关系;
    在所述原语句中包含与任一指定的边类型匹配的实体关系的情况下,获取与所述语句类型、任一指定的实体类型及所述任一指定的边类型对应的所述图查询语句模板。
  11. 如权利要求8所述的装置,其中所述图查询语句模板中包含待填充的实体,所述生成模块,包括:
    实体抽取单元,用于对所述原语句进行实体抽取,以获取所述原语句对应的实体集 合;
    生成单元,用于利用所述实体集合中的实体,对所述图查询语句模板进行实体填充,以生成所述图查询语句。
  12. 如权利要求11所述的装置,其中所述生成单元进一步用于:
    将所述实体集合中与所述图查询语句模板中待填充的实体的类型匹配的实体填充至所述图查询语句模板,以生成所述图查询语句。
  13. 如权利要求11所述的装置,其中所述生成单元进一步用于:
    分别利用所述实体集合中的各个实体,对所述图查询语句模板进行填充,以获取多个候选图查询语句;
    确定每个所述候选图查询语句与所述原语句的匹配度;
    根据所述匹配度,从所述多个候选图查询语句中选取所述图查询语句。
  14. 如权利要求11所述的装置,还包括:
    确定模块,用于确定所述知识库中与所述实体集合中各个实体对应的各个目标实体;
    所述生成单元,还用于根据所述各个目标实体、对所述图查询语句模板进行实体填充,以生成所述图查询语句。
  15. 一种计算机设备,包括
    处理器,和
    存储器;
    其中,所述处理器通过读取所述存储器中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于实现以下步骤:
    获取待查询的原语句;
    对所述原语句进行解析处理,以获取所述原语句对应的图查询语句模板;
    根据所述原语句和所述图查询语句模板,生成图查询语句;
    利用所述图查询语句对知识库进行查询,以获取所述原语句对应的查询结果。
  16. 一种非临时性计算机可读存储介质,其上存储有计算机程序,其中该程序被处理器执行时实现以下步骤:
    获取待查询的原语句;
    对所述原语句进行解析处理,以获取所述原语句对应的图查询语句模板;
    根据所述原语句和所述图查询语句模板,生成图查询语句;
    利用所述图查询语句对知识库进行查询,以获取所述原语句对应的查询结果。
  17. 一种计算机程序产品,包括计算机程序,其中,所述计算机程序被处理器执行时实现以下步骤:
    获取待查询的原语句;
    对所述原语句进行解析处理,以获取所述原语句对应的图查询语句模板;
    根据所述原语句和所述图查询语句模板,生成图查询语句;
    利用所述图查询语句对知识库进行查询,以获取所述原语句对应的查询结果。
PCT/CN2021/139332 2021-01-20 2021-12-17 知识库的查询方法、装置、计算机设备和存储介质 WO2022156450A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110076011.8A CN114860894A (zh) 2021-01-20 2021-01-20 知识库的查询方法、装置、计算机设备和存储介质
CN202110076011.8 2021-01-20

Publications (1)

Publication Number Publication Date
WO2022156450A1 true WO2022156450A1 (zh) 2022-07-28

Family

ID=82548445

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/139332 WO2022156450A1 (zh) 2021-01-20 2021-12-17 知识库的查询方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN114860894A (zh)
WO (1) WO2022156450A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235211A (zh) * 2023-11-15 2023-12-15 暗物智能科技(广州)有限公司 一种知识问答方法及***

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866593A (zh) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 一种基于知识图谱的数据库搜索方法
CN105868313A (zh) * 2016-03-25 2016-08-17 浙江大学 一种基于模板匹配技术的知识图谱问答***及方法
CN110765256A (zh) * 2019-12-24 2020-02-07 杭州实在智能科技有限公司 一种在线法律咨询自动回复的生成方法与设备
CN111522934A (zh) * 2020-04-23 2020-08-11 华东理工大学 一种基于化学品知识库的知识问答***和方法
CN111782763A (zh) * 2020-05-22 2020-10-16 平安科技(深圳)有限公司 基于语音语义的信息检索方法、及其相关设备
US20200356599A1 (en) * 2018-09-20 2020-11-12 Huawei Technologies Co., Ltd. Systems and methods for graph-based query analysis

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104866593A (zh) * 2015-05-29 2015-08-26 中国电子科技集团公司第二十八研究所 一种基于知识图谱的数据库搜索方法
CN105868313A (zh) * 2016-03-25 2016-08-17 浙江大学 一种基于模板匹配技术的知识图谱问答***及方法
US20200356599A1 (en) * 2018-09-20 2020-11-12 Huawei Technologies Co., Ltd. Systems and methods for graph-based query analysis
CN110765256A (zh) * 2019-12-24 2020-02-07 杭州实在智能科技有限公司 一种在线法律咨询自动回复的生成方法与设备
CN111522934A (zh) * 2020-04-23 2020-08-11 华东理工大学 一种基于化学品知识库的知识问答***和方法
CN111782763A (zh) * 2020-05-22 2020-10-16 平安科技(深圳)有限公司 基于语音语义的信息检索方法、及其相关设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117235211A (zh) * 2023-11-15 2023-12-15 暗物智能科技(广州)有限公司 一种知识问答方法及***
CN117235211B (zh) * 2023-11-15 2024-03-19 暗物智能科技(广州)有限公司 一种知识问答方法及***

Also Published As

Publication number Publication date
CN114860894A (zh) 2022-08-05

Similar Documents

Publication Publication Date Title
CN108804641B (zh) 一种文本相似度的计算方法、装置、设备和存储介质
CN109933785B (zh) 用于实体关联的方法、装置、设备和介质
CN108090068B (zh) 医院数据库中的表的分类方法及装置
JP6769405B2 (ja) 対話システムおよび対話方法
WO2016205286A1 (en) Automatic entity resolution with rules detection and generation system
JP2005352888A (ja) 表記揺れ対応辞書作成システム
WO2019028990A1 (zh) 代码元素的命名方法、装置、电子设备及介质
WO2022156450A1 (zh) 知识库的查询方法、装置、计算机设备和存储介质
CN111708810B (zh) 模型优化推荐方法、装置和计算机存储介质
CN113641813A (zh) 基于知识图谱的问答***、方法、电子设备及存储介质
CN107958068B (zh) 一种基于实体知识库的语言模型平滑方法
CN117171331B (zh) 基于大型语言模型的专业领域信息交互方法、装置及设备
CN114995729A (zh) 一种语音绘图方法、装置及计算机设备
WO2022022049A1 (zh) 文本长难句的压缩方法、装置、计算机设备及存储介质
CN117851575A (zh) 一种大语言模型问答优化方法、装置、电子设备及存储介质
CN116955406A (zh) Sql语句生成方法、装置、电子设备及存储介质
CN114969001B (zh) 一种数据库元数据字段匹配方法、装置、设备及介质
CN111782789A (zh) 智能问答方法与***
CN111460114A (zh) 检索方法、装置、设备及计算机可读存储介质
JP6327799B2 (ja) 自然言語推論システム、自然言語推論方法及びプログラム
CA3162733A1 (en) Extracting key value pairs using positional coordinates
CN108573025B (zh) 基于混合模板抽取句子分类特征的方法及装置
CN113468339A (zh) 基于知识图谱的标签提取方法、***、电子设备及介质
CN113408296A (zh) 一种文本信息提取方法、装置及设备
JP6712749B2 (ja) 最後のアルファベット除去アルゴリズムを利用した半導体部品検索方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21920824

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21920824

Country of ref document: EP

Kind code of ref document: A1