CN115374258A - Knowledge base query method and system combining semantic understanding with question template - Google Patents

Knowledge base query method and system combining semantic understanding with question template Download PDF

Info

Publication number
CN115374258A
CN115374258A CN202210475088.7A CN202210475088A CN115374258A CN 115374258 A CN115374258 A CN 115374258A CN 202210475088 A CN202210475088 A CN 202210475088A CN 115374258 A CN115374258 A CN 115374258A
Authority
CN
China
Prior art keywords
sentence pattern
template
question
pattern template
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210475088.7A
Other languages
Chinese (zh)
Inventor
葛剑飞
罗巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Puxu Technology Co ltd
Original Assignee
Jiangsu Puxu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Puxu Technology Co ltd filed Critical Jiangsu Puxu Technology Co ltd
Priority to CN202210475088.7A priority Critical patent/CN115374258A/en
Publication of CN115374258A publication Critical patent/CN115374258A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a knowledge base query method and a system combining semantic understanding and question templates, wherein the method comprises the following steps: constructing a sentence pattern template library based on preset sentence pattern template categories and sentence pattern template elements; constructing a domain dictionary based on the corpus; performing word segmentation on the input question sentence and marking the sentence pattern template element type to which the word segmentation result belongs by combining the domain dictionary; analyzing the dependency relationship of the word segmentation result, determining the dependency relationship among a plurality of semantic blocks and generating a question analysis template according to the dependency relationship; matching in the sentence pattern template base according to the question analysis template to obtain a matched final target sentence pattern template; filling the phrases contained in each semantic block in the question parsing template into the query sentence corresponding to the target sentence pattern template to generate a complete query sentence; and querying the knowledge base according to the generated query statement and returning an answer. The invention can realize that non-technical personnel can inquire the knowledge base in a natural language form, thereby improving the freedom and convenience of inquiry.

Description

Knowledge base query method and system combining semantic understanding with question template
Technical Field
The invention relates to the technical field of data processing and query, in particular to a knowledge base query method and a knowledge base query system combining semantic understanding and question templates.
Background
With the rapid development of big data related technology, various industries accumulate abundant data resources, wherein a large amount of domain knowledge is contained, and after the domain knowledge is processed and stored, a domain knowledge base is formed, so that the knowledge requirements of domain related services can be met.
The method comprises the steps of obtaining question semantics in a natural language form through a semantic understanding model, analyzing question components and dependency relations among the question components, dividing a question into phrase combinations through a syntax tree by a semantic dependency technology, and representing context relations among the phrases, so that sentence semantics can be understood. The semantic understanding model has the characteristics of intelligence and universality, has stronger generalization capability on more general question sentences, and has the defect that the parsing capability on professional problems or complex problems in the field is weaker, while sentence pattern templates and field dictionaries have stronger field adaptability, but the construction process needs to consume larger manpower and time resources.
Because domain problems are often of high complexity and expertise, it is difficult for a sentence pattern template library to completely cover and accurately characterize the sample space of an actual problem. Moreover, knowledge bases in different industries have differences in content and structure, and the question sentence is analyzed through a semantic understanding model which is common in the field, so that great difficulty exists in model construction. The query intention of the natural language question is mainly embodied on semantic content and semantic structure, and the query intention is difficult to be accurately matched with sentence pattern templates in the sentence pattern template library.
Disclosure of Invention
The invention aims to provide a knowledge base query method combining semantic understanding and question templates, simultaneously constructs question templates and a field dictionary for auxiliary analysis, can realize the function that non-technical personnel can query the knowledge base in a natural language form, and improves the freedom degree and convenience of query.
According to a first aspect of the invention, a knowledge base query method combining semantic understanding and question templates is provided, which comprises the following steps:
step 1, constructing a sentence pattern template library based on preset sentence pattern template categories and sentence pattern template elements; the sentence pattern template category represents the target attribute of question query, the sentence pattern template elements are semantic blocks, and the semantic blocks represent different question components and the functions thereof in the question;
step 2, constructing a domain dictionary based on the linguistic data in the domain, wherein the keywords stored in the domain dictionary are set to be marked according to sentence pattern template elements, and the sentence pattern template elements corresponding to the keywords are marked;
step 3, performing word segmentation processing on the input question sentence, and marking the sentence pattern template element type to which the word segmentation result belongs by combining a domain dictionary;
step 4, carrying out dependency relationship analysis on the word segmentation results of the element types of the sentence marking template to determine the dependency relationship among a plurality of semantic blocks, and recombining and sequencing the semantic blocks according to the dependency relationship to generate a question analysis template;
step 5, matching in the sentence pattern template library constructed in the step 1 according to the question analysis template to obtain a matched final target sentence pattern template;
step 6, filling the phrases contained in each semantic block in the question parsing template corresponding to the step 4 into the query sentence corresponding to the target sentence pattern template in the step 5 to generate a complete query sentence; and
and 7, querying a knowledge base according to the query statement generated in the step 6, and returning an answer.
In an optional embodiment, the sentence pattern template is divided into a query entity, a query entity attribute and a query entity relationship according to a query purpose, and the query entity, the query entity attribute and the query entity relationship are respectively marked as a first template I; a second template II, which is used for inquiring entity attributes; and a third template III for inquiring the entity relation.
In an alternative embodiment, the semantic blocks in the sentence pattern template elements include the following types:
1) A topic semantic block defining entities in sentences, entity attributes and phrases of entity relationships;
2) A query semantics block defining a query word or query phrase in a sentence;
3) A restriction semantic block defining a restriction word or phrase in a sentence;
4) And the query semantic block defines a mood auxiliary word in the sentence.
In an optional embodiment, in step 1, combining different semantic blocks of sentence pattern template elements under each sentence pattern template category to generate sentence pattern templates under different sentence pattern template categories, and constructing a sentence pattern template library of all sentence pattern template categories;
wherein, each sentence pattern template is expressed as follows:
< sentence pattern template >: : = (subject semantic block, [ restricted semantic block ], [ question semantic block ], [ auxiliary semantic block ]).
In an alternative embodiment, in the process of constructing the domain dictionary, the following processes are included:
performing word segmentation processing on the linguistic data related to the field, extracting key words, and labeling sentence pattern template element types corresponding to each key word by combining the sentence pattern template element types;
wherein, for the key word that the marked sentence pattern template element type is the theme semantic block, it also includes the type of marking its entity, entity attribute or entity relation.
In an alternative embodiment, the performing a word segmentation process on the input question sentence and marking the sentence pattern template element type to which the word segmentation result belongs by combining with the domain dictionary includes:
the method is favorable for performing word segmentation optimization by combining the domain dictionary with the reverse maximum matching algorithm, performing re-splicing through the domain dictionary to obtain a grammar analysis result, and labeling the sentence pattern template element type to which the sentence pattern template belongs.
In an optional embodiment, the step 5, performing matching in the sentence pattern template library constructed in the step 1 according to the question parsing template, to obtain a matched final target sentence pattern template, includes:
traversing and calculating the semantic blocks of the question parsing template and each sentence pattern template in the sentence pattern template library on the basis of calculating the semantic blocks of the question parsing template to obtain a similarity threshold value of the question parsing template and each sentence pattern template in the sentence pattern template library; the similarity threshold is the similarity of the semantic blocks, the similarity of the template length and the similarity of the sequence of the semantic blocks, and the final template similarity is obtained through weighted summation;
and arranging similarity threshold values from large to small, and taking the sentence pattern template corresponding to the highest value of the similarity threshold values as a matched final target sentence pattern template.
According to a second aspect of the present invention, there is provided a knowledge base query system combining semantic understanding and question templates, comprising:
one or more processors;
a memory storing instructions operable, when executed by the one or more processors, to implement the knowledgebase query method of the aforementioned combination of semantic understanding and question templates.
According to a third aspect of the object of the present invention, there is also provided a server, comprising:
one or more processors;
a memory storing instructions operable, when executed by the one or more processors, to implement the aforementioned knowledgebase query method in combination with a semantic understanding and question template.
The knowledge base query method combining the semantic understanding model and the sentence pattern template, which is provided by the invention, realizes the function that non-technical personnel can query the knowledge base in a natural language form by analyzing the query intention of the natural language question, including analyzing the semantic content of the question and analyzing the semantic structure.
Compared with the prior art, the invention has the following remarkable beneficial effects:
(1) Analyzing the sentence content and the sentence structure of the question by a semantic understanding model taking dependency analysis as a core and combining a domain dictionary to obtain the query intention of the question;
(2) Summarizing question sentence patterns of natural languages in the field to construct 3 sentence pattern template types, wherein the components of the sentence pattern template are composed of 4 semantic blocks, and a sentence pattern template library constructed according to the sentence pattern template types can better cover and represent sample spaces of actual problems;
(3) The knowledge base query method provided by the invention obtains the dependency relationship among the semantic blocks based on the dependency relationship analysis, recombines and sequences the semantic blocks according to the dependency relationship to generate the analysis template of the question, and then provides a template matching algorithm for the question analysis template, and simultaneously considers the similarity of the semantic blocks, the similarity of the template length and the similarity of the sequence of the semantic blocks, so that the query intention can be more accurately matched with the sentence pattern template.
It should be understood that all combinations of the foregoing concepts and additional concepts described in greater detail below can be considered as part of the inventive subject matter of this disclosure unless such concepts are mutually inconsistent. In addition, all combinations of claimed subject matter are considered a part of the presently disclosed subject matter.
The foregoing and other aspects, embodiments and features of the present teachings can be more fully understood from the following description taken in conjunction with the accompanying drawings. Additional aspects of the present invention, such as features and/or advantages of exemplary embodiments, will be apparent from the description which follows, or may be learned by practice of specific embodiments in accordance with the teachings of the present invention.
Drawings
The figures are not intended to be drawn to scale. In the drawings, each identical or nearly identical component that is illustrated in various figures may be represented by a like numeral. For purposes of clarity, not every component may be labeled in every drawing. Embodiments of various aspects of the present invention will now be described, by way of example, with reference to the accompanying drawings, in which:
FIG. 1 is a flow diagram of a knowledge base query method incorporating a semantic understanding model and sentence pattern templates in accordance with an exemplary embodiment of the present invention.
FIG. 2 is a schematic diagram of dependency analysis for an exemplary embodiment of the present invention.
Detailed Description
In order to better understand the technical content of the present invention, specific embodiments are described below with reference to the accompanying drawings.
In this disclosure, aspects of the present invention are described with reference to the accompanying drawings, in which a number of illustrative embodiments are shown. Embodiments of the present disclosure are not necessarily intended to include all aspects of the invention. It should be appreciated that the various concepts and embodiments described above, as well as those described in greater detail below, may be implemented in any of numerous ways, as the disclosed concepts and embodiments are not limited to any one implementation. Additionally, some aspects of the present disclosure may be used alone, or in any suitable combination with other aspects of the present disclosure.
The process of the knowledge base query method combining semantic understanding and question template in conjunction with the example shown in fig. 1 includes the following steps:
step 1, constructing a sentence pattern template library based on preset sentence pattern template categories and sentence pattern template elements; the sentence pattern template category represents the target attribute of question query, the sentence pattern template elements are semantic blocks, and the semantic blocks represent different question components and the functions thereof in the question;
step 2, constructing a domain dictionary based on the linguistic data in the domain, wherein the keywords stored in the domain dictionary are set to be marked according to sentence pattern template elements, and the sentence pattern template elements corresponding to the keywords are marked;
step 3, performing word segmentation processing on the input question sentence, and marking the sentence pattern template element type to which the word segmentation result belongs by combining a domain dictionary;
step 4, carrying out dependency relationship analysis on the word segmentation result of the element type of the marked sentence pattern template, determining the dependency relationship among a plurality of semantic blocks, and recombining and sequencing the semantic blocks according to the dependency relationship to generate a question analysis template;
step 5, matching in the sentence pattern template library constructed in the step 1 according to the question analysis template to obtain a matched final target sentence pattern template;
step 6, filling the phrases contained in each semantic block in the question parsing template corresponding to the step 4 into the query sentence corresponding to the target sentence pattern template in the step 5 to generate a complete query sentence; and
and 7, querying a knowledge base according to the query statement generated in the step 6, and returning an answer.
Preferably, the sentence pattern template is divided into a query entity, a query entity attribute and a query entity relation according to a query purpose, and the query entity, the query entity and the query entity attribute are respectively marked as a first template I; a second template II is used for inquiring entity attributes; and a third template III for inquiring the entity relation.
Preferably, the semantic blocks in the sentence pattern template elements include the following types:
1) A topic semantic block defining entities in sentences, entity attributes and phrases of entity relationships;
2) A query semantics block defining a query word or query phrase in a sentence;
3) A restriction semantic block defining a restriction word or phrase in a sentence;
4) And the query semantic block defines a mood auxiliary word in the sentence.
Preferably, in step 1, under each sentence pattern template category, different semantic blocks of the sentence pattern template elements are combined to generate sentence pattern templates under different sentence pattern template categories, and a sentence pattern template library of all sentence pattern template categories is constructed;
wherein, each sentence pattern template is expressed as follows:
< sentence pattern template >: : = (subject semantic block, [ restricted semantic block ], [ question semantic block ], [ assisted semantic block ]).
Preferably, in constructing the domain dictionary, the following processing is included:
performing word segmentation processing on the linguistic data related to the field, extracting key words, and labeling sentence pattern template element types corresponding to each key word by combining the sentence pattern template element types;
wherein, for the key word that the marked sentence pattern template element type is the theme semantic block, it also includes the type of marking its entity, entity attribute or entity relation.
Preferably, the performing a word segmentation process on the input question sentence and marking the sentence pattern template element type to which the word segmentation result belongs by combining the domain dictionary comprises:
firstly, performing word segmentation processing on an input question;
and then, optimizing the words and/or phrases obtained by word segmentation processing by combining a domain dictionary with a reverse maximum matching algorithm, splicing again through the domain dictionary to obtain a grammatical analysis result, and labeling the sentence pattern template element types to which the words and/or phrases belong.
Preferably, the step 5, matching in the sentence pattern template library constructed in the step 1 according to the question parsing template, to obtain a matched final target sentence pattern template, includes:
traversing and calculating the semantic blocks of the question parsing template and each sentence pattern template in the sentence pattern template library on the basis of calculating the semantic blocks of the question parsing template to obtain a similarity threshold value of the question parsing template and each sentence pattern template in the sentence pattern template library; the similarity threshold is the similarity of the semantic blocks, the similarity of the template length and the similarity of the sequence of the semantic blocks, and the final template similarity is obtained through weighted summation;
and arranging similarity threshold values from large to small, and taking the sentence pattern template corresponding to the highest value of the similarity threshold values as a matched final target sentence pattern template.
We will now make more specific the implementation of the aforementioned embodiments of the invention, with reference to specific examples.
A. Constructing sentence pattern template categories
Since the knowledge existing in the knowledge base mainly comprises knowledge entities, entity attributes and entity relationships. Therefore, in the embodiment of the present invention, the types of the sentence pattern templates are divided into three types, which respectively correspond to different query purposes, including:
1) Querying the entity;
2) Inquiring entity attributes;
3) And querying the entity relationship.
For example, the question "what is the latest loan interest rate of the business bank? The type of "is the query entity attribute, where" industry and commerce Bank "represents the entity and" loan interest Rate "is its attribute.
As an alternative classification, an entity is typically configured as a noun or a specific object.
The entity attribute refers to attribute information such as the characteristics and parameters of a certain aspect or dimension of an entity; entity relationships refer to relationship attributes of entities.
B. Constructing sentence pattern template elements
In the embodiment of the invention, sentence pattern template elements are defined as different semantic blocks, which represent different question sentence components and action attributes of stationery components in sentences.
As an alternative, the types of sentence pattern template elements specifically include:
1) Theme semantic block: the semantic block corresponds to the entity in the sentence, the entity attribute and the phrase of the entity relationship;
2) Question semantic block: the semantic block corresponds to the query words or query phrases in the sentence, such as how many, length, how many and the like;
3) And (3) limiting semantic blocks: the semantic block corresponds to a restrictive word or phrase in a sentence, such as a time-like language, a place-like language, a scope-like language, etc.;
4) Auxiliary semantic block: the semantic block corresponds to a mood assistant word in a sentence and the like.
For example, the question "what is the latest loan interest rate of the business bank? "middle," business bank "and" loan interest rate "are subject semantic blocks," how many "are query semantic blocks," latest "are limit semantic blocks, and" yes "are auxiliary semantic blocks.
C. Construction of sentence pattern template library
In the embodiment of the invention, the sentence pattern template is constructed by summarizing the sentence pattern of the field question sentence.
As previously mentioned, the types of sentence pattern templates include: querying entities, querying entity attributes, and querying entity relationships.
Thus, under each sentence pattern template type, different sentence pattern template elements are combined to generate different sentence pattern templates, thereby obtaining a sentence pattern template library of all sentence pattern template types.
As an alternative embodiment, each sentence pattern template in the library of sentence pattern templates has the following general form:
< sentence pattern template > = (subject semantic block, [ restricted semantic block ], [ question semantic block ], [ assist semantic block ]).
D. Constructing a domain dictionary
In an embodiment of the invention, the domain dictionary is constructed in a semi-manual manner.
As an alternative, the process of constructing the domain dictionary includes:
and performing word segmentation processing on the linguistic data related to the field, for example, extracting key words by using a tfidf unsupervised algorithm, and labeling the types of the corresponding sentence pattern template elements.
Particularly, it is noted that, for the keyword whose labeled type is the subject semantic block, the type of the corresponding entity, entity attribute or entity relationship is further labeled.
For example, in the corpus processing, for the keyword "industrial and commercial bank", the corresponding entity is "bank" in the financial field.
E. Lexical analysis
And performing lexical analysis on the input question, and mainly completing word segmentation and part-of-speech tagging.
As an alternative embodiment, the lexical analysis process includes:
firstly, performing word segmentation processing on an input question, and acquiring a word segmentation result based on forward word segmentation or reverse word segmentation; then, each phrase obtained by dividing the word result is labeled with the sentence pattern template element type to which the phrase belongs.
In another embodiment, in order to avoid the situation that the words with longer length may be segmented too finely, the domain dictionary and the inverse maximum matching algorithm may be used for performing word segmentation optimization, and the domain dictionary is used for performing re-splicing, and labeling the sentence pattern template element types to which the words belong.
For example, for an input question sentence, a threshold value k is set in advance, for example, k is set to be 2 or 3, then k characters are cut forward from the last character, the k characters are matched with a domain dictionary to see whether a matched word can be found, if the matched word cannot be found, the leftmost character of the k characters is removed, then the k-1 character is matched with the dictionary, the above process is continued until the matching is successful, or the first k-1 characters are not matched successfully, the k character is regarded as an independent word, then the length of the separated word is moved forward, and then the k characters are cut and continue to be processed until the whole part of good words. Thereby, all the segmentation results are obtained.
For example, "loan interest rate" may be split into "loan" and "interest rate" and merged into "loan interest rate" by the built domain dictionary.
F. Dependency analysis
In the embodiment of the invention, on the basis of obtaining the participles and the sentence pattern template element types to which the participles belong through lexical analysis, the dependency analysis model is utilized to analyze the dependency relationship among the semantic blocks in the question, and the semantic blocks are recombined and sequenced according to the dependency relationship to generate the analysis template of the question.
For example, the question "what is the latest loan interest rate of the business bank? "dependency analysis, as shown in fig. 2, the arrow between the phrases points to illustrate the dependency direction, and mark the type of the dependency, for example," latest "points to" loan rate "through the centering relationship, then the semantic block sequence is" topic semantic block "," definition semantic block ", and the analysis template of the final question is:
(subject semantic Block, auxiliary semantic Block, subject semantic Block, restriction semantic Block, auxiliary semantic Block, question semantic Block)
G. Template matching
Matching the generated question parsing template in a sentence pattern template library, calculating the similarity of semantic blocks, the similarity of template length and the similarity of semantic block sequence of the question parsing template, and obtaining the final template similarity through weighted summation.
In an alternative embodiment, the semantic block similarity is defined as f 1i The template length similarity is f 2i The similarity of the semantic block sequence is f 3i I =1,2,3, \8230, n, n represents the total number of question templates in the sentence pattern template library, and the similarity threshold value of the question analysis template and each sentence pattern template in the sentence pattern template library is obtained by traversing and calculating the sentence pattern templates in the question and sentence pattern template library on the basis of calculating the semantic blocks of the question analysis template; the similarity threshold is the similarity of the semantic blocks, the similarity of the template length and the similarity of the semantic block sequence, and the final template similarity is obtained through weighted summation;
and arranging similarity threshold values from large to small, and taking the sentence pattern template corresponding to the highest value of the similarity threshold values as a matched final target sentence pattern template.
Wherein f is 1i And the similarity of the semantic blocks of the question parsing template and the ith question template in the sentence pattern template library is expressed, and the similarity of the semantic blocks between the question parsing template and the ith question template is expressed.
f 2i And the template length similarity between the question parsing template and the ith question template in the sentence pattern template library is expressed, and the similarity of the lengths of the semantic blocks between the question parsing template and the ith question template is expressed.
f 3i Showing the semantic block sequence of question parsing template and ith question template in sentence pattern template librarySimilarity, which represents the similarity in the order of semantic blocks between the two.
Final template similarity f = af 1i +bf 2i +cf 3i Wherein a, b and c respectively represent the similarity of semantic blocks, the similarity of template length and the weight value of the similarity of semantic block sequence, and the values of a, b and c are respectively [0,1 ]]And not inclusive, a + b + c =1.
As an alternative, f 1i The value taking process comprises the following steps:
defining the type number of semantic blocks contained in a question parsing template as P1, and defining the type number of semantic blocks contained in the ith question template as P2;
the semantic block similarity f is calculated in the following manner 1i
f 1i =1-[|P1-P2|/P1]。
The smaller the difference in the number of types between the two, the higher the semantic block similarity.
As an alternative, f 2i The value taking process comprises the following steps:
defining the number of semantic blocks contained in a question parsing template as Q1, and defining the number of semantic blocks contained in the ith question template as Q2;
the template length similarity f is calculated in the following manner 2i
F 2i =1-[|Q1-Q2|/Q1]。
The smaller the number difference between the two, the higher the template length similarity.
As an alternative, f 3i The value taking process comprises the following steps:
firstly, determining the sequence of semantic blocks contained in a question parsing template and the sequence of semantic blocks contained in an ith question template;
then, starting from the 1 st semantic block contained in the question parsing template, comparing the semantic blocks contained in the ith question template one by one, and if any semantic block type which is not matched with the ith question template is the same as the type of the ith question template, taking the matching degree value corresponding to the 1 st semantic block as 0; if the ith question sentence is matchedIf the type of the jth semantic block contained in the template is the same as that of the 1 st semantic block, the matching degree value corresponding to the 1 st semantic block is (1/Q1) × (j/Q2), and the semantic block in the ith question template is deleted, namely the semantic block before the jth semantic block is discarded; then starting from the 2 nd semantic block contained in the question analysis template, continuously matching on the basis of the rejected ith question template until all the semantic blocks in the question analysis template are compared, thereby obtaining matching degree values corresponding to all the semantic blocks of the question analysis template, adding the matching degree values corresponding to all the semantic blocks, and obtaining the total matching degree, namely the sequence similarity f of the semantic blocks 3i
In obtaining
H. Knowledge query
Filling phrases contained in each semantic block in the corresponding question parsing template into the query sentence corresponding to the finally obtained target sentence pattern template (namely the sentence pattern template of the question) to generate a complete query sentence; and querying the knowledge base according to the generated query statement, and returning an answer.
According to the embodiment disclosed by the invention, the invention also provides a knowledge base query system combining semantic understanding and question templates, which comprises the following steps:
one or more processors;
a memory storing instructions operable, when executed by the one or more processors, to implement the combined semantic understanding and question template knowledgebase query method of the foregoing embodiments.
According to an embodiment of the present disclosure, there is also provided a server, including:
one or more processors;
a memory storing instructions operable, when executed by the one or more processors, to implement the combined semantic understanding and question template knowledgebase query method of the foregoing embodiments.
Therefore, by combining the knowledge base query method of the semantic understanding model and the sentence pattern template, the invention enables non-technical personnel to realize the query of the knowledge base in the form of natural language, and improves the freedom degree and convenience of the query. The semantic understanding model taking the dependency analysis as the core has stronger generalization capability on more general question sentences, the sentence pattern template can improve the adaptability of the query method in the complex field, and the recall rate and the accuracy rate in the process of querying professional problems or complex problems can be obviously improved in a mode of combining the two types of sentence pattern templates.
Although the invention has been described with reference to preferred embodiments, it is not intended to be limited thereto. Those skilled in the art can make various changes and modifications without departing from the spirit and scope of the invention. Therefore, the protection scope of the present invention should be determined by the appended claims.

Claims (9)

1. A knowledge base query method combining semantic understanding and question templates is characterized by comprising the following steps:
step 1, constructing a sentence pattern template library based on preset sentence pattern template categories and sentence pattern template elements; the sentence pattern template category represents the target attribute of question query, the sentence pattern template elements are semantic blocks, and the semantic blocks represent different question components and the functions thereof in the question;
step 2, constructing a domain dictionary based on the linguistic data in the domain, wherein the keywords stored in the domain dictionary are set to be marked according to sentence pattern template elements, and the sentence pattern template elements corresponding to the keywords are marked;
step 3, performing word segmentation processing on the input question, and marking the sentence pattern template element type to which the word segmentation result belongs by combining a field dictionary;
step 4, carrying out dependency relationship analysis on the word segmentation result of the element type of the marked sentence pattern template, determining the dependency relationship among a plurality of semantic blocks, and recombining and sequencing the semantic blocks according to the dependency relationship to generate a question analysis template;
step 5, matching in the sentence pattern template library constructed in the step 1 according to the question analysis template to obtain a matched final target sentence pattern template;
step 6, filling the phrases contained in each semantic block in the question parsing template corresponding to the step 4 into the query sentence corresponding to the target sentence pattern template in the step 5 to generate a complete query sentence; and
and 7, querying a knowledge base according to the query statement generated in the step 6, and returning an answer.
2. The method for querying a knowledge base by combining semantic understanding with question templates according to claim 1, wherein the sentence pattern template is divided into query entities, query entity attributes and query entity relationships according to query purposes, and is respectively marked as a first template I; a second template II is used for inquiring entity attributes; and a third template III for inquiring the entity relation.
3. The method of knowledge base query with semantic understanding and question template combined as claimed in claim 2, wherein the semantic blocks in the sentence pattern template elements comprise the following types:
1) A topic semantic block defining entities in sentences, entity attributes and phrases of entity relationships;
2) A query semantics block defining a query word or query phrase in a sentence;
3) A restriction semantic block defining a restriction word or phrase in a sentence;
4) And the query semantic block defines a mood auxiliary word in the sentence.
4. The method for querying a knowledge base by combining semantic comprehension with question templates according to claim 3, wherein in the step 1, under each sentence pattern template category, different semantic blocks of sentence pattern template elements are combined to generate sentence pattern templates under different sentence pattern template categories, and a sentence pattern template base of all sentence pattern template categories is constructed;
wherein, each sentence pattern template is expressed as follows:
< sentence pattern template >: : = (subject semantic block, [ restricted semantic block ], [ question semantic block ], [ assist semantic block ]).
5. The method for querying a knowledge base by combining semantic understanding with a question template according to claim 3, wherein the method comprises the following steps in the process of constructing the domain dictionary:
performing word segmentation processing on the corpus related to the field, extracting key words, and labeling sentence pattern template element types corresponding to each key word by combining the sentence pattern template element types;
wherein, for the key word that the marked sentence pattern template element type is the theme semantic block, it also includes the type of marking its entity, entity attribute or entity relation.
6. The method for querying a knowledge base by combining semantic understanding and question templates according to claim 3, wherein the step of performing segmentation processing on the input question and marking the segmentation result with a domain dictionary as the sentence pattern template element type to which the input question belongs comprises the steps of:
the method is favorable for performing word segmentation optimization by combining the domain dictionary with the reverse maximum matching algorithm, performing re-splicing through the domain dictionary to obtain a grammatical analysis result, and labeling the sentence pattern template element type to which the sentence pattern template belongs.
7. The method for querying a knowledge base by combining semantic understanding with question templates according to claim 6, wherein the step 5 of matching in the sentence pattern template base constructed in the step 1 according to the question parsing template to obtain a matched final target sentence pattern template comprises:
traversing and calculating the semantic blocks of the question parsing template and each sentence pattern template in the sentence pattern template library on the basis of calculating the semantic blocks of the question parsing template to obtain a similarity threshold value of the question parsing template and each sentence pattern template in the sentence pattern template library; the similarity threshold is the similarity of the semantic blocks, the similarity of the template length and the similarity of the sequence of the semantic blocks, and the final template similarity is obtained through weighted summation;
and arranging similarity threshold values from large to small, and taking the sentence pattern template corresponding to the highest value of the similarity threshold values as a matched final target sentence pattern template.
8. A system for querying a knowledge base by combining semantic comprehension with a question template, comprising:
one or more processors;
a memory storing instructions operable, when executed by the one or more processors, to implement the method of knowledge base querying in combination with the semantic understanding and question template of any of claims 1-7.
9. A server, comprising:
one or more processors;
a memory storing instructions operable, when executed by the one or more processors, to implement the method of knowledgebase querying in combination with semantic understanding and question templates of any one of claims 1-7.
CN202210475088.7A 2022-04-29 2022-04-29 Knowledge base query method and system combining semantic understanding with question template Pending CN115374258A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210475088.7A CN115374258A (en) 2022-04-29 2022-04-29 Knowledge base query method and system combining semantic understanding with question template

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210475088.7A CN115374258A (en) 2022-04-29 2022-04-29 Knowledge base query method and system combining semantic understanding with question template

Publications (1)

Publication Number Publication Date
CN115374258A true CN115374258A (en) 2022-11-22

Family

ID=84060837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210475088.7A Pending CN115374258A (en) 2022-04-29 2022-04-29 Knowledge base query method and system combining semantic understanding with question template

Country Status (1)

Country Link
CN (1) CN115374258A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117194616A (en) * 2023-11-06 2023-12-08 湖南四方天箭信息科技有限公司 Knowledge query method and device for vertical domain knowledge graph, computer equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117194616A (en) * 2023-11-06 2023-12-08 湖南四方天箭信息科技有限公司 Knowledge query method and device for vertical domain knowledge graph, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN112069298B (en) Man-machine interaction method, device and medium based on semantic web and intention recognition
CN108363790B (en) Method, device, equipment and storage medium for evaluating comments
CN104657439B (en) Structured query statement generation system and method for precise retrieval of natural language
CN104657440B (en) Structured query statement generation system and method
CN109947921B (en) Intelligent question-answering system based on natural language processing
CN110555205B (en) Negative semantic recognition method and device, electronic equipment and storage medium
CN112541337B (en) Document template automatic generation method and system based on recurrent neural network language model
AU2019265874B2 (en) Systems and methods for document deviation detection
CN112926337B (en) End-to-end aspect level emotion analysis method combined with reconstructed syntax information
CN113254507B (en) Intelligent construction and inventory method for data asset directory
CN116628173B (en) Intelligent customer service information generation system and method based on keyword extraction
CN112733547A (en) Chinese question semantic understanding method by utilizing semantic dependency analysis
CN114266256A (en) Method and system for extracting new words in field
CN113779062A (en) SQL statement generation method and device, storage medium and electronic equipment
CN116304748A (en) Text similarity calculation method, system, equipment and medium
CN113312922A (en) Improved chapter-level triple information extraction method
CN115374258A (en) Knowledge base query method and system combining semantic understanding with question template
CN113792542A (en) Intention understanding method fusing syntactic analysis and semantic role pruning
Sriram et al. Validation and normalization of DCS corpus and development of the Sanskrit heritage engine’s segmenter
CN117473054A (en) Knowledge graph-based general intelligent question-answering method and device
CN114417008A (en) Construction engineering field-oriented knowledge graph construction method and system
CN111897932A (en) Query processing method and system for text big data
WO2020026229A2 (en) Proposition identification in natural language and usage thereof
CN116910175B (en) Method, device and storage medium for constructing fault level tree of automatic mobile equipment
CN116595192B (en) Technological front information acquisition method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination