CN111368096A - Knowledge graph-based information analysis method, device, equipment and storage medium - Google Patents
Knowledge graph-based information analysis method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN111368096A CN111368096A CN202010156693.9A CN202010156693A CN111368096A CN 111368096 A CN111368096 A CN 111368096A CN 202010156693 A CN202010156693 A CN 202010156693A CN 111368096 A CN111368096 A CN 111368096A
- Authority
- CN
- China
- Prior art keywords
- information
- entity
- objection
- candidate
- expected result
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 33
- 239000013598 vector Substances 0.000 claims description 49
- 238000000034 method Methods 0.000 claims description 20
- 238000006243 chemical reaction Methods 0.000 claims description 16
- 230000004044 response Effects 0.000 claims description 10
- 238000011144 upstream manufacturing Methods 0.000 claims description 7
- 238000003062 neural network model Methods 0.000 claims description 4
- 238000013135 deep learning Methods 0.000 abstract description 6
- 238000007405 data analysis Methods 0.000 abstract description 2
- 238000005516 engineering process Methods 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 241000238558 Eucarida Species 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000036541 health Effects 0.000 description 1
- 235000012907 honey Nutrition 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to the technical field of data analysis, in particular to a knowledge graph-based information analysis method, a knowledge graph-based information analysis device, knowledge graph-based information analysis equipment and a storage medium, wherein the knowledge graph-based information analysis method comprises the following steps: judging whether the feedback information of the client is objection information; comparing the text similarity of the objection information with a preset objection type to determine the objection type of the objection information; extracting a named entity in objection information, constructing a target entity triple by taking an objection type as an entity, the named entity as an attribute and a numerical value corresponding to the named entity, traversing a preset knowledge graph by taking the target entity triple as a key element, and obtaining a plurality of candidate entity triples related to the objection information; and extracting candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result. The problem that the requirements of customers cannot be accurately and quickly acquired due to the limitation of the learning content of deep learning is solved.
Description
Technical Field
The present application relates to the field of data analysis technologies, and in particular, to a method, an apparatus, a device, and a storage medium for information analysis based on a knowledge graph.
Background
With the development of artificial intelligence technology, especially the rapid development of technologies such as deep learning, natural language processing and the like, the intelligent assistants such as microsoft mini ice, apple Siri, ali honey and the like are widely applied. The use of such smart assistant technology may answer questions posed by the user.
However, when the intelligent assistant technology is used to answer a question, the answer corresponding to the question cannot be given accurately and quickly due to the limitation of the learning content of deep learning.
Disclosure of Invention
Based on the above, aiming at the technical problem that the intelligent assistant cannot quickly and accurately answer the question due to the limited deep learning content at present, an information analysis method, device, equipment and storage medium based on the knowledge graph are provided.
A knowledge graph-based information analysis method comprises the following steps:
sending the problem to a user side and receiving feedback information of the user side;
judging whether the feedback information is consistent with an expected result, if so, sending the next problem to the user side, otherwise, marking the feedback information as objection information;
comparing the text similarity of the objection information with a preset objection type to determine the objection type of the objection information;
extracting a named entity in the objection information, constructing a target entity triple by taking the objection type as an entity, the named entity as an attribute and a numerical value corresponding to the named entity, and traversing a preset knowledge graph by taking the target entity triple as a key element to obtain a plurality of candidate entity triples related to the objection information;
and extracting candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result.
In one possible embodiment, the determining whether the feedback information is consistent with the expected result includes:
performing word vector conversion on the feedback information and the expected result to generate a feedback information word vector and an expected result word vector;
and calculating the similarity between the feedback word vector and the expected result word vector, if the similarity is greater than a preset similarity threshold, determining that the feedback information is consistent with the expected result, otherwise, determining that the feedback information is inconsistent with the expected result.
In one possible embodiment, the extracting named entities in the objection information includes:
performing word vector conversion on the objection information to obtain an objection information word vector;
inputting the dissimilarity information word vector into a preset conditional random field model to generate an initial recognition result;
and inputting the initial recognition result into a preset double-circulation neural network model for re-recognition to obtain the named entity.
In one possible embodiment, the traversing a preset knowledge-graph to obtain a plurality of candidate entity triples related to the objection information includes:
traversing the preset knowledge graph by taking the named entity as a query target to obtain all first entity triples containing the query target;
extracting attributes in the first entity triples, and if the attributes in the first entity triples are consistent with the attributes in the target entity triples, marking the attributes as second entity triples;
and acquiring the position of each second entity triple in the knowledge graph, extracting an upstream entity triple and a downstream entity triple of the second entity triples according to the position, and summarizing the second entity triples, the upstream entity triples and the downstream entity triples to obtain a plurality of candidate entity triples.
In one possible embodiment, the extracting candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result includes:
extracting attributes and attribute values in the candidate entity triples, and acquiring value intervals of all historical attribute values corresponding to the attributes from a system log according to the attributes;
if the attribute value is not in the value range, deleting the candidate entity triple corresponding to the attribute value;
and obtaining weights corresponding to attributes in the rest candidate entity triples, scoring the candidate answers according to the weights, and taking the candidate answer with the highest score as an expected result corresponding to the objection information.
In one possible embodiment, after extracting the candidate answers in each candidate entity triplet, scoring each candidate answer to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result, the method further includes:
determining a continuing problem according to the expected result;
receiving response information of a user terminal to the connection problem, and judging whether the response information contains a question word;
and if the response message contains the query word, re-determining the expected result, and if not, continuously sending the question to the user side.
An information analysis device based on knowledge graph comprises the following modules:
the information transceiving module is set to send the problem to the user side and receive feedback information of the user side;
the information identification module is set to judge whether the feedback information is consistent with an expected result or not, if so, the next problem is sent to the user side, otherwise, the feedback information is marked as objection information;
the type determining module is used for comparing the text similarity of the objection information with a preset objection type to determine the objection type of the objection information;
the triple selecting module is arranged for extracting a named entity in the objection information, constructing a target entity triple by taking the objection type as an entity, the named entity as an attribute and a numerical value corresponding to the named entity, and traversing a preset knowledge graph by taking the target entity triple as a key element to obtain a plurality of candidate entity triples related to the objection information;
and the result generation module is used for extracting the candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result.
In one possible embodiment, the information identification module is further configured to:
performing word vector conversion on the feedback information and the expected result to generate a feedback information word vector and an expected result word vector;
and calculating the similarity between the feedback word vector and the expected result word vector, if the similarity is greater than a preset similarity threshold, determining that the feedback information is consistent with the expected result, otherwise, determining that the feedback information is inconsistent with the expected result.
A computer device comprising a memory and a processor, the memory having stored therein computer readable instructions which, when executed by the processor, cause the processor to perform the steps of the above-described knowledge-graph based information analysis method.
A storage medium having stored thereon computer-readable instructions which, when executed by one or more processors, cause the one or more processors to perform the steps of the above-described knowledge-graph based information analysis method.
Compared with the existing mechanism, the method and the device have the advantages that the problem is sent to the user side, and the feedback information of the user side is received; judging whether the feedback information is consistent with an expected result, if so, sending the next problem to the user side, otherwise, marking the feedback information as objection information; comparing the text similarity of the objection information with a preset objection type to determine the objection type of the objection information; extracting a named entity in the objection information, constructing a target entity triple by taking the objection type as an entity, the named entity as an attribute and a numerical value corresponding to the named entity, and traversing a preset knowledge graph by taking the target entity triple as a key element to obtain a plurality of candidate entity triples related to the objection information; and extracting candidate answers in the candidate entity triples, scoring the candidate answers, and determining an expected result corresponding to the objection information according to a scoring result. Therefore, the technical problem that the intelligent assistant cannot quickly and accurately answer the questions due to the limited deep learning content is solved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application.
FIG. 1 is an overall flow diagram of a knowledge-graph based information analysis method in one embodiment of the present application;
FIG. 2 is a schematic diagram of an information recognition process in a knowledge-graph based information analysis method according to an embodiment of the present application;
FIG. 3 is a block diagram of an apparatus for knowledge-graph based information analysis in one embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
Fig. 1 is an overall flowchart of a method for analyzing knowledge-graph-based information according to an embodiment of the present application, and the method for analyzing knowledge-graph-based information includes the following steps:
s1, sending the question to the user side and receiving feedback information of the user side;
specifically, a user is asked with a fixed template to obtain the user's needs. The server can sort the problems in the database according to the frequency of the problems when the user asks the user with the user all the time, and sorts the problem with the highest frequency in the front and the problem with the low frequency in the back. Then, the server sends the questions to the user side in sequence by obtaining the IP address of the user side where the user is located.
S2, judging whether the feedback information is consistent with the expected result, if so, sending the next question to the user side, otherwise, marking the feedback information as objection information;
specifically, the expected result in this step is a scenario obtained by performing statistical sorting on the questions asked to the user by the user side over the past times, and there is an expected result for each question in the scenario. Such as: the problems are as follows: your income situation? According to the expected result corresponding to the script is a certain value or interval, the answer which does not conform to the script is objection information. For example: the scenario is the specific amount of the monthly salary, and the result answered by the user side is as follows: when the income is low, the range of expected results is exceeded, and the information is marked as objection information.
S3, comparing the objection information with a preset objection type in a text similarity manner to determine the objection type of the objection information;
specifically, the objections can be divided into a plurality of categories by traversing the working logs stored by the user side; the objection type division can be set correspondingly according to different scenes. For example, in an insurance sales scenario, the following categories may be used: the types of premium, insurance bought, body health and insurance not need to be bought and the like; occupation, illness, age, income, family, insurance product, time, premium, amount. Different types correspond to different expected results in the knowledge-graph.
The text similarity calculation in this step may employ rnn, cnn, etc. to find out to which objection type the objection information belongs.
S4, extracting a named entity in the objection information, constructing a target entity triple by taking the objection type as an entity, the named entity as an attribute and a numerical value corresponding to the named entity, and traversing a preset knowledge graph by taking the target entity triple as a key element to obtain a plurality of candidate entity triples related to the objection information;
specifically, the named entity (named entity) is a name of a person, a name of an organization, a name of a place, and other entities identified by names. The named entity in the objection message is the body of the issued action or a noun before and after the adjective. Such as: the child plays the game too long and the named entity in this piece of objection is "child".
There are two general ways to construct triples in a knowledge graph, one is < entity, relationship, entity >, and the other is < entity, attribute value >, where the former represents the relationship between two entities, and the latter represents the attribute relationship inside the entities. Triples may be stored via graph databases, such as open source NEO4J, jena, queried using the cypher, sparsl language, respectively. When the knowledge graph is traversed, the attribute values in the triples can be used as connection points, and the attributes in the triples can also be used as the connection points of the triples.
S5, extracting the candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result.
Specifically, when scoring is performed, a machine learning regression prediction algorithm can be used for scoring, a GBM training set is established, a GBM regression model is trained by the GBM training set, and then the trained GBM regression model is used for classifying and scoring candidate answers. And according to the scoring result, taking the candidate answer with the highest score as an expected result corresponding to the objection information. Manual review can be performed on the identified expected results to determine the accuracy of the results.
In the embodiment, the feedback information of the user is effectively analyzed by adopting a triple mode of { entity, attribute and attribute value }, so that the technical problem that the intelligent assistant cannot quickly and accurately answer the question due to the limited deep learning content is solved.
Fig. 2 is a schematic diagram of an information identification process in a knowledge-graph-based information analysis method according to an embodiment of the present application, where as shown in the figure, the determining whether the feedback information is consistent with an expected result includes:
s21, performing word vector conversion on the feedback information and the expected result to generate a feedback information word vector and an expected result word vector;
specifically, a Word vector conversion tool which can be used for Word vector conversion is a Word2vec or BERT model, and when the Word vector conversion tool is used for Word vector conversion, dimension reduction operation can be performed to convert a multi-dimensional Word vector into a two-dimensional Word vector so as to compare similarity.
S22, calculating the similarity between the feedback word vector and the expected result word vector, if the similarity is larger than a preset similarity threshold, determining that the feedback information is consistent with the expected result, otherwise, determining that the feedback information is inconsistent with the expected result.
Specifically, during similarity calculation, a mode of performing product operation on two word vectors may be adopted, if a matrix eigenvalue obtained after product operation is 0, the two word vectors are completely consistent, if the eigenvalue is less than 1, the actual numerical value is used as a numerical value of the similarity, and if the eigenvalue is greater than 1, a decimal part is reserved as the numerical value of the similarity.
According to the method and the device, the feedback information is effectively classified, so that the objection information is quickly obtained, analysis is conveniently performed on the objection information, and the data volume needing to be analyzed is reduced.
In one embodiment, the extracting named entities in the objection information includes:
performing word vector conversion on the objection information to obtain an objection information word vector;
specifically, the Word vector conversion may use Word2vec or other Word vector conversion models to perform the embedded conversion.
Inputting the dissimilarity information word vector into a preset conditional random field model to generate an initial recognition result;
among them, Conditional Random Fields (CRFs) are discriminant probability models, which are used to label or analyze sequence data, such as natural language characters or biological sequences. The conditional random field is a conditional probability distribution model P (Y | X) representing a markov random field of another set of output random variables Y given a set of input random variables X, i.e., the CRF is characterized by assuming that the output random variables constitute a markov random field. Conditional random fields can be viewed as a generalization of the maximum entropy markov model over the labeling problem.
And inputting the initial recognition result into a preset double-circulation neural network model for re-recognition to obtain the named entity.
Among them, the bicirculating neural network has memorability, parameter sharing and graphic completion (turing), so that the nonlinear characteristics of the sequence can be learned with high efficiency. The recurrent neural network is applied to Natural Language Processing (NLP), such as speech recognition, Language modeling, machine translation, and the like.
In the embodiment, the named entity is effectively extracted from the vectorized objection information by using the conditional random field model and the dual-cycle neural network model, so that the speed and the accuracy of information identification are improved.
In one embodiment, the traversing a preset knowledge-graph to obtain a plurality of candidate entity triples related to the objection information includes:
traversing the preset knowledge graph by taking the named entity as a query target to obtain all first entity triples containing the query target;
extracting attributes in the first entity triples, and if the attributes in the first entity triples are consistent with the attributes in the target entity triples, marking the attributes as second entity triples;
specifically, if the attribute of the first entity triplet is "officer" and the attribute of the target entity triplet is "officer", the two attributes are identical, and only the triplet identical to the target entity triplet is the triplet required in the embodiment, the expected result corresponding to the objection information can be accurately obtained.
And acquiring the position of each second entity triple in the knowledge graph, extracting an upstream entity triple and a downstream entity triple of the second entity triples according to the position, and summarizing the second entity triples, the upstream entity triples and the downstream entity triples to obtain a plurality of candidate entity triples.
Specifically, each entity or attribute in the knowledge graph can be not only connected with one triple, and all candidate answers related to the objection information can be extracted by obtaining the upstream and downstream triples, so that omission is avoided.
According to the method, the candidate triples are effectively obtained by using the knowledge graph, so that the number of data needing to be analyzed is simplified, and the efficiency of answering the customer questions by the intelligent assistant is effectively improved.
In one embodiment, the extracting the candidate answers in each candidate entity triplet, scoring each candidate answer to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result includes:
extracting attributes and attribute values in the candidate entity triples, and acquiring value intervals of all historical attribute values corresponding to the attributes from a system log according to the attributes;
specifically, if the attribute is age, the attribute value is 20-80. That is, the attribute value in this step conforms to the normal value range of the attribute, for example, the age cannot be a negative number.
If the attribute value is not in the value range, deleting the candidate entity triple corresponding to the attribute value;
and obtaining weights corresponding to attributes in the rest candidate entity triples, scoring the candidate answers according to the weights, and taking the candidate answer with the highest score as an expected result corresponding to the objection information.
Wherein, different attributes correspond to different weights, for example, the weight corresponding to age is 0.8, and the weight corresponding to occupation is 0.6. The weight of the attribute is divided according to different application scenarios.
In the embodiment, the dimension of information identification is simplified by using the parameter of the attribute value, so that the efficiency of answering the client question by the intelligent assistant is improved.
In one embodiment, after extracting the candidate answers in each candidate entity triplet, scoring each candidate answer to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result, the method further includes:
determining a continuing problem according to the expected result;
receiving response information of a user terminal to the connection problem, and judging whether the response information contains a question word;
and if the response message contains the query word, re-determining the expected result, and if not, continuously sending the question to the user side.
Wherein, the query words are preset according to the scene requirements. For example, if the expected result is a government employee, the follow-on question is "level", if the user side answers "what is the level? "there is a question in the response message, requiring a redetermination of the expected result that the customer is not a government officer.
In the embodiment, the answer result obtained by the method can be effectively verified by answering the continuing question of the user, so that the parameters in the scheme can be corrected in time.
The technical features mentioned in any of the above corresponding embodiments or implementations are also applicable to the embodiment corresponding to fig. 3 in the present application, and the details of the subsequent similarities are not repeated.
In the above description, a method for analyzing information based on a knowledge graph according to the present application is described, and an apparatus for analyzing information based on a knowledge graph is described below.
A structure of an information analysis apparatus based on a knowledge-graph as shown in fig. 3 is applicable to information analysis based on a knowledge-graph. The knowledge-graph-based information analysis apparatus in the embodiment of the present application can implement the steps corresponding to the knowledge-graph-based information analysis method performed in the embodiment corresponding to fig. 1 described above. The functions realized by the knowledge graph-based information analysis device can be realized by hardware, and can also be realized by hardware executing corresponding software. The hardware or software includes one or more modules corresponding to the above functions, which may be software and/or hardware.
In one embodiment, a knowledge-graph-based information analysis apparatus is provided, as shown in fig. 3, including the following modules:
the information transceiving module 10 is configured to send a question to the user side and receive feedback information of the user side;
the information identification module 20 is configured to determine whether the feedback information is consistent with an expected result, and if so, send the next problem to the user side, otherwise, mark the feedback information as objection information;
a type determining module 30, configured to perform text similarity comparison between the objection information and a preset objection type to determine the objection type of the objection information;
the triple selecting module 40 is configured to extract a named entity in the objection information, construct a target entity triple by using the objection type as an entity, the named entity as an attribute, and a value corresponding to the named entity, and traverse a preset knowledge graph by using the target entity triple as a key element to obtain a plurality of candidate entity triples related to the objection information;
and the result generation module 50 is configured to extract candidate answers in the candidate entity triples, score the candidate answers to obtain a scoring result, and determine an expected result corresponding to the objection information according to the scoring result.
In one embodiment, the information identification module is further configured to:
performing word vector conversion on the feedback information and the expected result to generate a feedback information word vector and an expected result word vector;
and calculating the similarity between the feedback word vector and the expected result word vector, if the similarity is greater than a preset similarity threshold, determining that the feedback information is consistent with the expected result, otherwise, determining that the feedback information is inconsistent with the expected result.
In one embodiment, a computer device is provided, the computer device includes a memory and a processor, the memory stores computer readable instructions, and the computer readable instructions, when executed by the processor, cause the processor to execute the steps of the knowledge-graph based information analysis method in the above embodiments.
In one embodiment, a storage medium storing computer-readable instructions is provided, which when executed by one or more processors, cause the one or more processors to perform the steps of the method for knowledge-graph-based information analysis in the above embodiments. The storage medium may be a nonvolatile storage medium or a volatile storage medium, and the present application is not limited in particular.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable storage medium, and the storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, or the like.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-described embodiments are merely illustrative of some embodiments of the present application, which are described in more detail and detail, but are not to be construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.
Claims (10)
1. A knowledge-graph-based information analysis method is characterized by comprising the following steps:
sending the problem to a user side and receiving feedback information of the user side;
judging whether the feedback information is consistent with an expected result, if so, sending the next problem to the user side, otherwise, marking the feedback information as objection information;
comparing the text similarity of the objection information with a preset objection type to determine the objection type of the objection information;
extracting a named entity in the objection information, constructing a target entity triple by taking the objection type as an entity, the named entity as an attribute and a numerical value corresponding to the named entity, and traversing a preset knowledge graph by taking the target entity triple as a key element to obtain a plurality of candidate entity triples related to the objection information;
and extracting candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result.
2. The method of knowledge-graph based information analysis of claim 1, wherein said determining whether the feedback information is consistent with an expected result comprises:
performing word vector conversion on the feedback information and the expected result to generate a feedback information word vector and an expected result word vector;
and calculating the similarity between the feedback word vector and the expected result word vector, if the similarity is greater than a preset similarity threshold, determining that the feedback information is consistent with the expected result, otherwise, determining that the feedback information is inconsistent with the expected result.
3. The method of knowledge-graph-based information analysis according to claim 1, wherein said extracting named entities in said objection information comprises:
performing word vector conversion on the objection information to obtain an objection information word vector;
inputting the dissimilarity information word vector into a preset conditional random field model to generate an initial recognition result;
and inputting the initial recognition result into a preset double-circulation neural network model for re-recognition to obtain the named entity.
4. The method of knowledge-graph based information analysis of claim 3, wherein traversing a preset knowledge-graph to obtain a plurality of candidate entity triples related to the objection information comprises:
traversing the preset knowledge graph by taking the named entity as a query target to obtain all first entity triples containing the query target;
extracting attributes in the first entity triples, and if the attributes in the first entity triples are consistent with the attributes in the target entity triples, marking the attributes as second entity triples;
and acquiring the position of each second entity triple in the knowledge graph, extracting an upstream entity triple and a downstream entity triple of the second entity triples according to the position, and summarizing the second entity triples, the upstream entity triples and the downstream entity triples to obtain a plurality of candidate entity triples.
5. The method of any one of claims 1 to 4, wherein the extracting candidate answers in each candidate entity triplet, scoring each candidate answer to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result comprises:
extracting attributes and attribute values in the candidate entity triples, and acquiring value intervals of all historical attribute values corresponding to the attributes from a system log according to the attributes;
if the attribute value is not in the value range, deleting the candidate entity triple corresponding to the attribute value;
and obtaining weights corresponding to attributes in the rest candidate entity triples, scoring the candidate answers according to the weights, and taking the candidate answer with the highest score as an expected result corresponding to the objection information.
6. The knowledge-graph-based information analysis method according to claim 5, wherein after extracting candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result, the method further comprises:
determining a continuing problem according to the expected result;
receiving response information of a user terminal to the connection problem, and judging whether the response information contains a question word;
and if the response message contains the query word, re-determining the expected result, and if not, continuously sending the question to the user side.
7. An information analysis device based on knowledge graph is characterized by comprising the following modules:
the information transceiving module is set to send the problem to the user side and receive feedback information of the user side;
the information identification module is set to judge whether the feedback information is consistent with an expected result or not, if so, the next problem is sent to the user side, otherwise, the feedback information is marked as objection information;
the type determining module is used for comparing the text similarity of the objection information with a preset objection type to determine the objection type of the objection information;
the triple selecting module is arranged for extracting a named entity in the objection information, constructing a target entity triple by taking the objection type as an entity, the named entity as an attribute and a numerical value corresponding to the named entity, and traversing a preset knowledge graph by taking the target entity triple as a key element to obtain a plurality of candidate entity triples related to the objection information;
and the result generation module is used for extracting the candidate answers in the candidate entity triples, scoring the candidate answers to obtain a scoring result, and determining an expected result corresponding to the objection information according to the scoring result.
8. The apparatus of claim 7, wherein the information recognition module is further configured to:
performing word vector conversion on the feedback information and the expected result to generate a feedback information word vector and an expected result word vector;
and calculating the similarity between the feedback word vector and the expected result word vector, if the similarity is greater than a preset similarity threshold, determining that the feedback information is consistent with the expected result, otherwise, determining that the feedback information is inconsistent with the expected result.
9. A computer device comprising a memory and a processor, the memory having stored therein computer-readable instructions, wherein the computer-readable instructions, when executed by the processor, cause the processor to perform the steps of the knowledge-graph based information analysis method of any one of claims 1 to 6.
10. A storage medium having stored thereon computer-readable instructions, which, when executed by one or more processors, cause the one or more processors to perform the steps of the method for knowledge-graph based information analysis of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010156693.9A CN111368096A (en) | 2020-03-09 | 2020-03-09 | Knowledge graph-based information analysis method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010156693.9A CN111368096A (en) | 2020-03-09 | 2020-03-09 | Knowledge graph-based information analysis method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111368096A true CN111368096A (en) | 2020-07-03 |
Family
ID=71208642
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010156693.9A Pending CN111368096A (en) | 2020-03-09 | 2020-03-09 | Knowledge graph-based information analysis method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111368096A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112380865A (en) * | 2020-11-10 | 2021-02-19 | 北京小米松果电子有限公司 | Method, device and storage medium for identifying entity in text |
CN112732941A (en) * | 2021-01-15 | 2021-04-30 | 医渡云(北京)技术有限公司 | Model-based medical knowledge graph construction method, device, equipment and medium |
CN113128231A (en) * | 2021-04-25 | 2021-07-16 | 深圳市慧择时代科技有限公司 | Data quality inspection method and device, storage medium and electronic equipment |
CN113609291A (en) * | 2021-07-27 | 2021-11-05 | 科大讯飞(苏州)科技有限公司 | Entity classification method and device, electronic equipment and storage medium |
CN113761167A (en) * | 2021-09-09 | 2021-12-07 | 上海明略人工智能(集团)有限公司 | Session information extraction method, system, electronic device and storage medium |
WO2022057671A1 (en) * | 2020-09-16 | 2022-03-24 | 浙江大学 | Neural network–based knowledge graph inconsistency reasoning method |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109508420A (en) * | 2018-11-26 | 2019-03-22 | 北京羽扇智信息科技有限公司 | A kind of cleaning method and device of knowledge mapping attribute |
CN109726279A (en) * | 2018-12-30 | 2019-05-07 | 联想(北京)有限公司 | A kind of data processing method and device |
CN110825862A (en) * | 2019-11-06 | 2020-02-21 | 北京诺道认知医学科技有限公司 | Intelligent question-answering method and device based on pharmacy knowledge graph |
-
2020
- 2020-03-09 CN CN202010156693.9A patent/CN111368096A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109508420A (en) * | 2018-11-26 | 2019-03-22 | 北京羽扇智信息科技有限公司 | A kind of cleaning method and device of knowledge mapping attribute |
CN109726279A (en) * | 2018-12-30 | 2019-05-07 | 联想(北京)有限公司 | A kind of data processing method and device |
CN110825862A (en) * | 2019-11-06 | 2020-02-21 | 北京诺道认知医学科技有限公司 | Intelligent question-answering method and device based on pharmacy knowledge graph |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022057671A1 (en) * | 2020-09-16 | 2022-03-24 | 浙江大学 | Neural network–based knowledge graph inconsistency reasoning method |
CN112380865A (en) * | 2020-11-10 | 2021-02-19 | 北京小米松果电子有限公司 | Method, device and storage medium for identifying entity in text |
CN112732941A (en) * | 2021-01-15 | 2021-04-30 | 医渡云(北京)技术有限公司 | Model-based medical knowledge graph construction method, device, equipment and medium |
CN112732941B (en) * | 2021-01-15 | 2023-07-07 | 医渡云(北京)技术有限公司 | Method, device, equipment and medium for constructing medical knowledge graph based on model |
CN113128231A (en) * | 2021-04-25 | 2021-07-16 | 深圳市慧择时代科技有限公司 | Data quality inspection method and device, storage medium and electronic equipment |
CN113609291A (en) * | 2021-07-27 | 2021-11-05 | 科大讯飞(苏州)科技有限公司 | Entity classification method and device, electronic equipment and storage medium |
CN113761167A (en) * | 2021-09-09 | 2021-12-07 | 上海明略人工智能(集团)有限公司 | Session information extraction method, system, electronic device and storage medium |
CN113761167B (en) * | 2021-09-09 | 2023-10-20 | 上海明略人工智能(集团)有限公司 | Session information extraction method, system, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN117033608B (en) | Knowledge graph generation type question-answering method and system based on large language model | |
CN111353310B (en) | Named entity identification method and device based on artificial intelligence and electronic equipment | |
CN111368096A (en) | Knowledge graph-based information analysis method, device, equipment and storage medium | |
CN112016313B (en) | Spoken language element recognition method and device and warning analysis system | |
CN111078837A (en) | Intelligent question and answer information processing method, electronic equipment and computer readable storage medium | |
CN111078876A (en) | Short text classification method and system based on multi-model integration | |
CN114119058A (en) | User portrait model construction method and device and storage medium | |
CN111125295A (en) | Method and system for obtaining food safety question answers based on LSTM | |
CN117076688A (en) | Knowledge question-answering method and device based on domain knowledge graph and electronic equipment | |
CN115577080A (en) | Question reply matching method, system, server and storage medium | |
CN113342958A (en) | Question-answer matching method, text matching model training method and related equipment | |
CN117520503A (en) | Financial customer service dialogue generation method, device, equipment and medium based on LLM model | |
CN116049376B (en) | Method, device and system for retrieving and replying information and creating knowledge | |
CN117332054A (en) | Form question-answering processing method, device and equipment | |
CN112579666A (en) | Intelligent question-answering system and method and related equipment | |
CN111104422A (en) | Training method, device, equipment and storage medium of data recommendation model | |
CN115730058A (en) | Reasoning question-answering method based on knowledge fusion | |
CN113468311B (en) | Knowledge graph-based complex question and answer method, device and storage medium | |
CN111666770B (en) | Semantic matching method and device | |
CN113095073B (en) | Corpus tag generation method and device, computer equipment and storage medium | |
CN112749530B (en) | Text encoding method, apparatus, device and computer readable storage medium | |
CN111400413B (en) | Method and system for determining category of knowledge points in knowledge base | |
CN113963235A (en) | Cross-category image recognition model reusing method and system | |
Dikshit et al. | Automating Questions and Answers of Good and Services Tax system using clustering and embeddings of queries | |
CN115618968B (en) | New idea discovery method and device, electronic device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |