CN117349449A - Text generation method, device, equipment and medium based on knowledge graph - Google Patents

Text generation method, device, equipment and medium based on knowledge graph Download PDF

Info

Publication number
CN117349449A
CN117349449A CN202311410600.0A CN202311410600A CN117349449A CN 117349449 A CN117349449 A CN 117349449A CN 202311410600 A CN202311410600 A CN 202311410600A CN 117349449 A CN117349449 A CN 117349449A
Authority
CN
China
Prior art keywords
triplet
attribute
contained
verification
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311410600.0A
Other languages
Chinese (zh)
Inventor
吴钟强
车皓阳
谷鹰
姚雷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Geely Holding Group Co Ltd
Zhejiang Zeekr Intelligent Technology Co Ltd
Original Assignee
Zhejiang Geely Holding Group Co Ltd
Zhejiang Zeekr Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Geely Holding Group Co Ltd, Zhejiang Zeekr Intelligent Technology Co Ltd filed Critical Zhejiang Geely Holding Group Co Ltd
Priority to CN202311410600.0A priority Critical patent/CN117349449A/en
Publication of CN117349449A publication Critical patent/CN117349449A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/027Frames
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a text generation method, device, equipment and medium based on a knowledge graph, and relates to the field of natural language processing. The method comprises the following steps: acquiring a text generation instruction, generating a target text corresponding to the text generation instruction based on a pre-trained text generation model, extracting triples of the target text to obtain triples corresponding to the target text, carrying out authenticity verification on the triples based on a pre-built knowledge graph, and outputting the target text if the authenticity verification is passed. On the basis of generating a target text based on a text generation model, carrying out authenticity verification on the triples of the target text based on the knowledge graph, and outputting the target text when the authenticity verification is passed, so that the output target text is actually present, thereby improving the authenticity of the generated text.

Description

Text generation method, device, equipment and medium based on knowledge graph
Technical Field
The present application relates to the field of natural language processing, and in particular, to a method, an apparatus, a device, and a medium for generating text based on a knowledge graph.
Background
Text generation is a technology for automatically generating text by a research machine in the field of natural language processing. With the development of text generation technology, text generation is widely applied to various fields such as literature authoring, news reporting, abstract generation, dialogue systems, machine translation, intelligent customer service, intelligent question-answering, chat robots, and the like.
In the related art, when a large language model is used for generating a text, proper training data is usually obtained based on a neural network algorithm, the meaning of a natural language text is learned through training data learning rules and features so as to obtain a trained neural network model, and a new text is generated based on the input text and the neural network model, but the generated text still has unreal problems.
Disclosure of Invention
The application provides a text generation method, device, equipment and medium based on a knowledge graph, which are used for solving the problem that the generated text still exists unrealistic when a large language model is used for generating the text in the related technology.
In a first aspect, the present application provides a text generation method based on a knowledge graph, including:
acquiring a text generation instruction;
generating a target text corresponding to the text generation instruction based on a pre-trained text generation model;
Performing triplet extraction on the target text to obtain a triplet corresponding to the target text, wherein the triplet comprises an entity, an attribute and an attribute value;
carrying out authenticity verification on the triples based on a pre-constructed knowledge graph, wherein the knowledge graph comprises the relationship among entities, attributes and attribute values;
and if the authenticity verification is passed, outputting the target text.
In one possible implementation, the verifying the authenticity of the triples based on a pre-constructed knowledge-graph includes: based on a pre-constructed knowledge graph, performing at least one of entity verification, attribute verification and attribute value verification on the triples; if at least one of the checks is passed, determining that the authenticity check of the triplet is passed; if either check fails, determining that the authenticity check for the triplet fails.
In one possible implementation manner, the text generation method based on the knowledge graph further comprises at least one of the following: based on a pre-constructed knowledge graph, performing entity verification on the triples, wherein the entity verification comprises the following steps: checking whether the entity contained in the triplet is in the entity contained in the knowledge graph or not; if the entity contained in the triplet is in the entity contained in the knowledge graph, determining that the entity verification of the triplet passes; if the entity contained in the triplet is not in the entity contained in the knowledge graph, determining that the entity verification of the triplet is not passed; based on a pre-constructed knowledge graph, performing attribute verification on the triples, wherein the method comprises the following steps: checking whether the attribute contained in the triplet is in the attribute contained in the knowledge graph or not; if the attribute contained in the triplet is in the attribute contained in the knowledge graph, determining that the attribute verification of the triplet passes; if the attribute contained in the triplet is not in the attribute contained in the knowledge graph, determining that the attribute verification of the triplet is not passed; based on a pre-constructed knowledge graph, performing attribute value verification on the triples, wherein the method comprises the following steps: checking whether the attribute value contained in the triplet is in the attribute value contained in the knowledge graph or not; if the attribute value contained in the triplet is in the attribute value contained in the knowledge graph, determining that the attribute value of the triplet passes the verification; if the attribute value contained in the triplet is not in the attribute value contained in the knowledge graph, determining that the attribute value verification of the triplet is not passed.
In one possible implementation manner, performing entity verification, attribute verification and attribute value verification on the triples based on a pre-constructed knowledge graph includes: acquiring an entity set of the entities contained in the knowledge graph; checking whether the entity contained in the triplet is in the entity set; if the entity contained in the triplet is in the entity set, determining that the entity verification of the triplet passes, and acquiring an attribute set of the entity contained in the triplet based on the knowledge graph; checking whether the attribute contained in the triplet is in the attribute set; if the attribute contained in the triplet is in the attribute set, determining that the attribute verification of the triplet passes, and acquiring an attribute value of the attribute contained in the triplet based on the knowledge graph; checking whether the attribute value contained in the triplet is equal to the acquired attribute value; and if the attribute value contained in the triplet is equal to the acquired attribute value, determining that the attribute value of the triplet passes the verification.
In one possible implementation manner, the text generation method based on the knowledge graph further comprises: if the authenticity verification is not passed, determining the times of generating the target text corresponding to the text generation instruction; and if the number of times is smaller than the number of times threshold, returning to the step of executing the generation of the target text corresponding to the text generation instruction based on the pre-trained text generation model.
In one possible implementation manner, the text generation method based on the knowledge graph further comprises: if the number of times is greater than or equal to the number of times threshold, outputting prompt information, wherein the prompt information is used for prompting the reason of unsuccessful text generation.
In one possible implementation manner, the text generation method based on the knowledge graph further comprises: performing triplet extraction on the target text to obtain a triplet corresponding to the target text, including: inputting the target text into a pre-constructed triplet extraction model to perform triplet extraction, so as to obtain a triplet corresponding to the target text output by the triplet extraction model; and/or the text generation model is a language model or a sequence-to-sequence model.
In a second aspect, the present application provides a text generating device based on a knowledge graph, including:
the acquisition module is used for acquiring the text generation instruction;
the generation module is used for generating a target text corresponding to the text generation instruction based on a pre-trained text generation model;
the extraction module is used for extracting the triples of the target text to obtain triples corresponding to the target text, wherein the triples comprise entities, attributes and attribute values;
the verification module is used for carrying out authenticity verification on the triples based on a pre-established knowledge graph, wherein the knowledge graph comprises a relation among an entity, an attribute and an attribute value;
And the output module is used for outputting the target text when the authenticity verification is passed.
In a third aspect, the present application provides an electronic device, comprising: a processor, a memory communicatively coupled to the processor;
a memory for storing computer-executable instructions;
a processor for executing computer-executable instructions stored in a memory to implement the method of any one of the first aspects.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein computer-executable instructions for performing the method of any of the first aspects when the computer-executable instructions are executed.
In a fifth aspect, the present application provides a computer program product comprising a computer program which, when executed, implements the method of any one of the first aspects.
According to the text generation method, device, equipment and medium based on the knowledge graph, the text generation instruction is acquired, the target text corresponding to the text generation instruction is generated based on the pre-trained text generation model, the target text is subjected to triplet extraction to obtain the triples corresponding to the target text, the triples comprise entities, attributes and attribute values, the authenticity verification is carried out on the triples based on the pre-built knowledge graph, the knowledge graph comprises the relationships among the entities, the attributes and the attribute values, and if the authenticity verification passes, the target text is output. In the process, on the basis of generating the target text based on the text generation model, carrying out authenticity verification on the triples of the target text based on the knowledge graph, and outputting the target text when the authenticity verification is passed, so that the output target text is actually present, thereby improving the authenticity of the generated text.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is an application scenario schematic diagram of a knowledge-graph-based text generation method according to an exemplary embodiment of the present application;
fig. 2 is a flow chart of a text generation method based on a knowledge graph according to an exemplary embodiment of the present application;
fig. 3 is another flow chart of a knowledge-graph-based text generation method according to an exemplary embodiment of the present application;
fig. 4 is a schematic structural diagram of a text generating device based on a knowledge graph according to an exemplary embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.
Specific embodiments thereof have been shown by way of example in the drawings and will herein be described in more detail. These drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but to illustrate the concepts of the present application to those skilled in the art by reference to specific embodiments.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present application as detailed in the accompanying claims.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the present application described herein may be capable of operation in sequences other than those illustrated or described herein, for example. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, article, or apparatus.
It should be noted that the method, the device, the equipment and the medium for generating the text based on the knowledge graph provided by the application can be used in the field of natural language processing, and can also be used in any field except the field of natural language processing, such as the field of artificial intelligence, the field of internet of things or other related fields, and the application fields of the method, the device, the equipment and the medium for generating the text based on the knowledge graph provided by the application are not limited.
First, some terms related to the present application will be explained:
knowledge graph, which is essentially a semantic network, is a graph-based data structure, consisting of points and edges. In the knowledge graph, each point represents an entity existing in the real world, each side is a relationship between the entities, and the entities can be things in the real world, such as people, place names, companies, telephones, animals, etc., and the relationship is used for expressing a certain relationship between different entities. A knowledge graph is a network of relationships that is obtained by linking together all the different kinds of information, and thus provides the ability to analyze problems from a "relationship" perspective.
In the related art, when a large language model is used for generating a text, a new text is usually generated based on the input text and a trained neural network model, but because the neural network model lacks overall modeling of the objective world, things which do not exist in the objective world exist in the generated text, and further, the generated text still has an unreal problem.
In order to solve the problems, the embodiment of the application provides a text generation scheme based on a knowledge graph, which is based on a text generation model to generate a target text, performs triplet extraction on the target text to obtain a triplet corresponding to the target text, introduces the knowledge graph, and re-verifies the authenticity of the triplet of the target text based on the knowledge graph, and outputs the target text when the authenticity verification passes, so that the output target text really exists, thereby improving the authenticity of the generated text.
Fig. 1 is an application scenario schematic diagram of a text generation method based on a knowledge graph according to an exemplary embodiment of the present application. As shown in fig. 1, the application scenario includes a client 11 and a server 12, where the number of clients 11 may be at least one. In practical application, when detecting a text generation instruction input by a user through the client 11, the server 12 executes the text generation method based on the knowledge graph provided by the application to obtain a target text corresponding to the text generation instruction. Correspondingly, the client 11 obtains the target text corresponding to the text generation instruction by the server 12.
It should be noted that the server 12 may be replaced by a server cluster or other computing device with a certain computing power. The client 11 may be a computer, a mobile phone, a notebook or a personal digital assistant (Personal Digital Assistant, PDA for short), etc.
A knowledge-graph-based text generation method according to an exemplary embodiment of the present application will be described below with reference to fig. 2 in conjunction with the application scenario of fig. 1. It should be noted that the above application scenario is only shown for the convenience of understanding the spirit and principles of the present application, and the embodiments of the present application are not limited by the application scenario shown in fig. 1.
Fig. 2 is a flow chart of a text generation method based on a knowledge graph according to an exemplary embodiment of the present application. As shown in fig. 2, the text generation method based on the knowledge graph in the embodiment of the application includes the following steps:
s201, acquiring a text generation instruction.
In this step, as shown in fig. 1, the server 12 acquires a text generation instruction upon detecting the text generation instruction input by the user through the client 11. Alternatively, the text generation instruction may be text or speech. For example, the text generation instruction is what the height of the text "a" inputted by the user, or the text generation instruction is what the height of the voice "a" inputted by the user, and the format of the text generation instruction is not limited here.
S202, generating target text corresponding to the text generation instruction based on a pre-trained text generation model.
Correspondingly, inputting the text generation instruction into a pre-trained text generation model, and generating a target text corresponding to the text generation instruction. For example, the text generation instruction "what the height of a is" acquired in step S201, the generated target text is "the height of a is 174cm".
S203, performing triplet extraction on the target text to obtain a triplet corresponding to the target text, wherein the triplet comprises an entity, an attribute and an attribute value.
In the step, a preset rule is adopted to conduct triplet extraction on the target text, and a triplet corresponding to the target text, namely an entity, an attribute and an attribute value corresponding to the target text is obtained. Illustratively, the entity, attribute, and attribute value are denoted as (s, p, o), where s identifies the entity of the target text, p identifies the attribute of the target text, and o identifies the attribute value of the target text. Correspondingly, the triplet extraction is carried out on the height of 174cm of the target text A, and the obtained triplet is 'A', 'height', '174 cm', namely the entity is 'A', the attribute is 'height', and the attribute value is '174 cm'.
S204, carrying out authenticity verification on the triples based on a pre-established knowledge graph, wherein the knowledge graph comprises the relationship among the entity, the attribute and the attribute value.
Further, based on a pre-established knowledge graph, carrying out authenticity verification on the triples by adopting a preset rule to obtain an authenticity verification result aiming at the triples. If the relationship among the entity, the attribute and the attribute value in the triplet is inconsistent with the relationship among the entity, the attribute and the attribute value in the knowledge graph, the authenticity verification for the triplet is not passed; and if the relationship among the entity, the attribute and the attribute value in the triplet is consistent with the relationship among the entity, the attribute and the attribute value in the knowledge graph, passing the authenticity verification for the triplet.
S205, outputting the target text if the authenticity verification is passed.
For example, if the height of the "a" is defined as 174cm "in the knowledge graph, the authenticity check for the triplet (" a "," height ","174cm ") is passed, and the height of the target text" a "is output as 174cm".
According to the text generation method based on the knowledge graph, the target text corresponding to the text generation instruction is generated through the pre-trained text generation model, on the basis, the target text is subjected to triplet extraction to obtain the triples corresponding to the target text, the authenticity verification is carried out on the triples of the target text based on the knowledge graph, and the target text is output when the authenticity verification is passed, so that the output target text is actually present, and the authenticity of the generated text is improved.
Based on the above embodiment, the knowledge graph is fused into the training process of the model in the related technology, or the knowledge graph is vectorized, but knowledge loss is usually caused in the text generation process, so that the situation that the model has insufficient knowledge acquisition amount is caused, and the generated text has unreal problems. Aiming at the problem, the method for displaying the three-dimensional text is adopted, the authenticity verification is carried out on the three-dimensional text corresponding to the target text based on the pre-constructed knowledge graph, the knowledge loss in the knowledge graph is effectively reduced, and the authenticity of the generated text is further improved.
Thus, in some embodiments, performing an authenticity check on the triples based on a pre-constructed knowledge-graph includes: based on a pre-constructed knowledge graph, performing at least one of entity verification, attribute verification and attribute value verification on the triples; if at least one of the checks is passed, determining that the authenticity check of the triplet is passed; if either check fails, determining that the authenticity check for the triplet fails.
In one implementation, for example, based on a pre-constructed knowledge graph, only performing entity verification on the triples, and if the entity verification passes, determining that the authenticity verification of the triples passes; if the entity check fails, determining that the authenticity check of the triplet fails. For example, if the triplet is "a", "height", "174cm", if the verification for "a" passes, then the authenticity verification for "a", "height", "174 cm") is determined to pass; if the verification for "A" fails, it is determined that the authenticity verification for ("A", "height", "174 cm") fails.
In another implementation, based on a pre-constructed good knowledge graph, performing entity verification, attribute verification and attribute value verification on the triples; if the entity check and the attribute value check are all passed, determining that the authenticity check of the triplet is passed; if any one of the entity check, the attribute check and the attribute value check fails, determining that the authenticity check of the triplet fails. For example, if the triplet is "a", "height", "174cm", if the verification for both "a", "height" and "174cm" is passed, then the authenticity verification for "a", "height", "174 cm") is determined to be passed; if the verification for any one of "A", "height" and "174cm" is not passed, it is determined that the authenticity verification for the "A", "height", "174 cm") is not passed.
Based on the above embodiments, in some embodiments, performing entity verification, attribute verification, and attribute value verification on the triples based on a pre-constructed knowledge graph includes: acquiring an entity set of the entities contained in the knowledge graph; checking whether the entity contained in the triplet is in the entity set; if the entity contained in the triplet is in the entity set, determining that the entity verification of the triplet passes, and acquiring an attribute set of the entity contained in the triplet based on the knowledge graph; checking whether the attribute contained in the triplet is in the attribute set; if the attribute contained in the triplet is in the attribute set, determining that the attribute verification of the triplet passes, and acquiring an attribute value of the attribute contained in the triplet based on the knowledge graph; checking whether the attribute value contained in the triplet is equal to the acquired attribute value; and if the attribute value contained in the triplet is equal to the acquired attribute value, determining that the attribute value of the triplet passes the verification.
For example, if the triplet to be verified is (s, p, o), the verification of the triplet is divided into three parts: entity verification, attribute verification and attribute value verification. Specifically, the entity set of the entity contained in the preset knowledge graph is S, if S is in S, the entity verification passing of the triplet to be verified is determined, all attribute sets of the entity S are obtained from the knowledge graph and marked as P, if P is in P, the attribute verification passing of the triplet to be verified is determined, the attribute value corresponding to the entity S and the attribute P is obtained from the knowledge graph and marked as o s If o and o s The values of the two are equal to each other, then it is determined that the attribute value of the triplet to be verified is verified. Correspondingly, if S is not in S, determining that the entity verification of the triplet to be verified is not passed, namely that the authenticity verification of the triplet to be verified is not passed; if P is not in P, determining that the attribute verification of the triplet to be verified is not passed, namely that the authenticity verification of the triplet to be verified is not passed; if o and o s If the values are not equal, determining that the verification of the attribute values of the triples to be verified is not passed, i.e. the authenticity check for the triplet to be checked is not passed.
In some embodiments, the knowledge-graph-based text generation method further comprises at least one of: based on a pre-constructed knowledge graph, performing entity verification on the triples, wherein the entity verification comprises the following steps: checking whether the entity contained in the triplet is in the entity contained in the knowledge graph or not; if the entity contained in the triplet is in the entity contained in the knowledge graph, determining that the entity verification of the triplet passes; if the entity contained in the triplet is not in the entity contained in the knowledge graph, determining that the entity verification of the triplet is not passed; based on a pre-constructed knowledge graph, performing attribute verification on the triples, wherein the method comprises the following steps: checking whether the attribute contained in the triplet is in the attribute contained in the knowledge graph or not; if the attribute contained in the triplet is in the attribute contained in the knowledge graph, determining that the attribute verification of the triplet passes; if the attribute contained in the triplet is not in the attribute contained in the knowledge graph, determining that the attribute verification of the triplet is not passed; based on a pre-constructed knowledge graph, performing attribute value verification on the triples, wherein the method comprises the following steps: checking whether the attribute value contained in the triplet is in the attribute value contained in the knowledge graph or not; if the attribute value contained in the triplet is in the attribute value contained in the knowledge graph, determining that the attribute value of the triplet passes the verification; if the attribute value contained in the triplet is not in the attribute value contained in the knowledge graph, determining that the attribute value verification of the triplet is not passed.
For example, if the triplet to be verified is (S1, P1, O1), the entity set contained in the pre-built knowledge graph is S1, the attribute set is P1, and the attribute value set is O1, and correspondingly, when verifying whether the entity contained in the triplet is in the entity contained in the knowledge graph, if S1 is in S1, determining that the entity verification of the triplet to be verified passes; if S1 is not in the S1, determining that the entity verification of the triplet to be verified is not passed; when checking whether the attribute contained in the triplet is in the attribute contained in the knowledge graph, if P1 is in P1, determining that the attribute of the triplet to be checked passes the check; if P1 is not in P1, determining that the attribute verification of the triplet to be verified is not passed; when checking whether the attribute value contained in the triplet is in the attribute value contained in the knowledge graph, if O1 is in O1, determining that the attribute value of the triplet to be checked passes the check; if O1 is not in O1, determining that the attribute value verification of the triplet to be verified is not passed.
Based on the foregoing embodiments, in some embodiments, the method for generating a text based on a knowledge graph further includes: if the authenticity verification is not passed, determining the times of generating the target text corresponding to the text generation instruction; and if the number of times is smaller than the number of times threshold, returning to the step of executing the generation of the target text corresponding to the text generation instruction based on the pre-trained text generation model. Fig. 3 is another flow chart of a text generating method based on a knowledge graph according to an exemplary embodiment of the present application. As shown in fig. 3, the text generation method based on the knowledge graph in the embodiment of the application includes the following steps:
S301, acquiring a text generation instruction.
S302, generating a target text based on the text generation model.
S303, extracting triples.
And performing triplet extraction on the target text to obtain a triplet corresponding to the target text.
S304, checking the triples.
Checking the authenticity of the triples, and if the verification is passed, executing S306 to output a target text; if the verification is not passed, the number of times of generating the target text corresponding to the text generation instruction is determined, and S305 is executed.
S305, judging whether the number of times of generating the target text is smaller than a number threshold.
If the number of times of generating the target text is smaller than the number threshold, executing S302; if the number of times the target text is generated is greater than or equal to the number of times threshold, S306 is performed.
S306, outputting a text generation result.
Correspondingly, when the verification is passed, the text generation result is that the text generation is successful, and the target text is output; and when the verification fails, the text generation result corresponds to the text generation failure. In some embodiments, if the number of times is greater than or equal to the number of times threshold, a prompt message is output, where the prompt message is used to prompt a reason for unsuccessful text generation.
The method comprises the steps that an example is that a frequency threshold value is preset to be 3, a text generation instruction is obtained, a first item target text is generated based on a text generation model, triad extraction is conducted on the first item target text, a first triad is obtained, verification is conducted on the first triad, if the first triad passes, the first item target text is output, and if the first item target text does not pass, the frequency of a target text corresponding to the text generation instruction is determined to be 1; returning to the step of generating the target text based on the text generation model, generating a second item target text, extracting the second item target text to obtain a second triplet, checking the second triplet, outputting the second item target text if the second triplet passes, and determining that the number of times of generating the target text corresponding to the text generation instruction is 2 if the second item target text does not pass; and returning to the step of generating the target text based on the text generation model, generating a third item target text, extracting a triplet of the third item target text to obtain a third triplet, checking the third triplet, outputting the third item target text if the check is passed, and outputting prompt information, for example, the third item target text is "how to maintain and manage the sofaro head and other cabin seats of the krypton 007" because the number of times of generating the target text corresponding to the text generation instruction is 3 if the check is not passed, and the corresponding prompt information can be "the krypton 007 is an entity which does not exist in reality" the krypton 007 "because the third item target text is not in a preset knowledge graph entity".
In some embodiments, the method for generating text based on a knowledge graph further comprises: performing triplet extraction on the target text to obtain a triplet corresponding to the target text, including: and inputting the target text into a pre-constructed triplet extraction model to perform triplet extraction, so as to obtain a triplet corresponding to the target text output by the triplet extraction model.
For example, if the target text is "the version of the vehicle system of the krypton 007 is os 5.0", the target text is input to a pre-built triplet extraction model to perform triplet extraction, so as to obtain a triplet ("the krypton 007", "the version of the vehicle system", "os 5.0") output by the triplet extraction model, that is, the entity is "the krypton 007", the attribute is "the version of the vehicle system", and the attribute value is "os 5.0".
In some embodiments, the method for generating text based on a knowledge graph further comprises: the text generation model is a language model or a sequence-to-sequence model.
In one implementation, the text generation model is a language model. Correspondingly, selecting a proper language model structure for model training to obtain a trained language model, and further, inputting a text generation instruction into the trained language model to obtain a target text corresponding to the text generation instruction.
In another implementation, the text generation model is a sequence-to-sequence model. Correspondingly, the text generation instruction is converted into an information sequence, and the information sequence is converted into a target text based on a sequence-to-sequence model.
In summary, the present application has at least the following advantages:
1. generating a target text corresponding to the text generation instruction through a pre-trained text generation model, extracting triples of the target text on the basis of the target text to obtain triples corresponding to the target text, carrying out authenticity verification on the triples of the target text based on a knowledge graph, and outputting the target text when the authenticity verification is passed, so that the output target text is actually present, and the authenticity of the generated text is improved.
2. And the displayed method is adopted to carry out authenticity verification on the triples corresponding to the target text based on the pre-constructed knowledge graph, so that knowledge loss in the knowledge graph is effectively reduced, and the authenticity of the generated text is further improved.
The following are device embodiments of the present application, which may be used to perform method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.
Fig. 4 is a schematic structural diagram of a text generating device based on a knowledge graph according to an exemplary embodiment of the present application. As shown in fig. 4, the text generating device 40 based on a knowledge graph includes an obtaining module 41, a generating module 42, an extracting module 43, a checking module 44, and an output module 45, wherein:
An obtaining module 41, configured to obtain a text generation instruction;
a generating module 42, configured to generate a target text corresponding to the text generating instruction based on the pre-trained text generating model;
the extraction module 43 is configured to perform triplet extraction on the target text to obtain a triplet corresponding to the target text, where the triplet includes an entity, an attribute, and an attribute value;
a verification module 44, configured to perform authenticity verification on the triplet based on a pre-constructed knowledge graph, where the knowledge graph includes relationships among entities, attributes, and attribute values;
and the output module 45 is used for outputting the target text when the authenticity verification is passed.
In one possible implementation, the verification module 44 may be specifically configured to: based on a pre-constructed knowledge graph, performing at least one of entity verification, attribute verification and attribute value verification on the triples; if at least one of the checks is passed, determining that the authenticity check of the triplet is passed; if either check fails, determining that the authenticity check for the triplet fails.
In one possible implementation, the verification module 44 may also be configured to: checking whether the entity contained in the triplet is in the entity contained in the knowledge graph or not; if the entity contained in the triplet is in the entity contained in the knowledge graph, determining that the entity verification of the triplet passes; if the entity contained in the triplet is not in the entity contained in the knowledge graph, determining that the entity verification of the triplet is not passed; based on a pre-constructed knowledge graph, performing attribute verification on the triples, wherein the method comprises the following steps: checking whether the attribute contained in the triplet is in the attribute contained in the knowledge graph or not; if the attribute contained in the triplet is in the attribute contained in the knowledge graph, determining that the attribute verification of the triplet passes; if the attribute contained in the triplet is not in the attribute contained in the knowledge graph, determining that the attribute verification of the triplet is not passed; based on a pre-constructed knowledge graph, performing attribute value verification on the triples, wherein the method comprises the following steps: checking whether the attribute value contained in the triplet is in the attribute value contained in the knowledge graph or not; if the attribute value contained in the triplet is in the attribute value contained in the knowledge graph, determining that the attribute value of the triplet passes the verification; if the attribute value contained in the triplet is not in the attribute value contained in the knowledge graph, determining that the attribute value verification of the triplet is not passed.
In one possible implementation, the verification module 44 may also be configured to: acquiring an entity set of the entities contained in the knowledge graph; checking whether the entity contained in the triplet is in the entity set; if the entity contained in the triplet is in the entity set, determining that the entity verification of the triplet passes, and acquiring an attribute set of the entity contained in the triplet based on the knowledge graph; checking whether the attribute contained in the triplet is in the attribute set; if the attribute contained in the triplet is in the attribute set, determining that the attribute verification of the triplet passes, and acquiring an attribute value of the attribute contained in the triplet based on the knowledge graph; checking whether the attribute value contained in the triplet is equal to the acquired attribute value; and if the attribute value contained in the triplet is equal to the acquired attribute value, determining that the attribute value of the triplet passes the verification.
In a possible implementation, the output module 45 may also be used to: if the authenticity verification is not passed, determining the times of generating the target text corresponding to the text generation instruction; and if the number of times is smaller than the number of times threshold, returning to the step of executing the generation of the target text corresponding to the text generation instruction based on the pre-trained text generation model.
In a possible implementation, the output module 45 may also be used to: if the number of times is greater than or equal to the number of times threshold, outputting prompt information, wherein the prompt information is used for prompting the reason of unsuccessful text generation.
In a possible implementation manner, the extraction module 43 may be specifically configured to input the target text into a pre-built triplet extraction model to perform triplet extraction, so as to obtain a triplet corresponding to the target text output by the triplet extraction model.
In a possible implementation manner, the text generation method based on the knowledge graph further comprises the following steps: the text generation model is a language model or a sequence-to-sequence model.
The text generating device based on the knowledge graph provided by the embodiment of the application can execute the technical scheme shown in the embodiment of the method, and the implementation principle and the beneficial effects are similar, and the description is omitted.
It should be noted that, it should be understood that the division of the modules of the above apparatus is merely a division of a logic function, and may be fully or partially integrated into a physical entity or may be physically separated. And these modules may all be implemented in software in the form of calls by the processing element; or can be realized in hardware; the method can also be realized in a form of calling software by a processing element, and the method can be realized in a form of hardware by a part of modules. For example, the processing module may be a processing element that is set up separately, may be implemented in a chip of the above-mentioned apparatus, or may be stored in a memory of the above-mentioned apparatus in the form of program codes, and the functions of the above-mentioned processing module may be called and executed by a processing element of the above-mentioned apparatus. The implementation of the other modules is similar. In addition, all or part of the modules can be integrated together or can be independently implemented. The processing element here may be an integrated circuit with signal processing capabilities. In implementation, each step of the above method or each module above may be implemented by an integrated logic circuit of hardware in a processor element or an instruction in a software form.
For example, the modules above may be one or more integrated circuits configured to implement the methods above, such as: one or more application specific integrated circuits (Application Specific Integrated Circuit, abbreviated as ASIC), or one or more microprocessors (Digital Signal Processor, abbreviated as DSP), or one or more field programmable gate arrays (Field Programmable Gate Array, abbreviated as FPGA), or the like. For another example, when a module above is implemented in the form of a processing element scheduler code, the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU) or other processor that may invoke the program code. For another example, the modules may be integrated together and implemented in the form of a System-On-a-Chip (SOC).
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions in accordance with embodiments of the present application are produced in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, fiber optic, digital subscriber line (Digital Subscriber Line, simply DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means from one website, computer, server, or data center. Computer readable storage media can be any available media that can be accessed by a computer or data storage devices, such as servers, data centers, etc., that contain an integration of one or more available media. Usable media may be magnetic media (e.g., floppy disks, hard disks, magnetic tapes), optical media (e.g., digital versatile discs (Digital Video Disc, abbreviated to DVD)), or semiconductor media (e.g., solid State Disk (SSD)), etc.
Fig. 5 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application. As shown in fig. 5, the electronic device 50 of the present embodiment includes:
at least one processor 51; and a memory 52 communicatively coupled to the at least one processor;
wherein the memory 52 stores instructions executable by the at least one processor 51 to cause the electronic device to perform the method as described in any of the embodiments above.
Alternatively, the memory 52 may be separate or integrated with the processor 51.
The memory 52 may include a high-speed random access memory (Random Access Memory, simply referred to as RAM), and may further include a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory.
The processor 51 may be a central processing unit (Central Processing Unit, CPU for short), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC for short), or one or more integrated circuits configured to implement embodiments of the present application. Specifically, when the text generation method based on the knowledge graph described in the foregoing method embodiment is implemented, the electronic device may be, for example, an electronic device having a processing function, such as a server.
Optionally, the electronic device may also include a communication interface 53. In a specific implementation, if the communication interface 53, the memory 52 and the processor 51 are implemented independently, the communication interface 53, the memory 52 and the processor 51 may be connected to each other through a bus and perform communication with each other. The bus may be an industry standard architecture (Industry Standard Architecture, ISA) bus, an external device interconnect (Peripheral Component, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. Buses may be divided into address buses, data buses, control buses, etc., but do not represent only one bus or one type of bus.
Alternatively, in a specific implementation, if the communication interface 53, the memory 52 and the processor 51 are implemented on a single chip, the communication interface 53, the memory 52 and the processor 51 may complete communication through internal interfaces.
The implementation principle and technical effects of the electronic device provided in this embodiment may be referred to the foregoing embodiments, and will not be described herein again.
The embodiment of the present application further provides a computer readable storage medium, where computer execution instructions are stored, where the computer execution instructions are used to implement the method steps in the method embodiment described above when executed, and specific implementation manner and technical effect are similar, and are not repeated herein.
The computer readable storage medium may be implemented by any type or combination of volatile or nonvolatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read Only Memory, EEPROM for short), erasable programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read Only Memory, PROM for short), read Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk. A readable storage medium can be any available medium that can be accessed by a general purpose or special purpose computer.
An exemplary readable storage medium is coupled to the processor such the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an application specific integrated circuit. Of course, the processor and the readable storage medium may reside as discrete components in a knowledge-graph based text generation apparatus.
The embodiment of the present application further provides a computer program product, which includes a computer program, and when the computer program is executed, performs the method steps in the embodiment of the method, and the specific implementation manner and the technical effect are similar, and are not repeated herein.
Other embodiments of the present application will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the present application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (10)

1. The text generation method based on the knowledge graph is characterized by comprising the following steps of:
acquiring a text generation instruction;
Generating a target text corresponding to the text generation instruction based on a pre-trained text generation model;
performing triplet extraction on the target text to obtain a triplet corresponding to the target text, wherein the triplet comprises an entity, an attribute and an attribute value;
carrying out authenticity verification on the triples based on a pre-constructed knowledge graph, wherein the knowledge graph comprises a relation among an entity, an attribute and an attribute value;
and if the authenticity verification is passed, outputting the target text.
2. The knowledge-based text generation method according to claim 1, wherein the verifying the authenticity of the triplet based on the pre-constructed knowledge-graph comprises:
based on a pre-constructed knowledge graph, performing at least one of entity verification and attribute value verification on the triples;
if the at least one check passes, determining that the authenticity check of the triplet passes;
if either check fails, determining that the authenticity check for the triplet fails.
3. The knowledge-based text generation method of claim 2, further comprising at least one of:
The entity verification of the triples based on the pre-constructed knowledge graph comprises the following steps: checking whether the entity contained in the triplet is in the entity contained in the knowledge graph or not; if the entity contained in the triplet is in the entity contained in the knowledge graph, determining that the entity verification of the triplet passes; if the entity contained in the triplet is not in the entity contained in the knowledge graph, determining that the entity verification of the triplet is not passed;
the performing attribute verification on the triples based on the pre-constructed knowledge graph comprises the following steps: checking whether the attribute contained in the triplet is in the attribute contained in the knowledge graph or not; if the attribute contained in the triplet is among the attributes contained in the knowledge graph, determining that the attribute verification of the triplet passes; if the attribute contained in the triplet is not contained in the attribute contained in the knowledge graph, determining that the attribute verification of the triplet is not passed;
and verifying the attribute value of the triplet based on the pre-constructed knowledge graph, wherein the method comprises the following steps: checking whether the attribute value contained in the triplet is in the attribute value contained in the knowledge graph or not; if the attribute value contained in the triplet is among the attribute values contained in the knowledge graph, determining that the attribute value of the triplet passes the verification; and if the attribute value contained in the triplet is not contained in the attribute value contained in the knowledge graph, determining that the attribute value verification of the triplet is not passed.
4. The knowledge-based text generation method according to claim 2, wherein performing entity verification, attribute verification and attribute value verification on the triples based on a pre-constructed knowledge graph comprises:
acquiring an entity set of the entities contained in the knowledge graph;
checking whether an entity contained in the triplet is in the entity set;
if the entity contained in the triplet is in the entity set, determining that the entity verification of the triplet passes, and acquiring an attribute set of the entity contained in the triplet based on the knowledge graph;
checking whether the attribute contained in the triplet is in the attribute set;
if the attribute contained in the triplet is in the attribute set, determining that the attribute of the triplet passes through the verification, and acquiring an attribute value of the attribute contained in the triplet based on the knowledge graph;
checking whether the attribute value contained in the triplet is equal to the acquired attribute value;
and if the attribute value contained in the triplet is equal to the acquired attribute value, determining that the attribute value of the triplet passes the verification.
5. The knowledge-graph-based text generation method according to any one of claims 1 to 4, further comprising:
If the authenticity verification is not passed, determining the times of generating the target text corresponding to the text generation instruction;
and if the times are smaller than the times threshold, returning to the step of executing the text generation model based on the pre-trained text to generate the target text corresponding to the text generation instruction.
6. The knowledge-based text generation method of claim 5, further comprising:
if the number of times is greater than or equal to the number of times threshold, outputting prompt information, wherein the prompt information is used for prompting the reason of unsuccessful text generation.
7. The knowledge-graph-based text generation method according to any one of claims 1 to 4, further comprising:
performing triplet extraction on the target text to obtain a triplet corresponding to the target text, including: inputting the target text into a pre-constructed triplet extraction model to perform triplet extraction, so as to obtain a triplet corresponding to the target text output by the triplet extraction model;
and/or the text generation model is a language model or a sequence-to-sequence model.
8. A knowledge-graph-based text generation device, comprising:
The acquisition module is used for acquiring the text generation instruction;
the generation module is used for generating a target text corresponding to the text generation instruction based on a pre-trained text generation model;
the extraction module is used for extracting the triples of the target text to obtain triples corresponding to the target text, wherein the triples comprise entities, attributes and attribute values;
the verification module is used for carrying out authenticity verification on the triples based on a pre-established knowledge graph, wherein the knowledge graph comprises a relation among an entity, an attribute and an attribute value;
and the output module is used for outputting the target text when the authenticity verification passes.
9. An electronic device, comprising: a processor, and a memory communicatively coupled to the processor;
the memory is used for storing computer execution instructions;
the processor is configured to execute the computer-executable instructions to implement the knowledge-graph-based text generation method of any one of claims 1 to 7.
10. A computer-readable storage medium, wherein computer-executable instructions are stored in the computer-readable storage medium, which when executed are configured to implement the knowledge-graph based text generation method of any one of claims 1 to 7.
CN202311410600.0A 2023-10-25 2023-10-25 Text generation method, device, equipment and medium based on knowledge graph Pending CN117349449A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311410600.0A CN117349449A (en) 2023-10-25 2023-10-25 Text generation method, device, equipment and medium based on knowledge graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311410600.0A CN117349449A (en) 2023-10-25 2023-10-25 Text generation method, device, equipment and medium based on knowledge graph

Publications (1)

Publication Number Publication Date
CN117349449A true CN117349449A (en) 2024-01-05

Family

ID=89361143

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311410600.0A Pending CN117349449A (en) 2023-10-25 2023-10-25 Text generation method, device, equipment and medium based on knowledge graph

Country Status (1)

Country Link
CN (1) CN117349449A (en)

Similar Documents

Publication Publication Date Title
CN109460463B (en) Model training method, device, terminal and storage medium based on data processing
CN109514586B (en) Method and system for realizing intelligent customer service robot
US11563727B2 (en) Multi-factor authentication for non-internet applications
US11042710B2 (en) User-friendly explanation production using generative adversarial networks
CN108960574A (en) Quality determination method, device, server and the storage medium of question and answer
TWI749349B (en) Text restoration method, device, electronic equipment and computer readable storage medium
CN111666393A (en) Verification method and device of intelligent question-answering system, computer equipment and storage medium
US20220027768A1 (en) Natural Language Enrichment Using Action Explanations
US11017307B2 (en) Explanations generation with different cognitive values using generative adversarial networks
US11762758B2 (en) Source code fault detection
US20220198255A1 (en) Training a semantic parser using action templates
CN111510566B (en) Method and device for determining call label, computer equipment and storage medium
CN110705637A (en) User classification method and device based on application installation list information and electronic equipment
CN110362294A (en) Development task executes method, apparatus, electronic equipment and storage medium
CN117349449A (en) Text generation method, device, equipment and medium based on knowledge graph
KR20210009885A (en) Method, device and computer readable storage medium for automatically generating content regarding offline object
CN115712571A (en) Interactive service test device, interactive service test device, computer equipment and storage medium
US11922129B2 (en) Causal knowledge identification and extraction
CN115687136A (en) Script program processing method, system, computer equipment and medium
CN113298636B (en) Risk control method, device and system based on simulation resource application
CN109977221B (en) User verification method and device based on big data, storage medium and electronic equipment
US11763082B2 (en) Accelerating inference of transformer-based models
CN113177399B (en) Text processing method, device, electronic equipment and storage medium
US20230186190A1 (en) Ticket embedding based on multi-dimensional it data
CN116029540A (en) Risk assessment method and device for front-end code development system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination