CN115510205A

CN115510205A - Question generation method, system and storage medium

Info

Publication number: CN115510205A
Application number: CN202211212765.2A
Authority: CN
Inventors: 张士杰
Original assignee: Pacific Insurance Technology Co Ltd
Current assignee: Pacific Insurance Technology Co Ltd
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2022-12-23

Abstract

The application discloses a question generation method, a question generation system and a storage medium. The method generates the question by using a question generation model based on knowledge information. And (3) obtaining a judgment result and a question knowledge information pair through a judgment system based on the question generated by the question generation model, and obtaining a target reward through the operation of the judgment result through the judgment system. Then the discrimination system returns the question knowledge information pair and the target reward to the question generation model; and training the question generation model in a reinforcement learning mode based on the training data by taking a question knowledge information pair corresponding to a target reward meeting a preset requirement returned to the question knowledge information pair in the question generation model as the training data, and updating the question generation model to generate a new question. The application also discloses a question generation system and a storage medium. In the embodiment of the application, question generation is realized.

Description

Question generation method, system and storage medium

Technical Field

The present application relates to the field of computer applications, and in particular, to a method, a system, and a storage medium for generating a question.

Background

Question generation has received increasing attention due to Machine Reading Comprehension (MRC) and the explosive development of intelligent question-answering systems. Question Generation (Question Generation) is an important subtask in text Generation, and aims to generate a Question that is related to an input and is naturally smooth from input data (types of text, knowledge base, images, and the like). The question generation model is generated at the same time, more training data can be provided for the question answering system by using the question generation model, the manual labeling cost is reduced, and the system performance is improved.

In the training stage of the question-answering system, the richer the training data, the better the quality, but the data is not easy to obtain and is lack of effective data under general conditions.

Disclosure of Invention

In view of this, embodiments of the present application provide a method, a system, and a storage medium for generating a question, which are used to implement the generation of a question as effective training data of a question-answering system model to train a question-answering system.

In a first aspect, an embodiment of the present application provides a method for generating a question, where the method includes:

generating a question by using a question generation model based on the knowledge information;

obtaining a discrimination result and a question knowledge information pair through the discrimination system based on the question generated by the question generation model;

calculating the judgment result through the judgment system to obtain a target reward;

the judging system returns the question knowledge information pair and the target reward to a question generation model;

using a question knowledge information pair which is returned to a corresponding target reward in a question knowledge information pair in the question generation model and meets the preset requirement as training data;

training a question generation model in a reinforcement learning mode based on the training data, and updating the question generation model;

and generating a new question by using the updated question generation model.

Optionally, the determining system includes:

a question-answering system, a knowledge base information-question sentence similarity model and a grammar correct and wrong judgment model;

the question generated based on the question generation model obtains a judgment result through the judgment system, and the judgment result specifically comprises the following steps:

the knowledge base information-question similarity model obtains a first reward by judging the similarity between the knowledge information input into the question generation model and the question generated based on the question generation model;

the grammar correctness judging model judges whether the grammar of the input question is correct or not to obtain a second reward;

the question-answering system inquires answers corresponding to the question generated by the question generation model, and obtains a third reward by judging whether the answers are matched with the knowledge information;

the obtaining of the target reward through the operation of the discrimination result by the discrimination system specifically includes:

a target prize is awarded based on the first prize, the second prize, and the third prize.

Optionally, the obtaining a target prize according to the first prize, the second prize and the third prize includes:

and calculating a weighted average value according to the first reward, the second reward and the third reward, and taking the weighted average value as the target reward.

Optionally, the knowledge information is stored in a knowledge base of the question-answering system, and the knowledge information is a group of information which has head and tail entities and has definite relations between the entities.

Optionally, the question sentence includes:

and the question with the answer matched with the knowledge information or the question with the answer not matched with the knowledge information.

Optionally, the question knowledge information pair meeting the preset requirement includes:

the corresponding target reward is the maximum value in the value range of the target reward.

In a second aspect, an embodiment of the present application provides a question generation system, where the system includes:

the question generation model and the discrimination system comprise a question answering system and a knowledge base;

the question generation model is used for generating a question by using the question generation model based on the knowledge information:

the question-answering system is used for acquiring question sentences from the question sentence generating model and is affiliated to the judging system;

the discrimination system is used for obtaining a discrimination result based on the question generated by the question generation model; calculating the judgment result to obtain a target reward, and generating a question knowledge information pair; returning the question knowledge information pair and the target reward to a question generation model; the question knowledge information pairs corresponding to the corresponding target rewards meeting preset requirements in the question knowledge information pairs in the question generation model are returned to serve as training data;

the question generation model is also used for training and updating in a reinforcement learning mode based on the training data; and generating a new question by using the updated question generation model.

Optionally, the determination system further includes:

knowledge base information-question similarity model and grammar correct and wrong judgment model;

the knowledge base information-question similarity model is used for obtaining a first reward by judging the similarity between the knowledge information input into the question generation model and the question generated based on the question generation model;

the grammar correctness judging model is used for obtaining a second reward by judging whether the grammar of the input question is correct or not;

the question-answering system is used for inquiring answers corresponding to the question generated by the question generation model and obtaining a third reward by judging whether the answers are matched with the knowledge information;

the discrimination system is specifically configured to:

Optionally, the determination system is specifically configured to:

and taking the weighted average value as the target reward according to the weighted average value of the first reward, the second reward and the third reward.

In a third aspect, an embodiment of the present application provides a computer storage medium, where codes are stored in the computer storage medium, and when the codes are executed, an apparatus that runs the codes implements the method for generating a question that is described in any one of the foregoing implementation manners of the first aspect.

When the method is executed, firstly, based on a knowledge base in a question-answering system, complete knowledge information in the knowledge base is acquired, and the knowledge information is used as an answer and input into a question generation model to generate a question. Inputting the question into a discrimination system to obtain an answer according to knowledge information, and simultaneously outputting a reward by the discrimination system according to a discrimination result to generate a question knowledge information pair. And then returning the question knowledge information pair and the reward to a question generation model as training data, and realizing the updating of the question generation model in a reinforcement learning mode so that the question generation model continuously generates a special question. In this way, the question generation model is trained in a reinforcement learning mode through reward and question knowledge information pairs output in the judging process of the judging system, new question and question knowledge information pairs are continuously generated to serve as training data of the question generation model, and question answer pairs consisting of the generated questions and answers inquired in the judging system can serve as effective data of the training question-answering system to train the question-answering system.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments or the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, and obviously, the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

FIG. 1 is a flow chart of a question generated by updating a model;

FIG. 2 is a flowchart illustrating a process of determining whether the target award meets a predetermined requirement;

FIG. 3 is a flow chart of the discrimination within the question answering system;

FIG. 4 is a schematic diagram of a question generation system;

FIG. 5 is a diagram illustrating the generation of target reward and question knowledge information pairs by the decision system.

Detailed Description

In the prior art, in the training stage of each model of the question-answering system, the more the training data, the better the quality, but in general, the data is not easy to obtain and the effective data is lacked. Meanwhile, after the question-answering system falls to the ground, the method for testing the effectiveness of the system can be divided into a part of data test constructed manually or a part of data test used after falling to the ground, the former has low reliability due to the relationship between diversity and data quantity, and the latter generally has potential problems and cannot be tested by the method, so that a proper method is found for positioning the fault of the question-answering system.

Based on this, the present application provides a question generation method, system and storage medium, which generates a question through a question generation model for reinforcement learning training, and finally uses the question and a question answer pair corresponding to the question as effective training data of a question-answering system, and the question serves as positioning data for realizing the fault of the question-answering system. The specific method comprises the following steps:

firstly, acquiring complete knowledge information in a knowledge base based on the knowledge base in a question-answer system, inputting the knowledge information as an answer into a question generation model to generate a question, inputting the question into a judgment system to obtain the answer according to the knowledge information, forming a question-knowledge information pair by the question and the knowledge information, outputting a reward by the judgment system according to a judgment result, returning the question-knowledge information pair and the reward to the question generation model to serve as training data, and updating the question generation model in a reinforcement learning mode to enable the question generation model to continuously generate a special question. Therefore, the question generation model is trained in a reinforcement learning mode through reward and question knowledge information pairs output in the judging process of the judging system, new question and question knowledge information pairs are continuously generated to serve as training data of the question generation model, question answer pairs formed by the generated questions and answers inquired in the judging system can serve as effective data of the training question-answering system to train the question-answering system, and meanwhile the question answer pairs can position model fault positions in the question-answering system.

Referring to fig. 1, the implementation of the present invention comprises the following steps:

step 101: and generating a question by using a question generation model based on the knowledge information.

Knowledge information is stored in a knowledge base of a question-and-answer system, the knowledge information is a group of information which has head and tail entities and has definite connection between the entities, and can be represented in a triple structure form of entity-relation-entity, if two entities have an entity relation, the relation is called a one-hop relation, if the entities have at least two entity relations or the entities have at least one entity relation, the relation is called a multi-hop relation, and the knowledge information applied in the application can be knowledge information with one-hop relation or knowledge information with multi-hop relation.

The Knowledge Base (KB) is a special database for Knowledge management, and is used for collecting, sorting and extracting Knowledge in related fields. Knowledge in the knowledge base is derived from domain experts and is a collection of domain knowledge needed to solve the problem, including some basic facts, rules and other relevant information. The representation of the knowledge base is an object model (object model), commonly referred to as an ontology, containing classes, subclasses, and entities. The system is a knowledge-based system, comprises a series of knowledge representing objective world facts and an inference engine (inference engine), and deduces some new facts by depending on certain rules and logic forms, wherein the common knowledge bases comprise Freebase, DBpedia and the like. Since the trained question generation model is ultimately used to participate in the generation of training samples for the question-answering system, the knowledge base relied upon here should correspond to the knowledge base referred to by the final question-answering system. According to different application scenarios and subject matters, a person skilled in the art can independently select a knowledge base, and further design a data structure and a reading mode specifically adopted by the knowledge base, which are not specifically limited in this specification. In the present application, the knowledge base-based question answering system is mainly used, and the present invention can also be applied to other types of question answering systems.

The question generation model comprises a machine learning model which can output corresponding questions according to input knowledge information, and specifically can adopt a neural network for question generation, such as: seq2Seq, transformer, large scale pre-trained models Bert, roberta, etc., with attention paid. The skilled person can select the specific application according to the requirements, and the invention is not limited herein.

Step 102: and obtaining a judgment result and a question knowledge information pair through the judgment system based on the question generated by the question generation model.

The question generation model generates a question by using knowledge information, and the specific generation mode comprises the following steps: structural adjustments and additions of content. The structural adjustment refers to structural adjustment of a question sentence, and an exemplary expression is that the original question sentence is: "is the student's height 1.7m? "what can be converted to" 1.7m is what is the student "by the structure adjustment? "is height 1.7m? "" whose height is 1.7m ", etc. The increase of the content refers to the increase of the query words, the connecting words and some interfering words for the knowledge information. The interference vocabulary refers to a vocabulary having a certain relation with the knowledge information, the interference vocabulary can be provided by a pre-training model in a question generation model, the pre-training model can be Bert, roberta and the like, and the question generation model can be a deep learning-based model, such as: seq2seq, transformer, etc., the specific types of question generation models and pre-training models may be set by those skilled in the art according to the application scenario, and are not limited herein. The question generated by the question model may be exemplarily expressed as: "whose height is 1.7m? "," what is 1.7m for the student? "," how much the height of the student is? "," is the student a boy, is height? "and the like.

The discrimination system mainly comprises three parts: knowledge base information-question sentence similarity model, grammar correct and wrong judgment model and question-answer system. The three modules are mainly used in the application, and other models such as a semantic repeated model and the like can be added in the judgment system to judge the question, so that negative influence on the application can not be generated, and the setting is not performed.

The knowledge base information-question similarity model is mainly used for judging the similarity between the knowledge information input into the question generation model and the question generated based on the question generation model, and exemplarily expresses that the input knowledge information is student-height-1.7 m, and the generated question can be: "what is the height of the student? "," what is the student's weight? "," what is the height of the student? Taking the three question sentences as an example, they are input into a knowledge base information-question similarity model, and compared with the knowledge base information, the similarity and the obtained reward are sequentially from high to low: "what is the height of the student? "," what is the height of the student? "," what the weight of the student is? "; the grammar correct and wrong judgment model is mainly used for judging whether the grammar of the input question is correct or not, and exemplarily expresses that the input knowledge information is student-height-1.7 m, and the generated question can be: "who 1.7m is height? "," is height 1.7m? "," whose height is 1.7m? "take these three questions as an example, they input the grammar correct and wrong judgment model, and can get" who 1.7m is height? "and" is height 1.7m? "the grammars of these two questions are wrong, and the corresponding rewards are low," whose height is 1.7m? "correct syntax, corresponding to a high prize; the question-answering system is used for inquiring answers corresponding to the question generated by the question generation model and judging whether the answers are matched with the knowledge information or not, and exemplarily expresses that the input knowledge information is student height-1.7 m, and the question generated by the corresponding question generation model can be student height? "the answer corresponding to the question generated by the query in the question-answering system may be" 1.7m "or" 50kg ", and taking these two answers as an example, it is determined that" 1.7m "is matched with the knowledge information, and" 50kg "is not matched with the knowledge information.

The discrimination process for the question-answering system is as follows: and inquiring the answer corresponding to the question generated by the question generation model, and judging whether the answer is matched with the knowledge information. There may be two types of situations in the matching process, the first being: the knowledge base of the question-answering system has two or more answers with similar ideograms, which can answer the question generated by the question generation model; the second case is: only one answer exists in the knowledge base of the question-answer system, and the question generated by the question generation model can be answered. For the second case, the answer and the knowledge information may be directly determined to be matched. For the first kind of cases, the specific matching manner of the answer and the knowledge information may be the following two types, the first type is: and recording the terms with similar expression meanings when constructing the knowledge graph, forming a fixed word list according to the mapping relation, and inquiring in the word list when inquiring answers. The second method is as follows: and setting a word similarity model in the question-answering system for judging the similarity between the answer queried by the question-answering system and the knowledge information, and when the similarity meets a threshold value, considering that the answer is matched with the knowledge information. The threshold may be set by a person skilled in the art according to an application scenario and an actual situation, and is not limited herein.

The existing question-answering systems can be broadly classified into the following 3 types according to the data sources of the answers to the questions and the answer modes thereof: a Web query Answering (WebQA) system based on Web information retrieval is supported by a search engine, and after understanding and analyzing the problem intention of a user, the search engine is utilized to search relevant answers in the whole network range and feed the answers back to the user. Typical systems are the early Ask Jeeves and AnswerBus question-answering systems.

A Knowledge base-Based Question Answering system (KBQA) KBQA system extracts valuable information by combining some existing Knowledge base or database resources (such as Freebase, DBpedia, yago, zhishi. Me and the like) and utilizing information of unstructured texts such as Wikipedia, encyclopedia and the like by using an information extraction method, constructs a Knowledge graph as background support of the Question Answering system, and provides answers with deeper semantic understanding for users by combining methods such as Knowledge reasoning and the like.

The Community Question Answering system (CQA) CQA system is also called a social media-based Question Answering system, most answers to questions are provided by net friends, and the Question Answering system retrieves questions in social media with similar semantics to the questions asked by the user and returns the answers to the user.

The method for generating the question is designed by taking a question-answering system based on a knowledge base as a main body, the generated data information such as the question can be applied to various question-answering systems to provide training data samples for the question-answering systems, and the main body of the method can be other types of question-answering systems and is not limited herein.

The question answering system in the application can comprise core modules of question type judgment, named entity identification, entity linkage, knowledge inquiry, answer sequencing and the like. For example, if the question-answer system has question type discrimination, named entity identification, entity link, knowledge query and answer sorting, the process of obtaining the answer by the question-answer system is mainly as follows: the question type distinguishing module extracts the type characteristics of the question, the named entity identifying module extracts the name (comment) in the question, the entity linking module corresponds the name to the entity (entity) in the knowledge base, the knowledge query module queries the entity in the knowledge base to obtain a knowledge base subgraph taking the entity node as the center, and extracts the corresponding node or edge from the extracted subgraph according to some rules or templates. And obtaining the characteristic vectors representing the question and the candidate knowledge information, sequencing through an answer sequencing module to obtain a final answer, and matching the question-answering system with the knowledge information according to the answer.

The judgment is respectively carried out through three parts, namely a knowledge base information-question sentence similarity model, a grammar correct and wrong judgment model and a question-answering system, and a reward which is in accordance with the judgment is given according to the value range of the reward, namely a judgment result. Exemplarily expressing that the reward value range of the knowledge base information-question similarity model is [0,1], the reward is 1 when the similarity between the knowledge information input into the question generation model and the question generated based on the question generation model is highest, and the reward is 0 when the similarity is lowest; the reward value range of the grammar correct-wrong judgment model is [0,1], the corresponding reward is 1 if the grammar of the input question is correct, and the corresponding reward is 0 if the grammar is wrong; and the question-answering system inquires the answer corresponding to the question generated by the question generation model, judges whether the answer is matched with the knowledge information or not, and rewards 0 if the answer is matched with the knowledge information and 1 if the answer is not matched with the knowledge information. The above-mentioned value ranges can be set by those skilled in the art according to actual conditions, and are not limited herein. The discrimination system generates a question knowledge information pair, wherein the question refers to a question generated by a question generation model based on knowledge information in a knowledge base of a question-answering system, and the knowledge information is corresponding to the question.

Step 103: and calculating the judgment result through the judgment system to obtain the target reward.

The target reward is reward obtained by performing a series of operations on the judgment result, and the operation process can be performed by adopting mathematical operation or program algorithm. For the mathematical operation, the calculation of the arithmetic average value of the three corresponding sub-awards in the judgment result can be adopted to obtain the target award, particularly, different weights can be given to the three different sub-awards according to the importance degrees of the three modules, the target award is obtained by summing, and all the operation methods which can reflect the relationship of the sub-awards and obtain the target award can be applied to the calculation of the target award in the application and are not limited herein. The target reward can also be obtained by adopting a program algorithm, such as: the program with a certain operation rule is written and applied to the discrimination system, the discrimination system inputs the sub-reward into the program to obtain the target reward, and the writing language and the programming software applied by the program are not limited here.

Step 104: and the discrimination system returns the question knowledge information pair and the target reward to the question generation model.

The question knowledge information pair and the target reward can be combined together to form a question-knowledge information-reward pair, and the question-knowledge information-reward pair is returned to the question generation model in the form of the question-knowledge information-reward pair.

Step 105: and using the question knowledge information pair which corresponds to the corresponding target reward in the question knowledge information pair returned to the question generation model and meets the preset requirement as training data.

The preset requirement may be a threshold set by a person skilled in the art for the target reward according to actual conditions, which is exemplarily expressed if the determination system includes: the system comprises three modules, namely a knowledge base information-question similarity model, a grammar correct and wrong judgment model and a question and answer system, wherein the reward evaluation range of the knowledge base information-question similarity model is [0,1], the reward evaluation range of the grammar correct and wrong judgment model is [0,1], the question and answer system inquires answers corresponding to questions generated by the question generation model, judges whether the answers are matched with knowledge information or not, if the answers are matched, the reward is 0, if the answers are not matched, the reward is 1, and meanwhile, the operation method for calculating the target reward is set as the arithmetic mean value of sub-rewards corresponding to the three modules. Then the preset range for the target prize is [0,1] and the preset requirement may be set to 1. Namely, the question knowledge information pair with the target reward of 1 is taken as training data, and the preset requirement is met when the target reward is 1; when the target reward is not 1, the target reward does not meet the preset requirement, and the corresponding question knowledge information pair cannot be used as training data for updating the question generation model.

The question generation model has an incentive mechanism, and question knowledge information pairs with large target incentive are positively rewarded and serve as training data for updating the question generation model; the question knowledge information pair with small target reward can be awarded negatively and can also be understood as punishment, the question knowledge information pair cannot be used as training data for updating the question generation model, but the question knowledge information pair can also enable the question generation model to learn negative information for generating a question, and enable the question generation model to avoid generating a question in the question knowledge information pair.

Step 106: and training the question generation model in a reinforcement learning mode based on the training data, and updating the question generation model.

The question generation model is trained in a reinforcement learning mode so as to be updated to generate a new question. Reinforcement learning is the act of guiding rewards obtained by an Agent interacting with the environment with the goal of maximizing rewards for the Agent, learning by the Agent in a "trial and error" manner, wherein reinforcement signals provided by the environment in reinforcement learning are an assessment of how well an action is being generated (usually scalar signals) rather than telling the reinforcement learning system RLS how to generate the correct action. Since the external environment provides little information, the RLS must learn on its own experience. In this way, the RLS obtains knowledge in the action-evaluation environment, and improves the action scheme to adapt to the environment, the "agent" in the present application refers to a question generation model, and the enhanced signal provided by the environment refers to a question knowledge information pair whose corresponding target reward in the question knowledge information pair returned to the question generation model meets the preset requirements.

Step 107: and generating a new question by using the updated question generation model.

In this embodiment, based on a knowledge base in a question-answering system, complete knowledge information in the knowledge base is obtained, the knowledge information is input as an answer to a question generation model to generate a question, the question is input to a discrimination system to obtain an answer according to the knowledge information, the question and the knowledge information form a question-sentence knowledge information pair, the discrimination system outputs a reward according to a discrimination result, the question-sentence knowledge information pair and the reward are returned to the question generation model as training data, and the question-sentence generation model is updated in a reinforcement learning manner to enable the question generation model to continuously generate new question sentences. There may be several models in the question-answering system that have sequential relationships in order: a question intent model, a named entity recognition model, a query model, and a similarity model. The question intention model is used for dividing the user intention into different categories such as relation query, attribute query, comparison, judgment and the like. And designing a sentence template, and performing matching judgment or identifying through entity link and attribute matching. For example, if the entity and the attribute are directly matched, an attribute value or a relationship name is returned; or labeling the intent based on a graph computation method. In the embodiment, it can be understood that the intention classification is completed by inputting the sentence representation learning. The named entity recognition model is used to extract the entities from unstructured input text, and can recognize more classes of entities according to business needs. The named entities generally refer to entities with specific meaning or strong reference in the text, such as entity class, time class, three major classes of numeric class and seven minor classes of name, place name, organization name, time, date, currency and percentage. The query model is used for searching the knowledge base data by using query languages such as SPARQL declarative language, cypher declarative language, PGQL declarative language and the like, and searching answers corresponding to the question sentences. The similarity model is used for carrying out similarity calculation on the information inquired by the inquiry model, finding out the information with the highest similarity and outputting the information as the answer of the question. Other models besides the question and sentence intention model, the named entity recognition model, the query model and the similarity model can also exist in the question and answer system, and the setting of the models can be carried out by the technical personnel in the field according to the application scene and the actual situation, which is not limited herein.

In an exemplary expression, taking a question and sentence intention model, a named entity identification model, a query model and a similarity model existing in a question-answering system as an example, a question sentence generated based on knowledge information in a knowledge base of the question-answering system is input into the question-answering system, the question-answering system searches and outputs an answer in the knowledge base, but the answer is not in accordance with the knowledge information, and in this case, the question sentence is judged to be a question sentence which has an answer in the question-answering system but cannot be answered correctly. The person skilled in the art can judge that the question-answering system has a fault according to the question and the answer returned by the question-answering system, and can manually check the execution conditions of each module in the question-answering system, such as: firstly, checking whether the question intention model judges the question intention correctly, if not, proving that the question intention model has faults, and if so, continuously checking a named entity recognition model; judging whether the named entity recognition model extracts the entities correctly or not, if not, proving that the named entity recognition model has faults, and if so, continuously checking the query model; judging whether the information which is inquired by the inquiry model correspondingly is correct or not, if not, proving that an inquiry statement applied by the inquiry model has an error, and if so, continuously checking the similarity model; and judging whether the similarity calculation performed by the similarity model is correct or not, if not, after errors such as input and output on a code elimination level, proving that the similarity model has faults, and completing fault troubleshooting.

FIG. 2 is a flowchart illustrating a process of determining whether the target award meets a predetermined requirement. As shown in fig. 2, step 105 in the method provided in the foregoing embodiment may be implemented by the following steps:

step 201: judging whether the corresponding target reward in the question knowledge information pair returned to the question generation model meets the preset requirement, if so, entering step 202, and if not, entering step 203;

and if the target reward meets the preset requirement, the generated question knowledge information pair meets the requirement of training data serving as a question generation model, and the question knowledge information pair meets the requirement, so that the quality of the question correspondingly generated according to the question knowledge information pair on the trained question generation model is higher, and the correspondingly generated reward is higher.

Step 202: the target reward meets the preset requirement, and the corresponding question knowledge information is input into a question generation system;

step 203: and deleting the question knowledge information pair corresponding to the target reward which does not meet the preset requirement, and continuously generating a question for the next knowledge information by the question generation model.

In step 203, it is mentioned that if the target reward does not meet the preset requirement, the question knowledge information pair corresponding to the target reward is deleted, and the reason for this operation is that the question generation model generates a huge number of questions and correspondingly also generates a huge number of question knowledge information pairs under the training of reinforcement learning, and if the question knowledge information pair which does not meet the preset requirement is not deleted, resource waste is caused to the system, and data redundancy occurs, resulting in a long model training time. Step 203 also mentions: and continuously generating the question for the next knowledge information by the question generation model. The reason for this operation is that, if the operation is not performed, the knowledge information corresponding to the question in the question knowledge information pair is selected to be sent to the question generation model, and the question generation model continues to generate the question according to the knowledge information, a situation of falling into an endless loop occurs, that is, the question generation model cannot generate a question meeting requirements for certain knowledge information all the time, which may cause waste of resources, and the question generation model needs to be trained in a diversified manner in the training process, so that the training process can achieve a good training effect on different knowledge information. The target reward does not meet the preset requirement, which means that the question sentence generated corresponding to the knowledge information does not meet the requirement. The knowledge information may be selected to be trained within a set threshold number of times to generate a question, it is understood that, for a case where the target reward meets a preset requirement, or for a corresponding question knowledge information pair, the knowledge information corresponding to the question is trained within the set threshold number of times to continue generating the question, the set threshold number of times may be set by a person skilled in the art according to an application scenario and an actual situation, and is not set here, and the threshold number of times is set to avoid waste of computational resources.

In this embodiment, the step 105 in the method provided in the foregoing embodiment is further optimized, by deleting question knowledge information pairs whose target rewards do not meet the preset requirements, the occurrence of situations that the model training speed becomes slow due to the waste of system resources and the occurrence of data redundancy is reduced, and meanwhile, in the question knowledge information pairs whose target rewards do not meet the preset requirements, knowledge information corresponding to a question is sent to a question generation model, and the question generation model continues to generate a question according to the knowledge information, so that the question generation model can be trained more perfectly. It should be further described that the main body of the embodiment may be a packaged determination module or a computer program, and the specific implementation manner may be set by a person skilled in the art according to the needs, and is not limited herein.

In the above-described solution, the determination system includes: on the basis of the grammar correct and wrong judgment model, the knowledge base information-question similarity model and the question-answering system, a further optimization scheme is provided, the grammar correct and wrong judgment model and the knowledge base information-question similarity model can be added in the question-answering system to realize the internal judgment of the question-answering system, and fig. 3 is a flow chart of the internal judgment of the question-answering system, and as shown in fig. 3, the judgment about the question-answering system in the question-answering system can also be realized by the following steps:

step 301: inputting a question into a question-answering system;

step 302: judging the question by a grammar correct and wrong judgment model in the question-answering system;

step 303: judging the question by a knowledge base information-question similarity model in a question answering system;

step 304: and optimizing the question sentence according to the judgment result.

The question is optimized based on the judgment result of the grammar correct and wrong judgment model in the question-answering system, or based on the judgment result of the knowledge information base-question similarity model in the question-answering system. Illustratively, the knowledge information is: student-height-1.7 m, the question generated is: "who 1.7m is height? "at this time, the grammar correct-error judgment model in the question-answering system will judge the question, and output the corresponding reward according to the judgment result, and the judgment result at this time is that the grammar of the question is wrong, and the corresponding reward is low, the question-answering system obtains the judgment result output by the grammar correct-error judgment model in the question-answering system, and judges whether the question needs to be input into the question correction model for correction according to the judgment result, in the case of the above example, the question" who 1.7m is height? "the question sentence correction model needs to be input for correction, and the specific correction mode may include: exchanging word orders, increasing or decreasing words, supplementing question sentences according to knowledge information and the like.

It should be noted that, in this embodiment, a grammar correct-and-wrong judgment model and a knowledge base information-question similarity model are added in the question-answering system, and do not affect the grammar correct-and-wrong judgment model, the knowledge base information-question similarity model and the question-answering system in the judgment system on the question, and the grammar correct-and-wrong judgment model judges whether the grammar of the input question is correct for the question and outputs a corresponding reward; the knowledge base information-question similarity model judges the similarity between the knowledge information input into the question generation model and the question generated based on the question generation model and outputs corresponding rewards; and the question-answering system inquires the answer corresponding to the question generated by the question generation model, judges whether the answer is matched with the knowledge information or not and outputs corresponding rewards. In the embodiment, the grammar correct-error model and the knowledge base information-question similarity model are added in the question-answering system, so that the judgment of the question-answering system on the input question of the question-question generation model and the generation of corresponding rewards cannot be influenced.

In the embodiment, the question and the optimization in the question-answering system are realized through the process of optimizing the question which does not meet the requirements, the question-answering system can be trained according to the internal judging process of the question-answering system, so that the question-answering system can judge the question more accurately, the accuracy of judging when the question is input into the judging system from the question generating model is higher, the reward of the corresponding output question knowledge information pair is more accurate, the training of the question generating model is more accurate and efficient, the question which has the corresponding answer but cannot be answered in the question-answering system can be generated better, technicians can be positioned to the position where the fault occurs in the question-answering system through the question, and the submodel of the question-answering system can be trained as effective training data.

Fig. 4 is a schematic structural diagram of a question generation system, and as shown in fig. 4, the present invention further provides a question generation system, which includes:

the question generation model 100 and the discrimination system 300, wherein the discrimination system comprises a question answering system 200, and the question answering system 200 comprises a knowledge base 201;

the question generation model 100 is configured to generate a question using a question generation model based on knowledge information:

the question-answering system 200 is used for obtaining a question from the question generation model, and the question-answering system is subordinate to a judging system;

the discrimination system 300 is configured to obtain a discrimination result based on a question generated by the question generation model; calculating the judgment result to obtain a target reward; generating a question knowledge information pair, and returning the question knowledge information pair and the target reward to a question generation model; the question knowledge information pairs corresponding to the target reward and meeting the preset requirements in the question knowledge information pairs in the question generation model are returned to serve as training data;

the question generation model 100 is further configured to train and update in a reinforcement learning manner based on the training data; and generating a new question by using the updated question generation model.

The invention provides a question generation system, which generates a question based on knowledge information through a question generation model 100, inputs the question into a discrimination system 300 and discriminates the question to obtain a discrimination result. The discrimination system 300 calculates the discrimination result, obtains a target reward, generates a question knowledge information pair, and returns the question knowledge information pair and the target reward to the question generation model 100; wherein, the question knowledge information pairs corresponding to the goal awards meeting the preset requirements in the question knowledge information pairs returned to the question generation model 100 are used as training data. The question generation model 100 is further used for training and updating in a reinforcement learning mode based on the training data; a new question is generated with the updated question generation model 100.

In an alternative implementation, fig. five is a schematic diagram of the discrimination system generating a target reward and question knowledge information pair, and as shown in fig. five, the discrimination system 300 includes: a knowledge base information-question similarity model 302, a grammar correct and wrong judgment model 301 and a question-answering system 200; based on the question generated by the question generation model 100, the determination result obtained by the determination system 300 specifically includes: the knowledge base information-question similarity model 302 obtains a first reward by judging the similarity between the knowledge information input into the question generation model and the question generated based on the question generation model; the grammar correctness judgment model 301 obtains a second reward by judging whether the grammar of the input question is correct; the question-answering system 200 queries answers corresponding to the question generated by the question generation model, and obtains a third reward by judging whether the answers are matched with the knowledge information; the obtaining of the target reward through the operation of the discrimination result by the discrimination system specifically includes: and carrying out weighted average according to the first reward, the second reward and the third reward to obtain a target reward, and generating a question knowledge information pair by a discrimination system.

By the question generating system, question generation can be realized, potential questions which cannot be answered by the question-answering system can be generated, and the question-answering system can be used for analyzing and positioning fault reasons and training the question-answering system by technicians in the field.

The embodiment of the application also provides corresponding equipment and a computer readable storage medium, which are used for realizing the scheme provided by the embodiment of the application.

The device includes a memory and a processor, where the memory is used for storing instructions or codes, and the processor is used for executing the instructions or codes, so as to cause the device to execute a question generation method according to any embodiment of the present application.

In practice, the computer-readable storage medium may take any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present embodiment, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.

The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of question generation, the method comprising:

obtaining a discrimination result and a question knowledge information pair through a discrimination system based on the question generated by the question generation model;

training the question generation model in a reinforcement learning mode based on the training data, and updating the question generation model;

and generating a new question by using the updated question generation model.

2. The method of claim 1, wherein the discrimination system comprises:

the system comprises a question-answering system, a knowledge base information-question sentence similarity model and a grammar correct and wrong judgment model;

the question answering system inquires answers corresponding to the question generated by the question generation model, and obtains a third reward by judging whether the answers are matched with the knowledge information;

3. The method of claim 2, wherein the awarding a target prize in accordance with the first prize, the second prize, and the third prize comprises:

4. The method of claim 1,

the knowledge information is stored in a knowledge base of the question answering system, and the knowledge information is a group of information which has head and tail entities and has definite relation among the entities.

5. The method of claim 1, wherein the question comprises:

and the question with the answer matched with the knowledge information, or the question with the answer not matched with the knowledge information.

6. The method according to claim 1, wherein the question knowledge information pair meeting preset requirements comprises:

7. A system for question generation, comprising:

the question generation model and the discrimination system comprise a question-answering system, and the question-answering system comprises a knowledge base;

the question-answering system is used for acquiring the question from the question generation model and belongs to a judging system;

the discrimination system is used for obtaining a discrimination result based on the question generated by the question generation model, calculating the discrimination result to obtain a target reward, generating a question knowledge information pair, and returning the question knowledge information pair and the target reward to the question generation model; the question knowledge information pairs corresponding to the corresponding target rewards meeting preset requirements in the question knowledge information pairs in the question generation model are returned to serve as training data;

8. The system of claim 7, wherein the discrimination system further comprises:

the discrimination system is specifically configured to:

9. The system of claim 8, wherein the discrimination system is specifically configured to:

10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a question generation implementation program, which, when executed by a processor, implements the steps of the method according to any one of claims 1 to 6.