CN115687595A - Comparison and interpretation generation method based on template prompt and oriented to common sense question answering - Google Patents

Comparison and interpretation generation method based on template prompt and oriented to common sense question answering Download PDF

Info

Publication number
CN115687595A
CN115687595A CN202211430964.0A CN202211430964A CN115687595A CN 115687595 A CN115687595 A CN 115687595A CN 202211430964 A CN202211430964 A CN 202211430964A CN 115687595 A CN115687595 A CN 115687595A
Authority
CN
China
Prior art keywords
question
knowledge
concept
candidate
candidate text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211430964.0A
Other languages
Chinese (zh)
Inventor
张寅�
陈强龙
徐国海
严明
张佶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CN202211430964.0A priority Critical patent/CN115687595A/en
Publication of CN115687595A publication Critical patent/CN115687595A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a contrast explanation generation method based on template prompt and oriented to common sense question answering. The invention comprises the following steps: 1) Symbol knowledge acquisition: identifying key concepts based on original input and obtaining relevant external knowledge through multi-knowledge-source retrieval, 2) generating by comparing and explaining: inputting the retrieved symbol knowledge and the constructed prompt template into a generative pre-training model to generate a corresponding contrast interpretation text; 3) Interpreting the enhanced reasoning: and taking the obtained contrast explanation texts as the context of the original question to enhance the common sense question-answer reasoning. The invention provides a contrast explanation generation method based on template prompt for the knowledge-enhanced general knowledge question-answering problem for the first time, and the contrast explanation text is taken as a novel knowledge type by combining the advantages of external symbol knowledge of a concept center and a generation type pre-training model, so that the problems of low knowledge resolution and small downstream task promotion range in the knowledge-enhanced general knowledge question-answering problem can be greatly relieved.

Description

Comparison and interpretation generation method based on template prompt and oriented to common sense question answering
Technical Field
The invention belongs to the field of natural language processing, and particularly relates to application of an external knowledge retrieval and comparative interpretation generator based on a generative pre-training model to knowledge-driven common sense question answering.
Background
In recent years, a large number of knowledge-enhanced pre-training language models have been proposed to improve the performance of various NLP tasks. These methods aim to inject knowledge into pre-trained language models (PLMs) and then fine-tune them for downstream tasks, where common sense question-answering is a typical application scenario for pre-trained language models.
However, implicit knowledge learned in pre-trained models presents outdated questions that are not friendly to answer to the questions. To overcome this problem, another direction of research is to explicitly retrieve knowledge from different knowledge sources and integrate them into downstream models in the question-and-answer domain. Extracting the external knowledge from the knowledge base, however, inevitably introduces irrelevant and even noisy knowledge, which compromises the performance of the model. The pre-trained model is regarded as a knowledge base from which relevant implicit knowledge can be obtained, but the knowledge is general in general and lacks specific and distinguishing information. Based on previous research, we conclude that high-quality knowledge inclusion needs to have the following characteristics, 1) diversity, providing different types of knowledge, 2) relevance, and factual knowledge content should be related to the question, and 3) distinguishability, and knowledge can provide additional distinguishing information, which is beneficial to establishing the connection between the question and the candidate answer. The knowledge acquired by the above method hardly makes good progress in the distinction of knowledge.
Therefore, in the field of the common sense question and answer, how to obtain external knowledge with high distinctiveness and improve the reasoning ability of the pre-training model on the common sense question and answer task is a technical problem to be solved urgently at present.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides a common sense question-answering-oriented contrast interpretation generation method based on template prompt.
The invention aims to provide a contrast explanation generation method to improve the performance of the prior pre-training model on the common sense question and answer. According to the relevant experience, the advantages of this are as follows: firstly, in the process of generating the comparative interpretation, original question-candidate texts and related external symbol knowledge are combined, so that the knowledge has higher relevance and diversity, secondly, the comparative interpretation mainly focuses on difference information between candidate concepts in the input texts, and has good characteristics in distinction, and in addition, the comparative interpretation is used as a final form of incorporating knowledge, so that the three characteristics of high-quality knowledge are met, the comparative interpretation is easier to understand by human, and the comparative interpretation has better interpretability.
The technical scheme adopted by the invention is as follows:
a common sense question-answering oriented contrast interpretation generation method based on template prompt comprises the following steps:
s1: aiming at a question-candidate text formed by a question and a candidate answer, a trained concept recognizer is adopted to recognize key concepts of the question-candidate text, token-level sequence labeling is carried out on the question-candidate text in the concept recognizer, and a label sequence of whether each token in the question-candidate text belongs to a concept or a background is output, so that all concepts in the question-candidate text are extracted; the concept recognizer consists of an encoder and a CRF layer, wherein the encoder adopts a RoBERTA-large model, and the CRF layer outputs a mark sequence after the problem-candidate text is encoded by the encoder;
s2: taking all concepts extracted from the problem-candidate text in the step S1 as anchor points, retrieving more external symbol knowledge from different knowledge bases including a common knowledge map and a dictionary, and splicing the retrieved external symbol knowledge into concept center knowledge;
s3: taking the question-candidate text, all concepts extracted from the question-candidate text in the step S1 and concept center knowledge acquired in the step S2 as input, and generating comparative interpretation knowledge related to the question and the candidate answer by adopting a pre-training model generator; the pre-training model generator is a contrast explanation generator based on template prompt and is obtained by fine-tuning a pre-training generated language model BART-base;
s4: taking the comparative interpretation knowledge generated by the comparative interpretation generator in the S3 and the problem-candidate text as the input of a trained interpretation enhancement reasoning module, and firstly coding the input by a pre-training model coder, wherein the pre-training model coder adopts a pre-training model DeBERTaV3; and the coding result of the pre-training model coder is input into the pooling layer and the multilayer perceptron, and the answer of the question is output to realize the common sense question-answer reasoning.
Preferably, in S1, the concept recognition task is regarded as a token-level sequence tagging task, and the input of the concept recognizer is a sentence S = [ CLS ]]Q[SEP]A[SEP]Wherein [ SEP]Is a label for separating question Q and candidate answer A, for an input sentence S, a set of concepts C = { C } in the sentence is found by a sequence labeling task 1 ,c 2 ,…,c n And marking the marks to form a mark sequence, and marking whether each token in the question-candidate text belongs to a concept or a background through 1 and 0 in the finally obtained mark sequence, wherein 1 represents a concept mark, and 0 represents a background mark.
Preferably, the training data set of the concept recognizer includes common gen, e-SNLI, and CSQA; each training data set contains the annotated concepts or labels in the instance, and in one instance if there are more than 3 identified concepts, the top 3 concepts will be selected for subsequent use according to a score ranking mechanism, otherwise all identified concepts will be selected.
Preferably, the specific implementation steps of S2 are as follows:
s21: firstly, searching a path from a problem concept to a candidate concept in a common-sense knowledge graph, if the path exists and only one path exists, directly selecting the path, and extracting all triples existing in the path as related knowledge in the common-sense knowledge graph; if more than one path exists, comparing the lengths of the paths, selecting the shortest path, and extracting all triples existing in the shortest path as related knowledge in the common sense knowledge graph; if no direct path exists between the problem concept and the candidate concept in the common sense knowledge graph, but the triples related to the candidate concept exist in the common sense knowledge graph, calculating the score of each triplet through a predefined score function, and selecting the triplet with the highest score as the related knowledge in the common sense knowledge graph; for any one triplet j, the score calculated by the score function is score j Comprises the following steps:
Figure BDA0003945137830000031
in the formula: w is a j The weight of a triplet j in the common sense knowledge graph is defined, N is the total number of triplets related to candidate concepts in the common sense knowledge graph, all N triplets related to the candidate concepts in the common sense knowledge graph are divided into a plurality of relation categories through clustering, and N is k The number of the triples contained in the relation category k where the triples j are located;
s22: for each concept extracted from the question-candidate text in S1, selecting the definition item which is the closest match from the dictionary as the concept description;
s23: and for each concept extracted from the question-candidate text, serially splicing the triple obtained in the step S21 and the concept description obtained in the step S22 to serve as concept center knowledge from an external knowledge source corresponding to each concept.
Preferably, in S22, when a definition entry that most closely matches is selected as a concept description in the dictionary, if there are definition entries of multiple formats, the priority order selected as the concept description is: the original form of the concept itself > space's lemma form > basic words.
Preferably, in the S3, the comparative interpretation generator based on the template prompt adopts a pretrained generative language model BART-base which is finely adjusted by an ECQA, eQASC and e-SNLI data set. The comparative interpretation generator model only considers the original question and the candidate option as input in the fine tuning stage, but adopts concept and external symbol knowledge to enhance the model input in the reasoning stage, and the model input content of the reasoning stage consists of a task forward prompt, a question in the question-candidate text, a candidate answer in the question-candidate text, all concepts extracted from the question-candidate text, the concept center knowledge acquired based on the concept in the question-candidate text and a pre-constructed discrete prompt template.
Preferably, in S4, in the interpretation enhancement inference module, the pre-trained model DeBERTaV3 encodes the input comparative interpretation knowledge and the question-candidate text, and finally outputs the hidden state of the [ CLS ] label as a representation, and the representation outputs the answer to the question sequentially through the pooling layer and the multi-layer perceptron.
Preferably, the objective function adopted by the interpretation enhanced inference module training is defined as follows:
Figure BDA0003945137830000041
wherein: i represents the ith sample in the training dataset, h i Representing hidden states of the output of the multi-layer perceptron, y i Represents the answer label of the ith sample, and n represents the total number of samples.
Preferably, the common sense knowledge graph is ConceptNet.
Preferably, the dictionary is a Cambridge dictionary.
Compared with the prior art, the invention provides a contrast explanation generation method based on template prompt for the first time aiming at the common knowledge question and answer problem with enhanced knowledge. The invention combines the external symbol knowledge of the concept center and the advantages of the generative pre-training model to compare and explain texts as a novel knowledge type, and can greatly relieve the problems of low knowledge resolution and small downstream task promotion range in the common knowledge question-answering problem.
Drawings
FIG. 1 is a schematic diagram of steps of a conventional knowledge question-answering-oriented comparative interpretation generation method based on template prompt;
FIG. 2 is a block diagram of the method of the present invention;
Detailed Description
The invention will be further elucidated and described with reference to the drawings and specific embodiments.
As shown in FIG. 1, in a preferred embodiment of the present invention, a template hint based comparative interpretation generation method is provided, which comprises the steps of S1-S4:
s1: aiming at a question-candidate text formed by a question and a candidate answer, a trained concept recognizer is adopted to recognize key concepts of the question-candidate text, token-level sequence labeling is carried out on the question-candidate text in the concept recognizer, and a label sequence of whether each token in the question-candidate text belongs to a concept or a background is output, so that all concepts in the question-candidate text are extracted; the concept recognizer is composed of an encoder and a CRF layer, wherein the encoder adopts a RoBERTA-large model, and the CRF layer outputs a mark sequence after the problem-candidate text is encoded by the encoder.
In the embodiment of the present invention, in the above step S1, the concept recognition task is regarded as a token-level sequence labeling task, the question-candidate text is composed of a question Q and a candidate answer a, and the input of the concept recognizer is a sentence S = [ CLS ]]Q[SEP]A[SEP]Wherein [ SEP]Is a label for separating question Q and candidate answer A, for an input sentence S, a set of concepts C = { C } in the sentence is found by a sequence labeling task 1 ,c 2 ,…,c n And labeling the cells to form a labeled sequence,and marking whether each token in the question-candidate text belongs to a concept or a background through 1 and 0 in the finally obtained marking sequence, wherein 1 represents a concept mark, and 0 represents a background mark.
In addition, in the embodiment of the present invention, limited by the size of the training corpus, we collected several similar data sets for training of the concept recognizer, the specific training data set of which includes common gen, e-SNLI, and CSQA. Each training data set contains the annotated concepts or labels in the instance, and in one instance if there are more than 3 identified concepts, the top 3 concepts will be selected for subsequent use according to a score ranking mechanism, otherwise all identified concepts will be selected.
S2: and taking all concepts extracted from the question-candidate text in the step S1 as anchor points, retrieving more external symbol knowledge from different knowledge bases including a common knowledge map and a dictionary, and splicing the retrieved external symbol knowledge into concept-centered knowledge.
In an embodiment of the present invention, the step S2 is implemented as follows:
s21: firstly, searching a path from a problem concept to a candidate concept in a common-sense knowledge graph, if the path exists and only one path exists, directly selecting the path, and extracting all triples existing in the path as related knowledge in the common-sense knowledge graph; if more than one path exists, comparing the lengths of the paths, selecting the shortest path, and extracting all triples existing in the shortest path as related knowledge in the common sense knowledge graph; if no direct path exists between the problem concept and the candidate concept in the common sense knowledge graph, but the triples related to the candidate concept exist in the common sense knowledge graph, calculating the score of each triplet through a predefined score function, and selecting the triplet with the highest score as the related knowledge in the common sense knowledge graph; for any triplet j, its score scorr calculated by the score function j Comprises the following steps:
Figure BDA0003945137830000051
in the formula: w is a j The weight of a triplet j in the common sense knowledge graph is defined, N is the total number of triplets related to the candidate concept in the common sense knowledge graph, all N triplets related to the candidate concept in the common sense knowledge graph are divided into a plurality of relation categories through relation clustering, and N is k The number of triples contained in the relationship category k where the triples j are located.
It should be noted that, here, the triples related to the candidate concepts in the common sense knowledge graph refer to the triples with the candidate concepts as one of the nodes in the common sense knowledge graph.
S22: for each concept extracted from the question-candidate text in S1, the definition entry that most closely matches is selected from the dictionary as the concept description.
In an embodiment of the present invention, when the definition entry closest to match is selected as the concept description in the dictionary, if there are multiple types of definition entries, it is necessary to define the priority order selected as the concept description as follows: the original form of the concept is > the lemma form of space > the basic word, that is, if the original form of the concept exists, the definition item of the original form is adopted, if the original form of the concept does not exist but the lemma form of space exists, the definition item corresponding to the lemma form of space is selected, and if neither of the two exists, the definition item corresponding to the basic word (the last word) is selected.
S23: and for each concept extracted from the question-candidate text, serially splicing the triple obtained in the step S21 and the concept description obtained in the step S22 to serve as concept center knowledge from an external knowledge source corresponding to each concept. Specifically, the invention inputs the concept center knowledge of the triple Triples and the concept description Definition in the form of Triples [ SEP ] Definition [ SEP ] }, and the concept center knowledge is used for explanation generation and downstream reasoning.
It should be noted that the above-mentioned common sense knowledge map and dictionary can be selected from different forms, in this embodiment, the common sense knowledge map can be ConceptNet, and the dictionary can be cambridge dictionary.
S3: taking the question-candidate text, all concepts extracted from the question-candidate text in the step S1 and concept center knowledge acquired in the step S2 as input, and generating comparative interpretation knowledge related to the question and the candidate answer by adopting a pre-training model generator; the pre-training model generator is a contrast explanation generator based on template prompt and is obtained by fine-tuning a pre-training generated language model BART-base.
In the above step S3 of the present invention, training of the comparative interpretation generator based on template prompt requires first collecting some data sets related to interpretation, and the principle is as follows: 1) Whether the data set directly contains a comparative interpretation; 2) If not, the dataset can provide interpretations, i.e., positive and negative interpretations, for different candidate answers; 3) If not, the interpretation of the data set contains factual knowledge that distinguishes between different candidate answers or tags. Carefully chosen, in the embodiment of the present invention, the comparative interpretation generator employs a pretrained generative language model BART-base that is fine-tuned to the ECQA, eQASC, and e-SNLI data sets. Unlike the comparative interpretation generator which only considers the original question and the candidate as input in the fine tuning stage, the comparative interpretation generation model adopts concept and external symbol knowledge to enhance the model input in the reasoning stage, and the model input content in the reasoning stage consists of a task forward prompt, a question in the question-candidate text, a candidate answer in the question-candidate text, all concepts extracted from the question-candidate text, the concept center knowledge acquired based on the concepts in the question-candidate text and a pre-constructed discrete prompt template. Specifically, the input content is organized as follows: [ Prefix ] Task Prefix [ SEP ] Question [ SEP ] Candidates [ SEP ] completion [ SEP ] Concept-centralized knowledge [ SEP ] extension project [ SEP ], wherein the Task Prefix is a Task precondition Prompt, in this embodiment, a fixed text is set as 'generate a contrast Explanation of the problem', the Concept-centralized iknowledgega represents the extracted Concept center knowledge (including a triplet and a Concept description), and the extension Prompt is a discrete Prompt template constructed by a human, and the Question, candidates, and Concept are a Question, a candidate answer, and a Concept in the Question-candidate text, respectively.
S4: taking the comparative interpretation knowledge generated by the comparative interpretation generator in the S3 and the problem-candidate text as the input of a trained interpretation enhancement reasoning module, and firstly coding the input by a pre-training model coder, wherein the pre-training model coder adopts a pre-training model DeBERTaV3; and the coding result of the pre-training model coder is input into the pooling layer and the multilayer perceptron, and the answer of the question is output to realize the common sense question-answer reasoning.
In the embodiment of the present invention, in the interpretation enhancement inference module, the pretrained model DeBERTaV3 encodes the input comparative interpretation knowledge and the question-candidate text, and finally outputs the hidden state of the [ CLS ] label as a representation, and the representation outputs the answer to the question sequentially through the pooling layer and the multilayer perceptron.
The objective function adopted by the training of the interpretation enhancement reasoning module is defined as follows:
Figure BDA0003945137830000071
wherein: i represents the ith sample in the training dataset, h i Representing hidden states of the output of the multi-layer perceptron, y i Represents the answer label for the ith sample, and n represents the total number of samples.
It should be noted that during the training of the interpretation enhancement reasoning module, the concept recognizer and the comparative interpretation generator are fixed and no parameter update is required.
The above method is applied to the embodiment, the specific implementation steps are as described above, and the effect is mainly shown in the embodiment.
Examples
As shown in Table 1, the present embodiment divides the existing CSQA method into four parts, pre-derived Language Model Only, PLM + Symbolic Knowledge Retrieval, PLM + Generated Knowledge and Human Parties. The experimental result shows the effectiveness of the common sense question and answer oriented contrast interpretation generation method (marked as CPACE). Specifically, the CPACE model of the present invention achieves 87.4% in a single model setup by generating a comparative explanation, and further, while KEAR leverages external knowledge and a training set of searches to enhance knowledge, the CPACE model of the present invention outperforms KEAR and takes a new SOTA on the CSQA leaderboard. This suggests that, unlike retrieving triples, defining and training instances, the generated comparative interpretation can be another effective way of knowledge enhancement.
TABLE 1 Experimental results of the test set of common sense questions and answers
Figure BDA0003945137830000081
To further measure the versatility of the method of the present invention, the present embodiment uses a comparative interpretation generated by the CPACE generator on both the QASC and OBQA datasets to evaluate the performance of the model. As shown in Table 2, this example selects some representative baselines for comparison, including UnifiedQA, roBERTA + AIR, and GenMC. Although ALBERT only achieved 71.8% and 72.5% on QASC and OBQA, ALBERT + KB achieved 80.3% and 83.2%, respectively, which only retrieved symbolic knowledge from KB. Using the CPACE model of the present invention we can further improve by 3.4% and 2.9%, respectively. The experimental results show that the CPACE model of the invention can be used for answering common knowledge questions and QA in other open fields.
TABLE 2 generalization test results of the model on QASC and OBQA datasets
Model (model) QASC OBQA
BERT 68.4 64.1
UnifiedQA 66.6 70.5
GenMC 67.6 71.6
ALBERT 71.8 72.5
ALBERT+KB 80.3 83.2
RoBERTa+AIR 81.4 81.7
CPACE 83.7 86.1
As shown in table 3, the present embodiment evaluates the effect of concept-centric knowledge (triples and definitions of concepts) of concepts, prefix hints and retrieval in the generator with BART-base as a pre-trained model. The generated interpretation-enhanced inference model can only reach 78.3% on the CSQA development set, using the trimmed BART-base as a generator.
With certain concepts, the generator may benefit much more since the concepts represent key information for a given sentence. When the concept is input as enhancement, the present embodiment can achieve a 0.8% improvement. This embodiment uses prefix hints as a formal constraint, yielding a 4.1% improvement, which fully illustrates the necessity of a comparative explanation hint as a constraint. At the same time, the present example can achieve a 5.2% improvement, enhanced with external concept-centric knowledge, which indicates that concept-centric knowledge is equally important in the generation of the comparative interpretation. Finally, combining the above three knowledge, the inference model can be improved by 6.9%.
TABLE 3 comparative interpretation of ablation test results of the generator module
Model (model) Verification set accuracy
BART 78.3
BART+Concept 79.1
BART+Prefix Prompt 82.4
BART+Concept-centric Knowledge 83.5
BART+All 85.2
As shown in Table 4, ALBERT achieved 73.8% of the performance in the CSQA question-answering task and DeBERTA achieved 84.6% of the performance, indicating that a better pre-trained model encoder had significant implications in downstream tasks. The present embodiment then enhances the inference model by using the concepts, the retrieved concept-centric knowledge and the generated comparative explanations as different types of additional knowledge, respectively. While the concept can only bring 1.5% and 0.2% improvement to ALBERT and DeBERTa, 10.4% and 0.5% improvement can be achieved by the triplet and concept definitions, respectively. Furthermore, by using the comparative interpretation of the generation, a great improvement can be obtained, 11.4% and 3.3%, respectively. This indicates that the invention generates a comparative interpretation that is more efficient than the retrieved symbol knowledge.
TABLE 4 ablation experimental results of model inference module
Model (model) Verification set accuracy
ALBERT 73.8
ALBERT+Concept 75.3
ALBERT+Concept-centric Knowledge 84.2
ALBERT+Contrastive Explanation 85.2
ALBERT+All 88.4
DeBERTaV3 84.6
DeBERTaV3+Concept 84.8
DeBERTaV3+Concept-centric Knowledge 85.1
DeBERTaV3+Contrastive Explanation 87.9
DeBERTaV3+All 91.7
ALBERT+Ground-truth Explanation 96.9
DeBERTaV3+Ground-truth Explanation 97.1
The above-described embodiments are merely preferred embodiments of the present invention, which should not be construed as limiting the invention. Various changes and modifications may be made by one of ordinary skill in the pertinent art without departing from the spirit and scope of the present invention. Therefore, the technical scheme obtained by adopting the mode of equivalent replacement or equivalent transformation is within the protection scope of the invention.

Claims (10)

1. A common sense question-answer oriented contrast interpretation generation method based on template prompt is characterized by comprising the following steps:
s1: aiming at a question-candidate text formed by a question and a candidate answer, a trained concept recognizer is adopted to recognize key concepts of the question-candidate text, token-level sequence labeling is carried out on the question-candidate text in the concept recognizer, and a label sequence of whether each token in the question-candidate text belongs to a concept or a background is output, so that all concepts in the question-candidate text are extracted; the concept recognizer consists of an encoder and a CRF layer, wherein the encoder adopts a RoBERTA-large model, and the CRF layer outputs a mark sequence after the problem-candidate text is encoded by the encoder;
s2: taking all concepts extracted from the problem-candidate text in the step S1 as anchor points, retrieving more external symbol knowledge from different knowledge bases including a common knowledge map and a dictionary, and splicing the retrieved external symbol knowledge into concept center knowledge;
s3: taking the question-candidate text, all concepts extracted from the question-candidate text in the step S1 and concept center knowledge acquired in the step S2 as input, and generating comparative interpretation knowledge related to the question and the candidate answer by adopting a pre-training model generator; the pre-training model generator is a contrast explanation generator based on template prompt and is obtained by fine-tuning a pre-training generated language model BART-base;
s4: taking the comparative interpretation knowledge generated by the comparative interpretation generator in the S3 and the problem-candidate text as the input of a trained interpretation enhancement reasoning module, and firstly coding the input by a pre-training model coder, wherein the pre-training model coder adopts a pre-training model DeBERTaV3; and outputting the coding result of the pre-training model coder through the pooling layer and the multilayer perceptron to output the answer of the question so as to realize the common sense question-answer reasoning.
2. The method for generating the conventional sense question-answering-based comparative interpretation based on the template prompt according to claim 1, wherein in the step S1, the concept recognition task is regarded as a token-level sequence labeling task, and the input of the concept recognizer is a sentence S = [ CLS ]]Q[SEP]A[SEP]Wherein [ SEP]Is a mark for separating the question Q and the candidate answer A, and for the input sentence S, the sequence marking task needs to find a group of concepts C = { C } in the sentence 1 ,c 2 ,…,c n And marking the words to form a mark sequence, and marking each token in the question-candidate text to belong to a concept or background through 1 and 0 in the finally obtained mark sequence, wherein 1 represents a concept mark, and 0 represents a backgroundAnd (5) scene marking.
3. The method of generating a conventional sense question-answer-oriented template-prompt-based comparative interpretation of claim 1 wherein the training data set of the concept recognizer includes CommonGen, e-SNLI, and CSQA; each training data set contains the annotated concepts or labels in the instance, and in one instance if there are more than 3 identified concepts, the top 3 concepts will be selected for subsequent use according to a score ranking mechanism, otherwise all identified concepts will be selected.
4. The common sense question-answering-oriented template-prompt-based comparative interpretation generation method of claim 1, wherein the specific implementation steps of S2 are as follows:
s21: firstly, searching a path from a problem concept to a candidate concept in a common-sense knowledge graph, if the path exists and only one path exists, directly selecting the path, and extracting all triples existing in the path as related knowledge in the common-sense knowledge graph; if more than one path exists, comparing the lengths of the paths, selecting the shortest path, and extracting all triples existing in the shortest path as related knowledge in the common sense knowledge graph; if no direct path exists between the problem concept and the candidate concept in the common sense knowledge graph, but the triples related to the candidate concept exist in the common sense knowledge graph, calculating the score of each triplet through a predefined score function, and selecting the triplet with the highest score as the related knowledge in the common sense knowledge graph; for any one triplet j, the score calculated by the score function is score j Comprises the following steps:
Figure FDA0003945137820000021
in the formula: w is a j The weight of the triple j in the common sense knowledge graph, N is the total number of the triples related to the candidate concept in the common sense knowledge graph, and all N are related to the candidate concept in the common sense knowledge graphThe related triples are divided into a plurality of relationship categories by clustering, N k The number of the triples contained in the relation category k where the triples j are located;
s22: for each concept extracted from the question-candidate text in S1, selecting the definition item which is the closest match from the dictionary as the concept description;
s23: and for each concept extracted from the question-candidate text, serially splicing the triple obtained in the step S21 and the concept description obtained in the step S22 to serve as concept center knowledge from an external knowledge source corresponding to each concept.
5. The method for generating a conventional sense question-answering-based comparative interpretation of the template hint according to claim 4, wherein in the step S22, when the definition item closest to the match is selected as the concept description in the dictionary, if there are a plurality of types of definition items, the priority order selected as the concept description is: the original form of the concept itself > space's lemma form > basic words.
6. The method as claimed in claim 1, wherein in S3, the template-hint-based comparative interpretation generator employs a pretrained generated language model BART-base trimmed by ECQA, eQASC and e-SNLI data sets, the comparative interpretation generator model only considers the original question and candidate option as input in the trimming stage, but employs concept and external symbol knowledge to enhance the model input in the reasoning stage, and the model input content in the reasoning stage is composed of task pre-submission hints, questions in the question-candidate text, candidate answers in the question-candidate text, all concepts extracted from the question-candidate text, the concept center knowledge obtained based on the concepts in the question-candidate text, and the pre-constructed discrete hint template.
7. The common sense question-answering-oriented template prompt-based comparative interpretation generation method of claim 1, wherein in the interpretation enhancement reasoning module in S4, the pre-training model DeBERTaV3 encodes the input comparative interpretation knowledge and the question-candidate text, finally outputs the hidden state of the [ CLS ] label as a representation, and the representation outputs the answer of the question sequentially through the pooling layer and the multi-layer perceptron.
8. The method for generating the conventional sense question-answering-based comparative interpretation of the template prompt according to claim 1, wherein the objective function adopted by the interpretation enhancement reasoning module for training is defined as follows:
Figure FDA0003945137820000031
wherein: i represents the ith sample in the training dataset, h i Representing hidden states of the output of the multi-layer perceptron, y i Represents the answer label for the ith sample, and n represents the total number of samples.
9. The method for generating the conventional sense question-answer-based comparative interpretation based on the template prompt according to the claim 1, wherein the common sense knowledge map adopts ConceptNet.
10. The method for generating the conventional sense question-answering-based comparative interpretation of the template prompt according to claim 1, wherein the dictionary is a Cambridge dictionary.
CN202211430964.0A 2022-11-15 2022-11-15 Comparison and interpretation generation method based on template prompt and oriented to common sense question answering Pending CN115687595A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211430964.0A CN115687595A (en) 2022-11-15 2022-11-15 Comparison and interpretation generation method based on template prompt and oriented to common sense question answering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211430964.0A CN115687595A (en) 2022-11-15 2022-11-15 Comparison and interpretation generation method based on template prompt and oriented to common sense question answering

Publications (1)

Publication Number Publication Date
CN115687595A true CN115687595A (en) 2023-02-03

Family

ID=85051893

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211430964.0A Pending CN115687595A (en) 2022-11-15 2022-11-15 Comparison and interpretation generation method based on template prompt and oriented to common sense question answering

Country Status (1)

Country Link
CN (1) CN115687595A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116910646A (en) * 2023-07-04 2023-10-20 南京航空航天大学 Method for classifying internal link objectives of knowledge units in SO website
CN117033608A (en) * 2023-09-28 2023-11-10 中国电子科技集团公司第十研究所 Knowledge graph generation type question-answering method and system based on large language model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116910646A (en) * 2023-07-04 2023-10-20 南京航空航天大学 Method for classifying internal link objectives of knowledge units in SO website
CN116910646B (en) * 2023-07-04 2024-02-09 南京航空航天大学 Method for classifying internal link objectives of knowledge units in SO website
CN117033608A (en) * 2023-09-28 2023-11-10 中国电子科技集团公司第十研究所 Knowledge graph generation type question-answering method and system based on large language model
CN117033608B (en) * 2023-09-28 2023-12-22 中国电子科技集团公司第十研究所 Knowledge graph generation type question-answering method and system based on large language model

Similar Documents

Publication Publication Date Title
US10678816B2 (en) Single-entity-single-relation question answering systems, and methods
CN112115238B (en) Question-answering method and system based on BERT and knowledge base
CN109271537B (en) Text-to-image generation method and system based on distillation learning
CN111046179B (en) Text classification method for open network question in specific field
CN115687595A (en) Comparison and interpretation generation method based on template prompt and oriented to common sense question answering
KR20140138648A (en) Methods, apparatus and products for semantic processing of text
CN111914556B (en) Emotion guiding method and system based on emotion semantic transfer pattern
CN112989033B (en) Microblog emotion classification method based on emotion category description
CN112925918B (en) Question-answer matching system based on disease field knowledge graph
CN110851584A (en) Accurate recommendation system and method for legal provision
CN117236338B (en) Named entity recognition model of dense entity text and training method thereof
CN111507093A (en) Text attack method and device based on similar dictionary and storage medium
CN113641809A (en) XLNET-BiGRU-CRF-based intelligent question answering method
Karlos et al. Combining active learning with self-train algorithm for classification of multimodal problems
CN117609436A (en) College scientific research management question-answering system combining knowledge graph and large language model
Malandrakis et al. Emotiword: Affective lexicon creation with application to interaction and multimedia data
CN111858885B (en) Keyword separation user question intention identification method
CN114077839A (en) Knowledge base question-answering method under small number of labeling scenes
CN113935329B (en) Asymmetric text matching method based on adaptive feature recognition and denoising
CN115422934B (en) Entity identification and linking method and system for space text data
CN118093834B (en) AIGC large model-based language processing question-answering system and method
CN117131169A (en) Complex question-answering method and system applied to mineral knowledge graph
CN117408258A (en) Method for constructing knowledge-driven few-sample named entity recognition adapter
CN117933255A (en) Cross-domain term extraction method based on large-scale language model and pre-training fine tuning mechanism
Hajihashemi Varnousfaderani Challenges and insights in semantic search using language models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination