CN113127599A - Question-answering position detection method and device of hierarchical alignment structure - Google Patents
Question-answering position detection method and device of hierarchical alignment structure Download PDFInfo
- Publication number
- CN113127599A CN113127599A CN202110230676.XA CN202110230676A CN113127599A CN 113127599 A CN113127599 A CN 113127599A CN 202110230676 A CN202110230676 A CN 202110230676A CN 113127599 A CN113127599 A CN 113127599A
- Authority
- CN
- China
- Prior art keywords
- question
- answer
- sequence
- sample
- coarse
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 35
- 238000000034 method Methods 0.000 claims abstract description 30
- 230000001419 dependent effect Effects 0.000 claims description 21
- 239000013598 vector Substances 0.000 claims description 19
- 238000004590 computer program Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 claims description 6
- 238000010606 normalization Methods 0.000 claims description 3
- 238000012549 training Methods 0.000 abstract description 9
- 238000011156 evaluation Methods 0.000 description 6
- 230000003993 interaction Effects 0.000 description 4
- 125000004122 cyclic group Chemical group 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 101100172190 Mus musculus Elavl1 gene Proteins 0.000 description 1
- 235000001513 akia Nutrition 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 239000011362 coarse particle Substances 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 229910052500 inorganic mineral Inorganic materials 0.000 description 1
- 238000013508 migration Methods 0.000 description 1
- 230000005012 migration Effects 0.000 description 1
- 239000011707 mineral Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 229920000642 polymer Polymers 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Economics (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a method and a device for detecting a question-answering position of a hierarchical alignment structure, wherein the method comprises the following steps: respectively converting the question text and the answer text into a question sequence and an answer sequence; splicing the question sequence and the answer sequence to obtain a question answer sequence; and inputting the question sequence, the answer sequence and the question answer sequence into a hierarchical alignment model to obtain a question-answer standpoint detection result. According to the hierarchical alignment model, a BERT pre-training model is used to obtain coarse-grained vertical representation, concept-level target alignment and evidence-level information alignment are performed from two aspects of question and answer in a QA pair, and coarse-to-fine vertical representation is obtained, so that higher accuracy and F1 value can be obtained on a question-and-answer vertical detection task.
Description
Technical Field
The invention relates to the field of social media-position detection-natural language processing, in particular to a question-answering position detection method and device with a hierarchical alignment structure.
Technical Field
The position detection task is a classification problem, which is intended to identify the position of an author expressed on a specific target (such as an entity, a statement, an event, and the like), and plays an important role in tasks such as opinion identification, political debate, rumor detection, fake news detection, and the like. In the social question-and-answer platform, question-and-answer position detection is a novel position detection task, aiming at identifying the position carried in the answer to a specific question.
For the site detection task, early research focused on online forensic text, mainly using rule-based algorithms (Walker, M., Tree, J.F., and, P., Abbott, R., King, J.: A: color for Resources on delay and floor. in: Proceedings of the origin International Conference on delay and Evaluation (LREC' 12). pp.812{ European channels Association (ELRA), Istanbul, Turkey (May)), SVM (Hasan, K.S., Ng., V.: stability classification of electronic devices: Data, fields, services, string, training, string, concrete, etc.) (Source: coding, N.S. 201, J.S. and J.S. J., coding. J.S.: coding, J.S., and J.7. application, J.S.: coding. and classification of electronic devices: Data, fields, sources, Resources, devices, concrete, N.S. Pat. No. 5. J.S. Pat. No. J.7. application, J.S. 12. and No. J.S. (Source, application. 7. application, mineral, P.S. (J.S.: code, P.S. 12. 7. application, No. 7. application, No. 7. application, application, cham (2014)), and the like. Recent work gradually shifted to the social media field, research methods gradually shifted to deep learning methods, using a model based on a deep neural network to analyze the standpoint of targets, such as documents (Vijayaraghavan, P., Sysoev, I., Vosoughi, S., Roy, D.: Deep at SemEvent-2016 (6): Detecting station in tweeth using the character and word-level Ns. in: Proceedings of the 10 International Workshop on management (SemEvent-2016). 413{419.Association for Computational linkage, Sanego, Calorifa (2016)), documents (Zarrella, G., A., Marsh, Miq-Securi, J.D.: Zones, J.D.: J., g. standard detection with systematic identification network. in: Proceedings of the 27th International Conference on Computational restriction. pp.2399{2409.Association for Computational restriction, SantaFe, New Mexico, USA (Aug 2018) }. In addition, there are some studies, such as literature (Zhang, B., Yang, M., Li, X., Ye, Y., Xu, X., Dai, K.: engineering cross-target state detection with transferable information-based knowledge consumption. in: Proceedings of the 58th annular Meeting of the Association for Computational linkage. pp.3188{3197.Association for Computational linkage, Online (JSlul 2020) }) and literature (over-looking, V., Attarad, G.: Transfer learning to polymers from pages in question) which use the main knowledge of the target object of migration { emission, emission.
The question-answer position detection aims at questions in a question-answer text and identifies positions in the answer text. Given a question-answer (QA) pair, The latest approach proposes a cyclic conditional attention network (Yuan, j., Zhao, y., Xu, j., Qin, b., expanding answer state detection with a repeat conditional assignment. in: The third-third aai Conference on architecture assignment, AAAI 2019, The third-First Innovative Applications of architecture assignment, IAAI 2019, The Ninth AAAI aggregate on duration assignment, EAAI 2019, honoluu, hawaiii, USA, January 27-library 1,2019, 20126. in architecture assignment, EAAI 2019, honoluu, Hawaii, USA, January 27, library 1, 20126. in architecture assignment, and The final answer to The question is obtained by modeling The final answer state of The cyclic conditional attention network (Yuan, j., Zhao, y 26, and The final answer state of The text of The final answer (aai.e., aai, 7433). When the question and answer standpoint detection task is solved, the model not only needs to understand the semantics in the question and answer text, but also needs to model the relationship between the question and the answer text.
Furthermore, vertical detection subtask vertical detection (Gorrall, G., Kochkina, E., Liakata, M., Aker, A., Zubiga, A., Bontcheva, K., Derczynski, L.: Semcal-2019 task 7: Rumour Eval, determining rule trend and support for rule in Proceedings of the 13th International word on semiconductor evaluation.845 {854.Association for computing Linguletics, Minneapolis, Minnesota, USA (Jun 2019)) and false news vertical detection (Gorrall, G., Kochkina, E., Lichkin, M., Akia, Zneu, A., Zuk, J., Wen, J.7, J.12, J.1. evaluation, J., and the question-answer position detection focuses more on how to learn the mutual correlation between QAs, and models the position representation under the specified target. The tasks associated with question-answering position detection are also target-dependent sentiment analysis (Gorrell, G., Kochkina, E., Liakata, M., Aker, A., Zubiga, A., Bontcheva, K., Derczynski, L.: SemEval-2019task 7: Rumourval, determining rumour veracity and reporting for rumour. in: Proceedings of the 13th International Workshop on semiconductor evaluation. pp.845. Association for practical linearity, Minneapolis, Minnesota, USA (Jun 2019)), the latter target being a representation of learning targets, and the need to find targets and information associated with the entire question.
The prior art is applied to the question-answering position detection task and ignores the following two problems. First, in question-and-answer position detection, positions are related to targets related to concepts in the question text, but words representing the same concepts appearing in the question and answer text may not coincide and target alignment should be performed. Second, the answer text may contain more than one concept-related target, and the additional target information may interfere with the recognition standpoint, and context alignment should be performed to find the content that can support the question text, i.e., evidence-related context.
Disclosure of Invention
The invention provides a method and a device for detecting a question-answering position of a hierarchical alignment structure, which solve the problem that a target related to a concept and a context related to an evidence in a QA pair in a question-answering position detection task are possibly inconsistent through a method of target alignment related to the concept and context alignment related to the evidence, and carry out vector representation from coarse to fine on the position, thereby effectively improving the effect of the question-answering position detection task and accurately identifying the position carried by an answer text aiming at the problem in the QA pair.
In order to achieve the purpose, the invention provides the following technical scheme:
a question-answering position detection method of a hierarchical alignment structure comprises the following steps:
1) respectively converting the question text and the answer text into a question sequence and an answer sequence;
2) splicing the question sequence and the answer sequence to obtain a question answer sequence;
3) inputting the question sequence, the answer sequence and the question answer sequence into a hierarchical alignment model to obtain a question-answer standpoint detection result;
wherein, a question-answering place detection model is obtained through the following steps:
a) respectively converting the plurality of sample question texts and the plurality of sample answer texts into sample question sequences and sample answer sequences, and splicing the sample question sequences and the corresponding sample answer sequences to obtain a plurality of sample question answer sequences;
b) respectively coding each sample question sequence, sample answer sequence and sample question answer sequence to obtain a plurality of question sequence representations SQAnswer sequence representation SAAnd coarse grain size in the vertical representation of SQA;
c) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining a number of question-dependent answer representations M as keys and valuesQ→AExpressing S as a sequence of answersAAs a query and representing the corresponding answer sequence as SQObtaining, as keys and values, a number of answer-dependent question representations MA→QAnd connecting the question-dependent answer representation MQ→AWith corresponding answer-dependent question representation MA→QObtaining a plurality of fine-grained representations DQA;
d) Aligning fine-grained representations D based on a multi-head attention mechanismQARepresenting S from the corresponding coarse-grained standpointQAThe sentence meanings related to evidence between the two groups of the sentences obtain a plurality of vectors representing O from a coarse position to a fine position;
e) and classifying a plurality of vector representations O from coarse to fine to obtain a level alignment model.
Further, the method of encoding the sample question sequence, the sample answer sequence, and the sample question answer sequence includes: a pre-trained BERT model is used.
Further, a question-dependent answer representation M is obtained by the following stepsQ→A:
1) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining an output of a first answer-question matching block as a key and value, comprising the steps of:
a) obtaining the output of the ith headWherein Is thatD is the embedding size of the sample question text and the sample answer text converted into the sample question sequence and the sample answer sequence, h is the number of heads,is a parameter which can be learned, i is more than or equal to 1 and less than or equal to h;
b) splicing the outputs of h heads, and performing linear projection operation on the splicing result to obtain an operation result
c) in question sequence representation SQAnd operation result MATT (S)Q,SA) The residual connection is carried out between the two to obtain the result Z ═ LN (S)Q+MATT(SQ,SA) LN is a hierarchical normalization operation;
d) accessing the result Z into a feedforward network and another residual connecting layer to obtain the output TIM of the first transform encoder1(SQ,SA) LN (Z + MLP (Z)), where MLP is a feed-forward network;
2) by stackingmAn answer-question matching block for obtaining an answer representation dependent on a question
Further, I ═ MATT' (D) is represented by a coarse to fine field vectorQA,SQA)=[ATT′1(DQA,SQA),...,ATT′h′(DQA,SQA)]W′OWherein
Further, by classifying several coarse to fine field vector representations O:
1) calculating the probability that the vector from the coarse to the fine position represents that O belongs to each type of position by utilizing a softmax function;
2) the class with the highest probability is taken as the class of the coarse-to-fine-field vector representation O.
Further, before calculating the probability that O belongs to each class of position represented by the coarse-to-fine position vector, a linear layer is used to reduce the number of dimensions each representing O by the coarse-to-fine position vector.
Further, a loss function of the level alignment model is trainedWhereinIs the result of the prediction, N is the number of sample question texts or sample answer texts, and | C | is the number of the set of the place categories.
Further, the set of standpoint categories C includes: approval, disapproval, and neutrality.
A storage medium having a computer program stored therein, wherein the computer program is arranged to perform the above-mentioned method when executed.
An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer to perform the method as described above.
Compared with the prior art, the invention has the following advantages:
compared with the scheme of the circulation condition attention network, the method explicitly models the target dependency information through the attention coding strategy. In contrast, the conditional attention and extraction process only simulates the interaction between the QA pair, and neither learns a feature-rich text representation nor explicitly performs target and context alignment at the encoding stage, but the present invention uses the BERT pre-training model to obtain coarse-grained vertical representation, and then performs concept-level target alignment and evidence-level information alignment from both the question and the answer in the QA pair to obtain coarse-to-fine vertical representation. Experiments prove that the technical method can obtain higher accuracy and F1 value on the question-answering position detection task.
Drawings
FIG. 1 is a Hierarchical Alignment (HAT) model architecture diagram of the present invention.
Detailed Description
In order to make the technical solutions in the embodiments of the present invention better understood and make the objects, features, and advantages of the present invention more comprehensible, the technical core of the present invention is described in further detail below with reference to the accompanying drawings and examples.
The invention provides a novel question-answering position detection model, namely a Hierarchy Alignment (HAT) model based on a Transformer, as shown in figure 1, the model can align the context related to the concept-related target and evidence in a question-answering pair, learn the position from rough to fine, and be applied to the question-answering position detection. The HAT model mainly comprises three modules: a question and answer text coding module, a concept related target alignment module and an evidence related context alignment module. First, the present invention uses the Pre-training model BERT (Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: Pre-training of deep biological transformations for Language integrity in: Proceedings of the 2019 Conference of the North American pipeline of the Association for practical linearity of Human Language Technologies, Volume 1(Long and Short Papers) pp.4171{4186.Association for practical linearity, Minneaps, Minnesota (Jun 2019)) to calculate the basic features and meaningful features of the question-answer text. Then, a QA interaction matching block is introduced to align the concept-related objects from two directions, obtaining a question-dependent answer representation and a question-dependent answer representation. Finally, a multi-head attention mechanism is used to align evidence-related contexts to learn a better ground representation for question-answering ground detection.
The method is mainly divided into the following four parts: encoding a question and answer text, aligning targets, aligning contexts and classifying from the standpoint.
1. Question-answer text coding
For problem text, the invention converts the text sequence into a sequence representation X ═ X1,x2,...,xNTherein ofIs the sum of word embedding, segment embedding and position embedding, N is the length in the problem sequence, d is the size of the embedding, and d is also the dimension size of the pre-trained model BERT used to obtain the text representation. The encoded text is the output of the last layer of the BERT encoder, i.e. the problem sequence representationThe answer sequence representation can be obtained using the same methodThen, the invention splices the question sequence and the answer sequence, inputs them into the pre-trained BERT model to obtain a coarse-grained position representation which is marked asWherein one dimension of the (N +1+ M) extra is the separator [ SEP ] between the Q and A sequences]。
2. Target alignment
The role of the concept-related object alignment module is to align the concept-related objects from both the question and answer aspects in the QA pair, learning an answer-dependent question representation and an answer-dependent question representation. The QA interaction matching module is thus constructed, using a self-attention mechanism, to align the concept-level targets from two aspects. We propose two QA interaction matching blocks: a question-answer matching block and an answer-question matching block.
Question-answer matching block represents a sequence of questions SQAs a query, the answer sequence is represented as SAAs keys and values. Conversely, the answer-question matching block represents the answer sequence SAAs a query, the answer sequence is represented as SQAs keys and values. In this way, the model focuses more on conceptually related goals, both in terms of questions and answers, and thus obtains an answer-dependent question representation and an answer-dependent question representation.
Specifically, the ith head calculation formula of the question-answer matching block is:
wherein the content of the first and second substances,is thatThe dimension (c) of (a) is,is a learnable parameter, and h is the number of heads.
Then, the outputs of the h heads are spliced together to perform linear projection operation, and the formula is as follows:
MATT(SQ,SA)=[ATT1(SQ,SA),ATT2(SQ,SA),...,ATTh(SQ,SA)]Wo
Then, at SQAnd MATT (S)Q,SA) The residual errors are connected, and the calculation formula is as follows:
Z=LN(SQ+MATT(SQ,SA))
where LN is the hierarchical normalization operation. After that, Z is then coupled into a feed forward network (MLP) and another residual connection layer, resulting in the output of the first transform encoder:
TIM(SQ,SA)=LN(Z+MLP(Z))
We stacked on lmA matching block for obtaining answer representation dependent on the questionI.e. the output of the last layer, denoted as MQ→AWhere l ismIs a hyper-parameter representing the number of matching blocks.
Similar to the computation of the question-answer matching block, we can also pile up lmA matching block obtains an answer-dependent question representation by computing an answer-question matching blockIs marked as MA→Q。
Finally, we will represent two kinds MQ→AAnd MA→QConnecting to obtain a fine-grained representation DQAAs the output of the conceptually related target alignment module.
3. Context alignment
The alignment module associated with evidence aims to align the evidence context of the QA pair and accumulate from coarse-grained to fine-grained standpoint representations for question-answering standpoint classification. To accomplish this, the present invention employs a multi-head attention layer to align the fine-grained representation D of QAQAFrom the standpoint of coarse particle size, SQAEvidence-related sentence meaning in between.
Specifically, the multi-head attention is calculated:
MATT′(DQA,SQA)=[ATT′1(DQA,SQA),...,ATT′h′(DQA,SQA)]W′o
where h' is the number of attention heads. Note that the last vertical vector from coarse to fine is denoted as O ═ MATT' (D)QA,SQA) Thus, the context alignment process is completed.
4. Location classification
After the vector representation of the vertical is obtained, the final vertical classification is performed. In the position classification part, a linear layer is used for reducing the number of dimensions, then the probability of belonging to each type of position is calculated by using a softmax function, and the category with the highest probability is taken as the position category of a given QA pair. This section is formulated as:
the loss function during training is:
wherein the content of the first and second substances,is the result of the prediction, representing the probability of the jth class of categories; when the jth class is the true label of sample i,is 1, otherwise is 0; n is the data size of the training data; i C is the size of the number of the set of vertical classes, where the set of vertical classes C ═ Favor, Against, Neutral }.
(III) positive effects
In order to verify the effect of the method, in the experimental process, the invention uses an open source data set proposed in the above-mentioned cyclic conditional attention network scheme, and the data set comprises a plurality of Chinese question-answer pairs. The question-answer pair data is collected from three websites of hundredth knowledge, dog searching and question asking and medical public network, and the related targets of concepts mainly comprise pregnancy, food safety, diseases and the like. The training data set size is 10598, the test size is 2993, and the data size for each of the standpoint categories of the training set and the test set is shown in table 1.
The evaluation indexes of the method are accuracy (accuracycacy), F1-macro, F1-macro, F1-favor and F1-against, wherein F1-favor is the F1 value of a sample with a position label as support, and F1-against is the F1 value of a sample with a position label as objection. The present method (HAT model) was compared with some mainstream methods, and the specific results are shown in table 2.
TABLE 1 data set statistics
TABLE 2 results of the experiment
The model provided by the invention achieves the optimum on each evaluation index, exceeds the performances of a plurality of mainstream models, and proves the effectiveness of the method provided by the invention.
The above examples are provided only for the purpose of describing the present invention, and are not intended to limit the scope of the present invention. The scope of the invention is defined by the appended claims. Various equivalent substitutions and modifications can be made without departing from the spirit and principles of the invention, and are intended to be within the scope of the invention.
Claims (10)
1. A question-answering position detection method of a hierarchical alignment structure comprises the following steps:
1) respectively converting the question text and the answer text into a question sequence and an answer sequence;
2) splicing the question sequence and the answer sequence to obtain a question answer sequence;
3) inputting the question sequence, the answer sequence and the question answer sequence into a hierarchical alignment model to obtain a question-answer standpoint detection result;
wherein, a question-answering place detection model is obtained through the following steps:
a) respectively converting the plurality of sample question texts and the plurality of sample answer texts into sample question sequences and sample answer sequences, and splicing the sample question sequences and the corresponding sample answer sequences to obtain a plurality of sample question answer sequences;
b) respectively coding each sample question sequence, sample answer sequence and sample question answer sequence to obtain a plurality of question sequence representations SQAnswer sequence representation SAAnd coarse grain size in the vertical representation of SQA;
c) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining, as keys and values, a number of question-dependent answer representation MQ→AExpressing S as a sequence of answersAAs a query and representing the corresponding answer sequence as SQObtaining, as keys and values, a number of answer-dependent question representations MA→QAnd connecting the question-dependent answer representation MQ→AWith corresponding answer-dependent question representation MA→QObtaining a plurality of fine-grained representations, denoted as DQA;
d) Aligning fine-grained representations D based on a multi-head attention mechanismQARepresenting S from the corresponding coarse-grained standpointQAThe sentence meanings related to evidence between the two groups of the sentences obtain a plurality of vectors representing O from a coarse position to a fine position;
e) and classifying a plurality of vector representations O from coarse to fine to obtain a level alignment model.
2. The method of claim 1, wherein the method of encoding the sample question sequence, the sample answer sequence, and the sample question answer sequence comprises: a pre-trained BERT model is used.
3. Such as rightThe method of claim 1, characterized in that the question-dependent answer representation M is obtained by the following stepsQ→A:
1) Representing S by a question sequenceQAs a query and representing the corresponding answer sequence as SAObtaining an output of a first answer-question matching block as a key and value, comprising the steps of:
a) obtaining the output of the ith headWherein Is thatD is the embedding size of the sample question text and the sample answer text converted into the sample question sequence and the sample answer sequence, h is the number of heads,is a parameter which can be learned, i is more than or equal to 1 and less than or equal to h;
b) splicing the outputs of h heads, and performing linear projection operation on the splicing result to obtain an operation result MATT (S)Q,SA)=[ATT1(SQ,SA),ATT2(SQ,SA),...,ATTh(SQ,SA)WOWhereinAre learnable parameters;
c) in question sequence representation SQAnd operation result MATT (S)Q,SA) The residual connection is carried out between the two to obtain the result Z ═ LN (S)Q+MATT(SQ,SA) Wherein LN is hierarchicalCarrying out a normalization operation;
d) accessing the result Z into a feedforward network and another residual connecting layer to obtain the output TIM of the first transform encoder1(SQ,SA) LN (Z + MLP (Z)), where MLP is a feed-forward network;
5. The method of claim 1, wherein the classification is performed by classifying a number of coarse-to-fine-field vector representations O:
1) calculating the probability that the vector from the coarse to the fine position represents that O belongs to each type of position by utilizing a softmax function;
2) the class with the highest probability is taken as the class of the coarse-to-fine-field vector representation O.
6. The method of claim 5 wherein the number of dimensions each representing O by a coarse-to-fine field vector is reduced using a linear layer before computing the probability that O belongs to each class of fields represented by a coarse-to-fine field vector.
8. The method of claim 7, wherein the set of standpoint categories C comprises: approval, disapproval, and neutrality.
9.A storage medium having a computer program stored thereon, wherein the computer program is arranged to, when run, perform the method of any of claims 1-8.
10. An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the method according to any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110230676.XA CN113127599B (en) | 2021-03-02 | 2021-03-02 | Question-answering position detection method and device of hierarchical alignment structure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110230676.XA CN113127599B (en) | 2021-03-02 | 2021-03-02 | Question-answering position detection method and device of hierarchical alignment structure |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113127599A true CN113127599A (en) | 2021-07-16 |
CN113127599B CN113127599B (en) | 2022-07-12 |
Family
ID=76772366
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110230676.XA Active CN113127599B (en) | 2021-03-02 | 2021-03-02 | Question-answering position detection method and device of hierarchical alignment structure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113127599B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558477A (en) * | 2018-10-23 | 2019-04-02 | 深圳先进技术研究院 | A kind of community's question answering system, method and electronic equipment based on multi-task learning |
CN111581979A (en) * | 2020-05-06 | 2020-08-25 | 西安交通大学 | False news detection system and method based on evidence perception layered interactive attention network |
US20200334334A1 (en) * | 2019-04-18 | 2020-10-22 | Salesforce.Com, Inc. | Systems and methods for unifying question answering and text classification via span extraction |
CN112232058A (en) * | 2020-10-15 | 2021-01-15 | 济南大学 | False news identification method and system based on deep learning three-layer semantic extraction framework |
CN112256861A (en) * | 2020-09-07 | 2021-01-22 | 中国科学院信息工程研究所 | Rumor detection method based on search engine return result and electronic device |
-
2021
- 2021-03-02 CN CN202110230676.XA patent/CN113127599B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109558477A (en) * | 2018-10-23 | 2019-04-02 | 深圳先进技术研究院 | A kind of community's question answering system, method and electronic equipment based on multi-task learning |
US20200334334A1 (en) * | 2019-04-18 | 2020-10-22 | Salesforce.Com, Inc. | Systems and methods for unifying question answering and text classification via span extraction |
CN111581979A (en) * | 2020-05-06 | 2020-08-25 | 西安交通大学 | False news detection system and method based on evidence perception layered interactive attention network |
CN112256861A (en) * | 2020-09-07 | 2021-01-22 | 中国科学院信息工程研究所 | Rumor detection method based on search engine return result and electronic device |
CN112232058A (en) * | 2020-10-15 | 2021-01-15 | 济南大学 | False news identification method and system based on deep learning three-layer semantic extraction framework |
Also Published As
Publication number | Publication date |
---|---|
CN113127599B (en) | 2022-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Tan et al. | Deep semantic role labeling with self-attention | |
JP7195365B2 (en) | A Method for Training Convolutional Neural Networks for Image Recognition Using Image Conditional Mask Language Modeling | |
CN112347268A (en) | Text-enhanced knowledge graph joint representation learning method and device | |
CN114565104A (en) | Language model pre-training method, result recommendation method and related device | |
CN111191002A (en) | Neural code searching method and device based on hierarchical embedding | |
CN111291188A (en) | Intelligent information extraction method and system | |
CN116097250A (en) | Layout aware multimodal pre-training for multimodal document understanding | |
CN109614480B (en) | Method and device for generating automatic abstract based on generation type countermeasure network | |
CN115310551A (en) | Text analysis model training method and device, electronic equipment and storage medium | |
CN113742733A (en) | Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device | |
CN114281931A (en) | Text matching method, device, equipment, medium and computer program product | |
CN116662488A (en) | Service document retrieval method, device, equipment and storage medium | |
CN113705191A (en) | Method, device and equipment for generating sample statement and storage medium | |
Kovvuri et al. | Pirc net: Using proposal indexing, relationships and context for phrase grounding | |
CN111831624A (en) | Data table creating method and device, computer equipment and storage medium | |
CN114330483A (en) | Data processing method, model training method, device, equipment and storage medium | |
Michael et al. | A First Experimental Demonstration of Massive Knowledge Infusion. | |
Peng et al. | MPSC: A multiple-perspective semantics-crossover model for matching sentences | |
CN117112743A (en) | Method, system and storage medium for evaluating answers of text automatic generation questions | |
CN116628162A (en) | Semantic question-answering method, device, equipment and storage medium | |
CN116680407A (en) | Knowledge graph construction method and device | |
CN113127599B (en) | Question-answering position detection method and device of hierarchical alignment structure | |
CN113204679B (en) | Code query model generation method and computer equipment | |
CN115203388A (en) | Machine reading understanding method and device, computer equipment and storage medium | |
CN115455144A (en) | Data enhancement method of completion type space filling type for small sample intention recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |