CN113761140A - Answer sorting method and device - Google Patents

Answer sorting method and device Download PDF

Info

Publication number
CN113761140A
CN113761140A CN202010812316.6A CN202010812316A CN113761140A CN 113761140 A CN113761140 A CN 113761140A CN 202010812316 A CN202010812316 A CN 202010812316A CN 113761140 A CN113761140 A CN 113761140A
Authority
CN
China
Prior art keywords
word
answer
neural network
question
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010812316.6A
Other languages
Chinese (zh)
Inventor
胡珅健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202010812316.6A priority Critical patent/CN113761140A/en
Publication of CN113761140A publication Critical patent/CN113761140A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/901Indexing; Data structures therefor; Storage structures
    • G06F16/9017Indexing; Data structures therefor; Storage structures using directory or table look-up
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an answer sorting method and device, and relates to the technical field of computers. One embodiment of the method comprises: acquiring a plurality of groups of question and answer data as training samples; each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences; calculating shallow layer mixed word and word vectors of all words of the question-answer data to train a sequencing model; wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series; and calculating the matching probability of each candidate answer corresponding to the target problem by adopting the trained ranking model, thereby ranking each candidate answer. The embodiment can solve the technical problem that the answer ranking result is inaccurate.

Description

Answer sorting method and device
Technical Field
The invention relates to the technical field of computers, in particular to an answer sorting method and device.
Background
Answer Selection (Answer Selection) refers to measuring semantic matching degrees of questions and candidate answers by using technologies such as natural language processing or deep learning, and further selecting more accurate answers from a plurality of candidate Answer lists. The natural language processing can be combined with technologies such as machine learning and deep learning, and is applied to intelligent medical systems, intelligent question-answering systems and the like.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
because a large number of professional terms exist in some special fields (such as the Chinese health consultation field), unknown words appear frequently, and wrong words and other conditions exist in question and answer data, the existing processing mode directly influences the calculation precision of semantic vectors, certain semantic precision is lost, and the result of ranking of answers is inaccurate.
Disclosure of Invention
In view of this, embodiments of the present invention provide an answer ranking method and apparatus to solve the technical problem of inaccurate answer ranking result.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided an answer ranking method including:
acquiring a plurality of groups of question and answer data as training samples; each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences;
calculating shallow layer mixed word and word vectors of all words of the question-answer data to train a sequencing model; wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series;
and calculating the matching probability of each candidate answer corresponding to the target problem by adopting the trained ranking model, thereby ranking each candidate answer.
Optionally, calculating a shallow mixed word vector of each word of the question-answer data to train a ranking model, including:
calculating shallow mixed word vectors of each word of the question-answering data;
and inputting the shallow layer mixed word vector of each word of the question-answer data into a sequencing model, and training the sequencing model by adopting a pairing method and a maximum interval algorithm.
Optionally, calculating a shallow mixed word vector for each word of the question-answer data includes:
performing word segmentation and word vector conversion on the question and answer data to obtain a word vector of each word;
performing word cutting and word vector conversion on the question and answer data to obtain a word vector of each word;
and for each character in the question and answer data, calculating a shallow layer mixed word and word vector of the character according to the word vector of the character and the word vector of the word in which the character is positioned.
Optionally, calculating a shallow mixed word vector of the word according to the word vector of the word and the word vector of the word in which the word is located, including:
and according to the preset weight of the word vector and the weight of the word vector, carrying out weighted summation on the word vector of the word and the word vector of the word in which the word is located to obtain the shallow mixed word and word vector of the word.
Optionally, the ranking model comprises a first neural network, a second neural network and a third neural network connected in parallel;
wherein the first neural network comprises a first recurrent neural network and a first convolutional neural network connected in series; the second neural network comprises a second recurrent neural network and a second convolutional neural network which are connected in series; the third neural network comprises a third recurrent neural network and a third convolutional neural network connected in series.
Optionally, inputting the shallow mixed word vector of each word of the question-answer data into a ranking model, and training the ranking model by using a pairwise method and a maximum interval algorithm, including:
inputting the shallow layer mixed word vector of each word of the forward answer example sentence into the first recurrent neural network, and outputting the semantic vector of the forward answer example sentence through the first convolution neural network; inputting the shallow layer mixed word vector of each word of the problem example sentence into the second recurrent neural network, and outputting the semantic vector of the problem example sentence through the second recurrent neural network; inputting shallow layer mixed word vectors of all words of the negative answer example sentences into the third recurrent neural network, and outputting semantic vectors of the negative answer example sentences through the third recurrent neural network;
the ranking model is trained using a pairwise approach and a maximum separation algorithm.
Optionally, training the ranking model using a pairwise approach and a maximum separation algorithm comprises:
calculating the similarity between the semantic vector of the forward answer example sentence and the semantic vector of the question example sentence to obtain the forward similarity;
calculating the similarity between the semantic vector of the negative answer example sentence and the semantic vector of the question example sentence to obtain negative similarity;
training the ranking model based on the positive and negative similarities and using a maximum interval algorithm as a loss function.
In addition, according to another aspect of the embodiments of the present invention, there is provided an answer ranking apparatus including:
the acquisition module is used for acquiring a plurality of groups of question and answer data as training samples; each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences;
the training module is used for calculating shallow layer mixed word and word vectors of all words of the question-answering data so as to train a sequencing model; wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series;
and the ranking module is used for calculating the matching probability of each candidate answer corresponding to the target problem by adopting the trained ranking model so as to rank each candidate answer.
Optionally, the training module is further configured to:
calculating shallow mixed word vectors of each word of the question-answering data;
and inputting the shallow layer mixed word vector of each word of the question-answer data into a sequencing model, and training the sequencing model by adopting a pairing method and a maximum interval algorithm.
Optionally, the training module is further configured to:
performing word segmentation and word vector conversion on the question and answer data to obtain a word vector of each word;
performing word cutting and word vector conversion on the question and answer data to obtain a word vector of each word;
and for each character in the question and answer data, calculating a shallow layer mixed word and word vector of the character according to the word vector of the character and the word vector of the word in which the character is positioned.
Optionally, the training module is further configured to:
and according to the preset weight of the word vector and the weight of the word vector, carrying out weighted summation on the word vector of the word and the word vector of the word in which the word is located to obtain the shallow mixed word and word vector of the word.
Optionally, the ranking model comprises a first neural network, a second neural network and a third neural network connected in parallel;
wherein the first neural network comprises a first recurrent neural network and a first convolutional neural network connected in series; the second neural network comprises a second recurrent neural network and a second convolutional neural network which are connected in series; the third neural network comprises a third recurrent neural network and a third convolutional neural network connected in series.
Optionally, the training module is further configured to:
inputting the shallow layer mixed word vector of each word of the forward answer example sentence into the first recurrent neural network, and outputting the semantic vector of the forward answer example sentence through the first convolution neural network; inputting the shallow layer mixed word vector of each word of the problem example sentence into the second recurrent neural network, and outputting the semantic vector of the problem example sentence through the second recurrent neural network; inputting shallow layer mixed word vectors of all words of the negative answer example sentences into the third recurrent neural network, and outputting semantic vectors of the negative answer example sentences through the third recurrent neural network;
the ranking model is trained using a pairwise approach and a maximum separation algorithm.
Optionally, the training module is further configured to:
calculating the similarity between the semantic vector of the forward answer example sentence and the semantic vector of the question example sentence to obtain the forward similarity;
calculating the similarity between the semantic vector of the negative answer example sentence and the semantic vector of the question example sentence to obtain negative similarity;
training the ranking model based on the positive and negative similarities and using a maximum interval algorithm as a loss function.
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors implement the method of any of the embodiments described above.
According to another aspect of the embodiments of the present invention, there is also provided a computer readable medium, on which a computer program is stored, which when executed by a processor implements the method of any of the above embodiments.
One embodiment of the above invention has the following advantages or benefits: because the shallow mixed word vector of each word of the question-answer data is calculated to train the ranking model, and the trained ranking model is used to calculate the matching probability of each candidate answer corresponding to the target question, thereby carrying out ranking on each candidate answer, the technical means that the ranking result of the answers is inaccurate in the prior art is overcome. In order to fully protect the context semantic information of the Chinese characters and overcome ambiguity and ambiguity, the embodiment of the invention trains a model through a shallow layer mixed word vector (a mixture of a single word vector and a word vector of a word in which the single word is located) of each word so as to obtain the semantic vector by a method more conforming to the Chinese natural language processing flow, thereby improving the accuracy of answer sorting.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram illustrating a main flow of an answer ranking method according to an embodiment of the invention;
FIG. 2 is a diagram illustrating computation of a shallow blended word vector, according to an embodiment of the invention;
FIG. 3 is a schematic structural diagram of an order model according to an embodiment of the invention;
FIG. 4 is a schematic diagram illustrating a main flow of an answer ranking method according to a reference embodiment of the present invention;
FIG. 5 is a diagram illustrating major blocks of an answer ranking device according to an embodiment of the present invention;
FIG. 6 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 7 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Currently, a semantic matching algorithm is generally adopted to rank candidate answers, and the semantic matching algorithm mainly comprises the following steps:
1) vector space model based on bag of words theory
The candidate answers are ranked using TF-IDF (term frequency-inverse text frequency index) and the segmented vector space model SVSM (semantic vector space mode). The method only uses shallow semantic algorithms such as TF-IDF and the like, the extracted semantic information is not accurate enough, and the situations of unknown words, wrongly written characters and the like cannot be overcome, so that certain semantic precision is lost.
2) Calculation method based on grammar and syntactic characteristics
And obtaining a syntactic structure by using a syntactic analysis tool, further extracting features by using the syntactic structure, and calculating the matching degree of the question and the answer. The method has large dependence and limitation and depends on the quality of artificially constructed features. And the data generalization ability aiming at different languages and different research contents is poor.
3) Semantic vector calculation based on deep learning technology
The binary convolution neural network is applied to an answer selection task, respective semantic distributed representation of question-answer pairs is generated through a convolution neural network method, and meanwhile, a similarity matrix is established between the question-answer pairs to solve the matching degree of the question-answer pairs by combining the idea of machine translation. The semantic vector calculation based on deep learning can learn deep semantic information, and the learning generalization capability is strong. However, there are currently few studies on Chinese corpora, especially on the basis of question-answer corpora data in some Chinese specific fields.
In order to solve the technical problems in the prior art, embodiments of the present invention provide an answer ranking method, which can improve semantic accuracy, thereby accurately ranking candidate answers.
Fig. 1 is a schematic diagram illustrating a main flow of an answer ranking method according to an embodiment of the invention. As an embodiment of the present invention, as shown in fig. 1, the answer ranking method may include:
step 101, a plurality of sets of question and answer data are obtained as training samples.
Optionally, a plurality of sets of question-answer data may be obtained from the consulting platform as training samples, and the question-answer data may be preprocessed, for example, to remove some common nonsense stop words (such as "a", "bar", "etc.) at the punctuation level.
Each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences. Therefore, the question and answer data are marked with labels, and question example sentences, positive answer example sentences and negative answer example sentences are marked. Generally, each set of question-answer data includes a question-answer example sentence, a positive answer example sentence (the answer example sentence with the highest matching degree with the question-answer example sentence), and at least one negative answer example sentence.
And 102, calculating shallow layer mixed word vectors of all words of the question-answer data to train a sequencing model.
The application scene in the Chinese domain is greatly different from English, especially in semantic vector calculation. The general method is to use a general word segmentation tool to segment words, and then train word vectors on the segmented results, but the Chinese question and answer data has more professional terms and is noisy, and the direct use of the general word segmentation tool to segment the question and answer data causes great performance loss. Therefore, the embodiment of the invention provides a shallow layer mixed word vector, which can overcome the problems of unregistered words and wrongly-distinguished words to a certain extent, make up semantic information possibly lost by the word vector in a Chinese semantic environment, and avoid ambiguity and semantic ambiguity of single words.
Optionally, step 102 may comprise: calculating shallow mixed word vectors of each word of the question-answering data; and inputting the shallow layer mixed word vector of each word of the question-answer data into a sequencing model, and training the sequencing model by adopting a pairing method and a maximum interval algorithm. Wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series. After the shallow layer mixed word and word vectors of all words of the question-answering data are calculated, the sequencing model is trained by adopting a pairing method and a maximum interval algorithm, so that the training speed of the model can be increased, and the generalization capability of the model can be improved.
Optionally, calculating a shallow mixed word vector for each word of the question-answer data includes: performing word segmentation and word vector conversion on the question and answer data to obtain a word vector of each word; performing word cutting and word vector conversion on the question and answer data to obtain a word vector of each word; and for each character in the question and answer data, calculating a shallow layer mixed word and word vector of the character according to the word vector of the character and the word vector of the word in which the character is positioned. For example, a general Word segmentation tool such as jieba may be used to segment words and characters from the question and answer data, and then a Word vector calculation tool (e.g., Word2Vec) may be used to perform vector conversion on each Word and each character.
Specifically, a Word vector lookup table can be obtained by using Word2Vec tool training, and then the required Word vector is found out from the Word vector lookup table; the Word2Vec tool may be used to train to get a Word vector lookup table and then find the desired Word vector from the Word vector lookup table. For a sentence (question example sentence or answer example sentence), input: sensor ═ x1,x2,x3,···,xn) Wherein x isiThe ith word representing the input sentence. By inputting the word x in the sentenceiFor the atomic semantic unit, x is respectively searched from the word vector lookup table and the word vector lookup tableiCorresponding word vector xicAnd xiWord vector x of the word in the sentenceiw
Optionally, calculating a shallow mixed word vector of the word according to the word vector of the word and the word vector of the word in which the word is located, including: and according to the preset weight of the word vector and the weight of the word vector, carrying out weighted summation on the word vector of the word and the word vector of the word in which the word is located to obtain the shallow mixed word and word vector of the word. Embodiments of the invention will xicAnd xiwBlending the results as word xiThe word vector weight and the word vector weight can be respectively set, then the word vector of the word and the word vector of the word in which the word is located are weighted and summed, so as to obtain the shallow mixed word and word vectorTo obtain a shallow mixed word vector of words.
For example, if the word vector weight and the word vector weight are 0.5, respectively, the word x can be obtainediThe shallow mixed word vector of (a) is:
Figure BDA0002631463470000091
for another example, the word vector weight and the word vector weight are 1/3 and 2/3, respectively, or 2/3 and 1/3, respectively, and so on, which are not described again.
Optionally, as shown in fig. 3, the ranking model comprises a first neural network, a second neural network and a third neural network connected in parallel; wherein the first neural network comprises a first recurrent neural network and a first convolutional neural network connected in series; the second neural network comprises a second recurrent neural network and a second convolutional neural network which are connected in series; the third neural network comprises a third recurrent neural network and a third convolutional neural network connected in series. Optionally, inputting the shallow mixed word vector of each word of the question-answer data into a ranking model, and training the ranking model by using a pairwise method and a maximum interval algorithm, including: inputting the shallow layer mixed word vector of each word of the forward answer example sentence into the first recurrent neural network, and outputting the semantic vector of the forward answer example sentence through the first convolution neural network; inputting the shallow layer mixed word vector of each word of the problem example sentence into the second recurrent neural network, and outputting the semantic vector of the problem example sentence through the second recurrent neural network; inputting shallow layer mixed word vectors of all words of the negative answer example sentences into the third recurrent neural network, and outputting semantic vectors of the negative answer example sentences through the third recurrent neural network; the ranking model is trained using a pairwise approach and a maximum separation algorithm.
Get the character xiAfter word vectors are mixed in the shallow layer, a bidirectional recurrent neural network is used for learning word sequence characteristics. In order to obtain a potential local feature representation with finer granularity in a sentence, the embodiment of the invention is used for followingAnd the output of the ring neural network layer is accessed into a convolutional neural network. The operation of the convolutional layer in natural language processing is to concatenate word vectors of continuous words in a sentence by taking a convolution window as a unit, and then map the semantic matrix into a new local feature vector through a mapping function, which is equivalent to performing a deeper abstraction on the semantic. And performing down-sampling after the convolutional layer operation to obtain the sentence characteristic vector representation with fixed size, namely positive case-semantic representation, problem-semantic representation and negative case-semantic representation.
Optionally, training the ranking model using a pairwise approach and a maximum separation algorithm comprises: calculating the similarity between the semantic vector of the forward answer example sentence and the semantic vector of the question example sentence to obtain the forward similarity; calculating the similarity between the semantic vector of the negative answer example sentence and the semantic vector of the question example sentence to obtain negative similarity; training the ranking model based on the positive and negative similarities and using a maximum interval algorithm as a loss function. After the semantic representation of the question-answer pairs is obtained, the Cosine value (Cosine) of the vector space is used for measuring the semantic similarity (namely the semantic matching degree) between the question-answer pairs. Because the question-answer pairs have pairwise attributes, under the method of calculating the similarity based on the semantic level, whether a question is similar to a candidate answer or not needs to be compared with other similar objects through a corresponding strategy so as to better determine which is more matched. Embodiments of the present invention therefore employ a Pairwise approach (Pairwise) and a maximum separation algorithm (such as the hinge loss function change loss) to train the model supervised.
And 103, calculating the matching probability of each candidate answer corresponding to the target question by adopting the trained ranking model, thereby ranking each candidate answer.
After the ranking model is obtained through training, the target question and a plurality of corresponding candidate answers are input into the model, the matching probability of each candidate answer can be obtained, then the candidate answers are arranged in a descending order according to the descending order of the matching probability, and the candidate answer with the highest matching probability is used as a forward answer, namely the best answer.
According to the various embodiments, it can be seen that the technical means for ranking the candidate answers by calculating the shallow mixed word vector of each word of the question-answer data to train the ranking model and calculating the matching probability of each candidate answer corresponding to the target question by using the trained ranking model in the embodiments of the present invention solves the technical problem of inaccurate answer ranking result in the prior art. In order to fully protect the context semantic information of the Chinese characters and overcome ambiguity and ambiguity, the embodiment of the invention trains a model through a shallow layer mixed word vector (a mixture of a single word vector and a word vector of a word in which the single word is located) of each word so as to obtain the semantic vector by a method more conforming to the Chinese natural language processing flow, thereby improving the accuracy of answer sorting.
FIG. 4 is a schematic diagram illustrating a main flow of an answer ranking method according to a reference embodiment of the present invention. As another embodiment of the present invention, as shown in fig. 4, the answer ranking method may include:
step 401, a plurality of sets of question and answer data are obtained as training samples.
Several groups of question and answer data can be obtained from the consultation platform to serve as training samples, and the question and answer data are preprocessed, such as removing some common nonsense stop words at the punctuation mark level. Each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences.
Step 402, performing word segmentation and word vector conversion on the question and answer data to obtain a word vector of each word.
The method can adopt general Word segmentation tools such as jieba and the like to segment words of the question and answer data, can use Word2Vec tools to train to obtain a Word vector lookup table, and then find out the required Word vectors from the Word vector lookup table.
Step 403, performing word cutting and word vector conversion on the question and answer data to obtain a word vector of each word.
The method can adopt general Word segmentation tools such as jieba and the like to segment words for the question and answer data, can use Word2Vec tools to train to obtain a Word vector lookup table, and then find out the required Word vector from the Word vector lookup table.
Step 404, for each word in the question and answer data, calculating a shallow layer mixed word vector of the word according to the word vector of the word and the word vector of the word where the word is located.
For a sentence (question example sentence or answer example sentence), input: sensor ═ x1,x2,x3,···,xn) Wherein x isiThe ith word representing the input sentence. By inputting the word x in the sentenceiFor the atomic semantic unit, x is respectively searched from the word vector lookup table and the word vector lookup tableiCorresponding word vector xicAnd xiWord vector x of the word in the sentenceiw
The word vectors of the words and the word vectors of the words in which the words are located can be subjected to weighted summation according to preset word vector weights and word vector weights, and shallow mixed word and word vectors of the words are obtained. Assuming that the word vector weight and the word vector weight are 0.5, respectively, the word x can be obtainediThe shallow mixed word vector of (a) is:
Figure BDA0002631463470000111
step 405, inputting the shallow layer mixed word vector of each word of the question-answer data into a sequencing model, and training the sequencing model by adopting a pairing method and a maximum interval algorithm.
The ranking model comprises a first neural network, a second neural network and a third neural network which are connected in parallel; wherein the first neural network comprises a first recurrent neural network and a first convolutional neural network connected in series; the second neural network comprises a second recurrent neural network and a second convolutional neural network which are connected in series; the third neural network comprises a third recurrent neural network and a third convolutional neural network connected in series. When a model is trained, inputting shallow layer mixed word vectors of all words of the forward answer example sentence into the first cyclic neural network, and outputting semantic vectors of the forward answer example sentence through the first convolution neural network; inputting the shallow layer mixed word vector of each word of the problem example sentence into the second recurrent neural network, and outputting the semantic vector of the problem example sentence through the second recurrent neural network; inputting shallow layer mixed word vectors of all words of the negative answer example sentences into the third recurrent neural network, and outputting semantic vectors of the negative answer example sentences through the third recurrent neural network; the ranking model is trained using a pairwise approach and a maximum separation algorithm.
Optionally, training the ranking model using a pairwise approach and a maximum separation algorithm comprises: calculating the similarity between the semantic vector of the forward answer example sentence and the semantic vector of the question example sentence to obtain the forward similarity; calculating the similarity between the semantic vector of the negative answer example sentence and the semantic vector of the question example sentence to obtain negative similarity; training the ranking model based on the positive and negative similarities and using a maximum interval algorithm as a loss function.
And 406, calculating the matching probability of each candidate answer corresponding to the target question by using the trained ranking model, thereby ranking each candidate answer.
After the ranking model is obtained through training, the target question and a plurality of corresponding candidate answers are input into the model, the matching probability of each candidate answer can be obtained, then the candidate answers are arranged in a descending order according to the descending order of the matching probability, and the candidate answer with the highest matching probability is used as a forward answer, namely the best answer.
In addition, in one embodiment of the present invention, the detailed implementation of the answer sorting method is described in detail above, so that the repeated content is not described herein.
Fig. 5 is a schematic diagram of the main modules of an answer ranking apparatus according to an embodiment of the invention, as shown in fig. 5, the answer ranking apparatus 500 includes an obtaining module 501, a training module 502 and a ranking module 503; the obtaining module 501 is configured to obtain a plurality of sets of question and answer data as training samples; each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences; the training module 502 is configured to calculate a shallow mixed word vector of each word of the question-answering data to train a ranking model; wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series; the ranking module 503 is configured to calculate, by using the trained ranking model, a matching probability of each candidate answer corresponding to the target question, so as to rank each candidate answer.
Optionally, the training module 502 is further configured to:
calculating shallow mixed word vectors of each word of the question-answering data;
and inputting the shallow layer mixed word vector of each word of the question-answer data into a sequencing model, and training the sequencing model by adopting a pairing method and a maximum interval algorithm.
Optionally, the training module 502 is further configured to:
performing word segmentation and word vector conversion on the question and answer data to obtain a word vector of each word;
performing word cutting and word vector conversion on the question and answer data to obtain a word vector of each word;
and for each character in the question and answer data, calculating a shallow layer mixed word and word vector of the character according to the word vector of the character and the word vector of the word in which the character is positioned.
Optionally, the training module 502 is further configured to:
and according to the preset weight of the word vector and the weight of the word vector, carrying out weighted summation on the word vector of the word and the word vector of the word in which the word is located to obtain the shallow mixed word and word vector of the word.
Optionally, the ranking model comprises a first neural network, a second neural network and a third neural network connected in parallel;
wherein the first neural network comprises a first recurrent neural network and a first convolutional neural network connected in series; the second neural network comprises a second recurrent neural network and a second convolutional neural network which are connected in series; the third neural network comprises a third recurrent neural network and a third convolutional neural network connected in series.
Optionally, the training module 502 is further configured to:
inputting the shallow layer mixed word vector of each word of the forward answer example sentence into the first recurrent neural network, and outputting the semantic vector of the forward answer example sentence through the first convolution neural network; inputting the shallow layer mixed word vector of each word of the problem example sentence into the second recurrent neural network, and outputting the semantic vector of the problem example sentence through the second recurrent neural network; inputting shallow layer mixed word vectors of all words of the negative answer example sentences into the third recurrent neural network, and outputting semantic vectors of the negative answer example sentences through the third recurrent neural network;
the ranking model is trained using a pairwise approach and a maximum separation algorithm.
Optionally, the training module 502 is further configured to:
calculating the similarity between the semantic vector of the forward answer example sentence and the semantic vector of the question example sentence to obtain the forward similarity;
calculating the similarity between the semantic vector of the negative answer example sentence and the semantic vector of the question example sentence to obtain negative similarity;
training the ranking model based on the positive and negative similarities and using a maximum interval algorithm as a loss function.
According to the various embodiments, it can be seen that the technical means for ranking the candidate answers by calculating the shallow mixed word vector of each word of the question-answer data to train the ranking model and calculating the matching probability of each candidate answer corresponding to the target question by using the trained ranking model in the embodiments of the present invention solves the technical problem of inaccurate answer ranking result in the prior art. In order to fully protect the context semantic information of the Chinese characters and overcome ambiguity and ambiguity, the embodiment of the invention trains a model through a shallow layer mixed word vector (a mixture of a single word vector and a word vector of a word in which the single word is located) of each word so as to obtain the semantic vector by a method more conforming to the Chinese natural language processing flow, thereby improving the accuracy of answer sorting.
It should be noted that, in the embodiment of the answer sorting apparatus of the present invention, the above answer sorting method has been described in detail, and therefore, the repeated content herein is not described again.
Fig. 6 illustrates an exemplary system architecture 600 to which the answer ranking method or answer ranking apparatus of embodiments of the invention may be applied.
As shown in fig. 6, the system architecture 600 may include terminal devices 601, 602, 603, a network 604, and a server 605. The network 604 serves to provide a medium for communication links between the terminal devices 601, 602, 603 and the server 605. Network 604 may include various types of connections, such as wire, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 601, 602, 603 to interact with the server 605 via the network 604 to receive or send messages or the like. The terminal devices 601, 602, 603 may have installed thereon various communication client applications, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only).
The terminal devices 601, 602, 603 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 605 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 601, 602, 603. The background management server may analyze and otherwise process the received data such as the item information query request, and feed back a processing result (for example, target push information, item information — just an example) to the terminal device.
It should be noted that the answer ranking method provided by the embodiment of the present invention is generally executed by the server 605, and accordingly, the answer ranking device is generally disposed in the server 605. The answer ranking method provided by the embodiment of the present invention may also be executed by the terminal devices 601, 602, and 603, and accordingly, the answer ranking apparatus may be disposed in the terminal devices 601, 602, and 603.
It should be understood that the number of terminal devices, networks, and servers in fig. 6 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 7, shown is a block diagram of a computer system 700 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 7 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 7, the computer system 700 includes a Central Processing Unit (CPU)701, which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM703, various programs and data necessary for the operation of the system 700 are also stored. The CPU 701, the ROM 702, and the RAM703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. A drive 710 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 709, and/or installed from the removable medium 711. The computer program performs the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 701.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer programs according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module, a training module, and a ranking module, where the names of the modules do not in some cases constitute a limitation on the modules themselves.
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, implement the method of: acquiring a plurality of groups of question and answer data as training samples; each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences; calculating shallow layer mixed word and word vectors of all words of the question-answer data to train a sequencing model; wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series; and calculating the matching probability of each candidate answer corresponding to the target problem by adopting the trained ranking model, thereby ranking each candidate answer.
According to the technical scheme of the embodiment of the invention, the technical means that the shallow layer mixed word vector of each word of the question-answer data is calculated to train the ranking model, and the trained ranking model is adopted to calculate the matching probability of each candidate answer corresponding to the target question, so that each candidate answer is ranked is overcome, and the technical problem that the answer ranking result is inaccurate in the prior art is solved. In order to fully protect the context semantic information of the Chinese characters and overcome ambiguity and ambiguity, the embodiment of the invention trains a model through a shallow layer mixed word vector (a mixture of a single word vector and a word vector of a word in which the single word is located) of each word so as to obtain the semantic vector by a method more conforming to the Chinese natural language processing flow, thereby improving the accuracy of answer sorting.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. An answer ranking method, comprising:
acquiring a plurality of groups of question and answer data as training samples; each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences;
calculating shallow layer mixed word and word vectors of all words of the question-answer data to train a sequencing model; wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series;
and calculating the matching probability of each candidate answer corresponding to the target problem by adopting the trained ranking model, thereby ranking each candidate answer.
2. The method of claim 1, wherein computing shallow mixed word vectors for each word of the question-answer data to train a ranking model comprises:
calculating shallow mixed word vectors of each word of the question-answering data;
and inputting the shallow layer mixed word vector of each word of the question-answer data into a sequencing model, and training the sequencing model by adopting a pairing method and a maximum interval algorithm.
3. The method of claim 2, wherein computing a shallow mixed word vector for each word of the question-answer data comprises:
performing word segmentation and word vector conversion on the question and answer data to obtain a word vector of each word;
performing word cutting and word vector conversion on the question and answer data to obtain a word vector of each word;
and for each character in the question and answer data, calculating a shallow layer mixed word and word vector of the character according to the word vector of the character and the word vector of the word in which the character is positioned.
4. The method of claim 3, wherein computing the shallow mixed word vector for the word from the word vector for the word and the word vector for the word in which the word is located comprises:
and according to the preset weight of the word vector and the weight of the word vector, carrying out weighted summation on the word vector of the word and the word vector of the word in which the word is located to obtain the shallow mixed word and word vector of the word.
5. The method of claim 2, wherein the ranking model comprises a first neural network, a second neural network, and a third neural network in parallel;
wherein the first neural network comprises a first recurrent neural network and a first convolutional neural network connected in series; the second neural network comprises a second recurrent neural network and a second convolutional neural network which are connected in series; the third neural network comprises a third recurrent neural network and a third convolutional neural network connected in series.
6. The method of claim 5, wherein the shallow mixed word vectors of the words of the question-answering data are input into a ranking model, and the ranking model is trained using a pairwise approach and a maximum-spacing algorithm, comprising:
inputting the shallow layer mixed word vector of each word of the forward answer example sentence into the first recurrent neural network, and outputting the semantic vector of the forward answer example sentence through the first convolution neural network; inputting the shallow layer mixed word vector of each word of the problem example sentence into the second recurrent neural network, and outputting the semantic vector of the problem example sentence through the second recurrent neural network; inputting shallow layer mixed word vectors of all words of the negative answer example sentences into the third recurrent neural network, and outputting semantic vectors of the negative answer example sentences through the third recurrent neural network;
the ranking model is trained using a pairwise approach and a maximum separation algorithm.
7. The method of claim 6, wherein training the ranking model using a pairwise approach and a maximum separation algorithm comprises:
calculating the similarity between the semantic vector of the forward answer example sentence and the semantic vector of the question example sentence to obtain the forward similarity;
calculating the similarity between the semantic vector of the negative answer example sentence and the semantic vector of the question example sentence to obtain negative similarity;
training the ranking model based on the positive and negative similarities and using a maximum interval algorithm as a loss function.
8. An answer ranking apparatus, comprising:
the acquisition module is used for acquiring a plurality of groups of question and answer data as training samples; each group of question-answer data comprises question example sentences and answer example sentences, and the answer example sentences comprise positive answer example sentences and negative answer example sentences;
the training module is used for calculating shallow layer mixed word and word vectors of all words of the question-answering data so as to train a sequencing model; wherein the ranking model comprises a recurrent neural network and a convolutional neural network in series;
and the ranking module is used for calculating the matching probability of each candidate answer corresponding to the target problem by adopting the trained ranking model so as to rank each candidate answer.
9. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
the one or more programs, when executed by the one or more processors, implement the method of any of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202010812316.6A 2020-08-13 2020-08-13 Answer sorting method and device Pending CN113761140A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010812316.6A CN113761140A (en) 2020-08-13 2020-08-13 Answer sorting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010812316.6A CN113761140A (en) 2020-08-13 2020-08-13 Answer sorting method and device

Publications (1)

Publication Number Publication Date
CN113761140A true CN113761140A (en) 2021-12-07

Family

ID=78785622

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010812316.6A Pending CN113761140A (en) 2020-08-13 2020-08-13 Answer sorting method and device

Country Status (1)

Country Link
CN (1) CN113761140A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615767A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Searching-ranking model training method and device and search processing method
CN106503056A (en) * 2016-09-27 2017-03-15 北京百度网讯科技有限公司 Generation method and device that Search Results based on artificial intelligence are made a summary
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN109408627A (en) * 2018-11-15 2019-03-01 众安信息技术服务有限公司 A kind of answering method and system merging convolutional neural networks and Recognition with Recurrent Neural Network
CN109635083A (en) * 2018-11-27 2019-04-16 北京科技大学 It is a kind of for search for TED speech in topic formula inquiry document retrieval method
CN109992788A (en) * 2019-04-10 2019-07-09 北京神州泰岳软件股份有限公司 Depth text matching technique and device based on unregistered word processing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104615767A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Searching-ranking model training method and device and search processing method
CN106503056A (en) * 2016-09-27 2017-03-15 北京百度网讯科技有限公司 Generation method and device that Search Results based on artificial intelligence are made a summary
CN108536679A (en) * 2018-04-13 2018-09-14 腾讯科技(成都)有限公司 Name entity recognition method, device, equipment and computer readable storage medium
CN109408627A (en) * 2018-11-15 2019-03-01 众安信息技术服务有限公司 A kind of answering method and system merging convolutional neural networks and Recognition with Recurrent Neural Network
CN109635083A (en) * 2018-11-27 2019-04-16 北京科技大学 It is a kind of for search for TED speech in topic formula inquiry document retrieval method
CN109992788A (en) * 2019-04-10 2019-07-09 北京神州泰岳软件股份有限公司 Depth text matching technique and device based on unregistered word processing

Similar Documents

Publication Publication Date Title
CN107491534B (en) Information processing method and device
US11151177B2 (en) Search method and apparatus based on artificial intelligence
US10650102B2 (en) Method and apparatus for generating parallel text in same language
CN107463704B (en) Search method and device based on artificial intelligence
CN106960030B (en) Information pushing method and device based on artificial intelligence
CN112860866B (en) Semantic retrieval method, device, equipment and storage medium
CN110688449A (en) Address text processing method, device, equipment and medium based on deep learning
CN111930792B (en) Labeling method and device for data resources, storage medium and electronic equipment
US10394955B2 (en) Relation extraction from a corpus using an information retrieval based procedure
US11977567B2 (en) Method of retrieving query, electronic device and medium
CN114385780B (en) Program interface information recommendation method and device, electronic equipment and readable medium
CN113407814B (en) Text searching method and device, readable medium and electronic equipment
CN112052424B (en) Content auditing method and device
CN113434636A (en) Semantic-based approximate text search method and device, computer equipment and medium
WO2021007159A1 (en) Identifying entity attribute relations
CN113268560A (en) Method and device for text matching
CN111597807B (en) Word segmentation data set generation method, device, equipment and storage medium thereof
CN116109732A (en) Image labeling method, device, processing equipment and storage medium
CN111581364A (en) Chinese intelligent question-answer short text similarity calculation method oriented to medical field
CN114298007A (en) Text similarity determination method, device, equipment and medium
CN112307738B (en) Method and device for processing text
CN110807097A (en) Method and device for analyzing data
CN111460224B (en) Comment data quality labeling method, comment data quality labeling device, comment data quality labeling equipment and storage medium
CN112559711A (en) Synonymous text prompting method and device and electronic equipment
CN111555960A (en) Method for generating information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination