CN113159187B

CN113159187B - Classification model training method and device and target text determining method and device

Info

Publication number: CN113159187B
Application number: CN202110442474.1A
Authority: CN
Inventors: 戴淑敏; 李长亮; 李小龙
Original assignee: Beijing Kingsoft Digital Entertainment Co Ltd
Current assignee: Beijing Kingsoft Digital Entertainment Co Ltd
Priority date: 2021-04-23
Filing date: 2021-04-23
Publication date: 2024-06-14
Anticipated expiration: 2041-04-23
Also published as: CN113159187A

Abstract

The method and the device for determining the target text provided by the application comprise the steps of obtaining a target problem, and inputting the target problem into a search database to obtain at least one initial text corresponding to the target problem; and inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question. Specifically, the target text determining method provides a two-stage text searching strategy, firstly, a target problem is input into a searching database, a plurality of initial texts corresponding to the target problem are obtained through the searching database, coarse text recall in a first stage is achieved, and then, the initial texts recalled in the first stage are further screened through a pre-trained classification model, so that a more accurate target text most relevant to the target problem is screened out of the plurality of initial texts.

Description

Classification model training method and device and target text determining method and device

Technical Field

The present application relates to the field of artificial intelligence, and in particular, to a method and apparatus for training a classification model, a method and apparatus for determining a target text, a computing device, and a computer readable storage medium.

Background

In the technical field of information retrieval, a common text recall method mainly comprises text matching recall, label recall and semantic recall, wherein the text matching recall is based on keywords in a problem statement (Query) of a user to a corpus, and the most relevant text (Doc) is matched based on a word Frequency-inverse text Frequency index (Term Frequency-Inverse Document Frequency, TF-IDF) statistical analysis method of the keywords; tag recall is to match the most relevant recall text according to the tags of the text in the corpus; the semantic recall is a text most relevant to the problem statement through semantic similarity calculation, the common semantic matching recall is mainly based on semantic matching of representation, the problem statement and the text of a user are respectively represented as semantic vectors, and then the semantic similarity calculation is carried out on the semantic vectors of the problem statement and the semantic vectors of the text to carry out matching recall.

However, the semantic vectors learned by the semantic matching recall mode have limitations, no interaction exists between the problem statement and the recall text, and no context information is considered, so that the matching accuracy is not high. Therefore, how to improve the matching precision between the question sentence and the recall text becomes a urgent problem to be solved.

Disclosure of Invention

In view of the above, embodiments of the present application provide a method and apparatus for training a classification model, a method and apparatus for determining a target text, a computing device, and a computer-readable storage medium, so as to solve the technical defects in the prior art.

According to a first aspect of an embodiment of the present application, there is provided a classification model training comprising:

acquiring a training data set, wherein the training data set comprises sample questions and sample answers corresponding to the sample questions;

Constructing a training sample corresponding to the sample problem based on the sample problem and a sample text of the sample problem obtained by searching a database;

And training the classification model based on the training sample and a sample label corresponding to the training sample to obtain the classification model.

According to a second aspect of the embodiment of the present application, there is provided a target text determining method, including:

Acquiring a target problem, and inputting the target problem into a search database to acquire at least one initial text corresponding to the target problem;

Inputting the target questions and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains target answers corresponding to the target questions, wherein the classification model is obtained by the classification model training method;

and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability.

According to a third aspect of an embodiment of the present application, there is provided a classification model training apparatus, including:

the training data acquisition module is configured to acquire a training data set, wherein the training data set comprises a sample question and a sample answer corresponding to the sample question;

A training sample construction module configured to construct a training sample corresponding to the sample question based on the sample question and a sample text of the sample question obtained by searching a database;

and the model training module is configured to train the classification model based on the training sample and the sample label corresponding to the training sample to obtain the classification model.

According to a fourth aspect of an embodiment of the present application, there is provided a target text determining apparatus including:

the problem acquisition module is configured to acquire a target problem, and input the target problem into the search database to acquire at least one initial text corresponding to the target problem;

The probability obtaining module is configured to input the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method;

And the text determining module is configured to determine a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability.

According to a fifth aspect of embodiments of the present application, there is provided a computing device comprising a memory, a processor and computer instructions stored on the memory and executable on the processor, the processor executing the steps of the classification model training method or the steps of the target text determination method when the computer instructions.

According to a sixth aspect of embodiments of the present application, there is provided a computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the classification model training method or the steps of the target text determination method.

The target text determining method provided by the application comprises the steps of obtaining a target problem, and inputting the target problem into a search database to obtain at least one initial text corresponding to the target problem; inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method; and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability. Specifically, the target text determining method provides a two-stage text searching strategy, firstly, a target problem is input into a searching database, a plurality of initial texts corresponding to the target problem are obtained through the searching database, coarse text recall in a first stage is achieved, and then, the initial texts recalled in the first stage are further screened through a pre-trained classification model, so that a more accurate target text most relevant to the target problem is screened out of the plurality of initial texts.

In addition, when the initial text recalled in the first stage is screened through the classification model, the target problem and each initial text are spliced and input into the classification model, the classification model calculates the similarity between word vectors on each position in the spliced text and other word vectors on the text, the similarity is equivalent to the two-by-two interactive calculation of the word vectors in the spliced text, and the characteristics of the word vectors on all positions around each word vector are referred to, so that the context information of the spliced text is combined, the matching precision of the target problem and the initial text is improved, and the target text corresponding to the target problem can be obtained more accurately.

Drawings

FIG. 1 is a block diagram of a computing device provided by an embodiment of the present application;

FIG. 2 is a flowchart of a training method of a semantic matching model according to an embodiment of the present application;

FIG. 3 is another flowchart of a training method of a semantic matching model according to an embodiment of the present application;

FIG. 4 is another flowchart of a training method of a semantic matching model according to an embodiment of the present application;

fig. 5 is another flowchart of a training method of a semantic matching model according to an embodiment of the present application.

Detailed Description

In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present application. The present application may be embodied in many other forms than those herein described, and those skilled in the art will readily appreciate that the present application may be similarly embodied without departing from the spirit or essential characteristics thereof, and therefore the present application is not limited to the specific embodiments disclosed below.

The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.

It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description.

First, terms related to one or more embodiments of the present invention will be explained.

Embedding: : i.e., embedded expression, word embedding is an essential link in the processing of text by a computer, i.e., an input natural language symbol is mapped to a fixed-length vector through a numerical matrix, thereby converting a complex text problem into a mathematical problem.

Transformer model: a neural network model for solving the sequence problem based on an attention model is mainly divided into two parts, namely an encoder (encoder) and a decoder (decoder), wherein the basic structures of the encoder and the decoder are similar, and the encoder and the decoder are composed of a multi-head self-attention layer and a full-connection layer. The transducer model can capture text information over longer distances than traditional recurrent neural network models that solve the sequence problem.

BERT model: BERT, bidirectional Encoder Representation from Transformers, refers to the encoder portion of the bi-directional transducer model, which is a self-encoding language model that captures word and sentence level representations, respectively, with a masking language model and a next sentence prediction two pre-training tasks.

ALBERT model: a lightweight BERT model aims to solve the problem that the parameters of the existing pre-training model are too large, and a ALBERT model is mainly improved by three points relative to the BERT model: (1) factoring the embedding matrix; (2) Cross-layer parameter sharing, i.e. multiple layers use the same parameters; (3) The Next Sentence Prediction (NSP) is replaced with a sentence sequential prediction (Sentence-order prediction, SOP), specifically with positive training samples identical to NSP, but negative training samples are constructed by selecting two consecutive sentences in a document and exchanging their order.

Recall: refers to returning related or user-interested) content in the fields of searching, recommending and the like according to the searching problems of the user, the behaviors of the user and the like. Common recall methods include text-matching recall, tag recall, semantic recall, etc

Semantic matching recall: word embedding coding is carried out on the problem sentences and the text corpus of the user, and semantic similarity of semantic vectors of the problem sentences and semantic vectors of the text corpus is calculated through a vector similarity calculation method, so that semantic matching recall is realized.

Elastic search: a non-relational database. Is a near real-time search platform from indexing the document to the point where the document can be searched with only a slight delay. The non-relational database is expandable and highly available, and aims to quickly query data wanted by a user.

In the present application, a classification model training method and apparatus, a target text determining method and apparatus, a computing device, and a computer-readable storage medium are provided, and detailed descriptions are given one by one in the following embodiments.

Fig. 1 shows a block diagram of a computing device 100 according to an embodiment of the present description. The components of the computing device 100 include, but are not limited to, a memory 110 and a processor 120. Processor 120 is coupled to memory 110 via bus 130 and database 150 is used to store data.

Computing device 100 also includes access device 140, access device 140 enabling computing device 100 to communicate via one or more networks 160. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. The access device 140 may include one or more of any type of network interface, wired or wireless (e.g., a Network Interface Card (NIC)), such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.

In one embodiment of the present description, the above-described components of computing device 100, as well as other components not shown in FIG. 1, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device shown in FIG. 1 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.

Computing device 100 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 100 may also be a mobile or stationary server.

Wherein the processor 120 may perform the steps of the method shown in fig. 2. Fig. 2 is a flowchart illustrating a classification model training method according to an embodiment of the present application, which specifically includes the following steps.

Step 202: and acquiring a training data set, wherein the training data set comprises sample questions and sample answers corresponding to the sample questions.

In practical application, in order to ensure the training effect of the classification model, the training data set includes a plurality of sample questions and sample answers corresponding to each sample question, where each sample question and corresponding sample answer form a query-answer pair, and then the training data set includes a plurality of query-answer pairs.

The sample questions can be any length and any type of sample questions, and the sample answer corresponding to each sample question can be understood as a standard answer corresponding to each sample question.

Step 204: and constructing a training sample corresponding to the sample problem based on the sample problem and the sample text of the sample problem obtained by searching a database.

The search database may be an elastic search database, or may be another text database with a search function, which is not limited in the present application.

Specifically, after sample questions in a training data set and sample answers corresponding to each sample question are obtained, each sample question is input into an elastic search database, and a preset number of sample texts corresponding to each sample question are searched through the elastic search database; the preset number can be set according to the practical training requirement of the classification model, for example, 100, 200, etc.

In practical application, when the sample text search is performed based on the sample problem, the sample text search can be performed based on keywords in the sample problem, that is, the sample text database extracts keywords in the sample problem, and 100 initial sample texts corresponding to the keywords are searched through the keywords, for example, when several keywords of the extracted sample problem appear in one initial sample text at the same time, the probability that the initial sample text is searched can be determined to be high, and the ranking in the 100 initial sample texts is also advanced, that is, all the 100 initial sample texts searched through the sample text database based on the sample problem have relevance with the sample problem, so that a more accurate positive training sample can be constructed based on the searched initial sample text and the sample problem later.

In the embodiment of the present application, the classification model is a supervised training, the training samples include a positive training sample and a negative training sample, and the constructing a training sample corresponding to the sample problem based on the sample problem and a sample text of the sample problem obtained by searching a database includes:

inputting the sample questions into a search database to obtain initial sample texts corresponding to the sample questions;

matching the sample answers corresponding to the sample questions with the initial sample text, and taking the initial sample text with the matching similarity larger than or equal to a preset similarity threshold as a first sample text;

Based on the sample problem and the first sample text, constructing a positive training sample corresponding to the sample problem;

And constructing a negative training sample corresponding to the sample question based on the sample question and other sample texts which are acquired from the search database and are different from the first sample text.

In the implementation, each sample question is input into a search database, and initial sample texts with preset numbers corresponding to each sample question are searched through the search database. And then matching the sample answers corresponding to each sample question with the initial sample text corresponding to the sample question, and taking the initial sample text with the matching similarity greater than or equal to a preset similarity threshold as a first sample text. And finally, constructing a positive training sample corresponding to each sample problem based on each sample problem and the first sample text corresponding to the sample problem. Meanwhile, based on each sample question and other sample questions which are acquired from the search database and are different from the first sample text corresponding to the sample question, a negative training sample corresponding to each sample question is constructed.

The preset similarity threshold may be set according to actual needs, for example, set to 80% or 90% or the like.

Specifically, the sample answers corresponding to the sample questions are matched with the initial sample text, and when the similarity between the sample answers corresponding to the sample questions and the initial sample text is determined, the similarity between the sample answers and each initial sample text can be calculated by calculating the editing distance between two character strings.

For example: the character string 1 is: "I now in Beijing" and string 2 are: "I am in Beijing", character string 1 has one more "now" word, and the edit distance between these two character strings is 1.

The step of calculating the edit distance between two character strings is to compare a series of operations such as adding, deleting or replacing two character strings, and then, after a step is taken, one character string can be converted into another character string, and then, the step is taken is that the edit distance between the two character strings is smaller, and the similarity is higher.

The sample answer and the initial sample text are respectively regarded as a character string, the initial sample text can be converted into the sample answer by how many steps are passed, the number of steps passed is the editing distance between the sample answer and the initial sample text, and the similarity between the sample answer and the initial sample text is calculated by the editing distance.

In practical application, fuzzyWuzzy (string fuzzy matching tool) can be used to match the sample answer corresponding to the sample question with each initial sample text, so as to realize similarity calculation of the sample answer and each initial sample text, and other similarity calculation tools can be used to realize similarity calculation of the sample answer and each initial sample text, without limitation.

For example, if the preset number is 100, the preset similarity threshold is 80%, so as to describe the construction of the training samples corresponding to the sample problem a in detail.

First, the sample question a is input into an elastic search database, and 100 initial sample texts corresponding to each sample question are retrieved through the elastic search database.

And then matching the standard answer corresponding to the sample question a with each initial sample text in the 100 initial sample texts, and calculating the similarity of the standard answer corresponding to the sample question a and each initial sample text, wherein an initial text sample with the similarity greater than or equal to 80% is used as a first sample text.

And finally, constructing a positive training sample corresponding to the sample problem a based on the sample problem a and each first sample.

Meanwhile, based on the sample question a and other sample texts (i.e., texts obtained from the elastic search database and other than the first sample text) obtained from the elastic search database, a negative training sample corresponding to the sample question a is constructed.

In the embodiment of the application, the training of the classification model depends on training samples, and in practical application, the training effect of the classification model is better as more training samples are, but the current training samples depend on manual labeling, and the cost of manual labeling is very high, so that the labor cost is greatly increased as a large number of training sample labeling; in the application, a plurality of initial sample questions related to the sample questions are obtained from the search database through a small number of query-answer pairs marked manually, the construction of a positive training sample is realized based on the sample questions and each associated initial sample text, and the construction of a negative training sample is realized based on the sample questions and other sample texts different from the associated initial sample text.

Specifically, the constructing a negative training sample corresponding to the sample question based on the sample question and other sample texts obtained from the search database and different from the first sample text includes:

matching the sample answers corresponding to the sample questions with the initial sample text, and taking the initial sample text with the matching similarity smaller than a preset similarity threshold as a second sample text;

Acquiring a third sample text different from the initial sample text from the search database based on the sample question;

Determining a fourth sample text from initial sample texts corresponding to other sample questions different from the sample questions;

and constructing a negative training sample corresponding to the sample question based on the sample question and the second sample text, the third sample text and/or the fourth sample text.

In the specific implementation, when the negative training sample of the classification model is constructed, a plurality of construction modes exist, and the negative training sample can be formed by combining an initial sample text with similarity smaller than a preset similarity threshold value with a sample problem; other sample texts different from the sample text corresponding to the current sample problem can be retrieved from the search database and combined with the sample problem to form a negative training sample; in addition, a negative training sample may be formed based on combining sample text corresponding to any sample question different from the current sample question with the sample question. Or combining the above modes in pairs or three ways to form a negative training sample.

Along the above example, matching the standard answer corresponding to the sample question a with each initial sample text in the 100 initial sample texts, and calculating the similarity between the standard answer corresponding to the sample question a and each initial sample text, wherein an initial text sample with the similarity less than 80% is used as the second sample text.

Meanwhile, acquiring a preset number of texts which are different from 100 initial sample texts corresponding to the sample problem a from an elastic search database based on the sample problem a as a third sample text; the preset number may be set according to actual needs, and is not limited herein.

Selecting a preset number of initial text samples from 100 initial sample texts corresponding to a sample problem b different from the sample problem a as a fourth sample text; the 100 initial sample texts corresponding to the sample question b may also be initial sample texts obtained from the elastic search database, and the preset number of the initial sample texts may also be set according to actual needs, which is not limited in any way.

Then after the second sample text, the third sample text and the fourth sample text are obtained, the sample question a can be respectively combined with each second sample text to construct a negative training sample corresponding to the sample question a; the sample problem a can be respectively combined with each three-sample text to construct a negative training sample corresponding to the sample problem a; or respectively combining the sample problem a with each fourth sample text to construct a negative training sample corresponding to the sample problem a; or respectively combining the sample question a with each second sample text, each third sample text and each fourth sample text to construct a negative training sample corresponding to the sample question a. Specifically, the combination manner of the negative training samples may be set according to practical applications, which is not limited herein.

In practical application, three situations are considered when negative training sample construction is carried out, one is a sample text which is related to a sample problem but has similarity smaller than a preset similarity threshold, and the construction of the negative training sample is realized by combining the sample problem; one is sample text which is not related to the sample problem, but is obtained from a search database, and the construction of a negative training sample is realized by combining the sample problem; yet another is sample text that is necessarily uncorrelated with the sample question, but contains sample answers to other sample questions, in conjunction with which the construction of the negative training sample is accomplished. By considering the three conditions, the classification model can be combined with various conditions of a negative training sample during training, so that richer and differentiated learning is realized, and the training effect of the classification model is greatly improved.

In addition, in order to enable the semantic matching of interaction between sample problems and corresponding sample texts to be better considered when the classification model is trained; because the representation of the same sample text is different under different sample problems, through training samples formed between the sample text corresponding to the sample problem and other sample problems, the semantic focus can be grasped based on the interaction between the sample problem and the sample text (namely, the interaction between the sample problem and all word vectors in the text after the corresponding sample text is spliced) during the training of the classification model, so that the accurate recall of the text by the subsequent classification model is realized, and during the training of the classification model, the sample problem and the corresponding sample text are spliced to be used as training samples to realize the training of the classification model, and the specific implementation manner is as follows:

the constructing a positive training sample corresponding to the sample question based on the sample question and the first sample text includes:

stitching the sample question with the first sample;

And taking the spliced result of the sample problem and the first sample as a positive training sample corresponding to the sample problem, and adding a corresponding first label for the positive training sample.

And constructing a negative training sample corresponding to the sample question based on the sample question and the second sample text, the third sample text and/or the fourth sample text, including:

Splicing the sample question with the second sample text, the third sample text and/or the fourth sample text;

and taking the spliced results of the sample problem, the second sample text, the third sample text and/or the fourth sample text as a negative training sample corresponding to the sample problem, and adding a corresponding second label for the negative training sample.

Wherein the first label may be a label indicating that the training sample is a positive training sample, e.g. 1; the second label may be a label indicating that the training sample is a negative training sample, e.g., 0.

In practical application, when the positive training sample is constructed, each sample problem is spliced with each corresponding first sample, then the spliced result is used as the positive training sample, and the first label of each positive training sample is as follows: 1, a step of; and splicing each sample question with the corresponding second sample text, third sample text and/or fourth sample text, then taking the spliced result as a negative training sample, and carrying out second label on each positive training sample: 0. when the classification model is trained later, accurate training of the classification model can be achieved based on the positive training sample, the negative training sample, the first label and the second label.

In specific implementation, when the classification model is trained, if the training samples are more, the burden is caused to the training process of the classification model, and if the training samples are less, the training accuracy of the classification model is influenced, so that when the initial sample text corresponding to the sample problem is searched from the search database before the training samples are constructed, the quantity of the initial sample text is limited, and the limit on the quantity of constructed positive training samples and negative training samples is realized. In addition, in order to further improve the accuracy of the constructed positive training sample and improve the training accuracy of the classification model, after the sample text corresponding to the sample problem is searched from the search database, the sample text searched for first time can be further screened through the semantics of the sample problem, and the specific implementation manner is as follows:

Inputting the sample question into a search database to obtain an initial sample text corresponding to the sample question, wherein the method comprises the following steps:

Inputting the sample questions into a search database to obtain at least one sample text to be screened corresponding to the sample questions;

and carrying out semantic analysis on the sample questions, and screening initial sample texts corresponding to the sample questions from the at least one sample text to be screened based on semantic analysis results.

In practical applications, when a sample question is input into a search database to search a sample text, the search of the sample text corresponding to the sample question is only realized by means of keyword matching, and there is a case that the sample text is not a text containing a sample answer of the sample question from a semantic point of view although the keywords of the sample question are more matched with the keywords in a certain sample text.

In order to avoid the occurrence of the situation, the accuracy of the subsequent construction training samples is improved, when initial sample texts are acquired, firstly, each sample question is input into a search database to obtain a plurality of sample texts to be screened corresponding to each sample question, then semantic analysis is carried out on each sample question, and the appropriate initial sample question of the sample question is screened out from the corresponding plurality of sample texts to be screened based on the semantic analysis result of each sample question.

Step 206: training a classification model based on the training sample and a sample label corresponding to the training sample to obtain the classification model, wherein the classification model outputs the probability that a sample text of the sample question obtained through searching a database contains a sample answer corresponding to the sample question.

When the method is implemented, the classification model comprises an input layer, a coding layer and a classification layer;

Correspondingly, the training the classification model based on the training sample and the sample label corresponding to the training sample to obtain the classification model comprises the following steps:

Inputting the training samples into the classification model through the input layer, and obtaining coding vectors of the training samples through the coding layer;

Inputting the coding vector of the training sample into the classification layer to obtain the initial probability of the training sample;

Calculating a loss value based on the initial probability of the training sample and the sample label;

And adjusting parameters of the classification model according to the loss value, and continuing training the classification model until a training stopping condition is reached.

The classification model includes, but is not limited to ALBERT model, other classification models which can realize sample problem and sample text interaction in the model training process, and consider context information based on semantic matching of interaction can be used, and for the convenience of understanding, the application is explained by using the classification model as ALBERT model.

Specifically, the positive training sample and the negative training sample constructed above are input into a classification model through an input layer of the classification model, and a coding vector (i.e. hidden layer vector) of the training sample is obtained through a coding layer (such as Embedding); and then inputting the coding vector of the training sample into a downstream two-classification task layer, and converting the multi-dimensional coding vector into a two-dimensional vector through a preset linear expression in the two-classification task layer, wherein each one-dimensional element of the two-dimensional vector represents the initial probability corresponding to the training sample. Finally, calculating a loss value based on the initial probability of the training sample and the sample label, and adjusting network parameters of the classification model according to the loss value, and continuing training the classification model until the classification model reaches a training stop condition.

In the embodiment of the specification, firstly, the quick and accurate construction of the training sample of the classification model can be realized based on the sample problem and the search database, so that the training time is saved for the subsequent training of the classification model; and then training the classification model based on the training sample constructed by splicing the sample questions and the corresponding sample texts, so that the sample questions and the sample texts interact when the classification model is trained, the context information of the questions and the corresponding texts can be well considered when the classification model is applied subsequently, the semantic focus is accurately grasped, and the accurate prediction of the probability of answers to the questions contained in the texts is realized.

Wherein the processor 120 may perform the steps of the method shown in fig. 3. Fig. 3 is a schematic flowchart showing a target text determining method according to an embodiment of the present application, which specifically includes the following steps.

Step 302: and acquiring a target problem, and inputting the target problem into a search database to acquire at least one initial text corresponding to the target problem.

Among them, the target questions include, but are not limited to, any length, any type of question.

The searching database may be referred to the above embodiments, and will not be described herein.

Specifically, a target problem is acquired, the target problem is input into a search database, and a plurality of initial texts corresponding to the target problem are acquired through the search database.

In a specific implementation, the inputting the target problem into a search database to obtain at least one initial text corresponding to the target problem includes:

inputting the target problem into a search database to obtain at least one text to be screened corresponding to the target problem;

and carrying out semantic analysis on the target problem, and screening at least one initial text corresponding to the target problem from the at least one text to be screened based on a semantic analysis result.

In practical application, in order to reduce the workload of the classification model and improve the accuracy of the initial text corresponding to the target problem obtained from the search database, after the target problem is obtained, the target problem is input into the search database to obtain a plurality of texts to be screened corresponding to the target problem. And then carrying out semantic analysis on the target problem, and screening out an initial text which is matched with the target problem from the texts to be screened based on the semantic analysis result of the target problem.

Step 304: and inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question.

The classification model is obtained by the classification model training method.

Specifically, after a target question and a plurality of initial texts corresponding to the target question are acquired, the target text and each corresponding initial text are respectively input into a classification model, and the probability that each initial text contains a target answer corresponding to the target text is obtained.

In a specific implementation, the inputting the target question and the at least one initial text into the classification model to obtain the probability that the at least one initial text includes the target answer corresponding to the target question includes:

and splicing the target question with each initial text in the at least one initial text, and inputting each spliced result into a classification model to obtain the probability that each initial text contains a target answer corresponding to the target question.

And inputting the target question and the spliced result of the initial text corresponding to the target question into a pre-trained classification model to obtain the probability that the target text contains the target answer corresponding to the target question, for example, the probability is 0.3.

In practical application, after inputting a target question and each corresponding initial text into a classification model, the classification model outputs the probability that each initial text contains a target answer corresponding to the target question and the probability that each initial text does not contain the target answer corresponding to the target question; because the application only screens the target text based on the positive training sample, the application only introduces the probability that each initial text contains the target answer corresponding to the target question through the classification model.

In the embodiment of the specification, firstly, the coarse recall of the initial text corresponding to the target problem is realized through the retrieval database, then the combination of the initial text of the coarse recall and the target problem is input into the classification model, and the text containing the target answer of the target problem is further screened from the initial text corresponding to the target problem by calculating the semantic similarity between the initial text and the target problem in the classification model so as to determine the accuracy of the target text obtained subsequently.

Step 306: and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability.

In a specific implementation, the determining, based on the probability, a target text including a target answer corresponding to the target question from the at least one initial text includes:

and performing descending order arrangement on all initial texts in the at least one initial text based on the probability, and acquiring a preset number of initial texts after descending order arrangement from high to low to serve as target texts containing target answers corresponding to the target questions.

Specifically, after the probability of each initial text corresponding to the target question is obtained, the initial texts are ranked from high to low based on the probability, and then a preset number of initial texts are selected as target texts containing target answers corresponding to the target question based on preset requirements. For example, the first 10 or the first 15 initial texts after descending order are selected as target texts containing target answers corresponding to the target questions, etc.

The target text determining method provided by the embodiment of the application comprises the steps of obtaining a target problem, and inputting the target problem into a search database to obtain at least one initial text corresponding to the target problem; inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method; and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability. Specifically, the target text determining method provides a two-stage text searching strategy, firstly, a target problem is input into a searching database, a plurality of initial texts corresponding to the target problem are obtained through the searching database, coarse text recall in a first stage is achieved, and then, the initial texts recalled in the first stage are further screened through a pre-trained classification model, so that a more accurate target text most relevant to the target problem is screened out of the plurality of initial texts.

In addition, when the initial text recalled in the first stage is screened through the classification model, the target problem and each initial text are combined and input into the classification model, the classification model can calculate the characteristics of each word of the text after the target problem is spliced with each initial text and the interaction between every two words, the interaction between the target problem and each initial text is realized in the classification model, the context information can be well considered, the matching precision of the target problem and the initial text is improved, and the target text corresponding to the target problem can be obtained more accurately.

Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a classification model training apparatus, and fig. 4 shows a schematic structural diagram of a classification model training apparatus according to an embodiment of the present disclosure. As shown in fig. 4, the apparatus includes:

A training data acquisition module 402 configured to acquire a training data set, wherein the training data set includes a sample question and a sample answer corresponding to the sample question;

a training sample construction module 404 configured to construct a training sample corresponding to the sample question based on the sample question and a sample text of the sample question obtained by searching a database;

model training module 406 is configured to train a classification model based on the training samples and sample labels corresponding to the training samples, and obtain the classification model.

Optionally, the training sample construction module 404 is further configured to:

stitching the sample question with the first sample;

Optionally, the classification model includes an input layer, an encoding layer, and a classification layer;

accordingly, the model training module 406 is further configured to:

The classification model training device provided by the embodiment of the specification can realize the rapid and accurate construction of the training sample of the classification model based on the sample problem and the search database, and save training time for the subsequent classification model training; training the classification model based on the training sample constructed by splicing the sample questions and the corresponding sample texts, so that the sample questions and the sample texts interact when the classification model is trained, the context information of the questions and the corresponding texts can be well considered when the classification model is applied subsequently, the semantic focus is accurately grasped, and the accurate prediction of the probability of answers to the questions contained in the texts is realized.

Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a target text determining apparatus, and fig. 5 shows a schematic structural diagram of a target text determining apparatus according to an embodiment of the present disclosure. As shown in fig. 5, the apparatus includes:

A question obtaining module 502, configured to obtain a target question, and input the target question into a search database to obtain at least one initial text corresponding to the target question;

A probability obtaining module 504 configured to input the target question and the at least one initial text into a classification model, to obtain a probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method described in the claims;

A text determination module 506 is configured to determine, from the at least one initial text, a target text containing a target answer corresponding to the target question based on the probability.

Optionally, the probability obtaining module 504 is further configured to:

Optionally, the text determination module 506 is further configured to:

Optionally, the problem acquisition module 502 is further configured to:

The target text determining device provided by the application comprises the steps of acquiring a target problem, and inputting the target problem into a search database to acquire at least one initial text corresponding to the target problem; inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method; and determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability. Specifically, the target text determining method provides a two-stage text searching strategy, firstly, a target problem is input into a searching database, a plurality of initial texts corresponding to the target problem are obtained through the searching database, coarse text recall in a first stage is achieved, and then, the initial texts recalled in the first stage are further screened through a pre-trained classification model, so that a more accurate target text most relevant to the target problem is screened out of the plurality of initial texts.

It should be noted that, the components in the apparatus claims should be understood as functional modules that are necessary to be established for implementing the steps of the program flow or the steps of the method, and the functional modules are not actually functional divisions or separate limitations. The device claims defined by such a set of functional modules should be understood as a functional module architecture for implementing the solution primarily by means of the computer program described in the specification, and not as a physical device for implementing the solution primarily by means of hardware.

An embodiment of the present application also provides a computing device including a memory, a processor, and computer instructions stored on the memory and executable on the processor, the processor executing the steps of the classification model training method or the steps of the target text determination method when the computer instructions.

The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solutions of the above-mentioned classification model training method and the target text determining method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solutions of the above-mentioned classification model training method and the target text determining method.

An embodiment of the present application also provides a computer-readable storage medium storing computer instructions that, when executed by a processor, implement the steps of the classification model training method or the steps of the target text determination method.

The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solutions of the computer readable storage medium and the technical solutions of the classification model training method and the target text determining method belong to the same concept, and details of the technical solutions of the computer readable storage medium which are not described in detail can be referred to the description of the technical solutions of the classification model training method and the target text determining method.

The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be increased or decreased appropriately according to the requirements of the patent practice, for example, in some areas, according to the patent practice, the computer readable medium does not include an electric carrier signal and a telecommunication signal.

It should be noted that, for the sake of simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the present application.

In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

The preferred embodiments of the application disclosed above are intended only to assist in the explanation of the application. Alternative embodiments are not intended to be exhaustive or to limit the application to the precise form disclosed. Obviously, many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described in order to best explain the principles of the application and the practical application, to thereby enable others skilled in the art to best understand and utilize the application. The application is limited only by the claims and the full scope and equivalents thereof.

Claims

1. A method of training a classification model, comprising:

Inputting the sample questions into a search database, obtaining initial sample texts corresponding to the sample questions, matching sample answers corresponding to the sample questions with the initial sample texts, and taking the initial sample texts with matching similarity greater than or equal to a preset similarity threshold as first sample texts;

constructing a negative training sample corresponding to the sample question based on the sample question and other sample texts which are acquired from the search database and are different from the first sample text;

And training a classification model based on the positive training sample, the sample label corresponding to the positive training sample, the negative training sample and the sample label corresponding to the negative training sample to obtain the classification model.

2. The method according to claim 1, wherein the constructing a negative training sample corresponding to the sample question based on the sample question and other sample text obtained from the search database that is different from the first sample text includes:

3. The method for training a classification model according to claim 1, wherein the inputting the sample question into a search database to obtain an initial sample text corresponding to the sample question comprises:

4. The method for training a classification model according to claim 1, wherein the constructing a positive training sample corresponding to the sample question based on the sample question and the first sample text comprises:

stitching the sample question with the first sample;

5. The method for training a classification model according to claim 2, wherein the constructing a negative training sample corresponding to the sample question based on the sample question and the second sample text, the third sample text, and/or the fourth sample text comprises:

6. The classification model training method of any of claims 1-5, wherein the classification model comprises an input layer, a coding layer, and a classification layer;

and adjusting parameters of the classification model according to the loss value, and continuing training the classification model until a training stopping condition is reached, wherein the classification model outputs the probability that the sample text of the sample question obtained through searching the database contains a sample answer corresponding to the sample question.

7. A target text determination method, comprising:

Inputting the target question and the at least one initial text into a classification model to obtain the probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method according to any one of claims 1-6;

8. The method for determining a target text according to claim 7, wherein the inputting the target question and the at least one initial text into the classification model to obtain a probability that the at least one initial text contains a target answer corresponding to the target question comprises:

9. The method for determining a target text according to claim 7, wherein determining a target text containing a target answer corresponding to the target question from the at least one initial text based on the probability comprises:

10. The method for determining a target text according to any one of claims 7 to 9, wherein the inputting the target question into a search database to obtain at least one initial text corresponding to the target question includes:

11. A classification model training apparatus, comprising:

A training sample construction module configured to input the sample question into a search database, obtain an initial sample text corresponding to the sample question, match a sample answer corresponding to the sample question with the initial sample text, take the initial sample text with a matching similarity greater than or equal to a preset similarity threshold as a first sample text,

Based on the sample question and the first sample text, constructing a positive training sample corresponding to the sample question,

The model training module is configured to train the classification model based on the positive training sample, the sample label corresponding to the positive training sample, the negative training sample and the sample label corresponding to the negative training sample to obtain the classification model.

12. A target text determining apparatus, comprising:

A probability obtaining module configured to input the target question and the at least one initial text into a classification model, to obtain a probability that the at least one initial text contains a target answer corresponding to the target question, wherein the classification model is obtained by the classification model training method of any one of claims 1-6;

13. A computing device comprising a memory, a processor, and computer instructions stored on the memory and executable on the processor, wherein the processor, when executing the computer instructions, performs the steps of the classification model training method of any of claims 1-6 or the steps of the target text determination method of any of claims 7-10.

14. A computer readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the classification model training method of any one of claims 1-6 or the steps of the target text determination method of any one of claims 7-10.