CN116825391A - Method, device, equipment and storage medium for question and answer sorting model - Google Patents

Method, device, equipment and storage medium for question and answer sorting model Download PDF

Info

Publication number
CN116825391A
CN116825391A CN202310779295.6A CN202310779295A CN116825391A CN 116825391 A CN116825391 A CN 116825391A CN 202310779295 A CN202310779295 A CN 202310779295A CN 116825391 A CN116825391 A CN 116825391A
Authority
CN
China
Prior art keywords
positive
sample
training
text
negative
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310779295.6A
Other languages
Chinese (zh)
Inventor
赵越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202310779295.6A priority Critical patent/CN116825391A/en
Publication of CN116825391A publication Critical patent/CN116825391A/en
Pending legal-status Critical Current

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to artificial intelligence and digital medical technology, and discloses a training method, a training device, training equipment and a training storage medium for a consultation answer ordering model of an on-line consultation scene. The method comprises the following steps: acquiring pre-constructed positive sample sets, and carrying out initial text-invariant sentence transformation on each positive sample set according to a preset negative sample construction strategy to obtain negative sample sets; according to a preset sample format, constructing the positive sample set and the negative sample set to obtain a training set; forward predicting the training sample by using a pre-constructed inquiry answer sequencing model, and obtaining comprehensive loss comprising positive sample loss, negative sample loss and positive and negative contrast loss according to a prediction result; and training the inquiry answer sequencing model according to the comprehensive loss to obtain the trained inquiry answer sequencing model. The invention can improve the fit degree of intelligent response and doctor response information under the on-line medical consultation scene, thereby improving doctor consultation efficiency.

Description

Method, device, equipment and storage medium for question and answer sorting model
Technical Field
The invention relates to the technical field of artificial intelligence and digital medical treatment, in particular to a training method, a training device, training equipment and a training computer readable storage medium for a consultation answer ordering model in an on-line consultation scene.
Background
With the development of digital medical treatment, more and more medical consultation platforms are gradually developed, and users can consult health problems to doctors or other professionals through the consultation platforms, so that the digital medical treatment is an important component in modern medical services.
Because the medical inquiry platform needs to be responsible for the life health of the user, the medical inquiry platform is mostly real doctor online answers in the inquiry scene, and the inquiry platform itself can also search the keywords of the user in the inquiry process, and make some answers to provide for doctors as references, so that the inquiry efficiency of the doctors is improved. However, the intelligent answer made by the platform per se can not be well fit with the intention of the doctor, and the doctor still needs to perform further screening, so that the doctor inquiry efficiency is still not high.
Disclosure of Invention
The invention provides a training method, device, equipment and storage medium of a query response ordering model, which mainly aim at improving the fitting degree of intelligent response and doctor response information by further ordering an intelligent response set under an on-line medical query scene, so as to improve the doctor query efficiency.
In order to achieve the above object, the present invention provides a training method for a query reply ranking model, including:
step A: constructing a positive sample by using historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, and carrying out initial text-unchanged sentence transformation on each real answer of the doctors according to a preset negative sample construction strategy to obtain a negative sample set;
and (B) step (B): according to a preset sample format, constructing the positive sample set and the negative sample set to obtain a training set;
step C: sequentially extracting a training sample from the training set, performing history statement-positive sample splicing operation on the training sample according to history statements and positive sample texts in the training sample by using a pre-constructed inquiry answer sequencing model to obtain a positive spliced text, and performing quantization coding on the positive spliced text to obtain a first output code;
step D: performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive information marking text, and performing quantization coding on the positive information marking text to obtain a second output code;
Step E: calculating positive sample loss between the second output code and the first output code according to a contrast learning algorithm;
step F: outputting the negative samples in the training samples according to the operation steps of the step C, the step D and the step E to obtain negative sample loss;
step G: according to a preset positive and negative sample score ordering rule, carrying out full-connection score prediction on the positive sample set and the negative sample set to obtain a model prediction result, and obtaining positive and negative contrast loss according to the model prediction result;
step H: according to a preset weight configuration rule, weighting calculation is carried out on the positive sample loss, the negative sample loss and the positive and negative contrast loss to obtain comprehensive loss;
step I: and training the inquiry answer sequencing model according to the comprehensive loss to obtain the trained inquiry answer sequencing model.
Optionally, the modifying the sentence with unchanged initial text for each doctor's real answer according to a preset negative sample construction strategy to obtain a negative sample set includes:
performing word segmentation processing on the real response of the doctor, randomly fixing the word segmentation of the preset numerical value before the real response of the doctor, and dividing the real response of the doctor into a text prefix part and a modifiable part;
Performing word random replacement on the modifiable part by using a pre-constructed relaxed text enhancement method to obtain a first modified part, and constructing a first type negative sample by using the first modified part and the text prefix part;
repeating text adding operation on the modifiable part according to a preset redundancy strategy to obtain a second modifiable part, and constructing a second type negative sample by utilizing the second modifiable part and the text prefix part;
performing error diagnosis result replacement operation on the modifiable portion according to a preset accuracy policy to obtain a third modified portion, and constructing a third type negative sample by using the third modified portion and the text prefix portion;
and mixing the first type negative sample, the second type negative sample and the third type negative sample according to a preset collocation proportion to obtain a negative sample set.
Optionally, training the query response ranking model according to the comprehensive loss to obtain a trained query response ranking model, including:
recording the comprehensive loss to obtain a comprehensive loss curve;
calculating the convergence score of the comprehensive loss curve, and judging whether the convergence score is smaller than a preset qualification threshold value or not;
When the convergence score is greater than the qualification threshold, continuing to extract training samples from the positive sample set and the negative sample set, and performing iterative training on the inquiry answer ordering model;
and stopping the training process when the convergence score is smaller than or equal to the qualification threshold value, and obtaining the trained inquiry answer ordering model.
Optionally, the performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive information marked text, and performing quantization coding on the positive information marked text to obtain a second output code, where the method includes:
extracting part-of-speech information of each word in the positive sample text, and calculating the relevance score of the part-of-speech information and the historical sentences;
according to the relevance scores of the word segmentation, carrying out weight calculation on the historical sentences to obtain historical re-expression sentences;
and splicing according to the history re-expression statement and the positive sample to obtain a positive information mark text, and quantizing to obtain a second output code.
Optionally, the performing full-connection score prediction on the positive sample set and the negative sample set according to a preset positive and negative sample score ordering rule to obtain a model prediction result, including:
Obtaining the inquiry answer sorting model, and respectively calculating the predictive scores of the positive sample set and the negative sample set to obtain a positive sample score and a negative sample score:
score + =sigmoid(FNN(output1))
score - =sigmoid(FNN(output2))
wherein the score + Score as positive sample, the score - The FNN () is a full-connection layer function, the sigmoid () is a numerical mapping function, the output1 is a first output code corresponding to a positive sample, and the output2 is a first output code corresponding to a negative sample;
performing function splicing operation on the positive sample score and the negative sample score to obtain a model prediction result:
wherein the score is a model prediction result.
Optionally, the obtaining positive and negative contrast loss according to the model prediction result includes:
wherein loss3 is positive and negative contrast loss, score is model prediction result, alpha is parameter weight, andthe score + Score as positive sample, the score _ Is a negative sample score.
Optionally, the training sample has a structure as follows:
sample= { history': history, pos: positive sample text, neg: corresponding negative sample text }
In the formula, the sample is a training sample, the history is a history statement, the history' is a history re-expression statement, the pos represents a positive sample, and the neg represents a negative sample.
In order to solve the above problem, the present invention further provides a training device for a query reply ranking model, the device comprising:
the sample construction module is used for constructing a positive sample by utilizing historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, carrying out initial text-invariant sentence transformation on the real answers of the doctors according to a preset negative sample construction strategy to obtain a negative sample set, and constructing the positive sample set and the negative sample set according to a preset sample format to obtain a training set;
the positive sample calculation module is used for sequentially extracting a training sample from the training set, performing historical sentence-positive sample splicing operation on the training sample according to historical sentences and positive sample texts in the training sample by utilizing a pre-constructed inquiry answer sorting model to obtain a positive spliced text, performing quantization coding on the positive spliced text to obtain a first output code, performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive information marking text, and performing quantization coding on the positive information marking text to obtain a second output code:
The positive and negative sample loss module is used for calculating positive sample loss between the second output code and the first output code according to a contrast learning algorithm, and outputting negative samples in the training samples according to the operation steps of the step C, the step D and the step E to obtain negative sample loss;
the comparison loss module is used for carrying out full-connection score prediction on the positive sample set and the negative sample set according to a preset positive and negative sample score ordering rule to obtain a model prediction result, and obtaining positive and negative comparison loss according to the model prediction result;
the comprehensive training module is used for carrying out weighted calculation on the positive sample loss, the negative sample loss and the positive and negative contrast loss according to a preset weight configuration rule to obtain comprehensive loss, and training the inquiry answer sorting model according to the comprehensive loss to obtain the trained inquiry answer sorting model.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of training the interview reply ranking model described above.
In order to solve the above-mentioned problems, the present invention further provides a computer readable storage medium having stored therein at least one computer program that is executed by a processor in an electronic device to implement the above-mentioned training method of the inquiry response ranking model.
In order to ensure that the answer result of the inquiry and the intention of a doctor are matched, the embodiment of the invention can truly answer a construction sample of the doctor when constructing the sample, and can carry out sentence transformation on a positive sample to generate a negative sample set, so that the positive sample and the negative sample are integrated together according to a preset sample format to obtain a training set; the training set contains positive samples and negative samples, so that positive sample results and negative sample results appear in the forward execution process of the model network, and positive sample loss and negative sample loss can be obtained; in addition, the invention also synthesizes the output results of the positive and negative samples together in a comparison learning mode, calculates a positive and negative comparison loss, further calculates a comprehensive loss, and trains the model, wherein the three losses enable the sequencing result to be more accurate, avoid uncorrelated or low-quality replies from interfering with the judgment of doctors, and improve the accuracy and the reliability. Therefore, the training method, the training device, the training equipment and the training storage medium for the inquiry response ordering model provided by the embodiment of the invention can improve the fitting degree of the intelligent response and the doctor response information by further ordering the intelligent response set in the on-line medical inquiry scene, thereby improving the doctor inquiry efficiency.
Drawings
FIG. 1 is a flowchart of a training method of a query response ranking model according to an embodiment of the present application;
FIG. 2 is a detailed flowchart of one step in a training method of the query response ranking model according to an embodiment of the present application;
FIG. 3 is a detailed flowchart of one step in a training method of the query response ranking model according to an embodiment of the present application;
FIG. 4 is a functional block diagram of a training device for a ranking model of inquiry responses according to an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device implementing the training method of the query reply ranking model according to an embodiment of the present application.
The achievement of the objects, functional features and advantages of the present application will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
The embodiment of the application provides a training method of a query response ordering model. In the embodiment of the present application, the execution body of the training method of the inquiry answer ranking model includes, but is not limited to, at least one of a server, a terminal, and the like, which can be configured to execute the method provided by the embodiment of the present application. In other words, the training method of the inquiry response ranking model may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (ContentDelivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
Referring to fig. 1, a flowchart of a training method of a query response ranking model according to an embodiment of the present invention is shown. In this embodiment, the method for training the inquiry answer ranking model includes steps S1 to S9:
s1, constructing a positive sample by utilizing historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, and carrying out initial text-unchanged sentence transformation on the real answers of the doctors according to a preset negative sample construction strategy to obtain a negative sample set.
In the embodiment of the present invention, the client inquiry information refers to the text of the user inquiry, for example, how eczema is found [ please ask ].
According to the embodiment of the invention, the real answer of the doctor is taken as a positive sample, so that the recommendation of an intelligent system can be more in line with the viewpoint of the doctor, for example, the doctor just types out five words of eczema in Chinese medicine, then the recommended answers given by a consultation system are immediately subjected to reconstruction and sequencing, and the doctor can select the eczema which is considered to be related to damp heat in Chinese medicine, the eczema in Chinese medicine … …, and the like.
Furthermore, according to a preset negative sample construction strategy, the embodiment of the invention reforms sentences with unchanged initial text of the real replies of each doctor to obtain a negative sample set. The negative sample construction strategy is a method for semantically and structurally modifying the text of the positive sample and is used for constructing negative samples which possibly interfere with answer sorting.
In detail, referring to fig. 2, in the embodiment of the present invention, according to a preset negative sample construction policy, the original text-invariant sentence transformation is performed on each real answer of the doctor, so as to obtain a negative sample set, which includes steps S11 to S15:
s11, performing word segmentation processing on the real response of the doctor, randomly fixing the word segmentation of the preset numerical values before the real response of the doctor, and dividing the real response of the doctor into a text prefix part and a modifiable part;
s12, carrying out word random replacement on the modifiable part by using a pre-built relaxed text enhancement method to obtain a first modified part, and building a first type negative sample by using the first modified part and the text prefix part;
s13, performing repeated text addition operation on the modifiable part according to a preset redundancy strategy to obtain a second modifiable part, and constructing a second type negative sample by using the second modifiable part and the text prefix part;
S14, performing error diagnosis result replacement operation on the modifiable part according to a preset accuracy strategy to obtain a third modified part, and constructing a third type negative sample by using the third modified part and the text prefix part;
s15, mixing the first type negative sample, the second type negative sample and the third type negative sample according to a preset collocation proportion to obtain a negative sample set.
Wherein the relaxed text enhancement method (easy data augmentation, EDA for short) is a simple data enhancement technique applied to text classification; the redundancy strategy refers to a method for adding the extraction part into the sentence before a client or a doctor to form a negative sample; the accuracy policy refers to a method for obtaining a negative sample by integrating an incorrect diagnosis result.
Specifically, in the embodiment of the invention, the positive sample [ eczema is considered to be related to damp-heat on the traditional Chinese medicine ], the first few segmented words are reserved to obtain the prefix part prefix [ eczema on the traditional Chinese medicine ], then one word is randomly replaced and deleted on the modifiable part [ considered to be related to damp-heat ], or the positions of the two words are randomly replaced, corresponding negative samples such as [ eczema ] related to damp-heat on the traditional Chinese medicine ], and half-sentence doctor information can be randomly intercepted in the doctor and the inquiry under the diagnosis and supplemented to the back of the prefix, and corresponding negative samples such as [ eczema ] are related to external factors of the traditional Chinese medicine such as living environment and external contact.
According to the redundancy strategy, a negative sample of duplicate content was constructed from a positive sample [ eczema was considered related to damp-heat in traditional Chinese medicine above ]: a message sent by a doctor is randomly selected from the inquiry and supplemented to the back of the prefix, and then corresponding negative samples such as eczema are considered to be related to damp-heat in the traditional Chinese medicine, and what medicine is used recently.
According to the accuracy policy, negative samples that are not contextually relevant are constructed: selecting a message similar to the positive sample from the first half part in the inquiry of other diagnosis of the doctor, and intercepting the second half part as the complement content; or to construct a negative sample of the diagnostic error: the diagnosis in the positive sample is randomly replaced by other diagnosis, and thus the negative samples obtained respectively may be [ how long the facial dermatitis is to be treated as early as possible, ] [ how long the alopecia is, consider seborrheic dermatitis according to your situation ].
According to the negative samples constructed according to the various strategies, the negative samples are more abundant, and the training effect is further improved. In addition, in the embodiment of the invention, a more comprehensive negative sample can be obtained through other negative sampling methods including batch negative sampling, random negative sampling and the like.
S2, constructing the positive sample set and the negative sample set according to a preset sample format to obtain a training set.
In the embodiment of the invention, the preset sample format is as follows:
sample= { history': history, pos: positive sample text, neg: corresponding negative sample text }
The sample is a training sample, the history is a history statement, the history' is a history re-expression statement, the pos represents a positive sample, and the neg represents a negative sample.
In the embodiment of the invention, history of inquiry = [ a ] 1 ,a 2 ,…,a n ]Wherein a is i The = { text, string, sender, int } represents a message in the dialogue, mainly in json format, the text key stores the text content of the message itself, and the sender key represents the user who sends the message; type int, 0 stands for patient, 1 stands for doctor.
And S3, sequentially extracting a training sample from the training set, performing history statement-positive sample splicing operation on the training sample according to the history statement and the positive sample text in the training sample by using a pre-constructed inquiry answer sorting model to obtain a positive spliced text, and performing quantization coding on the positive spliced text to obtain a first output code.
In the embodiment of the invention, the inquiry response ordering model is a downstream model for ordering the inquiry results of the intelligent inquiry robot, and is used for reordering the inquiry results according to keywords typed by doctors, so that the doctors can find out the words which want to be expressed conveniently, and the on-line inquiry efficiency of the doctors is improved.
In the embodiment of the invention, according to the sample format, each training sample is provided with the history statement, the positive sample text and the negative sample text, so that the inquiry answer sorting model can identify the positive sample and the negative sample at the same time in one training process, and the identification principle is similar.
The invention firstly marks the history dialogue and the positive sample splice as a positive splice text input1, and then encodes the text input1 through the bert to obtain an output first output code output1.
S4, performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive upper information marked text, and performing quantization coding on the positive upper information marked text to obtain a second output code.
In detail, referring to fig. 3, in the embodiment of the present invention, the effective information extraction operation is performed on the positive sample text, the positive spliced text is marked according to the extraction result, so as to obtain a positive context information marked text, and the positive context information marked text is quantized and encoded, so as to obtain a second output code, which includes steps S41 to S43:
S41, extracting part-of-speech information of each word in the positive sample text, and calculating the relevance score of the part-of-speech information and the historical sentences;
s42, carrying out weight calculation on the historical sentences according to the relevance scores of the segmented words to obtain historical re-expression sentences;
and S43, splicing according to the history re-expression statement and the positive sample to obtain a positive information mark text, and quantizing to obtain a second output code.
Specifically, after the first output code output1 is obtained, the input1 is passed through a re-expression module, effective information in the history dialogue is extracted, context is re-expressed and recorded, and the context is encoded by the bert to obtain the second output code input1'.
The concrete re-expression module operates as follows:
according to the method, firstly, the relevance score of the part-of-speech information and the historical sentences is calculated according to a positive sample, and then the dialogue history is re-represented:
w=cosine(pos,history)
history′=w*history
the history' is the effective information of the history dialogue, such as part-of-speech information. The second output code input1' is:
input1′=history′+pos
s5, calculating positive sample loss between the second output code and the first output code according to a contrast learning algorithm.
In the embodiment of the invention, the comparison learning aims to enable the original coding output1 and the coding output1' after extracting effective information to be close enough, so loss is a mean square loss function:
loss1=MSELoss(output1,output1′)
wherein the MSE is the mean square error.
And S6, outputting the negative samples in the training samples according to the operation steps of the steps S3-S5 to obtain negative sample loss.
Likewise, the loss of contrast learning for the negative example is:
loss2=MSELoss(output2,output2′)
in the embodiment of the invention, loss of the loss1 as a positive sample and loss of the loss2 as a negative sample are taken.
And S7, carrying out full-connection score prediction on the positive sample set and the negative sample set according to a preset positive and negative sample score ordering rule to obtain a model prediction result, and obtaining positive and negative contrast loss according to the model prediction result.
In detail, in the embodiment of the present invention, the performing full-connection score prediction on the positive sample set and the negative sample set according to a preset positive and negative sample score ordering rule to obtain a model prediction result includes:
obtaining the inquiry answer sorting model, and respectively calculating the predictive scores of the positive sample set and the negative sample set to obtain a positive sample score and a negative sample score:
score + =sigmoid(FNN(output1))
score - =sigmoid(FNN(output2))
Wherein the Score + Score as positive sample, the score - The FNN () is a full-connection layer function, the sigmoid () is a numerical mapping function, the output1 is a first output code corresponding to a positive sample, and the output2 is a first output code corresponding to a negative sample;
performing function splicing operation on the positive sample score and the negative sample score to obtain a model prediction result:
wherein the score is a model prediction result.
In an embodiment of the present invention, score + And score - The original outputs output1 and output2 of the positive sample and the negative sample are obtained through a layer of fully connected neural network FNN by a sigmoid function, and the value range is [0,1 ]]。
The loss3 between positive and negative sample pairs is therefore:
wherein loss3 is positive and negative contrast loss, score is model prediction result, alpha is parameter weight, score + Score as positive sample, the score - Is a negative sample score.
And S8, carrying out weighted calculation on the positive sample loss, the negative sample loss and the positive and negative contrast loss according to a preset weight configuration rule to obtain comprehensive loss.
In the embodiment of the invention, each weight coefficient needs to be carried out according to specific conditions, and if each weight coefficient is 1, the final comprehensive loss is as follows:
loss=loss1+loss2+loss3
And S9, training the inquiry answer sorting model according to the comprehensive loss to obtain a trained inquiry answer sorting model.
In detail, in the embodiment of the present invention, training the query response ranking model according to the comprehensive loss to obtain a trained query response ranking model includes:
recording the comprehensive loss to obtain a comprehensive loss curve;
calculating the convergence score of the comprehensive loss curve, and judging whether the convergence score is smaller than a preset qualification threshold value or not;
when the convergence score is greater than the qualification threshold, continuing to extract training samples from the positive sample set and the negative sample set, and performing iterative training on the inquiry answer ordering model;
and stopping the training process when the convergence score is smaller than or equal to the qualification threshold value, and obtaining the trained inquiry answer ordering model.
In the embodiment of the invention, the comprehensive loss represents the error value of the model, and when the comprehensive loss is not converged, the method shows that the model training has obvious progressive effect and can also continue training. And when the comprehensive loss converges, the model training effect is stopped, and in order to avoid the model overfitting phenomenon, the trained inquiry answer ordering model is obtained when the training process is stopped in time.
In order to ensure that the answer result of the inquiry and the intention of a doctor are matched, the embodiment of the invention can truly answer a construction sample of the doctor when constructing the sample, and can carry out sentence transformation on a positive sample to generate a negative sample set, so that the positive sample and the negative sample are integrated together according to a preset sample format to obtain a training set; the training set contains positive samples and negative samples, so that positive sample results and negative sample results appear in the forward execution process of the model network, and positive sample loss and negative sample loss can be obtained; in addition, the invention also synthesizes the output results of the positive and negative samples together in a comparison learning mode, calculates a positive and negative comparison loss, further calculates a comprehensive loss, and trains the model, wherein the three losses enable the sequencing result to be more accurate, avoid uncorrelated or low-quality replies from interfering with the judgment of doctors, and improve the accuracy and the reliability. Therefore, the training method of the inquiry response sequencing model provided by the embodiment of the invention can improve the fitting degree of the intelligent response and the doctor response information by further sequencing the intelligent response set in the on-line medical inquiry scene, thereby improving the doctor inquiry efficiency.
Fig. 4 is a functional block diagram of a training device of the query response ranking model according to an embodiment of the present invention.
The training device 100 of the inquiry response ranking model of the present invention may be installed in an electronic apparatus. Depending on the functions implemented, the training device 100 of the query response ranking model may include a sample construction module 101, a positive sample calculation module 102, a positive and negative sample loss module 103, a contrast loss module 104, and a comprehensive training module 105. The module of the invention, which may also be referred to as a unit, refers to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the sample construction module 101 is configured to construct a positive sample by using historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, reconstruct sentences with unchanged initial text of each real answer of doctors according to a preset negative sample construction strategy to obtain a negative sample set, and construct the positive sample set and the negative sample set according to a preset sample format to obtain a training set;
The positive sample calculation module 102 is configured to sequentially extract a training sample from the training set, perform a history statement-positive sample splicing operation on the training sample according to a history statement and a positive sample text in the training sample by using a pre-constructed inquiry answer sorting model, obtain a positive spliced text, perform quantization encoding on the positive spliced text to obtain a first output code, perform an effective information extraction operation on the positive sample text, perform a marking on the positive spliced text according to an extraction result to obtain a positive information marking text, and perform quantization encoding on the positive information marking text to obtain a second output code;
the positive and negative sample loss module 103 is configured to calculate positive sample loss between the second output code and the first output code according to a contrast learning algorithm, and output negative samples in the training samples according to the operation steps of the step C, the step D, and the step E, so as to obtain negative sample loss;
the contrast loss module 104 is configured to predict full-connection scores of the positive sample set and the negative sample set according to a preset positive and negative sample score ordering rule, obtain a model prediction result, and obtain positive and negative contrast loss according to the model prediction result;
The comprehensive training module 105 is configured to perform weighted calculation on the positive sample loss, the negative sample loss, and the positive and negative contrast loss according to a preset weight configuration rule to obtain a comprehensive loss, and train the inquiry answer sorting model according to the comprehensive loss to obtain a trained inquiry answer sorting model.
In detail, each module in the training device 100 for the query reply ranking model in the embodiment of the present application adopts the same technical means as the training method for the query reply ranking model described in fig. 1 to 3, and can produce the same technical effects, which are not described herein.
Fig. 5 is a schematic structural diagram of an electronic device 1 implementing a training method of a query response ranking model according to an embodiment of the present application.
The electronic device 1 may comprise a processor 10, a memory 11, a communication bus 12 and a communication interface 13, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a training program of a challenge answer ranking model.
The processor 10 may be formed by an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be formed by a plurality of integrated circuits packaged with the same function or different functions, including one or more central processing units (Central Processing Unit, CPU), a microprocessor, a digital processing chip, a graphics processor, a combination of various control chips, and so on. The processor 10 is a Control Unit (Control Unit) of the electronic device 1, connects respective components of the entire electronic device using various interfaces and lines, executes or executes programs or modules (for example, a training program for executing a challenge answer ranking model, etc.) stored in the memory 11, and invokes data stored in the memory 11 to perform various functions of the electronic device and process data.
The memory 11 includes at least one type of readable storage medium including flash memory, a removable hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device, such as a mobile hard disk of the electronic device. The memory 11 may in other embodiments also be an external storage device of the electronic device, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the electronic device. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device. The memory 11 may be used not only for storing application software installed in the electronic device and various types of data, such as codes of training programs of the inquiry response ranking model, etc., but also for temporarily storing data that has been output or is to be output.
The communication bus 12 may be a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
The communication interface 13 is used for communication between the electronic device 1 and other devices, including a network interface and a user interface. Optionally, the network interface may include a wired interface and/or a wireless interface (e.g., WI-FI interface, bluetooth interface, etc.), typically used to establish a communication connection between the electronic device and other electronic devices. The user interface may be a Display (Display), an input unit such as a Keyboard (Keyboard), or alternatively a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device and for displaying a visual user interface.
Fig. 5 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 5 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The training program of the interview reply ranking model stored in the memory 11 in the electronic device 1 is a combination of instructions that, when executed in the processor 10, may implement:
step A: constructing a positive sample by using historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, and carrying out initial text-unchanged sentence transformation on each real answer of the doctors according to a preset negative sample construction strategy to obtain a negative sample set;
And (B) step (B): according to a preset sample format, constructing the positive sample set and the negative sample set to obtain a training set;
step C: sequentially extracting a training sample from the training set, performing history statement-positive sample splicing operation on the training sample according to history statements and positive sample texts in the training sample by using a pre-constructed inquiry answer sequencing model to obtain a positive spliced text, and performing quantization coding on the positive spliced text to obtain a first output code;
step D: performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive information marking text, and performing quantization coding on the positive information marking text to obtain a second output code;
step E: calculating positive sample loss between the second output code and the first output code according to a contrast learning algorithm;
step F: outputting the negative samples in the training samples according to the operation steps of the step C, the step D and the step E to obtain negative sample loss;
step G: according to a preset positive and negative sample score ordering rule, carrying out full-connection score prediction on the positive sample set and the negative sample set to obtain a model prediction result, and obtaining positive and negative contrast loss according to the model prediction result;
Step H: according to a preset weight configuration rule, weighting calculation is carried out on the positive sample loss, the negative sample loss and the positive and negative contrast loss to obtain comprehensive loss;
step I: and training the inquiry answer sequencing model according to the comprehensive loss to obtain the trained inquiry answer sequencing model.
In particular, the specific implementation method of the above instructions by the processor 10 may refer to the description of the relevant steps in the corresponding embodiment of the drawings, which is not repeated herein.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable storage medium may be volatile or nonvolatile. For example, the computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
The present invention also provides a computer readable storage medium storing a computer program which, when executed by a processor of an electronic device, can implement:
Step A: constructing a positive sample by using historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, and carrying out initial text-unchanged sentence transformation on each real answer of the doctors according to a preset negative sample construction strategy to obtain a negative sample set;
and (B) step (B): according to a preset sample format, constructing the positive sample set and the negative sample set to obtain a training set;
step C: sequentially extracting a training sample from the training set, performing history statement-positive sample splicing operation on the training sample according to history statements and positive sample texts in the training sample by using a pre-constructed inquiry answer sequencing model to obtain a positive spliced text, and performing quantization coding on the positive spliced text to obtain a first output code;
step D: performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive information marking text, and performing quantization coding on the positive information marking text to obtain a second output code;
step E: calculating positive sample loss between the second output code and the first output code according to a contrast learning algorithm;
Step F: outputting the negative samples in the training samples according to the operation steps of the step C, the step D and the step E to obtain negative sample loss;
step G: according to a preset positive and negative sample score ordering rule, carrying out full-connection score prediction on the positive sample set and the negative sample set to obtain a model prediction result, and obtaining positive and negative contrast loss according to the model prediction result;
step H: according to a preset weight configuration rule, weighting calculation is carried out on the positive sample loss, the negative sample loss and the positive and negative contrast loss to obtain comprehensive loss;
step I: and training the inquiry answer sequencing model according to the comprehensive loss to obtain the trained inquiry answer sequencing model.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The embodiment of the application can acquire and process the related data based on the artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present application and not for limiting the same, and although the present application has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present application without departing from the spirit and scope of the technical solution of the present application.

Claims (10)

1. A method of training a challenge response ranking model, the method comprising:
Step A: constructing a positive sample by using historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, and carrying out initial text-unchanged sentence transformation on each real answer of the doctors according to a preset negative sample construction strategy to obtain a negative sample set;
and (B) step (B): according to a preset sample format, constructing the positive sample set and the negative sample set to obtain a training set;
step C: sequentially extracting a training sample from the training set, performing history statement-positive sample splicing operation on the training sample according to history statements and positive sample texts in the training sample by using a pre-constructed inquiry answer sequencing model to obtain a positive spliced text, and performing quantization coding on the positive spliced text to obtain a first output code;
step D: performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive information marking text, and performing quantization coding on the positive information marking text to obtain a second output code;
step E: calculating positive sample loss between the second output code and the first output code according to a contrast learning algorithm;
Step F: outputting the negative samples in the training samples according to the operation steps of the step C, the step D and the step E to obtain negative sample loss;
step G: according to a preset positive and negative sample score ordering rule, carrying out full-connection score prediction on the positive sample set and the negative sample set to obtain a model prediction result, and obtaining positive and negative contrast loss according to the model prediction result;
step H: according to a preset weight configuration rule, weighting calculation is carried out on the positive sample loss, the negative sample loss and the positive and negative contrast loss to obtain comprehensive loss;
step I: and training the inquiry answer sequencing model according to the comprehensive loss to obtain the trained inquiry answer sequencing model.
2. The method for training a ranking model of inquiry replies according to claim 1, wherein said performing initial text-invariant sentence modification on each of said doctor's real replies according to a preset negative sample construction strategy to obtain a negative sample set comprises:
performing word segmentation processing on the real response of the doctor, randomly fixing the word segmentation of the preset numerical value before the real response of the doctor, and dividing the real response of the doctor into a text prefix part and a modifiable part;
Performing word random replacement on the modifiable part by using a pre-constructed relaxed text enhancement method to obtain a first modified part, and constructing a first type negative sample by using the first modified part and the text prefix part;
repeating text adding operation on the modifiable part according to a preset redundancy strategy to obtain a second modifiable part, and constructing a second type negative sample by utilizing the second modifiable part and the text prefix part;
performing error diagnosis result replacement operation on the modifiable portion according to a preset accuracy policy to obtain a third modified portion, and constructing a third type negative sample by using the third modified portion and the text prefix portion;
and mixing the first type negative sample, the second type negative sample and the third type negative sample according to a preset collocation proportion to obtain a negative sample set.
3. The method for training a ranking model of interview answers according to claim 1, wherein training the ranking model of interview answers according to the comprehensive loss to obtain a trained ranking model of interview answers comprises:
recording the comprehensive loss to obtain a comprehensive loss curve;
Calculating the convergence score of the comprehensive loss curve, and judging whether the convergence score is smaller than a preset qualification threshold value or not;
when the convergence score is greater than the qualification threshold, continuing to extract training samples from the positive sample set and the negative sample set, and performing iterative training on the inquiry answer ordering model;
and stopping the training process when the convergence score is smaller than or equal to the qualification threshold value, and obtaining the trained inquiry answer ordering model.
4. The method for training a ranking model of a query response as claimed in claim 1, wherein said performing an effective information extraction operation on said positive sample text, and marking said positive spliced text according to the extraction result to obtain a positive context information mark text, and performing quantization encoding on said positive context information mark text to obtain a second output code, comprises:
extracting part-of-speech information of each word in the positive sample text, and calculating the relevance score of the part-of-speech information and the historical sentences;
according to the relevance scores of the word segmentation, carrying out weight calculation on the historical sentences to obtain historical re-expression sentences;
and splicing according to the history re-expression statement and the positive sample to obtain a positive information mark text, and quantizing to obtain a second output code.
5. The method for training a query response ranking model according to claim 1, wherein the performing full-connection score prediction on the positive sample set and the negative sample set according to a preset positive and negative sample score ranking rule to obtain a model prediction result comprises:
obtaining the inquiry answer sorting model, and respectively calculating the predictive scores of the positive sample set and the negative sample set to obtain a positive sample score and a negative sample score:
score + =sigmoid(FNN(output1))
score - =sigmoid(FNN(output2))
wherein the score + Score as positive sample, the score - The FNN () is a full-connection layer function, the sigmoid () is a numerical mapping function, the output1 is a first output code corresponding to a positive sample, and the output2 is a first output code corresponding to a negative sample;
performing function splicing operation on the positive sample score and the negative sample score to obtain a model prediction result:
wherein the score is a model prediction result.
6. The method for training a challenge response ranking model of claim 1, wherein said deriving positive and negative contrast loss from said model predictive outcome comprises:
wherein loss3 is positive and negative contrast loss, score is model prediction result, alpha is parameter weight, score + Score as positive sample, the score - Is a negative sample score.
7. The method for training a challenge response ranking model of claim 1 wherein the training samples are structured as follows:
sample= { history': history, pos: positive sample text, neg: corresponding negative sample text }
In the formula, the sample is a training sample, the history is a history statement, the history' is a history re-expression statement, the pos represents a positive sample, and the neg represents a negative sample.
8. A training device for a challenge response ranking model, the device comprising:
the sample construction module is used for constructing a positive sample by utilizing historical customer inquiry information and corresponding real answers of doctors to obtain a positive sample set, carrying out initial text-invariant sentence transformation on the real answers of the doctors according to a preset negative sample construction strategy to obtain a negative sample set, and constructing the positive sample set and the negative sample set according to a preset sample format to obtain a training set;
the positive sample calculation module is used for sequentially extracting a training sample from the training set, performing historical sentence-positive sample splicing operation on the training sample according to historical sentences and positive sample texts in the training sample by utilizing a pre-constructed inquiry answer sorting model to obtain a positive spliced text, performing quantization coding on the positive spliced text to obtain a first output code, performing effective information extraction operation on the positive sample text, marking the positive spliced text according to an extraction result to obtain a positive information marking text, and performing quantization coding on the positive information marking text to obtain a second output code;
The positive and negative sample loss module is used for calculating positive sample loss between the second output code and the first output code according to a contrast learning algorithm, and outputting negative samples in the training samples according to the operation steps of the step C, the step D and the step E to obtain negative sample loss;
the comparison loss module is used for carrying out full-connection score prediction on the positive sample set and the negative sample set according to a preset positive and negative sample score ordering rule to obtain a model prediction result, and obtaining positive and negative comparison loss according to the model prediction result;
the comprehensive training module is used for carrying out weighted calculation on the positive sample loss, the negative sample loss and the positive and negative contrast loss according to a preset weight configuration rule to obtain comprehensive loss, and training the inquiry answer sorting model according to the comprehensive loss to obtain the trained inquiry answer sorting model.
9. An electronic device, the electronic device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein,,
the memory stores a computer program executable by the at least one processor to enable the at least one processor to perform the method of training the interview reply ranking model of any one of claims 1 to 7.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a method of training a challenge response ranking model according to any one of claims 1 to 7.
CN202310779295.6A 2023-06-28 2023-06-28 Method, device, equipment and storage medium for question and answer sorting model Pending CN116825391A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310779295.6A CN116825391A (en) 2023-06-28 2023-06-28 Method, device, equipment and storage medium for question and answer sorting model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310779295.6A CN116825391A (en) 2023-06-28 2023-06-28 Method, device, equipment and storage medium for question and answer sorting model

Publications (1)

Publication Number Publication Date
CN116825391A true CN116825391A (en) 2023-09-29

Family

ID=88123713

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310779295.6A Pending CN116825391A (en) 2023-06-28 2023-06-28 Method, device, equipment and storage medium for question and answer sorting model

Country Status (1)

Country Link
CN (1) CN116825391A (en)

Similar Documents

Publication Publication Date Title
CN114822812A (en) Character dialogue simulation method, device, equipment and storage medium
WO2023178978A1 (en) Prescription review method and apparatus based on artificial intelligence, and device and medium
CN115221276A (en) Chinese image-text retrieval model training method, device, equipment and medium based on CLIP
CN116662488A (en) Service document retrieval method, device, equipment and storage medium
CN116821373A (en) Map-based prompt recommendation method, device, equipment and medium
CN116719920A (en) Dynamic sampling dialogue generation model training method, device, equipment and medium
CN116702761A (en) Text error correction method, device, equipment and storage medium
CN116737933A (en) Text classification method, apparatus, electronic device and computer readable storage medium
CN116705345A (en) Medical entity labeling method, device, equipment and storage medium
CN116383766A (en) Auxiliary diagnosis method, device, equipment and storage medium based on multi-mode data
CN116702776A (en) Multi-task semantic division method, device, equipment and medium based on cross-Chinese and western medicine
CN116994695A (en) Training method, device, equipment and storage medium of report generation model
CN116825391A (en) Method, device, equipment and storage medium for question and answer sorting model
CN111680515B (en) Answer determination method and device based on AI (Artificial Intelligence) recognition, electronic equipment and medium
CN113961715A (en) Entity linking method, device, equipment, medium and computer program product
US11989520B2 (en) System and method for morality assessment
CN115221875B (en) Word weight generation method, device, electronic equipment and storage medium
CN114723523B (en) Product recommendation method, device, equipment and medium based on user capability image
CN116431811A (en) Hidden factor-based attention mechanism intention recognition method, device, equipment and medium
CN116631608A (en) Method, device, equipment and storage medium for identifying inquiry symptoms
CN116364300A (en) Method, device, equipment and storage medium for identifying physique of traditional Chinese medicine
CN116701594A (en) Random sampling dialogue text generation training method, device and computer medium
CN116663544A (en) Text analysis method, device, equipment and storage medium based on length optimization
CN117271709A (en) Corpus expansion method and device, electronic equipment and storage medium
CN115952778A (en) Questionnaire question weight self-adaptive generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination