CN117851563A - Automatic question answering method and device, electronic equipment and readable storage medium - Google Patents

Automatic question answering method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN117851563A
CN117851563A CN202311753796.3A CN202311753796A CN117851563A CN 117851563 A CN117851563 A CN 117851563A CN 202311753796 A CN202311753796 A CN 202311753796A CN 117851563 A CN117851563 A CN 117851563A
Authority
CN
China
Prior art keywords
preset
target
answer
questions
question
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311753796.3A
Other languages
Chinese (zh)
Inventor
黄平
黄明星
李银锋
杨传华
沈鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shuidi Technology Group Co ltd
Original Assignee
Beijing Shuidi Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shuidi Technology Group Co ltd filed Critical Beijing Shuidi Technology Group Co ltd
Priority to CN202311753796.3A priority Critical patent/CN117851563A/en
Publication of CN117851563A publication Critical patent/CN117851563A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses an automatic question and answer method, an automatic question and answer device, electronic equipment and a readable storage medium. The method comprises the following steps: receiving an input problem sent by a client; determining at least one target problem corresponding to the input problem according to a preset similar problem matching model and a plurality of preset problems; determining at least one answer set corresponding to at least one target question according to a preset knowledge base; generating a target answer according to a preset language model, at least one target question and at least one answer set; and sending the target answer to the client. Through personalized rewriting of the structured answer set, natural and smooth target answers are generated, flexibility and adaptability of the answers are improved, the final output answers are easier to understand and read, user experience is improved, and practicability of automatic questions and answers is enhanced.

Description

Automatic question answering method and device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of artificial intelligence question answering technologies, and in particular, to an automatic question answering method, an apparatus, an electronic device, and a readable storage medium.
Background
With the development of natural language processing technology, the accumulation and practical application of automatic question and answer technology are mature, and the automatic question and answer technology is successfully applied to a plurality of business scene systems. In the prior art, the automatic question-answering method is generally based on keywords in questions posed by users, directly retrieving answers from a fixed question-answering set or knowledge document, and returning the retrieved answers to the users. However, the answers returned to the user are typically static, lack of flexibility, and do not provide a comprehensive, satisfactory answer to the user.
Disclosure of Invention
In view of this, the present application provides an automatic question-answering method, apparatus, electronic device and readable storage medium, and aims to solve the technical problem that in the prior art, answers returned by automatic questions-answering to users are usually static, lack of flexibility, and cannot provide comprehensive and satisfactory answers to users.
According to a first aspect of the present application, there is provided an automatic question-answering method, the method comprising:
receiving an input problem sent by a client;
determining at least one target problem corresponding to the input problem according to a preset similar problem matching model and a plurality of preset problems;
determining at least one answer set corresponding to at least one target question according to a preset knowledge base;
generating a target answer according to a preset language model, at least one target question and at least one answer set;
and sending the target answer to the client.
Optionally, the step of determining at least one target problem corresponding to the input problem according to the preset similarity problem matching model and the plurality of preset problems specifically includes:
inputting the input problem into a preset similar problem matching model, comparing the similarity between the input problem and a plurality of preset problems contained in the preset similar problem matching model, and outputting at least one similar problem;
at least one target problem is determined based on the at least one similar problem and the preset number threshold.
Optionally, the step of determining at least one target problem according to at least one similar problem and a preset number threshold specifically includes:
acquiring the number of problems of at least one similar problem;
if the number of the problems is one, determining similar problems as target problems;
if the number of the problems is multiple, comparing the number of the problems with a preset number threshold;
if the number of the questions is less than or equal to a preset number threshold, determining that a plurality of similar questions are a plurality of target questions;
if the number of the problems is greater than a preset number threshold, acquiring a plurality of similarity of a plurality of similar problems;
and sorting the plurality of similar problems according to the sequence of the similarity from large to small, and selecting a plurality of similar problems with a preset number of thresholds from the sorted plurality of similar problems as a plurality of target problems.
Optionally, the method further comprises:
acquiring preset reply information under the condition that the plurality of preset questions do not contain similar questions similar to the input questions;
and sending the preset reply information to the client.
Optionally, the step of generating the target answer according to the preset language model, the at least one target question and the at least one answer set specifically includes:
constructing a plurality of first prompt words according to at least one target question and at least one answer set;
inputting a plurality of first prompt words and at least one answer set into a trained preset language model to generate a target answer;
the trained preset language model is obtained by the following steps:
generating a plurality of second prompt words according to the plurality of preset questions, the preset answer of each preset question and the target scene;
training the preset language model according to the plurality of second prompt words, the plurality of preset questions and the plurality of preset answers.
Optionally, the method further comprises:
acquiring a plurality of knowledge data and a plurality of historical question-answer data of a target scene;
generating a plurality of preset questions and preset answers of each preset question according to the knowledge data and the historical question-answering data;
and constructing a preset knowledge base according to the plurality of preset questions and the plurality of preset answers.
Optionally, the method further comprises:
generating a positive sample set and a negative sample set according to a plurality of preset questions;
and training the preset neural network model by taking the positive sample set and the negative sample set as training data, and generating a trained preset similarity problem matching model.
According to a second aspect of the present application, there is provided an automatic question-answering apparatus, comprising:
the receiving unit is used for receiving the input problem sent by the client;
the first determining unit is used for determining at least one target problem corresponding to the input problem according to a preset similar problem matching model and a plurality of preset problems;
the second determining unit is used for determining at least one answer set corresponding to at least one target question according to a preset knowledge base;
the generating unit is used for generating a target answer according to the preset language model, at least one target question and at least one answer set;
and the sending unit is used for sending the target answer to the client.
According to a third aspect of the present application there is provided an electronic device comprising a memory storing a computer program and a processor implementing the steps of the method of any of the first aspects when the processor executes the computer program.
According to a fourth aspect of the present application there is provided a readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the method of any of the first aspects.
By means of the technical scheme, the automatic question answering method, the automatic question answering device, the electronic equipment and the readable storage medium are provided. Specifically, an input problem sent by a client is received, and one or more similar problems corresponding to the input problem are matched out of a plurality of preset problems by using a preset similar problem matching model. And further searching the knowledge content of each similar problem in a preset knowledge base, and summarizing to generate an answer set of each similar problem. And then, processing the searched answer set by utilizing a pre-trained language model, and summarizing to generate a target answer. Through the combination of the preset knowledge base and the preset similar question matching model, a plurality of similar questions are matched to search answers, relevant information is deeply interpreted and provided, the accuracy and coverage of the answers can be improved, and better relevant knowledge and consultation services are provided for users. Further, personalized rewriting is performed on the structured answer set by using a preset language model, natural and smooth target answers are generated, flexibility and adaptability of the answers are improved, the final output answers are easier to understand and read, user experience is improved, and practicability of automatic questions and answers is enhanced.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 shows a schematic flow chart of an automatic question-answering method provided in an embodiment of the present application;
FIG. 2 is a schematic flow chart of another automatic question-answering method according to an embodiment of the present application;
fig. 3 shows a schematic structural diagram of an automatic question answering device according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the application provides an automatic question-answering method, as shown in fig. 1, which comprises the following steps:
s101, receiving an input problem sent by a client.
The automatic question and answer method provided by the embodiment of the application can be applied to a server side, and particularly can be applied to scenes such as financial insurance. The server can analyze the input questions sent by the client, generate corresponding target answers, and send the target answers to the client. Specifically, the client is an application platform for inputting user problems, and is disposed at a terminal device, for example, the client may be a financial insurance service platform, and the server is in communication connection with the client. The input problem is a problem to be queried by a user according to an actual application scene and a business purpose. In particular, a specific business related question, such as "how to buy car insurance? ".
S102, determining at least one target problem corresponding to the input problem according to a preset similar problem matching model and a plurality of preset problems.
In the prior art, a method of hard matching an input problem of a client is generally adopted, and an answer of the input problem is obtained and returned to the client. For example, the user's input question is "is acute gastroenteritis available for purchase insurance? The answer of "ok" or "not ok" is obtained by matching the keywords "acute gastroenteritis", "insurance" in the questions in the insurance clauses and returned to the client. However, different users may have different points of interest and needs for the same problem, but different users may use different description modes due to different word habits or knowledge reserves of the users. Is insurance available for "acute gastroenteritis? "this question, the user may want to learn more about" gastroenteritis application ", and if only answers are possible or not possible, the user may need to re-ask questions again based on the" gastroenteritis application "related questions. Obviously, the intelligent answer mode of the prior art, which only carries out hard matching on the input questions and returns static answers, lacks understanding capability and has poor flexibility, and the comprehensive and satisfactory answers cannot be provided for users, so that the user experience is poor.
Based on the above problems, the present application proposes that after receiving an input problem of a client, at least one target problem is screened out from a plurality of preset problems by using a pre-trained similar problem matching model, and it should be noted that the target problem is a similar problem corresponding to the input problem, and the similar problems are already stored in a preset knowledge base in advance. For example, the input problem is "is acute gastroenteritis available for purchase insurance? If the gastroenteritis is acute, then the similar problem related to gastroenteritis application can be matched in a plurality of preset problems by using the preset similar problem matching model, for example, if the acute gastroenteritis is acute, the insurance can be applied? "," can gastroenteritis patient purchase travel insurance? Is "acute gastroenteritis affecting the application of medical insurance? "," if acute gastroenteritis has occurred, will it affect the purchase of major disease insurance? "etc.
By means of the method, one or more similar questions are matched for the input questions by using the preset similar question matching model, diversified answers are further provided, a user can be helped to obtain more comprehensive knowledge, and personalized requirements of the user are met.
S103, determining at least one answer set corresponding to at least one target question according to a preset knowledge base.
In this step, the preset knowledge base is a structured semantic knowledge base in which a large number of questions and corresponding answers of the business domain are stored. Taking the insurance field as an example, the preset knowledge base includes a large number of preset questions summarized based on insurance clauses, insurance article data and historical consultation data and answers corresponding to each preset question. And searching in a preset knowledge base based on the matched one or more similar questions (target questions), and finding an answer set corresponding to each similar question. For example, for the question "if there is acute gastroenteritis, can insurance be applied? And the preset knowledge base contains the claim regulations corresponding to the acute gastroenteritis in the insurance contracts of different insurance types.
S104, generating a target answer according to the pre-training language model and at least one answer set.
In this step, because the knowledge contained in the preset knowledge base has strong expertise and more answers, if the answer set output by the preset knowledge base is directly returned to the client, the answer may be hard and unnatural, and is not easy to be understood by the user. In addition, if the similar questions are multiple, the multiple answer sets are directly sent to the client, which may cause that the answers cannot be sufficiently matched with the questions asked by the user, which may cause confusion to the user and even mislead the user. Therefore, in order to improve both the smoothness and nature of the output answer and make the answer easier to understand, the embodiment of the application proposes to integrate the content in at least one answer set by using a pre-trained language model to generate a target answer.
By the method, answers to similar questions are summarized, so that more comprehensive, accurate and highly-adaptive answers can be provided, and a coherent experience is provided for users.
S105, sending the target answer to the client.
In this step, the final generated target answer is returned to the client for the user to view. The target answers can provide diversified views and explanations from different angles, help users deepen understanding of the questions, enable the users under different knowledge backgrounds and experiences to obtain comprehensive and easily understood knowledge, effectively improve the accuracy of automatic question answering, greatly improve user experience, reduce dependence on customer service staff and reduce labor cost.
According to the automatic question-answering method, the input questions sent by the client are received, and one or more similar questions corresponding to the input questions are matched out of a plurality of preset questions by using a preset similar question matching model. And further searching the knowledge content of each similar problem in a preset knowledge base, and summarizing to generate an answer set of each similar problem. And then, processing the searched answer set by utilizing a pre-trained language model, and summarizing to generate a target answer. Through the combination of the preset knowledge base and the preset similar question matching model, a plurality of similar questions are matched to search answers, relevant information is deeply interpreted and provided, the accuracy and coverage of the answers can be improved, and better relevant knowledge and consultation services are provided for users. Further, personalized rewriting is performed on the structured answer set by using a preset language model, natural and smooth target answers are generated, flexibility and adaptability of the answers are improved, the final output answers are easier to understand and read, user experience is improved, and practicability of automatic questions and answers is enhanced.
Further, as shown in fig. 2, as a refinement and extension of the specific implementation manner of the foregoing embodiment, in order to fully describe the specific implementation process of the embodiment, another automatic question-answering method is provided in the embodiment of the present application, where the method includes:
s201, receiving an input problem sent by a client.
The method of this step is the same as the method of step S101 shown in fig. 1, and will not be described here again.
S202, inputting the input problem into a preset similar problem matching model, comparing the similarity between the input problem and a plurality of preset problems contained in the preset similar problem matching model, and outputting at least one similar problem.
In the step, the input problem and each preset problem are input into a preset similar problem matching model, the input problem and each preset problem are represented in a vectorization mode through the model, and then the similarity between the input problem vector and each preset problem vector is calculated by using a similarity calculation method to measure the similarity between the input problem and each preset problem. And then find one or more similar questions related to the input question based on the calculated plurality of similarities.
In an actual application scene, in the process of comparing the similarity between the input problem and a plurality of preset problems, a similarity threshold value can be set, and when the similarity between any one of the preset problems and the input problem is greater than the similarity threshold value, the preset problem can be confirmed to be the similar problem of the input problem.
S203, determining at least one target problem according to the at least one similar problem and the preset quantity threshold.
In this step, the similar questions of the input questions may include one or more similar questions, if the number of similar questions is large, the similar questions are not screened, but if the answers are searched for all similar questions, a lot of time is required, the answer content is large, the user reading amount is large, and useful information can not be found in time. In order to avoid the occurrence of the above situation, the embodiment of the present application proposes that the number threshold may be preset, and the number of the selected similar problems is controlled within the threshold range, so as to obtain the final target problem.
Through the method, the number of the selected similar questions is controlled within a certain range, so that the most relevant and most useful questions are selected, further, more valuable answers can be provided, the answer retrieval efficiency is improved, the browsing amount of a user is reduced, meanwhile, the user can find the wanted answer faster, unnecessary searching is reduced, and the user satisfaction is increased.
In this embodiment of the present application, optionally, in step S203, the step of determining at least one target problem according to at least one similar problem and a preset number threshold specifically includes: acquiring the number of problems of at least one similar problem; if the number of the problems is one, determining similar problems as target problems; if the number of the problems is multiple, comparing the number of the problems with a preset number threshold; if the number of the questions is less than or equal to a preset number threshold, determining that a plurality of similar questions are a plurality of target questions; if the number of the problems is greater than a preset number threshold, acquiring a plurality of similarity of a plurality of similar problems; and ordering the plurality of similar problems according to the sequence of the similarity from large to small, and determining a plurality of target problems based on a preset quantity threshold among the ordered plurality of similar problems.
In this embodiment, the number of questions of the matched similar questions is obtained, and in the case that the number of questions is one, it is explained that the input questions are smaller, and there are no other questions related thereto, and at this time, the similar questions are taken as the target questions of the answers to be searched. Further, in the case that the number of questions is plural, comparing the number of questions with a preset number threshold, if the number of questions is less than or equal to the preset number threshold, directly using plural similar questions as plural target questions; if the number of the questions is greater than the preset number threshold, the related questions are more, and at this time, the similarity of each similar question needs to be obtained, and the similar questions are ordered according to the order of the similarity from large to small. And finally, selecting a preset number of similar problems from the sorted similar problems as target problems.
Optionally, the preset number threshold may be specifically set by related personnel based on the service field, taking the insurance field as an example, where the types of problems faced in the insurance field are various, and the range of the preset number threshold may be set to 3 to 5, and by setting the number threshold, similar problems are screened, so that the finally determined target problem is more quality, and further the amount of the user related to the additional information is reduced, so that the user can quickly find a desired answer, and the use experience of the user is improved, thereby improving the customer satisfaction and loyalty.
In the embodiment of the present application, optionally, in the process of matching similar problems with input problems, when the similar problems are not matched, the embodiment of the present application further includes: acquiring preset replies under the condition that the plurality of preset questions do not contain similar questions similar to the input questions; and returning the preset reply to the client.
In this embodiment, in the process of performing the similarity problem matching by using the preset similarity problem matching model, if it is found that the similar problem similar to the input problem is not included in the plurality of preset problems, it is indicated that the problem proposed by the user is not within the preset knowledge base storage range, and at this time, preset reply information, for example, "very sorry, cannot find the answer required by the user at present, is obtained. Please provide more detailed information or try to redirect the questions, i will try to provide you with accurate answers. I would also be willing to assist you if you have other questions. And sending the preset reply information to the client so that the user can ask questions again based on the received reply information.
By means of the method, under the condition that the answer cannot be found for the user, the preset reply information is sent to the user, so that the user can clearly know that the answer cannot be found, some models or irrelevant replies are not obtained, and misunderstanding or time waste caused by the user is avoided.
S204, determining at least one answer set corresponding to at least one target question according to a preset knowledge base.
The method of this step is the same as the method of step S103 shown in fig. 1, and will not be described here again.
S205, constructing a plurality of first prompt words according to at least one target question and at least one answer set.
In this step, the answers retrieved by the preset knowledge base are usually static, one-to-one matching of similar questions and answers, and for questions including multiple similar questions or questions requiring complex logic judgment, the answers may not meet the user's needs by directly feeding back the answers from the preset knowledge base, and in addition, the user may feel cold ice and lack humanized responses. This may reduce the user's satisfaction and experience. In order to provide more accurate and personalized answers, the embodiment of the application provides a method for personalized rewriting of an answer set by utilizing a pre-trained preset language model to obtain a target answer text. This helps to improve the accuracy, specificity and operability of the answers.
Specifically, according to the characteristics of each target question and the corresponding answer thereof, a plurality of corresponding first prompt words are set for the target questions, wherein the first prompt words should cover keywords, subject words, roles of questions, styles of answering questions and the like related to the questions, so as to help the model to identify and process the questions in a specific scene.
S206, inputting a plurality of first prompt words and at least one answer set into the trained preset language model to generate a target answer.
In the step, at least one answer set and a plurality of set first prompt sets are combined into a text, and a trained preset language model is input. The model determines the personalized requirements of the user according to the input prompt words, rewrites at least one answer set in a mode of replacing, adjusting sentence structure, adding or deleting some details and the like so as to ensure the correctness of the logic and the semanteme of the target answer, and finally generates the personalized target answer.
In an actual application scenario, taking an insurance field as an example, based on a user input of a problem of how to sell the senile cancer prevention risk, a plurality of similar problems are matched as what type of insurance the senile cancer prevention risk belongs to? "what are the safeguards of the elderly cancer prevention risks? What are "and" the claims for the prevention of cancer in elderly? ". After retrieving the answer to each target question, the plurality of first prompt words may be: cancer prevention risk for old people, guarantee responsibility, claim free amount, insurance planner (role of preset language model), user who does not understand insurance knowledge (role of questioner), conciseness, clarity and hierarchy (style of answering questions) ". Final assembly set conditions: the model is used as an insurance planner, and a user is a layman without knowledge of insurance, and the model needs to individually rewrite three answer sets corresponding to the three searched similar questions so as to arrange the knowledge in the three answer sets into a plurality of answer contents with strong logic sense, simplicity, easy understanding, distinct hierarchy and progressive relation. And in the automatic question-answering process, combining the set conditions and the three answer sets into a text, inputting a trained preset language model, and carrying out personalized rewriting on the three answer sets based on the set conditions to obtain a target answer text.
In an embodiment of the present application, optionally, the trained preset language model is obtained by: generating a plurality of second prompt words according to the plurality of preset questions, the preset answer of each preset question and the target scene; training the preset language model according to the plurality of second prompt words, the plurality of preset questions and the plurality of preset answers.
In this embodiment, the pre-set language model is mainly used for Natural Language Processing (NLP) tasks, and self-attention mechanism (self-attention) is used to capture the dependency between sentences, to better understand the context, and to generate answer text that is relevant to it. And generating a plurality of second prompt words in advance based on the plurality of preset questions, the preset answer of each preset question and the target scene. Thereafter, a formatted training sample is composed based on each preset question, preset answer and corresponding second prompt word for training of the model. And training the preset language model based on the training sample to adjust the objective function, the learning rate and the training rounds of the model so as to optimize the effect of generating personalized answers.
In the practical application scenario, the preset language model may be a GPT (generating Pre-trained Transformer) model, or may be another large language model with the same word processing and generating functions, which is not limited in this embodiment. For example, the rewrite function of the large model GPT-4 is used for personalized rewrite of the plurality of answer sets retrieved, so as to generate target answers which are more suitable for the query requirements of users.
Through the method, the answer set retrieved by the preset knowledge base is subjected to personalized rewrite, the trained preset language model is utilized to provide the target answer, the generated personalized answer standard is ensured to be understandable and accord with the professional knowledge of the target scene, the personalized requirement of the user can be met, and the satisfaction degree and experience of the user are improved.
In this embodiment of the present application, optionally, the step of constructing a preset knowledge base specifically includes: acquiring a plurality of knowledge data and a plurality of historical question-answer data of a target scene; generating a plurality of preset questions and preset answers of each preset question according to the knowledge data and the historical question-answering data; and constructing a preset knowledge base according to the plurality of preset questions and the plurality of preset answers.
In this embodiment, a plurality of knowledge data of the target scene is acquired, including data of internal resources, policies, product descriptions, etc., a large amount of knowledge data is studied, and common questions and answers therein are extracted. And simultaneously collecting records of past customer consultation and the existing common problem set in the target scene, and collecting a large number of historical problems and corresponding answers in the records. And organizing the collected questions and answers into formatted preset question-answer pairs, wherein each preset question-answer pair comprises a preset question and a corresponding preset answer, so that the preset is ensured to be clear and definite, and the preset answers are accurate and complete. And then constructing a preset knowledge base according to the structured and tidied preset question-answer pairs.
In an actual application scene, taking an automatic question and answer in the field of insurance as an example, structured insurance clause data and insurance website article data in an insurance platform are collected, policies and descriptions in clauses are carefully researched, and common insurance questions and corresponding answers in the insurance clauses are extracted. And simultaneously, extracting common insurance questions and corresponding answers from the website articles. Further, a record of past customer consultations and existing common problem sets are reviewed, and historical problems and corresponding answers therein are collected. In addition, insurance specialists and insurance service personnel in the company can be consulted to acquire the frequently encountered problems and relevant answers. All the collected questions and answers are sorted and categorized according to the subject matter, type or other relevant factors of the questions. This will help better organize the preset knowledge base, facilitating the subsequent training of the preset similarity problem matching model to provide the underlying data. The questions and answers are organized into formatted question-answer pairs. Each question-answer correspondence includes a question and a corresponding answer. The clear and definite question is ensured, and the answer is accurate and complete. And finally, constructing a preset knowledge base based on the collated question-answer pairs. Alternatively, knowledge in the insurance domain is dynamically changing, so the preset knowledge base needs to be updated and maintained periodically. Along with the occurrence of new questions, the change of insurance policies and the online/offline of products, updated questions and answers are added to a preset knowledge base in time, so that the timeliness and the accuracy of the preset knowledge base are ensured.
By the method, comprehensive and accurate questions and answers are collected and structured, and a preset knowledge base of a target scene is built after the questions and the answers are arranged, so that quick, consistent and accurate answers are provided, and the automatic answer efficiency is improved.
In this embodiment of the present application, optionally, a step of training a similarity problem matching model is preset, which specifically includes: generating a positive sample set and a negative sample set according to a plurality of preset questions; and training the preset neural network model by taking the positive sample set and the negative sample set as training data, and generating a trained preset similarity problem matching model.
In this embodiment, a plurality of preset questions are used as training data, and each preset question is cleaned and preprocessed, such as word segmentation, stop word removal, and the like, so as to facilitate later model training. Thereafter, a positive sample dataset is constructed, i.e., one is a preset question and the other is a near-sense or similar question, labeled 1, while a negative sample dataset is constructed, i.e., labeled 0, for a completely unrelated set of question pairs. Training the preset neural network model by using the constructed positive sample set and negative sample set, wherein the training times and the training speed can be adjusted according to actual requirements. The process is label-free self-supervision training, word vectors expressing similarity semantics are learned through maximized similarity and minimized dissimilarity, and finally a trained preset similarity problem matching model is generated.
In an actual application scenario, the preset neural network model may be a sentence transformer model. Fine tuning is carried out on a sentence transformer model, a large number of preset problems in the insurance field are combed by utilizing professional insurance knowledge, a large number of similar problems are contained in the preset problems, a positive and negative sample set of similar problems in a batch of insurance field is constructed, and a similar problem matching model in the insurance field is trained and generated.
By means of the method, the preset similar problem matching model is built, similar problems can be accurately matched, efficient retrieval of the insurance clause knowledge base is achieved, user inquiry is accurately matched, and service experience and satisfaction of users are improved.
S207, sending the target answer to the client.
The method of this step is the same as that of step S105 shown in fig. 1, and will not be described here again.
Further, as a specific implementation of the method shown in fig. 1, an embodiment of the present application provides an automatic question and answer device 300, as shown in fig. 3, including: a receiving unit 301, a first determining unit 302, a second determining unit 303, a generating unit 304, and a transmitting unit 305.
A receiving unit 301, configured to receive an input problem sent by a client;
a first determining unit 302, configured to determine at least one target problem corresponding to the input problem according to a preset similarity problem matching model and a plurality of preset problems;
a second determining unit 303, configured to determine at least one answer set corresponding to at least one target question according to a preset knowledge base;
a generating unit 304, configured to generate a target answer according to a preset language model, at least one target question, and at least one answer set;
and the sending unit 305 is configured to send the target answer to the client.
In a specific application scenario, in order to determine at least one target problem corresponding to an input problem, the first determining unit 302 includes: a generation module and a determination module.
The generation module is used for inputting the input problem into a preset similar problem matching model, comparing the similarity between the input problem and a plurality of preset problems contained in the preset similar problem matching model, and outputting at least one similar problem;
and the determining module is used for determining at least one target problem according to the at least one similar problem and the preset quantity threshold value.
Further, when the number of questions is greater than a preset number threshold, at least one target question needs to be determined according to the at least one similar question and the preset number threshold, and the determining module includes: the system comprises a first acquisition sub-module, a first determination sub-module, a comparison sub-module, a second determination sub-module, a second acquisition sub-module and a selection sub-module.
A first obtaining sub-module, configured to obtain a problem number of at least one similar problem;
the first determining submodule is used for determining similar problems as target problems if the number of the problems is one;
the comparison sub-module is used for comparing the number of the problems with a preset number threshold value if the number of the problems is multiple;
the second determining submodule is used for determining a plurality of similar problems as a plurality of target problems if the number of the problems is smaller than or equal to a preset number threshold;
the second obtaining submodule is used for obtaining a plurality of similarity of a plurality of similar problems if the number of the problems is larger than a preset number threshold;
the selecting sub-module is used for sequencing the plurality of similar problems according to the sequence of the similarity from large to small, and selecting a plurality of similar problems with a preset number of thresholds from the sequenced plurality of similar problems as a plurality of target problems.
Optionally, as shown in fig. 3, the apparatus further includes:
the first obtaining unit 306 is configured to obtain preset reply information when a similar problem similar to the input problem is not included in the plurality of preset problems.
Optionally, the sending unit 305 is further configured to send a preset reply message to the client.
In a specific application scenario, in order to generate a target answer, the generating unit 304 includes: the system comprises a construction module, a first generation module, a second generation module and a training module.
The construction module is used for constructing a plurality of first prompt words according to at least one target question and at least one answer set;
the first generation module is used for inputting a plurality of first prompt words and at least one answer set into the trained preset language model to generate a target answer;
the second generation module is used for generating a plurality of second prompt words according to a plurality of preset questions, preset answers of each preset question and a target scene;
the training module is used for training the preset language model according to the plurality of second prompt words, the plurality of preset questions and the plurality of preset answers.
Optionally, as shown in fig. 3, the apparatus further includes:
a second acquisition unit 307 for acquiring a plurality of knowledge data and a plurality of history question-answer data of the target scene.
In a specific application scenario, in order to generate a plurality of preset questions and a preset answer of each preset question, the generating unit 304 further includes: and the third generation module is used for generating a plurality of preset questions and preset answers of each preset question according to the knowledge data and the historical question-answering data.
Optionally, as shown in fig. 3, the apparatus further includes:
a construction unit 308, configured to construct a preset knowledge base according to the preset questions and the preset answers.
In a specific application scenario, in order to generate the trained preset similarity problem matching model, the generating unit 304 further includes: a fourth generation module and a fifth generation module:
a fourth generation module, configured to generate a positive sample set and a negative sample set according to a plurality of preset questions;
and the fifth generation module is used for training the preset neural network model by taking the positive sample set and the negative sample set as training data, and generating a trained preset similarity problem matching model.
The automatic question answering device 300 provided in the embodiment of the present application receives an input question sent by a client, and matches one or more similar questions corresponding to the input question among a plurality of preset questions by using a preset similar question matching model. And further searching the knowledge content of each similar problem in a preset knowledge base, and summarizing to generate an answer set of each similar problem. And then, processing the searched answer set by utilizing a pre-trained language model, and summarizing to generate a target answer. Through the combination of the preset knowledge base and the preset similar question matching model, a plurality of similar questions are matched to search answers, relevant information is deeply interpreted and provided, the accuracy and coverage of the answers can be improved, and better relevant knowledge and consultation services are provided for users. Further, personalized rewriting is performed on the structured answer set by using a preset language model, natural and smooth target answers are generated, flexibility and adaptability of the answers are improved, the final output answers are easier to understand and read, user experience is improved, and practicability of automatic questions and answers is enhanced.
In an exemplary embodiment, the present application also provides an electronic device including a memory and a processor. The memory stores a computer program, and a processor for executing the program stored in the memory, and executing the automatic question-answering method in the above embodiment.
In an exemplary embodiment, the present application also provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the automatic question-answering method.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented in hardware, or may be implemented by means of software plus necessary general hardware platforms. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile readable storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing an electronic device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in the various implementation scenarios of the present application.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application.
Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario.
The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.

Claims (10)

1. An automatic question-answering method, comprising:
receiving an input problem sent by a client;
determining at least one target problem corresponding to the input problem according to a preset similar problem matching model and a plurality of preset problems;
determining at least one answer set corresponding to at least one target question according to a preset knowledge base;
generating a target answer according to a preset language model, the at least one target question and the at least one answer set;
and sending the target answer to the client.
2. The method according to claim 1, wherein the step of determining at least one target problem corresponding to the input problem according to the preset similarity problem matching model and the plurality of preset problems specifically includes:
inputting the input questions into the preset similarity question matching model, comparing the similarity between the input questions and the preset questions contained in the preset similarity question matching model, and outputting at least one similarity question;
and determining at least one target problem according to the at least one similar problem and a preset quantity threshold.
3. The method according to claim 2, wherein the step of determining at least one target problem from the at least one similar problem and a preset number threshold value, comprises in particular:
acquiring the number of problems of the at least one similar problem;
if the number of the problems is one, determining similar problems as target problems;
if the number of the problems is a plurality of, comparing the number of the problems with a preset number threshold;
if the number of questions is less than or equal to the preset number threshold, determining that a plurality of similar questions are a plurality of target questions;
if the number of the problems is larger than the preset number threshold, acquiring a plurality of similarity of a plurality of similar problems;
and sorting the plurality of similar problems according to the sequence of the similarity from large to small, and selecting a plurality of similar problems with a preset number of thresholds from the sorted plurality of similar problems as a plurality of target problems.
4. The method as recited in claim 2, further comprising:
acquiring preset reply information under the condition that the plurality of preset questions do not contain similar questions similar to the input questions;
and sending the preset reply information to the client.
5. The method of claim 1, wherein the step of generating the target answer from the preset language model, the at least one target question, and the at least one answer set, specifically comprises:
constructing a plurality of first prompt words according to the at least one target question and the at least one answer set;
inputting the plurality of first prompt words into a trained preset language model to generate the target answer;
wherein the trained preset language model is obtained by:
generating a plurality of second prompt words according to the plurality of preset questions, the preset answer of each preset question and the target scene;
training a preset language model according to the plurality of second prompt words, the plurality of preset questions and the plurality of preset answers.
6. The method according to any one of claims 1 to 5, further comprising:
acquiring a plurality of knowledge data and a plurality of historical question-answer data of a target scene;
generating a plurality of preset questions and preset answers of each preset question according to the knowledge data and the historical question-answering data;
and constructing the preset knowledge base according to the preset questions and the preset answers.
7. The method according to any one of claims 1 to 5, further comprising:
generating a positive sample set and a negative sample set according to a plurality of preset questions;
and training the preset neural network model by taking the positive sample set and the negative sample set as training data, and generating a trained preset similarity problem matching model.
8. An automatic question answering apparatus, comprising:
the receiving unit is used for receiving the input problem sent by the client;
the first determining unit is used for determining at least one target problem corresponding to the input problem according to a preset similar problem matching model and a plurality of preset problems;
the second determining unit is used for determining at least one answer set corresponding to at least one target question according to a preset knowledge base;
the generating unit is used for generating a target answer according to a preset language model, the at least one target question and the at least one answer set;
and the sending unit is used for sending the target answer to the client.
9. An electronic device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202311753796.3A 2023-12-19 2023-12-19 Automatic question answering method and device, electronic equipment and readable storage medium Pending CN117851563A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311753796.3A CN117851563A (en) 2023-12-19 2023-12-19 Automatic question answering method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311753796.3A CN117851563A (en) 2023-12-19 2023-12-19 Automatic question answering method and device, electronic equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN117851563A true CN117851563A (en) 2024-04-09

Family

ID=90545719

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311753796.3A Pending CN117851563A (en) 2023-12-19 2023-12-19 Automatic question answering method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN117851563A (en)

Similar Documents

Publication Publication Date Title
CN112632385B (en) Course recommendation method, course recommendation device, computer equipment and medium
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN109934721A (en) Finance product recommended method, device, equipment and storage medium
CN108363821A (en) A kind of information-pushing method, device, terminal device and storage medium
CN109767318A (en) Loan product recommended method, device, equipment and storage medium
CN112667794A (en) Intelligent question-answer matching method and system based on twin network BERT model
EP2336905A1 (en) A searching method and system
CN104836720A (en) Method for performing information recommendation in interactive communication, and device
CN101408886A (en) Selecting tags for a document by analyzing paragraphs of the document
CN113254711B (en) Interactive image display method and device, computer equipment and storage medium
CN108664515A (en) A kind of searching method and device, electronic equipment
CN117592489B (en) Method and system for realizing electronic commerce commodity information interaction by using large language model
CN113342958A (en) Question-answer matching method, text matching model training method and related equipment
CN117520503A (en) Financial customer service dialogue generation method, device, equipment and medium based on LLM model
CN110795613A (en) Commodity searching method, device and system and electronic equipment
CN111859138B (en) Searching method and device
CN112330387A (en) Virtual broker applied to house-watching software
CN116756290A (en) Data query method and device, storage medium and electronic equipment
CN116701752A (en) News recommendation method and device based on artificial intelligence, electronic equipment and medium
CN117851563A (en) Automatic question answering method and device, electronic equipment and readable storage medium
CN114707510A (en) Resource recommendation information pushing method and device, computer equipment and storage medium
CN116303983A (en) Keyword recommendation method and device and electronic equipment
Ali et al. Identifying and Profiling User Interest over time using Social Data
CN111078972A (en) Method and device for acquiring questioning behavior data and server
CN116431779B (en) FAQ question-answering matching method and device in legal field, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination