CN117453895B - Intelligent customer service response method, device, equipment and readable storage medium - Google Patents

Intelligent customer service response method, device, equipment and readable storage medium Download PDF

Info

Publication number
CN117453895B
CN117453895B CN202311757102.3A CN202311757102A CN117453895B CN 117453895 B CN117453895 B CN 117453895B CN 202311757102 A CN202311757102 A CN 202311757102A CN 117453895 B CN117453895 B CN 117453895B
Authority
CN
China
Prior art keywords
question
answer
vector
background knowledge
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311757102.3A
Other languages
Chinese (zh)
Other versions
CN117453895A (en
Inventor
申冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311757102.3A priority Critical patent/CN117453895B/en
Publication of CN117453895A publication Critical patent/CN117453895A/en
Application granted granted Critical
Publication of CN117453895B publication Critical patent/CN117453895B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to the technical field of artificial intelligence, and particularly discloses an intelligent customer service response method, device, equipment and a readable storage medium. And for the question information input by the user, retrieving the second associated question-answer data stream and the second associated business background knowledge from the question-answer knowledge base to generate a second prompt, and inputting the second prompt into the language model, so that the language model can more accurately understand the questions of the user and give out proper answer information, and the success rate of intelligent customer service to solve the questions of the user is improved.

Description

Intelligent customer service response method, device, equipment and readable storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to an intelligent customer service response method, apparatus, device, and readable storage medium.
Background
With the development of technology, users have increasingly higher demands for product services. For example, for the pre-sale and after-sale service of the product, the problem of the product proposed by the user can be solved in time, and the problem is a key factor affecting the user experience. In this service mechanism, customer service plays an important role in communication. And the customer service is efficient and high-quality, so that good experience can be brought to users. However, the iteration rate of the product is increased, the market of the user is enlarged, and the product is limited by the quality and busyness of the personnel of the manual customer service, so that the large user consultation amount cannot be dealt with only through the manual customer service. For this, intelligent customer service has been developed.
Current intelligent customer service schemes mainly include intelligent customer service based on pipeline architecture and intelligent customer service based on large-scale language model (Large Language Model, LLM). However, these intelligent customer service schemes have the disadvantage of being only suitable for a one-to-one answer mode, and of having poor ability to distinguish user questions and to give answers based on a database. In a real customer service problem scene, different users can also adopt different description modes for the same problem, so that the intelligent customer service can not give satisfactory answers to the users and also needs to transfer to manual customer service.
The success rate of intelligent customer service for solving the user problem is improved, and the intelligent customer service is a technical problem which needs to be solved by a person skilled in the art.
Disclosure of Invention
The invention aims to provide an intelligent customer service response method, device and equipment and a readable storage medium, which are used for improving the success rate of solving the problem of a user by intelligent customer service and relieving the working pressure of manual customer service.
In order to solve the technical problems, the invention provides an intelligent customer service response method, which comprises the following steps:
constructing a question-answer knowledge base based on the historical question-answer logs and business background knowledge;
from an initial language model, searching a first associated question-answer data stream and first associated business background knowledge which are matched with first sample question information from the question-answer knowledge base, generating a first prompt according to the first associated question-answer data stream and the first associated business background knowledge, inputting the first prompt into an intermediate language model, outputting first prediction answer information, and carrying out iterative training on the intermediate language model by taking the first sample answer information corresponding to the first sample question information as a true value until the iterative training is finished to obtain a language model;
and retrieving a second associated question-answer data stream and second associated service background knowledge matched with the question information input by the user from the question-answer knowledge base, generating a second prompt according to the second associated question-answer data stream and the second associated service background knowledge, inputting the second prompt into the language model, and outputting answer information.
In some implementations, the building a question-answer knowledge base based on the historical question-answer logs and business background knowledge includes:
performing data cleaning on the historical question-answer logs to obtain a historical question-answer data stream;
data cleaning is carried out on the business background knowledge document, and a business background knowledge list is obtained;
and constructing the question-answer knowledge base by using the historical question-answer data stream and the service background knowledge list.
In some implementations, the data cleansing the historical question-answer log to obtain a historical question-answer data stream includes:
and executing system information removing operation, non-business information removing operation, incomplete dialogue removing operation and question and answer data merging operation on the historical question and answer log to obtain the historical question and answer data stream.
In some implementations, performing the remove non-traffic information operation includes:
classifying the data in the history question-answer log by using a first classification model to obtain business information and non-business information;
and deleting the non-business information from the historical question-answer log.
In some implementations, the training method of the first classification model includes:
performing cyclic annotation training on the initial classification model by using a first sample dialogue stream marked with business data and non-business data and a second sample dialogue stream not marked with business data and non-business data to obtain a first classification model;
In each cycle of annotation training, training the initial classification model by using the first sample to obtain an optimized classification model; inputting the second sample dialogue flow into the optimized classification model to obtain a first coarse classification result; and re-labeling the first rough classification result, and then re-training the optimized classification model.
In some implementations, performing the incomplete dialog operation includes:
detecting end information of the dialogue flow from the history question-answer log by using a second classification model;
if the end information of the dialogue flow is detected, determining that the dialogue flow is a complete dialogue flow;
if the end information of the dialogue flow is not detected, determining that the dialogue flow is an incomplete dialogue flow;
and deleting the incomplete dialogue stream in the history question-answer log.
In some implementations, the training method of the second classification model includes:
performing cyclic annotation training on the initial classification model by using a third sample dialogue stream marked with complete dialogue and incomplete dialogue and a fourth sample dialogue stream not marked with complete dialogue and incomplete dialogue to obtain the second classification model;
in each cycle labeling training, training the initial classification model by using the third sample dialogue flow to obtain an optimized classification model; inputting the fourth sample dialogue flow into the optimized classification model to obtain a second coarse classification result; and re-labeling the second coarse classification result, and then re-training the optimized classification model.
In some implementations, performing the question-answer data merge operation includes:
identifying a start message, an end message and an intermediate message from the historical question-answer log by using a third classification model;
and combining the start message, the end message and the intermediate message belonging to the same problem description.
In some implementations, the identifying a start message, an end message, and an intermediate message from the input data using a third classification model includes:
identifying a question start message, a question end message and a question intermediate message from the historical question-answer log by using a fourth classification model;
and identifying an answer start message, an answer end message and an answer intermediate message from the historical question and answer log by using a fifth classification model.
In some implementations, the training method of the third classification model includes:
using sentences as units, and using a start message label, an end message label and an intermediate message label to carry out sequence labeling to obtain first sample sequence data;
performing cyclic labeling training on the initial classification model by using the first sample sequence data and the unlabeled fifth sample dialogue stream to obtain the third classification model;
In each cycle labeling training, training the initial classification model by using the first sample sequence data to obtain an optimized classification model; inputting the fifth sample dialogue flow into the optimized classification model to obtain a third coarse classification result; and re-labeling the third coarse classification result, and then re-training the optimized classification model.
In some implementations, the loss function of the third classification model is:
wherein,for the loss function of the third classification model,nfor the number of sentences in one of the sample sequence data,kfor the number of tag classes to be used,iis the sequence number of the sentence,jfor the label number>Is the firstiThe sentence is marked as the firstjProbability of individual tags>Is modeled at the firstiThe prediction result at each sentence is the firstjProbability of individual tags.
In some implementations, the building a question-answer knowledge base based on the historical question-answer logs and business background knowledge includes:
extracting a historical question-answer data stream vector from the historical question-answer log by using a vectorization processing model;
extracting a business background knowledge vector from the business background knowledge by using the vectorization processing model;
constructing the question-answer knowledge base by using the historical question-answer data stream vector and the business background knowledge vector;
The retrieving, from the question-answer knowledge base, a first associated question-answer data stream and a first associated business background knowledge that match the first sample question information, including:
extracting a first problem vector from the first sample problem information by using the vectorization processing model;
retrieving a first associated question-answer data stream vector matched with the first question vector and a first associated business background knowledge vector from the question-answer knowledge base based on vector similarity;
the retrieving, from the question-answer knowledge base, a second associated question-answer data stream and second associated business background knowledge that match the question information input by the user, including:
extracting a second problem vector from the problem information by using the vectorization processing model;
and retrieving a second associated question-answer data stream vector matched with the second question vector from the question-answer knowledge base based on vector similarity and a second associated business background knowledge vector.
In some implementations, the training method of the vectorization processing model includes:
pre-training a vectorization encoder in an initial vectorization processing model by utilizing the business background knowledge to obtain the initial vectorization processing model after pre-training;
From the initial vectorization processing model after pre-training, performing cyclic labeling training on the initial vectorization processing model after pre-training by using a sixth sample dialogue stream and an unlabeled seventh sample dialogue stream which are marked to express similar dialogue streams and express dissimilar dialogue streams to obtain the vectorization processing model;
in each cycle labeling training, training the initial vectorization processing model after pre-training by using the sixth sample dialogue stream to obtain an optimized classification model; inputting the seventh sample dialogue flow into the optimized classification model to obtain a fourth coarse classification result; and re-labeling the fourth coarse classification result, and then re-training the optimized classification model.
In some implementations, the loss function of the vectorization processing model is:
wherein,for the loss function of the vectorized processing model,afor the first sample vector extracted from the sample problem set,pa second sample vector similar to the problem representation corresponding to the first sample vector,nfor a third sample vector that does not correspond to the problem representation of the first sample vector,mis edge constant, ++ >Sample question sets similar to the expression, +.>Sample question set dissimilar for the expression, +.>For the cosine similarity of the first sample vector and the second sample vector, +.>And the cosine similarity of the first sample vector and the third sample vector.
In some implementations, extracting a vector from input data includes:
extracting a text vector from the text in the input data by using a text vectorization model;
converting an image in the input data into an image vector using an image vectorization model;
and splicing the image vector and the text vector corresponding to the context of the image.
In some implementations, the stitching the image vector with the text vector corresponding to the context of the image includes:
performing image recognition on the image to obtain an image recognition result;
comparing the image recognition result with the context of the image, removing repeated characters, and updating the text vector of the context of the image after updating the context of the image by using the rest image recognition result;
and splicing the image vector with the text vector of the updated image context.
In some implementations, the building the question-answer knowledge base with the historical question-answer data stream vector and the business background knowledge vector includes:
constructing a historical question-answer data stream vector index based on the historical question-answer data stream by using an index tool;
constructing a business background knowledge vector index based on the business background knowledge vector by utilizing the index tool;
and storing the historical question-answer data stream vector index, the historical question-answer data stream vector, the service background knowledge vector index and the service background knowledge vector into the question-answer knowledge base.
In some implementations, retrieving, from the question-answer knowledge base, a second associated question-answer data stream and second associated business background knowledge that match the user-entered question information, includes:
carrying out data cleaning on the problem information to obtain a problem to be solved corresponding to the problem information and a problem description corresponding to the problem to be solved;
retrieving candidate question-answer data streams and candidate business background knowledge associated with the question description from the question-answer knowledge base;
according to the similarity calculation result of the candidate question-answer data stream and the question description, screening the second associated question-answer data stream from the candidate question-answer data stream;
And screening the second related business background knowledge from the candidate business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description.
In some implementations, the screening the second associated question-answer data stream from the candidate question-answer data stream according to the similarity calculation result of the candidate question-answer data stream and the question description includes:
if the candidate question-answer data stream is searched once, scoring the similarity of the candidate question-answer data stream by using the similarity value of the candidate question-answer data stream and the corresponding question description; if the candidate question-answer data stream is searched for a plurality of times, accumulating similarity values of the candidate question-answer data stream and the corresponding question descriptions as similarity scores of the candidate question-answer data stream;
taking the candidate question-answer data streams with the first preset quantity with the highest similarity scores as the second associated question-answer data streams;
and screening the second related business background knowledge from the candidate business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description, wherein the second related business background knowledge comprises the following steps:
If the candidate business background knowledge is searched once, scoring the similarity of the candidate business background knowledge by using the similarity value of the candidate business background knowledge and the corresponding problem description; if the candidate business background knowledge is retrieved for a plurality of times, accumulating similarity values of the candidate business background knowledge and the corresponding problem description as similarity scores of the candidate business background knowledge;
and taking the candidate business background knowledge with the second preset number with the highest similarity score as the second associated business background knowledge.
In order to solve the technical problem, the invention also provides an intelligent customer service response device, which comprises:
the database building unit is used for building a question-answer knowledge base based on the historical question-answer logs and the business background knowledge;
the training unit is used for searching a first associated question-answer data stream and a first associated business background knowledge which are matched with first sample question information from the question-answer knowledge base from an initial language model, generating a first prompt according to the first associated question-answer data stream and the first associated business background knowledge, inputting the first prompt into an intermediate language model, outputting first prediction answer information, and carrying out iterative training on the intermediate language model by taking the first sample answer information corresponding to the first sample question information as a true value until the iterative training is finished to obtain a language model;
And the processing unit is used for retrieving a second associated question-answer data stream and second associated service background knowledge matched with the question information input by the user from the question-answer knowledge base, generating a second prompt according to the second associated question-answer data stream and the second associated service background knowledge, inputting the second prompt into the language model, and outputting answer information.
In order to solve the technical problem, the invention also provides intelligent customer service response equipment, which comprises:
a memory for storing a computer program;
a processor for executing the computer program, which when executed by the processor implements the steps of the intelligent customer service answering method according to any one of the above.
To solve the above technical problem, the present invention further provides a readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the intelligent customer service response method according to any one of the above.
According to the intelligent customer service response method provided by the invention, the historical question-answer logs and the business background knowledge are added into the question-answer knowledge base, when the language model is trained, the first associated question-answer data stream matched with the first sample question information and the first associated business background knowledge are searched from the question-answer knowledge base to generate the first prompt, the intermediate language model is input to obtain the first predicted response information, the intermediate language model is subjected to iterative training by taking the first sample response information corresponding to the first sample question information as a true value until the iterative training is finished, and the obtained language model can enhance the capacity of distinguishing questions and the capacity of generating responses based on the learning of the historical question-answer logs compared with the traditional language model. And for the question information input by the user, a second associated question-answer data stream matched with the question information and second associated business background knowledge are searched from the question-answer knowledge base to generate a second prompt, and the second prompt is input into the language model, so that the language model can more accurately understand the questions of the user and give out proper answer information, the success rate of intelligent customer service to answer the questions of the user is improved, and the working pressure of manual customer service is relieved.
According to the intelligent customer service response method provided by the invention, the historical question-answer logs are subjected to data cleaning to obtain the historical question-answer data stream, the business background knowledge documents are subjected to data cleaning to obtain the business background knowledge list, the question-answer knowledge base is constructed by utilizing the historical question-answer data stream and the business background knowledge list, the canonical representation of the historical question-answer logs and the business background knowledge is enhanced, and the training and reasoning of the redundant information and the nonstandard information interference language model are reduced.
According to the intelligent customer service response method provided by the invention, the historical question-answer data stream can be obtained by executing the system information removing operation, the non-business information removing operation, the incomplete dialogue removing operation and the question-answer data merging operation on the historical question-answer log, and particularly, the non-business information removing operation, the incomplete dialogue removing operation and the question-answer data merging operation are realized through training a classification model, so that the cleaning effect on the historical question-answer log is enhanced.
According to the intelligent customer service response method provided by the invention, the historical question-answer logs, the business background knowledge, the sample question information and the question information input by the user can be vectorized through the vectorization model, so that similarity calculation is conveniently carried out to quickly search the associated data in the question-answer knowledge base. The vectorization encoder in the vectorization processing model is pre-trained to convert the vectorization encoder in the general field into the vectorization encoder with service specialty, and a loss function of the vectorization model is built according to cosine similarity, so that judgment of whether the expressions are similar or not can be quickly carried out when similar data are matched.
According to the intelligent customer service response method provided by the invention, the to-be-solved problem corresponding to the problem information and the problem description corresponding to the to-be-solved problem can be obtained by cleaning the data of the problem information input by the user, the candidate question-answer data stream and the candidate service background knowledge associated with the problem description are retrieved from the question-answer knowledge base, then the second associated question-answer data stream is obtained by screening the candidate question-answer data stream in a mode of accumulating similarity values to obtain similarity scores according to similarity calculation results, and the second associated service background knowledge is obtained by screening the candidate service background knowledge, so that the historical question-answer logs and the service background knowledge are fully utilized to guide language model responses to generate response information.
The invention also provides an intelligent customer service response device, equipment and a readable storage medium, which have the beneficial effects and are not repeated here.
Drawings
For a clearer description of embodiments of the invention or of the prior art, the drawings that are used in the description of the embodiments or of the prior art will be briefly described, it being apparent that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained from them without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of an intelligent customer service response method provided by an embodiment of the invention;
FIG. 2 is a flowchart of a method for cleaning a history question-answer log according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a question-answer data merging operation according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of matching of image-text vectors according to an embodiment of the present invention;
FIG. 5 is a flowchart of a method for optimizing a text vectorization model according to an embodiment of the present invention;
FIG. 6 is a flowchart of an optimization method of an image vectorization model according to an embodiment of the present invention;
FIG. 7 is a flowchart of a training method of a language model according to an embodiment of the present invention;
FIG. 8 is a flow chart of a method for reasoning of a language model provided by an embodiment of the present invention;
fig. 9 is a schematic diagram of matching a question-answer data stream according to an embodiment of the present invention;
FIG. 10 is a schematic diagram of matching accumulated scores of a question-answer data stream according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of an intelligent customer service answering device according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of an intelligent customer service answering device according to an embodiment of the present invention.
Detailed Description
The core of the invention is to provide an intelligent customer service response method, device and equipment and a readable storage medium, which are used for improving the success rate of intelligent customer service to solve the problem of a user and relieving the working pressure of manual customer service.
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The following describes an embodiment of the present invention.
For ease of understanding, a system architecture to which the present invention is applicable will first be described. The intelligent customer service response method provided by the embodiment of the invention can be applied to the computing equipment with the accelerator, and can also be applied to the accelerator cluster and the heterogeneous accelerator cluster. The accelerator may employ, but is not limited to, a graphics processor (Graphics Processing Unit, GPU), a field programmable gate array (Field Programmable Gate Array, FPGA), or the like.
On the basis of the above architecture, the intelligent customer service response method provided by the embodiment of the invention is described below with reference to the accompanying drawings.
The second embodiment of the present invention will be described below.
Fig. 1 is a flowchart of an intelligent customer service response method provided in an embodiment of the present invention.
The intelligent customer service scheme has the function of converting a large number of repeated parts in the artificial customer service into machine execution.
The intelligent customer service scheme at the present stage mainly adopts a pipeline architecture, and comprises components such As Speech Recognition (ASR), natural Language Understanding (NLU), dialogue Management (DM), language generation (NLG), speech synthesis (TTS) and the like, and is essentially a combination of common problem (faq) solution and task (slot filling) solution. To realize the intelligent customer service scheme based on the pipeline architecture, a sufficiently rich question-answer knowledge base needs to be constructed, and a question-answer tree structure based on a filling slot type is designed based on business rules. This solution has a number of problems: firstly, constructing a question-answer knowledge base of an intelligent customer service scheme based on a pipeline architecture, marking common questions and question-answer trees by a large amount of manpower, and describing the questions outside the question-answer database, wherein the system cannot answer; in addition, the answer mode of the intelligent customer service scheme based on the pipeline architecture is single and inflexible, and only after the intention and entity of the question description given by the user are completely matched, a preset output can be given.
With the development of large-scale language model (Large Language Model, LLM) technology, a scheme for outputting answers by using a language model, i.e., an intelligent customer service scheme based on the large-scale language model, has emerged. The intelligent customer service scheme based on the large-scale language model is essentially the combination of common problem solutions, task type questions and knowledge retrieval questions and answers, namely the retrieval of business background knowledge of the aimed domain and the answer generation based on the language model are added on the basis of the intelligent customer service scheme based on the pipeline architecture. The knowledge retrieval question and answer can solve the problem description (query) which is not in the question and answer knowledge base, through knowledge segmentation, vector generation and index construction of business background knowledge, when the problem information input by a user is received, the most relevant knowledge item is directly retrieved from the index, and is input into a language model based on a certain prompt (prompt) format, so that the language model can extract or summarize the answers according to the extracted business background knowledge. Although the traditional intelligent customer service scheme based on a large-scale language model can make up for the defects of a pipeline architecture to a certain extent, the scheme is more suitable for a one-to-one answer mode of consultation, is still inflexible, and the resolution and summarization capability of LLM is still lacking. For example, if no answer description exists in the retrieved business background knowledge, the language model is still affected to generate an answer, and the language model cannot distinguish whether the answer exists in the background knowledge. Even if the retrieved business background knowledge comprises answer descriptions, the traditional language model summarization and the capability of extracting answers still have a large error rate.
In reality, the customer service answers and answers, and the meaning of the problem described by the user can be determined by multiple rounds of interaction between the customer service and the user. Based on different knowledge backgrounds and conversation habits of users, the same problem is not completely the same in description of the problem given by different users. Therefore, when the intelligent customer service scheme is realized, even if a knowledge question-answering system based on a language model is added, a large number of situations exist in which the user problem cannot be solved and the manual customer service processing is required, and the method has less help to relieve the working pressure of the manual customer service.
In this regard, the intelligent customer service response method provided by the embodiment of the invention provides an auxiliary question-answering strategy based on log matching on the basis of the existing question-answering logs, and by learning the processed questions, the need of constructing a question-answering knowledge base through a large number of manual labels can be avoided, and multiple interactive question-answering can be realized and various questions of users can be more flexibly handled.
As shown in fig. 1, the intelligent customer service response method provided by the embodiment of the invention includes:
s101: and constructing a question-answer knowledge base based on the historical question-answer logs and the business background knowledge.
S102: and from the initial language model, searching a first associated question-answer data stream and first associated business background knowledge which are matched with the first sample question information from a question-answer knowledge base, generating a first prompt according to the first associated question-answer data stream and the first associated business background knowledge, inputting the first prompt into an intermediate language model, outputting first prediction answer information, and carrying out iterative training on the intermediate language model by taking the first sample answer information corresponding to the first sample question information as a true value until the iterative training is finished, thereby obtaining the language model.
S103: and retrieving a second associated question-answer data stream and second associated business background knowledge matched with the question information input by the user from the question-answer knowledge base, generating a second prompt according to the second associated question-answer data stream and the second associated business background knowledge, inputting the second prompt into the language model, and outputting the answer information.
In a specific implementation, for S101, a question-answer knowledge base is first constructed based on a historical question-answer log and business background knowledge.
Before intelligent customer service arises, the artificial customer service has accumulated massive question-answer logs, which are all dialogs generated by users based on actual questions and customer service in the product using process, and these dialogs are often not simple questions and answers, but the user may describe through multiple times or even the question descriptions provided through multiple channels, and the artificial customer service may also generate multiple answer descriptions or even answer descriptions to questions by receiving the question descriptions of the user and guiding the user to supplement the question descriptions, thereby generating a historical question-answer log through question-answer flows generated according to multiple sections of question descriptions of the user and multiple sections of answer descriptions of the artificial customer service.
In general, a service party provides a customer service with communication modes such as online chat tools, telephones and the like. When the question and answer stream contains information such as pictures, links and the like, or the questions can be described by simple words, the user may select an online chat tool to communicate with the customer service. When the question and answer stream does not contain pictures, links, or the description of the question is complex, the user may choose to consult the customer service by telephone, video, etc. Therefore, the history question-answer log obtained by the embodiment of the invention can be a log generated by a question-answer stream generated after the user is communicated with the manual service through an online chat tool, a telephone, a video and the like. The characters, pictures, links and the like in the question-answer stream transmitted by the online chat tool can be subjected to induction arrangement, and the history question-answer log is directly generated by matching communication time, product information and the like. For information in the question and answer stream transmitted by telephone, the historical question and answer log can be generated by matching communication time, product information and the like after the voice stream is converted into text through an automatic voice recognition (such as ASR) and other tools. For question and answer stream information transmitted by the video, voice contained in the video can be converted into text, screenshot is carried out on the video to obtain an image, and according to the text and the image, the communication time, the product information and the like are matched, so that a history question and answer log is generated.
It will be appreciated that the historical question-and-answer log should be a question-and-answer log generated by a question-and-answer stream that addresses the user's question. In particular, a human customer service or user may annotate whether a problem has been resolved to determine whether to list a historical question-and-answer log of the question-and-answer knowledge base. By applying the intelligent customer service response method provided by the embodiment of the invention, the intelligent customer service can generate new question-answer flows in operation, and can continuously acquire new question-answer flows corresponding to the solved problems to generate a historical question-answer log and update a question-answer knowledge base.
The business background knowledge may originate from business background knowledge documents in the field, such as product introduction, product usage instructions, etc. given by the service party for the provided product.
When the question-answer knowledge base is constructed, the historical question-answer logs and the business background knowledge can be stored according to the classification of the product category. The method can be used for classifying by adopting a regular model, a language model and the like to realize classified storage of the historical question-answer logs and the business background knowledge. Because the historical question-answer logs and the business background knowledge can relate to products in a plurality of categories, after the historical question-answer logs and the business background knowledge are roughly classified based on the product category, corresponding associated product labels are added to each of the historical question-answer logs and the business background knowledge, and indexes of a question-answer knowledge base are built by extracting all the product labels so as to be quickly searched.
For normalized storage and convenient retrieval, building a question-answer knowledge base based on the historical question-answer logs and business background knowledge in S101 may include: data cleaning is carried out on the historical question and answer logs, and a historical question and answer data stream is obtained; data cleaning is carried out on the business background knowledge document, and a business background knowledge list is obtained; and constructing a question-answer knowledge base by using the historical question-answer data stream and the business background knowledge list. The historical question-answer logs are subjected to data cleaning to obtain historical question-answer data streams, the business background knowledge documents are subjected to data cleaning to obtain business background knowledge lists, a question-answer knowledge base is constructed by utilizing the historical question-answer data streams and the business background knowledge lists, the canonical representation of the historical question-answer logs and the business background knowledge is enhanced, and redundant information and nonstandard information are reduced to interfere training and reasoning of a language model.
The history question-answering log records question-answering flows between customer service and users when the problems of the users are solved in the past, and a large amount of system prompt information (such as call information of 'access customer service x serves you', 'hello, asking what questions are asked') or even chatting information irrelevant to the service are contained in the history question-answering log, and the information can interfere training of a language model and distinguish the questions, so that the service-related dialogue flows can be extracted from the history question-answering log through identifying service-related keywords, and the data cleaning of the history question-answering log is realized to obtain the history question-answering data flows. Or the non-business information can be deleted reversely by setting the cleaning rule, so that the historical question-answer logs are cleaned to obtain the historical question-answer data stream.
Since business background knowledge often comes from business background knowledge documents, some non-business information such as publicity expression of products also exists in the documents, and different documents can have different paradigms. Therefore, the business background knowledge list can be generated by extracting the business background knowledge content in the business background knowledge document and carrying out knowledge segmentation cleaning on the business background knowledge content. A business context index may be generated based further on the list of business contexts for retrieval of the business contexts.
For S102, a language model is trained based on the constructed knowledge base of questions and answers and the sample question and answer data. The language model can adopt the model architecture of the existing large-scale language model (Large Language Model, LLM), and the method provided by the embodiment of the invention is utilized to optimize the model parameters. According to first sample question information (question description input by a simulation user) in sample question and answer data, a matched first associated question and answer data stream and first associated business background knowledge are retrieved from a question and answer knowledge base, a first prompt (prompt) is generated according to the first associated question and answer data stream and the first associated business background knowledge, the first prompt is input into an intermediate language model, first prediction answer information is output, the intermediate language model is iteratively trained by taking the first sample answer information corresponding to the first sample question information as a true value until the iterative training is finished, and a language model is obtained.
With S103, similar to the training process of the language model, only the input data is changed from the first sample question information to the question information input by the user. The question information entered by the user may include, but is not limited to: text, images, voice, video, links, etc. And according to the question information input by the user, correlating and matching the question-answer data stream and the service background knowledge in the question-answer database to obtain a second correlated question-answer data stream and a second correlated service background knowledge, generating a second prompt according to the second correlated question-answer data stream and the second correlated service background knowledge, inputting the second prompt into a language model, and outputting answer information.
According to the intelligent customer service response method provided by the embodiment of the invention, the historical question-answer logs and the business background knowledge are added into the question-answer knowledge base, when the language model is trained, the first associated question-answer data stream matched with the first sample question information and the first associated business background knowledge are searched from the question-answer knowledge base to generate the first prompt, the intermediate language model is input to obtain the first predicted response information, the intermediate language model is subjected to iterative training by taking the first sample response information corresponding to the first sample question information as a true value until the iterative training is finished, and the obtained language model can enhance the ability of distinguishing questions and the ability of generating responses based on the learning of the historical question-answer logs compared with the traditional language model. And for the question information input by the user, a second associated question-answer data stream matched with the question information and second associated business background knowledge are searched from the question-answer knowledge base to generate a second prompt, and the second prompt is input into the language model, so that the language model can more accurately understand the questions of the user and give out proper answer information, the success rate of intelligent customer service to answer the questions of the user is improved, and the working pressure of manual customer service is relieved.
The following describes a third embodiment of the present invention.
FIG. 2 is a flowchart of a method for cleaning a history question-answer log according to an embodiment of the present invention; fig. 3 is a schematic diagram of a question-answer data merging operation according to an embodiment of the present invention.
The embodiment of the invention introduces the steps of cleaning the data of the historical question-answer logs to obtain the historical question-answer data stream for normalized storage and convenient retrieval, and then storing the data stream. The embodiment of the invention further provides a method for cleaning the data of the historical question-answer logs to obtain the historical question-answer data stream.
In the intelligent customer service response method provided by the embodiment of the invention, the data cleaning of the history question-answer log is performed to obtain the history question-answer data stream, which may include: and executing a system information removing operation, a non-business information removing operation, an incomplete dialogue removing operation and a question and answer data merging operation on the historical question and answer log to obtain a historical question and answer data stream. The system information removing operation, the non-business information removing operation, the incomplete dialogue removing operation and the question-answer data combining operation can be sequentially executed or respectively executed.
As shown in fig. 2, if the above data cleansing operation is sequentially executed, an embodiment of the present invention provides a cleansing method for a history question-answer log, including:
S201: inputting a history question-answer log;
s202: removing system information operation; the method comprises the steps that system information removing operation is carried out on a historical question-answer log, and a historical question-answer log after the system information removing operation is obtained;
s203: removing non-business information operation; the method comprises the steps that non-business information removing operation is carried out on a historical question-answer log after system information removing operation, and a historical question-answer log after the non-business information removing operation is obtained;
s204: removing incomplete dialogue operations; the method comprises the steps that incomplete dialogue removal operation is carried out on a history question-answer log after non-business information removal operation, and a history question-answer log after incomplete dialogue removal operation is obtained;
s205: question-answer data merging operation; the method comprises the steps that a question and answer data merging operation is carried out on a history question and answer log after incomplete dialogue operation is removed, a history question and answer log after the question and answer data merging operation is obtained, and the history question and answer log after the question and answer data merging operation is taken as a final history question and answer data stream;
s206: and outputting a historical question-answer data stream.
The system information removing operation is performed on the historical question-answer logs, specifically, the specified system prompt information contained in the historical question-answer logs is removed through a regular/rule mode, so that the historical question-answer logs after the system information removing operation are obtained.
The history question-answer log may also include boring between the user and customer service, which may interfere with learning of the history question-answer log by the language model. Therefore, the embodiment of the invention executes the operation of removing the non-business information on the history question-answer log. The classification model can be adopted to judge whether the data in the history question-answer log is business information or boring. Then the operation of removing non-traffic information is performed, including: classifying the data in the history question-answer log by using a first classification model to obtain business information and non-business information; and deleting the non-business information from the historical question-answer log.
The first classification model may employ an understanding model of the encoder (encoder) structure, such as bert, etc. In order to improve training efficiency, the embodiment of the invention adopts a cyclic labeling mode to train to obtain a first classification model. The training method of the first classification model may include: performing cyclic annotation training on the initial classification model by using a first sample dialogue stream marked with business data and non-business data and a second sample dialogue stream not marked with business data and non-business data to obtain a first classification model; in each cycle labeling training, training an initial classification model by using a first sample to obtain an optimized classification model; inputting the second sample dialogue flow into the optimized classification model to obtain a first coarse classification result; and re-labeling the first rough classification result, and then training the optimized classification model again. Specifically, after the initial classification model is trained by using the first sample to obtain an optimized classification model, inputting the second sample dialogue flow into the optimized classification model to obtain a first coarse classification result, checking and modifying data before the classification probability of 0.3-0.7 in the first coarse classification result, and cycling for several times to obtain a classification data set with diversity and the optimized classification model.
To ensure the integrity of the content, the incomplete dialog removal operation is continuously performed on the history question-answer log. Performing the incomplete dialog removal operation may include: detecting the end information of the dialogue flow from the history question-answer log by using a second classification model; if the end information of the dialogue flow is detected, determining that the dialogue flow is a complete dialogue flow; if the end information of the dialogue flow is not detected, determining that the dialogue flow is an incomplete dialogue flow; incomplete dialog flows in the historical question-answer log are deleted.
The second classification model can adopt a classification model which has the same architecture as the first classification model and is used for judging whether the conversation flow is complete or not. Because of the small-scale model of the common encoder structure, absolute position coding is mostly adopted, and the length of the sequence which can be input is limited, the embodiment of the invention carries out classification judgment on whether the question-answering ends or not on the second half section of the dialogue flow. The training method of the second classification model may include: performing cyclic annotation training on the initial classification model by using a third sample dialogue stream marked with complete dialogue and incomplete dialogue and a fourth sample dialogue stream not marked with complete dialogue and incomplete dialogue to obtain a second classification model; in each cycle labeling training, training an initial classification model by using a third sample dialogue flow to obtain an optimized classification model; inputting the fourth sample dialogue flow into the optimized classification model to obtain a second coarse classification result; and re-labeling the second coarse classification result, and then training the optimized classification model again.
And cleaning all the question-answer flows in the history question-answer log by using the second classification model, and directly deleting the whole dialogue flow if the second classification model considers that the dialogue flow does not normally end or customer service does not give a final solution.
Because the question description and the answer description of customer service of the user may be described by sending a plurality of messages, and the questions and the answers are interspersed in the dialogue stream, a single message may not describe the whole question or the whole answer, and therefore, data merging is required to be performed on the question and answer data stream in the history question and answer log. Performing a question-answer data merge operation may include: identifying a start message, an end message and an intermediate message from the historical question-answer log by using a third classification model; the start message, the end message and the intermediate message belonging to the same problem description are combined. In order to avoid the interference of irrelevant items, the history question-answer log input when the question-answer data merging operation is performed should be the history question-answer log subjected to the system information removing operation, the non-business information removing operation and the incomplete dialogue removing operation.
The third classification model may employ a classification model of the same architecture as the first classification model for detecting question descriptions belonging to the same question and answer descriptions belonging to the same answer in the question-answer data stream. The training method of the third classification model may include: using sentences as units, and using a start message label, an end message label and an intermediate message label to carry out sequence labeling to obtain first sample sequence data; performing cyclic labeling training on the initial classification model by using the first sample sequence data and the unlabeled fifth sample dialogue stream to obtain a third classification model; in each cycle labeling training, training an initial classification model by using first sample sequence data to obtain an optimized classification model; inputting the fifth sample dialogue flow into the optimized classification model to obtain a third coarse classification result; and re-labeling the third coarse classification result, and then training the optimized classification model again. As shown in FIG. 3, unlike conventional token-level sequence labeling, embodiments of the present invention perform sequence labeling in sentence units. The model is input as follows:
[CLS]Question1[SEP]Question2[SEP]……[SEP]。
Wherein, [ CLS ]]For a sequence, corresponding to a segment, may include multiple sentences, each of which is assumed to correspond to a question (e.g., question 1, question 2 … …), and each of whichOne word in each sentence is a token (e.g., question 1 corresponds to token 1 through token a, question 2 corresponds to token b through token h … …). [ SEP ]]Then it is the location to be identified. The input sequence is initialized to E in a third allocation model cls 、E 1 ……E a 、E sep 、E b ……E h 、E sep … … is calculated to C, T in a third distribution model 1 ……T 2 、T sep 、T 1 ……T 2 、T sep The output of the … … third classification model is the pair [ SEP ]]The output of the location is classified, and its tag (label) can be designed as:
start (start): a start message representing a description of the problem;
end (end): an end message representing a description of the problem;
intermediate (mid): an intermediate message representing a description of the problem.
The loss function of the third classification model may be:
wherein,as a loss function of the third classification model,nfor the number of sentences in one sample sequence data,kfor the number of tag classes to be used,iis the sequence number of the sentence,jfor the label number>Is the firstiThe sentence is marked as the firstjProbability of individual tags>To model at the firstiThe prediction result at each sentence is the firstjProbability of individual tags.
If the third classification model adopts an absolute position coding encoder model, the input token number is limited, so that the third classification model can be divided into a fourth classification model and a fifth classification model, and model training optimization can be performed according to the method. Identifying the start message, the end message, and the intermediate message from the input data using the third classification model may include: identifying a problem start message, a problem end message and a problem intermediate message from the historical question-answer log by using a fourth classification model; and identifying an answer start message, an answer end message and an answer intermediate message from the historical question and answer log by using a fifth classification model.
Alternatively, the third classification model may also adopt an encoder model with relative position codes, and the user's message and the customer service message may be directly input into the model, where the labels are set to 6, and the labels are respectively:
c_start: a start message indicating a user question;
c_end: an end message indicating a user question;
c_mid: an intermediate message representing a user question;
a_start: a start message indicating a customer service response;
a_end: an end message indicating a customer service response;
a_mid: an intermediate message representing a customer service response.
All messages are traversed directly using the optimized third classification model (or fourth classification model and fifth classification model), and are merged into the above if the model considers that the message belongs to the same problem description as above, or else into the below.
According to the intelligent customer service response method provided by the embodiment of the invention, the historical question-answer data stream is obtained by executing the system information removing operation, the non-business information removing operation, the incomplete dialogue removing operation and the question-answer data merging operation on the historical question-answer log, and particularly, the non-business information removing operation, the incomplete dialogue removing operation and the question-answer data merging operation are realized by training a classification model, so that the cleaning effect on the historical question-answer log is enhanced.
The fourth embodiment of the present invention will be described below.
FIG. 4 is a schematic diagram of matching of image-text vectors according to an embodiment of the present invention; FIG. 5 is a flowchart of a method for optimizing a text vectorization model according to an embodiment of the present invention; fig. 6 is a flowchart of an optimization method of an image vectorization model according to an embodiment of the present invention.
In the intelligent customer service response method provided by the embodiment of the invention, the key words can be used for searching when the question-answer knowledge base is constructed and the associated knowledge is searched from the question-answer knowledge base. To increase computational efficiency, both the historical question-answer log and business background knowledge may be converted into vector representations.
In the intelligent customer service response method provided by the embodiment of the present invention, the step of constructing a question-answer knowledge base based on the historical question-answer log and the business background knowledge in S101 may include: extracting historical question-answer data stream vectors from the historical question-answer logs by using a vectorization processing model; extracting a business background knowledge vector from business background knowledge by using a vectorization processing model; and constructing a question-answer knowledge base by using the historical question-answer data stream vector and the business background knowledge vector.
The retrieving, from the question-answer knowledge base, the first associated question-answer data stream and the first associated business background knowledge that match the first sample question information in S102 may include: extracting a first problem vector from the first sample problem information by using a vectorization processing model; a first associated question-answer data stream vector matching the first question vector and a first associated business background knowledge vector are retrieved from a question-answer knowledge base based on the vector similarity.
Retrieving a second associated question-answer data stream and second associated business background knowledge from the question-answer knowledge base that matches the question information entered by the user in S103 may include: extracting a second problem vector from the problem information by using the vectorization processing model; and retrieving a second associated question-answer data stream vector matched with the second question vector and a second associated business background knowledge vector from the question-answer knowledge base based on the vector similarity.
In particular implementations, the vectorization processing model may employ a model architecture of an embedding-encoder. In the current vectorization processing model or language model, the decoder model is often trained based on general expectation, and if the model is directly applied to an intelligent customer service scheme, a plurality of problems exist in extracting vectors. Therefore, in the embodiment of the invention, the vectorized encoder in the vectorized processing model can be subjected to secondary pre-training of service background knowledge. In order to reduce the workload of data annotation, cyclic annotation training can be performed on the vectorization processing model.
The training method of the vectorization processing model may include:
pre-training a vectorization encoder in the initial vectorization processing model by using business background knowledge to obtain a pre-trained initial vectorization processing model;
from the initial vectorization processing model after pre-training, performing cyclic labeling training on the initial vectorization processing model after pre-training by using a sixth sample dialogue stream and an unlabeled seventh sample dialogue stream which label similar dialogue streams and dissimilar dialogue streams to obtain a vectorization processing model;
in each cycle labeling training, training the pre-trained initial vectorization processing model by using a sixth sample dialogue stream to obtain an optimized classification model; inputting the seventh sample dialogue flow into the optimized classification model to obtain a fourth coarse classification result; and re-labeling the fourth coarse classification result, and then training the optimized classification model again.
When the associated knowledge is retrieved from the knowledge base of questions and answers, the degree of association can be determined by calculating the cosine similarity between the vectors. Based on the method, the cosine similarity can be directly used as a loss to train the vectorization processing model, so that the cosine value of the text with similar semantics is as large as possible, and the cosine value of the text with incoherent semantics is as small as possible. The loss function of the vectorization processing model may be:
wherein,for the loss function of the vectorized processing model,afor the first sample vector extracted from the sample problem set,pfor the first sample vectorThe corresponding problem expresses a similar second sample vector,nfor a third sample vector that does not correspond to the problem representation of the first sample vector,mis edge constant, ++>To express a similar sample problem set, +.>To express dissimilar sample problem sets, +.>For cosine similarity of the first sample vector and the second sample vector, +.>The cosine similarity of the first sample vector and the third sample vector.
Edge constantmIs positive and is an adjustable parameter. If the edge constantmToo small, the model loss will be relatively large, possibly resulting in non-convergence of the network, but more confident discrimination between more similar samples, i.e aAndpbetter discrimination; if the edge constantmToo large, the loss value tends to approach 0, the model trains well, but is difficult to distinguishaAndp. By adjusting edge constantsmTo optimize the training effect of the vectorized processing model.
Since the user sometimes has convenience in description, the description of the image will be included in the description of the problem. Customer service may also introduce image information when answering. Therefore, in the embodiment of the present invention, extracting the vector from the input data may include: extracting a text vector from the text in the input data by using a text vectorization model; converting an image in the input data into an image vector using an image vectorization model; and splicing the image vector and the text vector corresponding to the context of the image.
Specifically, stitching the image vector with the text vector corresponding to the context of the image may include: performing image recognition on the image to obtain an image recognition result; comparing the image recognition result with the context of the image, removing the repeated characters, updating the context of the image by using the rest image recognition result, and updating the character vector of the context of the image; and splicing the image vector with the text vector of the context of the updated image.
In specific implementations, the text vectorization model may be trained and inferred in the manner described above with reference to embodiments of the present invention. The image vectorization model can also adopt the model architecture of an ebadd-encoder. The image encoder in the present stage has poor characterization capability for images in the professional field, and needs to be optimized.
Alternatively, the images in the dialog flow may be considered to correspond to the text content of their context, and the images may be stitched together after extracting the vectors with the text vectors of their context. In training an image encoder, the idea of contrast learning may be used for training based on an open source model (e.g., vit). As shown in fig. 4, a text vector (T 1 ,T 2 ,T 3 ,……T N ) The image is input into an image vectorization model to obtain an image vector (I 1 ,I 2 ,I 3 ,……I N ) After the character vector and the image vector are normalized by L2, a cosine similarity matrix can be obtained by direct dot product, and training comprising an image vectorization model can be performed by setting a diagonal label as 1 and a non-diagonal label as 0. In practical application, assuming that a user inputs the text that a fault lamp on the left side of a server is on, and then inputs an image of the fault lamp of the server, the image-text splicing method provided by the embodiment of the invention can be adopted for vectorization conversion and splicing.
In other scenarios, the images of the dialog flow may not be completely repeated with the text information of the context, and at this time, the text alignment and stitching cannot be performed. The image encoder in the image vectorization model can be pre-trained by utilizing the business background knowledge image by referring to the pre-training mode of the image encoder in the text vectorization model, and the image encoder capable of describing the image in the professional field is obtained on the basis of the existing image encoder. Thus, the image vector can be extracted after the content of the image is understood by using the image vectorization model.
Sometimes, the images in the dialog flow will contain text, which aids in understanding the images. Therefore, before the image is identified, characters contained in the image can be identified so as to accelerate the understanding of the image.
On the basis of the scheme of vector extraction provided by the embodiment of the invention, in order to further facilitate retrieval, a question-answer knowledge base is constructed by using a historical question-answer data stream vector and a business background knowledge vector, and the method can comprise the following steps: constructing a historical question-answer data stream vector index based on the historical question-answer data stream vector by using an index tool; constructing a business background knowledge vector index based on the business background knowledge vector by using an index tool; and storing the historical question-answer data stream vector index, the historical question-answer data stream vector, the service background knowledge vector index and the service background knowledge vector into a question-answer knowledge base. The faiss tool may be employed in particular to construct vector indexes. In the actual question-answering occasion, the vector index can be constructed by adopting an ivfflat-reverse index mode in a fass tool in consideration of the accuracy and the rapidness of retrieval.
As shown in fig. 5, an embodiment of the present invention provides a method for optimizing a text vectorization model, including: inputting the history question-answer logs and the business background knowledge into a text vectorization model, and outputting a vector index; the similarity problem is searched from the vector index, and the text vector is output through loss optimization (triplet loss).
As shown in fig. 6, an embodiment of the present invention provides a method for optimizing an image vectorization model, including: and extracting images and texts from the history question-answer logs, and performing contrast learning through a text vectorization model and an image vectorization model to obtain spliced image-text vectors.
According to the intelligent customer service response method provided by the embodiment of the invention, the historical question-answer logs, the business background knowledge, the sample question information and the question information input by the user are further subjected to vectorization representation through the vectorization model, so that similarity calculation is conveniently performed to quickly retrieve associated data in the question-answer knowledge base. The vectorization encoder in the vectorization processing model is pre-trained to convert the vectorization encoder in the general field into the vectorization encoder with service specialty, and a loss function of the vectorization model is built according to cosine similarity, so that judgment of whether the expressions are similar or not can be quickly carried out when similar data are matched.
The fifth embodiment of the present invention will be described below.
FIG. 7 is a flowchart of a training method of a language model according to an embodiment of the present invention; fig. 8 is a flowchart of a method for reasoning a language model according to an embodiment of the present invention.
Based on the above embodiment of the present invention, the construction of the question-answer knowledge base based on the historical question-answer logs and the business background knowledge in S101 may include: data cleaning is carried out on the historical question-answer logs to obtain historical question-answer data streams, and the historical question-answer data streams are converted into historical data stream vectors by using a vectorization processing model; data cleaning is carried out on the business background knowledge document to obtain a business background knowledge list, and the business background knowledge list is converted into a business background knowledge vector by using a vectorization processing model; constructing a historical question-answer data stream vector index based on the historical question-answer data stream vector by using an index tool; constructing a business background knowledge vector index based on the business background knowledge vector by using an index tool; and storing the historical question-answer data stream vector index, the historical question-answer data stream vector, the service background knowledge vector index and the service background knowledge vector into a question-answer knowledge base.
In S102, retrieving a first associated question-answer data stream and a first associated business background knowledge that are matched with the first sample question information from the question-answer knowledge base, generating a first prompt according to the first associated question-answer data stream and the first associated business background knowledge, inputting the first prompt into the intermediate language model, outputting first predicted answer information, and performing iterative training on the intermediate language model by taking the first sample answer information corresponding to the first sample question information as a true value, which may include:
The method comprises the steps that first sample question information is used as user input information, and a first associated question-answer data stream vector and a first associated service background knowledge vector are retrieved from a question-answer knowledge base based on a historical question-answer data stream vector index and a service background knowledge vector index; and generating a first prompt by using the first associated question-answer data stream vector and the first associated business background knowledge vector, and inputting the first prompt into an intermediate language model to perform model parameter optimization.
Specifically, as shown in fig. 7, an embodiment of the present invention provides a training method for a language model, which may include: carrying out knowledge segmentation cleaning on the business background knowledge document to obtain a business background knowledge list, and generating a business background knowledge index based on the business background knowledge list; data cleaning is carried out on the historical question-answer logs to obtain a historical question-answer data stream, and text questions are obtained from the historical question-answer data stream; generating a first prompt according to a first associated question-answer data stream obtained by associating and matching the business background knowledge index and the text questions and a first associated business background knowledge; and inputting the first prompt into the language model to perform model fine adjustment.
Based on this, as shown in fig. 8, an embodiment of the present invention provides a method for reasoning a language model, which may include: carrying out knowledge segmentation cleaning on the business background knowledge document to obtain a business background knowledge list, and generating a business background knowledge index based on the business background knowledge list; data cleaning is carried out on the historical question-answer logs, and a historical question-answer data stream is obtained; constructing a historical question-answer data stream index based on the historical question-answer data stream; performing association matching on a business background knowledge index and a historical question-answer data stream index according to question information (query) input by a user, and generating a second prompt based on a second associated question-answer data stream and second associated business background knowledge which are associated; and inputting the second prompt into the language model and outputting the answer information.
The sixth embodiment of the present invention will be described.
Fig. 9 is a schematic diagram of matching a question-answer data stream according to an embodiment of the present invention; fig. 10 is a schematic diagram of matching accumulated scores of a question-answer data stream according to an embodiment of the present invention.
When the intelligent customer service scheme provided by the embodiment of the invention is applied to an actual question-answer scene, the whole process is similar to the optimization process of the language model, and the difference is that the problem information provided by the user is input.
In the intelligent customer service response method provided by the embodiment of the present invention, retrieving, from the question knowledge base, a second associated question-answer data stream and second associated business background knowledge that are matched with the question information input by the user in S103 may include:
carrying out data cleaning on the problem information to obtain a problem to be solved corresponding to the problem information and a problem description corresponding to the problem to be solved;
retrieving candidate question-answer data streams and candidate business background knowledge associated with the question descriptions from a question-answer knowledge base;
screening the candidate question-answering data streams according to the similarity calculation result of the candidate question-answering data streams and the question description to obtain a second associated question-answering data stream;
and screening the candidate business background knowledge to obtain second associated business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description.
In a specific implementation, the process of data cleaning for the problem information may refer to the process of data cleaning for the history question-answer log provided in the foregoing embodiment of the present invention, which includes: and executing a system information removing operation, a non-business information removing operation, an incomplete dialogue removing operation and a question and answer data merging operation on the problem information to obtain the problem description. In practical application, it can be considered that the problem descriptions proposed by the user in one consultation all belong to the same problem to be solved, and all the obtained problem descriptions are associated to the same problem to be solved. If the problem description proposed by the user in one consultation is considered to belong to different problems to be solved, the problem description can be classified by means of keyword recognition and judging whether the speech segments of the problem description are similar, or the problem description can be classified into different problems to be solved based on the associated knowledge which is matched subsequently.
A receive time threshold may be set and if the user exceeds the time threshold without input, the user is deemed to have completed the input. The vectorization processing model introduced in the above embodiment of the present invention may be used to extract vectors for the description of the problem, specifically, text is extracted by using a text vectorization model, and an image is extracted by using an image vectorization model, which is not described herein. And then, using a question and answer knowledge base constructed with a historical question and answer data stream index and a business background knowledge vector index, and finally obtaining topk historical question and answer data streams most relevant to the description of the problem after id_dialog_subject mapping (the retrieval process of the second associated business background knowledge is the same).
For one or more question descriptions of each question to be processed, each of the question descriptions is matched to a plurality of associated question-answering data streams, and the second associated question-answering data stream can be determined by calculating the number of associated question-answering data streams matched to one question to be processed, namely if each question description of one question to be processed is matched to the same question-answering data stream, the question-answering data stream is taken as the second associated question-answering data stream.
Alternatively, the final second associated question-answer data stream may be determined by accumulating the scores. Screening the candidate question-answer data stream according to the similarity calculation result of the candidate question-answer data stream and the question description to obtain a second associated question-answer data stream, which may include:
if the candidate question-answer data stream is searched once, taking the similarity value of the candidate question-answer data stream and the corresponding question description as a similarity score of the candidate question-answer data stream; if the candidate question-answer data stream is searched for a plurality of times, accumulating similarity values of the candidate question-answer data stream and corresponding question descriptions as similarity scores of the candidate question-answer data stream;
and taking the first preset number of candidate question-answer data streams with the highest similarity scores as second associated question-answer data streams.
Screening the candidate business background knowledge to obtain second associated business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description, wherein the second associated business background knowledge can comprise:
if the candidate business background knowledge is searched once, taking the similarity value of the candidate business background knowledge and the corresponding problem description as the similarity score of the candidate business background knowledge; if the candidate business background knowledge is searched for a plurality of times, accumulating similarity values of the candidate business background knowledge and corresponding problem descriptions as similarity scores of the candidate business background knowledge;
and taking the second preset number of candidate business background knowledge with the highest similarity score as second associated business background knowledge.
In practical application, as shown in fig. 3, assuming that after a to-be-processed problem of a user is combined, three effective problem descriptions q1, q2 and q3 can be obtained, the number of candidate question-answer data streams can be set to be 3, topk is 2, the candidate question-answer data streams matched with the problem description q1 are d1, d2 and d3, similarity values of the candidate question-answer data streams and the problem description q1 are 0.7, 0.3 and 0.5 respectively, and d1 and d3 are taken as candidate question-answer data streams matched with the problem description q 1; the candidate question-answering data streams matched with the question description q2 are d2, d3 and d4, the similarity values of the candidate question-answering data streams and the question description q2 are 0.2, 0.3 and 0.6 respectively, and d3 and d4 are taken as the candidate question-answering data streams matched with the question description q 2; and d3 and d4 are used as candidate question-answering data streams matched with the question description q3, the similarity values of the candidate question-answering data streams with the question description q3 are respectively 0.4, 0.3 and 0.1, and d3 and d4 are used as candidate question-answering data streams matched with the question description q 3. Through the cumulative scores, as shown in fig. 4, the cumulative scores of the candidate question-answer data streams d1, d2, d3, d4 and d5 matched with the question descriptions q1, q2 and q3 are respectively 0.7, 0.5, 1.2, 0.9 and 0.1, and the second associated question-answer data stream matched with the to-be-processed question is d3 and d4 finally.
The determination of the background knowledge of the second associated traffic is the same.
According to the intelligent customer service response method provided by the embodiment of the invention, the problem to be solved corresponding to the problem information and the problem description corresponding to the problem to be solved can be obtained by cleaning the data of the problem information input by the user, the candidate question-answer data stream and the candidate service background knowledge associated with the problem description are retrieved from the question-answer knowledge base, then the second associated question-answer data stream is obtained by screening the candidate question-answer data stream in a mode of accumulating similarity values to obtain similarity scores according to similarity calculation results, and the second associated service background knowledge is obtained by screening the candidate service background knowledge, so that the language model response is guided to generate response information by fully utilizing the history question-answer logs and the service background knowledge.
The invention further discloses an intelligent customer service response device, equipment and a readable storage medium corresponding to the intelligent customer service response method.
The seventh embodiment of the present invention will be described.
Fig. 11 is a schematic structural diagram of an intelligent customer service answering device according to an embodiment of the present invention.
As shown in fig. 11, the intelligent customer service response device provided by the embodiment of the invention includes:
A library building unit 1101, configured to build a question-answer knowledge base based on the historical question-answer log and the business background knowledge;
the training unit 1102 is configured to, from the initial language model, retrieve a first associated question-answer data stream and a first associated business background knowledge that match the first sample question information from the question-answer knowledge base, generate a first prompt according to the first associated question-answer data stream and the first associated business background knowledge, input the first prompt into the intermediate language model, output first predicted answer information, and iteratively train the intermediate language model with the first sample answer information corresponding to the first sample question information as a true value until the iterative training is completed, so as to obtain a language model;
the processing unit 1103 is configured to retrieve a second associated question-answer data stream and second associated business background knowledge that match the question information input by the user from the question-answer knowledge base, generate a second prompt according to the second associated question-answer data stream and the second associated business background knowledge, input the second prompt into the language model, and output the answer information.
In some implementations, the library-building unit 1101 builds a question-answer knowledge base based on the historical question-answer logs and business background knowledge, including:
data cleaning is carried out on the historical question and answer logs, and a historical question and answer data stream is obtained;
Data cleaning is carried out on the business background knowledge document, and a business background knowledge list is obtained;
and constructing a question-answer knowledge base by using the historical question-answer data stream and the business background knowledge list.
In some implementations, the database unit 1101 performs data cleansing on the historical question-answer logs to obtain a historical question-answer data stream, including:
and executing a system information removing operation, a non-business information removing operation, an incomplete dialogue removing operation and a question and answer data merging operation on the historical question and answer log to obtain a historical question and answer data stream.
In some implementations, the library-building unit 1101 performs operations to remove non-business information, including:
classifying the data in the history question-answer log by using a first classification model to obtain business information and non-business information;
and deleting the non-business information from the historical question-answer log.
In some implementations, the training method of the first classification model includes:
performing cyclic annotation training on the initial classification model by using a first sample dialogue stream marked with business data and non-business data and a second sample dialogue stream not marked with business data and non-business data to obtain a first classification model;
in each cycle labeling training, training an initial classification model by using a first sample to obtain an optimized classification model; inputting the second sample dialogue flow into the optimized classification model to obtain a first coarse classification result; and re-labeling the first rough classification result, and then training the optimized classification model again.
In some implementations, the library building unit 1101 performs an incomplete dialog removal operation, including:
detecting the end information of the dialogue flow from the history question-answer log by using a second classification model;
if the end information of the dialogue flow is detected, determining that the dialogue flow is a complete dialogue flow;
if the end information of the dialogue flow is not detected, determining that the dialogue flow is an incomplete dialogue flow;
incomplete dialog flows in the historical question-answer log are deleted.
In some implementations, the training method of the second classification model includes:
performing cyclic annotation training on the initial classification model by using a third sample dialogue stream marked with complete dialogue and incomplete dialogue and a fourth sample dialogue stream not marked with complete dialogue and incomplete dialogue to obtain a second classification model;
in each cycle labeling training, training an initial classification model by using a third sample dialogue flow to obtain an optimized classification model; inputting the fourth sample dialogue flow into the optimized classification model to obtain a second coarse classification result; and re-labeling the second coarse classification result, and then training the optimized classification model again.
In some implementations, the library-building unit 1101 performs a question-answer data merge operation, including:
identifying a start message, an end message and an intermediate message from the historical question-answer log by using a third classification model;
The start message, the end message and the intermediate message belonging to the same problem description are combined.
In some implementations, the library-building unit 1101 identifies a start message, an end message, and an intermediate message from the input data using a third classification model, including:
identifying a problem start message, a problem end message and a problem intermediate message from the historical question-answer log by using a fourth classification model;
and identifying an answer start message, an answer end message and an answer intermediate message from the historical question and answer log by using a fifth classification model.
In some implementations, the training method of the third classification model includes:
using sentences as units, and using a start message label, an end message label and an intermediate message label to carry out sequence labeling to obtain first sample sequence data;
performing cyclic labeling training on the initial classification model by using the first sample sequence data and the unlabeled fifth sample dialogue stream to obtain a third classification model;
in each cycle labeling training, training an initial classification model by using first sample sequence data to obtain an optimized classification model; inputting the fifth sample dialogue flow into the optimized classification model to obtain a third coarse classification result; and re-labeling the third coarse classification result, and then training the optimized classification model again.
In some implementations, the loss function of the third classification model is:
wherein,as a loss function of the third classification model,nfor the number of sentences in one sample sequence data,kfor the number of tag classes to be used,iis the sequence number of the sentence,jfor the label number>Is the firstiThe sentence is marked as the firstjProbability of individual tags>To model at the firstiThe prediction result at each sentence is the firstjProbability of individual tags.
In some implementations, the library-building unit 1101 builds a question-answer knowledge base based on the historical question-answer logs and business background knowledge, including:
extracting historical question-answer data stream vectors from the historical question-answer logs by using a vectorization processing model;
extracting a business background knowledge vector from business background knowledge by using a vectorization processing model;
constructing a question-answer knowledge base by using the historical question-answer data stream vector and the business background knowledge vector;
training unit 1102 retrieves a first associated question-answer data stream and first associated business background knowledge from a question-answer knowledge base that matches the first sample question information, comprising:
extracting a first problem vector from the first sample problem information by using a vectorization processing model;
retrieving a first associated question-answer data stream vector matched with the first question vector and a first associated business background knowledge vector from a question-answer knowledge base based on the vector similarity;
The processing unit 1103 retrieves a second associated question-answer data stream and second associated business background knowledge matching the question information input by the user from the question-answer knowledge base, including:
extracting a second problem vector from the problem information by using the vectorization processing model;
and retrieving a second associated question-answer data stream vector matched with the second question vector and a second associated business background knowledge vector from the question-answer knowledge base based on the vector similarity.
In some implementations, a training method of a vectorization processing model includes:
pre-training a vectorization encoder in the initial vectorization processing model by using business background knowledge to obtain a pre-trained initial vectorization processing model;
from the initial vectorization processing model after pre-training, performing cyclic labeling training on the initial vectorization processing model after pre-training by using a sixth sample dialogue stream and an unlabeled seventh sample dialogue stream which label similar dialogue streams and dissimilar dialogue streams to obtain a vectorization processing model;
in each cycle labeling training, training the pre-trained initial vectorization processing model by using a sixth sample dialogue stream to obtain an optimized classification model; inputting the seventh sample dialogue flow into the optimized classification model to obtain a fourth coarse classification result; and re-labeling the fourth coarse classification result, and then training the optimized classification model again.
In some implementations, the loss function of the vectorization processing model is:
wherein,for the loss function of the vectorized processing model,afor the first sample vector extracted from the sample problem set,pa second sample vector similar to the problem representation corresponding to the first sample vector,nfor a third sample vector that does not correspond to the problem representation of the first sample vector,mis edge constant, ++>To express a similar sample problem set, +.>To express dissimilar sample problem sets, +.>For cosine similarity of the first sample vector and the second sample vector, +.>The cosine similarity of the first sample vector and the third sample vector.
In some implementations, extracting a vector from input data includes:
extracting a text vector from the text in the input data by using a text vectorization model;
converting an image in the input data into an image vector using an image vectorization model;
and splicing the image vector and the text vector corresponding to the context of the image.
In some implementations, stitching the image vector with a text vector corresponding to the context of the image includes:
performing image recognition on the image to obtain an image recognition result;
comparing the image recognition result with the context of the image, removing the repeated characters, updating the context of the image by using the rest image recognition result, and updating the character vector of the context of the image;
And splicing the image vector with the text vector of the context of the updated image.
In some implementations, building a question-answer knowledge base from historical question-answer data stream vectors and business background knowledge vectors includes:
constructing a historical question-answer data stream vector index based on the historical question-answer data stream vector by using an index tool;
constructing a business background knowledge vector index based on the business background knowledge vector by using an index tool;
and storing the historical question-answer data stream vector index, the historical question-answer data stream vector, the service background knowledge vector index and the service background knowledge vector into a question-answer knowledge base.
In some implementations, the processing unit 1103 retrieves a second associated question-answer data stream and second associated business background knowledge from the question knowledge base that matches the user-entered question information, including:
carrying out data cleaning on the problem information to obtain a problem to be solved corresponding to the problem information and a problem description corresponding to the problem to be solved;
retrieving candidate question-answer data streams and candidate business background knowledge associated with the question descriptions from a question-answer knowledge base;
screening the candidate question-answering data streams according to the similarity calculation result of the candidate question-answering data streams and the question description to obtain a second associated question-answering data stream;
And screening the candidate business background knowledge to obtain second associated business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description.
In some implementations, the processing unit 1103 screens out a second associated question-answer data stream from the candidate question-answer data streams according to the similarity calculation result between the candidate question-answer data stream and the question description, including:
if the candidate question-answer data stream is searched once, taking the similarity value of the candidate question-answer data stream and the corresponding question description as a similarity score of the candidate question-answer data stream; if the candidate question-answer data stream is searched for a plurality of times, accumulating similarity values of the candidate question-answer data stream and corresponding question descriptions as similarity scores of the candidate question-answer data stream;
taking the first preset number of candidate question-answer data streams with highest similarity scores as second associated question-answer data streams;
the processing unit 1103 screens out the second related business background knowledge from the candidate business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description, including:
if the candidate business background knowledge is searched once, taking the similarity value of the candidate business background knowledge and the corresponding problem description as the similarity score of the candidate business background knowledge; if the candidate business background knowledge is searched for a plurality of times, accumulating similarity values of the candidate business background knowledge and corresponding problem descriptions as similarity scores of the candidate business background knowledge;
And taking the second preset number of candidate business background knowledge with the highest similarity score as second associated business background knowledge.
Since the embodiments of the apparatus portion and the embodiments of the method portion correspond to each other, the embodiments of the apparatus portion are referred to the description of the embodiments of the method portion, and are not repeated herein.
The eighth embodiment of the present invention will be described.
Fig. 12 is a schematic structural diagram of an intelligent customer service answering device according to an embodiment of the present invention.
As shown in fig. 12, the intelligent customer service response device provided by the embodiment of the present invention includes:
a memory 1210 for storing a computer program 1211;
processor 1220 is configured to execute a computer program 1211, which computer program 1211, when executed by processor 1220, implements the steps of the intelligent customer service answering method according to any one of the embodiments described above.
Processor 1220 may include one or more processing cores, such as a 3-core processor, an 8-core processor, etc., among others. Processor 1220 may be implemented in at least one hardware form of digital signal processing DSP (Digital Signal Processing), field programmable gate array FPGA (Field-Programmable Gate Array), programmable logic array PLA (Programmable Logic Array). Processor 1220 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a central processor CPU (Central Processing Unit), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 1220 may be integrated with an image processor GPU (Graphics Processing Unit), a GPU for use in responsible for rendering and rendering of the content required for display by the display screen. In some embodiments, the processor 1220 may also include an artificial intelligence AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 1210 may include one or more readable storage media, which may be non-transitory. Memory 1210 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In this embodiment, the memory 1210 is at least used for storing a computer program 1211, where the computer program 1211, when loaded and executed by the processor 1220, can implement the relevant steps in the intelligent customer service answering method disclosed in any one of the foregoing embodiments. In addition, resources stored by memory 1210 may also include operating system 1212, data 1213, and the like, and may be either transient or persistent storage. The operating system 1212 may be Windows. The data 1213 may include, but is not limited to, data related to the methods described above.
In some embodiments, the intelligent customer service answering device may also include a display 1230, a power supply 1240, a communication interface 1250, an input-output interface 1260, sensors 1270, and a communication bus 1280.
Those skilled in the art will appreciate that the structure shown in fig. 12 is not limiting of the intelligent customer service response device and may include more or fewer components than shown.
The intelligent customer service response device provided by the embodiment of the invention comprises the memory and the processor, wherein the processor can realize the intelligent customer service response method when executing the program stored in the memory, and the effects are the same as the above.
The following describes an embodiment nine of the present invention.
It should be noted that the apparatus and device embodiments described above are merely exemplary, and for example, the division of modules is merely a logic function division, and there may be other division manners in actual implementation, for example, multiple modules or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules, which may be in electrical, mechanical, or other forms. The modules illustrated as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.
The integrated modules, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium for performing all or part of the steps of the method according to the embodiments of the present invention.
To this end, an embodiment of the present invention further provides a readable storage medium, where a computer program is stored, where the computer program when executed by a processor implements steps such as an intelligent customer service answering method.
The readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (ram) RAM (Random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The computer program included in the readable storage medium provided in this embodiment can implement the steps of the intelligent customer service response method described above when executed by the processor, and the same effects are achieved.
The method, the device, the equipment and the readable storage medium for intelligent customer service response provided by the invention are described in detail. In the description, each embodiment is described in a progressive manner, and each embodiment is mainly described by the differences from other embodiments, so that the same similar parts among the embodiments are mutually referred. The apparatus, device and readable storage medium disclosed in the embodiments are relatively simple to describe, and the relevant points refer to the description of the method section since they correspond to the methods disclosed in the embodiments. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.
It should also be noted that in this specification, relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

Claims (22)

1. An intelligent customer service response method is characterized by comprising the following steps:
constructing a question-answer knowledge base based on the historical question-answer logs and business background knowledge;
from an initial language model, searching a first associated question-answer data stream and first associated business background knowledge which are matched with first sample question information from the question-answer knowledge base, generating a first prompt according to the first associated question-answer data stream and the first associated business background knowledge, inputting the first prompt into an intermediate language model, outputting first prediction answer information, and carrying out iterative training on the intermediate language model by taking the first sample answer information corresponding to the first sample question information as a true value until the iterative training is finished to obtain a language model;
and retrieving a second associated question-answer data stream and second associated service background knowledge matched with the question information input by the user from the question-answer knowledge base, generating a second prompt according to the second associated question-answer data stream and the second associated service background knowledge, inputting the second prompt into the language model, and outputting answer information.
2. The intelligent customer service response method according to claim 1, wherein the constructing a question-answer knowledge base based on the historical question-answer log and the business background knowledge comprises:
Performing data cleaning on the historical question-answer logs to obtain a historical question-answer data stream;
data cleaning is carried out on the business background knowledge document, and a business background knowledge list is obtained;
and constructing the question-answer knowledge base by using the historical question-answer data stream and the service background knowledge list.
3. The intelligent customer service response method according to claim 2, wherein the step of performing data cleansing on the historical question-answer log to obtain a historical question-answer data stream comprises the steps of:
and executing system information removing operation, non-business information removing operation, incomplete dialogue removing operation and question and answer data merging operation on the historical question and answer log to obtain the historical question and answer data stream.
4. An intelligent customer service answering method according to claim 3, wherein performing the operation of removing non-business information comprises:
classifying the data in the history question-answer log by using a first classification model to obtain business information and non-business information;
and deleting the non-business information from the historical question-answer log.
5. The intelligent customer service response method according to claim 4, wherein the training method of the first classification model comprises:
Performing cyclic annotation training on the initial classification model by using a first sample dialogue stream marked with business data and non-business data and a second sample dialogue stream not marked with business data and non-business data to obtain a first classification model;
in each cycle of annotation training, training the initial classification model by using the first sample to obtain an optimized classification model; inputting the second sample dialogue flow into the optimized classification model to obtain a first coarse classification result; and re-labeling the first rough classification result, and then re-training the optimized classification model.
6. A method of intelligent customer service answering according to claim 3, wherein performing the incomplete conversation removal operation includes:
detecting end information of the dialogue flow from the history question-answer log by using a second classification model;
if the end information of the dialogue flow is detected, determining that the dialogue flow is a complete dialogue flow;
if the end information of the dialogue flow is not detected, determining that the dialogue flow is an incomplete dialogue flow;
and deleting the incomplete dialogue stream in the history question-answer log.
7. The intelligent customer service response method according to claim 6, wherein the training method of the second classification model comprises:
Performing cyclic annotation training on the initial classification model by using a third sample dialogue stream marked with complete dialogue and incomplete dialogue and a fourth sample dialogue stream not marked with complete dialogue and incomplete dialogue to obtain the second classification model;
in each cycle labeling training, training the initial classification model by using the third sample dialogue flow to obtain an optimized classification model; inputting the fourth sample dialogue flow into the optimized classification model to obtain a second coarse classification result; and re-labeling the second coarse classification result, and then re-training the optimized classification model.
8. An intelligent customer service answering method according to claim 3, wherein performing the question-answer data merging operation includes:
identifying a start message, an end message and an intermediate message from the historical question-answer log by using a third classification model;
and combining the start message, the end message and the intermediate message belonging to the same problem description.
9. The intelligent customer service response method according to claim 8, wherein the identifying the start message, the end message, and the intermediate message from the input data using the third classification model comprises:
Identifying a question start message, a question end message and a question intermediate message from the historical question-answer log by using a fourth classification model;
and identifying an answer start message, an answer end message and an answer intermediate message from the historical question and answer log by using a fifth classification model.
10. The intelligent customer service response method according to claim 8, wherein the training method of the third classification model comprises:
using sentences as units, and using a start message label, an end message label and an intermediate message label to carry out sequence labeling to obtain first sample sequence data;
performing cyclic labeling training on the initial classification model by using the first sample sequence data and the unlabeled fifth sample dialogue stream to obtain the third classification model;
in each cycle labeling training, training the initial classification model by using the first sample sequence data to obtain an optimized classification model; inputting the fifth sample dialogue flow into the optimized classification model to obtain a third coarse classification result; and re-labeling the third coarse classification result, and then re-training the optimized classification model.
11. The intelligent customer service response method according to claim 10, wherein the loss function of the third classification model is:
Wherein,for the loss function of the third classification model,nfor the number of sentences in one of the sample sequence data,kfor the number of tag classes to be used,iis the sequence number of the sentence,jfor the label number>Is the firstiThe sentence is marked as the firstjProbability of individual tags>Is modeled at the firstiThe prediction result at each sentence is the firstjProbability of individual tags.
12. The intelligent customer service response method according to claim 1, wherein the constructing a question-answer knowledge base based on the historical question-answer log and the business background knowledge comprises:
extracting a historical question-answer data stream vector from the historical question-answer log by using a vectorization processing model;
extracting a business background knowledge vector from the business background knowledge by using the vectorization processing model;
constructing the question-answer knowledge base by using the historical question-answer data stream vector and the business background knowledge vector;
the retrieving, from the question-answer knowledge base, a first associated question-answer data stream and a first associated business background knowledge that match the first sample question information, including:
extracting a first problem vector from the first sample problem information by using the vectorization processing model;
retrieving a first associated question-answer data stream vector matched with the first question vector and a first associated business background knowledge vector from the question-answer knowledge base based on vector similarity;
The retrieving, from the question-answer knowledge base, a second associated question-answer data stream and second associated business background knowledge that match the question information input by the user, including:
extracting a second problem vector from the problem information by using the vectorization processing model;
and retrieving a second associated question-answer data stream vector matched with the second question vector from the question-answer knowledge base based on vector similarity and a second associated business background knowledge vector.
13. The intelligent customer service response method according to claim 12, wherein the training method of the vectorization processing model comprises:
pre-training a vectorization encoder in an initial vectorization processing model by utilizing the business background knowledge to obtain the initial vectorization processing model after pre-training;
from the initial vectorization processing model after pre-training, performing cyclic labeling training on the initial vectorization processing model after pre-training by using a sixth sample dialogue stream and an unlabeled seventh sample dialogue stream which are marked to express similar dialogue streams and express dissimilar dialogue streams to obtain the vectorization processing model;
in each cycle labeling training, training the initial vectorization processing model after pre-training by using the sixth sample dialogue stream to obtain an optimized classification model; inputting the seventh sample dialogue flow into the optimized classification model to obtain a fourth coarse classification result; and re-labeling the fourth coarse classification result, and then re-training the optimized classification model.
14. The intelligent customer service response method according to claim 13, wherein the loss function of the vectorization processing model is:
wherein,for the loss function of the vectorized processing model,afor the first sample vector extracted from the sample problem set,pa second sample vector similar to the problem representation corresponding to the first sample vector,nfor a third sample vector that does not correspond to the problem representation of the first sample vector,mis edge constant, ++>Sample question sets similar to the expression, +.>Sample question set dissimilar for the expression, +.>For the first sample vector and the first sample vectorCosine similarity of the second sample vector, < >>And the cosine similarity of the first sample vector and the third sample vector.
15. The intelligent customer service response method according to claim 12, wherein extracting the vector from the input data comprises:
extracting a text vector from the text in the input data by using a text vectorization model;
converting an image in the input data into an image vector using an image vectorization model;
and splicing the image vector and the text vector corresponding to the context of the image.
16. The intelligent customer service response method according to claim 15, wherein the stitching the image vector with the text vector corresponding to the context of the image comprises:
performing image recognition on the image to obtain an image recognition result;
comparing the image recognition result with the context of the image, removing repeated characters, and updating the text vector of the context of the image after updating the context of the image by using the rest image recognition result;
and splicing the image vector with the text vector of the updated image context.
17. The intelligent customer service answering method according to claim 12, wherein said constructing the question-answer knowledge base from the historical question-answer data stream vector and the business background knowledge vector comprises:
constructing a historical question-answer data stream vector index based on the historical question-answer data stream by using an index tool;
constructing a business background knowledge vector index based on the business background knowledge vector by utilizing the index tool;
and storing the historical question-answer data stream vector index, the historical question-answer data stream vector, the service background knowledge vector index and the service background knowledge vector into the question-answer knowledge base.
18. The intelligent customer service answering method according to claim 1, wherein retrieving from the question-answering knowledge base a second associated question-answering data stream and second associated business background knowledge that match the user-entered question information, comprises:
carrying out data cleaning on the problem information to obtain a problem to be solved corresponding to the problem information and a problem description corresponding to the problem to be solved;
retrieving candidate question-answer data streams and candidate business background knowledge associated with the question description from the question-answer knowledge base;
according to the similarity calculation result of the candidate question-answer data stream and the question description, screening the second associated question-answer data stream from the candidate question-answer data stream;
and screening the second related business background knowledge from the candidate business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description.
19. The intelligent customer service answering method according to claim 18, wherein the step of screening the second associated question-answering data stream from the candidate question-answering data streams according to the similarity calculation result between the candidate question-answering data stream and the question description includes:
If the candidate question-answer data stream is searched once, scoring the similarity of the candidate question-answer data stream by using the similarity value of the candidate question-answer data stream and the corresponding question description; if the candidate question-answer data stream is searched for a plurality of times, accumulating similarity values of the candidate question-answer data stream and the corresponding question descriptions as similarity scores of the candidate question-answer data stream;
taking the candidate question-answer data streams with the first preset quantity with the highest similarity scores as the second associated question-answer data streams;
and screening the second related business background knowledge from the candidate business background knowledge according to the similarity calculation result of the candidate business background knowledge and the problem description, wherein the second related business background knowledge comprises the following steps:
if the candidate business background knowledge is searched once, scoring the similarity of the candidate business background knowledge by using the similarity value of the candidate business background knowledge and the corresponding problem description; if the candidate business background knowledge is retrieved for a plurality of times, accumulating similarity values of the candidate business background knowledge and the corresponding problem description as similarity scores of the candidate business background knowledge;
And taking the candidate business background knowledge with the second preset number with the highest similarity score as the second associated business background knowledge.
20. An intelligent customer service answering device, comprising:
the database building unit is used for building a question-answer knowledge base based on the historical question-answer logs and the business background knowledge;
the training unit is used for searching a first associated question-answer data stream and a first associated business background knowledge which are matched with first sample question information from the question-answer knowledge base from an initial language model, generating a first prompt according to the first associated question-answer data stream and the first associated business background knowledge, inputting the first prompt into an intermediate language model, outputting first prediction answer information, and carrying out iterative training on the intermediate language model by taking the first sample answer information corresponding to the first sample question information as a true value until the iterative training is finished to obtain a language model;
and the processing unit is used for retrieving a second associated question-answer data stream and second associated service background knowledge matched with the question information input by the user from the question-answer knowledge base, generating a second prompt according to the second associated question-answer data stream and the second associated service background knowledge, inputting the second prompt into the language model, and outputting answer information.
21. An intelligent customer service answering machine, comprising:
a memory for storing a computer program;
a processor for executing the computer program, which when executed by the processor implements the steps of the intelligent customer service answering method according to any one of claims 1 to 19.
22. A readable storage medium having stored thereon a computer program, which when executed by a processor performs the steps of the intelligent customer service answering method according to any one of claims 1 to 19.
CN202311757102.3A 2023-12-20 2023-12-20 Intelligent customer service response method, device, equipment and readable storage medium Active CN117453895B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311757102.3A CN117453895B (en) 2023-12-20 2023-12-20 Intelligent customer service response method, device, equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311757102.3A CN117453895B (en) 2023-12-20 2023-12-20 Intelligent customer service response method, device, equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN117453895A CN117453895A (en) 2024-01-26
CN117453895B true CN117453895B (en) 2024-03-01

Family

ID=89585770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311757102.3A Active CN117453895B (en) 2023-12-20 2023-12-20 Intelligent customer service response method, device, equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN117453895B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932708A (en) * 2023-04-18 2023-10-24 清华大学 Open domain natural language reasoning question-answering system and method driven by large language model
CN117112769A (en) * 2023-10-23 2023-11-24 南京国睿信维软件有限公司 Intelligent fault maintenance question-answering system and method based on large language model
CN117194602A (en) * 2023-09-06 2023-12-08 书音(上海)文化科技有限公司 Local knowledge base updating method and system based on large language model and BERT model
CN117235220A (en) * 2023-09-15 2023-12-15 之江实验室 Extensible large language model calling method and device based on graph database knowledge enhancement

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11347803B2 (en) * 2019-03-01 2022-05-31 Cuddle Artificial Intelligence Private Limited Systems and methods for adaptive question answering

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932708A (en) * 2023-04-18 2023-10-24 清华大学 Open domain natural language reasoning question-answering system and method driven by large language model
CN117194602A (en) * 2023-09-06 2023-12-08 书音(上海)文化科技有限公司 Local knowledge base updating method and system based on large language model and BERT model
CN117235220A (en) * 2023-09-15 2023-12-15 之江实验室 Extensible large language model calling method and device based on graph database knowledge enhancement
CN117112769A (en) * 2023-10-23 2023-11-24 南京国睿信维软件有限公司 Intelligent fault maintenance question-answering system and method based on large language model

Also Published As

Publication number Publication date
CN117453895A (en) 2024-01-26

Similar Documents

Publication Publication Date Title
CN110110054B (en) Method for acquiring question-answer pairs from unstructured text based on deep learning
CN104050160B (en) Interpreter&#39;s method and apparatus that a kind of machine is blended with human translation
CN111090727B (en) Language conversion processing method and device and dialect voice interaction system
CN109857846B (en) Method and device for matching user question and knowledge point
CN108846138B (en) Question classification model construction method, device and medium fusing answer information
CN114419387A (en) Cross-modal retrieval system and method based on pre-training model and recall ranking
CN116166782A (en) Intelligent question-answering method based on deep learning
US20230394247A1 (en) Human-machine collaborative conversation interaction system and method
CN115495568B (en) Training method and device for dialogue model, dialogue response method and device
CN111680512A (en) Named entity recognition model, telephone exchange switching extension method and system
CN112115252A (en) Intelligent auxiliary writing processing method and device, electronic equipment and storage medium
CN114328817A (en) Text processing method and device
CN115408488A (en) Segmentation method and system for novel scene text
CN115455982A (en) Dialogue processing method, dialogue processing device, electronic equipment and storage medium
CN113326367B (en) Task type dialogue method and system based on end-to-end text generation
CN114281948A (en) Summary determination method and related equipment thereof
CN112349294B (en) Voice processing method and device, computer readable medium and electronic equipment
CN110969005A (en) Method and device for determining similarity between entity corpora
CN113486143A (en) User portrait generation method based on multi-level text representation and model fusion
CN117453895B (en) Intelligent customer service response method, device, equipment and readable storage medium
CN116090450A (en) Text processing method and computing device
CN112506405B (en) Artificial intelligent voice large screen command method based on Internet supervision field
CN115204143A (en) Method and system for calculating text similarity based on prompt
CN114239565A (en) Deep learning-based emotion reason identification method and system
CN116913278B (en) Voice processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant