CN117539984A - Method, device and equipment for generating reply text - Google Patents

Method, device and equipment for generating reply text Download PDF

Info

Publication number
CN117539984A
CN117539984A CN202311301780.9A CN202311301780A CN117539984A CN 117539984 A CN117539984 A CN 117539984A CN 202311301780 A CN202311301780 A CN 202311301780A CN 117539984 A CN117539984 A CN 117539984A
Authority
CN
China
Prior art keywords
target
information
reply
vector
dialogue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311301780.9A
Other languages
Chinese (zh)
Inventor
张星亮
刘独刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Workec Technology Co ltd
Original Assignee
Shenzhen Workec Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Workec Technology Co ltd filed Critical Shenzhen Workec Technology Co ltd
Priority to CN202311301780.9A priority Critical patent/CN117539984A/en
Publication of CN117539984A publication Critical patent/CN117539984A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/094Adversarial learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The application is applicable to the technical field of computers, and provides a method, a device and equipment for generating a reply text, wherein the method for generating the reply text comprises the following steps: obtaining target information to be replied; performing word segmentation processing on the target information to obtain target word segmentation; determining a target vector sequence corresponding to target information according to a preset vector conversion strategy and target segmentation; acquiring history associated information corresponding to the target vector sequence from a history associated database; and inputting the target vector sequence and the corresponding history associated information thereof into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information. According to the scheme, the target information to be replied can be effectively processed, and the dialogue can be realized without considering and processing the history dialogue information which dynamically changes, so that the calculation and storage resource burden of a natural language model is reduced, the accuracy of natural language understanding is improved, the efficiency of a dialogue system is improved, the expensive hardware cost is reduced, and the performance bottleneck is avoided.

Description

Method, device and equipment for generating reply text
Technical Field
The application belongs to the technical field of computers, and particularly relates to a method, a device and equipment for generating reply text.
Background
Significant advances have been made in the field of natural language processing (Natural Language Processing, NLP), where natural language models (Natural Language Models) such as GPT (generated Pre-trained Transformer) have become key technologies. Current natural language models, such as GPT, employ an autoregressive approach that relies on the input and output of previous time steps each time a text output is generated, meaning that the model needs to consider and process dynamically changing historical dialog information to ensure that the generated text has context consistency and semantic rationality. In particular, in a dialog system, the model needs to take into account dialog history to understand and respond to questions or statements of the user, which enables the dialog system to better simulate interpersonal communications.
However, this method has the following problems: since the autoregressive model requires consideration of the historical context, including the user's questions, previous answers, etc., at each time step, a significant amount of computational and memory resources are required. This can lead to expensive hardware costs and performance bottlenecks for practical use, limiting the size and efficiency of the dialog system. In addition, historical dialog information often contains a large amount of useless information, such as greetings, irrelevant information, or lengthy background statements, that are not important for natural language understanding and dialog management, but that need to be processed and analyzed by the model, reducing the accuracy and efficiency of natural language understanding.
Disclosure of Invention
The embodiment of the application provides a method, a device and equipment for generating a reply text, which can solve the problems.
In a first aspect, an embodiment of the present application provides a method for generating a reply text, including:
obtaining target information to be replied;
performing word segmentation processing on the target information to obtain target word segmentation;
determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target segmentation;
acquiring history associated information corresponding to the target vector sequence from a history associated database;
inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information;
and determining a reply text corresponding to the target information according to the reply vector.
Further, the determining the target vector sequence corresponding to the target information according to the preset vector conversion strategy and the target word segmentation includes:
And inputting the target word segmentation into a preset word embedding model for processing to obtain a target vector sequence corresponding to the target information.
Further, the determining the target vector sequence corresponding to the target information according to the preset vector conversion strategy and the target word segmentation includes:
determining a first vector according to a preset vector conversion strategy and the target word segmentation;
and screening target vectors meeting a first preset condition from the first vectors, and determining a target vector sequence corresponding to the target information according to the target vectors.
Further, the determining, according to the reply vector, the reply text corresponding to the target information includes:
and inputting the reply vector into a preset conversion neural network model for processing to obtain a reply text corresponding to the target information.
Further, the inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information includes:
inputting the target vector sequence and the history associated information corresponding to the target vector sequence into a pre-trained dialogue neural network model for processing to obtain first reply information corresponding to the target information;
When the first reply information does not meet a second preset condition, inputting the target vector sequence and the initial reply corresponding to the target vector sequence into the pre-trained dialogue neural network model for processing to obtain second reply information;
and when the second reply information meets the second preset condition, taking the second reply information as a reply vector corresponding to the target information.
Further, after the history association information corresponding to the target vector sequence is obtained from the history association database, the method further includes:
when the history associated information corresponding to the target vector sequence is not obtained from the history associated database, inputting the target vector sequence into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information;
and determining a reply text corresponding to the target information according to the reply vector.
Further, after the reply vector corresponding to the target information is obtained, the method further includes:
and updating the history association database according to the reply vector corresponding to the target information.
In a second aspect, an embodiment of the present application provides a reply text generating device, including:
The first acquisition unit is used for acquiring target information to be replied;
the first processing unit is used for performing word segmentation processing on the target information to obtain target word segmentation;
the second processing unit is used for determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target word segmentation;
the second acquisition unit is used for acquiring the history associated information corresponding to the target vector sequence from a history associated database;
the third processing unit is used for inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information;
and the fourth processing unit is used for determining a reply text corresponding to the target information according to the reply vector.
Further, the second processing unit is specifically configured to:
and inputting the target word segmentation into a preset word embedding model for processing to obtain a target vector sequence corresponding to the target information.
Further, the second processing unit is specifically configured to:
determining a first vector according to a preset vector conversion strategy and the target word segmentation;
and screening target vectors meeting a first preset condition from the first vectors, and determining a target vector sequence corresponding to the target information according to the target vectors.
Further, the fourth processing unit is specifically configured to:
and inputting the reply vector into a preset conversion neural network model for processing to obtain a reply text corresponding to the target information.
Further, the third processing unit is specifically configured to:
inputting the target vector sequence and the history associated information corresponding to the target vector sequence into a pre-trained dialogue neural network model for processing to obtain first reply information corresponding to the target information;
when the first reply information does not meet a second preset condition, inputting the target vector sequence and the initial reply corresponding to the target vector sequence into the pre-trained dialogue neural network model for processing to obtain second reply information;
and when the second reply information meets the second preset condition, taking the second reply information as a reply vector corresponding to the target information.
Further, the generating device of the reply text further comprises:
a fifth processing unit, configured to input, when history association information corresponding to the target vector sequence is not obtained from the history association database, the target vector sequence into a pre-trained dialogue neural network model for processing, so as to obtain a reply vector corresponding to the target information;
and the sixth processing unit is used for determining a reply text corresponding to the target information according to the reply vector.
Further, the generating device of the reply text further comprises:
and the seventh processing unit is used for updating the history associated database according to the reply vector corresponding to the target information.
In a third aspect, embodiments of the present application provide an apparatus comprising a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the method according to the first aspect described above when executing the computer program.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium storing a computer program which, when executed by a processor, implements a method as described in the first aspect above.
In the embodiment of the application, target information to be replied is acquired; performing word segmentation processing on the target information to obtain target word segmentation; determining a target vector sequence corresponding to target information according to a preset vector conversion strategy and target segmentation; acquiring history associated information corresponding to the target vector sequence from a history associated database; inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information; and determining a reply text corresponding to the target information according to the reply vector. According to the scheme, the target information to be replied can be effectively processed, and the dialogue can be realized without considering and processing the history dialogue information which dynamically changes, so that the calculation and storage resource burden of a natural language model is reduced, the accuracy of natural language understanding is improved, the efficiency of a dialogue system is improved, and the expensive hardware cost is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required for the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flowchart of a method for generating reply text according to a first embodiment of the present application;
fig. 2 is a schematic diagram of a reply text generating device provided in a second embodiment of the present application;
fig. 3 is a schematic diagram of a reply text generating device provided in a third embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
Referring to fig. 1, fig. 1 is a schematic flowchart of a reply text generating method according to a first embodiment of the present application. The execution subject of the method for generating the reply text in this embodiment is a device with a function of generating the reply text, and the device may be a personal computer, a server, or the like. The method for generating the reply text as shown in fig. 1 may include:
S101: and obtaining target information to be replied.
The device acquires target information to be replied, wherein the target information to be replied can be input by a user or other dialogue participants, and can be a question, a request or a statement and the like which are presented by the user.
The target information to be replied to may be a piece of text or voice information.
S102: and performing word segmentation processing on the target information to obtain target word segmentation.
The target information is segmented by the device, and a word segmenter or a word segmentation tool can be stored in the device to divide the target information into words, phrases or sub-units. The selection of the segmenter typically depends on the natural language used and the analysis requirements.
Common word segmenters include rule-based word segmenters and statistical-based word segmenters.
The device performs word segmentation processing on the target information through the word segmentation device, and converts the target information into a series of words or phrases, wherein each word or phrase represents a language unit in the target information.
Wherein it is understood that the device may perform text cleaning and preprocessing on the target information prior to performing word segmentation. Including but not limited to removing unnecessary special characters, punctuation marks, and blank characters to ensure consistency and handleability of the text.
Further, after the segmentation is completed, the device may perform deactivated word filtering. Stop words are common words that are typically ignored in text analysis, such as "yes," "in," etc. Filtering these stop words helps to increase the information content of the word segmentation.
The device segments the target information to be replied into words or phrases that can be more easily understood and processed, providing a basis for subsequent natural language processing tasks, such as semantic analysis and reply generation, helping the system to more accurately understand the intent of the user and generate relevant replies.
S103: and determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target segmentation.
The device stores a preset vector conversion strategy, the setting of which is related to specific requirements and the nature of the data. The preset vector conversion strategy may include, but is not limited to, a word embedding model or a deep learning encoder.
Wherein vectorization is based on neural networks by learning semantic relationships between words from a large corpus of text and mapping each word into a vector space of fixed dimensions. These vectors are commonly referred to as Word Embedding (Word Embedding) and capture the semantic and grammatical information of words.
For example, the sentence "Zhang Sanlike reading" may be converted into the following vector representation: 0.123,0.456,0.789,0.321, where each digit represents an embedded vector of words.
Specifically, the device may input the target word into a preset word embedding model to process, so as to obtain a target vector sequence corresponding to the target information. That is, the device may use a pre-trained Word embedding model to vectorize the target Word, e.g., word2Vec, gloVe, or FastText may each map a target Word to a Word vector. These word vectors capture semantic information for each word.
The deep learning encoder encodes the target word sequence into a target vector sequence using a deep learning encoder, such as a Recurrent Neural Network (RNN) or a Convolutional Neural Network (CNN).
In one embodiment, the device may perform some filtering when determining the target vector sequence corresponding to the target information, so as to reject some interference terms, and reduce the influence of the interference terms. Specifically, the device determines a first vector according to a preset vector conversion strategy and target segmentation; and screening target vectors meeting a first preset condition from the first vectors, and determining a target vector sequence corresponding to target information according to the target vectors.
The device determines the first vector according to the preset vector conversion policy and the target word segmentation, and reference may be made to the specific details of determining the target vector sequence in S103 above, which are not described herein. The first preset condition is stored in the device, wherein the first preset condition is defined according to specific application requirements and can be a condition related to content, grammar, semantics and the like of the target information. The first preset condition is used for screening when the target vector sequence corresponding to the target information is determined, so that interference items are removed, and the influence of the interference items is reduced.
The first preset condition may be selecting a target vector containing a specific keyword or phrase; the first preset condition may also be to select a target vector that does not contain a specific keyword or phrase. In particular, the first preset condition may be a set threshold value, since the vectorized vector is a string of numbers, words that occur too little or too many times may be removed by setting the threshold value.
S104: and acquiring the history associated information corresponding to the target vector sequence from a history associated database.
A history association database is stored in the device, the history association database being used to store history dialogues and related information. This database may employ a different database management system (e.g., an SQL database or a NoSQL database) to store data. The history association database may include a conversation history, i.e., a historical conversation between the user and the device or other participant, including questions, statements, replies, and the like.
The device retrieves a history dialogue containing similarities or correlations with the target vector sequence by querying a history association database. This query may be performed based on the features and attributes of the target vector sequence, for example, using the results of the support and confidence calculations to obtain historical association information corresponding to the target vector sequence. The support degree refers to the frequency of occurrence of a certain word in the data set, and the confidence degree refers to the frequency of occurrence of two words simultaneously. By calculating the support and confidence, it is possible to determine which terms have a significant association relationship with each other. Based on the calculation results of the support and the confidence, an association rule may be generated. Association rules typically take the form of "a- > B", representing the relationship between word a and word B.
S105: inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information.
The device may store a pre-trained dialog neural network model, which is a deep learning neural network, for processing natural language text data. The pre-trained session neural network model may be pre-trained on the device, or may be trained on another device and transplanted to the local device, which is not limited herein.
Pre-trained conversation neural network models have been trained on large-scale conversation data to learn how to understand conversation history and generate replies related thereto.
This model may be a Recurrent Neural Network (RNN), long short-term memory network (LSTM), transducer model (transducer), or other neural network architecture suitable for dialog processing.
Specifically, the pre-trained recurrent neural network model may be a Recurrent Neural Network (RNN) model. The model may receive a sequence of input semantic vectors and generate new reply content based on previous dialog contexts. In a dialogue with a robot (GPT), previous user questions and robot answers may be used as input, and then a new answer may be generated using the RNN model.
For example, the input user question is "hello, i want to subscribe to a high-speed rail ticket from Shenzhen to Guangzhou. The robot answers "you good, please provide the identification card number". ". That is, the target vector sequence is "hello, i want to subscribe to a high-speed rail ticket from Shenzhen to Guangzhou. The vector sequence of "you good" and the history related information is "you good" please provide an identification card number. "corresponding vector.
The target vector sequence and its corresponding historical association information are [ user question vector, robot answer vector ] then, a pre-trained dialogue neural network model is used to generate a new answer, input: [ user question vector, robot answer vector ], output: the new answer vector "please ask you when the departure date? ". In this way, the device may generate new reply content based on the previous semantics and dialog context.
Prior to training and applying the conversational neural network model, the device needs to prepare a data set including sample information, sample history information, and corresponding sample reply vectors. These datasets typically consist of historical conversations and associated replies.
Wherein the sample information is a sample of the target information to be replied to, typically consisting of a sequence of vectors. Sample history information is history-related information related to the sample information, and is typically composed of a history dialogue and information extracted from a history-related database. The sample reply vector is a desired reply vector corresponding to the sample information. These reply vectors may be text, vectors, or other forms of representation.
In training a conversational neural network model, the device takes as input sample information and sample history information, with the goal of enabling the model to generate a reply vector that is similar to the sample reply vector. The device encodes the sample information and sample history information into a form understandable to the model. This may involve using an embedding layer to convert text into a vector representation and combining the historical association information with sample information. The device trains the conversational neural network model through a back propagation algorithm and a loss function to minimize the difference between the generated reply vector and the sample reply vector. The training process requires a large number of sample pairs and iterative training.
In one embodiment, in order to ensure accuracy of the result, a second preset condition may be set, and the output of the model may be determined. The equipment inputs the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain first reply information corresponding to the target information; when the first reply information does not meet a second preset condition, inputting the target vector sequence and the corresponding initial reply into a pre-trained dialogue neural network model for processing to obtain second reply information; and when the second reply information meets the second preset condition, the second reply information is used as a reply vector corresponding to the target information.
Specifically, the model predicts the first reply message based on the existing target vector sequence and its corresponding historical association information. This prediction is probabilistic based and the model generates a plurality of alternatives and assigns a probability value to each alternative. The device evaluates whether the first reply message meets a second preset condition. The second preset condition may be a condition related to reply content, syntax, semantics, or the like.
For example, the second preset condition may be a condition related to a length, and when the first reply message does not meet the preset length, the model may be input again for processing until the second reply message with the specified length is generated.
As mentioned above, the device needs to acquire whether there is history association information related to the target vector sequence in the history association database. If a match is found, this information can be further used for context understanding and reply generation. If no history associated information is found that relates to the target vector sequence, the device will process the target information and generate a reply in the following manner: when the history associated information corresponding to the target vector sequence is not obtained from the history associated database, inputting the target vector sequence into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; and determining a reply text corresponding to the target information according to the reply vector.
After the reply vector is obtained, in order to ensure that the data in the history associated database is richer, the device can update the history associated database according to the reply vector corresponding to the target information. When updating, the historical data can be covered according to the requirement, or new data can be added in the historical data, and the method is not limited.
S106: and determining a reply text corresponding to the target information according to the reply vector.
The device may decode the reply vector into natural language text if the reply vector may be converted into text reply to generate final reply text. This may involve text generation techniques such as Generating Antagonism Networks (GANs) or sequence-to-sequence models.
Specifically, the device may input the reply vector into a preset conversion neural network model for processing, so as to obtain a reply text corresponding to the target information.
In the embodiment of the application, target information to be replied is acquired; performing word segmentation processing on the target information to obtain target word segmentation; determining a target vector sequence corresponding to target information according to a preset vector conversion strategy and target segmentation; acquiring history associated information corresponding to the target vector sequence from a history associated database; inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information; and determining a reply text corresponding to the target information according to the reply vector. According to the scheme, the target information to be replied can be effectively processed, and the dialogue can be realized without considering and processing the history dialogue information which dynamically changes, so that the calculation and storage resource burden of a natural language model is reduced, the accuracy of natural language understanding is improved, the efficiency of a dialogue system is improved, and the expensive hardware cost is reduced.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic of each process, and should not limit the implementation process of the embodiment of the present application in any way.
Referring to fig. 2, fig. 2 is a schematic diagram of a reply text generating device according to a second embodiment of the present application. The units included are for performing the steps in the corresponding embodiment of fig. 1. Refer specifically to the description of the corresponding embodiment in fig. 1. For convenience of explanation, only the portions related to the present embodiment are shown.
Referring to fig. 2, the reply text generation apparatus 2 includes:
a first acquiring unit 21 for acquiring target information to be replied to;
a first processing unit 22, configured to perform word segmentation processing on the target information to obtain a target word;
the second processing unit 23 is configured to determine a target vector sequence corresponding to the target information according to a preset vector conversion policy and the target word segmentation;
a second obtaining unit 24, configured to obtain, from a history association database, history association information corresponding to the target vector sequence;
the third processing unit 25 is configured to input the target vector sequence and the history associated information corresponding to the target vector sequence into a pre-trained dialogue neural network model for processing, so as to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information;
The fourth processing unit 26 is configured to determine a reply text corresponding to the target information according to the reply vector.
Further, the second processing unit 23 is specifically configured to:
and inputting the target word segmentation into a preset word embedding model for processing to obtain a target vector sequence corresponding to the target information.
Further, the second processing unit 23 is specifically configured to:
determining a first vector according to a preset vector conversion strategy and the target word segmentation;
and screening target vectors meeting a first preset condition from the first vectors, and determining a target vector sequence corresponding to the target information according to the target vectors.
Further, the fourth processing unit 26 is specifically configured to:
and inputting the reply vector into a preset conversion neural network model for processing to obtain a reply text corresponding to the target information.
Further, the third processing unit 25 is specifically configured to:
inputting the target vector sequence and the history associated information corresponding to the target vector sequence into a pre-trained dialogue neural network model for processing to obtain first reply information corresponding to the target information;
when the first reply information does not meet a second preset condition, inputting the target vector sequence and the initial reply corresponding to the target vector sequence into the pre-trained dialogue neural network model for processing to obtain second reply information;
And when the second reply information meets the second preset condition, taking the second reply information as a reply vector corresponding to the target information.
Further, the reply text generating device 2 further includes:
a fifth processing unit, configured to input, when history association information corresponding to the target vector sequence is not obtained from the history association database, the target vector sequence into a pre-trained dialogue neural network model for processing, so as to obtain a reply vector corresponding to the target information;
and the sixth processing unit is used for determining a reply text corresponding to the target information according to the reply vector.
Further, the reply text generating device 2 further includes:
and the seventh processing unit is used for updating the history associated database according to the reply vector corresponding to the target information.
Referring to fig. 3, fig. 3 is a schematic diagram of a reply text generating device according to a third embodiment of the present application. As shown in fig. 3, the reply text generation device 3 includes: a processor 30, a memory 31 and a computer program 32, such as a reply text generation program, stored in the memory 31 and executable on the processor 30. The steps in the above-described respective reply text generation method embodiments, such as steps S101 to S106 shown in fig. 1, are implemented when the processor 30 executes the computer program 32. Alternatively, the processor 30 may perform the functions of the modules/units in the above-described apparatus embodiments when executing the computer program 32, for example, the functions of the first obtaining unit 21 to the fourth processing unit 26 shown in fig. 2.
By way of example, the computer program 32 may be partitioned into one or more modules/units that are stored in the memory 31 and executed by the processor 30 to complete the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing a specific function for describing the execution of the computer program 32 in the reply text generating device 3. For example, the computer program 32 may be divided into a first acquisition unit, a first processing unit, a second acquisition unit, a third processing unit, and a fourth processing unit, each unit specifically functioning as follows:
the first acquisition unit is used for acquiring target information to be replied;
the first processing unit is used for performing word segmentation processing on the target information to obtain target word segmentation;
the second processing unit is used for determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target word segmentation;
the second acquisition unit is used for acquiring the history associated information corresponding to the target vector sequence from a history associated database;
The third processing unit is used for inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information;
and the fourth processing unit is used for determining a reply text corresponding to the target information according to the reply vector.
The reply text generation device may include, but is not limited to, a processor 30, a memory 31. It will be appreciated by those skilled in the art that fig. 3 is merely an example of the reply text generating device 3 and does not constitute a limitation of the reply text generating device 3, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the reply text generating device may further include an input-output device, a network access device, a bus, etc.
The processor 30 may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 31 may be an internal storage unit of the reply text generating device 3, for example, a hard disk or a memory of the reply text generating device 3. The memory 31 may be an external storage device of the reply text generating device 3, for example, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided in the reply text generating device 3. Further, the reply text generating device 3 may further include both an internal storage unit and an external storage device of the reply text generating device 3. The memory 31 is used for storing the computer program and other programs and data required by the device for generating the reply text. The memory 31 may also be used for temporarily storing data that has been output or is to be output.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein again.
The embodiment of the application also provides a network device, which comprises: at least one processor, a memory, and a computer program stored in the memory and executable on the at least one processor, which when executed by the processor performs the steps of any of the various method embodiments described above.
Embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements steps that may implement the various method embodiments described above.
Embodiments of the present application provide a computer program product which, when run on a mobile terminal, causes the mobile terminal to perform steps that may be performed in the various method embodiments described above.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application implements all or part of the flow of the method of the above embodiments, and may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, where the computer program, when executed by a processor, may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing device/terminal apparatus, recording medium, computer Memory, read-Only Memory (ROM), random access Memory (RAM, random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other manners. For example, the apparatus/network device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. A method for generating a reply text, comprising:
obtaining target information to be replied;
performing word segmentation processing on the target information to obtain target word segmentation;
determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target segmentation;
Acquiring history associated information corresponding to the target vector sequence from a history associated database;
inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information;
and determining a reply text corresponding to the target information according to the reply vector.
2. The method for generating a reply text according to claim 1, wherein the determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target word comprises:
and inputting the target word segmentation into a preset word embedding model for processing to obtain a target vector sequence corresponding to the target information.
3. The method for generating a reply text according to claim 1, wherein the determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target word comprises:
Determining a first vector according to a preset vector conversion strategy and the target word segmentation;
and screening target vectors meeting a first preset condition from the first vectors, and determining a target vector sequence corresponding to the target information according to the target vectors.
4. The method for generating a reply text according to claim 1, wherein the determining a reply text corresponding to the target information according to the reply vector includes:
and inputting the reply vector into a preset conversion neural network model for processing to obtain a reply text corresponding to the target information.
5. The method for generating a reply text according to claim 1, wherein inputting the target vector sequence and the corresponding history association information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information comprises:
inputting the target vector sequence and the history associated information corresponding to the target vector sequence into a pre-trained dialogue neural network model for processing to obtain first reply information corresponding to the target information;
when the first reply information does not meet a second preset condition, inputting the target vector sequence and the initial reply corresponding to the target vector sequence into the pre-trained dialogue neural network model for processing to obtain second reply information;
And when the second reply information meets the second preset condition, taking the second reply information as a reply vector corresponding to the target information.
6. The method for generating a reply text according to claim 1, further comprising, after the obtaining, from a history association database, history association information corresponding to the target vector sequence:
when the history associated information corresponding to the target vector sequence is not obtained from the history associated database, inputting the target vector sequence into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information;
and determining a reply text corresponding to the target information according to the reply vector.
7. The method for generating reply text according to any one of claims 1 to 6, further comprising, after the obtaining of the reply vector corresponding to the target information:
and updating the history association database according to the reply vector corresponding to the target information.
8. A reply text generation apparatus, characterized by comprising:
the first acquisition unit is used for acquiring target information to be replied;
the first processing unit is used for performing word segmentation processing on the target information to obtain target word segmentation;
The second processing unit is used for determining a target vector sequence corresponding to the target information according to a preset vector conversion strategy and the target word segmentation;
the second acquisition unit is used for acquiring the history associated information corresponding to the target vector sequence from a history associated database;
the third processing unit is used for inputting the target vector sequence and the corresponding history associated information into a pre-trained dialogue neural network model for processing to obtain a reply vector corresponding to the target information; when the dialogue neural network model is trained, the input of the dialogue neural network model is sample information and sample history information corresponding to the sample information, and the output of the dialogue neural network model is a sample reply vector corresponding to the sample information;
and the fourth processing unit is used for determining a reply text corresponding to the target information according to the reply vector.
9. An apparatus, the apparatus comprising: processor, memory and computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor realizes the steps of the method according to claims 1 to 7.
CN202311301780.9A 2023-10-09 2023-10-09 Method, device and equipment for generating reply text Pending CN117539984A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311301780.9A CN117539984A (en) 2023-10-09 2023-10-09 Method, device and equipment for generating reply text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311301780.9A CN117539984A (en) 2023-10-09 2023-10-09 Method, device and equipment for generating reply text

Publications (1)

Publication Number Publication Date
CN117539984A true CN117539984A (en) 2024-02-09

Family

ID=89794630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311301780.9A Pending CN117539984A (en) 2023-10-09 2023-10-09 Method, device and equipment for generating reply text

Country Status (1)

Country Link
CN (1) CN117539984A (en)

Similar Documents

Publication Publication Date Title
WO2018133761A1 (en) Method and device for man-machine dialogue
CN112487173B (en) Man-machine conversation method, device and storage medium
CN111460115B (en) Intelligent man-machine conversation model training method, model training device and electronic equipment
CN109345282A (en) A kind of response method and equipment of business consultation
CN112883193A (en) Training method, device and equipment of text classification model and readable medium
CN110532368A (en) Question answering method, electronic equipment and computer readable storage medium
US11270082B2 (en) Hybrid natural language understanding
CN112579733B (en) Rule matching method, rule matching device, storage medium and electronic equipment
CA3123387C (en) Method and system for generating an intent classifier
CN112507706A (en) Training method and device of knowledge pre-training model and electronic equipment
CN114238373A (en) Method and device for converting natural language question into structured query statement
WO2024114186A1 (en) Intent recognition method and related device
CN115481229A (en) Method and device for pushing answer call, electronic equipment and storage medium
CN112632248A (en) Question answering method, device, computer equipment and storage medium
US11875128B2 (en) Method and system for generating an intent classifier
CN110489730A (en) Text handling method, device, terminal and storage medium
CN116662555B (en) Request text processing method and device, electronic equipment and storage medium
CN117575008A (en) Training sample generation method, model training method, knowledge question-answering method and knowledge question-answering device
CN117370512A (en) Method, device, equipment and storage medium for replying to dialogue
CN111680514B (en) Information processing and model training method, device, equipment and storage medium
CN116521832A (en) Dialogue interaction method, device and system, electronic equipment and storage medium
CN108959327B (en) Service processing method, device and computer readable storage medium
CN117539984A (en) Method, device and equipment for generating reply text
CN112786041B (en) Voice processing method and related equipment
CN114398482A (en) Dictionary construction method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination