CN117909483A

CN117909483A - Personalized dialogue method, system, equipment and medium

Info

Publication number: CN117909483A
Application number: CN202410309950.6A
Authority: CN
Inventors: 吴朋书; 彭英杰
Original assignee: Xunfeng Technology Guizhou Co ltd
Current assignee: Xunfeng Technology Guizhou Co ltd
Priority date: 2024-03-19
Filing date: 2024-03-19
Publication date: 2024-04-19
Anticipated expiration: 2044-03-19
Also published as: CN117909483B

Abstract

The application provides a personalized dialogue method, a system, equipment and a medium, which comprise the steps of responding to a dialogue role created by a user or calling a preset cold-talk strategy to determine whether the dialogue role initiatively initiates a session when the last session ends to reach a preset time threshold; in response to the conversational role initiating a conversation, selecting a corpus or a large language model to generate text data to be optimized; responding to the generated text data to be optimized, and judging whether to call a preset interception strategy to optimize the text data to be optimized according to the text inclusion relation, the emotion analysis result and the current period of the text data to be optimized; if judging that the preset interception strategy is called to optimize the text data to be optimized, taking the optimized text data to be optimized as target text data; otherwise, taking the text data to be optimized as target text data; determining the speaking content of the dialogue role for the user in the conversation according to the target text data. The reality of the character is improved, and the user experience is improved.

Description

Personalized dialogue method, system, equipment and medium

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a personalized dialogue method, a personalized dialogue system, personalized dialogue equipment and a personalized dialogue medium.

Background

With the development of artificial intelligence technology, how to implement man-machine conversation has also received extensive attention and application in various fields, such as more sophisticated intelligent assistants, intelligent customer service and various chat robots.

To further improve the personalization of human-machine dialogs, most companies are beginning to use large language models to continuously perfect human-machine dialogs. However, large language models are common to users' cognition for a particular character and training of the model requires collection of large amounts of data to calibrate, while for most particular characters, sufficient data cannot be accumulated for the model to calibrate continuously. In addition, dialog effects achieved directly by large model languages are generally less proactive.

There is a need for a personalized dialog method that solves the above-mentioned problems.

Disclosure of Invention

Based on the foregoing, it is necessary to provide a personalized dialogue method, system, device and medium.

In a first aspect, the present application provides a personalized dialog method, the method comprising:

responding to a dialogue role created by a user or calling a preset small-torque strategy until the last session is ended to reach a preset time threshold value, and determining whether the dialogue role initiatively initiates a session;

Responding to the dialogue role to initiate a session, and selecting a corpus or a large language model to generate text data to be optimized;

responding to the generated text data to be optimized, and judging whether to call a preset interception strategy to perform optimization processing on the text data to be optimized according to a text inclusion relation, an emotion analysis result and a current period aiming at the text data to be optimized;

If judging that a preset interception strategy is called to optimize the text data to be optimized, taking the optimized text data to be optimized as target text data; otherwise, taking the text data to be optimized as target text data;

determining speaking contents of the dialogue roles for the user in a session according to the target text data;

The determining whether the dialogue role initiatively initiates the session according to the preset cold-talk strategy comprises the following steps:

Searching historical session data corresponding to the user and determining the user emotion of the user according to the historical session data;

determining whether a user attribute is a new user according to whether the historical session data matched with the user exists;

determining whether the dialogue role initiates a session according to the emotion of the user, the current date characteristic, the current weather characteristic of the place of the user and the user attribute;

the calling of the preset interception policy to optimize the text data to be optimized comprises the following steps:

Responding to calling a preset interception strategy to perform first optimization processing on the text data to be optimized, and calling a preset topic strategy to select a new topic from a preset topic corpus so as to generate initial optimized text data, or generating the initial optimized text data according to a preset prompt word and a large language model;

If the text data to be optimized does not need to be optimized, taking the text data to be optimized as target text data;

And in response to the second optimization processing of the text data to be optimized, replacing the appointed text or the appointed text synonymous text in the text data to be optimized by using the configured word pairs to generate target text data.

In some embodiments, the corpus comprises a fixed corpus and a topic corpus, the method further comprising:

responding to the speech of the user in the session, and calling a preset topic strategy to judge whether a new topic needs to be initiated;

After judging that a new topic needs to be initiated, selecting the new topic in a preset topic corpus to generate text data to be optimized, or generating the text data to be optimized according to a prompt word and a large language model;

Responding to the generated text data to be optimized, and carrying out optimization processing on the text data to be optimized according to a preset interception strategy to generate target text data;

And determining speaking contents of the dialogue role for the user in the session according to the target text data.

In some embodiments, the determining whether to invoke a preset interception policy to perform optimization processing on the text data to be optimized according to the text inclusion relation, the emotion analysis result and the current period of the text data to be optimized includes:

judging whether the text data to be optimized needs to be subjected to first optimization processing according to the current period and the optimization mapping relation between the emotion analysis result corresponding to the text data to be optimized and the optimization processing judgment result, wherein the optimization processing judgment result comprises an optimization processing result and a non-optimization processing result;

if the optimization processing judgment result matched with the emotion analysis result and the current period is an optimization processing result, performing first optimization processing on the text data to be optimized;

If the optimization processing judging result matched with the emotion analysis result and the current period is a result without optimization processing, judging whether second optimization processing is needed to be performed on the text data to be optimized according to the text containing relation of the text data to be optimized;

If the text inclusion relation does not contain the appointed text or the appointed text synonymous text, the text data to be optimized does not need to be optimized;

and if the text inclusion relation contains the appointed text or the appointed text synonymous text, performing second optimization processing on the text data to be optimized.

In some embodiments, the invoking the preset interception policy to perform optimization processing on the text data to be optimized includes:

Judging whether second optimization processing is needed to be carried out on the initial optimization text data according to whether a specified text or a synonymous text of the specified text is contained in a text containing relation corresponding to the initial optimization text data;

if the text containing relation corresponding to the initial optimized text data contains a designated text or a synonymous text of the designated text, judging that the initial optimized text data needs to be subjected to second optimization processing; otherwise, judging that the initial optimized text data does not need to be optimized;

In response to the need for second optimization processing of the initial optimized text data, replacing the specified text or the specified text synonymous text in the initial optimized text data with the configured word pairs to generate target text data;

and taking the initial optimized text data as target text data in response to no optimization processing is required for the initial optimized text data.

In some embodiments, the invoking the preset topic policy to determine whether a new topic needs to be initiated includes:

Acquiring and judging a current user state according to the historical speaking content of the user in the session, wherein the user state comprises an awkward chat state, an emotion state and a normal state;

Responding to the state of the user as an awkward chat state or an emotion state, judging that a new topic needs to be initiated;

Responding to the state of the user as a normal state, acquiring a current topic and judging whether a new topic needs to be initiated according to the number of dialogue turns in the session;

if the current topic in the session is unchanged and the number of the dialog turns exceeds a preset threshold, initiating a new topic;

If the current topic in the session changes, no new topic is initiated.

In a second aspect, the present application provides a personalized dialog system, the system comprising:

The system comprises a chill module, a chill module and a user interface module, wherein the chill module is used for responding to a dialogue role created by a user or calling a preset chill strategy to determine whether the dialogue role initiatively initiates a session when the last session is ended by a preset time threshold;

The text generation module is used for responding to the dialogue role to initiate a session, and selecting a fixed corpus or a large language model to generate text data to be optimized;

The interception replacing module is used for responding to the generated text data to be optimized, judging whether a preset interception strategy is called or not according to the text inclusion relation, the emotion analysis result and the current period aiming at the text data to be optimized, and optimizing the text data to be optimized;

the interception replacing module is further used for optimizing the text data to be optimized if judging that a preset interception strategy is called, and taking the optimized text data to be optimized as target text data; otherwise, taking the text data to be optimized as target text data;

And the speaking output module is used for determining speaking contents of the dialogue roles for the user in the session according to the target text data.

In a third aspect, the present application provides an electronic device, including:

one or more processors;

And a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the following:

and determining speaking contents of the dialogue roles in the conversation aiming at the user according to the target text data.

In a fourth aspect, the present application also provides a computer-readable storage medium having stored thereon a computer program that causes a computer to perform the operations of:

The beneficial effects achieved by the application are as follows:

The application provides a personalized dialogue method, which comprises the steps of responding to a dialogue role created by a user or calling a preset cold-talk strategy to determine whether the dialogue role initiatively initiates a session when the last session ends to reach a preset time threshold; responding to the dialogue role to initiate a session, and selecting a corpus or a large language model to generate text data to be optimized; responding to the generated text data to be optimized, and judging whether to call a preset interception strategy to perform optimization processing on the text data to be optimized according to a text inclusion relation, an emotion analysis result and a current period aiming at the text data to be optimized; if judging that a preset interception strategy is called to optimize the text data to be optimized, taking the optimized text data to be optimized as target text data; otherwise, taking the text data to be optimized as target text data; and determining speaking contents of the dialogue roles in the conversation aiming at the user according to the target text data. The talk content is associated with the real-time event through a cold talk strategy, so that the conversation initiative is improved; the method has the advantages that sensitive words and sensitive topics are avoided through an interception strategy, topics can be actively transferred to a proper scene, and the generated speaking content is added with the nodules meeting the role setting, so that the authenticity of the roles is improved; the mastering of the conversation rhythm can be further realized through the topic strategy, and the field content liked by some users can be chat according to the characteristics of the users, so that the user experience is further improved.

Drawings

For a clearer description of the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the description below are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art, wherein:

FIG. 1 is a flowchart of a personalized dialog method provided by an embodiment of the present application;

FIG. 2 is a flow chart of yet another personalized dialog method provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of policy maintenance provided by an embodiment of the present application;

FIG. 4 is a diagram of a personalized dialog system architecture according to an embodiment of the present application;

Fig. 5 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

It should be understood that throughout this specification and the claims, unless the context clearly requires otherwise, the words "comprise", "comprising", and the like, are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is, it is the meaning of "including but not limited to".

It should also be appreciated that the terms "first," "second," and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Furthermore, in the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.

It should be noted that the terms "S1", "S2", and the like are used for the purpose of describing the steps only, and are not intended to be construed to be specific as to the order or sequence of steps, nor are they intended to limit the present application, which is merely used to facilitate the description of the method of the present application, and are not to be construed as indicating the sequence of steps. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be considered to be absent and not within the scope of protection claimed in the present application.

Example 1

The embodiment of the application provides a personalized dialogue method which is applied to an intelligent dialogue system, and particularly, the method disclosed by the embodiment of the application is used for carrying out man-machine dialogue in the intelligent dialogue system, as shown in figure 1, and comprises the following contents:

s1, responding to a dialogue role created by a user or a preset time threshold value from the last session end, and calling a preset coldness strategy to determine whether the dialogue role initiatively initiates the session.

In this application, the session refers to the whole dialogue process from the beginning of the dialogue and the last dialogue of the dialogue role created by the user. Before a user initiates a session for the first time, the intelligent dialogue system generates a role creation prompt to prompt the user to create a dialogue role by the user; the character creation prompt can be in the form of popup window or page on the user interface, and prompts the user to create the character in the form of text, or prompts the user to create the character in the form of voice. Specifically, the user can set parameters such as a prompt word and a character description word, a background story and the like to create a character which meets the user's expectations, for example, "you are a senior citizen. You are gentle but bothersome to math questions "," you are what "are, etc., and define characters' characters and features by hint words, etc.

After the user creates the character, the user can be considered to have a tendency to talk to the intelligent dialog system; and when the user has initiated a session and the session ends up reaching a preset time threshold, the user does not leave the intelligent conversation system, and the user is considered to have a need to conduct a conversation with the intelligent conversation system. At this point, a preset cold-roll strategy is invoked to determine whether the conversational character needs to actively initiate a session. Preferably, the preset time threshold may be 3 minutes or 5 minutes, which is not limited in the present application and may be adjusted by a worker according to actual conditions.

When the user exits the session with the character and reaches a preset time threshold, the user reenters the dialogue system, and can consider that the user has a new need to perform dialogue with the intelligent dialogue system. At this point, a preset cold-roll strategy is invoked to determine whether the conversational character needs to actively initiate a session. Preferably, the preset time threshold may be 30 minutes or 1 hour, which is not limited in the present application and may be adjusted by a worker according to actual conditions.

In one embodiment, determining whether the conversational character actively initiates the session according to a preset cold-rolling strategy includes: searching historical session data corresponding to the user and determining the user emotion of the user according to the historical session data; determining whether the user attribute is a new user according to whether historical session data matched with the user exists; and determining whether the dialogue role initiates the conversation according to the emotion of the user, the current date characteristic, the current weather characteristic of the place of the user and the user attribute. The application decides whether the dialogue role initiatively initiates the conversation or not through the cold-rolling strategy, improves the initiative of the dialogue role, enhances the interactivity with the user, and further improves the anthropomorphic degree of the dialogue role.

Specifically, whether historical session data exist in the intelligent dialogue system is searched, if the historical session data matched with the current user exist, the user attribute is proved not to be a new user, and the last historical session data are analyzed to obtain whether the corresponding user emotion contains negative emotion or not when the user performs the last session. Judging whether to actively initiate the session or not through the obtained user emotion, user attribute and current date characteristic of the last session of the current user and the current weather characteristic of the place where the user is located, wherein specific judgment rules can be configured by staff in the background, for example, the active session is judged as long as the last session of the user contains negative conditions or the user attribute is a new user; or the current date characteristic belongs to a special period such as a near holiday, a weekend, a holiday, a special holiday, etc. although the last session of the user does not contain negative emotion and the user is not a new user, and the need of actively initiating the session is also judged at this time; or the user does not contain negative emotion in the last session and is not a new user, but the current weather characteristics of the place where the user is located belong to severe weather such as storm, snow storm and the like and abnormal weather with larger temperature fluctuation compared with the weather of the previous days and the like, and the need of actively initiating the session is also judged at the moment. It can be appreciated that, according to the increase of the actual scenario, the person skilled in the art can appropriately increase or decrease the configuration of the judgment rule, so as to cover the most cases as possible, and further widen the application range of the cold-rolling strategy. In addition, the process of judging by using the judging rule can also be realized by a preset model, and the judging rule needs to be configured into the model in advance for training.

S2, responding to the dialogue role to initiate a session, and selecting a corpus or a large language model to generate text data to be optimized.

When a conversational character initiates a conversation (i.e., a small-form strategy determines that the conversation needs to be actively initiated), a text is output by using a corpus or a large language model to simulate speaking. The corpus comprises a fixed corpus and a topic corpus, if the corpus is selected to generate text data to be optimized, the corpus in the fixed corpus is intelligently selected according to the characteristics of a user or a topic strategy is called to intelligently output the corpus of one topic from the configured topic corpus so as to generate the text data to be optimized. The configuration of the fixed corpus is a fixed speaking record generated in a database through background configuration, wherein the name is a placeholder expression; the configuration background also provides an interactive interface, a large language model is used for assisting in generating fixed dialects, and staff is used for selecting and deciding which dialects need to be incorporated into the fixed corpus. It will be appreciated that there are a plurality of stationary corpora in the stationary corpus, the choice of stationary corpora being determined randomly. The intelligent dialogue system also counts the reply effect of the user on the corpus after the corpus is output by the fixed corpus, wherein the reply effect comprises the reply rate and the reply content, ranks according to the counted reply rate and the reply content, and eliminates the corpus with poor reply effect in the fixed corpus. The topic corpus can be configured by inputting and configuring topics through an operation means by designing the topics by people; the method can also automatically analyze and summarize topics through the public news content or the hot search content of a media website, form a speech term material by using a large language model and store the speech term material, or can be generated in real time by the large language model when the topic content needs to be output to a dialogue, and intelligently select one topic from an existing configured topic list and throw out related topic corpus when determining that the topic corpus needs to be utilized to generate text data to be optimized.

If the large language model outputs a section of text to simulate speaking, the intelligent dialogue system needs to configure a preset parameter in advance so that the system can simulate a user to perform dialogue with the large language model first, and the large language model can finish outputting document data to be optimized, for example, the system initiatively sends out' I are a senior citizen, and I want to know what college life is. At this time, the large model gives a reply suggestion, and the system takes out the content between double quotation marks as text data to be optimized and outputs the text data. Different models (such as a bloom model, a GPT series of OpenAI and a big language model of Google, an Alicloud M6, hundred degrees ERNIE Bot and the like) can be adopted as the big language model, and various linguistic data with character characteristics are collected for training, such as a two-dimensional character, and a senna and a light and small dialogue line are used; the dialogue speech is trained by selecting dialogue speech such as literary novels, documentaries and the like as a training set, so that a large language model which is successfully trained can automatically generate characters which are more in line with the expectations of users when the users create dialogue characters.

And S3, responding to the generated text data to be optimized, and judging whether to call a preset interception strategy to perform optimization processing on the text data to be optimized according to the text inclusion relation, the emotion analysis result and the current period of the text data to be optimized.

After generating text data to be optimized, in order to enable the finally output text to be more in accordance with the characteristics of dialogue roles and avoid some sensitive words, the application provides that the text data to be optimized needs to be judged to be optimized, and the text data to be optimized is optimized after the judgment needs to be optimized to generate target text data. Analyzing text data to be optimized according to a pre-trained emotion analysis model to generate an emotion analysis result; analyzing a text inclusion relation corresponding to the text data to be optimized; and determining whether the text data to be optimized needs to be optimized and the type of the optimizing process according to the text inclusion relation, the emotion analysis result and the current period, wherein the optimizing process type comprises a first optimizing process and a second optimizing process.

In one embodiment, the optimizing the text data to be optimized according to the text inclusion relation, the emotion analysis result and the current period of the text data to be optimized, wherein the judging whether to call a preset interception policy includes: judging whether the text data to be optimized needs to be subjected to first optimization processing according to the current period and the optimization mapping relation between the emotion analysis result corresponding to the text data to be optimized and the optimization processing judgment result, wherein the optimization processing judgment result comprises an optimization processing result and a non-optimization processing result; if the optimal processing judgment result matched with the emotion analysis result and the current period is an optimal processing result, performing first optimization processing on the text data to be optimized; if the optimization processing judging result matched with the emotion analysis result and the current period is the optimization processing result, judging whether second optimization processing is needed for the text data to be optimized according to the text inclusion relation of the text data to be optimized. For example, when the current period is spring festival and the detected emotion analysis result is frustration, the optimization processing judgment result matched in the mapping relation is the optimization processing result, and it can be understood that the staff can configure different periods and different emotion analysis results and the optimization processing judgment result matched with the different emotion analysis results after configuring the different period and different emotion analysis results, and the staff can continuously optimize the configuration according to the actual situation. If the text containing relation does not contain the appointed text or the appointed text synonymous text, the text data to be optimized does not need to be optimized; and if the text containing relation contains the appointed text or the appointed text synonymous text, performing second optimization processing on the text data to be optimized.

Specifically, according to the above, it can be known that whether the first optimization process is required is firstly determined to determine whether the text data related to the new topic needs to be generated; after judging that the first optimization processing is not needed, judging whether the second optimization is needed or not, so that the optimization time can be effectively reduced, and the situation that the current topic needs to be replaced after the second optimization processing is possibly found to generate redundant processes is avoided.

The method is characterized in that a specified text is a sensitive word which is preset by a worker and needs to be replaced, a name which needs to be replaced, a chat habit word, a word which does not accord with role setting and the like, the specified text is a text with the same meaning as the specified text, the specified text can be automatically generated according to the specified text through a Natural Language Processing (NLP) tool, or the specified text can be automatically generated through an automatic synonym generation tool, or the specified text can be automatically generated through an online synonym dictionary, the tool and the like, and the method does not restrict a specific method for generating the specified text; note that, the specified text synonym text is a synonym text for the sensitive words contained in the specified text, and the synonym text is not required to be generated for the chat habit words and the words conforming to the role setting. The configured word pairs are the association relation between the appointed text or the appointed text synonymous text and the optimized word used for replacement, for example, the optimized word corresponding to the appointed text I is 'j', the optimized word corresponding to the sensitive word is 'j' (i.e. blank space), and the like, and the configuration of the word pairs is configured by staff in the background and can be increased according to the actual scene. If it is determined that the second optimization process and the first optimization process are not required, that is, it is determined that the optimization process is not required, at this time, some words for nodules may be added to the text data to be optimized appropriately, so as to conform to the character setting, for example, the third person name is liked, the tail words of certain language and the network words are liked by certain semantics or language. And adding text data to be optimized of the nodules of the mouth to obtain target text data.

Specifically, the emotion analysis model may be one of a plurality of emotion analysis models such as a emotion model trained by KNN algorithm and used for determining emotion contained in text data by inputting the text data, a dictionary-based emotion analysis model (i.e., using an emotion dictionary, calculating emotion scores from the occurrence of emotion words in text), a machine-learned emotion analysis model (i.e., using an algorithm such as a Support Vector Machine (SVM), naive bayes, decision tree, etc., performing emotion analysis by training a model), and a deep learning model (i.e., using a neural network model such as a Recurrent Neural Network (RNN) and a Convolutional Neural Network (CNN)), or a plurality of the plurality of emotion analysis models, and selecting, as a final analysis result, an emotion analysis result output by a model having the best effect among the plurality of emotion analysis models.

The application provides a specific implementation step for determining the text inclusion relation, which specifically comprises the following steps:

Firstly, preprocessing text data to be optimized is needed, including: punctuation marks, stop words and special characters in the text are removed, the operations such as case and case unification and the like are performed, so that the cleanness and unification of text data to be optimized are ensured, and a foundation is laid for the subsequent matching step. Secondly, the text data to be optimized needs to be precisely matched: directly using a character string matching function provided by a programming language, such as find () function or index () function in Python, to judge whether a specified keyword or phrase exists in the text data to be optimized; if the matching is successful, a conclusion can be drawn that the text data to be optimized contains a specified text or a synonymous text of the specified text; otherwise, determining that the text data to be optimized does not contain the appointed text or the appointed text synonymous text; the simplest method for judging whether the text contains some text is precise matching. In addition, in practical application, the judgment can be carried out through fuzzy matching, and even if the text contained in the text data to be optimized is slightly different from the specified text or the keywords or phrases in the synonymous text of the specified text, the judgment is successful in matching, namely the text data to be optimized contains the specified text or the synonymous text of the specified text; in particular, regular expressions may be used for fuzzy matching, or fuzzy matching algorithms such as the Levenshtein distance algorithm, etc. may be used. It should be understood that the above is only a method for determining a text inclusion relationship, and the present application also includes any other method for identifying a text inclusion relationship, which is not described herein in detail.

S4, if judging that a preset interception strategy is called, optimizing the text data to be optimized, and taking the optimized text data to be optimized as target text data; otherwise, the text data to be optimized is used as target text data.

In one embodiment, the invoking the preset interception policy to perform optimization processing on the text data to be optimized further includes: responding to calling a preset interception strategy to perform first optimization processing on the text data to be optimized, and calling a preset topic strategy to select new topics from a preset topic corpus so as to generate initial optimized text data, or generating the initial optimized text data according to a preset prompt word and a large language model; if the text data to be optimized does not need to be optimized, taking the text data to be optimized as target text data; and in response to the second optimization processing of the text data to be optimized, replacing the specified text or the specified text synonymous text in the text data to be optimized by using the configured word pairs to generate target text data.

Specifically, if the intelligent dialogue system determines that the text data to be optimized needs to be subjected to first optimization processing, invoking a topic policy to select a new topic and to perform text data, including: one topic can be intelligently selected from the topic corpus and the topic corpus is output to generate new text data, which is called initial optimized text data; the intelligent selection of topics can be realized through a training model, characteristics of a user, the current period, the character characteristics and the like are taken as characteristics, and the characteristics are input into the model and a topic is output; the model can be trained by using topics and user characteristics, current period and role characteristics in historical conversation content related to the user in the intelligent dialogue system as a training set, and the application does not limit specific model constitution. The intelligent dialogue system can configure preset parameters of a dialogue, and if a user dialogues with the large language model, the large language model completes text output and submits the text to the next step. For example, the system decides that a new topic needs to be provided, and will be actively sent by the system "i am now chatting with friends, how to get rid of acne with the intention of changing the topic to chat? Please teach me gracefully to break the current topic and then initiate me wanted topic ", at which point the large model will give a reply suggestion, and the system takes out the content between the double quotations in the large model offer content and sends it directly into the dialogue with the user. If the intelligent dialogue system determines that the text data to be optimized needs to be subjected to second optimization processing, the specified text or the specified text synonymous text contained in the text data to be optimized is replaced as required, namely the specified text or the specified text synonymous text in the text data to be optimized is replaced according to the configured word pairs in a targeted mode.

It can be understood that the preset topic strategy includes a judging link, that is, whether a new topic needs to be replaced, and a text output link, that is, generating text data according to the new topic, including selecting topics from a topic corpus to generate text data, and generating text data by using prompt words and large language models determined by the new topic. Only when the cold-rolling strategy and the interception strategy are called, the topic strategy logic can be possibly entered to output topics, and the judgment link of the topic strategy is skipped.

Specifically, in one embodiment, invoking a preset interception policy to optimize text data to be optimized includes: judging whether the initial optimized text data needs to be subjected to first optimization processing according to whether a specified text or a synonymous text of the specified text is contained in a text containing relation corresponding to the initial optimized text data; if the text containing relation corresponding to the initial optimized text data contains a designated text or a synonymous text of the designated text, judging that the initial optimized text data needs to be subjected to second optimization; otherwise, judging that the initial optimized text data does not need to be optimized; in response to the need of performing second optimization processing on the initial optimized text data, replacing a specified text or a specified text synonymous text in the initial optimized text data by using the configured word pairs to generate target text data; and taking the initial optimized text data as target text data in response to no optimization processing is required for the initial optimized text data. It can be understood that, in the above process, if the interception policy decides to perform the first optimization process on the text data to be optimized, according to the above, it can be known that the text data to be optimized (i.e., the initial optimized text data) after the first optimization process is new text data generated according to the re-selected topic, where the initial text data may have a risk of containing sensitive words or words that do not meet the role setting, so the present application proposes that it is necessary to determine whether to perform the optimization process on the initial text data again. If the text corresponding to the initial optimized text data contains a relation or does not contain a designated text or a synonymous text of the designated text, and the optimization processing is not needed at the moment, the initial text data is used as target text data; if the text containing relation corresponding to the initial optimized text data contains the designated text or the designated text synonymous text, the second optimizing process still needs to be carried out, and in order to avoid falling into a dead loop, the designated text or the designated text synonymous text in the initial optimized text data is directly replaced to generate the target text data.

S5, determining speaking contents of the dialogue roles for the users in the conversation according to the target text data.

At this time, the target text data generated by optimizing the text data output by the fixed corpus or topic corpus and the large language model is truly output to the user, that is, the character created for the current user speaks for the first time in the current session.

Example two

The embodiment of the present application also discloses another personalized dialogue method applied to an intelligent dialogue system, and specifically, as shown in fig. 2, the personalized dialogue method is applied to an existing dialogue in the intelligent dialogue system, and includes the following contents:

a1, responding to the speech of a user in a session, and calling a preset topic strategy to judge whether a new topic needs to be initiated.

Specifically, when a session exists in the intelligent dialogue system, whether a user speaks is detected, and after the user speaks is detected, a preset topic strategy is called to judge whether a new topic needs to be initiated. In addition, if the intelligent dialogue system is in the state that after the dialogue role is created for the first time by the current user, when the cold-transformation strategy disclosed in the first embodiment judges that the dialogue role does not actively initiate dialogue, whether the user speaks is detected, and whether a new topic needs to be initiated is judged by using a preset topic strategy when the fact that the user speaks offspring is detected.

In one embodiment, the above-mentioned calling the preset topic policy determines whether a new topic needs to be initiated, including: acquiring and judging the current user state according to the historical speaking content of the user in the session, wherein the user state comprises an awkward chat state, an emotion state and a normal state; responding to the state of the user as an awkward chat state or an emotion state, judging that a new topic needs to be initiated; responding to the state of the user as a normal state, acquiring the current topic and judging whether a new topic needs to be initiated according to the number of dialogue turns in the session; if the current topic in the session is unchanged and the number of the session turns exceeds a preset threshold, a new topic is initiated; if the current topic in the session changes, a new topic is not initiated; the above is the judgment link of the topic strategy. The user performs dialogue with the intelligent dialogue system, each round of calculation is performed, and each round of dialogue starts with the reply of the user, and the intelligent dialogue system replies. The preset threshold value is preferably set to 5, which is not limited in the present application and can be adjusted according to practical situations.

Specifically, the topic strategy judging method for judging the need of a new topic includes two methods, namely, detecting the state of a user, for example, the user continuously sends only one or two word replies representing the state of being in an awkward chat (namely, the awkward chat state), for example, the user continuously speaks an anger emotion sentence (namely, the emotion state), and the method is suitable for changing the new topic. And the other is to judge by comparing the number of the conversational rounds, and randomly select and change the topics when the current topics continue to compare a plurality of conversations, so as to improve the anthropomorphic degree. The judgment of the user state may be implemented by performing emotion analysis on the historical speech content of the user in the current round of session, and may be implemented by using an emotion analysis model as described above, which is not limited by the present application.

A2, after judging that a new topic needs to be initiated, selecting the new topic according to a preset topic corpus to generate text data to be optimized, or generating the text data to be optimized according to a prompt word and a large language model. Specifically, the selecting of the new topics from the preset topic corpus to generate the text data to be optimized, or generating the text data to be optimized according to the prompt word and the large language model is consistent with the relevant steps disclosed in the first embodiment, and the application is not repeated herein.

In addition, the application also provides a method for avoiding entering the dead cycle: in the process, a plurality of calls for intercepting the strategies and topic strategies exist, when the intelligent dialogue system needs text output each time, unique numbers are made for dialogue requests, each number only passes through each strategy for 2 times in sequence, if the number exceeds the number, the strategies are skipped, and dead circulation is not entered. The threshold setting 2 may be any reasonable value, and the present application is not limited to this setting.

A3, responding to the generated text data to be optimized, and carrying out optimization processing on the text data to be optimized according to a preset interception strategy to generate target text data. Specifically, the selecting a new topic from the preset topic corpus to generate the text data to be optimized, or generating the text data to be optimized according to the prompt word and the large language model is consistent with the relevant steps disclosed in the first embodiment, which is not described herein.

And A4, determining speaking contents of the dialogue roles for the users in the session according to the target text data. The generated target text data is output at this time to realize a reply to the user.

It will be appreciated that steps A1-A4 are one round of dialogue, and may be performed multiple times in a smart dialogue system.

In addition, as shown in fig. 3, the embodiment of the present application further provides a method for maintaining a policy included in the personalization method disclosed in the foregoing embodiment, which specifically includes:

The preset cold rolling strategy, the interception strategy and the topic strategy are strategies in a strategy library which is generated by active human operation and active technology collection, and the strategy library stores and manages the three strategies in a database mode. When the strategy is used, a memory cache or a non-relational data cache and other quick calculation modes can be adopted for carrying out; the judgment of the strategy is to continuously configure and upgrade the preset by associating the characteristics of the user, the role setting, the current session content, the historical session content, the dialog content and emotion of the recent rounds and the like. The judgment of the strategy is to take the condition (such as emotion analysis result, user attribute and the like mentioned in the cold-rolling strategy) as an input item, judge whether the strategy output range is met according to the configuration recorded in the database by the strategy, and directly output the originally generated text data if the strategy output range is not met. The strategy judgment is an abstract independent module in the implementation, and can be used by all strategies, but topic strategies, coldness strategies and interception strategies can call different strategy judgment modes according to specific conditions.

When the strategy judgment needs to be output, the corpus is intelligently judged to be output according to the existing fixed corpus and topic corpus of the strategy or according to the large language model configuration fixed by the existing strategy. Meanwhile, the cold-rolling strategy and the interception strategy can also enter topic strategy logic to output topics, and it is worth noting that if the topic strategy is directly jumped, the judgment link of the topic strategy can be skipped.

The cold talk strategy can be that through an operation means, people design conditions requiring cold talk, the conditions support geographic positions, conversation interval time, user attributes, weather (such as severe weather or abnormal weather), recent time characteristics (such as near special holidays and holidays) and the like, and even emotion, topic conditions and the like when conversation is finished last time, can be continuously enriched and upgraded. The coldness and talk strategy can be that the output logic carries out corpus output by designing a fixed corpus or topic corpus by people through an operation means, and can also carry out batch generation by analyzing the beginning sentence of the dialogue in the public network through a large language model. The present application does not restrict the specific policy scope.

The interception policy can be that conditions required to be intercepted are designed by people through operation means, and the conditions support text inclusion, emotion analysis, current period and the like, so that the interception policy is continuously enriched and upgraded. The interception policy may be by an operational means, the output logic configured by a fixed replacement condition or a fixed large language model. The present application does not restrict the specific policy scope.

The topic strategy can be that conditions of topics are designed by people through operation means, and the conditions support scenes such as user emotion judgment, conversation round number judgment and the like, so that the topics are continuously enriched and upgraded. The present application does not restrict the specific policy scope.

Example III

For the first and second embodiments, as shown in fig. 4, the present application further provides a personalized dialogue system, including:

A coldness module 410, configured to respond to a dialogue role created by a user or a preset time threshold from the last session end, and invoke a preset coldness strategy to determine whether the dialogue role actively initiates a session;

a text generation module 420, configured to select a fixed corpus or a large language model to generate text data to be optimized in response to the dialog role initiating a session;

The interception replacing module 430 is configured to respond to the generated text data to be optimized, and determine whether to invoke a preset interception policy to perform optimization processing on the text data to be optimized according to a text inclusion relation, an emotion analysis result and a current period for the text data to be optimized;

And a speech output module 440, configured to determine, according to the target text data, speech content of the dialog role for the user in the session.

In some implementation scenarios, the system further includes a topic module (not illustrated in the figure) configured to invoke a preset topic policy to determine whether a new topic needs to be initiated in response to a utterance of a user in the session; the topic module is also used for selecting new topics in a preset topic corpus to generate text data to be optimized after judging that the new topics need to be initiated, or generating the text data to be optimized according to the prompt words and the large language model; the interception replacing module 430 is further configured to, in response to the generated text data to be optimized, perform optimization processing on the text data to be optimized according to a preset interception policy to generate target text data; the speech output module 440 is further configured to determine, according to the target text data, speech content of the dialog character for the user in the session.

In some implementations, the cold-turning module 410 is further configured to search historical session data corresponding to the user and determine a user emotion of the user according to the historical session data; determining whether a user attribute is a new user according to whether the historical session data matched with the user exists; and determining whether the dialogue role initiates a session according to the emotion of the user, the current date characteristic, the current weather characteristic of the place of the user and the user attribute.

In some implementation scenarios, the interception replacing module 430 is further configured to determine, according to a current period and an optimized mapping relationship between an emotion analysis result corresponding to the text data to be optimized and an optimization processing determination result, whether a first optimization process is required for the text data to be optimized, where the optimization processing determination result includes an optimization processing result and a non-optimization processing result; if the optimization processing judgment result matched with the emotion analysis result and the current period is an optimization processing result, performing first optimization processing on the text data to be optimized; if the optimization processing judging result matched with the emotion analysis result and the current period is a result without optimization processing, judging whether second optimization processing is needed to be performed on the text data to be optimized according to the text containing relation of the text data to be optimized; if the text inclusion relation does not contain the appointed text or the appointed text synonymous text, the text data to be optimized does not need to be optimized; and if the text inclusion relation contains the appointed text or the appointed text synonymous text, performing second optimization processing on the text data to be optimized.

In some implementations, the interception replacing module 430 is further configured to perform a first optimization process on the text data to be optimized in response to invoking a preset interception policy, and invoke a preset topic policy to select a new topic from a preset topic corpus to generate initial optimized text data, or generate the initial optimized text data according to a preset prompt word and a large language model; if the text data to be optimized does not need to be optimized, taking the text data to be optimized as target text data; and in response to the second optimization processing of the text data to be optimized, replacing the appointed text or the appointed text synonymous text in the text data to be optimized by using the configured word pairs to generate target text data.

In some implementation scenarios, the interception replacing module 430 is further configured to determine whether a second optimization process is required to be performed on the initial optimized text data according to whether a specified text or a synonymous text of the specified text is included in the text inclusion relationship corresponding to the initial optimized text data; if the text containing relation corresponding to the initial optimized text data contains a designated text or a synonymous text of the designated text, judging that the initial optimized text data needs to be subjected to second optimization processing; otherwise, judging that the initial optimized text data does not need to be optimized; in response to the need for second optimization processing of the initial optimized text data, replacing the specified text or the specified text synonymous text in the initial optimized text data with the configured word pairs to generate target text data; and taking the initial optimized text data as target text data in response to no optimization processing is required for the initial optimized text data.

In some implementation scenarios, the topic module is further configured to acquire and determine a current user state according to the historical speech content of the user in the session, where the user state includes an awkward chat state, an emotional state, and a normal state; responding to the state of the user as an awkward chat state or an emotion state, judging that a new topic needs to be initiated; responding to the state of the user as a normal state, acquiring a current topic and judging whether a new topic needs to be initiated according to the number of dialogue turns in the session; if the current topic in the session is unchanged and the number of the dialog turns exceeds a preset threshold, initiating a new topic; if the current topic in the session changes, no new topic is initiated.

Example IV

Corresponding to all the embodiments described above, an embodiment of the present application provides an electronic device, including: one or more processors; and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the following:

Fig. 5 illustrates an architecture of an electronic device, which may include a processor 510, a video display adapter 511, a disk drive 512, an input/output interface 513, a network interface 514, and a memory 520, among others. The processor 510, the video display adapter 511, the disk drive 512, the input/output interface 513, the network interface 514, and the memory 520 may be communicatively connected by a bus 530.

The processor 510 may be implemented by a general-purpose CPU (Central Processing Unit ), a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits, etc. for executing related programs to implement the technical solution provided by the present application.

The memory 520 may be implemented in the form of ROM (read only memory), RAM (Random Access Memory ), static storage, dynamic storage, etc. The memory 520 may store an operating system 521 for controlling the execution of the electronic device 500, and a Basic Input Output System (BIOS) 522 for controlling the low-level operation of the electronic device 500. In addition, a web browser 523, a data storage management system 524, an icon font processing system 525, and the like may also be stored. The icon font processing system 525 may be an application program that implements the operations of the foregoing steps in the embodiment of the present application. In general, when the technical solution provided by the present application is implemented by software or firmware, relevant program codes are stored in the memory 520 and invoked by the processor 510 to be executed.

The input/output interface 513 is used for connecting with an input/output module to realize information input and output. The input/output module may be configured as a component in a device (not shown) or may be external to the device to provide corresponding functionality. Wherein the input devices may include a keyboard, mouse, touch screen, microphone, various types of sensors, etc., and the output devices may include a display, speaker, vibrator, indicator lights, etc.

The network interface 514 is used to connect communication modules (not shown) to enable communication interactions of the device with other devices. The communication module may implement communication through a wired manner (such as USB, network cable, etc.), or may implement communication through a wireless manner (such as mobile network, WIFI, bluetooth, etc.).

Bus 530 includes a path to transfer information between components of the device (e.g., processor 510, video display adapter 511, disk drive 512, input/output interface 513, network interface 514, and memory 520).

In addition, the electronic device 500 may also obtain information of specific acquisition conditions from the virtual resource object acquisition condition information database, for performing condition judgment, and so on.

It should be noted that although the above devices only show the processor 510, the video display adapter 511, the disk drive 512, the input/output interface 513, the network interface 514, the memory 520, the bus 530, etc., in the specific implementation, the device may include other components necessary to achieve normal execution. Furthermore, it will be appreciated by those skilled in the art that the apparatus may include only the components necessary to implement the present application, and not all of the components shown in the drawings.

From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a cloud server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.

Example five

Corresponding to all the above embodiments, the embodiments of the present application further provide a computer-readable storage medium, characterized in that it stores a computer program that causes a computer to perform the operations of:

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.

Claims

1. A method of personalizing a conversation, the method comprising:

2. The method of claim 1, wherein the corpus comprises a fixed corpus and a topic corpus, the method further comprising:

3. The method according to claim 1, wherein the determining whether to invoke a preset interception policy to perform optimization processing on the text data to be optimized according to the text inclusion relation, the emotion analysis result and the current period of time for the text data to be optimized includes:

4. The method of claim 1, wherein invoking a preset interception policy to optimize the text data to be optimized comprises:

5. The method of claim 2, wherein the invoking the preset topic policy to determine whether a new topic needs to be initiated comprises:

If the current topic in the session changes, no new topic is initiated.

6. A personalized dialog system, the system comprising:

The chill module is used for responding to a dialogue role created by a user or determining whether the dialogue role initiatively initiates a session according to a preset chill strategy when a preset time threshold is reached from the last session end;

The interception replacing module is used for responding to the generated text data to be optimized, and carrying out optimization processing on the text data to be optimized according to a preset interception strategy so as to generate target text data;

7. An electronic device, the electronic device comprising:

one or more processors;

and a memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the method of any of claims 1-5.

8. A computer readable storage medium, characterized in that it stores a computer program, which causes a computer to perform the method of any one of claims 1-5.