CN114090755A - Reply sentence determination method and device based on knowledge graph and electronic equipment - Google Patents

Reply sentence determination method and device based on knowledge graph and electronic equipment Download PDF

Info

Publication number
CN114090755A
CN114090755A CN202111446164.3A CN202111446164A CN114090755A CN 114090755 A CN114090755 A CN 114090755A CN 202111446164 A CN202111446164 A CN 202111446164A CN 114090755 A CN114090755 A CN 114090755A
Authority
CN
China
Prior art keywords
node
user
chain
knowledge graph
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111446164.3A
Other languages
Chinese (zh)
Inventor
詹奕深
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202111446164.3A priority Critical patent/CN114090755A/en
Publication of CN114090755A publication Critical patent/CN114090755A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3334Selection or weighting of terms from queries, including natural language queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a reply sentence determination method and device based on a knowledge graph and electronic equipment, wherein the method comprises the following steps: determining service information corresponding to the voice information of the user at the current moment; acquiring a historical dialogue data set of an application field corresponding to the service information, and generating a process knowledge graph; inquiring a process knowledge graph according to the service information, and determining a first node corresponding to the service information; taking the first node as an initial node and taking a leaf node in the process knowledge graph as an end node to generate a path to obtain a candidate path set; respectively carrying out intention identification, entity extraction and flow extraction on the voice information at the current moment to obtain user intention information, at least one entity and a user flow chain; determining a target path in the candidate path set according to the user intention information and at least one entity; and determining a jump node of the user flow chain according to the user flow chain and the target path, and inquiring the reply knowledge graph according to the jump node to obtain the target sentence.

Description

Reply sentence determination method and device based on knowledge graph and electronic equipment
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a reply sentence determination method and device based on a knowledge graph and electronic equipment.
Background
Session management (DM) refers to a management process that controls a man-machine dialog, and specifically, the DM may determine recognition and reaction of a user's real intention at this time based on previous dialog history information of a man-machine. The most common application is task-driven multi-turn conversations, in which case the user either takes explicit purposes such as ordering a meal, booking a ticket, etc.; or the requirement is complex, a plurality of limiting conditions exist, and the guidance statement needs to be performed in multiple rounds to obtain a clear final target. Meanwhile, the user can continuously modify or improve the own requirements in the conversation process. The machine may also help the user find a satisfactory result by asking, clarifying or confirming when the user states that the need is not specific or clear enough.
In general, the task of current Dialog management is generally divided into two parts, Dialog State maintenance (DST) and generating system Decisions (DP). In particular, the DST procedure may be understood as a dialogue at time T +1, depending on the state of the previous time T and the system behavior of the previous time T, as well as the user behavior corresponding to the current time T + 1. The DP will generate system behavior according to the dialog state in the DST to decide what to do next. The system behavior represents user input observed during Natural Language Understanding (NLU) and feedback behavior of the system during Natural Language Generation (NLG).
However, the conventional DST and DP rely on the management of the conventional relational database, describe the state and application of the current session from the multi-table relationship of fields and sessions, make the maintenance of session state transition difficult, and lack the intuitive display of the session logic and the current session state. Meanwhile, the existing dialogue management system is realized based on a full-text retrieval technology or a deep learning semantic matching technology, so that the real will of a user cannot be really understood, the inference capability is lacked, and the practicability is poor.
Disclosure of Invention
In order to solve the above problems in the prior art, embodiments of the present application provide a reply statement determination method and apparatus based on a knowledge graph, and an electronic device, which can visually display a session logic and a current session state, and introduce inference capability, so as to better understand a real intention of a user, and improve practicality.
In a first aspect, an embodiment of the present application provides a method for determining a reply sentence based on a knowledge graph, including:
determining service information corresponding to the voice information at the current moment according to the voice information at the current moment of the user;
acquiring a historical dialogue data set of an application field corresponding to the service information, and generating a flow knowledge graph according to the historical dialogue data set, wherein the flow knowledge graph is used for identifying the flow logic of a dialogue link between a user and an agent in the application field corresponding to the service information;
inquiring a process knowledge graph according to the service information, and determining a first node corresponding to the service information;
taking the first node as an initial node, taking each leaf node in the process knowledge graph as an end node to generate a path, and combining all generated paths to obtain a candidate path set;
performing intention recognition on the voice information at the current moment to obtain user intention information, performing entity extraction on the voice information at the current moment to obtain at least one entity, and performing flow extraction on the voice information at the current moment to obtain a user flow chain;
determining a target path in the candidate path set according to the user intention information and at least one entity;
according to the user flow chain and the target path, determining a jump node of the user flow chain, inquiring a reply knowledge graph according to the jump node to obtain a target sentence, pushing the target sentence to an agent to reply the voice information of the user at the current moment, wherein the reply knowledge graph is used for identifying reply sentences of which the use frequency is greater than a threshold value in the application field and logic relations among the reply sentences.
In a second aspect, an embodiment of the present application provides a reply sentence determination apparatus based on a knowledge graph, including:
the analysis module is used for determining the service information corresponding to the voice information at the current moment according to the voice information at the current moment of the user;
the system comprises a map construction module, a flow knowledge map generation module and a service information analysis module, wherein the map construction module is used for acquiring a historical conversation data set of an application field corresponding to service information and generating the flow knowledge map according to the historical conversation data set, and the flow knowledge map is used for identifying flow logic of a conversation link between a user and a seat in the application field corresponding to the service information;
the query module is used for querying the process knowledge graph according to the service information and determining a first node corresponding to the service information;
the processing module is used for generating paths by taking the first node as an initial node and each leaf node in the process knowledge graph as an end node, and combining all generated paths to obtain a candidate path set;
the analysis module is also used for carrying out intention recognition on the voice information at the current moment to obtain user intention information, carrying out entity extraction on the voice information at the current moment to obtain at least one entity, and carrying out flow extraction on the voice information at the current moment to obtain a user flow chain;
the processing module is further used for determining a target path in the candidate path set according to the user intention information and the at least one entity;
and the query module is further used for determining a jump node of the user flow chain according to the user flow chain and the target path, querying a reply knowledge graph according to the jump node to obtain a target sentence, pushing the target sentence to the agent to reply the voice information of the user at the current moment, wherein the reply knowledge graph is used for identifying reply sentences of which the use frequency is greater than a threshold value in the application field and the logical relationship among the reply sentences.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor coupled to the memory, the memory for storing a computer program, the processor for executing the computer program stored in the memory to cause the electronic device to perform the method of the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having a computer program stored thereon, the computer program causing a computer to perform the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, the computer operable to cause the computer to perform a method according to the first aspect.
The implementation of the embodiment of the application has the following beneficial effects:
in the embodiment of the application, the voice information of the user at the current moment is analyzed to determine the service information corresponding to the voice information, and then historical dialogue data in the application field to which the service information belongs is acquired to establish a process knowledge graph. Then, the process knowledge graph is inquired through the service information, a first node corresponding to the service information is determined, then the first node is used as a starting point, each leaf node in the process knowledge graph is used as an end point to generate a path, and the generated path is screened to obtain a target path. And finally, determining a jumping node of the flow chain through the road sign path and a user flow chain corresponding to the voice information of the user at the current moment, and then inquiring a reply knowledge graph according to the jumping node to obtain a target sentence so as to reply the user. Therefore, information such as services, scenes, stages, nodes, intentions, entities and the like is extracted in outbound session management, so that a process knowledge graph capable of reflecting session process logic is constructed, and the session logic and the current session state are visually displayed. Meanwhile, an abstract mode of a human brain neuron is simulated by means of a knowledge graph, reasoning capacity is introduced to analyze the dialect circulation and the real intention of the user, the obtained recovery statement better accords with the current scene and fits the real intention of the user, and the practicability and the customer satisfaction are improved while the service quality is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic hardware configuration diagram of a reply statement determination apparatus based on a knowledge graph according to an embodiment of the present application;
fig. 2 is a schematic flowchart of a reply statement determination method based on a knowledge graph according to an embodiment of the present application;
FIG. 3 is a schematic flow chart diagram illustrating a method for generating a flow knowledge graph from historical dialogue data sets according to an embodiment of the present application;
FIG. 4 is a schematic diagram of session data provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of a process chain provided by an embodiment of the present application;
fig. 6 is a schematic flowchart of a method for performing N-time clustering on at least one process data chain to obtain a clustered data chain set according to an embodiment of the present disclosure;
fig. 7 is a flowchart illustrating a method for determining a target route in a candidate route set according to user intention information and at least one entity according to an embodiment of the present disclosure;
fig. 8 is a flowchart illustrating a method for determining a skip node of a user flow chain according to the user flow chain and a target path, and querying a reply knowledge graph according to the skip node to obtain a target sentence according to an embodiment of the present application;
fig. 9 is a schematic diagram of a user flow chain with an inserted undefined state node according to an embodiment of the present application;
FIG. 10 is a schematic diagram of a target path provided by an embodiment of the present application;
fig. 11 is a block diagram illustrating functional modules of a knowledge-graph-based reply sentence determination apparatus according to an embodiment of the present application;
fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments obtained by a person of ordinary skill in the art without any inventive work based on the embodiments in the present application are within the scope of protection of the present application.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
First, referring to fig. 1, fig. 1 is a schematic diagram of a hardware structure of a reply statement determination device based on a knowledge graph according to an embodiment of the present application. The knowledge-graph based reply sentence determination apparatus 100 includes at least one processor 101, a communication line 102, a memory 103, and at least one communication interface 104.
In this embodiment, the processor 101 may be a general processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more ics for controlling the execution of programs according to the present disclosure.
The communication link 102, which may include a path, carries information between the aforementioned components.
The communication interface 104 may be any transceiver or other device (e.g., an antenna, etc.) for communicating with other devices or communication networks, such as an ethernet, RAN, Wireless Local Area Network (WLAN), etc.
The memory 103 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a Random Access Memory (RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
In this embodiment, the memory 103 may be independent and connected to the processor 101 through the communication line 102. The memory 103 may also be integrated with the processor 101. The memory 103 provided in the embodiments of the present application may generally have a nonvolatile property. The memory 103 is used for storing computer-executable instructions for executing the scheme of the application, and is controlled by the processor 101 to execute. The processor 101 is configured to execute computer-executable instructions stored in the memory 103, thereby implementing the methods provided in the embodiments of the present application described below.
In alternative embodiments, computer-executable instructions may also be referred to as application code, which is not specifically limited in this application.
In alternative embodiments, processor 101 may include one or more CPUs, such as CPU0 and CPU1 of FIG. 1.
In alternative embodiments, the knowledge-graph based reply sentence determination apparatus 100 may include a plurality of processors, such as the processor 101 and the processor 107 in fig. 1. Each of these processors may be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
In an alternative embodiment, if the apparatus 100 for determining a reply sentence based on a knowledge graph is a server, for example, the apparatus may be an independent server, or may be a cloud server that provides basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, a cloud storage, a web service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like. The knowledge-graph-based reply sentence determination apparatus 100 may further include an output device 105 and an input device 106. The output device 105 is in communication with the processor 101 and may display information in a variety of ways. For example, the output device 105 may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device 106 is in communication with the processor 101 and may receive user input in a variety of ways. For example, the input device 106 may be a mouse, a keyboard, a touch screen device, or a sensing device, among others.
The above-described knowledge-graph-based reply sentence determination apparatus 100 may be a general-purpose device or a dedicated device. The present embodiment does not limit the type of the knowledge-graph-based reply sentence determination apparatus 100.
Next, it should be noted that the embodiments disclosed in the present application may acquire and process related data based on artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
Finally, the reply sentence determination method based on the knowledge graph can be applied to scenes of e-commerce sales, off-line entity sales, service popularization, telephone call-out, social platform popularization and the like. The method for determining the reply sentence based on the knowledge graph is mainly illustrated in a call-out scene as an example, and the method for determining the reply sentence based on the knowledge graph in other scenes is similar to the implementation mode in the call-out scene, and is not described herein.
The knowledge-graph-based reply sentence determination method disclosed in the present application will be explained below:
referring to fig. 2, fig. 2 is a schematic flowchart of a reply statement determination method based on a knowledge graph according to an embodiment of the present application. The reply statement determination method based on the knowledge graph comprises the following steps:
201: and determining the service information corresponding to the voice information at the current moment according to the voice information at the current moment of the user.
In this embodiment, the speech information may be analyzed and recognized to obtain a text. And analyzing the obtained text to determine the service information requested by the voice information at the current moment. Specifically, the acoustic features of the voice information may be obtained, and then the dialect category of the voice information may be determined according to the acoustic features. And then, acquiring an audio transposition formula corresponding to the dialect type, and converting the voice information into standard voice through the audio transposition formula. Therefore, the standard voice is subjected to feature extraction to obtain corresponding audio features, and then matching is carried out in a preset neural network according to the audio features to obtain a pinyin text matched with the audio features. Specifically, the pinyin text may be composed of at least one first pinyin-meta text, and the first pinyin-meta text refers to any one of an initial or a final.
In this embodiment, after obtaining the pinyin text, matching may be performed in the neural network according to each first pinyin element text in at least one first pinyin element text in the pinyin text, so as to obtain at least one first character corresponding to the at least one first pinyin element text one to one. Then, the at least one first character is arranged according to the arrangement sequence of the at least one first pinyin element text in the pinyin text according to the corresponding relation between the at least one first character and the at least one first pinyin element text, and the literal text can be obtained.
In this embodiment, after obtaining the text, the keyword extraction may be performed on the text, and then the service information corresponding to the text is determined according to the obtained keyword. Further, scene information of the text under the service corresponding to the service information and the flow stage information can be further determined through the keywords. Illustratively, the flow stage information is used to identify which stage of the whole session the current session is in, and taking the call-out scenario as an example, the session may be divided into: the method comprises a following step of performing a consistency phase, a promotion phase, a question and answer phase, a confirmation phase and an ending phase. Therefore, the service range of the voice information of the user at the current moment is further narrowed, and the subsequent intention analysis and reply recommendation are more accurate.
Specifically, for the voice information of the user, the text "how much interest in loan? ", then, through keyword extraction, the keyword can be obtained: loan and interest. After matching according to the keywords, it can be obtained that the service information requested by the voice information is: the loan transaction has the following scene information: and (5) loan scenes. Meanwhile, after the voice information is analyzed in terms of voice and the text is analyzed in terms of sentence patterns, the information of the flow stage where the voice information is located can be obtained by combining the keywords: a question answering stage.
202: and acquiring a historical dialogue data set of the application field corresponding to the service information, and generating a flow knowledge graph according to the historical dialogue data set.
In this embodiment, the process knowledge graph is used to identify the process logic of the session between the user and the agent in the application field corresponding to the service information. Illustratively, as described above, the dialog segments in the call-out scenario can be divided into: the method comprises a following step of performing a consistency phase, a promotion phase, a question and answer phase, a confirmation phase and an ending phase. The successive stage and the ending stage are usually fixed as the beginning and the end of the dialog flow, but the ordered jumps between the intermediate stages are often different. Meanwhile, because of the similarity of the services of the same type, the conversation links of the same type of services have certain similarity with each other. Therefore, in this embodiment, historical dialogue data in the application domain corresponding to the service information may be acquired according to the service information in step 201, and then a flow knowledge graph of the application domain may be generated according to the historical dialogue data.
Based on this, in the present embodiment, there is provided a method for generating a flow knowledge graph from a historical dialogue data set, as shown in fig. 3, the method comprising:
301: and performing conversation process extraction on each historical conversation data in the historical conversation data set to obtain at least one process data chain.
In the present embodiment, at least one flow data chain corresponds to historical dialogue data in the historical dialogue data set one-to-one, in other words, each historical dialogue data in the historical dialogue data set corresponds to one flow data chain.
In this embodiment, the flow chain is composed of different flow nodes and directed line segments that identify the jump relationships between the flow nodes. The process node is configured to identify a stage where the corresponding session data is located, and taking a scenario of a call-out as an example, the process node may include: the method comprises the following steps of a connection stage node, a promotion stage node, a question and answer stage node, a confirmation stage node and an end stage node. Additionally, the flow node may also include corresponding outbound text, reply responses, user intent, and the like. Illustratively, fig. 4 shows a piece of dialogue data, which is subjected to flow extraction, so as to obtain a flow chain as shown in fig. 5.
302: and carrying out N times of clustering processing on at least one process data chain to obtain a clustered data chain set.
In the present embodiment, N is an integer greater than or equal to 1. Specifically, multiple clustering processes are performed on at least one process data chain, similar process data chains are fused, and the difference between one or more cluster data chains included in a finally obtained cluster data chain set is large, so that one or more conversation process logics in the application field can be comprehensively reflected. In other words, through N clusters, a representative conversation process in the application field can be extracted comprehensively.
For example, the present embodiment provides a method for performing clustering processing on at least one flow data chain N times to obtain a clustered data chain set, as shown in fig. 6, where the method includes:
601: in the ith clustering process, the initial data chain set A is processediAnd carrying out clustering processing to obtain at least one clustering result.
In this embodiment, the distance between any two data chains included in each of the at least one clustering result is smaller than the distance threshold B corresponding to the ith clustering processiIn short, the distance threshold B according to each clusteringiAre all different, in particular the distance threshold BiMay be a value that gradually increases as the number of clustering rounds increases. i is an integer greater than or equal to 0 and less than or equal to N, and when i is equal to 1, the initial data chain set AiA set of data chains formed for at least one process data chain.
602: and performing fusion processing on the data chain in each clustering result to obtain at least one fused data chain corresponding to at least one clustering result one to one.
In this embodiment, feature extraction may be performed on each data chain in each clustering result to obtain at least one feature vector corresponding to the data chain in each clustering result. Then, an average vector of the at least one feature vector is obtained, and data chain reconstruction is performed according to the average vector to obtain the at least one fused data chain.
603: taking a data chain set formed by at least one fused data chain as an initial data chain set A of the (i + 1) th clustering processi+1And performing clustering processing for the (i + 1) th time until N times of clustering processing are performed to obtain a clustered data chain set.
In an optional embodiment, when the data chain set formed by at least one fused data chain obtained by a certain round of clustering processing is the same as the initial data chain set of the current round, the clustering processing may be stopped, and the data chain set formed by the at least one fused data chain is output as the clustered data chain set.
303: and extracting the stage node and the jump relation of each clustered data chain in the clustered data chain set, and performing duplicate removal processing on the extraction result to obtain at least one stage node and at least one jump relation.
Exemplarily, after the N-time clustering processing, two process chains are finally obtained: a flow chain C [ an engaging stage, a promotion stage, a question and answer stage, a promotion stage, a confirmation stage, a question and answer stage and an ending stage ]; and a flow chain D [ a contact stage, a promotion stage, a confirmation stage, an end stage, a question and answer stage, a confirmation stage and an end stage ]. Through extraction and deduplication, stage nodes can be obtained: the method comprises the following steps of a connection stage node, a promotion stage node, a question and answer stage node, a confirmation stage node and an end stage node. Jump relation: the method comprises the following steps that an engagement stage can jump to a promotion stage, a promotion stage can jump to a confirmation stage and a question and answer stage, the question and answer stage can jump to the promotion stage and the confirmation stage, the confirmation stage can jump to the question and answer stage and the end stage can jump to the question and answer stage.
304: and taking each stage node in the at least one stage node as a knowledge node of the process knowledge graph, and connecting two knowledge nodes with a jump relation through a directed line segment according to the at least one jump relation to obtain the process knowledge graph.
In this embodiment, the direction of the directed line segment is used to identify the direction of the jump between two knowledge nodes having a jump relationship.
In addition, in an optional implementation manner, in order to distinguish the knowledge flow graph between the application domains, a service node, a scene node, and the like corresponding to the application domain may be further set on the phase node. Specifically, the service node may be a root node of the knowledge flow graph, and the scene nodes representing the scenes in the service are distributed at the next level, and each scene node may be connected to each stage node.
Based on this, in an optional implementation manner, the knowledge flow maps of each field may be generated in advance, and then, when actually used, the knowledge flow maps of the corresponding fields are matched in each knowledge flow map by determining the application field of the voice information of the user at the current time, so as to improve the processing efficiency.
203: and inquiring the process knowledge graph according to the service information, and determining a first node corresponding to the service information.
In this embodiment, the corresponding service node may be determined in the knowledge graph according to the service information, the corresponding scene node may be determined in the lower node of the service node according to the scene information, and finally the corresponding phase node may be determined as the first node in the lower node of the scene node according to the process phase information.
204: and taking the first node as a starting node and each leaf node in the process knowledge graph as an ending node to generate a path, and combining all generated paths to obtain a candidate path set.
In this embodiment, a candidate path set may be generated by using all leaf nodes that can be reached from the start node as end nodes and using paths from the start node to each end node as candidate paths.
205: the method comprises the steps of carrying out intention identification on voice information at the current moment to obtain user intention information, carrying out entity extraction on the voice information at the current moment to obtain at least one entity, and carrying out flow extraction on the voice information at the current moment to obtain a user flow chain.
In this embodiment, semantic extraction may be performed on a text obtained from the voice information at the current time, and then matching may be performed in a preset intention library according to the obtained semantic vector, so as to determine an intention characteristic corresponding to the text. Specifically, the willingness characteristics pre-stored in the willingness library are set according to the applicable field, for example, for a telemarketing scene of a bank, the willingness characteristics can be pre-set to include: the interest, the amount, the repayment time, the dividend proportion and other strongly related willingness characteristics. Therefore, when matching is carried out, the intention characteristic most similar to the semantic vector of the text can be matched as the corresponding intention characteristic in a mode of calculating the similarity.
Meanwhile, in the present embodiment, each keyword in the text may be used as an entity of the text. Illustratively, for the text converted from the voice message "what is worth the interest in credit card overdraft? After keyword extraction, an entity can be obtained: "credit card", "overdraft" and "interest".
Finally, in this embodiment, the extraction manner of the user flow chain is similar to the extraction manner of the flow chain of the historical dialogue data in step 301, and details are not repeated here.
206: and determining a target path in the candidate path set according to the user intention information and at least one entity.
Specifically, the present embodiment provides a method for determining a target route in a candidate route set according to user intention information and at least one entity, as shown in fig. 7, the method including:
701: and acquiring destination intention information corresponding to the destination of each candidate path in the candidate path set.
In this embodiment, the session data corresponding to the phase node may be searched for by the phase node corresponding to the end point of each candidate path. And then extracting the intention of the dialogue data to obtain the terminal intention information.
702: and determining final intention information of the user according to the user intention information.
In this embodiment, the final intention information of the user may be determined according to the current intention information of the user in combination with the service and the scene corresponding to the current conversation. For example, if the current intention information of the user is "interest reduction", the service corresponding to the current session is "loan service", and the scene is "loan scene", the analysis is performed in combination with the above information, and the final intention information of the user is "loan handling with low interest".
703: at least one first candidate path is determined in the candidate path set according to the final intention information of the user.
In this embodiment, the similarity between the end point intention information corresponding to each of the at least one first candidate route and the final intention information of the user is greater than the first threshold. In other words, a candidate route having end point intention information similar to the final intention information of the user is selected as the first candidate route.
704: and matching at least one entity with each first candidate path to obtain at least one path score corresponding to at least one first candidate path one to one.
In this embodiment, when any one of the at least one entity exists in a first candidate route, the score of the route is made + 1. Specifically, each phase node in the candidate path has several related entities mounted thereon. At this time, for each candidate path, whenever there is some entity in the at least one entity in the entities mounted by some node therein, the score of the path is + 1. For example, existing entities "credit card", "overdraft", and "interest" are present, and candidate path 1 has 4 stage nodes, where the entities corresponding to stage node 1 are: "loan", the entities corresponding to the phase node 2 are: loan, borrow and repayment, the entities corresponding to the stage node 3 are: the entities corresponding to the stage node 4 are credit card, bond, overdraft and savings card: interest and principal. Then, the entities "credit card", "overdraft" and "interest" in candidate path 1 occur 3 times in total, and the path score of this candidate path 1 is 3.
705: and taking the first candidate path corresponding to the highest score in the at least one path score as the target path.
207: according to the user flow chain and the target path, determining a jump node of the user flow chain, inquiring a reply knowledge graph according to the jump node to obtain a target sentence, and pushing the target sentence to the seat so as to reply the voice information of the user at the current moment.
In this embodiment, the reply knowledge graph is used to identify reply sentences in the application domain that have a frequency of use greater than a threshold, and logical relationships between the reply sentences. In particular, some nodes in the reply knowledge-graph each represent a conversation, while other nodes each correspond to an entity. Therefore, after the target path is determined, the jump node of the user flow chain and the stage corresponding to the jump node may be determined according to the target path, and then the jump node and the user intention information and at least one entity obtained in step 205 are input into the reply knowledge graph to generate the target sentence.
Specifically, the present embodiment provides a method for determining a skip node of a user flow chain according to the user flow chain and a target path, and obtaining a target statement by querying a reply knowledge graph according to the skip node, as shown in fig. 8, where the method includes:
801: and inserting the undetermined state node into the end of the user process chain to obtain a first user process chain.
802: and carrying out graph embedding processing on the first user flow chain to obtain a first node vector of an undetermined transition node.
803: and determining a second node corresponding to the last node of the user flow chain in each node of the target path.
804: and carrying out graph embedding processing on the target path to obtain at least one second node vector.
In this embodiment, the node corresponding to each of the at least one second node vector is located after the second node.
805: and acquiring the similarity between each second node vector and the first node vector, and taking the node corresponding to the second node vector with the maximum similarity as a skip node of the user process chain.
Specifically, in the present embodiment, an undefined state node is used to replace a jumping node of a user flow chain, so as to obtain a first node vector of the undefined state node by a graph embedding method. And determining a second node corresponding to the last node of the user process chain in the target path according to the mapping relation between the stage node corresponding to the dialog generated in the user process chain and the node in the target path. And finally, comparing the similarity between the first node vector of the undetermined transition node (namely, the jump node) and the second node vector of each stage node positioned behind the second node in the target path, and taking the stage corresponding to the stage node with the highest similarity as the stage state of the undetermined transition node.
In an optional embodiment, other nodes in the first user flow chain and the target path may also be vectorized, and then the same node is matched in the target path according to the vector of the node that has occurred in the first user flow chain, so as to find a second node corresponding to the last node of the user flow chain in the target path.
For example, fig. 9 shows a schematic diagram of a user flow chain with an inserted undefined state node, and after graph embedding processing is performed on the user flow chain, a vector of a joining node is denoted as E, and a vector of the undefined state node is denoted as F. Fig. 10 is a schematic diagram of a target path corresponding to the user flow chain, and after the graph embedding processing is performed on the target path, a vector of a contact node is marked as G, a vector of a promotion node is marked as H, and a vector of an answer node is marked as I.
At this moment, through the vector comparison, it is the node of the same stage to obtain easily to engage node E and engage node G, maps each other between the two, and the node of the position under engaging node E with undetermined state node F this moment is: a marketing node H and an answering node I. Therefore, the similarity between the vector F of the undetermined state node and the vector H of the promotion node and the vector I of the answer node, respectively, is calculated. Assuming that the similarity between the vector F of the undetermined state node and the vector H of the promotion node is the highest, the phase transition of the undetermined state node F may be set as the promotion phase.
In this embodiment, after the skip node is determined, by giving the property of the stage node corresponding to the skip node in the flow knowledge graph to the skip node, the reply knowledge graph can be queried according to the skip node, the corresponding talking node can be found, and the talking template corresponding to the talking node can be extracted. Then, based on the user intention and at least one entity, searching a corresponding entity node in the reply knowledge graph, and generating an entity node path by combining the corresponding entity node in the reply knowledge graph according to the appearance sequence of the at least one entity in the original text. And finally, extracting knowledge data corresponding to each node on the entity node path, and generating a target statement by combining a conversational template.
In summary, in the method for determining a reply sentence based on a knowledge graph provided by the present invention, the voice information of the user at the current time is analyzed to determine the service information corresponding to the voice information, and then the historical dialogue data in the application field to which the service information belongs is obtained to establish the process knowledge graph. Then, the process knowledge graph is inquired through the service information, a first node corresponding to the service information is determined, then the first node is used as a starting point, each leaf node in the process knowledge graph is used as an end point to generate a path, and the generated path is screened to obtain a target path. And finally, determining a jumping node of the flow chain through the road sign path and a user flow chain corresponding to the voice information of the user at the current moment, and then inquiring a reply knowledge graph according to the jumping node to obtain a target sentence so as to reply the user. Therefore, information such as services, scenes, stages, nodes, intentions, entities and the like is extracted in outbound session management, so that a process knowledge graph capable of reflecting session process logic is constructed, and the session logic and the current session state are visually displayed. Meanwhile, an abstract mode of a human brain neuron is simulated by means of a knowledge graph, reasoning capacity is introduced to analyze the dialect circulation and the real intention of the user, the obtained recovery statement better accords with the current scene and fits the real intention of the user, and the practicability and the customer satisfaction are improved while the service quality is improved.
Referring to fig. 11, fig. 11 is a block diagram illustrating functional modules of a knowledge graph-based reply sentence determination apparatus according to an embodiment of the present disclosure. As shown in fig. 11, the knowledge-graph-based reply sentence determination apparatus 1100 includes:
the analysis module 1101 is configured to determine, according to the voice information of the user at the current time, service information corresponding to the voice information at the current time;
the graph building module 1102 is configured to obtain a historical dialogue data set of an application field corresponding to the service information, and generate a process knowledge graph according to the historical dialogue data set, where the process knowledge graph is used to identify a process logic of a dialogue link between a user and an agent in the application field corresponding to the service information;
the query module 1103 is configured to query the process knowledge graph according to the service information, and determine a first node corresponding to the service information;
a processing module 1104, configured to perform path generation using the first node as a start node and each leaf node in the process knowledge graph as an end node, and combine all generated paths to obtain a candidate path set;
the analysis module 1101 is further configured to perform intent recognition on the voice information at the current time to obtain user intent information, perform entity extraction on the voice information at the current time to obtain at least one entity, and perform flow extraction on the voice information at the current time to obtain a user flow chain;
the processing module 1104 is further configured to determine a target path in the candidate path set according to the user intention information and the at least one entity;
the query module 1103 is further configured to determine a jump node of the user flow chain according to the user flow chain and the target path, query a reply knowledge graph according to the jump node to obtain a target sentence, and push the target sentence to the agent to reply the voice information of the user at the current time, where the reply knowledge graph is used to identify a reply sentence in the application field, where the usage frequency of the reply sentence is greater than a threshold, and a logical relationship between the reply sentences.
In an embodiment of the present invention, in generating a process knowledge graph from a historical dialog data set, the graph building module 1102 is specifically configured to:
performing conversation process extraction on each historical conversation data in the historical conversation data set to obtain at least one process data chain, wherein the at least one process data chain is in one-to-one correspondence with the historical conversation data in the historical conversation data set;
performing clustering processing on at least one process data chain for N times to obtain a clustered data chain set, wherein N is an integer greater than or equal to 1;
extracting the stage node and the jump relation of each clustered data chain in the clustered data chain set, and performing duplicate removal processing on the extraction result to obtain at least one stage node and at least one jump relation;
and taking each stage node in at least one stage node as a knowledge node of the process knowledge graph, and connecting two knowledge nodes with a jump relation through a directed line segment according to at least one jump relation to obtain the process knowledge graph, wherein the direction of the directed line segment is used for identifying the jump direction between the two knowledge nodes with the jump relation.
In an embodiment of the present invention, in terms of performing clustering processing on at least one process data chain for N times to obtain a clustered data chain set, the graph building module 1102 is specifically configured to:
in the ith clustering process, the initial data chain set A is processediClustering to obtain at least one clustering result, wherein the distance between any two data chains contained in each clustering result in the at least one clustering result is less than a distance threshold B corresponding to the ith clustering processiI is an integer greater than or equal to 0 and less than or equal to N, and when i is equal to 1, the initial data chain set AiA data chain set composed of at least one process data chain;
performing fusion processing on the data chains in each clustering result to obtain at least one fused data chain, wherein the at least one fused data chain corresponds to the at least one clustering result one to one;
taking a data chain set formed by at least one fused data chain as an initial data chain set A of the (i + 1) th clustering processi+1And performing clustering processing for the (i + 1) th time until N times of clustering processing are performed to obtain a clustered data chain set.
In an embodiment of the present invention, in terms of performing fusion processing on data chains in each clustering result to obtain at least one fused data chain, the graph building module 1102 is specifically configured to:
extracting features of each data chain in each clustering result to obtain at least one feature vector, wherein the at least one feature vector corresponds to the data chain in each clustering result one to one;
obtaining an average vector of at least one feature vector;
and reconstructing the data chain according to the average vector to obtain at least one fused data chain.
In an embodiment of the present invention, in determining a target path in a candidate path set according to the user intention information and the at least one entity, the processing module 1104 is specifically configured to:
acquiring destination intention information corresponding to a destination of each candidate path in a candidate path set;
determining final intention information of the user according to the intention information of the user;
determining at least one first candidate path in the candidate path set according to the final intention information of the user, wherein the similarity between the end point intention information corresponding to each first candidate path in the at least one first candidate path and the final intention information of the user is greater than a first threshold;
matching at least one entity with each first candidate path to obtain at least one path score, wherein the at least one path score is in one-to-one correspondence with the at least one first candidate path;
and taking the first candidate path corresponding to the highest score in the at least one path score as the target path.
In an embodiment of the present invention, in determining a skip node of a user flow chain according to the user flow chain and a target path, and querying a reply knowledge graph according to the skip node to obtain a target statement, the querying module 1103 is specifically configured to:
inserting the undetermined state node into the end of the user process chain to obtain a first user process chain;
performing graph embedding processing on the first user flow chain to obtain a first node vector of an undetermined transition node;
determining a second node corresponding to the last node of the user process chain in each node of the target path;
performing graph embedding processing on the target path to obtain at least one second node vector, wherein a node corresponding to each second node vector in the at least one second node vector is positioned behind the second node;
and acquiring the similarity between each second node vector and the first node vector, and taking the node corresponding to the second node vector with the maximum similarity as a skip node of the user process chain.
Referring to fig. 12, fig. 12 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 12, the electronic device 1200 includes a transceiver 1201, a processor 1202, and a memory 1203. Connected to each other by a bus 1204. The memory 1203 is used for storing computer programs and data, and the data stored by the memory 1203 may be transferred to the processor 1202.
The processor 1202 is configured to read the computer program in the memory 1203 to perform the following operations:
determining service information corresponding to the voice information at the current moment according to the voice information at the current moment of the user;
acquiring a historical dialogue data set of an application field corresponding to the service information, and generating a flow knowledge graph according to the historical dialogue data set, wherein the flow knowledge graph is used for identifying the flow logic of a dialogue link between a user and an agent in the application field corresponding to the service information;
inquiring a process knowledge graph according to the service information, and determining a first node corresponding to the service information;
taking the first node as an initial node, taking each leaf node in the process knowledge graph as an end node to generate a path, and combining all generated paths to obtain a candidate path set;
performing intention recognition on the voice information at the current moment to obtain user intention information, performing entity extraction on the voice information at the current moment to obtain at least one entity, and performing flow extraction on the voice information at the current moment to obtain a user flow chain;
determining a target path in the candidate path set according to the user intention information and at least one entity;
according to the user flow chain and the target path, determining a jump node of the user flow chain, inquiring a reply knowledge graph according to the jump node to obtain a target sentence, pushing the target sentence to an agent to reply the voice information of the user at the current moment, wherein the reply knowledge graph is used for identifying reply sentences of which the use frequency is greater than a threshold value in the application field and logic relations among the reply sentences.
In an embodiment of the present invention, in generating a flow knowledge graph from historical dialogue data sets, the processor 1202 is specifically configured to:
performing conversation process extraction on each historical conversation data in the historical conversation data set to obtain at least one process data chain, wherein the at least one process data chain is in one-to-one correspondence with the historical conversation data in the historical conversation data set;
performing clustering processing on at least one process data chain for N times to obtain a clustered data chain set, wherein N is an integer greater than or equal to 1;
extracting the stage node and the jump relation of each clustered data chain in the clustered data chain set, and performing duplicate removal processing on the extraction result to obtain at least one stage node and at least one jump relation;
and taking each stage node in at least one stage node as a knowledge node of the process knowledge graph, and connecting two knowledge nodes with a jump relation through a directed line segment according to at least one jump relation to obtain the process knowledge graph, wherein the direction of the directed line segment is used for identifying the jump direction between the two knowledge nodes with the jump relation.
In an embodiment of the present invention, in terms of performing clustering processing on at least one process data chain N times to obtain a clustered data chain set, the processor 1202 is specifically configured to perform the following operations:
in the ith clustering process, the initial data chain set A is processediClustering to obtain at least one clustering result, wherein the distance between any two data chains contained in each clustering result in the at least one clustering result is less than a distance threshold B corresponding to the ith clustering processiI is an integer greater than or equal to 0 and less than or equal to N, and when i is equal to 1, the initial data chain set AiA data chain set composed of at least one process data chain;
performing fusion processing on the data chains in each clustering result to obtain at least one fused data chain, wherein the at least one fused data chain corresponds to the at least one clustering result one to one;
taking a data chain set formed by at least one fused data chain as an initial data chain set A of the (i + 1) th clustering processi+1And performing clustering processing for the (i + 1) th time until N times of clustering processing are performed to obtain a clustered data chain set.
In an embodiment of the present invention, in terms of performing fusion processing on the data chains in each clustering result to obtain at least one fused data chain, the processor 1202 is specifically configured to perform the following operations:
extracting features of each data chain in each clustering result to obtain at least one feature vector, wherein the at least one feature vector corresponds to the data chain in each clustering result one to one;
obtaining an average vector of at least one feature vector;
and reconstructing the data chain according to the average vector to obtain at least one fused data chain.
In an embodiment of the present invention, in determining a target path in a candidate path set according to user intention information and at least one entity, the processor 1202 is specifically configured to:
acquiring destination intention information corresponding to a destination of each candidate path in a candidate path set;
determining final intention information of the user according to the intention information of the user;
determining at least one first candidate path in the candidate path set according to the final intention information of the user, wherein the similarity between the end point intention information corresponding to each first candidate path in the at least one first candidate path and the final intention information of the user is greater than a first threshold;
matching at least one entity with each first candidate path to obtain at least one path score, wherein the at least one path score is in one-to-one correspondence with the at least one first candidate path;
and taking the first candidate path corresponding to the highest score in the at least one path score as the target path.
In an embodiment of the present invention, in determining a skip node of a user flow chain according to the user flow chain and a target path, and querying a reply knowledge graph according to the skip node to obtain a target statement, the processor 1202 is specifically configured to perform the following operations:
inserting the undetermined state node into the end of the user process chain to obtain a first user process chain;
performing graph embedding processing on the first user flow chain to obtain a first node vector of an undetermined transition node;
determining a second node corresponding to the last node of the user process chain in each node of the target path;
performing graph embedding processing on the target path to obtain at least one second node vector, wherein a node corresponding to each second node vector in the at least one second node vector is positioned behind the second node;
and acquiring the similarity between each second node vector and the first node vector, and taking the node corresponding to the second node vector with the maximum similarity as a skip node of the user process chain.
It should be understood that the reply sentence determination device based on the knowledge graph in the present application may include a smart Phone (e.g., an Android Phone, an iOS Phone, a Windows Phone, etc.), a tablet computer, a palm computer, a notebook computer, a Mobile Internet device MID (Mobile Internet Devices, MID for short), a robot or a wearable device, etc. The above-mentioned reply sentence determination device based on the knowledge graph is only an example, not an exhaustive list, and includes but is not limited to the above-mentioned reply sentence determination device based on the knowledge graph. In practical applications, the apparatus for determining reply sentences based on knowledge graph may further include: intelligent vehicle-mounted terminal, computer equipment and the like.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention can be implemented by combining software and a hardware platform. With this understanding in mind, all or part of the technical solutions of the present invention that contribute to the background can be embodied in the form of a software product, which can be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes instructions for causing a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods according to the embodiments or some parts of the embodiments.
Accordingly, the present application also provides a computer readable storage medium, which stores a computer program, wherein the computer program is executed by a processor to implement part or all of the steps of any one of the knowledge-graph based reply sentence determination methods as described in the above method embodiments. For example, the storage medium may include a hard disk, a floppy disk, an optical disk, a magnetic tape, a magnetic disk, a flash memory, and the like.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any one of the methods for knowledge-graph based reply sentence determination as set forth in the method embodiments above.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present application is not limited by the order of acts described, as some steps may occur in other orders or concurrently depending on the application. Further, those skilled in the art should also appreciate that the embodiments described in the specification are all alternative embodiments and that the acts and modules referred to are not necessarily required by the application.
In the above embodiments, the description of each embodiment has its own emphasis, and for parts not described in detail in a certain embodiment, reference may be made to the description of other embodiments.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is merely a logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.
The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, and the memory may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the methods and their core ideas of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for determining reply sentences based on knowledge graph, the method comprising:
determining service information corresponding to the voice information at the current moment according to the voice information at the current moment of the user;
acquiring a historical dialogue data set of an application field corresponding to the service information, and generating a flow knowledge graph according to the historical dialogue data set, wherein the flow knowledge graph is used for identifying flow logic of a dialogue link between a user and an agent in the application field corresponding to the service information;
inquiring the process knowledge graph according to the service information, and determining a first node corresponding to the service information;
taking the first node as an initial node, taking each leaf node in the process knowledge graph as an end node to generate a path, and combining all generated paths to obtain a candidate path set;
performing intention recognition on the voice information at the current moment to obtain user intention information, performing entity extraction on the voice information at the current moment to obtain at least one entity, and performing flow extraction on the voice information at the current moment to obtain a user flow chain;
determining a target path in the candidate path set according to the user intention information and the at least one entity;
and determining a jump node of the user flow chain according to the user flow chain and the target path, inquiring a reply knowledge graph according to the jump node to obtain a target sentence, pushing the target sentence to an agent to reply the voice information of the user at the current moment, wherein the reply knowledge graph is used for identifying reply sentences of which the use frequency is greater than a threshold value in the application field and the logical relationship between the reply sentences.
2. The method of claim 1, wherein generating a flow knowledge graph from the historical dialogue data set comprises:
performing dialogue process extraction on each historical dialogue data in the historical dialogue data set to obtain at least one process data chain, wherein the at least one process data chain is in one-to-one correspondence with the historical dialogue data in the historical dialogue data set;
performing clustering processing on the at least one process data chain for N times to obtain a clustered data chain set, wherein N is an integer greater than or equal to 1;
extracting the stage node and the jump relation of each clustered data chain in the clustered data chain set, and performing duplicate removal processing on the extraction result to obtain at least one stage node and at least one jump relation;
and taking each stage node in the at least one stage node as a knowledge node of the process knowledge graph, and connecting two knowledge nodes with a jump relationship through a directed line segment according to the at least one jump relationship to obtain the process knowledge graph, wherein the direction of the directed line segment is used for identifying the jump direction between the two knowledge nodes with the jump relationship.
3. The method of claim 2, wherein the clustering the at least one process data chain N times to obtain a cluster data chain set comprises:
in the ith clustering process, the initial data chain set A is processediClustering to obtain at least one clustering result, wherein the distance between any two data chains contained in each clustering result in the at least one clustering result is less than a distance threshold B corresponding to the ith clustering processiI is an integer greater than or equal to 0 and less than or equal to N, and when i is equal to 1, the initial data chain set AiA data chain set composed of the at least one process data chain;
performing fusion processing on the data chain in each clustering result to obtain at least one fused data chain, wherein the at least one fused data chain is in one-to-one correspondence with the at least one clustering result;
taking a data chain set formed by the at least one fused data chain as the (i + 1) th timeInitial data chain set A of clustering processingi+1And performing the (i + 1) th clustering processing until the N times of clustering processing are performed to obtain the clustering data chain set.
4. The method according to claim 3, wherein the fusing the data chains in each clustering result to obtain at least one fused data chain comprises:
performing feature extraction on each data chain in each clustering result to obtain at least one feature vector, wherein the at least one feature vector is in one-to-one correspondence with the data chain in each clustering result;
obtaining an average vector of the at least one feature vector;
and reconstructing a data chain according to the average vector to obtain the at least one fused data chain.
5. The method of claim 1, wherein determining a target path in the set of candidate paths based on the user intent information and the at least one entity comprises:
acquiring destination intention information corresponding to the destination of each candidate path in the candidate path set;
determining final intention information of the user according to the user intention information;
determining at least one first candidate path in the candidate path set according to the final intention information of the user, wherein the similarity between the end point intention information corresponding to each first candidate path in the at least one first candidate path and the final intention information of the user is greater than a first threshold;
matching the at least one entity with each first candidate path to obtain at least one path score, wherein the at least one path score is in one-to-one correspondence with the at least one first candidate path;
and taking the first candidate path corresponding to the highest score in the at least one path score as the target path.
6. The method of claim 1, wherein determining a jumping node of the user flow chain according to the user flow chain and the target path, and obtaining a target sentence according to a query reply knowledge graph of the jumping node comprises:
inserting an undetermined state node into the end of the user process chain to obtain a first user process chain;
performing graph embedding processing on the first user process chain to obtain a first node vector of the node which is not in the static state;
determining a second node corresponding to the last node of the user process chain in each node of the target path;
performing graph embedding processing on the target path to obtain at least one second node vector, wherein a node corresponding to each second node vector in the at least one second node vector is located behind the second node;
and acquiring the similarity between each second node vector and the first node vector, and taking the node corresponding to the second node vector with the maximum similarity as the skip node of the user process chain.
7. A knowledge-graph-based reply sentence determination apparatus, the apparatus comprising:
the analysis module is used for determining the service information corresponding to the voice information at the current moment according to the voice information at the current moment of the user;
the map construction module is used for acquiring a historical dialogue data set of an application field corresponding to the service information and generating a flow knowledge map according to the historical dialogue data set, wherein the flow knowledge map is used for identifying flow logic of a dialogue link between a user and an agent in the application field corresponding to the service information;
the query module is used for querying the process knowledge graph according to the service information and determining a first node corresponding to the service information;
the processing module is used for generating paths by taking the first node as an initial node and taking each leaf node in the process knowledge graph as an end node, and combining all the generated paths to obtain a candidate path set;
the analysis module is further configured to perform intent recognition on the voice information at the current time to obtain user intent information, perform entity extraction on the voice information at the current time to obtain at least one entity, and perform flow extraction on the voice information at the current time to obtain a user flow chain;
the processing module is further configured to determine a target path in the candidate path set according to the user intention information and the at least one entity;
the query module is further configured to determine a skip node of the user flow chain according to the user flow chain and the target path, query a reply knowledge graph according to the skip node to obtain a target sentence, and push the target sentence to a seat to reply the voice information of the user at the current time, where the reply knowledge graph is used to identify a reply sentence in the application field, where the frequency of use is greater than a threshold, and a logical relationship between the reply sentences.
8. The apparatus of claim 7, wherein in connection with the generating a process knowledge graph from the historical conversation data set, the graph construction module is specifically configured to:
performing dialogue process extraction on each historical dialogue data in the historical dialogue data set to obtain at least one process data chain, wherein the at least one process data chain is in one-to-one correspondence with the historical dialogue data in the historical dialogue data set;
performing clustering processing on the at least one process data chain for N times to obtain a clustered data chain set, wherein N is an integer greater than or equal to 1;
extracting the stage node and the jump relation of each clustered data chain in the clustered data chain set, and performing duplicate removal processing on the extraction result to obtain at least one stage node and at least one jump relation;
and taking each stage node in the at least one stage node as a knowledge node of the process knowledge graph, and connecting two knowledge nodes with a jump relationship through a directed line segment according to the at least one jump relationship to obtain the process knowledge graph, wherein the direction of the directed line segment is used for identifying the jump direction between the two knowledge nodes with the jump relationship.
9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the one or more programs including instructions for performing the steps in the method of any of claims 1-6.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method according to any one of claims 1-6.
CN202111446164.3A 2021-11-30 2021-11-30 Reply sentence determination method and device based on knowledge graph and electronic equipment Pending CN114090755A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111446164.3A CN114090755A (en) 2021-11-30 2021-11-30 Reply sentence determination method and device based on knowledge graph and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111446164.3A CN114090755A (en) 2021-11-30 2021-11-30 Reply sentence determination method and device based on knowledge graph and electronic equipment

Publications (1)

Publication Number Publication Date
CN114090755A true CN114090755A (en) 2022-02-25

Family

ID=80305958

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111446164.3A Pending CN114090755A (en) 2021-11-30 2021-11-30 Reply sentence determination method and device based on knowledge graph and electronic equipment

Country Status (1)

Country Link
CN (1) CN114090755A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114840178A (en) * 2022-07-01 2022-08-02 浙江西图盟数字科技有限公司 Process file generation method, device and equipment based on digital simulation platform
CN116167605A (en) * 2023-04-26 2023-05-26 北京中关村科金技术有限公司 Business process generation method, device, equipment and medium
CN116578692A (en) * 2023-07-13 2023-08-11 江西微博科技有限公司 AI intelligent service calculation method based on big data
CN117114695A (en) * 2023-10-19 2023-11-24 本溪钢铁(集团)信息自动化有限责任公司 Interaction method and device based on intelligent customer service in steel industry

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949787A (en) * 2020-08-21 2020-11-17 平安国际智慧城市科技股份有限公司 Automatic question-answering method, device, equipment and storage medium based on knowledge graph
CN112559709A (en) * 2020-12-16 2021-03-26 中国平安人寿保险股份有限公司 Knowledge graph-based question and answer method, device, terminal and storage medium
CN112732882A (en) * 2020-12-30 2021-04-30 平安科技(深圳)有限公司 User intention identification method, device, equipment and computer readable storage medium
CN113239178A (en) * 2021-07-09 2021-08-10 肇庆小鹏新能源投资有限公司 Intention generation method, server, voice control system and readable storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949787A (en) * 2020-08-21 2020-11-17 平安国际智慧城市科技股份有限公司 Automatic question-answering method, device, equipment and storage medium based on knowledge graph
CN112559709A (en) * 2020-12-16 2021-03-26 中国平安人寿保险股份有限公司 Knowledge graph-based question and answer method, device, terminal and storage medium
CN112732882A (en) * 2020-12-30 2021-04-30 平安科技(深圳)有限公司 User intention identification method, device, equipment and computer readable storage medium
CN113239178A (en) * 2021-07-09 2021-08-10 肇庆小鹏新能源投资有限公司 Intention generation method, server, voice control system and readable storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114840178A (en) * 2022-07-01 2022-08-02 浙江西图盟数字科技有限公司 Process file generation method, device and equipment based on digital simulation platform
CN116167605A (en) * 2023-04-26 2023-05-26 北京中关村科金技术有限公司 Business process generation method, device, equipment and medium
CN116578692A (en) * 2023-07-13 2023-08-11 江西微博科技有限公司 AI intelligent service calculation method based on big data
CN116578692B (en) * 2023-07-13 2023-09-15 江西微博科技有限公司 AI intelligent service calculation method based on big data
CN117114695A (en) * 2023-10-19 2023-11-24 本溪钢铁(集团)信息自动化有限责任公司 Interaction method and device based on intelligent customer service in steel industry
CN117114695B (en) * 2023-10-19 2024-01-26 本溪钢铁(集团)信息自动化有限责任公司 Interaction method and device based on intelligent customer service in steel industry

Similar Documents

Publication Publication Date Title
US11551007B2 (en) Determining intent from a historical vector of a to-be-analyzed statement
CN114090755A (en) Reply sentence determination method and device based on knowledge graph and electronic equipment
CN113590776B (en) Knowledge graph-based text processing method and device, electronic equipment and medium
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN110399473B (en) Method and device for determining answers to user questions
CN112926308B (en) Method, device, equipment, storage medium and program product for matching text
CN113407677B (en) Method, apparatus, device and storage medium for evaluating consultation dialogue quality
CN112579733A (en) Rule matching method, rule matching device, storage medium and electronic equipment
CN113641805A (en) Acquisition method of structured question-answering model, question-answering method and corresponding device
CN110059172B (en) Method and device for recommending answers based on natural language understanding
CN117421398A (en) Man-machine interaction method, device, equipment and storage medium
CN113821588A (en) Text processing method and device, electronic equipment and storage medium
US20230206007A1 (en) Method for mining conversation content and method for generating conversation content evaluation model
CN114647739B (en) Entity chain finger method, device, electronic equipment and storage medium
CN112148939A (en) Data processing method and device and electronic equipment
CN115510193A (en) Query result vectorization method, query result determination method and related device
CN114519094A (en) Method and device for conversational recommendation based on random state and electronic equipment
CN114444514A (en) Semantic matching model training method, semantic matching method and related device
CN113806541A (en) Emotion classification method and emotion classification model training method and device
CN114118937A (en) Information recommendation method and device based on task, electronic equipment and storage medium
CN111949776A (en) Method and device for evaluating user tag and electronic equipment
CN118093839B (en) Knowledge operation question-answer dialogue processing method and system based on deep learning
CN113344405B (en) Method, device, equipment, medium and product for generating information based on knowledge graph
CN115080845A (en) Recommendation reason generation method and device, electronic device and readable storage medium
CN118093839A (en) Knowledge operation question-answer dialogue processing method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination