CN111090740B

CN111090740B - Knowledge graph generation method for dialogue system

Info

Publication number: CN111090740B
Application number: CN201911237107.7A
Authority: CN
Inventors: 余轲
Original assignee: Beijing Lun Zi Technology Co ltd
Current assignee: Beijing Lun Zi Technology Co ltd
Priority date: 2019-12-05
Filing date: 2019-12-05
Publication date: 2023-09-29
Anticipated expiration: 2039-12-05
Also published as: CN111090740A

Abstract

The embodiment of the application discloses a method for generating a knowledge graph of a question-answering system. One embodiment of the method comprises the following steps: initializing a knowledge graph, acquiring an input sentence, determining each node in the initialized knowledge graph corresponding to the input sentence, determining the structural features and the unstructured features of each node, determining the graph embedded features of each node in the initialized knowledge graph by using a confidence propagation mechanism, and generating the knowledge graph of the question-answering system. The method utilizes the structured knowledge base information and unstructured dialogue sentence information in the question-answering system in the process of generating the knowledge graph, and can better assist in simulating and generating dialogue sentences of a real speaker in the question-answering process.

Description

Knowledge graph generation method for dialogue system

Technical Field

The embodiment of the application relates to the technical field of computers, in particular to the technical field of computer natural language processing, and particularly relates to a method for generating a knowledge graph of a question-answering system.

Background

In an open dialogue environment, the open question-answering system can learn the sentence embedding characteristics of the current input sentence according to the last dialogue sentence based on a knowledge base, so as to output a target sentence. Knowledge graph is a knowledge representation method that uses graph models to describe knowledge and model entity relationships. The knowledge-graph-based question answering system is used for solving questions based on knowledge graphs, and features are embedded into the generated graphs after semantic understanding results are mapped to the knowledge graphs to solve the questions. The process of embedding the knowledge graph to generate the graph embedded feature refers to mapping the content including the entity and the relation in the knowledge graph to a continuous vector space, wherein each node corresponds to the graph embedded feature.

The knowledge graph uses a vector expression mode, and the calculation efficiency of the application in the question-answering system is improved by using a numerical calculation method. The vector expression mode in the knowledge graph can effectively utilize the currently popular neural network, deep learning and other machine learning methods, so that the diversity of the design of the question-answering system can be increased.

The existing knowledge graph for the question-answering system can capture the open characteristics of a dialogue, but cannot be directly applied to a scene of knowledge interaction depending on the structural characteristics due to the lack of the structural dialogue state characteristics. If the knowledge graph can be constructed based on the structured features and the unstructured features in the question-answering system at the same time, new nodes are added, context knowledge is propagated and the knowledge graph is updated along with continuous input of sentences, the dialogue effect of the question-answering system can be improved.

Disclosure of Invention

The embodiment of the application provides a method for generating a knowledge graph of a question-answering system.

In a first aspect, an embodiment of the present application provides a method for generating information, the method including: generating an initialization knowledge graph based on the dialogue sample library; acquiring an input sentence; determining each node in the initialized knowledge-graph corresponding to the input sentence, determining a structural feature of each node, and determining an unstructured feature of each node; and determining graph embedding characteristics of each node in the initialized knowledge graph based on the determined structured characteristics and unstructured characteristics by using a confidence propagation mechanism, and generating the knowledge graph of the question-answering system.

In some embodiments, generating an initialization knowledge-graph based on a dialog sample library includes: generating nodes and edges based on a structured knowledge base in the dialogue sample library; updating the nodes and edges based on unstructured dialogue statements in the dialogue sample library; the dialogue sample library comprises a structured knowledge base and unstructured dialogue sentences, and the nodes and the edges form an initialization knowledge graph.

In some embodiments, generating nodes and edges based on a structured knowledge base in a dialog sample library includes: generating a node in the initialized knowledge-graph, the node comprising: project nodes, attribute nodes and entity nodes; and generating edges in the initialized knowledge graph, wherein the edges represent the relationship between different nodes.

In some embodiments, updating the nodes and edges based on unstructured dialogue statements in a dialogue sample library includes: if the unstructured sentences in the dialogue sample library contain nodes which are not in the initialized knowledge graph, adding new nodes, and updating edges according to node relations.

In some embodiments, determining the structural characteristics of each node includes: determining a single-hot vector of the occurrence number of each node, wherein the single-hot vector of the occurrence number represents the occurrence number of each node in all sentences stored in the question-answering system; determining a single heat vector of each node type, wherein the single heat vector of the type represents the type of each node; determining a single heat vector of each node occurrence, wherein the single heat vector of each occurrence represents whether each node appears in the input sentence; and serially connecting the occurrence times, the occurrence types and the independent heat vectors of the occurrence situations to determine the structural characteristics of each node.

In some embodiments, determining unstructured characteristics of each node includes: generating an entity set based on the input sentence; determining statement embedding characteristics of the input statement; and determining unstructured characteristics of each node based on the entity set and the statement embedded characteristics.

In some embodiments, generating the set of entities based on the input statement includes: initializing an entity set as an empty set; if the input sentence contains entity nodes in the initialized knowledge graph, determining an entity node set as the entity set; and if the input sentence does not contain the entity node in the initialization knowledge graph, using an entity set corresponding to a last sentence as the entity set, wherein the last sentence is a last sentence stored in the question-answering system for the input sentence.

In some embodiments, determining statement embedding characteristics of the input statement includes: taking the input sentence as the input of a recurrent neural network; and taking the value of the last hidden layer of the recurrent neural network as the statement embedded feature of the input statement to be output.

In some embodiments, determining graph embedding features for each node in the initialized knowledge-graph based on the determined structured features and unstructured features using a confidence propagation mechanism, generating a knowledge-graph of a question-answering system includes: layering nodes in the initialized knowledge graph; determining the serial connection result of the determined structured features and unstructured features as graph embedding features of nodes in the initialization knowledge graph of layer 0; updating graph embedding characteristics of each layer of nodes in the initialized knowledge graph by using a confidence propagation mechanism, wherein the confidence propagation mechanism updates the knowledge graph by using a method of transmitting information between nodes; and embedding features into the graphs of each layer of nodes in series, and generating a knowledge graph of the question-answering system.

In some embodiments, the method further comprises: and sending the knowledge graph of the question-answering system to target display equipment, and controlling the target display equipment to display the knowledge graph.

In a second aspect, an embodiment of the present application provides an electronic device, including: one or more processors; and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the method as described in any of the implementations of the first aspect.

In a third aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements a method as described in any of the implementations of the first aspect.

The method for generating information provided by the embodiment of the application comprises the steps of initializing a knowledge graph, inputting sentences, determining each node in the initialized knowledge graph corresponding to the input sentences, determining the structural characteristics and the unstructured characteristics of each node, and determining the graph embedding characteristics of each node in the initialized knowledge graph, thereby generating the knowledge graph for a question-answering system.

One of the above embodiments of the present application has the following advantageous effects: and generating an initialization knowledge graph based on the structured knowledge base and unstructured dialogue sentences in the dialogue sample library, and determining graph embedding characteristics of each node in the initialization knowledge graph by utilizing the structured knowledge base information and the unstructured dialogue sentence information in the question-answering system. Because the structured features and the unstructured features are considered simultaneously in the process of generating the knowledge graph node diagram embedded features, the real dialogue environment information of the question-answering system can be effectively captured. Therefore, the embodiment of the application can better assist in simulating and generating the dialogue statement of the real speaker in the question-answering process.

Drawings

Other features, objects and advantages of the present application will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the accompanying drawings in which:

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow chart of one embodiment of a method for generating a knowledge graph of a question-answering system, in accordance with the present application;

FIG. 3 is a flow diagram of one embodiment for generating an initialization knowledge-graph in accordance with the application;

FIG. 4 is a flow chart of yet another embodiment of a knowledge-graph method for generating a question-answering system, in accordance with the present application;

fig. 5 is a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.

Detailed Description

The application is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings.

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.

FIG. 1 illustrates an exemplary system architecture 100 to which an embodiment of a method of generating information of the present application may be applied.

As shown in fig. 1, a system architecture 100 may include terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a text processing application, a natural language processing application, a question-answering system application, etc., may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be hardware or software. When the terminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smartphones, tablets, laptop and desktop computers, and the like. When the terminal devices 101, 102, 103 are software, they can be installed in the above-listed electronic devices. Which may be implemented as multiple software or software modules (e.g., to provide conversational speech input) or as a single software or software module. The present application is not particularly limited herein.

The server 105 may be a server that provides various services, such as a generation server that analyzes sentences input by the terminal devices 101, 102, 103 and generates corresponding knowledge maps. The knowledge graph generation server may analyze and process the received data such as sentences, and feed back the processing result (for example, knowledge graph) to the terminal device.

It should be noted that, the method for generating the knowledge graph of the question-answering system provided by the embodiment of the present application is generally executed by the server 105, and accordingly, the device for generating the knowledge graph of the question-answering system is generally disposed in the server 105.

It should be noted that, the sentence may be directly stored locally in the server 105, and the server 105 may directly extract the local dialogue sentence to generate the knowledge graph of the question-answering system, where the exemplary system architecture 100 may not include the terminal devices 101, 102, 103 and the network 104.

It should also be noted that the knowledge-graph generation class application may also be installed in the terminal devices 101, 102, 103, and in this case, the method for generating the knowledge graph of the question-answering system may also be performed by the terminal devices 101, 102, 103. At this point, the exemplary system architecture 100 may also not include the server 105 and the network 104.

The server 105 may be hardware or software. When the server 105 is hardware, it may be implemented as a distributed server cluster formed by a plurality of servers, or as a single server. When the server is software, it may be implemented as a plurality of software or software modules (for example, to provide a knowledge-graph generation service), or may be implemented as a single software or software module. The present application is not particularly limited herein.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

With continued reference to FIG. 2, a flow 200 of one embodiment of a knowledge-graph method for generating a question-answering system in accordance with the present application is shown. The method for generating the knowledge graph of the question-answering system comprises the following steps:

step 201, generating an initialization knowledge graph based on a dialogue sample library.

In this embodiment, nodes and edges in the knowledge-graph are generated based on structured knowledge bases within defined specific areas contained in the dialog sample library. The nodes and edges are updated based on unstructured dialogue statements stored in a dialogue sample library. The nodes and edges generated based on the dialog sample library constitute an initialization knowledge graph.

Step 202, an input sentence is obtained.

In the present embodiment, an execution subject (e.g., a server shown in fig. 1) of a method for generating a knowledge graph may acquire an input sentence. Here, the input sentence may be a dialogue sentence of arbitrary content. For example, the input sentence may be a dialogue sentence regarding weather conditions.

The input sentence may be uploaded to the execution entity by a terminal device (for example, the terminal devices 101, 102, 103 shown in fig. 1) communicatively connected to the execution entity by a wired connection or a wireless connection, or may be stored locally in the execution entity. It should be noted that the wireless connection may include, but is not limited to, 3G/4G connections, wiFi connections, bluetooth connections, wiMAX connections, zigbee connections, UWB (ultra wideband) connections, and other now known or later developed wireless connection means.

Step 203, determining each node in the initialized knowledge-graph corresponding to the input sentence, determining a structural feature of each node, and determining an unstructured feature of each node.

In this embodiment, the execution body (for example, the server shown in fig. 1) determines, according to the input sentence, each node in the initialization knowledge graph that matches each word in the input sentence. Based on the input sentence and the dialogue sample library, the structured feature of each node is calculated, and based on the input sentence, the unstructured feature of each node is calculated.

And 204, determining graph embedding characteristics of each node in the initialized knowledge graph based on the determined structured characteristics and unstructured characteristics by using a confidence propagation mechanism, and generating a knowledge graph of a question-answering system.

In this embodiment, the execution body (for example, the server shown in fig. 1) connects in series the structured feature and the unstructured feature of each node, and propagates the structured feature and the unstructured feature of each node that have been determined to the neighborhood node corresponding to each node in the initialized knowledge graph by using a confidence propagation mechanism, so as to determine the graph embedding feature of the node in the initialized knowledge graph, and generate the knowledge graph of the question-answering system. The confidence propagation mechanism updates the knowledge graph by using a method for transmitting information between nodes.

One embodiment, as illustrated in fig. 2, has the following beneficial effects: and generating an initialization knowledge graph based on the structured knowledge base and unstructured dialogue sentences in the dialogue sample library, and determining graph embedding characteristics of each node in the initialization knowledge graph by utilizing the structured knowledge base information and the unstructured dialogue sentence information in the question-answering system. Because the structured features and the unstructured features are considered in the process of generating the knowledge graph node diagram embedded features, the real dialogue environment information of the question-answering system can be effectively captured, and therefore, the method and the device can better assist simulation to generate dialogue sentences of a real speaker in the question-answering process.

With continued reference to fig. 3, fig. 3 illustrates a flow 300 of one embodiment of initializing a knowledge-graph in accordance with the application. The initialization process may include the steps of:

step 301, determining a knowledge-graph basic structure.

In this embodiment, the basic structure of the initialization knowledge-graph is first determined. Knowledge graph is a semantic network that characterizes relationships between entities, describes real world things and their interrelationships in a structured form, and stores the things and their interrelationships as structured knowledge. In this embodiment, the basic structure of the knowledge graph includes nodes and edges, where each node represents an entity of specific knowledge, and a connection edge of a node represents a relationship between the entities.

Step 302, a dialogue sample library is obtained.

In this embodiment, the question-answering system is implemented based on a dialogue sample library. The dialogue sample library comprises a structured knowledge base and unstructured dialogue sentences in a limited specific field, and the question-answering system obtains answers by matching questions in the dialogue sample library. The structured knowledge base is stored in a list form, and specifically, the items and the corresponding attributes thereof. Unstructured conversation sentences are stored in the form of sentences of daily chat conversations. In this embodiment, the structured knowledge base of the dialogue sample library may include public knowledge bases, such as a proprietary knowledge base of natural language understanding, autonomous learning, scientific knowledge, and the like. Unstructured conversational sentences may contain daily conversational sentences or the like collected in a defined application scenario.

Step 303, generating nodes and edges based on the structured knowledge base in the dialogue sample library; the nodes and edges are updated based on unstructured dialogue statements in a dialogue sample library.

In this embodiment, the nodes in the initialized knowledge-graph are generated based on the structured text in the dialogue sample library, and the nodes include: item nodes, attribute nodes, and entity nodes. Edges in the initialization knowledge-graph are generated based on structured text in a dialog sample library, wherein the edges represent relationships between different nodes. A triplet of "node-edge-node" is determined.

Updating the nodes and the edges based on unstructured sentences in a dialogue sample library, adding new nodes if the sentences contain nodes which are not in the initialized knowledge graph for the sentences in the dialogue sample, and updating the edges according to the relation between the added nodes and other nodes. If the sentence does not contain the nodes which are not in the initialized knowledge graph, the nodes and the edges are not updated.

And step 304, determining the obtained knowledge graph as an initialized knowledge graph.

In this embodiment, the already obtained knowledge graph of the node and edge composition is used as the initialization knowledge graph.

One embodiment, as illustrated in fig. 3, has the following beneficial effects: an initialization knowledge-graph is generated based on the structured knowledge base and unstructured dialogue statements in the dialogue sample library. The knowledge graph consists of nodes and edges. Knowledge bases and dialogue sentences in the dialogue sample library are considered in determining nodes and edges, so nodes in the knowledge bases stored in structured form and nodes in dialogue sentences stored in unstructured form can be completely contained. Therefore, the method and the device can better generate the knowledge graph based on the question-answering process.

With further reference to fig. 4, a flow 400 of yet another embodiment of a method for generating a knowledge graph of a question-answering system is shown. The process 400 of the method for generating a knowledge graph includes the steps of:

in step 401, an input sentence is obtained.

In the present embodiment, an execution subject (e.g., a server shown in fig. 1) of the knowledge graph method for generating the question-answering system may acquire an input sentence. Here, the input sentence may be any dialogue sentence.

Here, the input sentence may be uploaded to the execution entity by a terminal device (for example, terminal devices 101, 102, 103 shown in fig. 1) communicatively connected to the execution entity by a wired connection or a wireless connection, or may be stored locally in the execution entity.

Step 402, determining each node in the initialized knowledge-graph corresponding to the input sentence.

In this embodiment, all the words in the input sentence are matched with all the nodes in the initialized knowledge-graph, and each node corresponding to all the words in the input sentence is determined.

Step 403, concatenating the single thermal vector of each node occurrence, occurrence number, type, and determining the structural feature of each node.

In this embodiment, after the execution body obtains the input sentence, each node in the initialized knowledge graph corresponding to the input sentence may be determined. Based on the dialogue sample library and the input sentences, judging whether each node appears in the input sentences, judging the type of each node, and calculating the number of times each node appears in the dialogue sentences in the dialogue sample library stored in the question-answering system. The occurrence condition, the type and the occurrence frequency of each node are expressed as independent heat vectors, and the three independent heat vectors are connected in series to obtain the structural feature F of the node in the initialized knowledge graph _t (v) Where v denotes the node, t denotes the computation for the current input statement, and F denotes the feature vector

Step 404, calculating the entity node condition in the input sentence, and generating an entity set.

In this embodiment, the entity set E is generated according to the node condition of the entity type in the input sentence _t Where E represents the set of entities and t represents the set of computing initialization entities for the current input statement as an empty set. And if the input sentence contains entity nodes in the initialized knowledge graph, the corresponding entity nodes are included in the entity set. If the input sentence does not contain entity nodes in the initialized knowledge graph, using an entity set E corresponding to the previous sentence _t-1 As the set of entities. Wherein E represents the entity set and t-1 represents the target of the last oneCalculation of the bar statement. The last sentence refers to the last sentence for the input sentence in the question-answering system. In the application process of the question-answering system, input sentences are continuously input, and the entity set is updated.

Step 405, inputting the input sentence into the recurrent neural network, generating the embedded feature of the input sentence.

In this embodiment, the current input sentence isWherein t represents the calculation for the current input sentence, t-1 represents the calculation for the last sentence, n _t Indicating that the input sentence contains n together _t Individual words (token), x _t Representing the input sentence +.>Representing words in the input sentence. The input sentence is input into the recurrent neural network, and this embodiment takes a Long Short-Term Memory network (LSTM) as an example:

h _t,j ＝LSTM(h _t,j-1 ,x _t,j )

wherein x is _t,j Representing the j-th word in the input sentence, i.e. input sentence x _t The words are input into the recurrent neural network one by one. h is a _t,j-1 Representing the output of the recurrent neural network corresponding to the j-1 th word of the input sentence, namely the embedded feature of the j-1 th word. h is a _t,j Representing the embedded feature corresponding to the j-th word of the input sentence. Wherein, the liquid crystal display device comprises a liquid crystal display device,namely, the recurrent neural network output of the previous sentence (t-1) is used as the input of the 1 st word of the input sentence (t) corresponding to the recurrent neural network, wherein the previous sentence is the previous sentence aiming at the input sentence in the question-answering system. The state value of the last hidden layer of the recurrent neural network is used as the output h of the recurrent neural network _t,j 。/>As input sentence x _t Is embedded with a feature u _t Where u represents the statement embedded feature vector.

Step 406, determining unstructured features of each node using the entity sets and the statement embedded features.

In the present embodiment, according to E _t And u _t Calculating unstructured features of each node: m is M _t (v)＝λ _t M _t-1 (v)+(1-λ _t )u _t

Wherein M is _t-1 (v) And M is an unstructured feature set corresponding to a previous sentence, wherein the previous sentence is a previous sentence aiming at the input sentence in the question-answering system. Lambda (lambda) _t As a control parameter, where t represents a calculation for a current input sentence:

wherein W is ^inc For controlling the transformation matrix, sigma is a scale parameter, M is the sum of the words in the input sentence which do not correspond to the nodes in the initialized knowledge graph _t (v)＝M _t-1 (v)。

Step 407, determining graph embedded features of each node in the initialized knowledge graph based on the structured features and the unstructured features, and generating a knowledge graph of the question-answering system

In this embodiment, the determined structured features and unstructured features are propagated to the neighborhood nodes in the initialized knowledge-graph by using a confidence propagation mechanism. Layering nodes in the initialized knowledge graph, and adding up K layers, wherein the graph embedded feature of the K-layer node V is expressed as V _t ^k (v) T represents the calculation for the current input sentence. The graph embedding feature of the layer 0 node is the concatenation result V of the determined structured and unstructured features _t ⁰ ＝[F _t (v),M _t (v)]. Layer k node V _t ^k The calculation is as follows:

wherein N is _t (v) Representing a set of neighborhood nodes of node v. Node v' e N in neighborhood _t (v) Is dependent on the graph embedding feature V of V' at the (k-1) layer _t ^k-1 (v') edge marking e _v-＞v' And, parameter matrix W ^mp Wherein e is _v-＞v' Represented by the embedding function R. Node V _t ^k (v) Is aggregated from an element-wise maximization (element-wise max) process of all nodes in the neighborhood.

The graph embedded features of each layer of nodes are connected in series to obtain graph embedded features V of each node _t (v) The method comprises the following steps:

V _t (v)＝[V _t ⁰ (v),...,V _t ^K (v)]

and generating a knowledge graph of the question-answering system according to the initialized knowledge graph and the calculated graph embedded features of each node.

One embodiment, as presented in fig. 4, has the following benefits: and determining the graph embedding characteristics of each node in the initialized knowledge graph by using the structured knowledge base information and the unstructured dialogue sentence information in the question-answering system. Because the structured features and the unstructured features are considered in the process of generating the knowledge graph node diagram embedded features, the real dialogue environment information of the question-answering system can be effectively captured, and therefore, the method and the device can better assist simulation to generate dialogue sentences of a real speaker in the question-answering process.

Referring now to FIG. 5, there is illustrated a schematic diagram of a computer system 500 suitable for use in implementing an electronic device of an embodiment of the present application. The electronic device shown in fig. 5 is only an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present application.

As shown in fig. 5, the computer system 500 includes a central processing unit (CPU, central Processing Unit) 501, which can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage section 508 into a random access Memory (RAM, random Access Memory) 503. In the RAM 503, various programs and data required for the operation of the system 500 are also stored. The CPU 501, ROM 502, and RAM 503 are connected to each other through a bus 504. An Input/Output (I/O) interface 505 is also connected to bus 504.

The following components are connected to the I/O interface 505: a storage section 506 including a hard disk or the like; and a communication section 507 including a network interface card such as a LAN (local area network ) card, a modem, or the like. The communication section 507 performs communication processing via a network such as the internet. The drive 508 is also connected to the I/O interface 505 as needed. A removable medium 509, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is installed on the drive 508 as needed so that a computer program read out therefrom is installed into the storage section 506 as needed.

In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication portion 507 and/or installed from the removable medium 509. The above-described functions defined in the method of the present application are performed when the computer program is executed by a Central Processing Unit (CPU) 501. The computer readable medium according to the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

As another aspect, the present application also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be present alone without being fitted into the device. The computer readable medium carries one or more programs which, when executed by the apparatus, cause the apparatus to: acquiring an input sentence; determining nodes of the input sentence corresponding to the initialized knowledge graph based on the initialized knowledge graph obtained by pre-initialization; calculating the structural characteristics and the unstructured characteristics of each node; and determining graph embedding characteristics of each node in the initialized knowledge graph based on the determined structured characteristics and unstructured characteristics by using a confidence propagation mechanism, and generating the knowledge graph of the question-answering system.

The above description is only illustrative of the preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by persons skilled in the art that the scope of the application referred to in the present application is not limited to the specific combinations of the technical features described above, but also covers other technical features formed by any combination of the technical features described above or their equivalents without departing from the inventive concept described above. Such as the above-mentioned features and the technical features disclosed in the present application (but not limited to) having similar functions are replaced with each other.

Claims

1. A knowledge graph generation method for a question-answering system comprises the following steps:

generating an initialization knowledge graph based on the dialogue sample library;

acquiring an input sentence;

determining each node in the initialized knowledge-graph corresponding to the input sentence, determining a structural feature of each node, and determining an unstructured feature of each node;

determining graph embedding characteristics of each node in the initialized knowledge graph based on the determined structured characteristics and unstructured characteristics by using a confidence propagation mechanism, and generating a knowledge graph of a question-answering system;

wherein the determining each node in the initialized knowledge-graph corresponding to the input sentence comprises:

matching all words in the input sentence with all nodes in the initialized knowledge graph, and determining each node corresponding to all words in the input sentence;

wherein said determining the structured characteristics of each node comprises:

determining a single-hot vector of the occurrence number of each node, wherein the single-hot vector of the occurrence number represents the occurrence number of each node in all sentences stored in the question-answering system;

determining a single heat vector of each node type, wherein the single heat vector of the node type represents the type of each node;

determining a single hot vector of each node occurrence, wherein the single hot vector of each occurrence represents whether each node occurs in the input sentence;

the occurrence times, the node types and the independent heat vectors of the occurrence conditions are connected in series, and the structural characteristics of each node are determined;

wherein said determining unstructured characteristics of each node comprises:

generating an entity set based on the input sentence;

determining statement embedding characteristics of the input statement;

and determining unstructured characteristics of each node based on the entity set and the statement embedded characteristics.

2. The method of claim 1, the generating an initialization knowledge-graph based on a dialog sample library, comprising:

generating nodes and edges based on a structured knowledge base in the dialogue sample library;

updating the nodes and edges based on unstructured dialogue statements in the dialogue sample library;

the dialogue sample library comprises a structured knowledge base and unstructured dialogue sentences, and the nodes and the edges form an initialization knowledge graph.

3. The method of claim 2, the generating nodes and edges based on structured knowledge bases in the dialog sample library, comprising:

generating a node in the initialized knowledge-graph, the node comprising: project nodes, attribute nodes and entity nodes;

and generating edges in the initialized knowledge graph, wherein the edges represent the relationship between different nodes.

4. The method of claim 2, the updating the nodes and edges based on unstructured dialogue statements in the dialogue sample library, comprising:

if the unstructured sentences in the dialogue sample library contain nodes which are not in the initialized knowledge graph, adding new nodes, and updating edges according to node relations.

5. The method of claim 1, the generating an entity set based on the input statement, comprising:

initializing an entity set to be a 0 set;

if the input sentence contains entity nodes in the initialized knowledge graph, determining an entity node set as the entity set;

and if the input sentence does not contain the entity node in the initialization knowledge graph, using an entity set corresponding to a last sentence as the entity set, wherein the last sentence is a last sentence stored in the question-answering system for the input sentence.

6. The method of claim 1, the determining statement embedding characteristics of the input statement, comprising:

taking the input sentence as the input of a recurrent neural network;

and taking the value of the last hidden layer of the recurrent neural network as the statement embedded feature of the input statement to be output.

7. The method of claim 1, the determining graph-embedded features of each node in the initialized knowledge-graph based on the determined structured features and unstructured features using a confidence propagation mechanism, generating a knowledge-graph of a question-answering system, comprising:

layering nodes in the initialized knowledge graph;

determining the serial connection result of the determined structured features and unstructured features as graph embedding features of nodes in the initialization knowledge graph of layer 0;

updating graph embedding characteristics of each layer of nodes in the initialized knowledge graph by using a confidence propagation mechanism, wherein the confidence propagation mechanism updates the knowledge graph by using a method of transmitting information between nodes;

and embedding features into the graphs of each layer of nodes in series, and generating a knowledge graph of the question-answering system.

8. The method of claim 1, the method further comprising:

and sending the knowledge graph of the question-answering system to target display equipment, and controlling the target display equipment to display the knowledge graph.

9. An electronic device, comprising:

one or more processors;

a storage means for storing one or more programs;

the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-8.

10. A computer readable medium having stored thereon a computer program, wherein the program when executed by a processor implements the method of any of claims 1-8.