CN112163109A

CN112163109A - Entity disambiguation method and system based on picture

Info

Publication number: CN112163109A
Application number: CN202011015942.9A
Authority: CN
Inventors: 沈志宏; 曾成林; 周园春; 赵子豪
Original assignee: Computer Network Information Center of CAS
Current assignee: Computer Network Information Center of CAS
Priority date: 2020-09-24
Filing date: 2020-09-24
Publication date: 2021-01-01

Abstract

The invention relates to an entity disambiguation method and system based on pictures. The method comprises the following steps: organizing the multimode data by adopting an attribute graph mode, wherein an entity is used as a node of the attribute graph, and data describing the entity is used as a node attribute; the attribute graph storage system is used for storing the nodes and the node attributes of the attribute graph, the picture storage system is used for storing the pictures corresponding to the nodes, and the attribute graph storage system and the picture storage system are associated through the picture attributes of the nodes; and for two entities to be disambiguated, searching pictures corresponding to the two entities according to the attribute graph storage system and the picture storage system, and comparing whether the pictures corresponding to the two entities refer to the same entity to realize entity disambiguation. The invention provides a new entity disambiguation thought based on multi-mode data, solves the problem of unclear entity designation caused by text ambiguity by using the corresponding relation between pictures and entities, simplifies the entity disambiguation process and improves the overall efficiency by using the characteristic that the multi-mode data contains rich information.

Description

Entity disambiguation method and system based on picture

Technical Field

The invention relates to the fields of entity disambiguation, multi-modal data fusion management, data processing and analysis, and provides an entity disambiguation technology based on pictures and system implementation of the technology.

Background

Entity disambiguation, also known as semantic disambiguation, is a technique specifically used to solve the problem of ambiguity generated by entities of the same name. The ambiguity and ambiguity of the text cause the unknown entity index, and the purpose of entity disambiguation is to resolve the unknown entity index caused by the text diversity and ambiguity. Currently, there are many related technologies for entity disambiguation, which are mainly classified into three types: a cluster-based entity disambiguation method, an entity link-based disambiguation method, and a structured text-oriented entity disambiguation method. The entity disambiguation method based on clustering needs to perform a series of index calculations on an entity set to be disambiguated, and the disambiguation is performed through the similarity of indexes. The disambiguation method based on the entity link needs to introduce a new knowledge base, needs to perform normalization operation on the knowledge base and an entity set to be disambiguated, and then performs disambiguation through comparison. The disambiguation method for the structured text is based on the structured text, the data usually lacks context information, additional information is needed, and the method has no universality. The existing entity disambiguation technology is mostly based on text analysis and processing, and has the disadvantages of complex method, large calculation amount and poor effect.

In a multimodal dataset, entities have corresponding information such as pictures, recordings, videos, etc. The image recognition and classification technology based on the artificial intelligence technology has better performance at present, the entity disambiguation method based on the picture can solve the ambiguity problem generated by text semantic ambiguity by using the artificial intelligence technology, and has the characteristics of easy calculation, easy operation, easy realization and the like, so the method has important significance in researching the entity disambiguation technology based on the picture.

Disclosure of Invention

The invention aims to provide an entity disambiguation technology based on pictures and a system implementation of the technology.

The technical scheme of the invention is as follows:

a method for entity disambiguation based on pictures, comprising the steps of:

1) the data organization method comprises the following steps: organizing the multimode data by adopting an attribute graph mode, wherein an entity is used as a node of the attribute graph, and data describing the entity is used as a node attribute; the attribute graph storage system is used for storing the nodes and the node attributes of the attribute graph, the picture storage system is used for storing the pictures corresponding to the nodes, and the attribute graph storage system and the picture storage system are associated through the picture attributes of the nodes;

2) the conversion method of the disambiguation task comprises the following steps: and for two entities to be disambiguated, searching pictures corresponding to the two entities according to the attribute picture storage system and the picture storage system, and comparing whether the pictures corresponding to the two entities refer to the same entity to realize entity disambiguation.

Further, the attribute graph storage system and the picture storage system are associated through the picture attributes of the nodes, and the method includes the following steps: and storing the picture stored in the picture storage system in the attribute picture storage system as a picture attribute of the entity, wherein the picture attribute is an identifier, and real picture data are acquired in the picture storage system through the identifier.

Further, retrieving pictures corresponding to the two entities according to the attribute picture storage system and the picture storage system includes: and returning node data through the attribute graph storage system to obtain the picture attribute of the node, and taking the content of the picture attribute as an id identifier to obtain real picture data in the picture storage system.

Further, the attribute graph storage system is realized by adopting a neo4j graph database, and the picture storage system is realized by adopting a TFS; when entity and picture data are stored, firstly, a picture is stored in a TFS to obtain an id identifier of the picture, and then the id identifier is stored into a neo4j map database as a picture attribute of a node.

Further, the comparing whether the pictures corresponding to the two entities refer to the same entity to implement entity disambiguation includes: and calling an AI algorithm with built-in integration to compare the picture data, judging the picture data to be the same entity if the AI algorithm returns that the picture similarity is higher, and judging the picture data to be different entities if the similarity is lower.

Further, the comparing whether the pictures corresponding to the two entities refer to the same entity to implement entity disambiguation includes: and calling a picture classification algorithm of the AI algorithm module to classify the pictures, judging the pictures as the same entity if the classification results are the same, and judging the pictures as different entities if the classification results are different.

A picture-based entity disambiguation system using the above method, comprising:

the multi-mode data management module is used for organizing the multi-mode data in a mode of adopting an attribute graph, wherein an entity is used as a node of the attribute graph, and data describing the entity is used as a node attribute; the multi-mode data management module comprises an attribute graph storage system and a picture storage system, wherein the attribute graph storage system stores nodes and node attributes of an attribute graph, the picture storage system stores pictures corresponding to the nodes, and the attribute graph storage system and the picture storage system are associated through the picture attributes of the nodes;

the query language analysis module is used for analyzing the query language and generating an execution plan;

and the execution engine is used for executing the execution plan generated by the query language analysis module, retrieving pictures corresponding to the two entities according to the attribute graph storage system and the picture storage system for the two entities to be disambiguated, and comparing whether the pictures corresponding to the two entities refer to the same entity to realize entity disambiguation.

Furthermore, the query language is cypher language, cypher sentences input by a user are firstly analyzed by the query language analysis module, and the query language analysis module receives the three types of cypher sentences and generates an execution plan; the three major classes of cypher sentences comprise storage type sentences, query type sentences and operation type sentences, the storage type sentences refer to the sentences for storing and updating data, the query type sentences refer to the sentences for querying the data, the operation type sentences refer to the sentences containing operation, the pictures are operated by calling corresponding AI algorithms, and results are returned.

The invention has the beneficial effects that:

a new idea of entity disambiguation based on multi-mode data is provided. The problem of unclear entity designation caused by text ambiguity is solved by utilizing the corresponding relation between the picture and the entity. By utilizing the characteristic of rich information of multimode data, the entity disambiguation process is simplified, and the overall efficiency is improved.

Drawings

FIG. 1 is a system architecture diagram of the present invention.

FIG. 2 is an entity disambiguation flow diagram of the present invention.

Fig. 3 is a schematic diagram of the user disambiguation case of the present invention.

The specific implementation mode is as follows:

the invention is further described by the following specific embodiments in conjunction with the accompanying drawings.

The entity disambiguation method based on the picture comprises a data organization method and a disambiguation task conversion method.

The data organization method organizes the multimode data in a mode of an attribute graph. An entity serves as a node of the attribute graph, and data describing the entity serves as a node attribute. Taking a person as an example, the person is constructed as a node of the attribute graph, and data related to the person, such as age, name, contact information, photo, etc., is taken as the attribute of the node. The data storage adopts a dual storage system: the system comprises an attribute graph storage system and a picture storage system. The attribute graph storage system is used for storing the nodes of the attribute graph and the node attributes thereof, and the picture storage system is used for storing pictures (namely pictures corresponding to the entities) of the nodes. The dual storage systems are associated through the picture attributes of the nodes, namely, pictures stored in the picture storage systems are stored in the attribute graph storage systems as one picture attribute of the entity. The image attribute of the node stored in the attribute image storage system is not an actual image, but an identifier by which actual image data can be acquired in the image storage system.

The conversion method of the disambiguation task converts the entity disambiguation problem into the picture comparison problem. Disambiguating the two entities, searching signature pictures belonging to the two different entities, and comparing whether the signature pictures of the two entities refer to the same entity. In the case of comparing whether two entities are the same person, the photos of the two persons are searched first, and then the photos are compared to judge whether the two entities are the same person. The disambiguation task is expressed in the extended cypher language as match [ n1, n2] return isoqual (n1.photo, n2. photo). Where n1, n2 represent entities with human nodes, n1.photo and n2.photo represent photos corresponding to human, and isEqual is a method call for post-extension support of the cypher language. Whether the result returned by the statement is true or false can be determined whether the two entities are the same person.

The invention discloses a system implementation of an entity disambiguation technology based on pictures, which comprises the following contents: 1) a multi-modal data management module; 2) a query language parsing module; 3) an execution engine. Fig. 1 is an architectural diagram of the system, including a storage layer, an intermediate layer, and an interface layer. The multimodal data management module is located in the storage layer and comprises a neo4j storage system and a TFS storage system. The query language analysis module and the execution engine are positioned in the middle layer. The middle layer also contains built-in algorithms, i.e., built-in AI algorithms.

The multi-modal data management module comprises an attribute map storage System and a picture storage System, wherein the attribute map storage System is realized by adopting a neo4j map database and is used for storing entities, and the picture storage System is realized by adopting a TFS (Taobao File System, a highly extensible, highly available, high-performance and Internet-service-oriented distributed File System) and is used for storing pictures corresponding to the entities. The method for associating the contents stored in the two storage systems is to store the picture as an attribute of an entity, wherein the attribute is represented by cypher to be n.photo, n represents a node stored in neo4j, photo t is taken as an attribute of the node, the attribute is an id identifier, real picture data can be acquired in a TFS picture storage system through the id identifier, and the photo attribute of the neo4j node corresponds to a picture in a TFS. When storing entity and picture data, firstly storing the picture in the TFS system, obtaining the id of the picture, and then storing the id as the photo attribute content of the node in the neo4j database. This identification is stored as picture attribute content in the neo4j storage system. When real picture data corresponding to an entity needs to be acquired from the system, firstly, node data is returned through neo4j, a photo attribute of the node is obtained, and the attribute content is used as an id identifier to acquire the real picture data in the TFS system.

And the query language analysis module is responsible for analyzing the expanded cypher language and generating an execution plan. The cypher sentences input by the user through the interface are firstly analyzed through the query language analysis module. The analysis module receives the three categories of cypher sentences, generates an execution plan, executes the plan by the execution engine and returns a result. Three broad classes of statements include a store class, a query class, and an operation class. The storage-class statement refers to a statement for storing and updating data, for example, a statement create (n: person { name: test, photo: tfs.put ('test.png') }) return n for storing entity and picture data, the parsing module parses the statement, when the tfs.put is identified, the operation is parsed into storing a picture in a TFS picture storage system, the returned id is used as a photo attribute, and then a neo4j node is created. The query type statement refers to a simple query of data, such as searching for a node labeled as person and returning that the photo attribute statement is match (n: person) return n.photo, and the parsing module will parse the statement into the following operations: first, matching returns eligible nodes in the neo4j system, and then the corresponding photo attributes are taken. The photo attribute here is just id identification and is not real data. To acquire real picture data, the query statement is changed into match (n: person) return tfs.get (n.photo), and after the parsing module parses the query statement, a new operation sequence is generated, that is, the real picture data is acquired from the picture system through the photo attribute. An operation class statement is a statement that includes an operation. If the statement match disambiguates two entities [ n1: person name ═ test1 ', n2: person name ═ test 2' ] return isoqual (n1.photo, n2.photo), the parsing module will parse the statement into the following operations: first, qualified n1 and n2 are obtained from neo4j, and then the corresponding photo attributes are passed into the isoqual method. And acquiring real picture data from the picture system according to the transmitted photo attribute inside the isEqual, calling a corresponding AI algorithm, operating the picture, and returning a result.

The execution engine executes the operation sequence generated by the query language parsing module, and the specific disambiguation is shown in fig. 2. Png stores test in TFS system, the id returned identifies the photo attribute as a node, and then creates a person node in neo4j containing the name and photo attribute. As for the operation type statement entity disambiguation example, the execution engine firstly queries a node meeting the condition from neo4j through match query, then obtains a corresponding photo attribute, and transmits the photo attribute into the isoqual method, the internal part of the isoqual firstly calls the tfs.get method according to the transmitted photo attribute to obtain real picture data, and finally calls a built-in integrated AI algorithm to compare the picture data, and returns a result. If the AI algorithm returns that the image similarity is higher, the same entity is judged, and if the similarity is lower, the different entity is judged. The image classification algorithm of the AI algorithm module can also be called to classify the acquired images, if the classification results are the same, the same entity is judged, and if the classification results are different, the different entities are judged.

Examples of implementing a picture-based entity disambiguation process based on a person relationship graph are provided below:

a person relation graph based on a paper author in a specific research field is constructed by a certain unit, and the graph takes persons as nodes and comprises attributes such as names and photos. Because the map is automatically constructed through a program, the same person may be subjected to node creation for multiple times, and now two entities suspected to be the same person need to be disambiguated.

1) As shown in fig. 3, the labels of both entities are person, names are different, but abbreviations are the same, one is TG and the other is Tom Grem, and it is necessary to disambiguate the two entities to determine whether they are the same person.

2) The arithmetic statement match [ n1: person n1.name ═ TG ', n2: person n2.name ═ Tom Grem' ] return isoqual (n1.photo, n2.photo) was constructed.

3) And converting the entity disambiguation task into a comparison problem of pictures by constructing an operation statement.

4) The system analysis module analyzes the sentences to generate a query plan.

5) And (3) specifically executing the query plan generated in the step (4) by the execution engine, searching n1 and n2 which meet the conditions from neo4j, then obtaining photo attributes of the n1 and n2, calling a TFS (text to picture) method in isoqual, obtaining real picture data from a TFS (text to picture) storage system, and finally calling a built-in integrated AI (analog to digital) algorithm to carry out operation comparison on the picture data and return a result.

6) And disambiguating the two entities according to the returned result.

Based on the same inventive concept, another embodiment of the present invention provides an electronic device (computer, server, smartphone, etc.) comprising a memory storing a computer program configured to be executed by the processor and a processor, the computer program comprising instructions for performing the steps of the inventive method.

Based on the same inventive concept, another embodiment of the present invention provides a computer-readable storage medium (e.g., ROM/RAM, magnetic disk, optical disk) storing a computer program, which when executed by a computer, performs the steps of the inventive method.

The attribute map storage system of the present invention may be implemented by using another suitable map database other than neo4j, and the picture storage system may be implemented by using another suitable storage system other than TFS.

Parts of the invention not described in detail are well known to the person skilled in the art.

The above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and a person skilled in the art can make modifications or equivalent substitutions to the technical solution of the present invention without departing from the spirit and scope of the present invention, and the scope of the present invention should be determined by the claims.

Claims

1. A picture-based entity disambiguation method, comprising the steps of:

organizing the multimode data by adopting an attribute graph mode, wherein an entity is used as a node of the attribute graph, and data describing the entity is used as a node attribute;

the attribute graph storage system is used for storing the nodes and the node attributes of the attribute graph, the picture storage system is used for storing the pictures corresponding to the nodes, and the attribute graph storage system and the picture storage system are associated through the picture attributes of the nodes;

and for two entities to be disambiguated, searching pictures corresponding to the two entities according to the attribute graph storage system and the picture storage system, and comparing whether the pictures corresponding to the two entities refer to the same entity to realize entity disambiguation.

2. The method of claim 1, wherein the associating of the attribute graph storage system and the picture storage system via the picture attributes of the nodes comprises: and storing the picture stored in the picture storage system in the attribute picture storage system as a picture attribute of the entity, wherein the picture attribute is an identifier, and real picture data are acquired in the picture storage system through the identifier.

3. The method of claim 1, wherein retrieving the picture corresponding to the two entities according to the attribute map storage system and the picture storage system comprises: and returning node data through the attribute graph storage system to obtain the picture attribute of the node, and taking the content of the picture attribute as an id identifier to obtain real picture data in the picture storage system.

4. The method according to claim 1, wherein the attribute map storage system is implemented using a neo4j map database, and the picture storage system is implemented using TFS; when entity and picture data are stored, firstly, a picture is stored in a TFS to obtain an id identifier of the picture, and then the id identifier is stored into a neo4j map database as a picture attribute of a node.

5. The method of claim 1, wherein the performing entity disambiguation by comparing whether pictures corresponding to two entities refer to the same entity comprises: and calling an AI algorithm with built-in integration to compare the picture data, judging the picture data to be the same entity if the AI algorithm returns that the picture similarity is higher, and judging the picture data to be different entities if the similarity is lower.

6. The method of claim 1, wherein the performing entity disambiguation by comparing whether pictures corresponding to two entities refer to the same entity comprises: and calling a picture classification algorithm of the AI algorithm module to classify the pictures, judging the pictures as the same entity if the classification results are the same, and judging the pictures as different entities if the classification results are different.

7. A system for image-based entity disambiguation using the method of any of claims 1 through 6, comprising:

8. The system according to claim 7, wherein the query language is cypher language, cypher sentences input by the user are firstly analyzed by the query language analyzing module, and the query language analyzing module receives three categories of cypher sentences and generates the execution plan; the three major classes of cypher sentences comprise storage type sentences, query type sentences and operation type sentences, the storage type sentences refer to the sentences for storing and updating data, the query type sentences refer to the sentences for querying the data, the operation type sentences refer to the sentences containing operation, the pictures are operated by calling corresponding AI algorithms, and results are returned.

9. An electronic apparatus, comprising a memory and a processor, the memory storing a computer program configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1 to 6.

10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a computer, implements the method of any one of claims 1 to 6.