CN111046213B

CN111046213B - Knowledge base construction method based on image recognition

Info

Publication number: CN111046213B
Application number: CN201911309368.5A
Authority: CN
Inventors: 郑文锋; 杨波; 陈小兵; 刘珊; 曾庆川
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2019-12-18
Filing date: 2019-12-18
Publication date: 2021-12-10
Anticipated expiration: 2039-12-18
Also published as: CN111046213A

Abstract

The invention discloses a knowledge base construction method based on image recognition, which comprises the steps of firstly obtaining a series of images related to a target knowledge base and carrying out image preprocessing, then using three neural network groups to recognize input images so as to obtain scene, entity and attribute information of image contents, matching the output entities and attributes according to the coincidence degree of the pixel range occupied by the entities and the pixel range occupied by the attributes, and finally completing construction of the knowledge base with scene, entity and attribute triples; therefore, the image is used as a knowledge base construction source, so that the construction method of the knowledge base is expanded, the redundancy of the knowledge base can be reduced, the query efficiency is improved, and the accumulation and reusability of image knowledge can be realized.

Description

Knowledge base construction method based on image recognition

Technical Field

The invention belongs to the technical field of machine learning and information processing, and particularly relates to a knowledge base construction method based on image recognition.

Background

In the background of the information age, internet technology not only shortens communication distances all over the world step by step so that information can be rapidly spread and communicated, but also along with rapid iteration of industrial technology, each terminal user not only becomes a receiver of information, but also becomes a producer of information in a reasonable way. One of the huge changes that has ensued is the explosive export of knowledge. The well-spraying knowledge output can be used as a data base of new technology, for example, the deep learning technology which is most concerned by the society in the present invention is a high-computing representation based on a large amount of available training data, and can be acquired, processed, sorted and aggregated into structured knowledge base data.

The knowledge base stores and extracts knowledge according to established rules, and because the knowledge base uses an explicit knowledge storage mode, the knowledge base not only can serve as an expert system to serve in a specific field, but also can use the same knowledge storage rules to construct an open knowledge base, for example, the known knowledge bases DBpedia, Freebase and the like all adopt Resource Description Frameworks (RDF), so that the knowledge bases can be mutually linked. The construction of the knowledge base can simplify the process of acquiring information by people and improve the utilization efficiency of the information, and is considered as the basis of the semantic network of the web 3.0.

Most of the existing knowledge base construction methods face structured or unstructured text information, and extract associated data from a text source by using a specific data processing method. For example, DBpedia obtains and stores associated data as a knowledge base by parsing WikiText text commonly used by Wikipedia. Patent CN106844723A "medical knowledge base construction method based on question and answer system" uses public medical websites and medical data to perform data processing, and obtains associated information between data to construct a knowledge base. Patent CN106650940A "a method and an apparatus for constructing a domain knowledge base" also extracts a core concept from a text based on the core concept in the domain to be constructed and a target text where the core concept is located, and determines whether to add the core concept into the knowledge base, thereby implementing automatic construction of the knowledge base.

Extracting entities from a text and relationships among the entities are a feasible and common knowledge base construction method, but the construction method only limits knowledge sources to the text, so that diversity of the knowledge sources is limited, and even in some scenes, the knowledge base construction method cannot play a role. For example, in police monitoring equipment, a large amount of knowledge and information is present in a long-time video recording screen. Whether suspicious personnel appear in the image or not, the traveling path of the suspicious personnel, dressing characteristics of the suspicious personnel, time nodes for personnel to come in and go out and the like play a critical role in assisting police to rapidly detect the case. However, at present, in order to acquire information and knowledge in an image, people can basically identify useful information only through long-time viewing, and artificially integrate the information to make a judgment. The knowledge acquisition mode is not only inefficient, but also has low reusability of knowledge, and the accumulation of knowledge cannot be realized.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a knowledge base construction method based on image recognition, which constructs a target knowledge base by using an image recognition technology, makes up the defect that an image is used as a knowledge base construction source, improves the knowledge acquisition efficiency and is simple and easy to implement.

In order to achieve the above object, the present invention provides a knowledge base construction method based on image recognition, which is characterized by comprising the following steps:

(1) acquiring a target image

Acquiring images containing a plurality of scenes and entities related to a knowledge base to be constructed, wherein each image contains a plurality of entities E₁、E₂、…、E_i…, and each entity has multiple attribute values A_1,1、A_1,2、…、A_2,1、A_2,2、…、A_i,j、…；

(2) Target image preprocessing

Firstly, converting each image into a gray-scale image, then smoothing and sharpening the gray-scale image, and finally, storing all the images in an image library;

(3) building a neural network model

The neural network model is composed of three convolutional neural networks CNN, wherein the second CNN and the third CNN are connected in parallel and then are connected in series behind the first CNN and are respectively used for scene identification, entity identification and attribute identification; the result of scene identification is represented by a unique heat vector S, the result of entity identification is represented by a unique heat vector E, and the result of attribute identification is represented by a unique heat vector A;

(4) recognizing images using neural network models

(4.1) randomly selecting an image from the image library and inputting the image into the neural network model;

(4.2) the first CNN recognizes the scene in the image and outputs a unique heat vector S_kSubscript k denotes the kth image;

(4.3) after the first CNN is identified, identifying the image by the second CNN and the third CNN at the same time;

(4.3.1) the second CNN identifies entity E in the image₁、E₂、…、E_i…, and recording the pixel range P (E) of each entity in the image₁)、P(E₂)、…、P(E_i)、…；

(4.3.2) the third CNN identifies the attribute A under each entity in the image_1,1、A_1,2、…、A_2,1、A_2,2、…、A_i,j…, and recording the pixel range P (A) of each attribute in the image_1,1)、P(A_1,2)、…、P(A_2,1)、P(A_2,2)、…、P(A_i,j)、…；

(5) Calculating the contact ratio

Computing entity E_iPixel range P (E)_i) And attribute A_i,jPixel range P (a)_i,j) The contact ratio of (A) to (B) of (B) O;

(6) and constructing a target knowledge base

Comparing the contact ratio O with a preset contact ratio threshold value, if the contact ratio is greater than or equal to the contact ratio threshold value, indicating that the identified entity and the attribute are in the same pixel region, namely entity E_iAnd attribute A_i,jAre matched with each other, and are (S)_k，E_i，A_i,j) The format of (A) is stored in a target knowledge base;

if the contact ratio is less than the contact ratio threshold value, calculating an entity E according to the method in the step (5)_iPixel range P (E)_i) And attribute A_i,j+1Pixel range P (a)_i,j+1) The step (6) is returned again; then, repeating the steps until the matching of all the entities and the attributes in the image is completed;

and (5) returning to the step (4.1), identifying the next image by using the neural network model, and sequentially processing according to the methods in the steps (5) to (6) until the matching of all entities and attributes in all images in the image library is completed, thereby constructing a target knowledge base. .

The invention aims to realize the following steps:

the invention relates to a knowledge base construction method based on image recognition, which comprises the steps of firstly obtaining a series of images related to a target knowledge base and carrying out image preprocessing, then using three neural network groups to recognize input images so as to obtain scene, entity and attribute information of image contents, matching the output entities and attributes according to the pixel range occupied by the entities and the coincidence degree of the attributes in the pixel range, and finally completing construction of the knowledge base with scene, entity and attribute triples; therefore, the image is used as a knowledge base construction source, so that the construction method of the knowledge base is expanded, the redundancy of the knowledge base can be reduced, the query efficiency is improved, and the accumulation and reusability of image knowledge can be realized.

Drawings

FIG. 1 is a flow chart of a knowledge base construction method based on image recognition according to the invention;

FIG. 2 is a flow diagram of entity and attribute matching;

Detailed Description

The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.

Examples

FIG. 1 is a flow chart of a knowledge base construction method based on image recognition.

In this embodiment, as shown in fig. 1, the method for constructing a knowledge base based on image recognition of the present invention includes the following steps:

s1, acquiring target image

In this embodiment, each image includes a scene such as the sky, and a plurality of entities such as people, objects, animals, etc., and each image needs to satisfy a basic definition, i.e., a definition sufficient for human eyes to recognize.

S2, preprocessing the target image

in the embodiment, the gray value of the image is changed by utilizing gray level transformation, so that the brightness effect of the image is changed, the image is in moderate brightness, the neural network group can clearly distinguish the outline of an object conveniently, and the identification precision is improved; the smoothing processing is used for filtering redundant noise points in the image; the sharpening process is used for enhancing the jumping part of the gray scale in the image, so that the outline information in the image can be highlighted, and the entity identification is facilitated;

s3, building a neural network model

in this embodiment, a neural network model is built by using CNN, and in addition, networks such as AlexNet, VGGNet, ResNet, *** lenet and the like can be built; after the neural network model is built, pre-training is required, so that the neural network model can accurately identify scenes such as parks, parking lots, campuses and the like, entities such as people, articles, animals and the like, and attributes such as colors, shapes, sizes and the like, and a specific training process is not repeated herein;

s4 recognizing images by using neural network model

S4.1, randomly selecting an image from an image library and inputting the image into a neural network model;

s4.2, recognizing the scene in the image by the first CNN, and outputting a unique heat vector S_kSubscript k denotes the kth image;

s4.3, after the first CNN is identified, identifying the image by the second CNN and the third CNN at the same time, wherein the specific flow is shown in FIG. 2;

s4.3.1, the second CNN recognizes the entity E in the image₁、E₂、…、E_i…, and recording the pixel range P (E) of each entity in the image₁)、P(E₂)、…、P(E_i)、…；

S4.3.2, the third CNN identifies the attribute A under each entity in the image_1,1、A_1,2、…、A_2,1、A_2,2、…、A_i,j…, and recording the pixel range P (A) of each attribute in the image_1,1)、P(A_1,2)、…、P(A_2,1)、P(A_2,2)、…、P(A_i,j)、…；

In this embodiment, the feature map output by the last convolutional layer of CNN may be up-sampled until the feature map is restored to the resolution of the input image, so that image segmentation at the pixel level may be implemented and the pixel range of each attribute in the image may be recorded.

S5, calculating the contact ratio

Computing entity E_iPixel range P (E)_i) And attribute A_i,jPixel range P (a)_i,j) The degree of coincidence O of (A) is in the range of [0, 1]]；

S6, constructing a target knowledge base

Comparing the contact ratio O with a preset contact ratio threshold value which is 0.7, and if the contact ratio is greater than or equal to the contact ratio threshold value, indicating that the identified entity and the attribute are in the same pixelRegion, entity E_iAnd attribute A_i,jAre matched with each other, and are (S)_k，E_i，A_i,j) The format of (A) is stored in a target knowledge base;

if the contact ratio is less than the contact ratio threshold, then calculate entity E according to the method of step S5_iPixel range P (E)_i) And attribute A_i,j+1Pixel range P (a)_i,j+1) The contact ratio of (3) and then returns to step S6; then, repeating the steps until the matching of all the entities and the attributes in the image is completed;

and returning to the step S4.1, identifying the next image by using the neural network model, and sequentially processing according to the method in the steps S5-S6 until the matching of all entities and attributes in all images in the image library is completed, thereby constructing a target knowledge base.

Examples of the invention

The embodiment constructs a personnel information knowledge base based on the monitoring picture, and is used for assisting public security organs to quickly determine the suspect, draw the motion track of the suspect and help the suspect to quickly solve a case.

First, an image set of case-related monitoring devices is collected and preprocessed according to the above method.

Second, the image is identified using a neural network model. In this embodiment, the main task for entity identification is human identification, and the attributes involved include: the human body shape (fat and thin degree), the color of the clothing and the height size can be specifically subdivided into: human body shape (thin, normal, fat), clothing color (red, orange, yellow, green, blue, purple), height size (below 1.5 m, 1.5-1.6 m, 1.6-1.7 m, 1.7-1.8 m, 1.8-1.9 m, more than 1.9 m).

And finally, calculating the coincidence degree of the pixel range of the entity and the pixel range of the corresponding attribute, and completing the mutual matching between the entity and the corresponding attribute according to the coincidence degree, thereby constructing a personnel information knowledge base.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims

1. A knowledge base construction method based on image recognition is characterized by comprising the following steps:

(1) acquiring a target image

(2) Target image preprocessing

(3) building a neural network model

(4) recognizing images using neural network models

(4.3.1) the second CNN identifies the true in the imageBody E₁、E₂、…、E_i…, and recording the pixel range P (E) of each entity in the image₁)、P(E₂)、…、P(E_i)、…；

(4.3.2) the third CNN identifies the attribute A in the image_1,1、A_1,2、…、A_2,1、A_2,2、…、A_i,j…, and recording the pixel range P (A) of each attribute in the image_1,1)、P(A_1,2)、…、P(A_2,1)、P(A_2,2)、…、P(A_i,j)、…；

(5) Calculating the contact ratio

(6) and constructing a target knowledge base

and (5) returning to the step (4.1), identifying the next image by using the neural network model, and sequentially processing according to the methods in the steps (5) to (6) until the matching of all entities and attributes in all images in the image library is completed, thereby constructing a target knowledge base.

2. The knowledge base construction method based on image recognition according to claim 1, wherein the value range of the contact ratio O is [0,1 ].