CN111046213B - Knowledge base construction method based on image recognition - Google Patents
Knowledge base construction method based on image recognition Download PDFInfo
- Publication number
- CN111046213B CN111046213B CN201911309368.5A CN201911309368A CN111046213B CN 111046213 B CN111046213 B CN 111046213B CN 201911309368 A CN201911309368 A CN 201911309368A CN 111046213 B CN111046213 B CN 111046213B
- Authority
- CN
- China
- Prior art keywords
- image
- knowledge base
- cnn
- entity
- attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
- G06V10/751—Comparing pixel values or logical combinations thereof, or feature values having positional relevance, e.g. template matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Software Systems (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Biomedical Technology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a knowledge base construction method based on image recognition, which comprises the steps of firstly obtaining a series of images related to a target knowledge base and carrying out image preprocessing, then using three neural network groups to recognize input images so as to obtain scene, entity and attribute information of image contents, matching the output entities and attributes according to the coincidence degree of the pixel range occupied by the entities and the pixel range occupied by the attributes, and finally completing construction of the knowledge base with scene, entity and attribute triples; therefore, the image is used as a knowledge base construction source, so that the construction method of the knowledge base is expanded, the redundancy of the knowledge base can be reduced, the query efficiency is improved, and the accumulation and reusability of image knowledge can be realized.
Description
Technical Field
The invention belongs to the technical field of machine learning and information processing, and particularly relates to a knowledge base construction method based on image recognition.
Background
In the background of the information age, internet technology not only shortens communication distances all over the world step by step so that information can be rapidly spread and communicated, but also along with rapid iteration of industrial technology, each terminal user not only becomes a receiver of information, but also becomes a producer of information in a reasonable way. One of the huge changes that has ensued is the explosive export of knowledge. The well-spraying knowledge output can be used as a data base of new technology, for example, the deep learning technology which is most concerned by the society in the present invention is a high-computing representation based on a large amount of available training data, and can be acquired, processed, sorted and aggregated into structured knowledge base data.
The knowledge base stores and extracts knowledge according to established rules, and because the knowledge base uses an explicit knowledge storage mode, the knowledge base not only can serve as an expert system to serve in a specific field, but also can use the same knowledge storage rules to construct an open knowledge base, for example, the known knowledge bases DBpedia, Freebase and the like all adopt Resource Description Frameworks (RDF), so that the knowledge bases can be mutually linked. The construction of the knowledge base can simplify the process of acquiring information by people and improve the utilization efficiency of the information, and is considered as the basis of the semantic network of the web 3.0.
Most of the existing knowledge base construction methods face structured or unstructured text information, and extract associated data from a text source by using a specific data processing method. For example, DBpedia obtains and stores associated data as a knowledge base by parsing WikiText text commonly used by Wikipedia. Patent CN106844723A "medical knowledge base construction method based on question and answer system" uses public medical websites and medical data to perform data processing, and obtains associated information between data to construct a knowledge base. Patent CN106650940A "a method and an apparatus for constructing a domain knowledge base" also extracts a core concept from a text based on the core concept in the domain to be constructed and a target text where the core concept is located, and determines whether to add the core concept into the knowledge base, thereby implementing automatic construction of the knowledge base.
Extracting entities from a text and relationships among the entities are a feasible and common knowledge base construction method, but the construction method only limits knowledge sources to the text, so that diversity of the knowledge sources is limited, and even in some scenes, the knowledge base construction method cannot play a role. For example, in police monitoring equipment, a large amount of knowledge and information is present in a long-time video recording screen. Whether suspicious personnel appear in the image or not, the traveling path of the suspicious personnel, dressing characteristics of the suspicious personnel, time nodes for personnel to come in and go out and the like play a critical role in assisting police to rapidly detect the case. However, at present, in order to acquire information and knowledge in an image, people can basically identify useful information only through long-time viewing, and artificially integrate the information to make a judgment. The knowledge acquisition mode is not only inefficient, but also has low reusability of knowledge, and the accumulation of knowledge cannot be realized.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a knowledge base construction method based on image recognition, which constructs a target knowledge base by using an image recognition technology, makes up the defect that an image is used as a knowledge base construction source, improves the knowledge acquisition efficiency and is simple and easy to implement.
In order to achieve the above object, the present invention provides a knowledge base construction method based on image recognition, which is characterized by comprising the following steps:
(1) acquiring a target image
Acquiring images containing a plurality of scenes and entities related to a knowledge base to be constructed, wherein each image contains a plurality of entities E1、E2、…、Ei…, and each entity has multiple attribute values A1,1、A1,2、…、A2,1、A2,2、…、Ai,j、…;
(2) Target image preprocessing
Firstly, converting each image into a gray-scale image, then smoothing and sharpening the gray-scale image, and finally, storing all the images in an image library;
(3) building a neural network model
The neural network model is composed of three convolutional neural networks CNN, wherein the second CNN and the third CNN are connected in parallel and then are connected in series behind the first CNN and are respectively used for scene identification, entity identification and attribute identification; the result of scene identification is represented by a unique heat vector S, the result of entity identification is represented by a unique heat vector E, and the result of attribute identification is represented by a unique heat vector A;
(4) recognizing images using neural network models
(4.1) randomly selecting an image from the image library and inputting the image into the neural network model;
(4.2) the first CNN recognizes the scene in the image and outputs a unique heat vector SkSubscript k denotes the kth image;
(4.3) after the first CNN is identified, identifying the image by the second CNN and the third CNN at the same time;
(4.3.1) the second CNN identifies entity E in the image1、E2、…、Ei…, and recording the pixel range P (E) of each entity in the image1)、P(E2)、…、P(Ei)、…;
(4.3.2) the third CNN identifies the attribute A under each entity in the image1,1、A1,2、…、A2,1、A2,2、…、Ai,j…, and recording the pixel range P (A) of each attribute in the image1,1)、P(A1,2)、…、P(A2,1)、P(A2,2)、…、P(Ai,j)、…;
(5) Calculating the contact ratio
Computing entity EiPixel range P (E)i) And attribute Ai,jPixel range P (a)i,j) The contact ratio of (A) to (B) of (B) O;
(6) and constructing a target knowledge base
Comparing the contact ratio O with a preset contact ratio threshold value, if the contact ratio is greater than or equal to the contact ratio threshold value, indicating that the identified entity and the attribute are in the same pixel region, namely entity EiAnd attribute Ai,jAre matched with each other, and are (S)k,Ei,Ai,j) The format of (A) is stored in a target knowledge base;
if the contact ratio is less than the contact ratio threshold value, calculating an entity E according to the method in the step (5)iPixel range P (E)i) And attribute Ai,j+1Pixel range P (a)i,j+1) The step (6) is returned again; then, repeating the steps until the matching of all the entities and the attributes in the image is completed;
and (5) returning to the step (4.1), identifying the next image by using the neural network model, and sequentially processing according to the methods in the steps (5) to (6) until the matching of all entities and attributes in all images in the image library is completed, thereby constructing a target knowledge base. .
The invention aims to realize the following steps:
the invention relates to a knowledge base construction method based on image recognition, which comprises the steps of firstly obtaining a series of images related to a target knowledge base and carrying out image preprocessing, then using three neural network groups to recognize input images so as to obtain scene, entity and attribute information of image contents, matching the output entities and attributes according to the pixel range occupied by the entities and the coincidence degree of the attributes in the pixel range, and finally completing construction of the knowledge base with scene, entity and attribute triples; therefore, the image is used as a knowledge base construction source, so that the construction method of the knowledge base is expanded, the redundancy of the knowledge base can be reduced, the query efficiency is improved, and the accumulation and reusability of image knowledge can be realized.
Drawings
FIG. 1 is a flow chart of a knowledge base construction method based on image recognition according to the invention;
FIG. 2 is a flow diagram of entity and attribute matching;
Detailed Description
The following description of the embodiments of the present invention is provided in order to better understand the present invention for those skilled in the art with reference to the accompanying drawings. It is to be expressly noted that in the following description, a detailed description of known functions and designs will be omitted when it may obscure the subject matter of the present invention.
Examples
FIG. 1 is a flow chart of a knowledge base construction method based on image recognition.
In this embodiment, as shown in fig. 1, the method for constructing a knowledge base based on image recognition of the present invention includes the following steps:
s1, acquiring target image
Acquiring images containing a plurality of scenes and entities related to a knowledge base to be constructed, wherein each image contains a plurality of entities E1、E2、…、Ei…, and each entity has multiple attribute values A1,1、A1,2、…、A2,1、A2,2、…、Ai,j、…;
In this embodiment, each image includes a scene such as the sky, and a plurality of entities such as people, objects, animals, etc., and each image needs to satisfy a basic definition, i.e., a definition sufficient for human eyes to recognize.
S2, preprocessing the target image
Firstly, converting each image into a gray-scale image, then smoothing and sharpening the gray-scale image, and finally, storing all the images in an image library;
in the embodiment, the gray value of the image is changed by utilizing gray level transformation, so that the brightness effect of the image is changed, the image is in moderate brightness, the neural network group can clearly distinguish the outline of an object conveniently, and the identification precision is improved; the smoothing processing is used for filtering redundant noise points in the image; the sharpening process is used for enhancing the jumping part of the gray scale in the image, so that the outline information in the image can be highlighted, and the entity identification is facilitated;
s3, building a neural network model
The neural network model is composed of three convolutional neural networks CNN, wherein the second CNN and the third CNN are connected in parallel and then are connected in series behind the first CNN and are respectively used for scene identification, entity identification and attribute identification; the result of scene identification is represented by a unique heat vector S, the result of entity identification is represented by a unique heat vector E, and the result of attribute identification is represented by a unique heat vector A;
in this embodiment, a neural network model is built by using CNN, and in addition, networks such as AlexNet, VGGNet, ResNet, *** lenet and the like can be built; after the neural network model is built, pre-training is required, so that the neural network model can accurately identify scenes such as parks, parking lots, campuses and the like, entities such as people, articles, animals and the like, and attributes such as colors, shapes, sizes and the like, and a specific training process is not repeated herein;
s4 recognizing images by using neural network model
S4.1, randomly selecting an image from an image library and inputting the image into a neural network model;
s4.2, recognizing the scene in the image by the first CNN, and outputting a unique heat vector SkSubscript k denotes the kth image;
s4.3, after the first CNN is identified, identifying the image by the second CNN and the third CNN at the same time, wherein the specific flow is shown in FIG. 2;
s4.3.1, the second CNN recognizes the entity E in the image1、E2、…、Ei…, and recording the pixel range P (E) of each entity in the image1)、P(E2)、…、P(Ei)、…;
S4.3.2, the third CNN identifies the attribute A under each entity in the image1,1、A1,2、…、A2,1、A2,2、…、Ai,j…, and recording the pixel range P (A) of each attribute in the image1,1)、P(A1,2)、…、P(A2,1)、P(A2,2)、…、P(Ai,j)、…;
In this embodiment, the feature map output by the last convolutional layer of CNN may be up-sampled until the feature map is restored to the resolution of the input image, so that image segmentation at the pixel level may be implemented and the pixel range of each attribute in the image may be recorded.
S5, calculating the contact ratio
Computing entity EiPixel range P (E)i) And attribute Ai,jPixel range P (a)i,j) The degree of coincidence O of (A) is in the range of [0, 1]];
S6, constructing a target knowledge base
Comparing the contact ratio O with a preset contact ratio threshold value which is 0.7, and if the contact ratio is greater than or equal to the contact ratio threshold value, indicating that the identified entity and the attribute are in the same pixelRegion, entity EiAnd attribute Ai,jAre matched with each other, and are (S)k,Ei,Ai,j) The format of (A) is stored in a target knowledge base;
if the contact ratio is less than the contact ratio threshold, then calculate entity E according to the method of step S5iPixel range P (E)i) And attribute Ai,j+1Pixel range P (a)i,j+1) The contact ratio of (3) and then returns to step S6; then, repeating the steps until the matching of all the entities and the attributes in the image is completed;
and returning to the step S4.1, identifying the next image by using the neural network model, and sequentially processing according to the method in the steps S5-S6 until the matching of all entities and attributes in all images in the image library is completed, thereby constructing a target knowledge base.
Examples of the invention
The embodiment constructs a personnel information knowledge base based on the monitoring picture, and is used for assisting public security organs to quickly determine the suspect, draw the motion track of the suspect and help the suspect to quickly solve a case.
First, an image set of case-related monitoring devices is collected and preprocessed according to the above method.
Second, the image is identified using a neural network model. In this embodiment, the main task for entity identification is human identification, and the attributes involved include: the human body shape (fat and thin degree), the color of the clothing and the height size can be specifically subdivided into: human body shape (thin, normal, fat), clothing color (red, orange, yellow, green, blue, purple), height size (below 1.5 m, 1.5-1.6 m, 1.6-1.7 m, 1.7-1.8 m, 1.8-1.9 m, more than 1.9 m).
And finally, calculating the coincidence degree of the pixel range of the entity and the pixel range of the corresponding attribute, and completing the mutual matching between the entity and the corresponding attribute according to the coincidence degree, thereby constructing a personnel information knowledge base.
Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.
Claims (2)
1. A knowledge base construction method based on image recognition is characterized by comprising the following steps:
(1) acquiring a target image
Acquiring images containing a plurality of scenes and entities related to a knowledge base to be constructed, wherein each image contains a plurality of entities E1、E2、…、Ei…, and each entity has multiple attribute values A1,1、A1,2、…、A2,1、A2,2、…、Ai,j、…;
(2) Target image preprocessing
Firstly, converting each image into a gray-scale image, then smoothing and sharpening the gray-scale image, and finally, storing all the images in an image library;
(3) building a neural network model
The neural network model is composed of three convolutional neural networks CNN, wherein the second CNN and the third CNN are connected in parallel and then are connected in series behind the first CNN and are respectively used for scene identification, entity identification and attribute identification; the result of scene identification is represented by a unique heat vector S, the result of entity identification is represented by a unique heat vector E, and the result of attribute identification is represented by a unique heat vector A;
(4) recognizing images using neural network models
(4.1) randomly selecting an image from the image library and inputting the image into the neural network model;
(4.2) the first CNN recognizes the scene in the image and outputs a unique heat vector SkSubscript k denotes the kth image;
(4.3) after the first CNN is identified, identifying the image by the second CNN and the third CNN at the same time;
(4.3.1) the second CNN identifies the true in the imageBody E1、E2、…、Ei…, and recording the pixel range P (E) of each entity in the image1)、P(E2)、…、P(Ei)、…;
(4.3.2) the third CNN identifies the attribute A in the image1,1、A1,2、…、A2,1、A2,2、…、Ai,j…, and recording the pixel range P (A) of each attribute in the image1,1)、P(A1,2)、…、P(A2,1)、P(A2,2)、…、P(Ai,j)、…;
(5) Calculating the contact ratio
Computing entity EiPixel range P (E)i) And attribute Ai,jPixel range P (a)i,j) The contact ratio of (A) to (B) of (B) O;
(6) and constructing a target knowledge base
Comparing the contact ratio O with a preset contact ratio threshold value, if the contact ratio is greater than or equal to the contact ratio threshold value, indicating that the identified entity and the attribute are in the same pixel region, namely entity EiAnd attribute Ai,jAre matched with each other, and are (S)k,Ei,Ai,j) The format of (A) is stored in a target knowledge base;
if the contact ratio is less than the contact ratio threshold value, calculating an entity E according to the method in the step (5)iPixel range P (E)i) And attribute Ai,j+1Pixel range P (a)i,j+1) The step (6) is returned again; then, repeating the steps until the matching of all the entities and the attributes in the image is completed;
and (5) returning to the step (4.1), identifying the next image by using the neural network model, and sequentially processing according to the methods in the steps (5) to (6) until the matching of all entities and attributes in all images in the image library is completed, thereby constructing a target knowledge base.
2. The knowledge base construction method based on image recognition according to claim 1, wherein the value range of the contact ratio O is [0,1 ].
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911309368.5A CN111046213B (en) | 2019-12-18 | 2019-12-18 | Knowledge base construction method based on image recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911309368.5A CN111046213B (en) | 2019-12-18 | 2019-12-18 | Knowledge base construction method based on image recognition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111046213A CN111046213A (en) | 2020-04-21 |
CN111046213B true CN111046213B (en) | 2021-12-10 |
Family
ID=70237877
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911309368.5A Active CN111046213B (en) | 2019-12-18 | 2019-12-18 | Knowledge base construction method based on image recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111046213B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598036B (en) * | 2020-05-22 | 2021-01-01 | 广州地理研究所 | Urban group geographic environment knowledge base construction method and system of distributed architecture |
CN112287656B (en) * | 2020-10-12 | 2024-05-28 | 四川语言桥信息技术有限公司 | Text comparison method, device, equipment and storage medium |
CN113854780A (en) * | 2021-09-30 | 2021-12-31 | 重庆清微文化旅游有限公司 | Intelligent dispensing method and system without cross contamination of materials |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1184810A3 (en) * | 1996-02-05 | 2004-03-10 | Texas Instruments Incorporated | Improvements in or relating to motion event detection |
CN108280132A (en) * | 2017-01-06 | 2018-07-13 | Tcl集团股份有限公司 | Establish the method and system in the individualized knowledge library of semantic image segmentation |
CN109983507A (en) * | 2016-12-21 | 2019-07-05 | 英特尔公司 | The positioning returned based on extensive CNN is carried out via two-dimensional map |
-
2019
- 2019-12-18 CN CN201911309368.5A patent/CN111046213B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1184810A3 (en) * | 1996-02-05 | 2004-03-10 | Texas Instruments Incorporated | Improvements in or relating to motion event detection |
CN109983507A (en) * | 2016-12-21 | 2019-07-05 | 英特尔公司 | The positioning returned based on extensive CNN is carried out via two-dimensional map |
CN108280132A (en) * | 2017-01-06 | 2018-07-13 | Tcl集团股份有限公司 | Establish the method and system in the individualized knowledge library of semantic image segmentation |
Non-Patent Citations (1)
Title |
---|
遥感大数据时代与智能信息提取;张兵;《武汉大学学报 信息科学版》;20181231;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN111046213A (en) | 2020-04-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108388888B (en) | Vehicle identification method and device and storage medium | |
CN113936339B (en) | Fighting identification method and device based on double-channel cross attention mechanism | |
CN104679863B (en) | It is a kind of based on deep learning to scheme to search drawing method and system | |
CN111046213B (en) | Knowledge base construction method based on image recognition | |
CN105930402A (en) | Convolutional neural network based video retrieval method and system | |
CA3069365A1 (en) | Generation of point of interest copy | |
CN109657715B (en) | Semantic segmentation method, device, equipment and medium | |
CN113689382B (en) | Tumor postoperative survival prediction method and system based on medical images and pathological images | |
CN105590099A (en) | Multi-user behavior identification method based on improved convolutional neural network | |
CN111783712A (en) | Video processing method, device, equipment and medium | |
CN106845513A (en) | Staff detector and method based on condition random forest | |
CN114332573A (en) | Multi-mode information fusion recognition method and system based on attention mechanism | |
CN104156464A (en) | Micro-video retrieval method and device based on micro-video feature database | |
Li et al. | Image manipulation localization using attentional cross-domain CNN features | |
Setiawan et al. | Sequential inter-hop graph convolution neural network (SIhGCN) for skeleton-based human action recognition | |
CN109766918A (en) | Conspicuousness object detecting method based on the fusion of multi-level contextual information | |
CN108446605B (en) | Double interbehavior recognition methods under complex background | |
Mohammad et al. | Searching surveillance video contents using convolutional neural network | |
CN113066074A (en) | Visual saliency prediction method based on binocular parallax offset fusion | |
CN111914772A (en) | Method for identifying age, and training method and device of age identification model | |
CN116701706A (en) | Data processing method, device, equipment and medium based on artificial intelligence | |
CN115953832A (en) | Semantic decoupling-based combined action recognition method of self-attention model | |
Zhang et al. | Skeleton-based action recognition with attention and temporal graph convolutional network | |
CN116110074A (en) | Dynamic small-strand pedestrian recognition method based on graph neural network | |
CN116955707A (en) | Content tag determination method, device, equipment, medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |