CN114491122A

CN114491122A - Graph matching method for searching similar images

Info

Publication number: CN114491122A
Application number: CN202111634430.5A
Authority: CN
Inventors: 杨益枘; 林旭滨; 何力; 管贻生; 张宏
Original assignee: Guangdong University of Technology
Current assignee: Guangdong University of Technology
Priority date: 2021-12-29
Filing date: 2021-12-29
Publication date: 2022-05-13
Anticipated expiration: 2041-12-29
Also published as: CN114491122B

Abstract

The invention discloses a graph matching method for similar image retrieval, which mainly comprises two stages of off-line data set construction and on-line deep learning training: the first stage comprises selecting a Pascal VOC data set as a training data set; a number of images with annotation points covering all kinds of the data set are selected as a training set. The second stage comprises the following steps: adopting a pre-trained VGG-16 neural network as a feature extractor; each image generates a topological structure of a bidirectional edge through a full-connection Delaunay triangulation technology; after the point feature embedding of the topological geometric information is completed, the feature description of the edge is carried out on the basis of a point-edge correlation matrix; according to the edge feature description vectors of the respective graphs, an edge-to-edge similarity matrix can be constructed; through the steps, the final point characteristic sum can be obtained, and then the similarity matrix of point-point matching is calculated. The scheme also has the advantages of high retrieval performance, high efficiency and easy implementation.

Description

Graph matching method for searching similar images

Technical Field

The invention relates to the technical field of image retrieval, in particular to a graph matching method for similar image retrieval.

Background

With the development of the internet, how to efficiently retrieve images meeting the requirements of users in a network environment is a core technical problem. Generally, image retrieval techniques are mainly divided into two branches: text-based and content-based retrieval. Text-based image retrieval typically queries for images in the form of keywords, or looks through images under specific categories in the form of a hierarchical directory. And the content-based image retrieval is to retrieve other images with similar characteristics from an image database according to semantic content and characteristics of the images.

The existing content-based image retrieval system firstly extracts feature information of image content, stores the feature information in a feature library, and then compares and sorts relevant features according to the features of a query image to obtain a retrieval result of the image. The image retrieval technology based on the content uses a computer to carry out mathematical description of a unified rule on the image, reduces the manpower consumption of manually labeling the image keywords, and thus improves the retrieval efficiency. With the improvement of computer performance and the development of deep learning, the computer can extract abundant characteristics such as object color, shape and structure from the image. However, similarity matching of the structured feature information is a problem with high computational complexity.

From the perspective of mathematical optimization, graph matching of structured information belongs to the NP-hard second order composition problem. Graph matching aims to find the corresponding relation between nodes and nodes among objects by utilizing graph structure information. On the other hand, the booming deep learning and graph convolution neural networks have great potential in the graph matching problem. With graph embedding techniques based on graph convolution neural networks, a second order combinatorial problem, which was previously difficult to solve accurately in polynomial time, translates into a first order problem that can be solved accurately in polynomial time. However, the existing depth map matching method based on the map embedding technology does not consider the similarity information of the second-order edge and the edge, and the information is introduced as the cross-map embedding information, so that the precision and the efficiency are improved. For this reason, further improvements and refinements are required in the prior art.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a graph matching method for the retrieval of similar images.

The purpose of the invention is realized by the following technical scheme:

a graph matching method for image retrieval of the same kind mainly comprises two stages of off-line data set construction and on-line deep learning training, and comprises the following specific steps:

stage one: and constructing a data set matched with the offline depth image.

Step S1: the Pascal VOC dataset was chosen as the training dataset.

Step S2: a number of images with annotation points covering all kinds of the data set are selected as a training set.

And a second stage: and training a depth map matching network on line.

Step S3: and adopting a pre-trained VGG-16 neural network as a feature extractor, wherein the parameters of the neural network are pre-trained on the ImageNet data set.

Step S4: and each image generates a topological structure of the bidirectional edge through a full-connected Delaunay triangulation technology.

Step S5: after the point feature embedding of the topological geometric information is completed, the feature description of the edge is carried out on the basis of the point-edge correlation matrix.

Step S6: according to the edge feature description vectors of the respective graphs, an edge-to-edge similarity matrix K can be constructed_e。

Step S61: the point-edge pairing relation of graph matching can be constructed into a relevance graph model.

Step S62: according to the topological structure of the association graph, the edge similarity score and the point similarity can be associated to obtain the cross-graph conversion matrix.

Step S63: and (4) carrying out cross-map point embedding operation by taking a cross-map distribution matrix as prior information.

Step S7: through the steps, the final point characteristics can be obtained

And

a similarity matrix for the point-to-point matches is then calculated.

As a preferable embodiment of the present invention, the step S3 further includes the steps of: two images to be matched are obtained by a feature extractor

And

where d is the dimension of the feature vector, n₁And n₂Respectively counting the number of the characteristic points of the two images; f¹And F²The outputs extracted from the relu4_2 and relu5_1 layers of the VGG-16 neural network are spliced.

As a preferable embodiment of the present invention, the step S4 further includes the steps of: the attribute of each edge is formed by two end point coordinates after normalization, and the connection information of the edge represents the topological structure information of each graph; then, inputting the point characteristic information and the edge attribute information into a neural network SplineCNN of the graph as input information; SplineCNN is used as a geometric topological information embedding technology, and MAX aggregation is adopted in structural information aggregation; finally, point characteristics embedded with respective geometric topological information are obtained

And

as a preferable embodiment of the present invention, the step S5 further includes the steps of: the point-edge correlation matrices of the two graphs are respectively

And

wherein e₁And e₂Respectively representing the number of edges of the two figures, when G_i,k＝H_j,kWhen 1, it indicates that the edge k starts from node i to node j ends; characteristics of the edge

And

is defined as follows:

as a preferable embodiment of the present invention, the step S6 further includes the steps of: edge-to-edge recognition matrix K_e：

Wherein the content of the first and second substances,

is a training parameter; k_eEach element of the matrix represents edge-to-edge matching information, and in order to expand the difference between the similarity values of the edges, i.e. emphasize the high similarity values and compress the low similarity values, the Ke matrix is normalized to obtain a normalized epsilon matrix:

ε＝softmax(K_e) Formula (3)

Then, the normalized epsilon matrix is passed through a companion graphIs converted into a cross-edge conversion matrix

Conversion matrix based on cross-graph

We can get the feature embedding information across the graph; for node

Cross-map feature information m_j→iIs calculated as follows:

finally, vector addition operation is carried out on the cross-map feature information and the point feature information:

similar operations are also performed for the feature points of the second graph.

As a preferable embodiment of the present invention, the step S7 further includes the steps of: the similarity matrix formula is as follows:

the linear solution to the graph matching problem relies on the Sinkhorn iterative algorithm, which is sequentially normalized along rows and columns to obtain soft scores according to the obtained score matrix SMatching matrix

P_ij＝Sinkhorn(exp(S_ij) Equation (8).

As a preferable aspect of the present invention, the graph matching method further includes step S8: given truth allocation matrix

And a soft distribution matrix P, wherein the error can be obtained by constructing a cross entropy loss function:

as a preferable embodiment of the present invention, the step S1 further includes the steps of: the data set contains several different categories of images: airplanes, bicycles, birds, boats, bottles, buses, cars, cats, chairs, cattle, tables, dogs, horses, motorcycles, people, plants, sheep, sofas, trains, televisions; each image contains 6 to 23 annotated feature point image coordinates.

As a preferable embodiment of the present invention, the step S2 further includes the steps of: 1682 sheets were selected accordingly as the test set. For each image to be trained, extracting a bounding box containing all annotation feature points, adjusting the size of the image to be 256 multiplied by 256, and finally entering the deep learning network training.

The working process and principle of the invention are as follows: aiming at the problem of precision loss caused by neglecting second-order edge and edge similarity information in the existing depth map matching scheme based on the map embedding technology, the second-order edge and edge similarity information is introduced by using a depth map matching model based on the cross-map embedding technology, and the method is applied to image retrieval of similar objects, so that the matching performance is improved, the cost on memory consumption is also obviously reduced, and the performance and the efficiency of image retrieval are greatly improved finally.

Drawings

Fig. 1 is a schematic flow chart of a graph matching method for the same kind of image retrieval according to the present invention.

Fig. 2 is a schematic diagram of a graph matching method for searching similar images according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described below with reference to the accompanying drawings and examples.

Example 1:

as shown in fig. 1 to fig. 2, the present embodiment discloses a graph matching method for retrieving similar images, which mainly includes two stages of offline data set construction and online deep learning training, and the specific steps are as follows:

stage one: and constructing a data set matched with the offline depth image.

Step S1: the Pascal VOC dataset was chosen as the training dataset.

And a second stage: and training the depth map matching network on line.

Step S7: through the steps, the final point characteristics can be obtained

And

a similarity matrix for the point-to-point matches is then calculated.

And

And

as a preferable embodiment of the present invention, the step S5 further includes the steps of: of two figuresThe point-edge correlation matrix is respectively

And

And

is defined as follows:

Wherein the content of the first and second substances,

ε＝softmax(K_e) Formula (3)

Then, the normalized epsilon matrix is converted into a cross-edge conversion matrix through the structure of the accompanying graph

Conversion matrix based on cross-graph

We can get the feature embedding information across the graph; for node

Cross-map feature information m_j→iIs calculated as follows:

the linear solution of the graph matching problem depends on a Sinkhorn iterative algorithm, and the algorithm is sequentially normalized along rows and columns according to the obtained fractional matrix S to obtain a soft distribution matrix

P_ij＝Sinkhorn(exp(S_ij) Equation (8).

As a preferable aspect of the present invention, the graph matching method further includes step S8: given truth-value distribution matrix

The above embodiments are preferred embodiments of the present invention, but the present invention is not limited to the above embodiments, and any other changes, modifications, substitutions, combinations, and simplifications which do not depart from the spirit and principle of the present invention should be construed as equivalents thereof, and all such changes, modifications, substitutions, combinations, and simplifications are intended to be included in the scope of the present invention.

Claims

1. A graph matching method for image retrieval of the same kind is characterized in that the graph matching method mainly comprises two stages of off-line data set construction and on-line deep learning training, and the specific steps are as follows:

stage one: constructing a data set matched with the offline depth image;

step S1: selecting a Pascal VOC data set as a training data set;

step S2: selecting a plurality of images which are provided with annotation points and cover all kinds of data sets as training sets;

and a second stage: training a depth map matching network on line;

step S3: adopting a pre-trained VGG-16 neural network as a feature extractor, wherein parameters of the neural network are pre-trained on an ImageNet data set;

step S4: each image generates a topological structure of a bidirectional edge through a full-connection Delaunay triangulation technology;

step S5: after the point feature embedding of the topological geometric information is completed, the feature description of the edge is carried out on the basis of a point-edge correlation matrix;

step S6: according to the edge feature description vectors of the respective graphs, an edge-to-edge similarity matrix K can be constructed_e；

Step S61: point-edge pairing relation of graph matching can be constructed into an association graph model;

step S62: according to the topological structure of the association graph, the edge similarity score and the point similarity can be associated to obtain a cross-graph conversion matrix;

step S63: performing cross-map point embedding operation by taking a cross-map distribution matrix as prior information;

step S7: through the steps, the final point characteristics can be obtained

And

a similarity matrix for the point-to-point matches is then calculated.

2. The graph matching method for homogeneous image retrieval according to claim 1, wherein said step S3 further includes the steps of: two images to be matched are obtained by a feature extractor

And

3. The graph matching method for homogeneous image retrieval according to claim 1, wherein said step S4 further includes the steps of: the attribute of each edge is formed by two end point coordinates after normalization, and the connection information of the edge represents the topological structure information of each graph; then, inputting the point characteristic information and the edge attribute information into a neural network SplineCNN of the graph as input information; SplineCNN is used as a geometric topological information embedding technology, and MAX aggregation is adopted in structural information aggregation; finally, point characteristics embedded with respective geometric topological information are obtained

And

4. the graph matching method for homogeneous image retrieval according to claim 1, wherein said step S5 further includes the steps of: the point-edge correlation matrices of the two graphs are respectively

And

And

is defined as follows:

5. the graph matching method for homogeneous image retrieval according to claim 1, wherein said step S6 further includes the steps of: edge-to-edge recognition matrix K_e：

Wherein the content of the first and second substances,

ε＝softmax(K_e) Formula (3)

Conversion matrix based on cross-graph

We can get the feature embedding information across the graph; for node

Cross-map feature information m_j→iIs calculated as follows:

6. The graph matching method for homogeneous image retrieval according to claim 1, wherein said step S7 further includes the steps of: the similarity matrix formula is as follows:

P_ij＝Sinkhorn(exp(S_ij) Equation (8).

7. The graph matching method for homogeneous image retrieval according to claim 1, further comprising step S8: given truth allocation matrix

8. the graph matching method for homogeneous image retrieval according to claim 1, wherein said step S1 further includes the steps of: the data set contains several different categories of images: airplanes, bicycles, birds, boats, bottles, buses, cars, cats, chairs, cattle, tables, dogs, horses, motorcycles, people, plants, sheep, sofas, trains, televisions; each image contains 6 to 23 annotated feature point image coordinates.

9. The map matching method for homogeneous image retrieval according to claim 1, wherein said step S2 further includes the steps of: 1682 sheets were selected accordingly as test sets. For each image to be trained, extracting a bounding box containing all annotation feature points, adjusting the size of the image to be 256 multiplied by 256, and finally entering the deep learning network training.