LU501692B1

LU501692B1 - Depth weighted hash learning method based on spatial importance

Info

Publication number: LU501692B1
Application number: LU501692A
Authority: LU
Inventors: Xiushan Nie; Yilong Yin; Yang Shi
Original assignee: Univ Shandong Jianzhu
Priority date: 2022-03-18
Filing date: 2022-03-18
Publication date: 2022-09-20

Abstract

The present disclosure provides a depth weighted hash learning method based on spatial importance, comprising the following steps: (1) spatial importance extraction: constructing an importance learning model of depth space to obtain an importance region and a non-importance region of an image; (2) hash learning of the importance region and the non-importance region: putting different depth convolution neural networks into the importance region and non-importance region of the image to learn hash codes, and combining two types of hash codes as a final hash representation. Compared with the prior art, the present disclosure realizes hierarchical hash code learning of different regions of an image, performs hash coding according to importance of the different regions, and finally makes fusion to obtain a hash code of the image. The present disclosure embodies the effect of different parts of the image on hash learning, and improves the accuracy of hash retrieval.

Description

DEPTH WEIGHTED HASH LEARNING METHOD BASED ON SPATIAL LUS01692

IMPORTANCE TECHNICAL FIELD

[01] The present disclosure relates to a depth weighted hash learning method based on spatial importance, and falls within the technical fields of multimedia signal processing and large data retrieval.

BACKGROUND ART

[02] With the rapid development of the Internet, cloud computing, social media and other information technologies in recent years, people can use sensors more easily, which makes the data uploaded by sensors include a large number of images and videos. According to a briefing by the China Mobile Institute, humans created 18 billion gigabytes (GB) of data in 2011 and the data were growing more than 60% per year. It was expected that global data will reach 350 gigabytes (GB) per year by 2020. How to deal with these data has become a problem to be solved urgently, and how to compare the similarity of these data is an important difficulty, and the nearest neighbor search method emerges.

[03] Traditional nearest neighbor searches find the most similar items from the database to the target data based on the similarity of the data. This similarity is typically quantified as the distance between the data in space, and it is believed that the closer the data is in space, the higher the similarity between the data. However, with the continuous upgrade of image acquisition equipment, it is difficult to meet people's needs since the speed of nearest neighbor search is slow when dealing with high-dimensional data, that is, the traditional retrieval methods cannot get ideal retrieval results, and cannot retrieve results in an acceptable time. There is an urgent need to find a method to solve the shortcomings of the nearest neighbor retrieval method, and the outstanding performance of the approximate nearest neighbor retrieval method in the retrieval speed has attracted the attention of researchers.

[04] Approximate nearest neighbor retrieval uses the characteristic that the data will form cluster-like aggregation distribution after the data volume increases. Through classifying or coding the data in the database by means of data analysis clustering, the data category to which the target data belongs can be predicted according to its data characteristics, and some or all of the categories are returned as the retrieval result. However, the core idea of approximate nearest neighbor retrieval is to search the data items which may be neighbors and not only to return the most possible items, so as to improve the retrieval efficiency at the expense of the accuracy within the acceptable range, which makes it possible to obtain satisfactory results within the acceptable time range. As an approach to approximate nearest neighbor retrieval, hash maps high-dimensional data of visual space into compact binary code of Hamming space. Hash has attracted extensive attention by researchers due to its excellent storage capacity and efficient computing power.

SUMMARY 1

[05] The present disclosure considers the spatial importance information from the LU501692 perspective of the degree of contribution to image recognition, that is, if the data of a certain pixel position can greatly contribute to image recognition, we consider that the spatial importance of this pixel position is high, whereas the spatial importance of this position is low. On the basis of the research and utilization of the spatial importance information, the present disclosure proposes a depth weighted hash learning method based on spatial importance, which improves the performance of hash learning. Compared with existing hashing techniques, the present disclosure can learn to obtain spatial importance information and use it to learn hash codes, which improves the efficiency and accuracy of large data retrieval using hashing techniques. In the existing documents and techniques, there is no technique and method for obtaining a hash code by weighting the spatial importance information.

[06] The technical solution adopted by the present disclosure is as follows:

[07] a depth weighted hash learning method based on spatial importance, characterized in that the method comprises the steps of:

[08] (1) learning spatial importance information using depth network: constructing a depth spatial importance learning model, i.e. feeding an image into the depth network, and the depth network learning to obtain the spatial importance information of the image according to sensitivity of a pixel position of the image to image classification and classification label information of the image, wherein the spatial importance information is information characterizing the contribution degree of data at each pixel position in an original image to the identification of the whole image, and if the data of a certain pixel position can greatly help the identification of the image, the spatial importance of the pixel position is considered to be high, whereas the spatial importance of the position is considered to be low;

[09] (2) hash learning of an importance region and a non-importance region, wherein the specific steps include

[10] (1) generating the importance region of the image and the non-importance region of the image through the importance information obtained in step (1) and the original image;

[11] (2) putting the importance region of the image and the non-importance region of the image into two different depth networks;

[12] (3) using the two depth networks to establish a mapping relationship between a hash code and an original feature so as to obtain a hash code of the importance region of the image and a hash code of the non-importance region of the image; and

[13] (4) concatenating the hash code of the importance region of the image and the hash code of the non-importance region of the image to obtain a final hash code.

[14] Preferably, in step (2), a hash joint optimization objective function is established via sample mark information, sample similarity information and quantization information, and a hash representation is obtained via the optimization objective function, wherein the objective function is as follows: 2 min Z = L, +nL, + BL, LU501692 =->"(s,0, —log(1+e")) s, es

N +n | b, —d, 13 BO log y, +(1=y,)lo@(1= X), i=l

[15] B being a hash code of all pictures, Ls representing a similarity loss, La representing a quantization loss, Le representing a classification loss, and being parameters, in L,, S being a similarity matrix, and Sj; being similarity of an image 1 and an image j in the similarity matrix, if the image 1 and the image j are of the same type, S being 1, and if the image i and the image j are of different types, S;; being 0, b; and b; being hash codes of the image 1 and the image j, while in Lg, b; being a hash code of the image 1, d; being a result obtained by the depth network, and in Le, y; being label information about the image i and being prediction information obtained by the network.

[16] Preferably, the deep network comprises a convolution neural network CNN and a full convolution network FCN.

[17] The present disclosure implements a depth hash learning method with weighted spatial importance, making full use of spatial importance information in each image and improving the performance of hash retrieval.

BRIEFT DESCRIPTION OF THE DRAWING

[18] FIG 1 is a schematic diagram of a depth weighted hash learning method based on spatial importance of the present disclosure.

DETAILED DESCRIPTION OF THE EMBODIMENTS

[19] Hereinafter, the present disclosure will be described in detail with reference to the accompanying drawings.

[20] The method of the present disclosure comprises the following specific steps according to the scheme shown in FIG 1.

[21] (1) Spatial importance extraction

[22] An input to a network is an original image and an output is a degree of importance of each pixel position of the image. The weight of the feature map can be learned according to the classification information of the image, and the degree of importance can be obtained by weighting.

[23] In the feature extraction stage, two types of networks are available according to the actual application requirements:

[24] (1) convolution neural network (CNN): an existing convolution neural network (CNN) model can be used,

[25] (2) full convolution network (FCN): a modification can be performed using an existing full convolution network (FCN) model or using an existing convolution neural network (CNN). 3

[26] (2) Hash learning of an importance region and a non-importance region LU501692

[27] In a hash learning stage, a hash joint optimization objective function is established through sample label information, sample similarity information and quantization information, and a hash representation 1s obtained through the optimization objective function, and the present disclosure proposes an optimization problem as follows: min L=L, +nL, + pL, =->"(s,0, —log(1+e")) s, es 28 N 123] +n |b, —d, 13 BO log y, +(1-y,)log(1-3))), i=l

[29] B being a hash code of all pictures, Ls representing a similarity loss, La representing a quantization loss, Le representing a classification loss, and being parameters, in L,, S being a similarity matrix, and Sj; being similarity of an image 1 and an image j in the similarity matrix, if the image 1 and the image j are of the same type, S being 1, and if the image i and the image j are of different types, S;; being 0, b; and b; being hash codes of the image 1 and the image j, while in Lg, b; being a hash code of the image 1, d; being a result obtained by the depth network, and in Le, y; being label information about the image i and being prediction information obtained by the network

[30] Table 1 is a simulation experiment of the method of the present disclosure, measured with MAP (Mean Average Precision), performed on three common databases: CIFAR-10, MS-COCO, and NUS-WIDE. Table 1 is a performance comparison of the present disclosure (SIWH) with other algorithms. From Table 1, it can be observed that SIWH is significantly better than other algorithms in different length and different data sets. The MAP values of SIWH implemented on the CIFAR10 and NUSWIDE data sets achieved an average performance improvement of 2.57% and 1.29%, respectively, compared to the optimal depth hash method ADSH. The average performance of SIWH is improved by 4.47% compared to the optimal depth hash method DOH on the MS-COCO data set. Substantial improvements demonstrate the effectiveness of the proposed method.

[31] Table 1 MAP performance comparison of the present disclosure with other algorithms. 4

_—m m om LU501692 ot | CHAR { NUS-WIDE i MS-COCO REIN N N omer | bits | 48 bie | 54 bie | 128 bie | 24hits | Abie | 6&blir | 128 bin | Dhs | bis | 68 bite | 128 hs LSH | a272 fons | oason | oder? | OB6S4 | G3882 | 0.2083 | 03000 | eons | 01462 | 61774 | 02607 SH | 42346 | 02059 | DAIST | 08168 | 1238 | 01729 | D2358 | ORME 00837 | QIQHE | wi | 0237 SELSH | 2378 | 00083 | 03872 | DS5I7 ODED | MINT | 0286 | XM | GOSSE | CIS | DURE | 0.366 PCAH | 0030 | 01726 | 6.1863 | 0.2018 | 06024 | 0.0809 | 0.0886 | 3.1131 | 00662 | 00633 | GOT | Quin TI | 43648 | D445 | GIST | LAND | 03108 | 03884 | 0.4130 | BST | ARS | 02862 | 03085 | ONE FSSH | 45853 | DIM | DGSIS | 07300 | 0.3059 | 33716 | nad | OS411 | 0305 | 03415 | 04063 | ole DSH | 07304 | 07330 | ©7834 | TRS | N6508 | 0.6634 | 06587 | GFR [ASIN | 0060 | ORT | OSO0T2 DPSH | 088213 | GES | 8888 | GROTH | OA390 | 03420 | 08123 | 0.3468 | 06823 DENT | 0.6068 | am DROS | 48085 | 0.0004 | 09002 | 0.8970 | 08203 | GS328 | DSMT | oats eee | OTIN | Gm | 0727 DCH | ARTS | 08782 | GATAG | 0227 | 0.7512 | 97632 | 0.7687 | 07602 | GS8SN | 0.3054 | ANGER | 025 DOSH | 08452 | 08881 | GS016 | 0.8003 | 017352 | GEM | ORGY | 07957 | GSR | 0.6032 | BOM | déE ADSH | 08057 | 09040 | 09060 | 0.50 | DISRT | ORNS | R80 | ORTE | QOS | 06376 | OAS | 06617 DOS | 48651 | 08618 | RG | 0.8719 | 07830 | 47846 [OES | ATS | Q7336 | 07887 | 07537 om SFWH | 0023 | 0.0304 | 09204 | 0.0316 | 0,5673 | 0,8048 | 0.0023 0.9100 | 07522 | 47086 | 0,8007 | 0.8153

Claims

WHAT IS CLAIMED IS: LUS01692

1. A depth weighted hash learning method based on spatial importance, the method comprising the steps of: (1) learning spatial importance information using depth network: constructing a depth spatial importance learning model, 1.e. feeding an image into the depth network, and the depth network learning to obtain the spatial importance information of the image according to sensitivity of a pixel position of the image to image classification and classification label information of the image; (2) hash learning of an importance region and a non-importance region, the deep network comprising a convolution neural network CNN and a full convolution network FCN, wherein the specific steps comprise (1) generating the importance region of the image and the non-importance region of the image through the importance information obtained in step (1) and the original image; (2) putting the importance region of the image and the non-importance region of the image into two different depth networks; (3) using the two depth networks to establish a mapping relationship between a hash code and an original feature so as to obtain a hash code of the importance region of the image and a hash code of the non-importance region of the image; and (4) concatenating the hash code of the importance region of the image and the hash code of the non-importance region of the image to obtain a final hash code.

2. The depth weighted hash learning method based on spatial importance according to claim 1, characterized in that: in step (2), a hash joint optimization objective function is established via sample mark information, sample similarity information and quantization information, and a hash representation is obtained via the optimization objective function, wherein the objective function is as follows: min L =L,+nL, + BL, =->"(s,0, —log(1+e")) s, es

N +n | b, —d, 13 BO log y, +(1=y,)lo@(1= X), i=l B being a hash code of all pictures, Ls representing a similarity loss, La representing a quantization loss, Le representing a classification loss, n and ß being parameters, in L,, S being a similarity matrix, and Sj; being similarity of an image 1 and an image j in the similarity matrix, if the image 1 and the image j are of the same type, S being 1, and if the image i and the image j are of different types, S;; being 0, b; and b; being hash codes of the image 1 and the image j, while in Lg, b; being a hash code of the image 1, d; being a result obtained by the depth network, and in Le, y; being label information about the image i and being prediction information obtained by the network.

1