CN107133348A

CN107133348A - Extensive picture concentrates the proximity search method based on semantic consistency

Info

Publication number: CN107133348A
Application number: CN201710368677.4A
Authority: CN
Inventors: 胡鸣珂; ***; 吕成钢
Original assignee: Individual
Current assignee: Individual
Priority date: 2017-05-23
Filing date: 2017-05-23
Publication date: 2017-09-05
Anticipated expiration: 2037-05-23
Also published as: CN107133348B

Abstract

The invention discloses a kind of in proximity search method of the extensive picture concentration based on semantic consistency, including transition matrix training process：Semantic consistency is introduced in the similarity of the picture in calculating pictures and sampling picture, and obtains the transition matrix needed for next stage；Hash cataloged procedure：The optimization similarity that the transform matrix calculations obtained according to training process go out between picture and sampling picture, and similar matrix is built according to optimization similarity and then binary coding is carried out to each picture in pictures using Hash coding techniques；Then new inquiry picture and the binary-coded Hamming distance of each picture are compared, so as to find the neighbour of inquiry picture.The present invention introduces semantic congruence characteristic when carrying out similarity measurement to picture, more can accurately measure the similitude between picture, the training time of algorithm is reduced using stochastic gradient descent method, can be effectively applied to large-scale image data and concentrate.

Description

Extensive picture concentrates the proximity search method based on semantic consistency

Technical field

The present invention relates to a kind of method concentrated in Large Scale Graphs sheet data and proximity search is carried out to picture, belong to engineering Practise technical field.

Background technology

An important application is exactly the proximity search of picture in NN Query.In the big data epoch, image data is most bright Aobvious the characteristics of is that data scale is very big, and the characteristic dimension of picture in itself is very high.For the neighbour of magnanimity higher-dimension picture efficiently and accurately Inquiring about the research to front subjects such as computer vision, machine learning has particularly important application value.

The searching algorithm of traditional NN Query algorithm, such as the index structure based on tree, this searching algorithm is all present The problem of dimension.Because performance drastically glides when carrying out approximate neighbor search to higher-dimension image data, it has not been suitable for current The big data epoch.The method of prevalence the most is the approximate neighbor search based on salted hash Salted now, and classical proximity search is breathed out Uncommon algorithm such as local sensitivity hash algorithm (LSH), by by neighbor search problem be converted to the similar binary coding of searching come Solve.Approximate search algorithm based on salted hash Salted has more simple index structure and less memory space.But LSH In order to ensure precision and recall rate, it is necessary to build multiple Hash tables simultaneously, cause being significantly increased for query time and storage overhead.

Occur in that again therewith and can produce the hash algorithm of highly efficient coding, a kind of hash algorithm based on figure is due to can Preferably to measure the similitude between picture sample so as to obtaining better performance.Such as compose Hash (SH) algorithm, anchor Hash (AGH) algorithm.But these algorithms are excessively unilateral when finding neighbour's picture, they only considered picture in data set Actual storage locations, without considering the semantic marker information that may have of picture, so that these algorithms are near in picture Like the poor-performing in search.Concentrated in the Large Scale Graphs sheet data of reality, many pictures have semantic marker information, different Class label information representative picture belongs to different classifications.Such as two pictures possible actual storage in data set is distant, But there is identical class to mark " sky " for they, then this two pictures is also approximate picture.And currently a popular picture Approximate search algorithm often poor-performing when applying on Large Scale Graphs sheet data collection, it is impossible to good solving practical problems.

The content of the invention

The technical problems to be solved by the invention are：There is provided it is a kind of applied to Large Scale Graphs sheet data concentrate based on semanteme The approximate image searching method of uniformity.Mainly solve the problems, such as the proximity search of picture and similar picture is passed through into salted hash Salted It is mapped to same or analogous binary coding.

The present invention uses following technical scheme to solve above-mentioned technical problem：

The present invention proposes a kind of in proximity search method of the extensive picture concentration based on semantic consistency, methods described bag Include following steps：

Step 1：Pictures sample matrix X is inputted, and inputs the corresponding semantic category mark matrix Y of pictures, wherein X is n* The matrix of d dimensions, Y is the matrix of n*c dimensions, and n is the number of picture sample, and d is the dimension of picture feature, and c is the quantity of class mark；

Step 2：A part of picture is randomly selected from pictures as sampling pictures；

Step 3：The relational matrix W between the picture in pictures and sampling pictures is defined, marriage relation matrix is simultaneously introduced Semantic consistency builds object function expression formula, is optimized by stochastic gradient descent algorithm iterative, after formula convergence to be expressed Obtain the transition matrix A that optimization is completed；

Step 4：For each picture sample x, transition matrix A is substituted into the relational matrix that step 3 is defined, closed It is the value of each element of matrix；Similar matrix Z is constructed by relational matrix, encoder matrix is obtained with reference to similar matrix, it is right Each picture that Large Scale Graphs sheet data is concentrated carries out Hash coding using encoder matrix, and picture is compressed by original d dimensional features It is mapped to the binary coding of k dimensions；

Step 5：For a new inquiry picture q, the binary coding of inquiry picture is calculated by encoder matrix, with Image data concentrates the binary coding of each picture to compare Hamming distance, if Hamming distance is less than the threshold value r of setting, i.e., It is approximate picture to think two pictures.

Further, proximity search method of the invention, transition matrix A calculating process is as follows：

Step (1), the relational matrix W defined between picture are each elements in the matrix of a n*n dimension, relational matrix It is defined as：

W_ij=exp (- | | A (x_i-u_j)||²) (1)

A represents transition matrix, x in above formula_iRepresent the i-th pictures in pictures, u_jRepresent the jth in sampling pictures Pictures；

Step (2), objective function formula are：

Wherein, f_iThe class label vector of i-th of picture sample is represented, class label vector is member in the column vector of c dimensions, vector The value of element is 1 or 0, represents that picture belongs to this class and is not belonging to this class, f respectively_jRepresent the class of picture in sampling pictures Label vector, | | f_i-f_j||²It is the semantic congruence property introduced when training transition matrix；

Step (3), according to stochastic gradient descent algorithm transformation matrix A, it is as follows that iteration updates rule：

Wherein, γ_tIt is the optimization step-length in each iterative process, the initial value of transition matrix is that I/ δ, I are the lists that d*d is tieed up Bit matrix, δ is the median of Euclidean distance between each picture in pictures；

After all picture samples traversal terminates in step (4), pictures, that is, obtain the transition matrix of final optimization pass completion A。

Further, proximity search method of the invention, step-length γ_tChoose one kind in values below：1*10^-5, 1*10^-4, 1* 10^-3, or 1*10^-2。

Further, proximity search method of the invention, step 4 is specific as follows：

Step a, obtain after transition matrix A, then the introducing language between every pictures and sampling picture is calculated by formula (1) Optimization similarity after adopted uniformity, that is, obtained the value of relational matrix W each element, if sampling picture is concentrated with m Picture sample, then similar matrix Z is constructed by relational matrix,

Z Matrix Computation Formulas is defined as follows：

Wherein<i>What set was represented is sampling pictures, i.e., only when picture is to belong to the picture in sampling pictures, The value of corresponding element on Z matrixes is just calculated, otherwise corresponding element value is 0 on Z matrixes；

Step b, the quantity for setting picture in sampling picture set construct the Metzler matrix of a m*m dimension, Metzler matrix is defined such as m Under：

M=Λ^-1/2Z^TZΛ^-1/2 (5)

Wherein Λ=diag (Z^T1) be, a diagonal matrix, calculate obtain before Metzler matrix the eigenvalue cluster of k maximum into The diagonal matrix of k*k dimensions：∑=diag (δ₁,...,δ_k)∈R^k×kCharacteristic vector composition corresponding with first k maximum characteristic value M*k dimension matrix：V=[v₁,...,v_k]∈R^m×k；

Step c, each matrix obtained by above formula, construct final encoder matrix Y, Y matrix and are defined as follows：

Y is the matrix of a n*k dimension, and n representative pictures concentrate the number of picture, and k is represented to be compiled when being mapped to binary coding The digit of code, encoder matrix Y often row is exactly a coding function, and each picture is calculated by coding function obtains what a k was tieed up Vector, then binarization segmentation is carried out to this vector：Sgn (y), has just obtained the binary coding of each picture in pictures.

Further, proximity search method of the invention, r chooses one in values below：1,2,3, or 4.

The key technology of the present invention is described below：

(1) approximate search algorithm based on semantic congruence

Approximate search algorithm based on semantic congruence is to calculate figure during image data concentrates each picture and sampling pictures Semantic consistency is introduced during the similarity of piece, the object function expression formula comprising semantic information is constructed.Then stochastic gradient is used Descent algorithm iterative, obtains reflecting the transition matrix in picture in semantic congruence characteristic after expression formula convergence.It is sharp again Picture is mapped to salted hash Salted the binary coding of k, and it is close similar input picture is mapped to Hamming distance Binary coding.

(2) stochastic gradient descent (SGD) algorithm：

Stochastic gradient descent algorithm declines the innovatory algorithm of (Gradient Descent, GD) algorithm as gradient, and it is led The problem of will be excessively slow for original gradient descent algorithm convergence rate and be easily absorbed in local optimum, is that one kind minimizes loss function Or the iterative method of risk function.The present invention reduces semantic congruence proximity search method using stochastic gradient descent algorithm The training time of middle transition matrix.

The present invention uses above technical scheme compared with prior art, with following technique effect：

1. solve with traditional gradient descent algorithm convergence rate it is excessively slow the problem of.

2. using the optimization similarity between transform matrix calculations picture, solve and measured using traditional gaussian kernel function The problem of excessively being relied on sensitive parameter during picture similarity.

3. the original image compression that d is tieed up using salted hash Salted is mapped to the binary coding of k bits, drastically increase The efficiency of algorithm and greatly reduce occupancy to memory headroom.

Brief description of the drawings

Fig. 1 is the system framework figure of the present invention.

Fig. 2 is flow chart of the method for the present invention.

Embodiment

Technical scheme is described in further detail below in conjunction with the accompanying drawings：

The present invention introduces semantic congruence characteristic when carrying out similarity measurement to picture, more can accurately measure picture Between similitude.And the optimization similarity that sampling pictures calculating is introduced after semantic consistency is generated, and use boarding steps Spend descent method to reduce the training time of algorithm, algorithm is effectively applied to large-scale image data and concentrate.And then Efficient binary coding is generated using Hash coding techniques, better performance can be obtained in approximate picture searching.

Semantic consistency is based on as shown in figure 1, being concentrated the invention provides one kind in extensive picture, and is compiled using Hash Picture is carried out binary coding and then finds the side of approximate picture by comparing the Hamming distance between coding of graphics by code technology Method.

The invention mainly comprises two parts：Train the process of transition matrix and the process of Hash coding.

The process of training transition matrix is introduced when being mainly the similarity in the picture in calculating pictures and sampling picture Semantic consistency, and obtain the transition matrix needed for next stage.

The process of Hash coding be mainly the transform matrix calculations that are obtained according to training process go out picture and sampling picture it Between optimization similarity, and according to optimization similarity build similar matrix so that using Hash coding techniques to each in pictures Picture carries out binary coding.Then new inquiry picture and the binary-coded Hamming distance of each picture are compared, so as to find Inquire about the neighbour of picture.

First, transition matrix process is trained：

Train the process of transition matrix is main to set up model according to the thought of semantic congruence and obtain coding stage needs Transition matrix, transition matrix reflects inherent semantic congruence characteristic between picture.This hair during transition matrix is trained Bright use stochastic gradient descent (SGD) algorithm reduces the training time.If the characteristic dimension of picture is tieed up for d, trained Transition matrix is the square formation of d rows d row.

Extensive picture concentrates the basic thought of the proximity search method based on semantic consistency to be by introducing semanteme one Characteristic is caused, picture is mapped to the binary coding of k dimensions by initial d dimension compressions.And similar input picture is mapped to the Chinese The close binary coding of prescribed distance.

Step 1：In calculation optimization similarity, if image data, which is concentrated, includes n pictures samples, the pass between picture is defined Be matrix W be n*n dimension matrix, each element definition in relational matrix is：

W_ij=exp (- | | A (x_i-u_j)||²) (1)

A represents transition matrix, x in above formula_iRepresent the i-th pictures in data set, u_jIn representing in sampling pictures Jth pictures.

Step 2：Semantic consistency has been introduced primarily into during training transition matrix A and has set up object function, and by with Machine gradient declines (SGD) algorithm iteration and solved, and obtains the transition matrix required for coding stage.Objective function formula is：

F in above-mentioned object function_iThe class label vector of i-th of picture sample is represented, (class label vector is the row of c dimensions Vector, c is the number of class, and the value of element is 1 or 0 in vector, represents that picture belongs to this class and is not belonging to this class respectively). f_jRepresent the class label vector of picture in sampling pictures.||f_i-f_j||²It is the semanteme one introduced when training transition matrix Cause property.The characteristic similarity that it is combined between picture can generate more accurate binary coding.

Step 3：The present invention uses stochastic gradient descent algorithm to reduce used in training transition matrix in optimization process Time.The initial value of transition matrix is 1/ δ, and I is the unit matrix of d*d dimensions, and δ is Euclidean distance between each picture in data set Median.Then according to stochastic gradient descent algorithm transformation matrix A, it is as follows that iteration updates rule：

Wherein γ_tIt is the optimization step-length in each iterative process, step-length can choose values below：(1*10^-5, 1*10^-4, 1*10^-3, 1*10^-2).After all picture sample traversals of image data concentrating terminate, that is, obtain the transition matrix A of final optimization pass completion. Now transition matrix training terminates, output transition matrix A.

2nd, Hash cataloged procedure：

Reflected sample is constructed by transition matrix obtained in the previous step and is taken out as shown in Fig. 2 the process of Hash coding is main The similar matrix Z of similarity relationships after optimizing between sample pictures.Then using salted hash Salted to Large Scale Graphs sheet data collection In each picture carry out Hash coding.Approximate picture of the new picture in data set is wanted to look up, compares binary system between picture The Hamming distance of coding, if Hamming distance is less than the threshold value r of setting, that is, it is approximate picture to think two pictures.

Step 1：Obtain after transition matrix A, then the introducing language between every pictures and sampling picture is calculated by formula (1) Optimization similarity after adopted uniformity.The value of relational matrix W each element is obtained.If sampling picture is concentrated with m Picture sample, then the similar matrix Z used using Hash coding techniques needs can be constructed by relational matrix.Z matrixes Calculation formula is defined as follows：

Wherein<i>What set was represented is sampling pictures.I.e. only when picture is to belong to the picture in sampling pictures, The value of corresponding element on Z matrixes is just calculated, otherwise corresponding element value is 0 on Z matrixes.

Step 2：If the quantity of picture is m, the Metzler matrix of one m*m dimension of construction in picture set of sampling.Metzler matrix is defined such as Under：

M=Λ^-1/2Z^TZΛ^-1/2 (5)

Wherein Λ=diag (Z^T1), it is a diagonal matrix.Calculate obtain before Metzler matrix the eigenvalue cluster of k maximum into The diagonal matrix of k*k dimensions：∑=diag (δ₁,...,δ_k)∈R^k×kCharacteristic vector composition corresponding with first k maximum characteristic value M*k dimension matrix：V=[v₁,...,v_k]∈R^m×k。

Step 3：Each matrix obtained by above formula, constructs final encoder matrix Y, Y matrix and is defined as follows：

Y is the matrix of a n*k dimension, and n representative pictures concentrate the number of picture, and k is represented to be compiled when being mapped to binary coding The digit of code.Encoder matrix Y often row is exactly a coding function, and each picture is calculated by coding function obtains what a k was tieed up Vector, then binarization segmentation is carried out to this vector：sgn(y).The binary coding that image data concentrates each picture is just obtained.

Step 4：If new inquiry picture will carry out the search of approximate picture, same calculated using coding function is looked into Ask the binary coding of picture.Then the coding of comparison query picture and image data concentrate the Hamming distance of all coding of graphics From.Defining Hamming distance threshold value r, (r can choose values below：1,2,3,4), if the Hamming distance of inquiry picture and certain picture From less than threshold value r, that is, it is the approximate picture for inquiring about picture to think this picture.Travel through image data collection, you can find inquiry picture All approximate pictures.

The holistic approach flow of the present invention is as follows：

Step 1：(X is the matrix of n*d dimensions to input image data collection sample matrix X, and n is the number of picture, and n value can With very big, d is the dimension of picture feature), and input the corresponding semantic category mark matrix Y of pictures (Y is the matrix of n*c dimensions, n It is number of samples, c is the quantity of class mark).

Step 2：A part of picture is randomly selected from pictures as sampling pictures, the purpose of sampling pictures is chosen It is by calculating the similarity between picture and sampling picture, calculating time overhead can be greatly reduced, improving the efficiency of algorithm.

Step 3：The each pictures concentrated for image data, introduce semantic consistency and build object function expression formula O (A), wherein A (A is the matrix of d*d dimensions, and d is the dimension of picture feature) is the transition matrix needed in coding stage.By with Machine gradient descent algorithm iterative, obtains the transition matrix A that optimization is completed after expression formula convergence.

Step 4：For each picture sample x, the phase between picture sample x and sampling picture is multiplied by with transition matrix A Like degree.The optimization similarity introduced after semantic consistency is obtained.Then recycle salted hash Salted encoded, by picture by Original d dimensional features compression is mapped to the binary coding of k dimensions.

Step 5：For a new inquiry picture q, its approximate picture is found.Obtained first with training in step 3 Transition matrix A be multiplied by picture q sampling picture between similarity.Obtained introduce semantic consistency after optimization it is similar Degree.The binary coding of inquiry picture is calculated by coding function again.The binary coding of each picture is concentrated with image data Compare Hamming distance.If Hamming distance is less than the threshold value r of setting, that is, it is approximate picture to think two pictures.

The present invention uses above implementer's case, as follows the problem of solved compared with prior art：

(1) the problem of performance is not good is caused without introducing semantic consistency in traditional approximate search algorithm training process：Very The algorithms for being traditionally used for picture neighbor search excessively unilateral when inquiry picture neighbour is found more, and picture is inquired about finding Neighbour when do not account for the semantic information that picture may have so that these algorithms are in the practical application of picture proximity search Performance is not good.The present invention introduces semantic congruence characteristic when carrying out similarity measurement to picture, more can accurately measure figure Similitude between piece.Algorithm is set effectively to apply in the picture proximity search of reality.

(2) abstract pictures calculation optimization similarity is used.Solve that Large Scale Graphs sheet data centralized calculation is similar to be spent Slow the problem of：Concentrated in Large Scale Graphs sheet data, if using it is traditional calculate picture and picture similarity between any two this Measure is planted, time overhead can be made very big, it is infeasible in practical application.The present invention is concentrated from mass picture and randomly selected very Few a part of picture only calculates the optimization similarity between picture and sampling pictures as sampling pictures.Greatly reduce The time overhead of algorithm, improves efficiency of algorithm.

(3) the problem of target function type restrained slow is solved using stochastic gradient descent algorithm：Original gradient declines Algorithm is referred to as batch gradient descent algorithm, and the way of this algorithm is the loss function for minimizing all training datas so that final What is solved is global optimal solution, that is, the parameter solved is so that the minimum parameter of loss function value.But batch gradient algorithm Per the step of iteration one, all data of training set will be used, if picture number is very big in data set, then use batch gradient Algorithm is very slow.Stochastic gradient descent algorithm only uses a data sample, speed when iteration updates one time.It is special It is not that the advantage of speed becomes apparent for Large Scale Graphs sheet data collection.And for target loss function, using with Machine gradient descent algorithm has not needed the whole data set of traversal just to reach convergence.The present invention is taken with stochastic gradient descent algorithm Carry out the object function of iterative algorithm for batch gradient algorithm, solve the problem of algorithmic statement is excessively slow.

In summary, using reflecting, picture is interior to be calculated between picture the present invention in the transition matrix of semantic consistency Optimize similarity.In order to improve search efficiency, concentrated from extensive picture and randomly select a part of picture as sampling pictures To measure the similitude between picture, and the training time of algorithm is reduced using stochastic gradient descent method when training transition matrix. Obtained after the similar matrix for coding, original image is mapped by the optimization similarity between picture using Hash coding techniques For the binary coding of k bits.In the neighbour of the new inquiry picture of search, inquired about first by the coding function of model The binary coding of picture, then and pictures in all pictures compare coding between Hamming distance.When some pictures and inquiry When the Hamming distance of picture is less than given Hamming distance threshold value, that is, it is considered to inquire about the approximate picture of picture.

Described above is only some embodiments of the present invention, it is noted that for the ordinary skill people of the art For member, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications also should It is considered as protection scope of the present invention.

Claims

1. concentrate the proximity search method based on semantic consistency in extensive picture, it is characterised in that：Methods described is included such as Lower step：

Step 1：Pictures sample matrix X is inputted, and it is n*d dimensions to input the corresponding semantic category mark matrix Y, wherein X of pictures Matrix, Y is the matrix of n*c dimensions, and n is the number of picture sample, and d is the dimension of picture feature, and c is the quantity of class mark；

Step 3：The relational matrix W between the picture in pictures and sampling pictures is defined, marriage relation matrix simultaneously introduces semanteme Uniformity builds object function expression formula, is optimized by stochastic gradient descent algorithm iterative, is produced after formula convergence to be expressed The transition matrix A completed to optimization；

Step 4：For each picture sample x, transition matrix A is substituted into the relational matrix that step 3 is defined, relation square is obtained The value of each element of battle array；Similar matrix Z is constructed by relational matrix, encoder matrix is obtained with reference to similar matrix, to big rule Each picture that mould image data is concentrated carries out Hash coding using encoder matrix, and picture is compressed by original d dimensional features and mapped The binary coding tieed up into k；

Step 5：For a new inquiry picture q, the binary coding of inquiry picture is calculated by encoder matrix, with picture The binary coding of each picture compares Hamming distance in data set, if Hamming distance is less than the threshold value r of setting, that is, thinks Two pictures are approximate pictures.

2. proximity search method according to claim 1, it is characterised in that：Transition matrix A calculating process is as follows：

Step (1), the relational matrix W defined between picture are each element definitions in the matrix of a n*n dimension, relational matrix For：

W_ij=exp (- | | A (x_i-u_j)||²) (1)

A represents transition matrix, x in above formula_iRepresent the i-th pictures in pictures, u_jRepresent the jth figure in sampling pictures Piece；

Step (2), objective function formula are：

<mrow> <mi>O</mi> <mrow> <mo>(</mo> <mi>A</mi> <mo>)</mo> </mrow> <mo>=</mo> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <munderover> <mo>&Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>n</mi> </munderover> <munderover> <mo>&Sigma;</mo> <mrow> <mi>j</mi> <mo>=</mo> <mn>1</mn> </mrow> <mi>m</mi> </munderover> <msub> <mi>W</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>|</mo> <mo>|</mo> <msub> <mi>f</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>f</mi> <mi>j</mi> </msub> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>

Wherein, f_iThe class label vector of i-th of picture sample is represented, class label vector is element in the column vector of c dimensions, vector It is worth for 1 or 0, represents that picture belongs to this class and is not belonging to this class, f respectively_jRepresent the class mark of picture in sampling pictures Vector, | | f_i-f_j||²It is the semantic congruence property introduced when training transition matrix；

<mrow> <msub> <mi>A</mi> <mrow> <mi>t</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> <mo>=</mo> <msub> <mi>A</mi> <mi>t</mi> </msub> <mo>-</mo> <msub> <mi>&gamma;</mi> <mi>t</mi> </msub> <mfrac> <mrow> <mo>&part;</mo> <mi>O</mi> <mrow> <mo>(</mo> <mi>A</mi> <mo>)</mo> </mrow> </mrow> <mrow> <mo>&part;</mo> <mi>A</mi> </mrow> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>

Wherein, γ_tIt is the optimization step-length in each iterative process, the initial value of transition matrix is that I/ δ, I are the unit squares that d*d is tieed up Battle array, δ is the median of Euclidean distance between each picture in pictures；

After all picture samples traversal terminates in step (4), pictures, that is, obtain the transition matrix A of final optimization pass completion.

3. proximity search method according to claim 2, it is characterised in that：Step-length γ_tChoose one kind in values below：1* 10^-5, 1*10^-4, 1*10^-3, or 1*10^-2。

4. proximity search method according to claim 2, it is characterised in that：Step 4 is specific as follows：

Step a, obtain after transition matrix A, then the introducing semanteme one between every pictures and sampling picture is calculated by formula (1) Optimization similarity after cause property, that is, obtained the value of relational matrix W each element, if sampling picture is concentrated with m pictures Sample, then similar matrix Z is constructed by relational matrix,

Z Matrix Computation Formulas is defined as follows：

<mrow> <msub> <mi>Z</mi> <mrow> <mi>i</mi> <mi>j</mi> </mrow> </msub> <mo>=</mo> <mfenced open = "{" close = ""> <mtable> <mtr> <mtd> <mrow> <mfrac> <mrow> <mi>exp</mi> <mrow> <mo>(</mo> <mo>-</mo> <mo>|</mo> <mo>|</mo> <mi>A</mi> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>u</mi> <mi>j</mi> </msub> </mrow> <mo>)</mo> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> <mrow> <msub> <mi>&Sigma;</mi> <mrow> <mi>j</mi> <mo>&Element;</mo> <mo><</mo> <mi>i</mi> <mo>></mo> </mrow> </msub> <mi>exp</mi> <mrow> <mo>(</mo> <mo>-</mo> <mo>|</mo> <mo>|</mo> <mi>A</mi> <mo>(</mo> <mrow> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>-</mo> <msub> <mi>u</mi> <mi>j</mi> </msub> </mrow> <mo>)</mo> <mo>|</mo> <msup> <mo>|</mo> <mn>2</mn> </msup> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mo>&ForAll;</mo> <mi>j</mi> <mo>&Element;</mo> <mo><</mo> <mi>i</mi> <mo>></mo> </mrow> </mtd> </mtr> <mtr> <mtd> <mrow> <mn>0</mn> <mo>,</mo> </mrow> </mtd> <mtd> <mrow> <mi>o</mi> <mi>t</mi> <mi>h</mi> <mi>e</mi> <mi>r</mi> <mi>w</mi> <mi>i</mi> <mi>s</mi> <mi>e</mi> </mrow> </mtd> </mtr> </mtable> </mfenced> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>

Wherein<i>What set was represented is sampling pictures, i.e., only just counted when picture is to belong to the picture in sampling pictures The value of corresponding element on Z matrixes is calculated, otherwise corresponding element value is 0 on Z matrixes；

Step b, the quantity for setting picture in sampling picture set construct the Metzler matrix of a m*m dimension, Metzler matrix is defined as follows as m：

M=Λ^-1/2Z^TZΛ^-1/2 (5)

Wherein Λ=diag (Z^T1) be, a diagonal matrix, calculate obtain before Metzler matrix k maximum eigenvalue cluster into k*k tie up Diagonal matrix：∑=diag (δ₁,...,δ_k)∈R^k×kThe m*k of characteristic vector composition corresponding with first k maximum characteristic value The matrix of dimension：V=[v₁,...,v_k]∈R^m×k；

<mrow> <mi>Y</mi> <mo>=</mo> <msqrt> <mi>n</mi> </msqrt> <msup> <mi>Z&Lambda;</mi> <mrow> <mo>-</mo> <mn>1</mn> <mo>/</mo> <mn>2</mn> </mrow> </msup> <msup> <mi>V&Sigma;</mi> <mrow> <mo>-</mo> <mn>1</mn> <mo>/</mo> <mn>2</mn> </mrow> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow>

Y is the matrix of a n*k dimension, and n representative pictures concentrate the number of picture, and k is represented and is mapped to what is encoded during binary coding Digit, encoder matrix Y often row is exactly a coding function, and each picture is calculated by coding function obtains the vector that a k is tieed up, Binarization segmentation is carried out to this vector again：Sgn (y), has just obtained the binary coding of each picture in pictures.

5. proximity search method according to claim 1, it is characterised in that：R chooses one in values below：1,2,3, or 4。