CN102789489B

CN102789489B - Image retrieval method and system based on hand-held terminal

Info

Publication number: CN102789489B
Application number: CN201210228711.5A
Authority: CN
Inventors: 杨震群
Original assignee: Individual
Current assignee: Individual
Priority date: 2012-07-04
Filing date: 2012-07-04
Publication date: 2014-08-27
Anticipated expiration: 2032-07-04
Also published as: CN102789489A

Abstract

The invention realizes an image retrieval system. The system allows a user to photograph an interested object or an interested scene through the hand-held terminal. After a picture is uploaded to a server, the system matches the existing picture in a database with the picture photographed by the user to discover the object or the scene photographed by the user, thereby determining an interesting point or a photographing position of the user; and then the related information is returned to the hand-held terminal of the user. A picture matching algorithm developed by the invention has the characteristics of resisting photographing angles, scales, rotation and light conversion, the matching precision achieves above 95.34 percent, and the application level is achieved.

Description

Image search method based on handheld terminal and system

Technical field

The present invention relates to image seek technology field, a kind of image search method and system based on handheld terminal is provided.

Background technology

Current main flow search engine is all based on word, but the own request of inquiring about all can not or inconveniently be described with word under many circumstances by domestic consumer.The mode that the present invention allows user to take pictures by handheld terminal, expresses oneself inquiry request with picture, directly perceived and facilitate, and has met the actual demand of people's lives.More successful similar image retrieval algorithm is mostly based on partial interest point (Local Interesting Points, LIPS) at present.But because the algorithm time complexity of the required LIPs of extraction is very high, existing algorithm is difficult to real-time network picture retrieval.The traditional algorithm that algorithm of the present invention roughly levels off to based on LIPs in precision, but on time complexity, be far smaller than classic method, meet the picture retrieval needs of real-time network environment.

Summary of the invention

The object of the invention is to overcome the deficiencies in the prior art, propose the image matching method of a kind of anti-shooting angle, yardstick, rotation, illumination conversion.

The present invention is for achieving the above object by the following technical solutions:

1, the image search method based on handheld terminal, is characterized in that comprising the following steps:

1) the camera routine interface that calls handheld terminal carries out photo acquisition;

2) picture is sent to server by the network transmission interface that calls handheld terminal;

3) result of finally server being returned shows in user interface.

In such scheme, in described step 2, server process adopts treatment step to comprise after receiving picture:

21) accept the picture that client-side program is uploaded;

22) picture is carried out to paster decomposition;

23) paster is mapped in paster space;

24) mate with existing picture in storehouse;

25) according to the point of interest of matching result predictive user or the position of taking pictures;

26) relevant information is sent to client-side program.

In such scheme, described step 22) in paster decomposition method comprise the following steps:

31) build orthogonal paster spatial model:

A, every width picture is modeled as to the combination of one group of paster, the statistical nature drawing on the contained all member's pasters of a width picture is used for representing the feature of this picture;

B, by collecting large-scale picture, and the paster of all pictures is gathered and carries out cluster analysis, find a paster set of compacting most by weighing in the right similarity of all pasters, construct a similarity matrix R, R is carried out obtaining a stack features vector after spectral factorization, this group vector is the vector of middle paster is expressed, and the space that this stack features vector is opened is required paster space;

C, use spectral factorization (Spectral Decomposition) are optimized the orthogonality in space, and process is as follows: first, calculate in all pasters to (u _i, u _j) cosine similarity and put into matrix R=[r _ij] _{m × m}, wherein

r_{ij} = Sim ({\overset{&RightArrow;}{u}}_{i}, {\overset{&RightArrow;}{u}}_{j}) = \cos ({\overset{&RightArrow;}{u}}_{i}, {\overset{&RightArrow;}{u}}_{j}) = \frac{{\overset{&RightArrow;}{u}}_{i} {\overset{&RightArrow;}{u}}_{j}}{| {\overset{&RightArrow;}{u}}_{i} | | {\overset{&RightArrow;}{u}}_{j} |}

for paster u _iand u _jcorresponding minutia vector, then, carries out spectral factorization to R, R=V Λ V ^t=(V Λ ^1/2v ^t) (V Λ ^1/2v ^t)=C ^tc

Wherein Λ is the diagonal matrix that has comprised R eigenwert, and V is characteristic of correspondence vector matrix, and Matrix C has comprised in vector corresponding to all pasters.The actual base vector that comprises one group of linear space of C now obtaining, the space that all base vectors are opened is required orthogonal paster space.

32) be independent of the feature representation technology of conversion: 31) on the basis in described paster space, arbitrary non-with reference to paster in paster space, be expressed as:

C^{T} \overset{&RightArrow;}{u} = R_{u}

\overset{&RightArrow;}{u} = {(C^{T})}^{- 1} R_{u}

Wherein R _um dimensional vector, expressed paster u with in all relations with reference to paster.Now any image T can be expressed as the vectorial vector sum that all pasters that this picture is corresponding form in paster space in paster space,

T = Σ_{k = 1}^{n} {\overset{&RightArrow;}{u}}_{k}

33) Adaptive Matching Algorithm: the picture that the picture that exceedes certain threshold value with source picture analogies degree is considered as to candidate, use hidden Markov model (Hidden Markov Model) to carry out modeling to the spatial relation between paster, automatically recognize the pattern comprising in picture, then, in the set of all candidate's pictures, search for these patterns to find final result.

With respect to traditional image matching technology, the model that the present invention proposes has following several advantages:

One, the balance of accuracy and runtime: with respect to the method (as: color moment that lattice extracts) of global characteristics, the feature representation that the present invention proposes has certain anti-ability to transform, thereby precision is higher; Relatively traditional local feature method (as: SIFT, SURF etc.), the algorithm time complexity that the present invention proposes is lower, and detection speed is faster, is more suitable for large scale network and calculates.

Two, extensibility: the vector in the paster space that the present invention builds can calculate mutually, for further mining algorithm has left development space.The structure that adds space has been considered orthogonality (be each base vector not linear dependence), can ensure the more mathematical model of convergence linear space of correlation computations in space, thereby more accurate.

Three, core of the present invention is the images match model based on paster space.Image feature representation is carried out in paster space of this model construction, has improved the mode of in classic method, all pasters (or unique point) in two width Target Photos being compared one by one, thereby has reduced the time complexity of copy image detect algorithm.Meanwhile, picture is carried out to the method for paster decomposition and broken the locus strong constraint between local feature point in picture, reduced the impact of various copy picture editor modes on feature representation, thereby improved the robustness of algorithm.

Brief description of the drawings

Fig. 1 is system block diagram of the present invention;

Fig. 2 is image paster decomposing schematic representation of the present invention;

Fig. 3 is system flow schematic diagram of the present invention.

Embodiment

Embodiment of the present invention are made up of client algorithm and server end algorithm two parts.

Technical scheme of the present invention is:

First build the paster spatial model that can express picture, picture can be tried one's best and not be subject to the impact of various picture conversion in the expression in this space, be then created on this basis the algorithm that can meet network (being mainly cell phone network) picture coupling in speed and precision.So correlative study is divided into three parts: orthogonal paster spatial model, be independent of the feature representation technology of conversion and Adaptive Matching Algorithm.Below briefly introduce respectively:

Orthogonal paster spatial model: this model is modeled as every width picture the combination (as shown in Figure 2) of one group of paster (a little image block), the statistical nature therefore drawing on the contained all member's pasters of a width picture can be used for representing the feature of this picture.

By collecting large-scale picture, and the paster of all pictures gathered and carry out cluster analysis, can find a paster set of compacting most the most representative one group of paster of this set-inclusion, has reacted the various minutias that picture may carry.By weighing in the right similarity of all pasters, can construct a similarity matrix R.R is carried out can obtaining a stack features vector after spectral factorization.This group vector is the vector of middle paster is expressed.The space that this stack features vector is opened is required paster space.Due to be the most representative paster group, therefore any paster is not (even if exist in), can by weigh with in the similarity of all pasters weigh its entrained details classification.For example,, if a paster X follows in two paster u _iand u _jthe most similar, illustrate that the entrained details of X is u _iand u _jthe combination of entrained minutia.So, each picture with in the vector x that forms of the similarity of all pasters expressed the distribution of its entrained details statistics.

But due to reference to paster ( in all pasters) between also there is certain similarity, it is non-orthogonal directly building by paster and their similarity the result that the method in space obtains in theory.This nonorthogonality can affect the accuracy of net result.Therefore can use spectral factorization (Spectral Decomposition) to be optimized the orthogonality in space.Process is as follows: first, calculate in all pasters to (u _i, u _j) cosine similarity and put into matrix R=[r _ij] m × m, wherein

r_{ij} = Sim ({\overset{&RightArrow;}{u}}_{i}, {\overset{&RightArrow;}{u}}_{j}) = \cos ({\overset{&RightArrow;}{u}}_{i}, {\overset{&RightArrow;}{u}}_{j}) = \frac{{\overset{&RightArrow;}{u}}_{i} {\overset{&RightArrow;}{u}}_{j}}{| {\overset{&RightArrow;}{u}}_{i} | | {\overset{&RightArrow;}{u}}_{j} |} - - - (1)

for paster u _iand u _jcorresponding minutia vector (can use color moment, wavelet texture etc.).Then, matrix R is carried out to spectral factorization,

R＝VΛV ^T＝(VΛ ^1/2V ^T)(VΛ ^1/2V ^T)＝C ^TC (2)

C ^tfor the transposed matrix of Matrix C, wherein Λ is the diagonal matrix that has comprised R eigenwert, and V is characteristic of correspondence vector matrix.Matrix C has comprised in vector corresponding to all pasters, can be expressed as the actual base vector that comprises one group of linear space of C.The space that these base vectors open is exactly the orthogonal paster space that we require.Its orthogonality can ensure that the expression of paster in space has global coherency (Global Consistency).Last arbitrary non-with reference to paster can in paster space, be expressed as:

C^{T} \overset{&RightArrow;}{u} = R_{u}

\overset{&RightArrow;}{u} = {(C^{T})}^{- 1} R_{u} - - - (3)

Wherein R _um dimensional vector, expressed paster u with in all relations with reference to paster (being cosine similarity shown in formula 1).Any image T can be expressed as the vectorial vector sum that all pasters that this picture is corresponding form in paster space in paster space,

T = Σ_{k = 1}^{n} {\overset{&RightArrow;}{u}}_{k} - - - (4)

By above-mentioned model, any image can be mapped in paster space and be expressed as a vector.Therefore similarity between image can be judged by the similarity (or distance) of calculating their corresponding vectors simply.The mode that this similarity account form has avoided multiple spot feature to mate between two, the time complexity of algorithm is from O (N ²) ease down to O (cN) (wherein c is a constant, represents the dimension in paster space).

Be independent of the feature representation technology of conversion: the paster that paster spatial model carries out image decomposes breaks up the strong position constraint of each local feature on picture.Because the impact that local detail is brought by various editor's characteristic changes is very little, so that the vector obtaining in paster space is expressed robustness is higher, can more effectively deal with the rotation, displacement, montage of image and the impact of the editing operation such as convergent-divergent on a small scale.

Adapting to image matching algorithm: judge that whether two width pictures comprise Compatible object or scene, can weigh according to their similarity simply.But, owing to having broken up the position constraint of local feature, the pattern having (Pattern) of some object itself is also lost thereupon.Therefore, in research, only using similarity as a kind of reference value of selecting candidate's copy, that is: the picture that exceedes certain threshold value with source picture analogies degree is considered as to candidate's picture.In further matching algorithm, use hidden Markov model to carry out modeling to the spatial relation between paster, can automatically recognize like this pattern comprising in picture.Then, in the set of all candidate's pictures, search for these patterns to find final result.The discovery of these patterns does not need to train in advance, is adaptive.

Touch image matching system operational scheme constructed on the basis of type as shown in Figure 3 in paster space.System is divided into off-line training and two parts of On-line matching.In off-line part, first collect a large amount of pictures as training set, more all pasters of training set are carried out carrying out cluster analysis after paster extraction, then use said method structure paster space.In On-line matching part, in the time that each test picture arrives, first carry out paster extraction, re-use the paster space training each test picture is carried out to paster expression and other vector quantization expression of picture level.Then carry out similarity calculating according to vector expression, the foundation that its result is judged as the copy picture of same object or scene (comprise to).

Claims

1. the image search method based on handheld terminal, is characterized in that comprising the following steps:

3) result of finally server being returned shows in user interface;

In described step 2, server process adopts treatment step to comprise after receiving picture:

21) accept the picture that client-side program is uploaded;

22) picture is carried out to paster decomposition;

23) paster is mapped in paster space;

24) mate with existing picture in storehouse;

26) relevant information is sent to client-side program;

Described step 22) in paster decomposition method comprise the following steps:

31) build orthogonal paster spatial model:

B, by collecting large-scale picture, and the paster of all pictures is gathered and carries out cluster analysis, find a paster set of compacting most , by weighing in the right similarity of all pasters, construct a similarity matrix R, R is carried out obtaining a stack features vector after spectral factorization, this group vector is the vector of middle paster is expressed, and the space that this stack features vector is opened is required paster space;

C, use spectral factorization (Spectral Decomposition) are optimized the orthogonality in space, and process is as follows:

First, calculate in all pasters to (u _i, u _j) cosine similarity and put into matrix R=[r _ij] _{m × m}, wherein

for paster u _iand u _jcorresponding minutia vector, then, carries out spectral factorization to R,

R＝VΛV ^T＝(VΛ ^1/2V ^T)(VΛ ^1/2V ^T)＝C ^TC

Wherein Λ is the diagonal matrix that has comprised R eigenwert, and V is characteristic of correspondence vector matrix, and Matrix C has comprised in vector corresponding to all pasters, the actual base vector that comprises one group of linear space of C now obtaining, the space that all base vectors are opened is required orthogonal paster space;

Wherein R _um dimensional vector, expressed paster u with in all relations with reference to paster, now any image T can be expressed as the vectorial vector sum that all pasters that this picture is corresponding form in paster space in paster space,