CN104331513A

CN104331513A - High-efficiency prediction method for image retrieval performance

Info

Publication number: CN104331513A
Application number: CN201410685896.1A
Authority: CN
Inventors: 贾强槐; 田新梅
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2014-11-24
Filing date: 2014-11-24
Publication date: 2015-02-04

Abstract

The invention discloses a high-efficiency prediction method for image retrieval performance. The method comprises the following steps: adaptively selecting an example image from a retrieval result upon the spurious correlation feedback technology; determining the vote of each image upon the voting decision-making mechanism and according to the visual similarity between each image in the retrieval result and the example image, and further estimating the association probability between each image in the retrieval result and an input search request; calculating the mean average precision of the image retrieval by synthesizing the association probabilities between all the images in the retrieval result and the input search request to realize high-efficiency prediction on the image retrieval performance. The method disclosed by the invention is lower in complexity, is convenient to realize, cannot be easily influenced by noise, and is capable of accurately performing prediction on the image retrieval performance.

Description

A kind of image retrieval performance prediction method efficiently

Technical field

The present invention relates to image retrieval technologies field, particularly relate to a kind of image retrieval performance prediction method efficiently.

Background technology

Image retrieval technologies is under the prerequisite of given query text or image, according to content information or given query standard, from image data base, search inquires about relevant picture to user, and sorts to picture from high to low according to the degree of correlation between picture and inquiry and ranking results is returned to user.The research of image retrieval technologies has achieved huge progress, many retrieval models are in succession proposed and are constantly obtained verification, improvement and checking in practice, but also there is serious robustness problem in most of searching system, such as, for some inquiry, result for retrieval quality is high, and for other inquiry, result for retrieval contains the irrelevant image of a lot of inquiry.Inquire about between its result for retrieval for difference and often there is larger difference; What is more, even if the good searching system of those average retrieval performances, but to some inquiry, their result for retrieval can not be satisfactory.Therefore, we wish that searching system can identify the inquiry that those result for retrieval are poor automatically, and do corresponding process to them.

Query performance prediction is the advanced problems of information retrieval field research.This technology is attempted when not having relevant information (namely not knowing the degree of correlation of result for retrieval and inquiry), the assessment searching system fine or not degree that it returns results to a certain inquiry.Query performance prediction can produce wholesome effect to user and searching system simultaneously.From the angle of user, this technology can provide valuable feedback information, such as, when the Query Result that searching system dopes certain user is poor time, can by this information feed back to user, then user can reconstruct his inquiry, interactive to obtain result for retrieval better with searching system even further; Concerning searching system, in the ideal case, if searching system can predict the performance of a certain inquiry, so it just automatically can adjust its parameter or algorithm to adapt to different inquiries, thus obtains better retrieval performance.

Relative to long-term for improving the exploration of searching system performance on retrieval model, the research of image retrieval query performance prediction is scarcely out of swaddling-clothes.The initial research about image retrieval query performance prediction is based on text message, text message (the context text returned around picture is retrieved mainly through research, picture URL etc.) and input inquiry text between mutual relationship, the specifics of such as text, generality, ambiguousness and iconicity etc.These researchs are main exists 2 deficiencies: the vision content 1) ignoring image only depends on text message, but text is not enough to the abundant visual information of description and text message often comprises a lot " noise "; 2) do not estimate real image searching result quality, and be simple inquiry being divided into easily and difficult two classes.

Current major part research is the query performance prediction returning picture vision content for retrieval.The main thought of this research is " document " that be first expressed as by picture being made up of vision word, then using some characteristics of method statistic of text analyzing, such as calculating by returning the probability distribution variances between language model and the language model be made up of whole pictures that picture forms; Estimate the Space Consistency returned between picture; Study for the forward picture visual consistency of retrieval return-list rank; And the research visual similarity distribution returning picture etc.These researchs first propose reasonably hypothesis, then make full use of the visual information of picture, estimate the query performance of result for retrieval, promoted the development of image retrieval query performance prediction technology.But these methods just give the indicated value of a query performance height, do not estimate the true value of result for retrieval quality.

In current research method, in addition to the above method, the method from the true retrieval performance aspect research of direct estimation is also had.The roughly thought of the method is, first estimates each correlation probabilities returning picture and given inquiry, then utilizes the correlation probabilities obtained to estimate Average Accuracy, and how accurate its key point be exactly must estimate correlation probabilities.Due to a kind of method that the method is direct estimation result for retrieval quality true value, if so accurately correlation probabilities cannot be estimated, the accuracy of net result so will certainly be affected.

Summary of the invention

The object of this invention is to provide a kind of image retrieval performance prediction method efficiently, the method complexity is lower, is convenient to realize, not easily affected by noise and can predict image retrieval performance comparatively accurately.

The object of the invention is to be achieved through the following technical solutions:

A kind of image retrieval performance prediction method efficiently, the method comprises:

Based on pseudo-linear filter technology adaptive selection sample picture from result for retrieval;

Based on ballot decision-making mechanism, and determine the votes of each pictures according to the vision similarity of each pictures in described result for retrieval and described sample picture, and then estimate the correlation probabilities in described result for retrieval between each pictures and inquiry request of input;

Correlation probabilities in integrated retrieval result between all pictures and the inquiry request of input calculates the Average Accuracy of this image retrieval, realizes efficient image retrieval performance prediction.

Further, describedly to comprise based on pseudo-linear filter technology adaptive selection sample picture from result for retrieval:

Image retrieval method for reordering is utilized to resequence to result for retrieval, then based on pseudo-linear filter technology adaptive selection sample picture from the result for retrieval of rearrangement; Described sample picture comprises the set of positive sample picture and the set of negative sample picture, or only comprises the set of positive sample picture; It is the picture of front K that described positive sample pictures are combined into rank in the result for retrieval of rearrangement, and it is the picture of rear K that described negative sample pictures are combined into rank in the result for retrieval of rearrangement; Wherein, the numerical value of K is determined by the method for consistance score C oS, is expressed as:

K^{*} = \max_{K &Element; [L, M]} CoS (K);

Wherein, K ^*represent the optimal value of K, L and M represents minimum value and the maximal value of K value respectively; CoS (K), for meeting conforming picture to shared number percent in selected positive sample picture, is expressed as:

CoS (K) = \frac{1}{| K (K - 1) |} \underset{i, j = 1, . . ., K; i &NotEqual; j}{Σ} δ (I^{i}, I^{j});

δ (I^{i}, I^{j}) = \{\begin{matrix} 1, if & s (I^{i}, I^{j}) > μ \\ 0, else \end{matrix};

In above formula, s (I ⁱ, I ^j) represent picture I ⁱand I ^jbetween vision similarity, μ is default threshold value.

Further, the step calculating the vision similarity of each pictures and described sample picture in result for retrieval comprises:

Adopt visual word bag model and vector space model all pictures to be expressed as one group of vector, comprising: adopt intensive sampling scale invariant feature switch technology Dense SIFT to extract the SIFT feature of each picture; Then, use clustering algorithm K-means that all SIFT feature obtained are polymerized to the code book comprising S vision word visual word; According to Nearest neighbor rule, all SIFT feature are quantized on corresponding visual word; Re-use TF-IDF weight mechanism and weigh the importance of each visual word in every pictures; Finally, every pictures is represented with vector space model;

Picture I ⁱvector expression be x ⁱin every dimensional vector be calculated as follows:

\begin{matrix} x_{l}^{i} = {tf}_{l} \cdot {idf}_{l}, & l = 1,2, . . ., S; \end{matrix}

Wherein, tf _lrepresent vision word w _lat picture I ⁱthe frequency of middle appearance, idf _lrepresent inverse document frequency, for weighing vision word w _limportance in whole pictures;

Cosine function cosine is adopted to calculate picture I in result for retrieval ⁱwith the vision similarity s (I of sample picture Dk ⁱ, D ^k), be expressed as:

s (I^{i}, D^{k}) = \frac{x^{i} \cdot x_{D}^{k}}{| | x^{i} | | | | x_{D}^{k} | |}, k = 1,2, . . ., K .

Further, described based on ballot decision-making mechanism, and determine that the votes of each pictures comprises according to the similarity of each pictures in described result for retrieval and described sample picture:

Each pictures in result for retrieval is carried out vision similarity calculating with each sample picture in each the sample picture in the positive sample picture set P in sample picture and negative sample picture set N respectively, if result of calculation is greater than threshold value, then this picture obtains a corresponding initial ballot, is expressed as:

{vote}_{ik}^{+} = \{\begin{matrix} 1, s (I^{i}, P^{k}) > μ \\ 0, else \end{matrix}, i = 1,2, . . . M, k = 1,2, . . ., K;

{vote}_{ik}^{-} = \{\begin{matrix} 1, s (I^{i}, N^{k}) > μ \\ 0, else \end{matrix}, i = 1,2, . . . M, k = 1,2, . . ., K;

Wherein, with represent picture I respectively ⁱcarry out vision similarity with a kth picture in a kth picture in positive sample picture set P and negative sample picture set N and calculate the rear initial positive and negative poll obtained, M represents the picture number in result for retrieval, and μ is default threshold value;

The initial ballot sum obtained according to each picture again confirms its final vote number, is expressed as:

r_{i}^{+} = \{\begin{matrix} 1, Σ_{k = 1}^{K} {vote}_{ik}^{+} &GreaterEqual; K / 2 \\ 0, else \end{matrix}, i = 1,2, . . . M;

r_{i}^{-} = \{\begin{matrix} 1, Σ_{k = 1}^{K} {vote}_{ik}^{+} &GreaterEqual; K / 2 \\ 0, else \end{matrix}, i = 1,2, . . . M;

Wherein, with represent picture I respectively ⁱfinal positive votes and negative votes.

Further, the correlation probabilities in the described result for retrieval of described estimation between each pictures and inquiry request of input comprises:

If described sample picture comprises the set of positive sample picture and the set of negative sample picture, then picture I ⁱand the correlation probabilities p between the inquiry request of input _icomputing formula is:

p_{i} = \{\begin{matrix} 1, if r_{i}^{+} = 1 and r_{i}^{-} = 0 \\ 0, else \end{matrix}, i = 1,2, . . ., M;

If described sample picture only comprises the set of positive sample picture, then by each positive sample picture in the set of described positive sample picture as a sorter, recycling regression algorithm integrates the classification results of each sorter, obtains picture I ⁱand the correlation probabilities p between the inquiry request of input _i:

p_{i} = \frac{\exp (Σ_{k = 1}^{K} f_{k} (I^{i}) / K - 0.5)}{1 + \exp (Σ_{k = 1}^{K} f_{k} (I^{i}) / K - 0.5)}, i = 1,2, . . ., M;

Wherein, f _k(I ⁱ) represent that a kth sorter is to picture I ⁱclassification results, when picture obtain one just vote time, f _k(I ⁱ)=1, otherwise, f _k(I ⁱ)=0.

Further, the Average Accuracy that the correlation probabilities in described integrated retrieval result between all pictures and the inquiry request of input calculates this image retrieval comprises:

Sample picture is comprised to the situation of the set of positive sample picture and the set of negative sample picture, the formula calculating the Average Accuracy of this image retrieval is:

EAP @ T = \frac{1}{Z_{T}} Σ_{i = 1}^{T} rel (i) \cdot \frac{1}{i} Σ_{j = 1}^{i} rel (j);

Wherein, T represent result for retrieval is resequenced after all pictures of T before rank, rel (i) is a two-valued function, represents that the picture when rank i-th is relevant to the inquiry request of input, then rel (i)=1, otherwise, rel (i)=0, i belongs to [1, T], j belongs to [1, i]; Z _tit is a normalization coefficient;

Sample picture is only comprised to the situation of positive sample picture set, the formula calculating the Average Accuracy of this image retrieval is:

E (AP @ T) = E [\frac{1}{Z_{T}} Σ_{i = 1}^{T} rel (i) \frac{Σ_{j = 1}^{i} rel (j)}{i}];

Wherein, E represents mathematical expectation.

As seen from the above technical solution provided by the invention, from result for retrieval, positive negative correlation sample picture is selected based on enhancement mode pseudo-linear filter technology, estimate the correlation probabilities of the every pictures in result for retrieval and inquiry request again based on ballot decision-making mechanism, last comprehensive all correlation probabilities calculate this Average Accuracy retrieved; Compared to existing method, the method complexity is lower, is convenient to realize, not easily affected by noise and can predict image retrieval performance comparatively accurately.

Accompanying drawing explanation

In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, below the accompanying drawing used required in describing embodiment is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawings can also be obtained according to these accompanying drawings.

The process flow diagram of a kind of efficient image retrieval performance prediction method that Fig. 1 provides for the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on embodiments of the invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to protection scope of the present invention.

Embodiment one

The efficient image retrieval performance prediction method of one that Fig. 1 provides for the embodiment of the present invention.As shown in Figure 1, the method mainly comprises the steps:

Step 11, based on pseudo-linear filter technology adaptive selection sample picture from result for retrieval.

Sample picture selects there is important effect in the present invention, and can have a strong impact on correlation probabilities and estimate this step.Pseudo-linear filter technology is a concept from text retrieval, and the document that its hypothesis has high rank in result for retrieval list is associated with the query often, thinks that the document that those come list end is incoherent with inquiry often simultaneously.

The embodiment of the present invention, directly select those pictures in result for retrieval list with high rank or low rank respectively as positive sample picture and negative sample picture by pseudo-linear filter technology, but those pictures selected may contain " noise ", determine that how many pictures should be selected also to be a difficult thing simultaneously.In order to address these problems, we mainly improve pseudo-linear filter application performance in the present invention from three aspects:

1) initial retrieval result reordering.

In general, image searching result is better, calculates selected picture truer by pseudo-linear filter.Therefore, we reorder to initial retrieval result, then adopt pseudo-linear filter technology to carry out selected positive and negative sample picture reordering on result for retrieval.In the embodiment of the present invention, arbitrary image retrieval method for reordering can adopt, such as Visual Rank, Bayesian Reranking method etc.

2) positive and negative sample picture is selected.

Usually, pseudo-linear filter technology can choose positive and negative sample simultaneously.In image retrieval, often more similar between picture associated with the query, but often vary between uncorrelated picture.In other words, huge with the incoherent number of pictures of inquiry, by the negative sample picture chosen a just part wherein, may comprise " noise " in negative sample picture simultaneously.By the further research to the effect of negative sample picture, we are to whether being taken into account that the present invention exists query.Therefore, in order to address this problem, we have proposed two kinds of strategies: rank in list of 1) simultaneously selecting to reorder is front K (top-K) and rank is that the picture of rear K (bottom-K) is respectively as the set of positive sample picture and the set of negative sample picture; 2) only select the picture of top-K as pseudo-positive example picture.

3) sample picture number is adaptively selected.

Because follow-up step needs to carry out based on the sample picture chosen, therefore sample picture number plays considerable effect.When K value is too large, positive sample pictures credit union introduces " noise " (negative sample); When K value is too small, selected positive sample picture set is not enough to well set forth user's inquiry.In other words, return results reasonable inquiry for those, we can compose a larger K value, and return results poor inquiry for those, and we just select a less K value.In order to address this problem, the embodiment of the present invention proposes a kind of method of adaptively selected sample number, namely CoS (Coherence Score, consistance score) method can be adopted to be respectively the adaptively selected K value of each inquiry, to be formulated as follows:

K^{*} = \max_{K &Element; [L, M]} CoS (K);

CoS (K) = \frac{1}{| K (K - 1) |} \underset{i, j = 1, . . ., K; i &NotEqual; j}{Σ} δ (I^{i}, I^{j});

δ (I^{i}, I^{j}) = \{\begin{matrix} 1, if & s (I^{i}, I^{j}) > μ \\ 0, else \end{matrix};

In above formula, δ (I ⁱ, I ^j) be similar to an impulse function, s (I ⁱ, I ^j) represent picture I ⁱand I ^jbetween vision similarity; μ is default threshold value, and the similar value between the picture pair being generally set to data centralization 80% is all less than this threshold value.

Step 12, based on ballot decision-making mechanism, and determine the votes of each pictures according to the vision similarity of each pictures in described result for retrieval and described sample picture, and then estimate the correlation probabilities in described result for retrieval between each pictures and inquiry request of input.

Along with sample picture adaptively selected after, the present invention also needs to estimate the correlation probabilities of the inquiry request of every pictures in result for retrieval and input.At present, in image rearrangement sequence, this correlation probabilities estimation problem is regarded as a classification problem by many work; A lot of sorter, such as SVM (support vector machine, support vector machine) sorter, can training in above-mentioned selected positive and negative sample and obtain, then carrying out two classification (relevant or uncorrelated) to retrieving the every pictures returned.Be different from existing method, the embodiment of the present invention proposes a kind of simple voting mechanism efficiently, then designs two kinds of correlation probabilities methods of estimation.

1) vision similarity calculates.

The voting mechanism of the embodiment of the present invention is voted based on similarity, therefore, needs first to calculate the vision similarity of each pictures and described sample picture in result for retrieval.Specifically, the embodiment of the present invention adopts visual word bag model and vector space model all pictures to be expressed as one group of vector, comprise: adopt Dense SIFT (intensive sampling scale invariant feature switch technology) extract each picture SIFT feature (can according to the patch size of 16*16, step-length be 6 criterion extract); Then, K-means (clustering algorithm) is used all SIFT feature obtained to be polymerized to the code book comprising the individual visual word (vision word) of S (such as, S can be 1000); According to Nearest neighbor rule, all SIFT feature are quantized in corresponding vision word; Re-use TF-IDF weight mechanism and weigh the importance of each visual word in every pictures; Finally, every pictures is represented with vector space model;

\begin{matrix} x_{l}^{i} = {tf}_{l} \cdot {idf}_{l}, & l = 1,2, . . ., S; \end{matrix}

Wherein, tf _lrepresent vision word w _lat picture I ⁱthe frequency of middle appearance, idf _lrepresent inverse document frequency, for weighing vision word w _limportance in whole pictures; Described whole pictures are the result for retrieval set that current searching system returns for several inquiry request inputted, and also can use typical Web353 data set or typical MSRA-MM_V1.0 data set etc.

Cosine function cosine is adopted to calculate picture I in result for retrieval ⁱwith sample picture D ^k(its vector expression is

x_{D}^{k} = [x_{D 1}^{k}, x_{D 2}^{k}, . . ., x_{DS}^{k}], k = 1,2, . . ., K)

Vision similarity s (I ⁱ, D ^k), be expressed as:

s (I^{i}, D^{k}) = \frac{x^{i} \cdot x_{D}^{k}}{| | x^{i} | | | | x_{D}^{k} | |} .

2) ballot decision-making mechanism.

Our hypothesis: picture associated with the query, should to inquire about picture concerned visually similar to other, and picture uncorrelated from other is different.Based on this hypothesis, we have proposed a simple voting mechanism efficiently.Every pictures in result for retrieval is carried out similarity-rough set with each picture in each picture in selected positive sample picture set P and negative sample picture set N by respectively, if similarity is between the two greater than a certain given threshold value, then this pictures obtains a corresponding initial ballot, is formulated as follows:

{vote}_{ik}^{+} = \{\begin{matrix} 1, s (I^{i}, P^{k}) > μ \\ 0, else \end{matrix}, i = 1,2, . . . M, k = 1,2, . . ., K;

{vote}_{ik}^{-} = \{\begin{matrix} 1, s (I^{i}, N^{k}) > μ \\ 0, else \end{matrix}, i = 1,2, . . . M, k = 1,2, . . ., K;

Wherein, with represent picture I respectively ⁱcarry out vision similarity with a kth picture in a kth picture in positive sample picture set P and negative sample picture set N and calculate the rear initial positive and negative poll obtained, M represents the picture number in result for retrieval, and μ is default threshold value.

Due to result for retrieval often more complicated, and may contain " noise " by the sample picture that pseudo-linear filter technology is selected, the ballot that some pictures obtain from these selected sample pictures may be invalid.In order to build a decision-making mechanism of voting efficiently, we add a decision-making mechanism here, it can be described as: if a pictures obtains most ballots (such as from selected sample picture, be at least half), so it just really obtains a ballot, is formulated as follows:

r_{i}^{+} = \{\begin{matrix} 1, Σ_{k = 1}^{K} {vote}_{ik}^{+} &GreaterEqual; K / 2 \\ 0, else \end{matrix}, i = 1,2, . . . M;

r_{i}^{-} = \{\begin{matrix} 1, Σ_{k = 1}^{K} {vote}_{ik}^{+} &GreaterEqual; K / 2 \\ 0, else \end{matrix}, i = 1,2, . . . M;

The situation that presented hereinbefore is for comprising the set of positive sample picture and the set of negative sample picture in sample picture, applicable equally for the situation only comprising the set of positive sample picture in sample picture; That is, do not consider that negative sample (even if considering that its result is also 0) only calculates positive voting results when calculating, concrete formula please refer to foregoing teachings, repeats no more herein.

3) correlation probabilities is estimated.

After the every pictures in result for retrieval all obtains corresponding voting results, the embodiment of the present invention devises two kinds of methods to estimate the dependent probability of every pictures, is called rigid judgement and soft estimation.Here p is used _ibe denoted as the correlation probabilities of the i-th pictures and given inquiry in retrieval list.

A) rigid judgement.

If described sample picture comprises the set of positive sample picture and the set of negative sample picture, be then applicable to the situation of rigid judgement.The embodiment of the present invention follows following hypothesis: picture concerned should vision similar with other picture concerned, simultaneously away from other irrelevant picture, thus our rigid decision method can be defined as follows:

p_{i} = \{\begin{matrix} 1, if r_{i}^{+} = 1 and r_{i}^{-} = 0 \\ 0, else \end{matrix}, i = 1,2, . . ., M;

As can be seen from the above equation, the present invention marks a pictures and if only if that its only obtains one and just votes for relevant.

B) soft estimation.

If described sample picture only comprises the set of positive sample picture, be then applicable to the situation of soft estimation.

We discuss in a step 11, and we know only may use the meeting of pseudo-positive example sample better, thus we only utilize pseudo-positive example picture to estimate correlation probabilities here.We use the thought of classification, by each positive sample picture in the set of positive sample picture alone as a sorter, then utilize a logistic regression algorithm to integrate the classification results of each sorter, are formulated as follows:

p_{i} = \frac{\exp (Σ_{k = 1}^{K} f_{k} (I^{i}) / K - 0.5)}{1 + \exp (Σ_{k = 1}^{K} f_{k} (I^{i}) / K - 0.5)}, i = 1,2, . . ., M;

Correlation probabilities in step 13, integrated retrieval result between all pictures and the inquiry request of input calculates the Average Accuracy of this image retrieval, realizes efficient image retrieval performance prediction.

According to judgement rigid in step 12 and the soft correlation probabilities estimating to obtain, the embodiment of the present invention also proposes two kinds of Average Accuracy computing method.

1) for rigid judgement, the correlation probabilities p that we obtain _ivalue non-zero that is one, thus we directly use the formula calculating Average Accuracy, as follows:

EAP @ T = \frac{1}{Z_{T}} Σ_{i = 1}^{T} rel (i) \cdot \frac{1}{i} Σ_{j = 1}^{i} rel (j);

Wherein, T represents in step 11 all pictures carrying out T before rank after initial retrieval result reordering; Rel (i) is a two-valued function, represent the picture relevant to the inquiry request of input (judging according to rigid judgement) when rank i-th, then rel (i)=1, otherwise, rel (i)=0, rel (j) is similar with the implication of rel (i), difference is only that the value of i and j varies in size, and i belongs to [1, T], j belongs to [1, i]; Z _tit is a normalization coefficient.

2) for soft estimation, the correlation probabilities p that we obtain _ivalue belongs between 0 to 1, thus can not directly apply above-mentioned formula, and therefore we are from the angle of mathematical expectation, has carried out again deriving, shown in specific as follows to above-mentioned formula:

\begin{matrix} E (AP @ T) = E [\frac{1}{Z_{T}} Σ_{i = 1}^{T} rel (i) \frac{Σ_{j = 1}^{i} rel (j)}{i}] \\ = \frac{1}{Z_{T}} Σ_{i = 1}^{T} Σ_{j = 1}^{i} \frac{E [rel (i) rel (j)]}{i} \\ = \frac{1}{Z_{T}} Σ_{i = 1}^{T} \frac{1}{i} {E [rel {(i)}^{2}] + Σ_{j = 1}^{i - 1} E [rel (i) rel (j)]} \\ = \frac{1}{Z_{T}} Σ_{i = 1}^{T} \frac{1}{i} {p (rel (i) = 1) + Σ_{j = 1}^{i - 1} p (rel (i) = 1, rel (j) = 1)} \\ = \frac{1}{Z_{T}} Σ_{i = 1}^{T} \frac{1}{i} {p (rel (i) = 1) + Σ_{j = 1}^{i - 1} p (rel (i) = 1) p (rel (j) = 1)} \end{matrix}

E (AP@T) expects to obtain to EAP@T computational mathematics, is an approximate value; E represents mathematical expectation, and p represents the correlation probabilities between picture and the inquiry request of input, such as uses p _ibe denoted as the correlation probabilities of the i-th pictures and given inquiry in retrieval list.

In the process of deriving above-mentioned formula, our supposition is between two pictures of the return-list diverse location probability relevant to given inquiry separate.

On the other hand, also many experiments has been carried out for the solution of the present invention.

1) twice experiment is carried out for relative coefficient, once experiment compares to adopt positive and negative sample picture and the relative coefficient comparing result (as shown in table 1) only using positive sample picture, and table 1 tables of data Benq is in only using the query performance prediction method of positive sample picture more effective; The relative coefficient comparative result (as shown in table 2) using the adaptively selected positive sample quantity shown in step 11 and existing fixed qty is compared in another experiment, and table 2 tables of data Benq is better in the method for adaptively selected positive sample number.In table 1, EAP_PN represents the Average Accuracy using positive and negative sample to estimate, EAP_P represents the Average Accuracy only using positive sample to estimate; In table 2, Fixed represents existing fixed qty, and Adaptive represents adaptively selected positive sample quantity of the present invention; Table 1 is with table 2, and Kendall ' s τ is Kendall's coefficient, and Pearson ' s r is Pearson's coefficient, and Spearman ' s ρ is Spearman coefficient; Three kinds of related coefficient balancing methods above, it is the recognised standard evaluating query performance prediction method, their span is all [-1,1] between, 1 represents best positive correlation, and-1 represents perfect negative correlation, and 0 represents completely uncorrelated, the larger expression degree of correlation of coefficient value is better, illustrates that coupling is between the two also better; Hard represents rigid decision method, and Soft represents soft estimation.

Table 1 adopts positive and negative sample picture and the relative coefficient comparative result only using positive sample picture

The relative coefficient comparative result of the adaptively selected positive sample quantity of table 2 and existing fixed qty

2) scheme of the solution of the present invention and prior art is utilized to contrast.Comparing result is as table 3-table 4.Wherein, table 3 is the contrast carried out based on typical Web353 data set, and table 4 is the contrast carried out based on typical MSRA-MM_V1.0 data set; Table 3 represents vision definition score with VCS in table 4, and COS represents consistance score, and RS represents representative score, and QAPE represents self-adaptation query performance prediction, and Ours represents the solution of the present invention; In table 4, Method represents control methods, NDCG: storage gain (Normalized Discounted Cumulative Gain) is lost in normalization, and the same with AP method of the present invention is also a kind of standard method weighing true result for retrieval quality.The comparing result of table 3 and table 4 shows method of the present invention, compares other method, has better effect in query performance prediction.

Table 3 is based on the comparing result of typical Web353 data set

Table 4 is based on the comparing result of typical MSRA-MM_V1.0 data set

Advantage and the good effect of the embodiment of the present invention are as follows:

(1) the present invention proposes a kind of brand-new query performance prediction method, compared to most methods in the past, this algorithm can the retrieval performance of the given inquiry searching system of automatic Prediction, and not only as an instruction, the value approaching to reality value namely estimated.

(2) the present invention proposes a kind of positive sample adaptive selection method, solve the problem how selecting positive sample number for difference inquiry.

(3) the present invention proposes two kinds of brand-new correlation probabilities methods of estimation.Existing method, returns vision hyperlink between picture by dissecting needle to given query and search, estimates the correlation probabilities of every pictures.Will analyze the similarity degree returned between all pictures compared to existing method, the method complexity that the present invention adopts is lower, is convenient to realize, and have benefited from simply voting efficiently decision-making mechanism, the method is more not easily affected by noise simultaneously.

Through the above description of the embodiments, those skilled in the art can be well understood to above-described embodiment can by software simulating, and the mode that also can add necessary general hardware platform by software realizes.Based on such understanding, the technical scheme of above-described embodiment can embody with the form of software product, it (can be CD-ROM that this software product can be stored in a non-volatile memory medium, USB flash disk, portable hard drive etc.) in, comprise some instructions and perform method described in each embodiment of the present invention in order to make a computer equipment (can be personal computer, server, or the network equipment etc.).

The above; be only the present invention's preferably embodiment, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; the change that can expect easily or replacement, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claims.

Claims

1. an efficient image retrieval performance prediction method, it is characterized in that, the method comprises:

2. method according to claim 1, is characterized in that, describedly comprises based on pseudo-linear filter technology adaptive selection sample picture from result for retrieval:

K^{*} = \max_{K &Element; [L, M]} CoS (K);

CoS (K) = \frac{1}{| K (K - 1) |} \underset{i, j = 1, . . ., K; i &NotEqual; j}{Σ} δ (I^{i}, I^{j});

δ (I^{i}, I^{j}) = \{\begin{matrix} 1, if & s (I^{i}, I^{j}) > μ \\ 0, else \end{matrix};

3. method according to claim 1 and 2, is characterized in that, the step calculating the vision similarity of each pictures and described sample picture in result for retrieval comprises:

\begin{matrix} x_{l}^{i} = {tf}_{l} \cdot i {df}_{l}, & l = 1,2, . . ., S; \end{matrix}

Cosine function cosine is adopted to calculate picture I in result for retrieval ⁱwith sample picture D ^kvision similarity s (I ⁱ, D ^k), be expressed as:

s (I^{i}, D^{k}) = \frac{x^{i} \cdot x_{D}^{k}}{| | x^{i} | | | | x_{D}^{k} | |}, k = 1,2, . . ., K .

4. method according to claim 1 and 2, is characterized in that, described based on ballot decision-making mechanism, and determines that the votes of each pictures comprises according to the similarity of each pictures in described result for retrieval and described sample picture:

{vote}_{ik}^{+} = \{\begin{matrix} 1, s (I^{i}, P^{k}) > μ \\ 0, else \end{matrix}, i = 1,2, . . . M, k = 1,2, . . ., K;

{vote}_{ik}^{-} = \{\begin{matrix} 1, s (I^{i}, N^{k}) > μ \\ 0, else \end{matrix}, i = 1,2, . . . M, k = 1,2, . . ., K;

r_{i}^{+} = \{\begin{matrix} 1, Σ_{k = 1}^{K} {vote}_{ik}^{+} &GreaterEqual; K / 2 \\ 0, else \end{matrix}, i = 1,2, . . . M;

r_{i}^{-} = \{\begin{matrix} 1, Σ_{k = 1}^{K} {vote}_{ik}^{-} &GreaterEqual; K / 2 \\ 0, else \end{matrix}, i = 1,2, . . . M;

Wherein, r _i ⁺with r _i ^-represent picture I respectively ⁱfinal positive votes and negative votes.

5. method according to claim 4, is characterized in that, the correlation probabilities in the described result for retrieval of described estimation between each pictures and inquiry request of input comprises:

p_{i} = \{\begin{matrix} 1, if r_{i}^{+} = 1 and r_{i}^{-} = 0 \\ 0, else \end{matrix}, i = 1,2, . . ., M;

p_{i} = \frac{\exp (Σ_{k = 1}^{K} f_{k} (I^{i}) / K - 0.5)}{1 + \exp (Σ_{k = 1}^{K} f_{k} (I^{i}) / K - 0.5)}, i = 1,2, . . ., M;

6. method according to claim 5, is characterized in that, the Average Accuracy that the correlation probabilities in described integrated retrieval result between all pictures and the inquiry request of input calculates this image retrieval comprises:

EAP @ T = \frac{1}{Z_{T}} Σ_{i = 1}^{T} rel (i) \cdot \frac{1}{i} Σ_{j = 1}^{i} rel (j);

E (AP @ T) = E [\frac{1}{Z_{T}} Σ_{i = 1}^{T} rel (i) \frac{Σ_{j = 1}^{i} rel (j)}{i}];

Wherein, E represents mathematical expectation.