CN100592297C

CN100592297C - Multiple meaning digital picture search method based on representation conversion

Info

Publication number: CN100592297C
Application number: CN200810020716A
Authority: CN
Inventors: 周志华; 张敏灵
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2008-02-22
Filing date: 2008-02-22
Publication date: 2010-02-24
Anticipated expiration: 2028-02-22
Also published as: CN101236565A

Abstract

The invention discloses a polysemantic digital image retrieval method based on representation conversion. The method comprises the following steps: firstly, users select query images from the prior multi-mark image library, including relevant images and irrelevant images; secondly, polysemantic information which is implicated in the query images is described explicitly by means of representation conversion; thirdly, a forecasting model is obtained after study of the query images after conversion by adoption of a preset classification method; fourthly, concept marks of the images waiting for retrieval in a digital image memory device are forecasted on the basis of the forecasting model, and retrieval is performed by utilization of forecast results and the retrieval images are returned; fifthly, if the users are satisfactory with retrieval results, the sixth step is executed; if the users are not satisfactory with the retrieval results, more query images are selected from the multi-markimage library for feedback, and the second step is executed; sixthly, the operation is finished. The polysemantic digital image retrieval method based on representation conversion explicitly processespolysemantic information of the images on the basis of the representation conversion technology and solves the problem that a majority of the prior image retrieval methods can only process monosemantic images.

Description

A kind of multiple meaning digital picture search method based on the expression conversion

Technical field

The present invention relates to a kind of digital picture search method, particularly a kind of search method that is applicable to multiple meaning digital picture.

Background technology

Along with the data aggregation of computing machine and improving constantly of processing power, obtaining of digital picture also becomes more and more easier.Therefore, available digital picture increases just apace and obtaining application in increasing industry.Image retrieval technologies is the method that a kind of assisted user efficiently obtains image information, and this method is found out it quickly and accurately and wished the image that obtains and return to the user by the query image that the user submits to indexing unit from image library.A kind of effective image retrieval strategy is to regard retrieving as a learning process, utilizes query image that the user submits to as the required sample of study, realizes retrieval to digital picture thereby use machine learning techniques study to obtain a forecast model.

Present image retrieval technologies relates generally to the univocality digital picture, and the image of the type is corresponding to single notion class, so its semanteme is definite and does not have ambiguity.Yet in real world, multiple meaning digital picture is extensively to exist.For example, a width of cloth interior decoration image may be simultaneously corresponding to a plurality of notion classes such as desk, sofa, wardrobes; One width of cloth natural scene image may have a plurality of notion marks such as blue sky, the sun, mountain range simultaneously; The open-air biometric image of one width of cloth may be under the jurisdiction of a plurality of classifications such as meadow, lion, giraffe simultaneously.Because the existing digital image retrieval technologies can only be handled the univocality digital picture, therefore the multiple semantic information that can't utilize multiple meaning digital picture to contain obtains and user-dependent image thereby be unfavorable for retrieving effectively.

Summary of the invention

1, goal of the invention: the problem that the objective of the invention is to handle the univocality digital picture at present digital image search technology, a kind of method that can effectively handle multiple meaning digital picture is proposed, this method is by representing conversion to the initial vector feature of image, explicitly is described the embedded various semantic informations of multiple meaning digital picture, thereby improves the performance of digital image search device.

2, technical scheme: for achieving the above object, a kind of multiple meaning digital picture search method of the present invention based on the expression conversion, may further comprise the steps: (1) user selects query image from existing multiple labeling image library, comprises associated picture and uncorrelated image; (2) use the method explicitly of expression conversion to describe the ambiguity information that query image contains; (3) use default sorting technique that the query image after changing is learnt to obtain a forecast model; (4), and utilize gained to predict the outcome to retrieve and return retrieving images based on the notion mark of image to be retrieved in the forecast model predicted figure image storage apparatus; (5) if the user is satisfied to result for retrieval, then execution in step 6, otherwise select more query image to feed back execution in step 2 from the multiple labeling image library; (6) finish.

3, beneficial effect: the present invention has provided a kind of search method that is used for multiple meaning digital picture, this method is based on the expression switch technology, explicitly is handled the multiple semantic information of image, has solved the limitation that present most of image search method can only be handled the univocality image.

Description of drawings

Fig. 1 is the workflow diagram of digital image search device.

Fig. 2 is the process flow diagram of the inventive method.

Fig. 3 is the digital image representation transformation flow figure that the present invention adopts.

Fig. 4 is the process flow diagram of the sorting technique used of the present invention.

Embodiment

Below in conjunction with accompanying drawing most preferred embodiment is elaborated.

As shown in Figure 1, deposited digital picture to be retrieved in the digital image storage device, there is a multiple labeling digital picture storehouse in addition, contains some multiple meaning digital pictures in this image library, each multiple meaning digital picture is all corresponding notion mark of lineup worker's mark.The user chooses M width of cloth query image and submits to the digital image search device from multiple labeling digital picture storehouse, some of them are its uninterested irrelevant image for its interested associated picture other.Can use the classical way in the Digital Image Processing textbook to generate suitable characteristics of image, as color, texture, shape etc.Thus, every width of cloth image can be represented by a proper vector.After obtaining characteristics of image, adopt the method for expression conversion that multiple meaning digital picture is handled, use default sorting technique training to obtain corresponding forecast model then, based on this image to be retrieved in digital image storage device is retrieved.If the user is dissatisfied to the gained result, can from multiple labeling digital picture storehouse, chooses more query image and feed back to the digital image search device.

The method that the present invention relates to as shown in Figure 2.Step 10 is origination action.The query image of supposing user's submission is corresponding to S set={ (x _i, Y _i) | 1≤i≤M}, wherein Y _iFor with image object x _iRelevant ambiguity information is by one group of notion tag set

Y_{i} &SubsetEqual; {1,2, . . ., Q}

Expression (Q is all possible notion mark number).Step 11 pair all query image are represented conversion, describe the ambiguity information of image object with explicitly, and it describes in detail as shown in Figure 3.Next step 12 uses default sorting technique training to obtain required forecast model from the image object after the conversion, and it describes in detail as shown in Figure 4.The forecast model that step 13 utilizes training to obtain is retrieved the image to be retrieved in the digital image storage device.Specifically, indexing unit at first adopts the method identical with step 11 that image to be retrieved is represented conversion, and the image object after will changing is then submitted to the model of training gained and predicted.After obtaining the notion tag set that image to be retrieved is subordinate to, can treat the retrieving images output of sorting with multiple mode.Wherein a kind of method intuitively is the similarity of investigating between the notion tag set of the notion tag set of image to be retrieved and associated picture and irrelevant image, if high more and low more with the latter's similarity with the former similarity, then the ordering of this image to be retrieved is just forward more.Behind the output result for retrieval, indexing unit promptly enters the done state shown in the step 14.

Fig. 3 has provided the detailed description of step 11 among Fig. 2, how to understand specifically digital picture represented conversion to measure feature.Step 1100 among Fig. 3 is initial states.Step 1101 has constituted a loop body to 1105, generates the prototype vector v corresponding with the q class during each is taken turns at round-robin _qWherein, step 1103 construction set U at first _qIf, certain image object (x _i, Y _i) contain underlined q, then with vector x _iPlace this set.Step 1104 will be gathered U _qIn institute's directed quantity ask the required prototype vector v of average acquisition _qGo up v from directly perceived _qThe summary info of q class has been described approx.After said process was finished, step 1106 had constituted another loop body to 1109, during each is taken turns at round-robin each query image was represented conversion.Specifically, step 1108 based on prototype vector with each image object (x _i, Y _i) be converted to new representation (X _i, Y _i), x wherein _iBecome one group of set X that vector constitutes by a single vector _i, gather contained vector by x _iAsk difference to obtain with each prototype vector.On directly perceived, each difference value vector has reflected x _iAnd the spatial relationship between each class.After above-mentioned transfer process was finished, initial query image data set S had promptly become new image object data set S ^New, shown in step 1110.Step 1111 is done states.

Fig. 4 has provided the detailed description of step 12 among Fig. 2, understands specifically and how to use default classification methodology acquistion to corresponding forecast model.Step 1200 among Fig. 4 is initial states.Step 1201 is at first with data set S ^NewIn the set of all images object correspondence represent X _i(1≤i≤M) puts into unmarked data set U.Then, in given set, after the distance metric mode of different objects, can carry out cluster to the object among the data set U, obtain k cluster centre M based on the classical unsupervised learning method in machine learning and the data mining textbook _j(1≤j≤k).Wherein, clusters number k is specified in advance by the user.Here, we utilize Hausdorff commonly used in the pattern-recognition textbook apart from the distance of coming between metric set object A and the collection object B.This tolerance is investigated in set A the minimum distance of all elements in each element and set B, and the minimum distance of all elements in each element and the set A in the set B.Then, with the maximal value of all minimum distances of trying to achieve as the final distance between A and the B.Step 1202 utilizes the result of cluster gained with object X to loop body of 1205 formations during each is taken turns at round-robin _iBe converted to vectorial z _iRepresentation, the j of this vector dimension z _IjCorresponding to X _iWith j cluster centre M _jBetween the Hausdorff distance.After said process is finished, image retrieving apparatus utilizes the matrix representation W of the method training forecast model of minimum error quadratic sum, the solving equation group of this matrix is found the solution being provided with respectively shown in step 1206 and 1207 of required matrix Φ and matrix T shown in step 1208.Because directly the system of equations shown in the solution procedure 1208 may run into left end matrix (Φ ^TTherefore Φ) irreversible difficulty has adopted the classical way in this linear algebra textbook of svd to solve the problems referred to above here.Step 1209 is a done state.

The personage who knows this area will understand, though described specific embodiment for the ease of explaining here, can make various changes under the situation that does not deviate from spirit and scope of the invention.Therefore, except claims, can not be used to limit the present invention.

Claims

1, a kind of multiple meaning digital picture search method based on the expression conversion may further comprise the steps:

(1) user selects query image from existing multiple labeling image library, comprises associated picture and uncorrelated image;

(2) use the method explicitly of expression conversion to describe the ambiguity information that query image contains;

(3) use default sorting technique that the query image after changing is learnt to obtain a forecast model;

(4), and utilize gained to predict the outcome to retrieve and return retrieving images based on the notion mark of image to be retrieved in the forecast model predicted figure image storage apparatus;

(5) if the user is satisfied to result for retrieval, then execution in step 6, otherwise select more query image to feed back execution in step 2 from the multiple labeling image library;

(6) finish.

2, the multiple meaning digital picture search method based on expression conversion according to claim 1 is characterized in that step (2) two stages of experience, and each stage is corresponding to a loop body:

(1) loop body of phase one correspondence has comprised the circulation of Q wheel, and wherein Q is all possible notion mark number, during each is taken turns at round-robin, at first constructs and the current corresponding data set U of notion mark q that investigates _q, this set is made of the image object that all have mark q; To gather U then _qIn all images vector ask on average, obtain the prototype vector v corresponding with the q class _q

(2) loop body of subordinate phase correspondence has comprised the circulation of M wheel, and wherein M is the query image number of user's selection, during each is taken turns at round-robin, based on the prototype vector v that obtains on last stage _q, with the vector representation form x of i width of cloth image employing _iBe converted to vector set representation X _i, wherein, X _iIn comprise Q vector altogether and each is vectorial corresponding to x _iWith certain prototype vector v _qBetween difference, 1≤q≤Q wherein, 1≤i≤M;

After finishing in above-mentioned two stages, initial training collection S has promptly converted new training set S to ^New

3, the multiple meaning digital picture search method based on the expression conversion according to claim 1 is characterized in that step (3) experience three phases:

(1) in the phase one, at first with data set S ^NewX is represented in the set of middle all images object _iPut into unmarked data set U; Utilize non-supervision machine learning method pair set U to carry out cluster analysis then, obtain k cluster centre M _i1≤i≤k wherein, 1≤j≤k;

(2) subordinate phase during each is taken turns at round-robin, is utilized the cluster result of gained on last stage corresponding to the loop body of a total M wheel, and X is represented in the set of image object _iBe converted to vector representation z _i, wherein, z _iBe attribute vector and its j dimension z of a k dimension _IjValue is X _iWith M _iBetween the Hausdorff distance;

(3), utilize the matrix representation W of the method training forecast model of minimum error quadratic sum in the phase III; In order to overcome the difficulty that singular matrix brought that occurs in the solution procedure, utilize the svd technology to come the target system of equations is found the solution;

Data set S wherein ^NewBe meant the new training set by obtaining after training image is represented to transform, variable M represents the query image number that the user selects;

After above-mentioned three phases is finished,, promptly obtained required forecast model in conjunction with cluster result and matrix representation W.