CN104462111A

CN104462111A - Image retrieval database establishing method

Info

Publication number: CN104462111A
Application number: CN201310424717.4A
Authority: CN
Inventors: 陈卓; 李薪宇
Original assignee: Chengdu Idealsee Technology Co Ltd
Current assignee: Chengdu Idealsee Technology Co Ltd
Priority date: 2013-09-17
Filing date: 2013-09-17
Publication date: 2015-03-25

Abstract

The invention discloses an image retrieval database establishing method. The method includes: firstly, subjecting original target images to be trained to preprocessing to form a training image set; then removing text area feature points when subjecting the training image set to feature extraction and generating feature data corresponding to the original target images in an image retrieval database with the features points of a non-text area. Key interference points in a text area can be effectively removed, retrieval accuracy rate is improved, the retrieval database can be further compressed in size by removal of the great deal of interference points, and calculating time in real-time retrieval is further reduced. In addition, the original target images to be trained are preprocessed, blurred images and affine transformation images used for stimulating afocal image results of the original training images are added in the training image set, and when the images acquired by a camera are exfocal blurred images, or a user is under the condition that the visual angle is more than 80 degrees, the target images in the retrieval database can still be accurately searched out.

Description

Image retrieval database building method

Technical field

The present invention relates to field of image recognition, particularly relate to a kind of image retrieval database building method.

Background technology

As everyone knows, in the image retrieval technologies of content-based identification, first need in server end training sample image, feature extraction is carried out to sample image, to form image retrieval property data base.

In the process of synthetic image retrieval character data, when processing with prior art, owing to there is more print hand writing in a large amount of target image, in character area, the similarity of pixel color distribution causes having in a large number the key point that approximate key point describes content and is extracted, interference greatly can be produced to result for retrieval in retrieving, so that produce the result for retrieval of mistake.

In addition, image recognition is usually subject to the obstruction of coarse characteristic matching process, coarse characteristic matching process can because of affined transformation (making acquisition image show deformation because of visual angle or the change watching point) and other distortion (such as, when the image ratio that Image Acquisition end obtains is fuzzyyer, to marked change be there is in its feature) and aggravate, thus cause the correct minimizing of coupling and the increase of erroneous matching.

Summary of the invention

The object of this invention is to provide a kind of image retrieval database building method, by easily causing the character area feature of interference to eliminate searching database, only preserving non-legible area image feature, effectively can eliminate the interference of word similar features to image searching result.

In order to realize foregoing invention object, the invention provides a kind of image retrieval database building method, comprising: the former target image treating training carries out pre-service, form training plan image set; Feature point extraction is carried out to each sub-picture that described training image is concentrated; Each sub-picture that described training image is concentrated is split, obtains character area and non-legible region; The unique point overlapped with character area by location of pixels is rejected, by the characteristic corresponding to former target image in remaining unique point synthetic image searching database.

Accordingly, present invention also offers a kind of image retrieval database building method, comprising: the target image treating training carries out pre-service, form training plan image set; Each sub-picture that described training image is concentrated is split, obtains territory, text block and non-legible region; Concentrate the non-legible region of each sub-picture to carry out feature point extraction to described training image, in image retrieval database, correspond to the characteristic of target image.

The difference of above-mentioned two kinds of methods is: first method is first carry out feature point extraction to each sub-picture that described training image is concentrated, and then the unique point of character area is rejected; And second method is directly only concentrate the non-legible region of each sub-picture to carry out feature point extraction to described training image, feature extraction is not carried out to character area.

In above-mentioned two kinds of methods, each sub-picture that described training image is concentrated is split, obtain character area and non-legible region, comprise further: image is recursively cut along the white space in image in vertical and horizontal both direction, obtain cannot cutting rectangle frame region more one by one; When the size in single rectangle frame region is less than or equal to 6% of the whole figure of training image, judge that this rectangle frame region is as character area, remaining is non-legible region.Wherein, the white space in described image comprises: margin, subfield edge, indentation are blank, image and text filed junction section is blank, blank between word and word.

In above-mentioned two kinds of methods, the described former target image treating training carries out pre-service, and form training plan image set, preferred version is: by former target image the to be trained mode process by Gaussian Blur, simulation afocal imaging effect, obtains the fuzzy object image approximate with retrieving images; Affined transformation is carried out to former target image and fuzzy object image respectively in N number of direction, obtains 2N and open new training image, wherein 2≤N≤8; 2N opens new training plan and form training plan image set together with former target image, fuzzy object image.

In above-mentioned two kinds of methods, Text region can also be carried out to the character area in former target image, the Word message identified be corresponded in image retrieval database the second retrieve data of former target image.

Compared with prior art, the present invention has following beneficial effect:

1. the present invention eliminates searching database by easily causing the character area feature of interference, only preserves non-legible area image feature, effectively can eliminate the interference of word similar features to image searching result; Can carry out Text region to extract the Word message in character area by the method for OCR to character area in addition, this information also can be used as the image retrieval (as business card retrieval) of accidental quality for some particular types of image retrieval;

2. the present invention by training image concentrate add the blurred picture of former training image (former target image to be trained) after, when using the camera of handheld device or wearable device acquisition image to retrieve, if the image the got image (image obtained when namely not focusing) that to be afocal fuzzy, at this moment still can find correct target image in searching database.Meanwhile, concentrate the process adding affined transformation at training image, user can be made when visual angle is greater than 80 degree still can to find target image in searching database exactly.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings:

Fig. 1 is the embodiment of the present invention one image retrieval database building method schematic flow sheet;

Fig. 2 is the embodiment of the present invention two image retrieval database building method schematic flow sheet;

Fig. 3 is image schematic diagram to be split in the embodiment of the present invention;

Fig. 4 obtains after Fig. 3 is split cannot cutting rectangle frame area schematic more one by one.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

In content-based image recognition technology, in the process generating retrieve data, when processing with prior art, owing to there is more print hand writing in a large amount of target image, in character area, the similarity of pixel color distribution causes having in a large number the key point that approximate key point describes content and is extracted, interference greatly can be produced to result for retrieval in retrieving, so that produce the result for retrieval of mistake.

The present invention proposes a kind of image retrieval database building method, effectively can reject character area interference key point.Meanwhile, the rejecting of a large amount of noise spot also can compress the size of searching database further, and then computing time during minimizing real-time retrieval.

See Fig. 1, be the embodiment of the present invention one image retrieval database building method schematic flow sheet, described method comprises the steps:

S101: the former target image treating training carries out pre-service, form training plan image set, this step is specially: by former target image the to be trained mode process by Gaussian Blur, and simulation afocal imaging effect, obtains the fuzzy object image approximate with retrieving images; Affined transformation is carried out to former target image and fuzzy object image respectively in N number of direction, obtains 2N and open new training image, wherein 2≤N≤8; 2N opens new training plan and form training plan image set together with former target image, fuzzy object image.

S102: feature point extraction is carried out to each sub-picture that described training image is concentrated, this place feature detection and extraction can adopt the innovatory algorithm of SIFT algorithm or SIFT algorithm, as SURF, Ferns, ORB scheduling algorithm, the operation of extracting characteristics of image is carried out for gray-scale map.Each characteristic contains key point positional information (two-dimensional coordinate value on the image plane, the yardstick of key point and direction value) and describes content.

S103: each sub-picture that described training image is concentrated is split, obtain character area and non-legible region, be specially: image is recursively cut along the white space (tab-stops) in image in vertical and horizontal both direction, obtain to cut rectangle frame region one by one again (see Fig. 3, Fig. 4, Fig. 3 is image to be split, and Fig. 4 is segmentation result schematic diagram); When the size in single rectangle frame region is less than or equal to 6% of the whole figure of training image, judge that this rectangle frame region is as character area, remaining is non-legible region.In the embodiment of the present invention, when single character and training image size higher than 0.06 time, the character area that this character forms is used as image real time transfer, such as: in image indexing system of the present invention, for the resolution of training the training image of searching database generally to adopt 512 × 512, therefore, when character in image wide or tall and big is in 30 pixel units, general pattern data processing can be used as.So, the present invention is when carrying out page layout and analyzing, after obtaining image segmentation result, take out wherein wide and or tall and big in the rectangle frame region of 30, according to the plane of delineation coordinate range in these regions, feature in the aforementioned characteristics of image extracted by training image beyond these regional extents rejected, remaining feature is for training searching database.White space in image described in step S103 comprises: margin, subfield edge, indentation are blank, image and text filed junction section is blank, blank between word and word.Before image is cut, first can carry out binary conversion treatment to image, improve picture contrast, make image have clear and definite white space.In this step, dividing method is carried out to each sub-picture that training image is concentrated, can with reference to a kind of mixing page layout analytical approach detected based on tab-stop, Hybrid PageLayout Analysis via Tab-Stop Detection.Raymond W.Smith.ICDAR, page241-245.IEEE Computer Society, (2009)).Iamge Segmentation is the rectangle frame be made up of many character areas and non-legible region by the first step processing procedure during the method is usual in OCR method.What the present invention adopted is a kind of page layout's analytical approach of physics instead of page layout's analytical approach of logic-based, that is, we are not only the segmentation to plain text image, we want can process comprise text arbitrarily image (such as, page-images on books, magazine, newspaper, report), the character area in image and non-legible region segmentation are out processed respectively.

S104: the unique point overlapped with character area by location of pixels is rejected, by the characteristic corresponding to former target image in remaining unique point synthetic image searching database.The obtain manner (pixel of process key point and built-in pocket thereof) of content is described based on key point in image, pocket (such as, the character area of block letter) similar in a large number in same picture or different images describes content by causing identical or approximate key point.Therefore, after Keypoint detector completes, location of pixels and the key point that character area overlaps are rejected the computing time (character area can detect more key point usually) that corresponding error hiding can be avoided can also to save key point describer greatly.

For the image indexing system that some is special, such as business card etc. other take text message as the image indexing system of the image of one of important information, can also the modes such as OCR be adopted the character area in former target image to carry out Text region, the Word message identified be corresponded in image retrieval database the second retrieve data of former target image.

See Fig. 2, be the embodiment of the present invention two image retrieval database building method schematic flow sheet, comprise the steps:

S201: the target image treating training carries out pre-service, forms training plan image set;

S202: each sub-picture that described training image is concentrated is split, obtains territory, text block and non-legible region;

S203: concentrate the non-legible region of each sub-picture to carry out feature point extraction to described training image, corresponds to the characteristic of target image in image retrieval database.This steps characteristic detects the innovatory algorithm that can adopt SIFT algorithm or SIFT algorithm, and as SURF, Ferns, ORB scheduling algorithm, the operation of extracting characteristics of image is carried out for gray-scale map.

The present embodiment step S201 is identical with the first embodiment step S101, and the present embodiment step S202 is identical with the first embodiment step S103, is not repeated herein.The difference of the present embodiment and a upper embodiment is only: the first embodiment first carries out feature point extraction to each sub-picture that described training image is concentrated, and then the unique point of character area is rejected; And the present embodiment directly only concentrates the non-legible region of each sub-picture to carry out feature point extraction to described training image, feature extraction is not carried out to character area.

For some special image indexing system (such as business card etc. other take text message as the image indexing system of the image of one of important information), the picture much with same or similar image-region is there is in its sample image, also exist simultaneously much there is Similar Text information picture (such as, the business card of same company has consistent background image and consistent Business Name address etc. usually), based on the accuracy rate of Text region, simple carry out retrieving text information and can not obtain result for retrieval exactly, and first carry out general pattern retrieval mate this problem that can solve with carrying out text message again, simultaneously, when building searching database, owing to having the existence of the image of agreement context, a view data can be only had in a database to the business card image of a company or unit, greatly can save the time of image retrieval like this.

For business card recognition system, can first process according to above-mentioned two kinds of methods, the non-legible provincial characteristics obtained is used for training searching database, then on the basis of aforementioned Iamge Segmentation, the Text region method of OCR (adopt) is carried out to each character area represented by rectangle frame, using the Word message that draws the second retrieve data as this training image.In the searching system of reality uses, first image retrieval data are retrieved, then in the result for retrieval set drawn, text message is mated, draw unique accurate result for retrieval.Certain this method needs in client process target image to be retrieved being carried out to an OCR, to obtain the text message of retrieving images for mating with the text message in database.

The present invention eliminates searching database by easily causing the character area feature of interference, only preserves non-legible area image feature, effectively can eliminate the interference of word similar features to image searching result; Text region can be carried out to extract the Word message in character area by the method for OCR in addition to character area, this information also can be used as the image retrieval (as business card retrieval) of accidental quality for some particular types of image retrieval, can effectively improve image retrieval accuracy rate.

In addition, when using technical solution of the present invention, if for handheld device and wearable device for client obtains retrieving images, the realtime graphic searching system retrieved is carried out at server end, due in content-based image identification technical field, image recognition is usually subject to the obstruction of coarse characteristic matching process, coarse characteristic matching process can because of affined transformation (deformation because of the acquisition image display that visual angle or viewing point change over) and other distortion (such as, when the image ratio that Image Acquisition end obtains is fuzzyyer, to marked change be there is in its feature) and aggravate, thus cause the correct minimizing of coupling and the increase of erroneous matching.In the image indexing system for client acquisition retrieving images with handheld device and wearable device, visual angle surpasses 60 degree, or the retrieving images that causes of human factor is fuzzy etc., and problem is very common, for this two problems, the present inventor finds in the production process of known-image-features database, before feature point detection, carry out the accuracy rate that corresponding pre-service can improve image retrieval to warehouse-in image.

Therefore in step S101 and S201, the former target image treating training carries out pre-service, form training plan image set, first be by Gaussian Blur process by former target image to be trained, simulation afocal imaging effect, obtains the fuzzy object image approximate with retrieving images, then on N number of direction, carries out affined transformation respectively to former target image and fuzzy object image, obtain 2N and open new training image, wherein 2≤N≤8; The 2N obtained is opened new training plan and form training plan image set together with former target image, fuzzy object image.

Wherein, Gaussian Blur simulation afocal imaging effect is specially:

First, target image to be trained is converted to gray level image, then each pixel in image is done to the calculating of normal distribution:

N dimension space normal distribution equation is:

G (r) = \frac{1}{{\sqrt{{2 πσ}^{2}}}^{N}} e^{{- r}^{2} / ({2 σ}^{2})} - - - (1)

It is defined as at two-dimensional space:

G (r) = \frac{1}{{2 πσ}^{2}} e^{{- r}^{2} / ({2 σ}^{2})} - - - (2)

Wherein, r is blur radius, and σ is the standard deviation distributed just very much, and in two-dimensional space, the level line Shi Cong center of the curved surface that this formula generates starts the concentric circles in normal distribution.K is gaussian kernel, and gaussian kernel k is larger, and the image of generation is fuzzyyer, and the key between σ and k is represented by formula (3):

σ=(0.5*(k–1)–1)*0.3+0.8 （3）

R is determined by k, is expressed as formula (4):

r=n–(k–1)/2；（4）

Wherein, the scope of n is 0 ~ k-1, and in order to simulate afocal imaging effect, obtain the fuzzy object image approximate with retrieving images, k is preferably the odd number between 13 ~ 21, and namely k is preferably 13,15,17,19 or 21.The Gaussian matrix of a k*1 can be calculated according to formula (2), (3), (4).

Then the Gaussian matrix obtained described in basis, the former target image treating training does convolution algorithm to the gray-scale value of each pixel in figure respectively in the vertical direction and the horizontal direction, obtain one and treat training image through Fuzzy Processing, namely approximate with the afocal blurred picture of former target image to be trained fuzzy object image.When obtaining retrieving images by smart mobile phone camera, if what obtain is a fuzzyyer retrieving images, when retrieving in searching database, will there is the Fuzzy Processing result images of larger possibility matching to this target image.So far, obtain the training plan image set be made up of former training image and blurred picture, process is to increase the robustness of image indexing system in affined transformation of the present invention's proposition further.

Wherein, affined transformation is carried out to former target image and fuzzy object image respectively in N number of direction, false N=2, be specially: based on former target image and fuzzy object image, generate the image rotating on four direction, this four direction respectively: image front, image turned upside down, turn clockwise 45 degree, be rotated counterclockwise 45 degree; Again by often open image rotating in the horizontal direction with 50 degree ~ 60 degree, in vertical direction (longitudinal direction of the plane of delineation) with 10 degree ~ 20 degree carry out view transformation (to often open image carry out view transformation time, preferably in the horizontal direction with 57 degree, vertical direction carries out view transformation with the direction of 13 degree), to obtain the new training image after affined transformation.Affined transformation process is specially: according to its cosine of visual angle angle calculation and sine value, be multiplied with the wide high level of original image (this place refers to postrotational image on four direction) the image-region coordinate after can obtaining view transformation, can calculate visual angle effect matrix by the image-region coordinate after original image area coordinate and view transformation, original image is carried out the result images that visual angle effect can obtain affined transformation by matrix thus.

By the way, training image concentrate add the blurred picture of former training image (former target image to be trained) after, when using the camera of handheld device or wearable device acquisition image to retrieve, if the image the got image (image obtained when namely not focusing) that to be afocal fuzzy, at this moment still can find correct target image in searching database.Meanwhile, concentrate the process adding affined transformation at training image, user can be made when visual angle is greater than 80 degree still can to find target image in searching database exactly.

All features disclosed in this instructions, or the step in disclosed all methods or process, except mutually exclusive feature and/or step, all can combine by any way.

Arbitrary feature disclosed in this instructions (comprising any accessory claim, summary and accompanying drawing), unless specifically stated otherwise, all can be replaced by other equivalences or the alternative features with similar object.That is, unless specifically stated otherwise, each feature is an example in a series of equivalence or similar characteristics.

The present invention is not limited to aforesaid embodiment.The present invention expands to any new feature of disclosing in this manual or any combination newly, and the step of the arbitrary new method disclosed or process or any combination newly.

Claims

1. an image retrieval database building method, is characterized in that, comprising:

The former target image treating training carries out pre-service, forms training plan image set;

Feature point extraction is carried out to each sub-picture that described training image is concentrated;

Each sub-picture that described training image is concentrated is split, obtains character area and non-legible region;

The unique point overlapped with character area by location of pixels is rejected, by the characteristic corresponding to former target image in remaining unique point synthetic image searching database.

2. the method for claim 1, is characterized in that, splits, obtain character area and non-legible region, comprise further each sub-picture that described training image is concentrated:

Image is recursively cut along the white space in image in vertical and horizontal both direction, obtains cannot cutting rectangle frame region more one by one;

When the size in single rectangle frame region is less than or equal to 6% of the whole figure of training image, judge that this rectangle frame region is as character area, remaining is non-legible region.

3. method as claimed in claim 2, it is characterized in that, the white space in described image comprises: margin, subfield edge, indentation are blank, image and text filed junction section is blank, blank between word and word.

4. as claimed any one in claims 1 to 3 method, is characterized in that, described in treat training former target image carry out pre-service, form training plan image set, comprising:

By former target image the to be trained mode process by Gaussian Blur, simulation afocal imaging effect, obtains the fuzzy object image approximate with retrieving images;

Affined transformation is carried out to former target image and fuzzy object image respectively in N number of direction, obtains 2N and open new training image, wherein 2≤N≤8;

2N opens new training plan and form training plan image set together with former target image, fuzzy object image.

5. method as claimed in claim 4, is characterized in that, carry out Text region to the character area in former target image, the Word message identified is corresponded in image retrieval database the second retrieve data of former target image.

6. an image retrieval database building method, is characterized in that, comprising:

The target image treating training carries out pre-service, forms training plan image set;

Each sub-picture that described training image is concentrated is split, obtains territory, text block and non-legible region;

Concentrate the non-legible region of each sub-picture to carry out feature point extraction to described training image, in image retrieval database, correspond to the characteristic of target image.

7. method as claimed in claim 6, is characterized in that, splits, obtain character area and non-legible region, comprise further each sub-picture that described training image is concentrated:

8. method as claimed in claim 7, it is characterized in that, the white space in described image comprises: margin, subfield edge, indentation are blank, image and text filed junction section is blank, blank between word and word.

9. the method according to any one of claim 6 to 8, is characterized in that, described in treat training former target image carry out pre-service, form training plan image set, comprising:

10. method as claimed in claim 9, is characterized in that, carry out Text region to the character area in former target image, the Word message identified is corresponded in image retrieval database the second retrieve data of former target image.