CN103679218A

CN103679218A - Handwritten form keyword detection method

Info

Publication number: CN103679218A
Application number: CN201310582398.XA
Authority: CN
Inventors: 吕岳; 张文超
Original assignee: East China Normal University
Current assignee: East China Normal University
Priority date: 2013-11-19
Filing date: 2013-11-19
Publication date: 2014-03-26
Anticipated expiration: 2033-11-19
Also published as: CN103679218B

Abstract

The invention discloses a handwritten form keyword detection method comprising that: characteristic points of keyword images in a keyword image database are extracted so that a keyword characteristic database is established; the characteristic points of texts in text images to be detected are extracted so that a characteristic point database of the text images to be detected is obtained; sliding windows of the text images to be detected are extracted, characteristic point sets corresponding to the sliding windows are extracted from the characteristic point database, and the characteristic point sets are compared so that initial matching point pair sets are obtained; the initial matching point pair sets are screened so that accurate matching point pairs are obtained; and the sliding windows in the text images to be detected are screened according to the matching point pairs and integrated so that detection results are obtained. In the process of characteristic extraction, SIFT characteristic representation is adopted. The method is applicable to detection of keywords of large amounts of handwritten form documents, such as historical literatures, correspondences, notes, etc. Under the premise of establishing a black list image database, the keywords in a black list can be effectively detected and handwriting of different authors can also be distinguished.

Description

A kind of handwritten form keyword spotting method

Technical field

The present invention relates to image detecting technique, relate in particular to a kind of specific people's of writing handwritten form keyword spotting method based on word local feature.

Background technology

Handwritten form keyword spotting is to detect specific keyword in a large amount of handwritten text images.The method of handwritten form keyword spotting is normally carried out character retrieval on the basis of word identification again at present.

Yet Chinese character classification is many, writing style is changeable, first the existing handwritten form keyword error-detecting method based on identification word needs to set up huge Chinese character template base, and spend the plenty of time and carry out training and the classification of feature, the image pre-service in early stage and Text segmentation also largely impact are known the other result of word, and then affect keyword spotting result; In addition the writing style that the method for the result of this dependence identification is not write difference people is included in and is considered, thus the identification to handwritten text effectively.

Summary of the invention

The present invention has overcome cannot identify the defect of the specific people's of writing handwritten text in prior art effectively, has proposed a kind of handwritten form keyword spotting method.

The present invention proposes a kind of handwritten form keyword spotting method, comprise the steps: step 1: obtain keyword image library, extract the unique point of keyword image in described keyword image library and set up keyword feature storehouse; Step 2: extract the unique point of text image Chinese version to be detected, obtain the unique point storehouse of described text image to be detected; Step 3: extract the moving window of text image to be detected, extract described moving window characteristic of correspondence point set from described unique point storehouse, by described unique point set and the contrast of described keyword feature storehouse, obtain initial matching double points set; Step 4: according to described moving window Chinese word geometry information, screen described initial matching double points set and obtain matching double points accurately; Step 5: moving window integration according in the described text image to be detected of described matching double points screening, obtain testing result.

In the handwritten form keyword spotting method that the present invention proposes, when the unique point in described step 3 in described unique point set and described keyword feature storehouse difference are less than threshold value, this unique point is detected as initial matching double points, and described unique point is shown below:

\begin{matrix} | θ_{s} - θ_{w} | < p 1 \\ p 2 < \frac{σ_{s}}{σ_{w}} < \frac{1}{p 2} \\ {| | {\overset{&RightArrow;}{d}}_{s} - {\overset{&RightArrow;}{d}}_{w} | |}^{2} < p 3 \end{matrix};

In formula, θ _sthe unique point deflection that represents keyword image, σ _sthe unique point yardstick that represents keyword image,

the unique point eigenvector that represents keyword image, θ _wthe unique point deflection that represents text image to be detected, σ _wthe unique point yardstick that represents text image to be detected,

the unique point eigenvector that represents text image to be detected, p1, p2, p3 represent respectively threshold value.

In the handwritten form keyword spotting method that the present invention proposes, in described step 4, by the described initial matching double points set of word geometry information constraint screening, its process comprises the steps:

Step b1: set up the geological information constraints graph between described unique point set and the unique point set of described keyword image;

Step b2: search algorithm by Clique described geological information constraints graph is screened, delete the matching double points of mistake coupling in described initial matching double points set.

In the handwritten form keyword spotting method that the present invention proposes, the constraint condition of described geological information constraints graph is as shown in the formula expression:

|(x _si-x _sj)-(x _wi-x _sj)|＜p4×Avg(σ _s) ；

|(y _si-y _sj)-(y _wi-y _wj)|＜p5×Avg(σ _s)

In formula, x _sthe horizontal ordinate of representation feature point in keyword image, x _wthe horizontal ordinate of representation feature point in moving window image, y _sthe ordinate of representation feature point in keyword image, y _wthe ordinate of representation feature point in moving window image, Avg (σ _s) representing the average characteristics point scale of keyword image, p4 and p5 represent threshold value.

In the handwritten form keyword spotting method that the present invention proposes, the moving window screening in described step 5 in described text image to be detected comprises the steps:

Step c1: if the centre-of gravity shift ratio of the center of gravity of the matching double points of described moving window and former keyword is greater than 0.15, delete this slip moving window, otherwise continue next step;

Step c2: if the match point of described moving window is greater than 0.18 in the left half of ratio distributing and the distribution proportion deviation of former keyword, deletes this slip moving window, otherwise retain this moving window.

This method will detect the specific people's of writing hand-written keyword by computed image matching degree, the keyword image of reserved particular person being write and text image to be detected carry out respectively feature extraction, use moving window to slide at text image to be detected, detect specific hand-written keyword.In characteristic extraction procedure, this method has adopted SIFT characteristic present.The method is applicable to the keyword spotting of a large amount of handwritten form documents, as historical document, letter, notes etc., is setting up under the prerequisite of blacklist image library, not only can effectively detect the keyword in blacklist, more can distinguish the person's handwriting of different authors.

Accompanying drawing explanation

Fig. 1 is the process flow diagram of handwritten form keyword spotting method of the present invention;

Fig. 2 is the schematic diagram of text image to be detected in the present embodiment;

Fig. 3 is the process flow diagram in keyword spotting stage in the present invention;

Fig. 4 is the schematic diagram of matching double points preliminary in the present embodiment;

Fig. 5 is the schematic diagram of geological information figure in the present embodiment;

Fig. 6 is the testing result schematic diagram of the present embodiment;

Fig. 7 is the schematic diagram of center of gravity in the testing result of the present embodiment;

Fig. 8 is the schematic diagram by mistake mating in the testing result of the present embodiment;

Fig. 9 is the testing result schematic diagram of the present embodiment.

Embodiment

In conjunction with following specific embodiments and the drawings, the present invention is described in further detail.Implement process of the present invention, condition, experimental technique etc., except the content of mentioning specially below, be universal knowledege and the common practise of this area, the present invention is not particularly limited content.

The flow process of this method as shown in Figure 1, is mainly divided into two stages: feature extraction phases and keyword spotting stage.In feature extraction phases, extract the unique point set of keyword image in keyword image library, and the unique point set of text image Chinese version to be detected, keyword feature point storehouse and unique point set to be detected after extraction, formed respectively.Keyword spotting in the stage compares both, thereby obtains the matching double points between text and keyword image, thereby obtains final matching double points by continuous screening, thereby completes the detection of keyword.

In feature extraction phases, first keyword image library is carried out to SIFT feature extraction in batches.Each envelope image can be described as the S set={ F of SIFT unique point _i, each unique point

mainly 5 data, consist of, its concrete meaning is in Table 1.The unique point set of each keyword image is stored in text, forms feature database.

Table 1SIFT characteristic point data table

For text image to be detected, in advance it is gone and is cut apart.Document Segmentation to be detected is become to several moving windows, in each moving window, comprise several words.Preferably, when the text image of every a line being extracted to its SIFT unique point set, by the horizontal ordinate x value ascending order of unique point, arrange storage, be designated as

so that follow-up slip moving window extracts.

Due to the special repetitive structure of Chinese character, the direct matching result of keyword characteristics of image on text image is very poor, and what obtain is mistake matching characteristic point substantially.So text image is cut, and adopts slip moving window, characteristic matching is dwindled in the scope of several Chinese characters, can obtain more correct match point.

In the present embodiment, utilize set Wind (p, q) to represent moving window, wherein p and q represent respectively the origin coordinates of moving window and stop coordinate.If text image width is W _d, moving window width is w, and moving window moving step length is s, and moving window quantity is n, as shown in Figure 2, according to following three kinds of methods, cuts apart text image and extracts slip moving window:

If 1 W _d< w, n=1, p ₁=0, q ₁=W _d;

2 otherwise, according to formula (1) below, calculate the coordinate of moving window

If 3 q _i> W _d, q _i=W _d, the not enough width of final stage also becomes a moving window.

After obtaining moving window set, each characteristic matching is only carried out for some moving windows, starts to right end, just as a moving window sliding from left to right from the left side of text image.Coupling in each moving window only drops on p for horizontal ordinate x _iand q _ibetween text feature point set.

In above formula, moving window width w and moving window moving step length s are respectively as following formula (2) represents:

\begin{matrix} w = W_{s} \times \frac{Avg (σ_{d})}{Avg (σ_{s})} \\ s = \frac{w}{6} \end{matrix} - - - (2)

In formula, W _sfor keyword picture traverse, Avg (σ _s) be the average of keyword characteristics of image point scale, Avg (σ _d) be the average of text image unique point.By unique point yardstick, regulate moving window width, can adapt to the dimensional variation of image.

The keyword spotting stage is the core of this method.The main process in this stage as shown in Figure 3.In each moving window, keyword image characteristic point can use Kd-tree algorithm find with moving window in the unique point of 5 arest neighbors in unique point set, form initial matching double points.

First utilize the information such as deflection, yardstick of unique point to carry out preliminary screening to matching double points, if do not consider rotation, distortion and the convergent-divergent of file and picture, so the deflection between two similar unique points, scale-value difference just can not be large especially.In addition, the king-sized matching double points of the Euclidean distance of eigenvector also can be deleted, because eigenvector representation feature point gradient statistical information around.Utilize formula (3) to matching characteristic point to carrying out preliminary screening

\begin{matrix} | θ_{s} - θ_{w} | < p 1 \\ p 2 < \frac{σ_{s}}{σ_{w}} < \frac{1}{p 2} \\ {| | {\overset{&RightArrow;}{d}}_{s} - {\overset{&RightArrow;}{d}}_{w} | |}^{2} < p 3 \end{matrix} - - - (3)

In formula, θ _s, σ _s,

and θ _w, σ _w, the unique point deflection, yardstick, the eigenvector that represent respectively keyword image and moving window image.P1, p2, p3 are respectively corresponding threshold values.The result of match point after preliminary screening as shown in Figure 4

Fig. 4 has intercepted the fragment of slip moving window in text image, the position that rectangle frame is moving window, and what show below is the image of keyword, thin straight line has connected the unique point of mating in two width images.

In repetitive structure Fig. 5 due to Chinese character, still there are a lot of Mismatching points.Utilize unique point to set up geometrical constraint in the space distribution of word, can further delete Mismatching point pair.The first step is first set up geometrical constraint figure, and its step is as follows:

1, establish S={s ₁..., s _mbe keyword characteristics of image point set, W={w ₁..., w _mit is the feature point set matching in moving window image.Non-directed graph G=(V, E) represents geometrical constraint figure.V={v ₁... v _mbe the vertex set of G, wherein v _i=(s _i, w _i), represent in S and W that every a pair of match point is with regard to a summit in corresponding G.

set for G limit.

2, add as follows the limit of G, for any two unique points to v _i=(s _i, w _i) and v _j=(s _j, w _j), meet constraint condition

|(x _si-x _sj)-(x _wi-x _wj)|＜p4×Avg(σ _s) (4)

|(y _si-y _sj)-(y _wi-y _wj)|＜p5×Avg(σ _s)

In vertex v _iand v _jbetween add a limit.X in above formula _s, x _wthe horizontal ordinate of representation feature point in keyword image and moving window image, and y _sand y _wrepresent its ordinate, with reference to figure 5 examples.P4 and p5 are two threshold values, in order to adapt to the size of different keyword images, be multiplied by the average dimension value of keyword image characteristic point.Although geometrical constraint figure is the method for setting up image geometry contact of relatively commonly using, different constraint condition can build different constraints graphs, but unique constraint condition that the present invention proposes, as shown in formula (4), the architectural feature that meets word, be suitable for the screening that unique point is right, to improving accuracy rate of the present invention, have very large effect.

After more than setting up geometrical constraint figure, suppose wherein all unique points all correctly coupling, in GCG figure, all summits will be connected to each other so, form agglomerate closely.When existence is mated by mistake, will in GCG figure, find a clique, in this subgraph, all summits are all interconnected, can search algorithm with Clique and realize, its concrete steps are as follows:

1, initialization candidate vertices collection C=V, Clique vertex set M=φ;

2, calculate the degree on each summit in V, be designated as set Deg (V);

3, select the maximum vertex v of Deg (V) in C ₀, added M, i.e. M=M ∪ { v ₀;

4, by v ₀from C, delete, and only in C, retain and v ₀the summit being connected, C=C v ₀, C=C ∩ N (v ₀).N (v wherein ₀) expression v ₀adjacent vertex collection.

If 5 C=φ, algorithm finishes, otherwise forwards 3 repetition said process to.

By setting up geometrical constraint figure and Clique, search algorithm, can delete a large amount of Mismatching points pair, result is as shown in Figure 6 after geometrical constraint is processed for the initial matching in Fig. 4.Contrast the visible matching double points that many mistakes are mated of having deleted with Fig. 4.See the contrast of Fig. 8 and Fig. 9, moving window has covered " impelling " two words again, and due to the similarity of some strokes, this moving window and keyword image " food " have produced a large amount of error matching points pair.According to text geometry information delete Mismatching point to after, the matching double points showing in Fig. 9 has just reduced a lot, can find out that it is different character images that the present invention can judge.Geometrical constraint figure has effectively utilized the structural information of word, can greatly improve the correctness of feature point pair matching, plays in the methods of the invention central role.

By geometrical constraint to after the screening of match point, the quantity that each moving window can a corresponding match point, the very few moving window of quantity is deleted.Because the complicated and simple degree difference of Chinese character causes the unique point quantity of himself also different, make the match point quantity variance of different Chinese character very large, therefore adopt adaptive threshold to screen moving window.Formula (5) shows to adjust threshold value according to the quantity of SIFT unique point in keyword image

tn＝k×S _fnum (5)

In formula, tn is threshold value, S _fnumfor keyword characteristics of image, count, k gets empirical value 0.25.

When the some words in keyword appear in text image, often the point of coupling is also more.This method adopts two kinds of strategies to judge: the one, and the judgement based on centre-of gravity shift, the 2nd, the judgement based on a distribution.

When the set of a certain moving window unique point is mated with keyword image characteristic point, can obtain mating two set of point set after initial matching point set and screening, if while being greater than threshold value through remaining the center of gravity horizontal ordinate of match point and the coordinate offset amount of corresponding keyword image characteristic point after Mismatching point screening, delete this moving window; If while being greater than threshold value through remaining the distribution proportion that (comprises left side or right side) in keyword image of match point and the distribution proportion of keyword image characteristic point in keyword image after Mismatching point screening, delete this moving window.When meeting following formula

|XCent(S ₀)-XCent(S _x)|＞tc (6)

|LPoi(S ₀)-LPoi(S _x)|＞tp

Time, current moving window is deleted.XCent in formula (S) represents the center of gravity horizontal ordinate of keyword image characteristic point, and LPoi (S) represents that keyword image characteristic point is at the distributive law of left one side of something.Tc and tp are two threshold values, get empirical value 0.15 and 0.18.As shown in Figure 7, two black circles represent the center of gravity of keyword image, from middle position, moved to the position in left side, because after screening Mismatching point, match point has substantially all concentrated on Chinese character " food ", although match point number is also many, but in the time of can finding out the judgement through centre-of gravity shift, distribution and the keyword image deviations of its match point are very large, this moving window still can be deleted, " property food " can not be detected as " food " yet, thereby reduced the error rate of handwritten form identification.

Finally need remaining moving window to merge, because some overlapping moving window has all comprised same keyword.When two moving window laps account for the more than 60% of whole moving window, they are merged into a moving window.Moving window after merging is exactly the candidate keywords moving window finally detecting.

The present embodiment is tested in 2764 envelope text images, by 50 authors, is write, and everyone 50 texts are not etc.Extract 15 of keywords, 99 keyword to be detected positions, carry out 10 groups of experiments.Use recall rate R, the alert rate E of mistake and new F value evaluation experimental result.Suppose detected image W envelope, S keyword detected altogether, wherein correctly detect Y, flase drop N, S=Y+N, the keyword that should detect has T, and above-mentioned three judgment criteria are calculated according to formula (7)

\begin{matrix} R = \frac{P}{T} \\ E = \frac{N}{W} \\ F = \frac{2}{0.8 \times \frac{1}{R} + 1.2 \times (\frac{1}{1 - E})} \end{matrix} - - - (7)

By formula (7), visible R can embody this method to keyword spotting ability, more high better; E will try one's best lowly, the least possiblely in to the detection of great amount of images must introduce erroneous judgement, embodies the distinguishing ability of the method to keyword.New F value is the balance to above two standards, be R and E according to the harmonic-mean of different weight calculation, concrete outcome is in Table 2.

Table 2 the present embodiment keyword spotting result

	R	E	F
				Optimum value	91.92％	0.65％	94.91％
Worst-case value	85.86％	3.12％	93.18％
				Mean value	88.64％	1.85％	94.09％

Protection content of the present invention is not limited to above embodiment.Do not deviating under the spirit and scope of inventive concept, variation and advantage that those skilled in the art can expect are all included in the present invention, and take appending claims as protection domain.

Claims

1. a handwritten form keyword spotting method, is characterized in that, comprises the steps:

Step 1: obtain keyword image library, extract the unique point of keyword image in described keyword image library and set up keyword feature storehouse;

Step 2: extract the unique point of text image Chinese version to be detected, obtain the unique point storehouse of described text image to be detected;

Step 3: extract the moving window of text image to be detected, extract described moving window characteristic of correspondence point set from described unique point storehouse, by described unique point set and the contrast of described keyword feature storehouse, obtain initial matching double points set;

Step 4: according to described moving window Chinese word geometry information, screen described initial matching double points set and obtain matching double points accurately;

Step 5: moving window integration according in the described text image to be detected of described matching double points screening, obtain testing result.

2. handwritten form keyword spotting method as claimed in claim 1, it is characterized in that, when the unique point in described step 3 in described unique point set and described keyword feature storehouse difference are less than threshold value, this unique point is detected as initial matching double points, and described unique point is shown below:

\begin{matrix} | θ_{s} - θ_{w} | < p 1 \\ p 2 < \frac{σ_{s}}{σ_{w}} < \frac{1}{p 2} \\ {| | {\overset{&RightArrow;}{d}}_{s} - {\overset{&RightArrow;}{d}}_{w} | |}^{2} < p 3 \end{matrix};

3. handwritten form keyword spotting method as claimed in claim 1, is characterized in that, in described step 4, by the described initial matching double points set of word geometry information constraint screening, its process comprises the steps:

4. handwritten form keyword spotting method as claimed in claim 3, is characterized in that, the constraint condition of described geological information constraints graph is as shown in the formula expression:

|(x _si-x _sj)-(x _wi-x _wj)|＜p4×Avg(σ _s)

；

|(y _si-y _sj)-(y _wi-y _wj)|＜p5×Avg(σ _s)

5. handwritten form keyword spotting method as claimed in claim 1, is characterized in that, the moving window screening in described step 5 in described text image to be detected comprises the steps:

Step c1: if the centre-of gravity shift ratio of the center of gravity of the match point of described moving window and former keyword is greater than 0.15, deleting should

Slip moving window, otherwise continue next step;