CN109815948B

CN109815948B - Test paper segmentation algorithm under complex scene

Info

Publication number: CN109815948B
Application number: CN201910031875.0A
Authority: CN
Inventors: 李晓光; 王麟腾; 高猛; 王世禹
Original assignee: Liaoning University
Current assignee: Liaoning University
Priority date: 2019-01-14
Filing date: 2019-01-14
Publication date: 2023-05-30
Anticipated expiration: 2039-01-14
Also published as: CN109815948A

Abstract

A test paper segmentation algorithm under a complex scene comprises the following steps: 1) Obtaining an image edge from an original image through gray level transformation and an edge detection algorithm; 2) Calculating a communication area and an external rectangle thereof through the image edge; 3) Merging the communication areas according to the IOUs with the different communication areas and the circumscribed rectangles; 4) Constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas; 5) Dividing the text line area by using the dividing result of the step 4) through a text line construction method; 6) And (3) constructing a question number judging model based on a convolutional neural network, and dividing the test question area by combining the dividing result of the step (5). The segmentation algorithm provided by the invention has a good segmentation effect in a test paper segmentation task under a complex scene, and can segment a character area, a text line area and a test question area well.

Description

Test paper segmentation algorithm under complex scene

Technical Field

The invention belongs to the field of electronic image segmentation technology and convolutional neural networks, and particularly relates to a test paper segmentation algorithm under a complex scene.

Background

With the development of intelligent education and artificial intelligence, in many application scenarios, the segmentation of test paper images shot by mobile equipment is one of the key links. Because the shooting scene and the angle of the mobile equipment are not fixed, the size, the definition and the local exposure degree of the image often have uncertainty, and the test paper itself has complex scenes such as handwriting and printing body mixing. Thousands of image segmentation methods exist at present, but in the test paper segmentation task under a complex scene, the segmentation effect on a character area, a text line area and a test question area is often difficult to meet the requirement.

Disclosure of Invention

In order to solve the problems, the invention provides a test paper segmentation algorithm in a complex scene, which can segment a character area, a text line area and a test question area well aiming at a test paper image in the complex scene.

In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a test paper segmentation algorithm under a complex scene is characterized by comprising the following steps:

1) Obtaining an image edge from an original image through gray level transformation and an edge detection algorithm;

2) Calculating a communication area and an external rectangle thereof through the image edge;

3) Merging the communication areas according to the IOUs with the different communication areas and the circumscribed rectangles;

4) Constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas;

5) Dividing the text line area by using the dividing result of the step 4) through a text line construction method;

6) And (3) constructing a question number judging model based on a convolutional neural network, and dividing the test question area by combining the dividing result of the step (5).

In the step 1), the specific method comprises the following steps:

1.1 Converting the original image O into a gray image G through a color image space;

1.2 Extracting the image edge E of G by an edge detection algorithm.

In the step 2), the specific method comprises the following steps:

2.1 Extracting a connected region set A of E, wherein A is composed of n small connected regions A _i Composition, i.e. a= { a _i },i∈[1,n]And A when i+.j _i ≠A _j ；

2.2 Calculation A) _i Is circumscribed by rectangle D _i Denoted as D _i ＝F(A _i ) Wherein D is _i ＝(X _i ,Y _i ,W _i ,H _i )，X _i ,Y _i ,W _i ,H _i Respectively represent D _i Upper left vertex abscissa, upper left vertex ordinate, D _i Width D of (2) _i Is a high level of (2).

In the step 3), the specific method comprises the following steps:

3.1 Calculation of D) _i And D _j Wherein D is _j ∈{Di|i∈[1,n],i≠j}；

3.2 Set a threshold α if IOU >Alpha, then D _i ←F(A _i ∪A _j ) And D++D- { D _j }。

In the step 4), the specific method comprises the following steps:

4.1 Construction of a two-classification character discrimination model M by a convolutional neural network ₁ ；

4.2)M ₁ Discrimination result

4.3 A) the parameter beta is set up,

if R is ₁ ＝M ₁ (D _i ) =0 and k= { a _j |j∈[1,n]If i is not equal to j and the center distance between Di and Dj is less than beta }, R 'is calculated' ₁ ＝M ₁ (F(A _i ∪K))；

K is a set of elements Aj with a center distance < beta from an element Ai circumscribed rectangle Di;

if R 'is' ₁ =0, then d≡d- { D _i }, otherwise D _i ←F(A _i +k) and d≡d-K.

In the step 5), the specific method comprises the following steps:

5.1 Every two adjacent elements D in D _i And D _j Forming a group, combining different groups until the combination cannot be performed, and marking the combination result as L, namely L++D;

5.2 Setting parameters gamma and delta, D _i And D _j Form a group of requirements D _i ΘD _j And D is _j ΘD _i Is true, wherein D _i ΘD _j The requirements are satisfied: (1) d (D) _i Is the distance D in D _j Minimum element and distance less than gamma, (2)D) _i And D _j Is greater than delta.

In the step 6), the specific method comprises the following steps:

6.1 Construction of a two-classification question number discrimination model M by a convolutional neural network ₂ ；

6.2)M ₂ Discrimination result

6.3 According to Y) _i Ordered from small to large, in (X _i ,Y _i ) Taking H as the upper left vertex _i ×H _i The rectangular area of the size is marked as T _i Calculating R ₂ ＝M ₂ (T _i ) If R is ₂ =1, then merge { L _k |k∈[1,i-1]And L++L- { L _k |k∈[1,i-1]The combined result is denoted as Q.

The beneficial effects of the invention are as follows: the method for guiding the merging of the connected areas by the character discrimination model based on the convolutional neural network and dividing the test question areas by the question number discrimination model based on the convolutional neural network has good dividing effect in the test paper dividing task under the complex scene, and can divide the character areas, the text line areas and the test question areas well.

Drawings

Fig. 1: the present invention creates an algorithm flow chart.

Fig. 2: the invention creates the test paper image under the complex scene.

Fig. 3: the invention creates the test paper image character region segmentation effect diagram under the complex scene.

Fig. 4: the invention creates the text line region segmentation effect diagram of the test paper image under the complex scene.

Fig. 5: the invention creates the test paper image test question area segmentation effect diagram under the complex scene.

Detailed Description

1) The image edge is obtained from the original image through gray level transformation and an edge detection algorithm, and the specific method comprises the following steps:

1.2 Extracting the image edge E of G by an edge detection algorithm.

2) The method for calculating the communication area and the circumscribed rectangle thereof through the image edge comprises the following steps:

3) The IOU merging communication areas with different communication areas and circumscribed rectangles comprises the following specific steps:

3.1 Calculation of D) _i And D _j Wherein D is _j ∈{Di|i∈[1,n],i≠j}；

3.2 A) setting a threshold value alpha, if IOU > alpha, D _i ←F(A _i ∪A _j ) And D++D- { D _j }。

4) The method comprises the steps of constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas, wherein the specific method comprises the following steps:

4.2)M ₁ Discrimination result

4.3 A set parameter β, β representing the maximum allowed distance between Di and Dj centers, typically set to 10;

Wherein K is a set of elements Aj with a center distance < beta from an element Ai circumscribed rectangle Di;

if R 'is' ₁ =0, then d≡d- { D _i }, otherwise D _i ←F(A _i U.K) and D+.D-K.

5) And 4) dividing text line areas by using the dividing result in the step 4) through a text line construction method, wherein the specific method comprises the following steps of:

5.2 Setting parameters gamma and delta, D _i And D _j Form a group of requirements D _i ΘD _j And D is _j ΘD _i Is true, wherein D _i ΘD _j The requirements are satisfied: (1) d (D) _i Is the distance D in D _j Minimum element and distance less than gamma, (2)D) _i And D _j Is greater than delta. Gamma represents the maximum value of the minimum distance between Di and Dj, typically set to 10, for text line construction; delta represents the minimum of the overlapping degree of Di and Dj in the vertical direction when the text line is constructed, and is usually set to be 0.7;

6) The method comprises the steps of constructing a question number judging model based on a convolutional neural network, and dividing a test question area by combining the dividing result of the step 5), wherein the specific method comprises the following steps:

6.2)M ₂ Discrimination result

Example 1:

the invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical solution of the present invention, and should not be used to limit the scope of the present invention.

FIG. 2 shows a test paper image in a complex scene according to the present invention, and an image edge is obtained through gray level transformation and an edge detection algorithm; calculating a communication area and an external rectangle thereof through the image edge; merging the communication areas according to the IOUs with the different communication areas and the circumscribed rectangles; constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas; dividing a text line region by a text line construction method by utilizing a character region division result; and constructing a question number judging model based on a convolutional neural network, and dividing a test question area by combining a text line dividing result. The flow chart of the test paper image segmentation algorithm under the complex scene provided by the invention is shown in fig. 1, and the specific implementation of the algorithm comprises the following steps:

step one: converting fig. 2 into a gray image G through a color image space; and then extracting the edge of G through a Canny algorithm to generate an image edge E.

Step two: extracting a connected region set A of a result E of the step, wherein A consists of n small connected regions A _i Composition, i.e. a= { a _i },i∈[1,n]And A when i+.j _i ≠A _j The method comprises the steps of carrying out a first treatment on the surface of the Calculation A _i Is circumscribed by rectangle D _i Namely D _i ＝F(A _i ) Wherein D is _i ＝(X _i ,Y _i ,W _i ,H _i )，X _i ,Y _i ,W _i ,H _i Respectively represent D _i Upper left vertex abscissa, upper left vertex ordinate, D _i Width D of (2) _i Is a high level of (2).

Step three: calculation D _i And D _j Wherein D is _j ∈{Di|i∈[1,n]I+.j }; and is modified in that,]setting a threshold α=0.5, if IOU > α, D _i ←F(A _i ∪A _j ) And D++D- { D _j }。

Step four: construction of two-classification character discrimination model M by convolutional neural network ₁ ；M ₁ The method consists of three convolution pooling groups, a full connection layer and a softmax classifier;

one of the convolution pooling groups consists of a 5×5 convolution layer, a step size of 1, a packing of 2, a BN layer, a RELU activation layer, a 3×3 convolution layer, a step size of 1, a BN layer, a RELU activation layer, and a 2×2 max pooling layer in sequence order; a first set 16 channels, a second set 32 channels, and a third set 64 channels in the three convolutionally pooled sets; the softmax classifier is divided into two classes of (0, 1) representations (non-character, character).

M ₁ All parameters are initialized by normal distribution, the learning rate is set to be 0.001, the training round number is set to be 10, an Adam optimizer is adopted, and the loss function is that

Where p (x) represents a model predictive value and q (x) represents a data tag value;

M ₁ the method comprises the steps of inputting images with the size of 48 multiplied by 48 pixels into a single channel, randomly generating 20000 images with the size of 48 multiplied by 48 pixels by a training set by adopting a font library of regular script, song Ti, microsoft elegant black and Times New Roman, randomly extracting 20000 handwritten character images in a HWDB1.1 data set to be standardized into the size of 48 multiplied by 48 pixels, randomly subtracting partial areas from 40000 images to generate 40000 non-character images, processing 40000 character images and 40000 non-character images in a step one mode, and applying the processed images to M ₁ Training.

Setting the parameter β=10, if R ₁ ＝M ₁ (D _i ) =0 and k= { a _j |j∈[1,n]If i is not equal to j and the center distance between Di and Dj is less than beta }, R 'is calculated' ₁ ＝M ₁ (F(A _i U.K.), if R' ₁ =0, then d≡d- { D _i }, otherwise D _i ←F(A _i U.K) and D≡D-K, a character region division image can be generated as shown in FIG. 3.

Step five: every two similar elements D in D _i And D _j Forming a group, merging different groups until the combination cannot be performed, and marking the merging result as L, namely L++D, so as to generate a text line region segmentation image, as shown in FIG. 4;

wherein, setting parameters gamma=10 and delta=0.7, d _i And D _j Form a group of requirements D _i ΘD _j And D is _j ΘD _i Is true, wherein D _i ΘD _j The requirements are satisfied: (1) d (D) _i Is the distance D in D _j Minimum element and distance less than gamma, (2)D) _i And D _j Is perpendicular to the overlap of (a)The degree is greater than delta;

step six: construction of two-classification question number discrimination model M by convolutional neural network ₂ ；M ₂ The method consists of two convolution pooling groups, a full connection layer and a softmax classifier;

one of the convolution pooling groups consists of a 3×3 convolution layer, a step size of 1, a packing of 1, a BN layer, a RELU activation layer, a 3×3 convolution layer, a step size of 1, a BN layer, a RELU activation layer, and a 2×2 max pooling layer according to the sequence order; a first set 32 channels and a second set 64 channels in the two convolutionally pooled sets; the softmax classifier is divided into two classes of (0, 1) representations (non-question mark, question mark).

M ₂ All parameters are initialized by adopting uniform distribution, the learning rate is set to be 0.001, the training round number is set to be 50, an Adam optimizer is adopted, and the loss function is that

model input is a single-channel 48X 48 pixel size image, a training set adopts a font library of regular script, song Ti, microsoft elegant black and Times New Roman to generate 6000 characters with 48X 48 pixel size, the characters with the characters including Chinese characters, english characters and digital characters, simultaneously randomly generates 6000 characters with 48X 48 pixel size and non-character characters, and the generated 12000 images are processed in a step one mode and then are used for M ₂ Training;

according to Y _i Ordered from small to large, in (X _i ,Y _i ) Taking H as the upper left vertex _i ×H _i The rectangular area of the size is marked as T _i Calculating R ₂ ＝M ₂ (T _i ) If R is ₂ =1, then merge { L _k |k∈[1,i-1]And L++L- { L _k |k∈[1,i-1]And (3) marking the combined result as Q, and generating a test question segmentation area as shown in figure 5.

Claims

1. A test paper segmentation algorithm under a complex scene is characterized by comprising the following steps:

2.2 Calculation A) _i Is circumscribed by rectangle D _i Denoted as D _i ＝F(A _i ) Wherein D is _i ＝(X _i ,Y _i ,W _i ,H _i )，X _i ,Y _i ,W _i ,H _i Respectively represent D _i Upper left vertex abscissa, upper left vertex ordinate, D _i Width D of (2) _i Is of a height of (2);

3.1 Calculation of D) _i And D _j Wherein D is _j ∈{Di|i∈[1,n],i≠j}；

3.2 A) setting a threshold value alpha, if IOU > alpha, D _i ←F(A _i ∪A _j ) And D++D- { D _j }；

4.2)M ₁ Discrimination result

4.3 A) the parameter beta is set up,

Wherein K is: a set of elements Aj with a center distance < beta from the element Ai circumscribing rectangle Di;

if R 'is' ₁ =0, then d≡d- { D _i }, otherwise D _i ←F(A _i U.K) and D≡D-K;

6) Constructing a question number judging model based on a convolutional neural network, and dividing a test question area by combining the dividing result of the step 5);

6.2)M ₂ Discrimination result

2. The test paper segmentation algorithm in a complex scene according to claim 1, wherein: in the step 1), the specific method comprises the following steps:

1.2 Extracting the image edge E of G by an edge detection algorithm.

3. The test paper segmentation algorithm in a complex scene according to claim 1, wherein: in the step 5), the specific method comprises the following steps: