CN109815948B - Test paper segmentation algorithm under complex scene - Google Patents

Test paper segmentation algorithm under complex scene Download PDF

Info

Publication number
CN109815948B
CN109815948B CN201910031875.0A CN201910031875A CN109815948B CN 109815948 B CN109815948 B CN 109815948B CN 201910031875 A CN201910031875 A CN 201910031875A CN 109815948 B CN109815948 B CN 109815948B
Authority
CN
China
Prior art keywords
areas
area
dividing
image
test paper
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910031875.0A
Other languages
Chinese (zh)
Other versions
CN109815948A (en
Inventor
李晓光
王麟腾
高猛
王世禹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Liaoning University
Original Assignee
Liaoning University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Liaoning University filed Critical Liaoning University
Priority to CN201910031875.0A priority Critical patent/CN109815948B/en
Publication of CN109815948A publication Critical patent/CN109815948A/en
Application granted granted Critical
Publication of CN109815948B publication Critical patent/CN109815948B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)
  • Character Input (AREA)

Abstract

A test paper segmentation algorithm under a complex scene comprises the following steps: 1) Obtaining an image edge from an original image through gray level transformation and an edge detection algorithm; 2) Calculating a communication area and an external rectangle thereof through the image edge; 3) Merging the communication areas according to the IOUs with the different communication areas and the circumscribed rectangles; 4) Constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas; 5) Dividing the text line area by using the dividing result of the step 4) through a text line construction method; 6) And (3) constructing a question number judging model based on a convolutional neural network, and dividing the test question area by combining the dividing result of the step (5). The segmentation algorithm provided by the invention has a good segmentation effect in a test paper segmentation task under a complex scene, and can segment a character area, a text line area and a test question area well.

Description

Test paper segmentation algorithm under complex scene
Technical Field
The invention belongs to the field of electronic image segmentation technology and convolutional neural networks, and particularly relates to a test paper segmentation algorithm under a complex scene.
Background
With the development of intelligent education and artificial intelligence, in many application scenarios, the segmentation of test paper images shot by mobile equipment is one of the key links. Because the shooting scene and the angle of the mobile equipment are not fixed, the size, the definition and the local exposure degree of the image often have uncertainty, and the test paper itself has complex scenes such as handwriting and printing body mixing. Thousands of image segmentation methods exist at present, but in the test paper segmentation task under a complex scene, the segmentation effect on a character area, a text line area and a test question area is often difficult to meet the requirement.
Disclosure of Invention
In order to solve the problems, the invention provides a test paper segmentation algorithm in a complex scene, which can segment a character area, a text line area and a test question area well aiming at a test paper image in the complex scene.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows: a test paper segmentation algorithm under a complex scene is characterized by comprising the following steps:
1) Obtaining an image edge from an original image through gray level transformation and an edge detection algorithm;
2) Calculating a communication area and an external rectangle thereof through the image edge;
3) Merging the communication areas according to the IOUs with the different communication areas and the circumscribed rectangles;
4) Constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas;
5) Dividing the text line area by using the dividing result of the step 4) through a text line construction method;
6) And (3) constructing a question number judging model based on a convolutional neural network, and dividing the test question area by combining the dividing result of the step (5).
In the step 1), the specific method comprises the following steps:
1.1 Converting the original image O into a gray image G through a color image space;
1.2 Extracting the image edge E of G by an edge detection algorithm.
In the step 2), the specific method comprises the following steps:
2.1 Extracting a connected region set A of E, wherein A is composed of n small connected regions A i Composition, i.e. a= { a i },i∈[1,n]And A when i+.j i ≠A j
2.2 Calculation A) i Is circumscribed by rectangle D i Denoted as D i =F(A i ) Wherein D is i =(X i ,Y i ,W i ,H i ),X i ,Y i ,W i ,H i Respectively represent D i Upper left vertex abscissa, upper left vertex ordinate, D i Width D of (2) i Is a high level of (2).
In the step 3), the specific method comprises the following steps:
3.1 Calculation of D) i And D j Wherein D is j ∈{Di|i∈[1,n],i≠j};
3.2 Set a threshold α if IOU >Alpha, then D i ←F(A i ∪A j ) And D++D- { D j }。
In the step 4), the specific method comprises the following steps:
4.1 Construction of a two-classification character discrimination model M by a convolutional neural network 1
4.2)M 1 Discrimination result
Figure BDA0001944522260000021
4.3 A) the parameter beta is set up,
if R is 1 =M 1 (D i ) =0 and k= { a j |j∈[1,n]If i is not equal to j and the center distance between Di and Dj is less than beta }, R 'is calculated' 1 =M 1 (F(A i ∪K));
K is a set of elements Aj with a center distance < beta from an element Ai circumscribed rectangle Di;
if R 'is' 1 =0, then d≡d- { D i }, otherwise D i ←F(A i +k) and d≡d-K.
In the step 5), the specific method comprises the following steps:
5.1 Every two adjacent elements D in D i And D j Forming a group, combining different groups until the combination cannot be performed, and marking the combination result as L, namely L++D;
5.2 Setting parameters gamma and delta, D i And D j Form a group of requirements D i ΘD j And D is j ΘD i Is true, wherein D i ΘD j The requirements are satisfied: (1) d (D) i Is the distance D in D j Minimum element and distance less than gamma, (2)D) i And D j Is greater than delta.
In the step 6), the specific method comprises the following steps:
6.1 Construction of a two-classification question number discrimination model M by a convolutional neural network 2
6.2)M 2 Discrimination result
Figure BDA0001944522260000031
6.3 According to Y) i Ordered from small to large, in (X i ,Y i ) Taking H as the upper left vertex i ×H i The rectangular area of the size is marked as T i Calculating R 2 =M 2 (T i ) If R is 2 =1, then merge { L k |k∈[1,i-1]And L++L- { L k |k∈[1,i-1]The combined result is denoted as Q.
The beneficial effects of the invention are as follows: the method for guiding the merging of the connected areas by the character discrimination model based on the convolutional neural network and dividing the test question areas by the question number discrimination model based on the convolutional neural network has good dividing effect in the test paper dividing task under the complex scene, and can divide the character areas, the text line areas and the test question areas well.
Drawings
Fig. 1: the present invention creates an algorithm flow chart.
Fig. 2: the invention creates the test paper image under the complex scene.
Fig. 3: the invention creates the test paper image character region segmentation effect diagram under the complex scene.
Fig. 4: the invention creates the text line region segmentation effect diagram of the test paper image under the complex scene.
Fig. 5: the invention creates the test paper image test question area segmentation effect diagram under the complex scene.
Detailed Description
1) The image edge is obtained from the original image through gray level transformation and an edge detection algorithm, and the specific method comprises the following steps:
1.1 Converting the original image O into a gray image G through a color image space;
1.2 Extracting the image edge E of G by an edge detection algorithm.
2) The method for calculating the communication area and the circumscribed rectangle thereof through the image edge comprises the following steps:
2.1 Extracting a connected region set A of E, wherein A is composed of n small connected regions A i Composition, i.e. a= { a i },i∈[1,n]And A when i+.j i ≠A j
2.2 Calculation A) i Is circumscribed by rectangle D i Denoted as D i =F(A i ) Wherein D is i =(X i ,Y i ,W i ,H i ),X i ,Y i ,W i ,H i Respectively represent D i Upper left vertex abscissa, upper left vertex ordinate, D i Width D of (2) i Is a high level of (2).
3) The IOU merging communication areas with different communication areas and circumscribed rectangles comprises the following specific steps:
3.1 Calculation of D) i And D j Wherein D is j ∈{Di|i∈[1,n],i≠j};
3.2 A) setting a threshold value alpha, if IOU > alpha, D i ←F(A i ∪A j ) And D++D- { D j }。
4) The method comprises the steps of constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas, wherein the specific method comprises the following steps:
4.1 Construction of a two-classification character discrimination model M by a convolutional neural network 1
4.2)M 1 Discrimination result
Figure BDA0001944522260000041
4.3 A set parameter β, β representing the maximum allowed distance between Di and Dj centers, typically set to 10;
if R is 1 =M 1 (D i ) =0 and k= { a j |j∈[1,n]If i is not equal to j and the center distance between Di and Dj is less than beta }, R 'is calculated' 1 =M 1 (F(A i ∪K));
Wherein K is a set of elements Aj with a center distance < beta from an element Ai circumscribed rectangle Di;
if R 'is' 1 =0, then d≡d- { D i }, otherwise D i ←F(A i U.K) and D+.D-K.
5) And 4) dividing text line areas by using the dividing result in the step 4) through a text line construction method, wherein the specific method comprises the following steps of:
5.1 Every two adjacent elements D in D i And D j Forming a group, combining different groups until the combination cannot be performed, and marking the combination result as L, namely L++D;
5.2 Setting parameters gamma and delta, D i And D j Form a group of requirements D i ΘD j And D is j ΘD i Is true, wherein D i ΘD j The requirements are satisfied: (1) d (D) i Is the distance D in D j Minimum element and distance less than gamma, (2)D) i And D j Is greater than delta. Gamma represents the maximum value of the minimum distance between Di and Dj, typically set to 10, for text line construction; delta represents the minimum of the overlapping degree of Di and Dj in the vertical direction when the text line is constructed, and is usually set to be 0.7;
6) The method comprises the steps of constructing a question number judging model based on a convolutional neural network, and dividing a test question area by combining the dividing result of the step 5), wherein the specific method comprises the following steps:
6.1 Construction of a two-classification question number discrimination model M by a convolutional neural network 2
6.2)M 2 Discrimination result
Figure BDA0001944522260000042
6.3 According to Y) i Ordered from small to large, in (X i ,Y i ) Taking H as the upper left vertex i ×H i The rectangular area of the size is marked as T i Calculating R 2 =M 2 (T i ) If R is 2 =1, then merge { L k |k∈[1,i-1]And L++L- { L k |k∈[1,i-1]The combined result is denoted as Q.
Example 1:
the invention is further described below with reference to the accompanying drawings. The following examples are only for more clearly illustrating the technical solution of the present invention, and should not be used to limit the scope of the present invention.
FIG. 2 shows a test paper image in a complex scene according to the present invention, and an image edge is obtained through gray level transformation and an edge detection algorithm; calculating a communication area and an external rectangle thereof through the image edge; merging the communication areas according to the IOUs with the different communication areas and the circumscribed rectangles; constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas; dividing a text line region by a text line construction method by utilizing a character region division result; and constructing a question number judging model based on a convolutional neural network, and dividing a test question area by combining a text line dividing result. The flow chart of the test paper image segmentation algorithm under the complex scene provided by the invention is shown in fig. 1, and the specific implementation of the algorithm comprises the following steps:
step one: converting fig. 2 into a gray image G through a color image space; and then extracting the edge of G through a Canny algorithm to generate an image edge E.
Step two: extracting a connected region set A of a result E of the step, wherein A consists of n small connected regions A i Composition, i.e. a= { a i },i∈[1,n]And A when i+.j i ≠A j The method comprises the steps of carrying out a first treatment on the surface of the Calculation A i Is circumscribed by rectangle D i Namely D i =F(A i ) Wherein D is i =(X i ,Y i ,W i ,H i ),X i ,Y i ,W i ,H i Respectively represent D i Upper left vertex abscissa, upper left vertex ordinate, D i Width D of (2) i Is a high level of (2).
Step three: calculation D i And D j Wherein D is j ∈{Di|i∈[1,n]I+.j }; and is modified in that,]setting a threshold α=0.5, if IOU > α, D i ←F(A i ∪A j ) And D++D- { D j }。
Step four: construction of two-classification character discrimination model M by convolutional neural network 1 ;M 1 The method consists of three convolution pooling groups, a full connection layer and a softmax classifier;
one of the convolution pooling groups consists of a 5×5 convolution layer, a step size of 1, a packing of 2, a BN layer, a RELU activation layer, a 3×3 convolution layer, a step size of 1, a BN layer, a RELU activation layer, and a 2×2 max pooling layer in sequence order; a first set 16 channels, a second set 32 channels, and a third set 64 channels in the three convolutionally pooled sets; the softmax classifier is divided into two classes of (0, 1) representations (non-character, character).
M 1 All parameters are initialized by normal distribution, the learning rate is set to be 0.001, the training round number is set to be 10, an Adam optimizer is adopted, and the loss function is that
Figure BDA0001944522260000061
Where p (x) represents a model predictive value and q (x) represents a data tag value;
M 1 the method comprises the steps of inputting images with the size of 48 multiplied by 48 pixels into a single channel, randomly generating 20000 images with the size of 48 multiplied by 48 pixels by a training set by adopting a font library of regular script, song Ti, microsoft elegant black and Times New Roman, randomly extracting 20000 handwritten character images in a HWDB1.1 data set to be standardized into the size of 48 multiplied by 48 pixels, randomly subtracting partial areas from 40000 images to generate 40000 non-character images, processing 40000 character images and 40000 non-character images in a step one mode, and applying the processed images to M 1 Training.
Setting the parameter β=10, if R 1 =M 1 (D i ) =0 and k= { a j |j∈[1,n]If i is not equal to j and the center distance between Di and Dj is less than beta }, R 'is calculated' 1 =M 1 (F(A i U.K.), if R' 1 =0, then d≡d- { D i }, otherwise D i ←F(A i U.K) and D≡D-K, a character region division image can be generated as shown in FIG. 3.
Step five: every two similar elements D in D i And D j Forming a group, merging different groups until the combination cannot be performed, and marking the merging result as L, namely L++D, so as to generate a text line region segmentation image, as shown in FIG. 4;
wherein, setting parameters gamma=10 and delta=0.7, d i And D j Form a group of requirements D i ΘD j And D is j ΘD i Is true, wherein D i ΘD j The requirements are satisfied: (1) d (D) i Is the distance D in D j Minimum element and distance less than gamma, (2)D) i And D j Is perpendicular to the overlap of (a)The degree is greater than delta;
step six: construction of two-classification question number discrimination model M by convolutional neural network 2 ;M 2 The method consists of two convolution pooling groups, a full connection layer and a softmax classifier;
one of the convolution pooling groups consists of a 3×3 convolution layer, a step size of 1, a packing of 1, a BN layer, a RELU activation layer, a 3×3 convolution layer, a step size of 1, a BN layer, a RELU activation layer, and a 2×2 max pooling layer according to the sequence order; a first set 32 channels and a second set 64 channels in the two convolutionally pooled sets; the softmax classifier is divided into two classes of (0, 1) representations (non-question mark, question mark).
M 2 All parameters are initialized by adopting uniform distribution, the learning rate is set to be 0.001, the training round number is set to be 50, an Adam optimizer is adopted, and the loss function is that
Figure BDA0001944522260000071
Where p (x) represents a model predictive value and q (x) represents a data tag value;
model input is a single-channel 48X 48 pixel size image, a training set adopts a font library of regular script, song Ti, microsoft elegant black and Times New Roman to generate 6000 characters with 48X 48 pixel size, the characters with the characters including Chinese characters, english characters and digital characters, simultaneously randomly generates 6000 characters with 48X 48 pixel size and non-character characters, and the generated 12000 images are processed in a step one mode and then are used for M 2 Training;
according to Y i Ordered from small to large, in (X i ,Y i ) Taking H as the upper left vertex i ×H i The rectangular area of the size is marked as T i Calculating R 2 =M 2 (T i ) If R is 2 =1, then merge { L k |k∈[1,i-1]And L++L- { L k |k∈[1,i-1]And (3) marking the combined result as Q, and generating a test question segmentation area as shown in figure 5.

Claims (3)

1. A test paper segmentation algorithm under a complex scene is characterized by comprising the following steps:
1) Obtaining an image edge from an original image through gray level transformation and an edge detection algorithm;
2) Calculating a communication area and an external rectangle thereof through the image edge;
2.1 Extracting a connected region set A of E, wherein A is composed of n small connected regions A i Composition, i.e. a= { a i },i∈[1,n]And A when i+.j i ≠A j
2.2 Calculation A) i Is circumscribed by rectangle D i Denoted as D i =F(A i ) Wherein D is i =(X i ,Y i ,W i ,H i ),X i ,Y i ,W i ,H i Respectively represent D i Upper left vertex abscissa, upper left vertex ordinate, D i Width D of (2) i Is of a height of (2);
3) Merging the communication areas according to the IOUs with the different communication areas and the circumscribed rectangles;
3.1 Calculation of D) i And D j Wherein D is j ∈{Di|i∈[1,n],i≠j};
3.2 A) setting a threshold value alpha, if IOU > alpha, D i ←F(A i ∪A j ) And D++D- { D j };
4) Constructing a character discrimination model based on a convolutional neural network, guiding the merging of the connected areas, and connecting the merged connected areas with rectangular segmentation character areas;
4.1 Construction of a two-classification character discrimination model M by a convolutional neural network 1
4.2)M 1 Discrimination result
Figure QLYQS_1
4.3 A) the parameter beta is set up,
if R is 1 =M 1 (D i ) =0 and k= { a j |j∈[1,n]If i is not equal to j and the center distance between Di and Dj is less than beta }, R 'is calculated' 1 =M 1 (F(A i ∪K));
Wherein K is: a set of elements Aj with a center distance < beta from the element Ai circumscribing rectangle Di;
if R 'is' 1 =0, then d≡d- { D i }, otherwise D i ←F(A i U.K) and D≡D-K;
5) Dividing the text line area by using the dividing result of the step 4) through a text line construction method;
6) Constructing a question number judging model based on a convolutional neural network, and dividing a test question area by combining the dividing result of the step 5);
6.1 Construction of a two-classification question number discrimination model M by a convolutional neural network 2
6.2)M 2 Discrimination result
Figure QLYQS_2
6.3 According to Y) i Ordered from small to large, in (X i ,Y i ) Taking H as the upper left vertex i ×H i The rectangular area of the size is marked as T i Calculating R 2 =M 2 (T i ) If R is 2 =1, then merge { L k |k∈[1,i-1]And L++L- { L k |k∈[1,i-1]The combined result is denoted as Q.
2. The test paper segmentation algorithm in a complex scene according to claim 1, wherein: in the step 1), the specific method comprises the following steps:
1.1 Converting the original image O into a gray image G through a color image space;
1.2 Extracting the image edge E of G by an edge detection algorithm.
3. The test paper segmentation algorithm in a complex scene according to claim 1, wherein: in the step 5), the specific method comprises the following steps:
5.1 Every two adjacent elements D in D i And D j Forming a group, combining different groups until the combination cannot be performed, and marking the combination result as L, namely L++D;
5.2 Setting parameters gamma and delta, D i And D j Form a group of requirements D i ΘD j And D is j ΘD i Is true, wherein D i ΘD j The requirements are satisfied: (1) d (D) i Is the distance D in D j Minimum element and distance less than gamma, (2)D) i And D j Is greater than delta.
CN201910031875.0A 2019-01-14 2019-01-14 Test paper segmentation algorithm under complex scene Active CN109815948B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910031875.0A CN109815948B (en) 2019-01-14 2019-01-14 Test paper segmentation algorithm under complex scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910031875.0A CN109815948B (en) 2019-01-14 2019-01-14 Test paper segmentation algorithm under complex scene

Publications (2)

Publication Number Publication Date
CN109815948A CN109815948A (en) 2019-05-28
CN109815948B true CN109815948B (en) 2023-05-30

Family

ID=66604206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910031875.0A Active CN109815948B (en) 2019-01-14 2019-01-14 Test paper segmentation algorithm under complex scene

Country Status (1)

Country Link
CN (1) CN109815948B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111080664B (en) * 2019-12-30 2022-03-08 合肥联宝信息技术有限公司 Data processing method and device, computer storage medium and computer
CN111368848B (en) * 2020-05-28 2020-08-21 北京同方软件有限公司 Character detection method under complex scene
CN111652141B (en) * 2020-06-03 2023-05-05 广东小天才科技有限公司 Question segmentation method, device, equipment and medium based on question numbers and text lines
CN112560849B (en) * 2021-01-24 2021-08-20 中天恒星(上海)科技有限公司 Neural network algorithm-based grammar segmentation method and system
CN115880704B (en) * 2023-02-16 2023-06-16 中国人民解放军总医院第一医学中心 Automatic cataloging method, system, equipment and storage medium for cases

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103679168B (en) * 2012-08-30 2018-11-09 北京百度网讯科技有限公司 Detection method and detection device for character region
CN106603838A (en) * 2016-12-06 2017-04-26 深圳市金立通信设备有限公司 Image processing method and terminal
CN107067399A (en) * 2017-02-13 2017-08-18 杭州施强教育科技有限公司 A kind of paper image segmentation processing method
CN107895142A (en) * 2017-10-26 2018-04-10 湖南考神信息科技有限责任公司 A kind of the paper contents of test question automatic division method and system of view-based access control model mark

Also Published As

Publication number Publication date
CN109815948A (en) 2019-05-28

Similar Documents

Publication Publication Date Title
CN109815948B (en) Test paper segmentation algorithm under complex scene
CN109886121B (en) Human face key point positioning method for shielding robustness
CN106682629B (en) Identification algorithm for identity card number under complex background
CN105740909B (en) Text recognition method under a kind of natural scene based on spatial alternation
CN112183233A (en) Ship board identification method and system based on deep learning
CN108121991B (en) Deep learning ship target detection method based on edge candidate region extraction
CN111914935B (en) Ship image target detection method based on deep learning
CN112287941B (en) License plate recognition method based on automatic character region perception
CN109657612B (en) Quality sorting system based on facial image features and application method thereof
CN110866529A (en) Character recognition method, character recognition device, electronic equipment and storage medium
CN110188762B (en) Chinese-English mixed merchant store name identification method, system, equipment and medium
CN113344826B (en) Image processing method, device, electronic equipment and storage medium
CN109886978A (en) A kind of end-to-end warning information recognition methods based on deep learning
CN112926565B (en) Picture text recognition method, system, equipment and storage medium
CN111680690A (en) Character recognition method and device
CN112052845A (en) Image recognition method, device, equipment and storage medium
CN111353956A (en) Image restoration method and device, computer equipment and storage medium
CN110084136A (en) Context based on super-pixel CRF model optimizes indoor scene semanteme marking method
Li et al. Braille recognition using deep learning
CN111507356A (en) Segmentation method of handwritten characters of lower case money of financial bills
CN111242829A (en) Watermark extraction method, device, equipment and storage medium
CN112036290B (en) Complex scene text recognition method and system based on class mark coding representation
CN111126173A (en) High-precision face detection method
CN113065407B (en) Financial bill seal erasing method based on attention mechanism and generation countermeasure network
WO2022252613A1 (en) Method for identifying multiple types of lines in pdf on basis of desktop software by means of function fitting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant