CN113743318A - Table structure identification method based on row and column division, storage medium and electronic device - Google Patents

Table structure identification method based on row and column division, storage medium and electronic device Download PDF

Info

Publication number
CN113743318A
CN113743318A CN202111042986.5A CN202111042986A CN113743318A CN 113743318 A CN113743318 A CN 113743318A CN 202111042986 A CN202111042986 A CN 202111042986A CN 113743318 A CN113743318 A CN 113743318A
Authority
CN
China
Prior art keywords
row
column
distribution
features
feature map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202111042986.5A
Other languages
Chinese (zh)
Inventor
孔令军
包云超
王茜雯
侯文涛
刘伟光
周耀威
闫佳艺
李华康
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinling Institute of Technology
Original Assignee
Jinling Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinling Institute of Technology filed Critical Jinling Institute of Technology
Priority to CN202111042986.5A priority Critical patent/CN113743318A/en
Publication of CN113743318A publication Critical patent/CN113743318A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a table structure identification method based on row-column division, a storage medium and an electronic device, wherein the method comprises the steps of obtaining a table image; extracting a table feature map comprising row features and column features; processing the row characteristics and the column characteristics to respectively obtain row distribution and column distribution of the table; and judging whether the areas of the row distribution and the column distribution are overlapped, wherein the overlapped part is a table cell, and otherwise, the overlapped part is a background. The invention simplifies the table row and column prediction and ensures higher stability of the prediction; the prediction of table rows and columns is completed in the same convolutional network, so that the debugging and the deployment are facilitated; the table row and column distribution is obtained first, and then the table cell distribution is obtained, so that the robustness is improved by the bottom-up method.

Description

Table structure identification method based on row and column division, storage medium and electronic device
Technical Field
The invention belongs to the technical field of computer vision and artificial intelligence, and particularly relates to a table structure identification method based on row-column division, a storage medium and an electronic device.
Background
In daily life, a form is a general and common text object, and how to detect and identify the form in massive data becomes a necessary and challenging task. The table detection and the table structure identification form a complete table identification task. The purpose of form detection is to locate the form area in the page, which many researchers define as a target detection problem. Table structure identification is a more difficult task compared to table detection, with the goal of obtaining structure information for the table. Early table structure identification studies were primarily based on heuristic rules, i.e., a series of rules were developed to detect tables that met specific conditions. However, the table identification method based on the heuristic rule is difficult to design, is limited in a certain scene, and cannot show good generalization capability. At present, most researchers use deep learning methods such as target detection and image segmentation to identify the table structure. For the special structure of the table, the frame lines between rows and columns can be used as objects for identification, but the frame lines of the table occupy fewer pixels, which causes the problem of unbalance between positive and negative samples. Some studies propose consistency assumptions for the table structure: all rows of the table start from the start of the first column and end at the end of the last column; all columns start from the start of the first row and end at the end of the last row. Therefore, for the column features, only the classification of the first row of pixels needs to be predicted and then expanded to obtain the whole column prediction image, and for the row features, only the classification of the first column of pixels needs to be predicted. Although the complexity of the row-column division can be reduced, a large fault tolerance rate is easily generated, and the whole prediction graph is influenced by the classification prediction error of a certain pixel position.
Disclosure of Invention
In view of the defects in the prior art, the table structure identification task is divided into the table row and column division tasks, and complete table structure information is constructed through the divided row and column information.
In a first aspect of the present invention, a method for identifying a table structure based on row-column division is provided, which includes the following steps,
s1, obtaining a form image;
s2, extracting a table feature map comprising row features and column features;
s3, processing the row characteristics and the column characteristics to respectively obtain row distribution and column distribution of the table;
s4, judging whether the areas of the row distribution and the column distribution are overlapped, wherein the overlapped part is a table cell, and otherwise, the overlapped part is a background.
Further, the extracting of the row features and the column features of the table in step S2 is specifically to perform feature extraction by using a convolutional neural network based on deep learning as a backbone network, where the convolutional neural network is VGG, ResNet, or MobileNet.
Further, in step S3, specifically,
s31, respectively extracting the maximum value of each line and each column of the feature map on the channel dimension by using a network based on an attention mechanism;
s32, correspondingly generating a distribution of a column of pixels and a distribution of a row of pixels;
assuming that the input table feature map has a size of H × W × C, a line feature map F having a size of H × 1 × C is outputrowAnd a column profile F of size 1 XWxCcol
Figure BDA0003250137990000021
Figure BDA0003250137990000022
S33, for the line feature diagram FrowAnd said column profile FcolTiling to obtain row distribution and column distribution with dimension H × W × C
Figure BDA0003250137990000023
Figure BDA0003250137990000024
In a second aspect of the present invention, a computer-readable storage medium is provided, in which a computer program is stored, wherein the computer program is configured to perform the method according to any one of the above-mentioned aspects when the computer program runs.
In a third aspect of the present invention, an electronic device is provided, comprising a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the method according to any one of the above technical solutions.
The invention has the following beneficial effects: the table row and column prediction is simplified, and the higher stability of the prediction is ensured; the prediction of table rows and columns is completed in the same convolutional network, so that the debugging and the deployment are facilitated; the table row and column distribution is obtained first, and then the table cell distribution is obtained, so that the robustness is improved by the bottom-up method.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flowchart illustrating a method for identifying a table structure based on row-column division according to an embodiment of the present invention;
FIG. 2 is a schematic illustration of the row distribution obtained in the embodiment of FIG. 1;
FIG. 3 is a schematic illustration of the column distribution obtained in the embodiment of FIG. 1;
fig. 4 is a schematic diagram of the cell distribution in the embodiment of fig. 1.
Detailed Description
In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover non-exclusive inclusions, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in fig. 1, a first aspect of this embodiment is a method for identifying a table structure based on row-column division, including the following steps:
and S1, acquiring a form image.
In the embodiment of the present invention, the picture including the table may be obtained by a scanner, a high-speed scanner, a digital camera, a mobile terminal with a camera, and the like, which is not limited in the present invention.
In the embodiment of the present invention, the picture may include contents such as tables, characters, and pictures, and colors of the background, the tables, and the characters in the picture may be white, black, red, yellow, blue, and the like, which is not limited in the present invention.
And S2, extracting a table feature diagram comprising row features and column features.
In the embodiment of the invention, the table is an ordered organization form formed by a plurality of rows and columns, and the intersection areas of the rows and the columns form a plurality of cells in the table. Based on the row and column distributions, a cell distribution can be constructed, knowing the structure of the table.
Specifically, the convolutional neural network based on deep learning is used as a backbone network for feature extraction, and the backbone network may be VGG, ResNet, MobileNet, or the like, which is not limited in this disclosure.
And S3, processing the row characteristics and the column characteristics to respectively obtain the row distribution and the column distribution of the table.
Specifically, the method comprises the following steps:
s31, slicing the table feature map, namely, respectively extracting the maximum value of each row and each column of the feature map on the channel dimension by using a network based on an attention mechanism;
s32, correspondingly generating a distribution of a column of pixels and a distribution of a row of pixels;
assuming that the input table feature map has a size of H × W × C, a line feature map F having a size of H × 1 × C is outputrowAnd a column profile F of size 1 XWxCcol
Figure BDA0003250137990000041
Figure BDA0003250137990000042
S33, for the line feature diagram FrowAnd said column profile FcolTiling, i.e. FrowCopying W times along the width axis, adding FcolH copies along the height axis result in a row distribution and a column distribution with dimensions H × W × C, respectively
Figure BDA0003250137990000043
Figure BDA0003250137990000044
The slicing operation simplifies the column division on each channel into the prediction of a row of elements, and the tiling operation restores the feature map to the size before slicing, so that on one hand, rough soft prediction is generated to guide the learning of a row and column prediction network, and on the other hand, error correction can be performed by means of the row and column prediction network to avoid generating large errors.
Normalizing the characteristic value of the characteristic diagram after the tiling operation to 0-1 through Softmax; the row and column information streams are added to the upsampled overall stream, respectively. And finally, multiplying the normalized feature map and the added feature map to obtain an output feature map. This operation is intended to extract attention on the columns from the column information stream and suppress irrelevant information, which is finally applied to the information stream enhanced by the overall information stream.
S4, judging whether the areas of the row distribution and the column distribution are overlapped, wherein the overlapped part is a table cell, and otherwise, the overlapped part is a background.
In a second aspect of this embodiment, a computer-readable storage medium is provided, in which a computer program is stored, where the computer program is configured to execute the method in any one of the above technical solutions when the computer program is executed.
In a third aspect of the present embodiment, an electronic device is provided, which includes a memory and a processor, where the memory stores a computer program, and the processor is configured to execute the computer program to perform the method in any one of the above technical solutions.
The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (5)

1. The table structure identification method based on row-column division is characterized by comprising the following steps,
s1, obtaining a form image;
s2, extracting a table feature map comprising row features and column features;
s3, processing the row characteristics and the column characteristics to respectively obtain row distribution and column distribution of the table;
s4, judging whether the areas of the row distribution and the column distribution are overlapped, wherein the overlapped part is a table cell, and otherwise, the overlapped part is a background.
2. The table structure recognition method of claim 1, wherein the extracting of the row features and the column features of the table in S2 specifically includes performing feature extraction by using a convolutional neural network based on deep learning as a backbone network, where the convolutional neural network is VGG, ResNet, or MobileNet.
3. The table structure recognition method according to claim 1, wherein step S3 is specifically,
s31, respectively extracting the maximum value of each line and each column of the feature map on the channel dimension by using a network based on an attention mechanism;
s32, correspondingly generating a distribution of a column of pixels and a distribution of a row of pixels;
assuming that the input table feature map has a size of H × W × C, a line feature map F having a size of H × 1 × C is outputrowAnd a column profile F of size 1 XWxCcol
Figure FDA0003250137980000011
Figure FDA0003250137980000012
S33, for the line feature diagram FrowAnd said column profile FcolTiling to obtain row distribution and column distribution with dimension H × W × C
Figure FDA0003250137980000013
Figure FDA0003250137980000014
4. A computer-readable storage medium, in which a computer program is stored, wherein the computer program is configured to carry out the method of any one of claims 1 to 3 when executed.
5. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 3.
CN202111042986.5A 2021-09-07 2021-09-07 Table structure identification method based on row and column division, storage medium and electronic device Withdrawn CN113743318A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111042986.5A CN113743318A (en) 2021-09-07 2021-09-07 Table structure identification method based on row and column division, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111042986.5A CN113743318A (en) 2021-09-07 2021-09-07 Table structure identification method based on row and column division, storage medium and electronic device

Publications (1)

Publication Number Publication Date
CN113743318A true CN113743318A (en) 2021-12-03

Family

ID=78736459

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111042986.5A Withdrawn CN113743318A (en) 2021-09-07 2021-09-07 Table structure identification method based on row and column division, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN113743318A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115331245A (en) * 2022-10-12 2022-11-11 中南民族大学 Table structure identification method based on image instance segmentation
TWI806392B (en) * 2022-01-27 2023-06-21 國立高雄師範大學 Table detection method of table text

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI806392B (en) * 2022-01-27 2023-06-21 國立高雄師範大學 Table detection method of table text
CN115331245A (en) * 2022-10-12 2022-11-11 中南民族大学 Table structure identification method based on image instance segmentation
CN115331245B (en) * 2022-10-12 2023-02-03 中南民族大学 Table structure identification method based on image instance segmentation

Similar Documents

Publication Publication Date Title
US8712188B2 (en) System and method for document orientation detection
US8947736B2 (en) Method for binarizing scanned document images containing gray or light colored text printed with halftone pattern
US8693779B1 (en) Segmenting printed media pages into articles
CN109241861B (en) Mathematical formula identification method, device, equipment and storage medium
CN109635805B (en) Image text positioning method and device and image text identification method and device
JPH0721319A (en) Automatic determination device of asian language
CN112070649B (en) Method and system for removing specific character string watermark
CN113743318A (en) Table structure identification method based on row and column division, storage medium and electronic device
CN110443235B (en) Intelligent paper test paper total score identification method and system
US10423851B2 (en) Method, apparatus, and computer-readable medium for processing an image with horizontal and vertical text
CN111680690A (en) Character recognition method and device
US20190005325A1 (en) Identification of emphasized text in electronic documents
US10586125B2 (en) Line removal method, apparatus, and computer-readable medium
CN115035539B (en) Document anomaly detection network model construction method and device, electronic equipment and medium
CN114283156A (en) Method and device for removing document image color and handwriting
CN111461070A (en) Text recognition method and device, electronic equipment and storage medium
CN111626145A (en) Simple and effective incomplete form identification and page-crossing splicing method
CN114565927A (en) Table identification method and device, electronic equipment and storage medium
CN109948598B (en) Document layout intelligent analysis method and device
US20080310715A1 (en) Applying a segmentation engine to different mappings of a digital image
CN115797939A (en) Two-stage italic character recognition method and device based on deep learning
CN112580738B (en) AttentionOCR text recognition method and device based on improvement
CN114926829A (en) Certificate detection method and device, electronic equipment and storage medium
US10185885B2 (en) Tex line detection
CN113793264A (en) Archive image processing method and system based on convolution model and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20211203

WW01 Invention patent application withdrawn after publication