CN110490193B - Single character area detection method and bill content identification method - Google Patents

Single character area detection method and bill content identification method Download PDF

Info

Publication number
CN110490193B
CN110490193B CN201910668919.0A CN201910668919A CN110490193B CN 110490193 B CN110490193 B CN 110490193B CN 201910668919 A CN201910668919 A CN 201910668919A CN 110490193 B CN110490193 B CN 110490193B
Authority
CN
China
Prior art keywords
bill
picture
region
identified
single character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910668919.0A
Other languages
Chinese (zh)
Other versions
CN110490193A (en
Inventor
张汉宁
苏斌
廖野
李煜
田福康
弋渤海
王长辉
杨宏德
张俊杰
方红超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shaanxi Taoding Information Technology Co ltd
Original Assignee
Xi'an Network Computing Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xi'an Network Computing Data Technology Co ltd filed Critical Xi'an Network Computing Data Technology Co ltd
Priority to CN201910668919.0A priority Critical patent/CN110490193B/en
Publication of CN110490193A publication Critical patent/CN110490193A/en
Application granted granted Critical
Publication of CN110490193B publication Critical patent/CN110490193B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Character Input (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the technical field of intelligent account making, and provides a single character region detection method and a bill content identification method, which comprises the steps of obtaining a field region picture to be identified, and labeling a single character region in the field region picture to be identified to obtain a single character region picture; zooming the field area pictures to be identified with different sizes to a fixed size; obtaining a first-layer characteristic diagram through convolution and pooling operations; extracting a field region feature map through a VGG-Net16 network; setting an initial detection frame, sending the initial detection frame into a softmax layer, and selecting a proposing window through outputting probability scores; pooling operation is carried out on the proposal window, and the proposal window is normalized into a feature vector with fixed size and unified dimensionality; and sending the characteristic vectors into a full connection layer, and calculating frame regression to obtain frame offset. Through the technical scheme, the problem of low identification accuracy of bill contents in the prior art is solved.

Description

Single character area detection method and bill content identification method
Technical Field
The invention belongs to the technical field of intelligent account making, and relates to a single character area detection method and a bill content identification method.
Background
In the field of finance and taxation, various types of bills need to be scanned or shot before accounting, and important text contents in shot bill pictures, such as money amount, date, name of billing company and the like, are identified. Because the scanner or various image devices can take a lot of background information irrelevant to the bill into the scanner or various image devices when shooting the bill picture, and simultaneously, due to the influence of external factors such as various bills, unclear content printing, complex shooting scene and the like, the content of a field to be identified can be fuzzy or deformed, and the identification accuracy rate of the bill content can be low.
Disclosure of Invention
The invention provides a single character region detection method and a bill content identification method, and solves the problem of low bill content identification accuracy rate in the prior art.
The technical scheme of the invention is realized as follows: comprises that
S10: obtaining a field area picture to be identified, and labeling a single character area in the field area picture to be identified to obtain a single character area picture;
s11: zooming the field area pictures to be identified with different sizes to a fixed size to obtain a uniform size picture, and recording the height of the uniform size picture as H pixel points and the width as W pixel points, wherein the size of the uniform size picture is H multiplied by W pixel points;
s12: performing convolution and pooling operations on the obtained uniform-size pictures to obtain a first-layer characteristic diagram;
s13: extracting a field region feature map from the obtained first layer feature map through a VGG-Net16 network;
s14: setting M initial detection frames with different sizes and 4 corresponding offsets for each pixel point of the obtained field region feature map, wherein the 4 offsets comprise the center coordinate of the initial detection frame, the length of the initial detection frame and the width of the initial detection frame, sending H multiplied by W multiplied by M initial detection frames into a softmax layer, and obtaining two probability scores for each initial detection frame;
s15: screening out an initial detection frame belonging to the foreground according to the probability score;
s16: sorting the initial detection frames obtained in the step S15 according to probability scores by a non-maximum suppression method, selecting the first N results as proposal output of a single character area, and finishing extraction of a proposal window;
s17: mapping the obtained proposal window to the field region feature map, performing pooling operation on the proposal window through an interest pooling layer, and normalizing the proposal windows of different sizes into feature vectors of fixed size and uniform dimension;
s18: and sending the feature vector into a full-connection layer, calculating frame regression by adopting a Loss function Smooth L1Loss, outputting frame offset of a single character region, and finishing detection of the single character region.
Further, the specific criterion for judging whether each initial detection frame belongs to the foreground or the background according to the probability score in step S15 is as follows: and when the IOU of the probability score of one initial detection frame and the probability score of the single character area picture is more than or equal to 0.8, judging that the initial detection frame is a foreground.
Further, the value range of M in step S14 is 8 to 10, and the value range of N in step S16 is 280 to 320.
The invention also provides a bill content identification method, which comprises the steps of
S21: acquiring a bill picture set;
s22: marking the bill regions of all the bill region pictures in the bill picture set by using a picture marking tool in the deep learning field, marking the field region to be identified and a single character region of each bill region, storing the recorded information of the field region to be identified, randomly selecting 80% of picture files in the marked bill shooting picture set to form a training sample set, and taking the rest 20% of the picture files as a testing sample set;
s23: counting the number of training samples according to the types of the bills, and performing construction and expansion on the bills with the number of the training samples smaller than 20 to obtain a training sample set with balanced number;
s24: taking the first 4 layers of a deep learning network VGG-Net16 as basic network layers, forming a network structure of a note region detection model by combining a pyramid network, taking note pictures in a training sample set as the input of the note region detection model, taking marked note region data information as the output of the note region detection model, and performing iterative training until the output accuracy of the note region detection model on a test sample set is greater than a preset threshold value to obtain the trained note region detection model;
s25: taking the first 4 layers of a deep learning network VGG-Net16 as basic network layers, forming a network structure of a field region detection model to be identified by combining a pyramid network, taking a note region labeling picture in a training sample set as the input of the field region detection model to be identified, taking labeled field region data information to be identified as the output of the field region detection model to be identified, and performing iterative training until the output accuracy of the field region detection model to be identified on a test sample set is greater than a preset threshold value to obtain a trained field region detection model to be identified;
s26: detecting a single character area in the field area picture to be recognized according to the steps from S11 to S17 to obtain a single character area image;
s27: the VGG-Net16 is used as a network structure, a single character region image is used as input, field region recorded information to be recognized is used as output, training of a region recorded information recognition model to be recognized is carried out until the output accuracy of the region recorded information recognition model to be recognized on a test sample set is larger than a preset threshold value, and a trained region recorded information recognition model to be recognized is obtained;
s28: and loading the trained bill region detection model file, the field region detection model file to be identified and the recorded information identification model file of the region to be identified in sequence, starting a Web interface service for dividing the bill region, and returning the recorded information of each bill in a Base64 coding mode to finish the identification of the bill content.
Further, the method for expanding the training samples in step S23 includes an image mixing method and a layer mixing method, where the image mixing method specifically includes: superposing the sample bill picture and another bill background according to the proportion of 6:4 to form a new picture, wherein the new picture contains the content of the sample bill picture and the other bill background;
the layer mixing method specifically comprises the following steps:
s231: opening a sample bill picture and a bill background picture by using picture editing software;
s232: selecting a pre-replaced selection area in the bill background picture, copying the selection area to the layer of the sample bill picture, and recording the selection area as a first selection area;
s233: adjusting the size of the first selection area to adapt to the sample bill picture, loading the first selection area, then shrinking the first selection area by 3-5 pixels, deleting the selection area corresponding to the sample bill layer,
s234: and simultaneously selecting the layer where the sample bill is located and the layer where the first selected area is located, and obtaining the picture after the panoramic image generation layer is mixed by using an automatic layer mixing command, so as to complete the expansion of the sample bill.
Further, step S21 includes
S211: connecting a scanner to read the image information of the bill;
s212: and processing image information of the bill, including picture compression, picture enhancement, background removal processing and picture direction correction.
The working principle and the beneficial effects of the invention are as follows:
1. the invention is beneficial to realizing the identification of character content by extracting the field area characteristic diagram, extracting the proposal window, normalizing the proposal window into the characteristic vector with fixed size and finally finishing the detection of a single character area. For example, the amount of money on a bill is 23.4 yuan, the existing identification mode is to identify all characters of the whole bill, the accuracy rate of directly identifying the whole bill is low due to the difference of the sizes, fonts and printing effects of various characters in the bill, and by adopting the single character area detection method, the area detection of the character 2, the area detection of the character 3, the area detection of the character 4 and the area detection of the character element can be firstly carried out, and then the character identification is respectively carried out on each character detection area, so that the pertinence is stronger, and the identification accuracy rate is high.
Step S11 is configured to scale the field area pictures to be identified with different sizes to a fixed size, and the method can be implemented by using the existing rule of Opencv, step S12 to step S13 are configured to extract the field area feature map, step S14 is configured with a plurality of initial detection frames, then step S15 to step S16 is performed to select N initial detection frames closest to the actually labeled single character area, and step S17 to step S18 comprehensively consider the N initial detection frames selected in step S16 to obtain the final single character area.
2. The IOU represents Intersection-over-Union (INTER-OVER-Union) is a concept in the field of target detection, here we are concerned about the field area to be identified, belonging to the foreground part, and the initial detection frame belonging to the foreground part is selected through the comparison of the IOU.
3. As shown in fig. 1, a schematic diagram of note region labeling, region labeling of a field to be identified, and single character region labeling is shown, where the note region labeling adopts a rectangular frame, there is only one note image in the rectangular frame, and each region of the field to be identified and each single character region are also respectively labeled by a rectangular frame.
The bill content identification method is based on the deep learning theory, and sequentially performs bill region detection, field region detection to be identified and single character region detection from a bill picture set, and after the single character region detection is completed, only records in the single character region are identified, so that the accuracy of character identification can be greatly improved, and the accuracy of the whole bill content identification is improved.
The invention constructs and expands a small number of training samples to ensure that the data of each type of bill is roughly the same, so that the learning accuracy is very high, the phenomenon that the characteristics of a certain type of bill cannot be learned is avoided, and the accurate identification of various bills is facilitated.
4. The image mixing method can be easily realized through graphic editing software such as Photoshop, and the expansion of rare samples can be completed; the layer mixing method can also realize the character replacement in the bill pictures in batch by using the scripting language of Photoshop software, so as to achieve the purpose of expanding rare samples. The training sample expansion method adopted in the invention can realize effective expansion of rare samples, and has the advantages of simple operation and strong practicability.
5. According to the invention, after the bill image information is obtained through the scanner, the bills with fuzzy content, shooting deformation and complex shooting scene are preprocessed, so that the bill information is easy to identify, and the accuracy of bill content identification is further improved.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
FIG. 1 is a schematic diagram of note region labeling, region labeling of a field to be identified and region labeling of a single character in the present invention;
FIG. 2 is a flow chart of single character region detection according to the present invention;
FIG. 3 is a flow chart of bill content identification in the present invention;
in the figure: 1-bill picture set, 2-bill area, 3-field area to be identified and 4-single character area.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
As shown in fig. 1-3, comprises
S10: obtaining a field area picture to be identified, and labeling a single character area in the field area picture to be identified to obtain a single character area picture;
s11: zooming the field area pictures to be identified with different sizes to a fixed size to obtain a uniform size picture, and recording the height of the uniform size picture as H pixel points and the width as W pixel points, namely the size of the uniform size picture is H multiplied by W pixel points;
s12: performing convolution and pooling operations on the obtained uniform-size pictures to obtain a first-layer characteristic diagram;
s13: extracting a field region feature map from the obtained first layer feature map through a VGG-Net16 network;
s14: setting 9 initial detection frames with different sizes and corresponding 4 offsets for each pixel point of the obtained field region feature map, wherein the 4 offsets comprise the center coordinate of the initial detection frame, the length of the initial detection frame and the width of the initial detection frame, sending H multiplied by W multiplied by 9 initial detection frames into a softmax layer, and obtaining two probability scores for each initial detection frame;
s15: screening out an initial detection frame belonging to the foreground according to the probability score;
s16: sorting the initial detection frames obtained in the step S15 according to probability scores by a non-maximum suppression method, selecting the first N results as proposal output of a single character area, and finishing extraction of a proposal window;
s17: mapping the obtained proposal window to the field region feature map, performing pooling operation on the proposal window through an interest pooling layer, and normalizing the proposal windows of different sizes into feature vectors of fixed size and uniform dimension;
s18: and sending the feature vector into a full-connection layer, calculating frame regression by adopting a Loss function Smooth L1Loss, outputting frame offset of a single character region, and finishing detection of the single character region.
The invention is beneficial to realizing the identification of character content by extracting the field area characteristic graph, extracting the proposal window, normalizing the proposal window into the characteristic vector with fixed size and finally completing the detection of a single character area. For example, the amount of money on a bill is 23.4 yuan, the conventional identification mode is to identify all characters of the whole bill, and the accuracy rate of directly identifying the whole bill is low due to the difference of the sizes, fonts and printing effects of various characters in the bill, so that by adopting the single character area detection method, the area detection of the character 2, the area detection of the character 3, the area detection of the character 4 and the area detection of the character yuan can be firstly carried out, and then the character identification is respectively carried out on each character detection area, so that the pertinence is stronger, and the identification accuracy rate is high.
Step S11 is configured to scale the field area pictures to be identified with different sizes to a fixed size, and the method can be implemented by using the existing rule of Opencv, step S12 to step S13 are configured to extract the field area feature map, step S14 is configured with a plurality of initial detection frames, then step S15 to step S16 is performed to select N initial detection frames closest to the actually labeled single character area, and step S17 to step S18 comprehensively consider the N initial detection frames selected in step S16 to obtain the final single character area.
Further, the specific criterion for determining whether each of the initial detection frames belongs to the foreground or the background according to the probability score in step S15 is as follows: and when the IOU of the probability score of one initial detection frame and the probability score of the single character area picture is more than or equal to 0.8, judging that the initial detection frame is a foreground.
The IOU represents Intersection-over-Union (INTER-OVER-Union) is a concept in the field of target detection, here we are concerned about the field area to be identified, belonging to the foreground part, and the initial detection frame belonging to the foreground part is selected through the comparison of the IOU.
Further, the value range of N in step S16 is 280 to 320.
The invention also provides a bill content identification method, which comprises the steps of
S21: acquiring a bill picture set;
s22: marking all bill region pictures in a bill picture set by using a picture marking tool in the field of deep learning, marking a field region to be identified and a single character region of each bill region, storing the recorded information of the field region to be identified, randomly selecting 80% of picture files in the marked bill shooting picture set to form a training sample set, and taking the remaining 20% of the picture files as a test sample set;
s23: counting the number of training samples according to the types of the bills, and performing construction and expansion on the bills with the number of the training samples smaller than 20 to obtain a training sample set with balanced number;
s24: taking the first 4 layers of a deep learning network VGG-Net16 as basic network layers, forming a network structure of a bill region detection model by combining a pyramid network, taking a bill picture in a training sample set as the input of the bill region detection model, taking marked bill region data information as the output of the bill region detection model, and performing iterative training until the output accuracy of the bill region detection model on a test sample set is greater than a preset threshold value to obtain the trained bill region detection model;
s25: taking the first 4 layers of a deep learning network VGG-Net16 as basic network layers, forming a network structure of a field region detection model to be identified by combining a pyramid network, taking a note region labeling picture in a training sample set as the input of the field region detection model to be identified, taking labeled field region data information to be identified as the output of the field region detection model to be identified, and performing iterative training until the output accuracy of the field region detection model to be identified on a test sample set is greater than a preset threshold value to obtain a trained field region detection model to be identified;
s26: detecting a single character area in the field area picture to be recognized according to the steps of S11-S17 to obtain a single character area image;
s27: taking VGG-Net16 as a network structure, taking a single character region image as input, taking the region recording information of the field to be identified as output, and training the region recording information identification model to be identified until the output accuracy of the region recording information identification model to be identified on a test sample set is greater than a preset threshold value, so as to obtain a trained region recording information identification model to be identified;
s28: and loading the trained bill region detection model file, the field region detection model file to be identified and the recorded information identification model file of the region to be identified in sequence, starting a Web interface service for dividing the bill region, and returning the recorded information of each bill in a Base64 coding mode to finish the identification of the bill content.
As shown in fig. 1, a schematic diagram of note region labeling, region labeling of a field to be identified, and single character region labeling is shown, where the note region labeling adopts a rectangular frame, there is only one note image in the rectangular frame, and each region of the field to be identified and each single character region are also respectively labeled by a rectangular frame.
The bill content identification method is based on the deep learning theory, and sequentially performs bill region detection, field region detection to be identified and single character region detection from a bill picture set, and after the single character region detection is completed, only records in the single character region are identified, so that the accuracy of character identification can be greatly improved, and the accuracy of the whole bill content identification is improved.
The invention constructs and expands a small number of training samples to ensure that the data of each type of bill is roughly the same, so that the learning accuracy is very high, the phenomenon that the characteristics of a certain type of bill cannot be learned is avoided, and the accurate identification of various bills is facilitated.
Further, the method for expanding the training sample in step S23 includes an image mixing method and a layer mixing method, where the image mixing method specifically includes: superposing the sample bill picture and another bill background according to the proportion of 6:4 to form a new picture, wherein the new picture contains the content of the sample bill picture and the other bill background;
the layer mixing method specifically comprises the following steps:
s231: opening a sample bill picture and a bill background picture by using picture editing software;
s232: selecting a pre-replaced selection area in the bill background picture, copying the selection area to the layer of the sample bill picture, and recording the selection area as a selection area one;
s233: adjusting the size of the first selection area to adapt to the sample bill picture, loading the first selection area, then contracting the first selection area by 3-5 pixels, deleting the selection area corresponding to the sample bill layer,
s234: and simultaneously selecting the layer where the sample bill is located and the layer where the first selected area is located, and obtaining the picture after the panoramic image generation layer is mixed by using an automatic layer mixing command, so as to complete the expansion of the sample bill.
The image mixing method can be easily realized through graphic editing software such as Photoshop, and the expansion of rare samples can be completed; the layer mixing method can also use the scripting language of Photoshop software to realize the text replacement in the bill images in batch, so as to achieve the purpose of expanding rare samples. The training sample expansion method adopted by the invention can realize effective expansion of rare samples, and has simple operation and strong practicability.
Further, step S21 includes
S211: connecting a scanner to read the image information of the bill;
s212: and processing the image information of the bill, including picture compression, picture enhancement, background removal processing and picture direction correction.
According to the invention, after the bill image information is obtained through the scanner, the bills with fuzzy content, shooting deformation and complex shooting scene are preprocessed, so that the bill information is easy to identify, and the accuracy of bill content identification is further improved.
The present invention is not limited to the above preferred embodiments, and any modifications, equivalent substitutions, improvements, etc. within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (3)

1. A bill content recognition method for recognizing a single character area from a bill picture is characterized by comprising
S21: acquiring a bill picture set;
s22: marking the bill regions of all the bill region pictures in the bill picture set by using a picture marking tool in the deep learning field, marking the field region to be identified and a single character region of each bill region, storing the recorded information of the field region to be identified, randomly selecting 80% of picture files in the marked bill shooting picture set to form a training sample set, and taking the rest 20% of the picture files as a testing sample set;
s23: counting the number of training samples according to the bill types, and constructing and expanding the bill types with the number of the training samples smaller than 20 to obtain a training sample set with balanced number;
the method for expanding the training sample in the step S23 includes an image mixing method and a layer mixing method, where the image mixing method specifically includes: superposing the sample bill picture and another bill background according to the proportion of 6:4 to form a new picture, wherein the new picture contains the content of the sample bill picture and the other bill background;
the layer mixing method specifically comprises the following steps:
s231: opening a sample bill picture and a bill background picture by using picture editing software;
s232: selecting a pre-replaced selection area in the bill background picture, copying the selection area to the layer of the sample bill picture, and recording the selection area as a first selection area;
s233: adjusting the size of the first selection area to adapt to the sample bill picture, loading the first selection area, then contracting the first selection area by 3-5 pixels, deleting the selection area corresponding to the sample bill layer,
s234: simultaneously selecting the layer where the sample bill is located and the layer where the first selection area is located, and obtaining a picture after the panoramic image generation layer is mixed by using an automatic layer mixing command, so as to complete the expansion of the sample bill;
s24: taking the first 4 layers of a deep learning network VGG-Net16 as basic network layers, forming a network structure of a bill region detection model by combining a pyramid network, taking a bill picture in a training sample set as the input of the bill region detection model, taking marked bill region data information as the output of the bill region detection model, and performing iterative training until the output accuracy of the bill region detection model on a test sample set is greater than a preset threshold value to obtain the trained bill region detection model;
s25: taking the first 4 layers of a deep learning network VGG-Net16 as basic network layers, forming a network structure of a field region detection model to be identified by combining a pyramid network, taking a note region labeling picture in a training sample set as the input of the field region detection model to be identified, taking labeled field region data information to be identified as the output of the field region detection model to be identified, and performing iterative training until the output accuracy of the field region detection model to be identified on a test sample set is greater than a preset threshold value to obtain a trained field region detection model to be identified;
s26: detecting a single character area in the field area picture to be recognized according to the steps of S11-S17 to obtain a single character area image;
s27: taking VGG-Net16 as a network structure, taking a single character region image as input, taking the region recording information of the field to be identified as output, and training the region recording information identification model to be identified until the output accuracy of the region recording information identification model to be identified on a test sample set is greater than a preset threshold value, so as to obtain a trained region recording information identification model to be identified;
s28: sequentially loading a trained bill region detection model file, a field region detection model file to be recognized and a region recorded information recognition model file to be recognized, starting Web interface service for bill region segmentation, and returning information recorded by each bill in a Base64 coding mode to complete bill content recognition;
the single character region detection method includes:
s10: obtaining a field area picture to be identified, and labeling a single character area in the field area picture to be identified to obtain a single character area picture;
s11: zooming the field area pictures to be identified with different sizes to a fixed size to obtain a uniform size picture, and recording the height of the uniform size picture as H pixel points and the width as W pixel points, wherein the size of the uniform size picture is H multiplied by W pixel points;
s12: performing convolution and pooling operations on the obtained uniform-size pictures to obtain a first-layer characteristic diagram;
s13: extracting a field region feature map from the obtained first layer feature map through a VGG-Net16 network;
s14: setting M initial detection frames with different sizes and 4 corresponding offsets for each pixel point of the obtained field region feature map, wherein the 4 offsets comprise the center coordinate of the initial detection frame, the length of the initial detection frame and the width of the initial detection frame, sending H multiplied by W multiplied by M initial detection frames into a softmax layer, and obtaining two probability scores for each initial detection frame;
s15: screening out an initial detection frame belonging to the foreground according to the probability score;
in step S15, the specific criterion for determining whether each of the initial detection frames belongs to the foreground or the background according to the probability score is as follows: when the IOU of the probability score of one initial detection frame and the probability score of the single character area picture is more than or equal to 0.8, judging that the initial detection frame is a foreground;
s16: sorting the initial detection frames obtained in the step S15 through a non-maximum suppression method according to probability scores, selecting the top N results as proposal output of a single character area, and finishing the extraction of proposal windows;
s17: mapping the obtained proposal window to the field area characteristic diagram, performing pooling operation on the proposal window through an interest pooling layer, and normalizing the proposal windows with different sizes into characteristic vectors with fixed sizes and unified dimensionality;
s18: and sending the feature vector into a full-connection layer, calculating frame regression by adopting a Loss function Smooth L1Loss, outputting frame offset of a single character region, and finishing detection of the single character region.
2. The method for identifying bill contents according to claim 1, wherein M in step S14 has a value ranging from 8 to 10, and N in step S16 has a value ranging from 280 to 320.
3. The ticket content identification method of claim 1, wherein step S21 comprises
S211: connecting a scanner to read the image information of the bill;
s212: and processing the image information of the bill, including picture compression, picture enhancement, background removal processing and picture direction correction.
CN201910668919.0A 2019-07-24 2019-07-24 Single character area detection method and bill content identification method Active CN110490193B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910668919.0A CN110490193B (en) 2019-07-24 2019-07-24 Single character area detection method and bill content identification method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910668919.0A CN110490193B (en) 2019-07-24 2019-07-24 Single character area detection method and bill content identification method

Publications (2)

Publication Number Publication Date
CN110490193A CN110490193A (en) 2019-11-22
CN110490193B true CN110490193B (en) 2022-11-08

Family

ID=68548038

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910668919.0A Active CN110490193B (en) 2019-07-24 2019-07-24 Single character area detection method and bill content identification method

Country Status (1)

Country Link
CN (1) CN110490193B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027443B (en) * 2019-12-04 2023-04-07 华南理工大学 Bill text detection method based on multitask deep learning
CN112925837B (en) * 2019-12-06 2022-08-02 上海高德威智能交通***有限公司 Text structuring method and device
CN111507352B (en) * 2020-04-16 2021-09-28 腾讯科技(深圳)有限公司 Image processing method and device, computer equipment and storage medium
CN112308036A (en) * 2020-11-25 2021-02-02 杭州睿胜软件有限公司 Bill identification method and device and readable storage medium
CN112733726A (en) * 2021-01-12 2021-04-30 海尔数字科技(青岛)有限公司 Bill sample capacity expansion method and device, electronic equipment and storage medium
CN113468906B (en) * 2021-07-12 2024-03-26 深圳思谋信息科技有限公司 Graphic code extraction model construction method, identification device, equipment and medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284750A (en) * 2018-08-14 2019-01-29 北京市商汤科技开发有限公司 Bank slip recognition method and device, electronic equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106650721B (en) * 2016-12-28 2019-08-13 吴晓军 A kind of industrial character identifying method based on convolutional neural networks
KR101858099B1 (en) * 2017-02-03 2018-06-27 인천대학교 산학협력단 Method and apparatus for detecting vehicle plates
CN107766809B (en) * 2017-10-09 2020-05-19 平安科技(深圳)有限公司 Electronic device, bill information identification method, and computer-readable storage medium
CN107798299B (en) * 2017-10-09 2020-02-07 平安科技(深圳)有限公司 Bill information identification method, electronic device and readable storage medium
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN108596066B (en) * 2018-04-13 2020-05-26 武汉大学 Character recognition method based on convolutional neural network
CN109784342B (en) * 2019-01-24 2021-03-12 厦门商集网络科技有限责任公司 OCR (optical character recognition) method and terminal based on deep learning model

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284750A (en) * 2018-08-14 2019-01-29 北京市商汤科技开发有限公司 Bank slip recognition method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于链接线的自然场景文字检测技术的研究;王家伟;《中国优秀硕士学位论文全文数据库信息科辑》;20190115;1-24 *
移动端目标检测***的设计与实现;肖学锋;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190115;9-12 *

Also Published As

Publication number Publication date
CN110490193A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN110490193B (en) Single character area detection method and bill content identification method
CN110555433B (en) Image processing method, device, electronic equipment and computer readable storage medium
CN109784342B (en) OCR (optical character recognition) method and terminal based on deep learning model
WO2022142611A1 (en) Character recognition method and apparatus, storage medium and computer device
WO2021042505A1 (en) Note generation method and apparatus based on character recognition technology, and computer device
JP7246104B2 (en) License plate identification method based on text line identification
JP2016134175A (en) Method and system for performing text-to-image queries with wildcards
CN109711407B (en) License plate recognition method and related device
CN109740515B (en) Evaluation method and device
CN109377494B (en) Semantic segmentation method and device for image
CN110427819B (en) Method for identifying PPT frame in image and related equipment
CN111401374A (en) Model training method based on multiple tasks, character recognition method and device
CN112070649A (en) Method and system for removing specific character string watermark
CA3166091A1 (en) An identification method, device computer equipment and storage medium for identity document reproduction
CN113591866A (en) Special job certificate detection method and system based on DB and CRNN
CN112115950B (en) Wine mark identification method, wine information management method, device, equipment and storage medium
CN115187456A (en) Text recognition method, device, equipment and medium based on image enhancement processing
CN111626145A (en) Simple and effective incomplete form identification and page-crossing splicing method
CN112365451B (en) Method, device, equipment and computer readable medium for determining image quality grade
CN113256643A (en) Portrait segmentation model training method, storage medium and terminal equipment
CN116311322A (en) Document layout element detection method, device, storage medium and equipment
CN114359739B (en) Target identification method and device
CN115147852A (en) Ancient book identification method, ancient book identification device, ancient book storage medium and ancient book storage equipment
CN114625872A (en) Risk auditing method, system and equipment based on global pointer and storage medium
US11928872B2 (en) Methods and apparatuses for recognizing text, recognition devices and storage media

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240409

Address after: 710100 11a-1-5, Chang'an innovation and entrepreneurship center, Wenyuan Middle Road, Guodu street, Chang'an District, Xi'an City, Shaanxi Province

Patentee after: Shaanxi taoding Information Technology Co.,Ltd.

Country or region after: China

Address before: 710000 Room 102, block a, Chang'an cultural center, Wenyuan South Road, Guodu Street office, Chang'an District, Xi'an City, Shaanxi Province

Patentee before: Xi'an Network Computing Data Technology Co.,Ltd.

Country or region before: China