CN113065407A - Financial bill seal erasing method based on attention mechanism and generation countermeasure network - Google Patents

Financial bill seal erasing method based on attention mechanism and generation countermeasure network Download PDF

Info

Publication number
CN113065407A
CN113065407A CN202110254233.4A CN202110254233A CN113065407A CN 113065407 A CN113065407 A CN 113065407A CN 202110254233 A CN202110254233 A CN 202110254233A CN 113065407 A CN113065407 A CN 113065407A
Authority
CN
China
Prior art keywords
convolutional neural
picture
neural network
seal
original
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110254233.4A
Other languages
Chinese (zh)
Other versions
CN113065407B (en
Inventor
刘义江
陈蕾
侯栋梁
池建昆
范辉
阎鹏飞
魏明磊
李云超
姜琳琳
辛锐
陈曦
杨青
沈静文
吴彦巧
姜敬
檀小亚
师孜晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiongan New Area Power Supply Company State Grid Hebei Electric Power Co
State Grid Hebei Electric Power Co Ltd
Original Assignee
Xiongan New Area Power Supply Company State Grid Hebei Electric Power Co
State Grid Hebei Electric Power Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiongan New Area Power Supply Company State Grid Hebei Electric Power Co, State Grid Hebei Electric Power Co Ltd filed Critical Xiongan New Area Power Supply Company State Grid Hebei Electric Power Co
Priority to CN202110254233.4A priority Critical patent/CN113065407B/en
Publication of CN113065407A publication Critical patent/CN113065407A/en
Application granted granted Critical
Publication of CN113065407B publication Critical patent/CN113065407B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/12Accounting
    • G06Q40/125Finance or payroll
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Finance (AREA)
  • Artificial Intelligence (AREA)
  • Accounting & Taxation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Image Analysis (AREA)

Abstract

The invention belongs to the field of bill text recognition, and relates to a financial bill seal erasing method based on an attention mechanism and a generation countermeasure network, which is realized by a processor and comprises the following steps: receiving an original picture of the financial bill; determining a first feature map of the original picture according to the original picture by using a feature extraction module in a convolutional neural network; respectively extracting a background color chart of the original image and an attention heat chart reflecting position distribution of the seal on the original image by using the convolutional neural network according to the first characteristic chart; generating an image of the original image after the seal is erased in a confrontation mode by using the convolutional neural network according to a second characteristic diagram spliced by the original image, the background color diagram and the attention heat diagram in the channel direction; the convolutional neural network performs training using a way to generate an antagonism. The invention solves the problem of difficult identification of financial bills containing seals, and achieves the aim of erasing the seals without losing original character information.

Description

Financial bill seal erasing method based on attention mechanism and generation countermeasure network
Technical Field
The invention belongs to the technical field of graph convolution neural networks, and particularly relates to a method for erasing and filling partial areas from a picture.
Background
In the computer automatic processing process of financial bill reimbursement, the starting point of the processing flow relates to the digital input of financial bills, and the physical financial bills comprise various business bills such as invoices, train tickets, plane tickets, examination and approval tickets and the like. The financial bill is scanned into a digital image file by image capturing equipment such as a scanner, and content information of the financial bill is detected and identified from the digital image file by using an algorithm model. One practical problem is that the financial bills of the original real object are basically stamped with stamps, and the stamps randomly cover, cut and mashup the content information, so that the result accuracy of detecting and identifying the content information in the digital picture files of the financial bills through the graph convolution neural network is low.
The current common bill seal erasing method is based on image convolution processing of image color channels, for example, an original image is segmented into red, green and blue channel gray level images according to RGB three-channel information, and then a threshold value obtained by manual setting or learning training is utilized to forcibly set the image higher than the threshold value to be white and the image lower than the threshold value to be black. The Chinese patent publication CN108146093B also discloses a method for taking out a bill stamp. The prior art method based on image color channels has the following three defects: firstly, because the types, sizes, positions and background complexity of the seals of each picture are different, the threshold values required to be set for each picture are also different, the difference of the color data acquired by the same printing color under different illumination conditions is larger, and meanwhile, the text content of which the color is similar to that of the seal but is not the seal can not be eliminated; secondly, the method for erasing the seal only through the channel threshold has poor effect, the seal content cannot be completely removed, and in order to avoid the influence of excessive erasing on the detection and identification of the covered text, the boundary judgment of the seal content is prone to being conservative and non-greedy, so that obvious seal traces are usually left; finally, the position of the seal cannot be determined so as to reduce resource consumption, the method needs to integrally judge a full-size three-color channel of the whole picture, and the seal of the whole picture can be erased, so that the algorithm running time and the calculation resources are greatly consumed.
Disclosure of Invention
The invention aims to provide a method for erasing a seal in a financial bill based on an attention mechanism and an adversarial network generation method in deep learning, and the method is used for specially solving the problem of difficulty in identifying the financial bill containing the seal by the attention mechanism and the adversarial network generation method, achieving the aim of erasing the financial bill seal without losing original text information, promoting financial office informatization by solving the problem of erasing the financial bill seal, saving social manpower resource cost and simplifying reimbursement processing flow.
The technical scheme provided by a plurality of embodiments of the invention is a financial bill seal erasing method based on an attention mechanism and a generation countermeasure network, which is realized by a processor, and the method comprises the following steps:
receiving an original picture of the financial bill; determining a first feature map of the original picture according to the original picture by using a feature extraction module in a convolutional neural network; respectively extracting a background color chart of the original image and an attention heat chart reflecting position distribution of the seal on the original image by using the convolutional neural network according to the first characteristic chart; and performing seal erasing on the original image by using the convolutional neural network according to a second characteristic diagram spliced in the channel direction by the original image, the background color diagram and the attention heat diagram.
Preferably, the feature extraction module of the convolutional neural network is configured to evaluate feature vectors distributed in each channel of a color space of the original picture to form a first feature map having the same length and width as the original picture.
Preferably, the convolutional neural network comprises a background color separation module configured to globally maximally pool the first feature map to determine a one-dimensional feature vector of the original picture in the channel direction, map the one-dimensional feature vector to coordinate values of a color space, and copy the coordinate values to create a background color map having the same length and width as the original picture.
Preferably, the convolutional neural network comprises: an attention mechanism module configured to evaluate the first feature map to evaluate the attention heat map reflecting a position distribution of a stamp on the original picture.
Preferably, the evaluating the first feature map to evaluate the attention heat map reflecting the position distribution of the stamp on the original image is: firstly, copying a one-dimensional feature vector determined by the first feature map with the maximum global pooling into a multi-dimensional feature vector with the same length and width as the original image along the length and width direction of the original image, performing point multiplication on the multi-dimensional feature vector and the first feature map, and summing channel dimensions of point multiplication results to obtain the attention heat map.
Preferably, the convolutional neural network comprises a U-net network used for generating a seal erasing picture according to the second characteristic diagram; during training, constructing a discriminator for the U-net network, wherein the discriminator is used for judging the authenticity of the erased picture generated by the U-net network according to the chapter-free pictures in the input paired samples so as to form a generation countermeasure network; in a round training period, training the U-net network for multiple times to enable the U-net network to learn continuously to generate more vivid seal erasing pictures, and continuously improving the real performance of judging the erasing pictures by the discriminator until the convolutional neural network is trained until the discriminator determines that the pictures generated by the U-net network after the seal erasing are real.
Preferably, when the convolutional neural network is trained, a training sample set composed of paired samples is adopted for implementation; the pair of samples includes a corresponding chapter-containing picture and a chapter-free picture. It is further preferred that the loss function configured when training the convolutional neural network comprises a prediction numberAccording to the offset loss function Ldata. It is further preferred that, in training the convolutional neural network, the configured loss function comprises generating a countering network loss function LGAN
Preferably, the original picture of the financial bill is received without preprocessing the original picture. Because the preferred embodiment comprises the global pooling of the first characteristic diagram and uses a single picture training stamp to erase the convolutional neural network on the basis, the preprocessing of the application stage can be cancelled, so that the efficiency of identifying the text content of the bill can be improved.
The invention provides a technical scheme that the method comprises the steps of firstly extracting the bill picture characteristics by using a characteristic extraction module, then splitting a background color image by using a background color separation module, secondly positioning a seal area by using an attention mechanism module, and finally learning to erase the seal by using a generation confrontation network module. Therefore, according to the technical scheme, aiming at the problem of seal erasure in financial bills, the text identification accuracy is improved through seal erasure, the seal area is firstly positioned through an attention mechanism, then seal erasure is carried out through a generated countermeasure network, the seal erasure can be carried out in the area with high attention weight in the processing process, the calculation amount and the running time consumed by an algorithm are reduced, and the problem of difficulty in identification of the financial bills containing the seals is effectively solved.
Drawings
FIG. 1 is a schematic structural diagram of a convolutional neural network in a financial document stamp erasing method based on an attention mechanism and a generation countermeasure network according to an embodiment of the invention;
FIG. 2 is a diagram illustrating chapter-containing pictures in a pair of samples in a training sample set according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating chapter-free pictures in a pair of samples in a training sample set according to an embodiment of the present invention;
FIG. 4 is a schematic data flow diagram illustrating training of a convolutional neural network in a financial document stamp erasure method based on an attention mechanism and generation countermeasure network according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of data flow when a convolutional neural network is applied to erase a financial document stamp according to the embodiment of FIG. 4.
Detailed Description
It should be noted that the idea of the invention is to receive the original picture of the financial bill; determining a first feature map of the original picture according to the original picture by using a feature extraction module in a convolutional neural network; respectively extracting a background color chart of the original image and an attention heat chart reflecting position distribution of the seal on the original image by using the convolutional neural network according to the first characteristic chart; generating an image of the original image after the seal is erased in a confrontation mode by using the convolutional neural network according to a second characteristic diagram spliced by the original image, the background color diagram and the attention heat diagram in the channel direction; the convolutional neural network performs training using a way to generate an antagonism.
Referring to fig. 1, in one embodiment of the present invention, a convolutional neural network in a financial bill stamp erasure method based on an attention mechanism and generation of a countermeasure network includes a network model structure of three parts: the first part is to extract the size of C based on deep learning multi-layer convolution0A picture feature extraction module for extracting original picture features of XHXW to output a picture with a size of C1First feature map of XHXW, C1H, W respectively representing the channel, height and width values of the first profile; the second part is a background color separation module based on an attention mechanism, and the global maximum pooling is adopted to reduce the dimension of the feature vector of each picture in the length and width directions in the first feature map obtained by the first part picture feature extraction module to obtain the one-dimensional feature vector of the feature data of the picture on each channel dimension, namely the feature vector with the dimension of C1The first histogram of xHxW is further globally maximally pooled to obtain a size C1The one-dimensional characteristic vector is connected by multiple layers to obtain a Background Color chart (Background Color) with the same size as the original image, wherein the size of the Background Color chart is C0xHxW, and obtaining an Attention heat Map (Attention Map) by weighting and summing the last convolution layer in the channel direction by using a one-dimensional weight vector, wherein the size of the Attention heat Map is 1 xHxW, and the background color Map and the Attention heat Map are used for being spliced with an original picture in the channel direction to form a second feature Map with a rulerCun is composed of (2C)0+1) xHxW, in the second characteristic diagram, the attention weight of each pixel of the image in the attention heat map is used for reflecting the distribution of the area where the seal is located on the image, and the background color map is used for providing characteristic information for distinguishing the foreground distribution containing the seal and the background distribution containing the color similar to the seal in the image in the subsequent processing; the third part is a seal erasing module based on the generation countermeasure network and used for erasing the approximate region of the seal extracted in the second step. After an Attention weight Map (Attention Map) and a Background Color Map (Background Color) output by a Background separation module are spliced together in a channel direction, the Attention weight Map and the Background Color Map are used as a second feature Map to be input integrally, a part of the note document Map is removed after the note document Map is split through a deep learning network with the functions of splitting and generating, the note document Map with the seal removed is generated again, and the seal is erased and is also provided with a response label for supervision. In addition, when the convolutional neural network of the embodiment is trained, the authenticity of the output erasing seal result graph is judged through a discriminator, a generation countermeasure network is formed, and a generation countermeasure network loss function is introduced.
Referring to fig. 2, 3 and 4, as a specific example of the above-mentioned embodiment, in this embodiment, the picture feature extraction module outputting the first feature map adopts a 10-layer convolution structure, samples in training and application and picture size for erasing a stamp are preprocessed to be 600 × 800 (height × width) resolution, and RGB color space is selected in the input channel direction, so as to be input in a matrix of size 3 × 600 × 800, i.e., C, in the picture feature extraction input layer03. Specifically, the structure of the image feature extraction module is configured as the following table:
type of operation Parameter(s) Size of
Input device 3×600×800
Block1, convolution layer × 3 Kernel 3, step 1, boundary complement 1 32×600×800
Block2, convolution layer × 3 Kernel 3, step 1, boundary complement 1 64×600×800
Block3, convolution layer × 4 Kernel 3, step 1, boundary complement 1 128×600×800
It will be readily appreciated that as the choice of 3 channels in the RGB color space in the input vector is preferred, data in the compressed picture format, e.g., bmp, jpg, etc., can be easily processed into three-channel vectors of RGB, and for non-compressed formats, e.g., RAW, channel settings containing more detail can be used to further improve erasure accuracy.
Specifically, in this embodiment, the background color separation module for obtaining the background color map includes a global maximum pooling layer, which is configured to dimension the first feature map to the channel direction, generate a one-dimensional feature vector with a size equal to the number of channels of the first feature map, where the length of the one-dimensional feature vector is determined by the final convolution output size of the feature extraction module part, and in this embodiment, the final convolution layer Block3 of the picture feature extraction module has an output size of 128 × 600 × 800, that is, C1 ═ 128, and the size of the one-dimensional feature vector is 128. The background color separation module further includes three layers of full connections corresponding to the channels, and the one-dimensional feature vectors are fully connected to color space vectors with the same number of channels as the number of channels of the original picture, where the size of the fully connected vector is C0 ═ 3 in this embodiment. Finally, the background color separation module copies and expands the vector with the size of 3 on the height and width of the picture to generate a background color image with the size of 3 × 600 × 800. In an exemplary mode, during training, the background color separation module selects an L1 loss function to conduct supervised training, and the obtained background color graph can reflect the original color of the bill. It is easy to understand that, when the background color map is displayed visually, it is a pure color map with the same size as the original picture, but the vector of the selected color space carries the overall color feature information of the original picture, not the actual background color of the original picture, and it is mainly used to provide feature processing for segmenting the foreground and the background on each channel of the color space in subsequent processing.
Specifically, in this embodiment, the attention mechanism module for obtaining the attention heat map first obtains the global maximum pooling in the background color separation module to obtain the size C1Is replicated H × W times to obtain a size C1A characteristic vector with the size of multiplied by H and W, and then the characteristic vector and the size of C obtained by the first partial characteristic extraction module1And performing dot multiplication on the first feature map of the multiplied by x H x W, and summing feature vectors obtained by the dot multiplication in channel dimensions to obtain the attention heat map of the embodiment, wherein the size of the attention heat map is 1 x H x W. The attention heat map can reflect the position distribution of the stamp in the bill picture. It will be readily appreciated that other prior art attention mechanisms may be used by those skilled in the art to extract the positional distribution of the stamp portions in the ticket image.
Specifically, the seal erasing module for generating the seal erasing picture comprises a U-net network used for generating the seal erasing picture according to a second characteristic diagram, a discriminator is established for the U-net network during training and used for judging the authenticity of the U-net network generated and erased picture according to the seal-free picture in the input paired samples so as to form a generation confrontation network, the U-net network is trained for multiple times in a training cycle taking the paired sample pictures as input to enable the U-net network to learn continuously to generate a more vivid seal erasing picture, and the authenticity capability of discriminating the erased picture is continuously improved by the discriminator until the discriminator is trained to determine that the picture after the U-net network is erased is true. In the present embodiment, the discriminator uses 2-3 layers of a convolutional neural network.
The structure of the convolutional neural network part in this embodiment is described above, and the following process of training and application is used to specifically disclose the principle of this embodiment that a stamp-removed picture is obtained from an original picture. The process comprises the following steps:
step 100, a training sample set is created. Specifically, in this embodiment, the initial training sample set includes N pairs of bill pictures including a seal and not including a seal, and is represented as
Figure BDA0002967300320000061
Wherein S isiRepresenting a bill picture containing a seal, hereinafter referred to as a seal-containing picture, NiIs represented by the formulaiThe corresponding picture does not contain a seal, and is hereinafter referred to as a seal-free picture. Fig. 2 shows the chapters-containing picture in a pair of samples in the training sample set, which should be generally in color, and fig. 3 shows the chapters-free picture of fig. 2. It is easy to understand that the training sample set of the present embodiment is very simple in structure, and only needs to scan or take pictures before and after stamping. Therefore, the invoice seal erasing method based on the generation countermeasure network provided by the invention is easy to use and migrate to various application scenes.
And 200, constructing a convolutional neural network and configuring a training environment. The convolutional neural network is established and the training environment thereof is configured according to the structural description of the convolutional neural network of the present embodiment. Specifically, in the training stage, in order to keep the same size of the same batch of data, a training environment is configured to initialize and fix the paired stamp training pictures obtained in step 100 to 600 × 800, the size is not satisfied, and a bilinear interpolation method is used for transformation. The data enhancement method used in the training process comprises the following steps: random brightness adjustment, saturation/color adjustment. It is easy to understand that the conventional data enhancement means can perform partial data augmentation on the training sample set, and the invention does not limit the conventional picture data preprocessing.
In this embodiment, the training of the whole convolutional neural network model needs to use the paired sample image data set of step 100, so the loss function in the whole model training includes two parts: the loss function of the first part is the calculated pixel deviation L of the seal erasing picture and the original seal-free picture obtained through the networkdata(ii) a The second part generates a countering network loss function LGAN. The loss function L used for training of the convolutional neural network model is configured as the following formula:
L=λ1Ldata2LGAN (1)
wherein the predicted data deviation loss function LdataThe concrete configuration is as follows:
Figure BDA0002967300320000071
wherein S isiRepresenting pictures containing chapters, NiIs represented by the formulaiThe corresponding non-chapter picture is a picture,
Figure BDA0002967300320000072
means to average the loss function of all pairs of samples in all data, i.e. its expected value, PdataIs a Data set of the sample Data,
Figure BDA0002967300320000073
the term "chapter-free picture" means a picture obtained by prediction.
Generating a countering network loss function LGANThe concrete configuration is as follows:
Figure BDA0002967300320000074
wherein S isiRepresenting pictures containing chapters, NiIs represented by the formulaiAnd E is an expected value of the appointed network, and D is the loss of the discriminator.
It is easy to understand that,
Figure BDA0002967300320000075
to generate a calculated generator loss function in the pairwise reactance,
Figure BDA0002967300320000076
to generate a discriminator loss function in the challenge.
Exemplary, hyper-parameter λ1=1,λ2The optimizer chooses ADADELTA to calculate the gradient and does back-propagation at 0.01. The training learning rate is initialized to 0.1, every 10 epoch learning rates are multiplied by 0.9, the trained batch size is set to 64, and a total of 100 epochs are trained.
Step 300, training the convolutional neural network using the training sample set of step 100. After 100 epochs of training, a plurality of models can be obtained, and the optimal model is selected, wherein the model is the model with the minimum objective function value and is used for practical application.
Referring to the flow direction of data in the overall training of the neural network model shown in fig. 4, it is easy to understand that, in the training process, the first feature map with the size of 128 × 600 × 800 extracted by the feature extraction module in the convolutional neural network is further input to the global maximum pooling to obtain a feature vector with the dimension of 128 dimensions, and the global maximum pooling can enable the entire network to process input pictures with different sizes; and finally, outputting a vector of three RGB values, namely representing the color of a pixel point, after the obtained one-dimensional characteristic vector passes through a full connection layer, and copying the pixel for 600 × 800 times to obtain a background color map 3 × 600 × 800 with the same size as the original picture. Here, the L1 loss function is selected for supervised training, and the obtained background color map can reflect the original color of the bill.
The attention power machine module firstly copies the one-dimensional to-feature vector with the length of 128 obtained by global maximum pooling in the background color separation module into a 128 × 600 × 800-size to-feature vector for 600 × 800 times, performs point multiplication on the feature vector and the feature vector obtained by the first-step feature extraction module to obtain the feature vector with the size of 128 × 600 × 800, and finally performs summation in 128 dimensions of a channel to obtain an attention heat map with the size of 1 × 600 × 800. The attention heat map can reflect the position distribution of the stamp in the bill picture.
The seal erasing module inputs a bill picture of 3 multiplied by 600 multiplied by 800, the attention mechanism module outputs the attention heat map of 1 multiplied by 600 multiplied by 800 and a background color map of 3 multiplied by 600 multiplied by 800 obtained by the background color separation module, and the three parts are spliced in the channel dimension. Secondly, sending the spliced characteristic vector 7 multiplied by 600 multiplied by 800 into a U-net network for seal erasure and obtaining a picture 3 multiplied by 600 multiplied by 800 after the seal erasure, meanwhile, a discriminator is established by the module for judging the authenticity of the picture after the U-net network erasure to form a generation confrontation network, the U-net continuously learns to generate a more vivid seal erasure picture, and the discriminator continuously improves the authenticity capability of discriminating the erasure picture until the discriminator determines that the picture after the U-net network erasure is true after network training.
And 400, applying the parameters of the convolutional neural network model trained in the step 300 to perform seal erasing processing on the images containing the seal collected by the system. Because the convolutional neural network constructed in the step 200 performs the global pooling dimension reduction processing on the first feature map and performs the training by using a single picture containing the chapters on the basis, the training forming parameters are insensitive to the basic parameters such as the size and the contrast of the picture, and in the application process, the notes such as train tickets and the like do not need to be preprocessed firstly or the picture does not need to be subjected to data enhancement, no matter how the value of the collected picture H, W is taken, and finally the C is obtained through feature extraction1The characteristic diagram of xHxW is changed into C through global pooling1Feature vector of × 1 × 1. Referring to fig. 5, the directly trained seal erasure convolutional neural network model can erase the seal in the bill without the judgment of a discriminator. The result generated by erasing the seal of the convolutional neural network model part through U-net can realize the picture before stamping the sample financial bill shown in figure 3, namely, the seal erasing and the bill picture repairing are realized for the character recognition of an external system.
The above description is only exemplary of the present invention and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention is within the protection scope of the present invention.

Claims (10)

1. A financial document stamp erasure method based on an attention mechanism and a generation countermeasure network, implemented by a processor, the method comprising:
receiving an original picture of the financial bill; determining a first feature map of the original picture according to the original picture by using a feature extraction module in a convolutional neural network; respectively extracting a background color chart of the original image and an attention heat chart reflecting position distribution of the seal on the original image by using the convolutional neural network according to the first characteristic chart; generating an image of the original image after the seal is erased in a confrontation mode by using the convolutional neural network according to a second characteristic diagram spliced by the original image, the background color diagram and the attention heat diagram in the channel direction; the convolutional neural network performs training using a way to generate an antagonism.
2. A financial document stamp erasure method according to claim 1, including the steps of: the feature extraction module of the convolutional neural network is configured to evaluate feature vectors distributed in each channel of a color space of the original picture to form a first feature map with the same length and width as the original picture.
3. A financial document stamp erasure method according to claim 2, including: the convolutional neural network comprises a background color separation module configured to globally pool the first feature map to determine a one-dimensional feature vector of the original picture in a channel direction, map the one-dimensional feature vector to coordinate values of a color space, and copy the coordinate values to create a background color map with the same length and width as the original picture.
4. A financial document stamp erasure method according to claim 3, including the steps of: the convolutional neural network comprises an attention mechanism module configured to evaluate the first feature map to evaluate the attention heat map reflecting the position distribution of the stamp on the original picture.
5. The financial document stamp erasure method of claim 4, wherein: the attention heat map for evaluating the position distribution of the reaction seal on the original picture by evaluating the first characteristic diagram is as follows: firstly, copying a one-dimensional feature vector determined by the first feature map with the maximum global pooling into a multi-dimensional feature vector with the same length and width as the original image along the length and width direction of the original image, performing point multiplication on the multi-dimensional feature vector and the first feature map, and summing channel dimensions of point multiplication results to obtain the attention heat map.
6. A financial document stamp erasure method according to claim 1, including the steps of: the convolutional neural network comprises a U-net network used for generating a seal erasing picture according to a second characteristic diagram; during training, constructing a discriminator for the U-net network, wherein the discriminator is used for judging the authenticity of the erased picture generated by the U-net network according to the chapter-free pictures in the input paired samples so as to form a generation countermeasure network; in a round training period, training the U-net network for multiple times to enable the U-net network to learn continuously to generate more vivid seal erasing pictures, and continuously improving the real performance of judging the erasing pictures by the discriminator until the convolutional neural network is trained until the discriminator determines that the pictures generated by the U-net network after the seal erasing are real.
7. A financial document stamp erasure method according to claim 1, including the steps of: when the convolutional neural network is trained, a training sample set consisting of paired samples is adopted for implementation; the pair of samples includes a corresponding chapter-containing picture and a chapter-free picture.
8. A financial document stamp erasure method according to claim 7, including the steps of: training the convolutional neural networkWhen it is configured, the loss function includes a predicted data deviation loss function Ldata
9. The financial document stamp erasure method of claim 8, wherein: when the convolutional neural network is trained, the configured loss function comprises a generation of a countering network loss function LGAN
10. A financial document stamp erasing method according to any one of claims 1 to 9, including: and receiving the original picture of the financial bill without preprocessing the original picture.
CN202110254233.4A 2021-03-09 2021-03-09 Financial bill seal erasing method based on attention mechanism and generation countermeasure network Active CN113065407B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110254233.4A CN113065407B (en) 2021-03-09 2021-03-09 Financial bill seal erasing method based on attention mechanism and generation countermeasure network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110254233.4A CN113065407B (en) 2021-03-09 2021-03-09 Financial bill seal erasing method based on attention mechanism and generation countermeasure network

Publications (2)

Publication Number Publication Date
CN113065407A true CN113065407A (en) 2021-07-02
CN113065407B CN113065407B (en) 2022-07-12

Family

ID=76559889

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110254233.4A Active CN113065407B (en) 2021-03-09 2021-03-09 Financial bill seal erasing method based on attention mechanism and generation countermeasure network

Country Status (1)

Country Link
CN (1) CN113065407B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115273123A (en) * 2022-09-26 2022-11-01 山东豸信认证服务有限公司 Bill identification method, device and equipment and computer storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359550A (en) * 2018-09-20 2019-02-19 大连民族大学 Language of the Manchus document seal Abstraction and minimizing technology based on depth learning technology
CN109492627A (en) * 2019-01-22 2019-03-19 华南理工大学 A kind of scene text method for deleting of the depth model based on full convolutional network
CN110163194A (en) * 2019-05-08 2019-08-23 腾讯科技(深圳)有限公司 A kind of image processing method, device and storage medium
CN110517186A (en) * 2019-07-30 2019-11-29 金蝶软件(中国)有限公司 Eliminate method, apparatus, storage medium and the computer equipment of invoice seal
CN110619642A (en) * 2019-09-05 2019-12-27 四川大学 Method for separating seal and background characters in bill image
CN111915522A (en) * 2020-07-31 2020-11-10 天津中科智能识别产业技术研究院有限公司 Image restoration method based on attention mechanism
CN111931769A (en) * 2020-06-30 2020-11-13 北京来也网络科技有限公司 Invoice processing device, invoice processing apparatus, invoice computing device and invoice storage medium combining RPA and AI

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109359550A (en) * 2018-09-20 2019-02-19 大连民族大学 Language of the Manchus document seal Abstraction and minimizing technology based on depth learning technology
CN109492627A (en) * 2019-01-22 2019-03-19 华南理工大学 A kind of scene text method for deleting of the depth model based on full convolutional network
CN110163194A (en) * 2019-05-08 2019-08-23 腾讯科技(深圳)有限公司 A kind of image processing method, device and storage medium
CN110517186A (en) * 2019-07-30 2019-11-29 金蝶软件(中国)有限公司 Eliminate method, apparatus, storage medium and the computer equipment of invoice seal
CN110619642A (en) * 2019-09-05 2019-12-27 四川大学 Method for separating seal and background characters in bill image
CN111931769A (en) * 2020-06-30 2020-11-13 北京来也网络科技有限公司 Invoice processing device, invoice processing apparatus, invoice computing device and invoice storage medium combining RPA and AI
CN111915522A (en) * 2020-07-31 2020-11-10 天津中科智能识别产业技术研究院有限公司 Image restoration method based on attention mechanism

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BIN DING,ET AL.: "RGAN: Attentive Recurrent Generative Adversarial Network for Shadow Detection and Removal", 《ARXIV》 *
HAJAR EMAMI,ET AL.: "SPA-GAN: Spatial Attention GAN for Image-to-Image Translation", 《ARXIV》 *
刘宗鹏: "基于注意力生成对抗网络的图像去雨算法研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115273123A (en) * 2022-09-26 2022-11-01 山东豸信认证服务有限公司 Bill identification method, device and equipment and computer storage medium

Also Published As

Publication number Publication date
CN113065407B (en) 2022-07-12

Similar Documents

Publication Publication Date Title
CN110516201B (en) Image processing method, image processing device, electronic equipment and storage medium
CN111401372B (en) Method for extracting and identifying image-text information of scanned document
CN101443785B (en) Detecting compositing in a previously conpressed image
CN112287941B (en) License plate recognition method based on automatic character region perception
CN110880000B (en) Picture character positioning method and device, computer equipment and storage medium
US20230206487A1 (en) Detection and identification of objects in images
CN106096610A (en) A kind of file and picture binary coding method based on support vector machine
CN109740572A (en) A kind of human face in-vivo detection method based on partial color textural characteristics
CN111583201B (en) Transfer learning method for constructing super-resolution pathology microscope
CN111680690A (en) Character recognition method and device
CN108710893A (en) A kind of digital image cameras source model sorting technique of feature based fusion
CN110400362A (en) A kind of ABAQUS two dimension crack modeling method, system and computer readable storage medium based on image
Mahale et al. Image inconsistency detection using local binary pattern (LBP)
CN111696021A (en) Image self-adaptive steganalysis system and method based on significance detection
CN112884758A (en) Defective insulator sample generation method and system based on style migration method
CN110222217B (en) Shoe print image retrieval method based on segmented weighting
CN116030453A (en) Digital ammeter identification method, device and equipment
CN113065407B (en) Financial bill seal erasing method based on attention mechanism and generation countermeasure network
US7586627B2 (en) Method and system for optimizing print-scan simulations
CN111898544B (en) Text image matching method, device and equipment and computer storage medium
CN112365451A (en) Method, device and equipment for determining image quality grade and computer readable medium
CN112561782A (en) Method for improving reality degree of simulation picture of offshore scene
CN112085727A (en) Intelligent identification method for scale structure on surface of hot rolled steel
CN114519788A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN116644422A (en) Malicious code detection method based on malicious block labeling and image processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant