CN114550194B

CN114550194B - Method and device for identifying letters and visitors

Info

Publication number: CN114550194B
Application number: CN202210441221.7A
Authority: CN
Inventors: 温立强; 王晓娟; 韦崟屹; 杨跃; 翁璐嵩
Original assignee: Beijing Peking University Software Engineering Co ltd
Current assignee: Beijing Peking University Software Engineering Co ltd
Priority date: 2022-04-26
Filing date: 2022-04-26
Publication date: 2022-08-19
Anticipated expiration: 2042-04-26
Also published as: CN114550194A

Abstract

The embodiment of the application provides a method and a device for identifying a letter, wherein the method comprises the following steps: acquiring a letter to be identified; carrying out format conversion on the letters to be identified to obtain images of the letters to be identified; identifying the specified content in the image of the letter to be identified to obtain an identification result; wherein the designated content includes at least one of a red header title, a character number, a date, a official seal and a handwritten signature. By means of the technical scheme, the working load and pressure of the letter and visit staff can be reduced, and letter and visit processing efficiency is improved.

Description

Method and device for identifying letters and visitors

Technical Field

The application relates to the technical field of computers, in particular to the technical field of computer vision and natural language processing, and particularly relates to a method and a device for identifying letters and visits.

Background

The letter visit means a system that the citizen contacts with the organization or personnel responsible for the letter visit work of the corresponding unit in various participation forms such as letter, e-mail, visit, telephone, fax and short message, so as to reflect the situation, express the own opinion and request the problem solving, and the organization or personnel related to the letter visit work adopt a certain mode to process. And, among a large number of letters, there are many irregular letters, for example, without red header, letter number, date, official seal or handwritten signature, etc., so that the letters (or letters) need to be subjected to a normative review.

Currently, the current review method is that the letter of the letter is reviewed by letter staff in a normative way.

In the process of implementing the invention, the inventor finds that the following problems exist in the prior art: because the existing checking method is realized by adopting a manual checking mode, the method at least has the problem of low letter checking efficiency.

Disclosure of Invention

The embodiment of the application aims to provide a method and a device for identifying a letter and visit article so as to improve the auditing efficiency of the letter and visit article.

In a first aspect, an embodiment of the present application provides a method for identifying a letter, where the method includes: acquiring a letter to be identified; carrying out format conversion on the letters to be identified to obtain images of the letters to be identified; identifying the specified content in the image of the letter to be identified to obtain an identification result; wherein the designated content includes at least one of a red header title, a character number, a date, a official seal and a handwritten signature.

Therefore, by means of the technical scheme, the method and the device for screening and processing the unqualified letters can optimize the defects of manual screening and processing methods of the unqualified letters, reduce the workload and pressure of letter processing workers, improve the letter processing efficiency, enhance the timeliness and real-time property of response solicited by the letter processors in emergency, improve the accuracy of effective letter identification, and promote the business service intellectualization in the letter field and the modernization of social security control capability.

In one possible embodiment, the specified content includes a red header;

the method for identifying the appointed content in the image of the letter to be identified to obtain the identification result comprises the following steps: performing Optical Character Recognition (OCR) on an image of the home page content of the letter to be recognized to obtain a first OCR result; selecting a target text portion from the first OCR result; wherein the target text part is the text part with the widest vertical cross pixel and the largest height in the first OCR result; performing HSV three-channel separation on the target text part to obtain a chrominance channel, a saturation channel and a brightness channel; if the chroma channel, the saturation channel and the brightness channel are all in the red color gamut range, determining that the letter to be identified has a red header title; and if any one of the chrominance channel, the saturation channel and the luminance channel is not in the red color gamut range, determining that the letter to be identified does not have a red header title.

In one possible embodiment, the specified content includes a text number;

the method comprises the following steps of identifying the specified content in the image of the letter to be identified to obtain an identification result, wherein the identification result comprises the following steps: performing Optical Character Recognition (OCR) on an image of the home page content of the letter to be recognized to obtain a first OCR result; matching at least one text number template with the first OCR result; if a target text number template matched with the first OCR result exists in at least one text number template, determining that the letters exist in the letters to be identified; and if all the character number templates in at least one character number template are not matched with the first OCR result, determining that the letters of the letters to be recognized do not have character numbers.

In one possible embodiment, the specified content includes a date;

the method comprises the following steps of identifying the specified content in the image of the letter to be identified to obtain an identification result, wherein the identification result comprises the following steps: performing Optical Character Recognition (OCR) on the image of the end page content of the letter to be recognized to obtain a second OCR result; matching the at least one date template with the second OCR results; if the target date template matched with the second OCR result exists in the at least one date template, determining the existence date of the letters to be identified; and if all the date templates in the at least one date template are not matched with the second OCR result, determining that the letters to be recognized do not have dates.

In one possible embodiment, the specified content includes official seals;

the method comprises the following steps of identifying the specified content in the image of the letter to be identified to obtain an identification result, wherein the identification result comprises the following steps: carrying out gray processing on an image of a letter to be identified to obtain a gray image; and inputting the gray image into a pre-trained official seal recognition model based on YOLOv5 to obtain an official seal recognition result.

In one possible embodiment, the data enhancement process for training a sample official seal image of an official seal recognition model includes: acquiring a sample official seal image; performing transparentization processing on the sample official seal image to obtain a first image to be cloned; fusing and pasting a first image to be cloned into a first background image to obtain a first fused image for training a official seal recognition model; the first background image is a text data image which does not contain official seal.

Therefore, by means of the technical scheme, the problem that the original labeling data amount is small can be solved, and more training data can be acquired in a data enhancement mode.

In a possible embodiment, the fusing and pasting the first image to be cloned into the first background image to obtain the training data of the official seal recognition model, including: respectively obtaining a gradient field of a sample official seal image and a gradient field of a first background image; covering the gradient field of the sample official seal image on the gradient field of the first background image to obtain the gradient field of the first fusion image; performing partial derivation on the gradient field of the first fusion image to obtain a first divergence value; and processing the first divergence value by using a Poisson reconstruction algorithm to obtain a pixel color value of each pixel point in the first fusion image.

In one possible embodiment, the specified content includes a handwritten signature;

the method comprises the following steps of identifying the specified content in the image of the letter to be identified to obtain an identification result, wherein the identification result comprises the following steps: carrying out gray processing on an image of a letter to be identified to obtain a gray image; and inputting the gray image into a pre-trained handwriting signature recognition model based on YOLOv5 to obtain a handwriting signature recognition result.

In one possible embodiment, a data enhancement process for training a sample handwritten signature image of a handwritten signature recognition model includes: acquiring a sample handwritten signature image; performing transparentization processing on the sample handwritten signature image to obtain a second image to be cloned; fusing and pasting the second image to be cloned into a second background image to obtain a second fused image for training the handwritten signature recognition model; the second background image is a text data image not containing a handwritten signature.

In one possible embodiment, fusing and pasting the second image to be cloned into the second background image to obtain a second fused image for training the handwritten signature recognition model, including: respectively acquiring a gradient field of a sample handwritten signature image and a gradient field of a second background image; covering the gradient field of the sample handwritten signature image on the gradient field of a second background image to obtain a gradient field of a second fusion image; performing partial derivation on the gradient field of the second fusion image to obtain a second divergence value; and processing the second divergence value by using a Poisson reconstruction algorithm to obtain the pixel color value of each pixel point in the second fusion image.

In a second aspect, an embodiment of the present application provides an apparatus for identifying a letter, where the apparatus includes: the acquisition module is used for acquiring the letters to be identified; the format conversion module is used for carrying out format conversion on the letters to be identified to obtain images of the letters to be identified; the identification module is used for identifying the specified content in the image of the letter to be identified to obtain an identification result; wherein the designated content includes at least one of a red header title, a character number, a date, a official seal and a handwritten signature.

In a third aspect, an embodiment of the present application provides a storage medium, where a computer program is stored on the storage medium, and when the computer program is executed by a processor, the computer program performs the method described in the first aspect or any optional implementation manner of the first aspect.

In a fourth aspect, an embodiment of the present application provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the method of the first aspect or any of the alternative implementations of the first aspect.

In a fifth aspect, the present application provides a computer program product which, when run on a computer, causes the computer to perform the method of the first aspect or any possible implementation manner of the first aspect.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments of the present application will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and that those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

FIG. 1 is a flow chart illustrating a method for identifying a letter provided by an embodiment of the application;

FIG. 2 is a flow chart illustrating a method for identifying red headers according to an embodiment of the present application;

FIG. 3 is a flow chart illustrating a method for identifying a document number according to an embodiment of the present application;

FIG. 4 is a flow chart illustrating a method of identifying a date provided by an embodiment of the present application;

FIG. 5 is a flowchart illustrating a method for identifying official seal according to an embodiment of the present application;

FIG. 6 is a flowchart illustrating a data enhancement method for training a sample official seal image of an official seal recognition model according to an embodiment of the present application;

FIG. 7 is a flowchart illustrating an image fusion method provided in an embodiment of the present application;

FIG. 8 is a flow chart illustrating a method of recognizing a handwritten signature provided in an embodiment of the present application;

FIG. 9 is a flowchart illustrating a data enhancement method for training a sample handwritten signature image of a handwritten signature recognition model according to an embodiment of the present application;

FIG. 10 is a flow chart of another image fusion method provided by the embodiment of the present application;

FIG. 11 is a block diagram illustrating an apparatus for identifying a letter according to an embodiment of the present application;

fig. 12 shows a block diagram of an electronic device provided in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments.

Thus, the following detailed description of the embodiments of the present application, as presented in the figures, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making any creative effort belong to the protection scope of the present application.

In order to solve the problem of low review efficiency of letters in the prior art, the embodiment of the application provides a standardized review scheme of letters and documents, wherein the letters and documents in different formats are acquired or received and uniformly processed to obtain the image of the letter and documents, and whether the current letter and documents have red heads, whether the letters exist, whether the dates exist, whether official stamps exist and whether handwritten signatures exist are identified from the image of the letter and documents, so that the defects of the method for manually screening and processing the unqualified letters and documents can be optimized, the workload and pressure of letter and documents can be reduced, the processing efficiency of the letters and documents can be improved, the timeliness and the real-time property of response required by a letter and documents under emergency can be enhanced, the identification accuracy of effective letters and documents can be improved, and the modernization of business service intelligence and social security management capability in the field of letters and documents can be promoted.

To facilitate understanding of the embodiments of the present application, some terms referred to in the embodiments of the present application are explained below:

"pixel color value": which includes three color channels of red (R), green (G), and blue (B), and RGB channels of an image) as an effective monochrome image color tool that can hold image color information.

"HSV (Hue, Saturation, brightness Value) channel": its color description is more consistent with the way humans describe and interpret colors than the RGB channel.

The "regular expression": the method is a common text matching tool in the field of computer science, and can be used for retrieving and replacing texts conforming to a certain mode (rule) on the premise of providing an effective template. The Elasticissearch for distributed full-text retrieval is used as a common retrieval tool and can also realize a text matching task.

"target detection algorithm based on deep learning": the method is divided into two algorithms, namely a two-stage algorithm and a one-stage algorithm, wherein the former algorithm has slightly higher accuracy but lower speed, and the latter algorithm has slightly lower accuracy but higher speed. And, the target detection based on deep learning can be divided into four parts, Input, Backbones, Neck and Head. The Input is responsible for inputting and processing images, the Backbones are responsible for extracting image features, the Neck is responsible for enhancing the extracted features, and the Head is used for outputting a final detection result.

"YOLO series": the method represents the most advanced target detection level in the industry, belongs to a one-stage algorithm, can keep higher detection accuracy, and can greatly improve the detection speed of the model, so that a user can balance speed and accuracy. And, the YOLO series is currently updated to YOLOv5, and YOLOv5 is optimized in terms of data enhancement, network structure, and the like, compared with YOLOv 4.

Referring to fig. 1, fig. 1 is a flowchart illustrating a method for identifying a letter according to an embodiment of the present application. The method as shown in fig. 1 may be performed by an apparatus for identifying a letter, and the apparatus may be an apparatus for identifying a letter as shown in fig. 11. And the specific device of the device can be set according to the actual requirement, and the embodiment of the application is not limited to this. For example, the device may be a computer, a server, or the like. Specifically, the method shown in fig. 2 includes:

and step S110, obtaining the letters to be identified.

And step S120, carrying out format conversion on the letters to be identified to obtain images of the letters to be identified.

Specifically, since there are many forms of the letters to be identified (for example, it may be a Word version, a PDF version, etc.), at least one acquired letter to be identified may be subjected to a uniform format conversion, so as to convert all the letters to be identified into a uniform image format. For example, all of the letters to be identified may be converted into a JPG image.

And step S130, identifying the specified content in the image of the letter to be identified to obtain an identification result.

It should be understood that, since different letter formats have different requirements on the standardization of the letter and visit documents, the specific content of the specified content may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the specified content includes at least one of a red header title, a character number, a date, a official seal, and a handwritten signature.

It should also be understood that the red header, the text number, the date, the official seal, and the handwritten signature may also be considered format elements of the letter.

It should also be understood that, the specific manner of obtaining the recognition result by recognizing the specified content in the image of the letter to be recognized may be set according to actual requirements, and the embodiment of the present application is not limited to this.

Alternatively, in the case that the specified content includes a red header, please refer to fig. 2, where fig. 2 shows a flowchart of a method for identifying a red header according to an embodiment of the present application. Specifically, the method comprises the following steps:

step S210, selecting the image of the home page content of the letters to be identified from the images of the letters to be identified. The home page content refers to the entire content of the home page.

Step S220, performing Optical Character Recognition (OCR) on the image of the first page content of the letter to be recognized, to obtain a first OCR result.

Step S230, selecting a target text portion from the first OCR result. Wherein the target text portion is the text portion with the widest vertical cross pixel and the largest height in the first OCR result.

That is, according to the first OCR result (or OCR character recognition result), the text portion with the largest vertical-span pixel and the largest height in the character recognition is selected.

And S240, carrying out HSV three-channel separation on the target text part to obtain a chromaticity channel, a saturation channel and a brightness channel.

It should be understood that the chrominance channel may also be referred to as the H channel, the saturation channel may also be referred to as the S channel, and the luminance channel may also be referred to as the V channel.

Step S250, respectively checking whether the chroma channel, the saturation channel and the luminance channel of the text portion are within the red color gamut range according to the range of the red color gamut in the three channels.

If the chroma channel, the saturation channel and the brightness channel of the text part are all in the red color gamut range, executing step S260; if any one of the chrominance channel, the saturation channel and the luminance channel of the text portion is not within the red color gamut, step S270 is executed.

And step S260, determining that the letters to be identified have red headers.

Step S270, determining that the letters to be identified do not have red headers.

It should be noted that, in addition to the above method embodiments, object detection may be performed by using a deep learning model for object detection to complete the same task of identifying a red header, and the embodiments of the present application are not limited thereto.

Optionally, in the case that the designated content includes a text number, please refer to fig. 3, and fig. 3 shows a flowchart of a method for identifying a text number provided in an embodiment of the present application. Specifically, the method as shown in fig. 3 includes:

step S310, selecting the image of the home page content of the letter to be identified from the image of the letter to be identified.

And S320, performing OCR character recognition on the image of the home page content of the letter to be recognized to obtain a first OCR result.

And step S330, matching the at least one character number template with the first OCR result to obtain a first matching result.

It should be understood that the obtaining manner of each of the at least one text template may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the document formats of all the letters in the database need to be counted and analyzed, and a plurality of document templates are formed according to the form of regular expressions.

And step S340, determining a character number identification result according to the first matching result.

Specifically, if a target text number template matched with the first OCR result exists in at least one text number template according to the first matching result, determining that a text number exists in the letters to be identified; and if the first matching result determines that all the character number templates in the at least one character number template are not matched with the first OCR result, determining that no character number exists in the letters to be identified.

It should be noted here that in addition to the above method embodiments, an Elasticsearch may be adopted to search the current letter to identify whether the current letter has a text number.

Optionally, in a case that the specified content includes a date, please refer to fig. 4, where fig. 4 shows a flowchart of a method for identifying a date according to an embodiment of the present application. Specifically, the method shown in fig. 4 includes:

step S410, selecting an image of the last page content of the letters to be identified from the images of the letters to be identified.

And step S420, performing OCR character recognition on the image of the end page content of the letter to be recognized to obtain a second OCR result.

And step S430, matching the at least one date template with the second OCR result to obtain a second matching result.

It should be understood that the obtaining manner of each of the at least one date template may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the date formats of all the letters in the database need to be counted and analyzed, and a plurality of date templates are formed according to the form of regular expressions.

And step S440, determining a date identification result according to the second matching result.

Specifically, if a target date template matched with the second OCR result exists in at least one date template determined by the second matching result, determining the existence date of the to-be-identified letter; and if all the date templates in the at least one date template are determined not to be matched with the second OCR result through the second matching result, determining that the letters to be recognized do not have dates.

It should be noted here that in addition to the above method embodiments, an Elasticsearch may be adopted to search the current letter to identify whether the current letter has a date.

Optionally, in a case that the designated content includes a official seal, please refer to fig. 5, and fig. 5 shows a flowchart of a method for identifying an official seal provided in an embodiment of the present application. Specifically, the method as shown in fig. 5 includes:

and step S510, carrying out gray processing on the image of the letter to be identified to obtain a gray image.

Step S520, inputting the gray image into a pre-trained official seal recognition model based on YOLOv5 to obtain an official seal recognition result.

It should be understood that the content included in the official seal identification result may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the official seal identification result is whether the letter to be identified contains the official seal.

It should also be understood that the specific model and layer structure of the official seal identification model and the like may be set according to actual requirements, and the embodiment of the present application is not limited to this.

It should also be understood that the training process of the official seal recognition model may also be set according to actual requirements, and the embodiments of the present application are not limited thereto.

Optionally, after the data set is acquired, the data in the data set needs to be labeled, and position information of a common seal is provided for each piece of image data. The label content can be the type of the detected object (official seal), the abscissa of the midpoint of the object, the ordinate, the length and the width of the object. And besides detecting the target type, other label data need to be normalized.

And when the Yolov5 network structure is used for training the official seal recognition model, graying the letter and visit document image, wherein the purpose of graying is to eliminate the influence of color on the training model and accelerate the model training and detection speed. And training the image through a model after graying, testing the image on a test set after training, evaluating the training condition of the model according to indexes such as test index accuracy, precision, recall rate, AP, IoU and the like, selecting the hyper-parameters with the best index effect, training the hyper-parameters on all data sets again, and storing the trained model parameters.

It should be noted here that the model training method needs to be performed by means of the YOLOv5 network structure, but one problem faced by model training is insufficient data set, insufficient model training, weak model generalization capability, and the like, so that in order to increase the data set or training data containing official seal, the generalization capability of the model is increased, a better training effect is obtained, and more effective data can be provided for model training and testing in an image fusion manner.

For example, referring to fig. 6, fig. 6 shows a flowchart of a data enhancement method for training a sample official seal image of an official seal recognition model according to an embodiment of the present application. Specifically, the data enhancement method shown in fig. 6 includes:

step S610, a sample official seal image is obtained.

And step S620, performing transparentization processing on the sample official seal image to obtain a first image to be cloned.

Step S630, fusing and pasting the first image to be cloned into the first background image to obtain a first fused image for training the official seal recognition model; the first background image is a text data image which does not contain official seal.

In other words, in the process of generating the model training data, in order to make the generated image closer to the real image, the sample official seal image is firstly processed to be transparent and used as the image area to be cloned, the common text data image without official seal is used as the background picture, the purpose of image fusion is to fuse and paste the image area to be cloned into the background picture, and the text data containing the official seal is generated and forms a data set together with the original data containing the official seal.

It should be understood that, the specific process of obtaining the training data of the official seal recognition model by fusing and pasting the first to-be-cloned image into the first background image may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

Optionally, please refer to fig. 7, and fig. 7 shows a flowchart of an image fusion method provided in an embodiment of the present application. Specifically, the method shown in fig. 7 includes:

step S710, a gradient field of the sample official seal image and a gradient field of the first background image are respectively obtained.

It should be understood that, the specific manner of respectively acquiring the gradient field of the sample official seal image and the gradient field of the first background image may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the gradient field of the sample official seal image and the gradient field of the first background image may be obtained by a differential method.

Step S720, the gradient field of the sample official seal image is covered on the gradient field of the first background image to obtain the gradient field of the first fusion image.

Step S730, performing a partial derivation on the gradient field of the first fusion image to obtain a first divergence value.

Step S740, processing the first divergence value by using a poisson reconstruction algorithm to obtain a pixel color value of each pixel point in the first fusion image.

It should be understood that the specific process of obtaining the pixel color value of each pixel point in the first fusion image by processing the first divergence value through the poisson reconstruction algorithm may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

For example, the poisson reconstruction equation Ax = b, where b is the divergence, so that only the coefficient matrix a needs to be constructed to obtain the pixel value x of the fused image. For a three channel image, only three channel equations need to be solved to obtain the value of pixel R, G, B for each point in the fused image.

It should be noted here that besides the above method embodiments, a two-stage class target detection algorithm may be adopted to identify whether there is a official seal in the current letter.

Optionally, in a case that the designated content includes a handwritten signature, please refer to fig. 8, and fig. 8 shows a flowchart of a method for recognizing a handwritten signature according to an embodiment of the present application. Specifically, the method shown in fig. 8 includes:

and step S810, carrying out gray processing on the image of the letter to be identified to obtain a gray image.

Step S820, inputting the gray image into a pre-trained handwriting signature recognition model based on YOLOv5 to obtain a handwriting signature recognition result.

It should also be understood that the specific model and layer structure of the handwritten signature recognition model may be set according to actual requirements, and the embodiment of the present application is not limited to this.

It should also be understood that the training process of the handwritten signature recognition model may also be set according to actual requirements, and the embodiments of the present application are not limited thereto.

Alternatively, after the data set is acquired, the data in the data set needs to be labeled, and position information of the handwritten signature is provided for each piece of image data. The label content can be the type of the detected object (handwritten signature), the abscissa of the midpoint of the object, the ordinate, the length and the width of the object. And besides detecting the target type, other label data need to be normalized.

And when the handwriting signature recognition model is trained by using a YOLOv5 network structure, graying the letter image to eliminate the influence of colors on the training model and accelerate the model training and detection speed. And training the image through a model after graying, testing the image on a test set after training, evaluating the training condition of the model according to indexes such as test index accuracy, precision, recall rate, AP, IoU and the like, selecting the hyper-parameters with the best index effect, training the hyper-parameters on all data sets again, and storing the trained model parameters.

It should be noted here that the model training method needs to be performed by means of the YOLOv5 network structure, but one problem faced by model training is that a data set is insufficient, model training is insufficient, model generalization capability is weak, and therefore, in order to increase a data set or training data containing a handwritten signature, the generalization capability of a model is increased, a better training effect is obtained, and more effective data can be provided for model training and testing in an image fusion manner.

For example, referring to fig. 9, fig. 9 is a flowchart illustrating a data enhancement method for training a sample handwritten signature image of a handwritten signature recognition model according to an embodiment of the present application. Specifically, the data enhancement method shown in fig. 9 includes:

step S910, a sample handwritten signature image is obtained.

And step S920, performing transparentization processing on the sample handwritten signature image to obtain a second image to be cloned.

Step S930, fusing and pasting the second image to be cloned into a second background image to obtain a second fused image for training the handwritten signature recognition model; the second background image is a text data image not containing a handwritten signature.

In other words, in the process of generating the model training data, in order to make the generated image closer to the real image, the sample handwritten signature image is firstly processed to be transparent and used as an image area to be cloned, a common text data image without the handwritten signature is used as a background picture, the purpose of image fusion is to fuse and paste the image area to be cloned into the background picture, and text data with the handwritten signature is generated and forms a data set together with the original data with the handwritten signature.

It should be understood that the specific process of obtaining the training data of the handwritten signature recognition model by fusing and pasting the second image to be cloned into the second background image may be set according to actual requirements, and the embodiment of the present application is not limited thereto.

Optionally, referring to fig. 10, fig. 10 shows a flowchart of another image fusion method provided in the embodiment of the present application. Specifically, the method shown in fig. 10 includes:

step S1010, a gradient field of the sample handwritten signature image and a gradient field of the second background image are respectively obtained.

It should be understood that the specific manner of respectively acquiring the gradient field of the sample handwritten signature image and the gradient field of the second background image may be set according to actual requirements, and the embodiments of the present application are not limited thereto.

For example, the gradient field of the sample handwritten signature image and the gradient field of the second background image may be obtained by a differential method.

Step S1020, the gradient field of the sample handwritten signature image is overlaid on the gradient field of the second background image to obtain a gradient field of a second fusion image.

Step S1030, performing partial derivation on the gradient field of the second fused image to obtain a second divergence value.

Step S1040, processing the second divergence value by using a poisson reconstruction algorithm, to obtain a pixel color value of each pixel point in the second fusion image.

It should be understood that the specific process of obtaining the pixel color value of each pixel point in the second fused image by processing the second divergence value through the poisson reconstruction algorithm is similar to the specific process of step S740, and specific reference may be made to the related description of step S740.

It should also be noted here that in addition to the above method embodiments, a two-stage class target detection algorithm may be used to identify whether a handwritten signature exists on the current letter.

It should be understood that the above method for identifying letters is only exemplary, and those skilled in the art can make various changes, modifications or variations according to the above method and also fall within the protection scope of the present application.

Referring to fig. 11, fig. 11 is a block diagram illustrating a device 1100 for identifying a letter according to an embodiment of the present application. It should be understood that the apparatus 1100 corresponds to the above method embodiment, and can perform the steps related to the above method embodiment, and the specific functions can be referred to the above description, and the detailed description is appropriately omitted here to avoid redundancy. The device 1100 includes at least one software functional module that can be stored in a memory in the form of software or firmware (firmware) or solidified in an Operating System (OS) of the device 1100. Specifically, the apparatus 1100 includes:

an obtaining module 1110, configured to obtain a letter to be identified;

the format conversion module 1120 is used for carrying out format conversion on the letters to be identified to obtain images of the letters to be identified;

the identification module 1130 is configured to identify the specified content in the image of the mailpiece to be identified, so as to obtain an identification result; wherein the designated content includes at least one of a red header title, a character number, a date, a official seal, and a handwritten signature.

In one possible embodiment, the specified content includes a red header; the identifying module 1130 is specifically configured to: performing Optical Character Recognition (OCR) on an image of the home page content of the letter to be recognized to obtain a first OCR result; selecting a target text portion from the first OCR result; wherein the target text part is the text part with the widest vertical cross pixel and the largest height in the first OCR result; performing HSV three-channel separation on the target text part to obtain a chrominance channel, a saturation channel and a brightness channel; if the chroma channel, the saturation channel and the brightness channel are all in the red color gamut range, determining that the letters to be identified have red headers; and if any one of the chrominance channel, the saturation channel and the luminance channel is not in the red color gamut range, determining that the letter to be identified does not have a red header title.

In one possible embodiment, the specified content includes a text number; the identifying module 1130 is specifically configured to: performing Optical Character Recognition (OCR) on an image of the home page content of the letter to be recognized to obtain a first OCR result; matching at least one character number template with the first OCR result; if a target text number template matched with the first OCR result exists in at least one text number template, determining that a text number exists in the letters to be identified; and if all the character number templates in at least one character number template are not matched with the first OCR result, determining that the letters of the letters to be recognized do not have character numbers.

In one possible embodiment, the specified content includes a date; the identifying module 1130 is specifically configured to: performing Optical Character Recognition (OCR) on the image of the end page content of the letter to be recognized to obtain a second OCR result; matching the at least one date template with the second OCR results; if the target date template matched with the second OCR result exists in the at least one date template, determining the existence date of the letters to be identified; and if all the date templates in the at least one date template are not matched with the second OCR result, determining that the letters to be recognized do not have dates.

In one possible embodiment, the specified content includes official seals; the identifying module 1130 is specifically configured to: carrying out gray processing on an image of a letter to be identified to obtain a gray image; and inputting the gray image into a pre-trained official seal recognition model based on YOLOv5 to obtain an official seal recognition result.

In one possible embodiment, the apparatus 1100 further comprises a first data enhancement module (not shown);

the first data enhancement module is specifically configured to: acquiring a sample official seal image; performing transparentization processing on the sample official seal image to obtain a first image to be cloned; fusing and pasting the first image to be cloned into the first background image to obtain a first fused image for training the official seal recognition model; the first background image is a text data image which does not contain official seal.

In a possible embodiment, the obtaining module 1110 is further configured to: respectively obtaining a gradient field of a sample official seal image and a gradient field of a first background image; covering the gradient field of the sample official seal image on the gradient field of the first background image to obtain a gradient field of a first fusion image; performing partial derivation on the gradient field of the first fusion image to obtain a first divergence value; and processing the first divergence value by using a Poisson reconstruction algorithm to obtain the pixel color value of each pixel point in the first fusion image.

In one possible embodiment, the specified content includes a handwritten signature; the identifying module 1130 is specifically configured to: carrying out gray processing on an image of a letter to be identified to obtain a gray image; and inputting the gray image into a pre-trained handwriting signature recognition model based on YOLOv5 to obtain a handwriting signature recognition result.

In a possible embodiment, the obtaining module 1110 is further configured to: acquiring a sample handwritten signature image; performing transparentization processing on the sample handwritten signature image to obtain a second image to be cloned; fusing and pasting the second image to be cloned into a second background image to obtain a second fused image for training the handwritten signature recognition model; the second background image is a text data image not containing a handwritten signature.

In one possible embodiment, the apparatus 1100 further comprises a second data enhancement module (not shown);

the second data enhancement module is specifically configured to: respectively acquiring a gradient field of a sample handwritten signature image and a gradient field of a second background image; covering the gradient field of the sample handwritten signature image on the gradient field of a second background image to obtain a gradient field of a second fusion image; performing partial derivation on the gradient field of the second fusion image to obtain a second divergence value; and processing the second divergence value by using a Poisson reconstruction algorithm to obtain the pixel color value of each pixel point in the second fusion image.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method, and redundant description is not repeated here.

Referring to fig. 12, fig. 12 is a block diagram illustrating a structure of an electronic device 1200 provided in an embodiment of the present application. As shown in fig. 12, the electronic device 1200 may include a processor 1210, a communication interface 1220, a memory 1230, and at least one communication bus 1240. Wherein a communications bus 1240 is used to enable direct, connected communications between these components. The communication interface 1220 in the embodiment of the present application is used for communicating signaling or data with other devices. Processor 1210 may be an integrated circuit chip having signal processing capabilities. The Processor 1210 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; but may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor 1210 may be any conventional processor or the like.

The Memory 1230 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), and the like. The memory 1230 stores computer-readable instructions that, when executed by the processor 1210, enable the electronic device 1200 to perform the various steps involved in the above-described method embodiments.

The electronic device 1200 may further include a memory controller, an input-output unit, an audio unit, and a display unit.

The memory 1230, the memory controller, the processor 1210, the peripheral interface, the input/output unit, the audio unit, and the display unit are electrically connected to each other directly or indirectly, so as to implement data transmission or interaction. For example, these components may be electrically coupled to each other via one or more communication buses 1240. The processor 1210 is configured to execute executable modules stored in the memory 1230. Also, the electronic device 1200 is configured to perform the following method: acquiring a letter to be identified; carrying out format conversion on the letters to be identified to obtain images of the letters to be identified; identifying the specified content in the image of the letter to be identified to obtain an identification result; wherein the designated content includes at least one of a red header title, a character number, a date, a official seal and a handwritten signature.

The input and output unit is used for providing input data for a user to realize the interaction of the user and the server (or the local terminal). The input/output unit may be, but is not limited to, a mouse, a keyboard, and the like.

The audio unit provides an audio interface to the user, which may include one or more microphones, one or more speakers, and audio circuitry.

The display unit provides an interactive interface (e.g. a user interface) between the electronic device and a user or for displaying image data to a user reference. In this embodiment, the display unit may be a liquid crystal display or a touch display. In the case of a touch display, the display can be a capacitive touch screen or a resistive touch screen, which supports single-point and multi-point touch operations. Supporting single-point and multi-point touch operations means that the touch display can sense touch operations simultaneously generated from one or more positions on the touch display, and the sensed touch operations are sent to the processor for calculation and processing.

It is to be understood that the configuration shown in fig. 12 is merely exemplary, and that the electronic device 1200 may include more or fewer components than shown in fig. 12, or have a different configuration than shown in fig. 12. The components shown in fig. 12 may be implemented in hardware, software, or a combination thereof.

The present application also provides a storage medium having a computer program stored thereon, which, when executed by a processor, performs the method of the method embodiments.

The present application also provides a computer program product which, when run on a computer, causes the computer to perform the method of the method embodiments.

It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working process of the system described above may refer to the corresponding process in the foregoing method, and redundant description is not repeated here.

It should be noted that, in this specification, each embodiment is described in a progressive manner, and each embodiment focuses on differences from other embodiments, and portions that are the same as and similar to each other in each embodiment may be referred to. For the device-like embodiment, since it is basically similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes. It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application. It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined or explained in subsequent figures.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method of identifying a letter, comprising:

acquiring a letter to be identified;

carrying out format conversion on the letters to be identified to obtain images of the letters to be identified;

identifying the appointed content in the image of the letter to be identified to obtain an identification result; wherein the designated content comprises at least one of a red header title, a text number, a date, a official seal and a handwritten signature; the specified content includes the red-headed title; wherein, the identifying the appointed content in the image of the letter to be identified to obtain the identification result comprises: performing Optical Character Recognition (OCR) on the image of the home page content of the letter to be recognized to obtain a first OCR result; selecting a target text portion from the first OCR result; wherein the target text portion is a text portion with the widest vertical cross pixel and the largest height in the first OCR result; performing HSV three-channel separation on the target text part to obtain a chromaticity channel, a saturation channel and a brightness channel; if the chrominance channel, the saturation channel and the luminance channel are all in a red color gamut range, determining that the letter to be identified has the red header title; and if any one of the chrominance channel, the saturation channel and the luminance channel is not in the red color gamut range, determining that the letter to be identified does not have the red header title.

2. The method according to claim 1, wherein the specified content includes the character number;

wherein, the identifying the appointed content in the image of the letter to be identified to obtain the identification result further comprises:

performing Optical Character Recognition (OCR) on the image of the home page content of the letter to be recognized to obtain a first OCR result;

matching at least one text number template with the first OCR result;

if a target text number template matched with the first OCR result exists in the at least one text number template, determining that the text number exists in the to-be-identified letter;

and if all the character number templates in the at least one character number template are not matched with the first OCR result, determining that the character number does not exist in the to-be-identified letter.

3. The method of claim 1, wherein the specified content includes the date;

performing Optical Character Recognition (OCR) on the image of the last page content of the letter to be recognized to obtain a second OCR result;

matching at least one date template with the second OCR result;

if a target date template matched with the second OCR result exists in the at least one date template, determining that the to-be-recognized correspondence exists in the date;

and if all date templates in the at least one date template are not matched with the second OCR result, determining that the date does not exist in the letters to be identified.

4. The method according to claim 1, wherein the specified content includes the official seal;

carrying out gray processing on the image of the letter to be identified to obtain a gray image;

and inputting the gray image into a pre-trained official seal recognition model based on YOLOv5 to obtain an official seal recognition result.

5. The method of claim 4, wherein the data enhancement process for training sample official seal images of the official seal recognition model comprises:

acquiring the sample official seal image;

performing transparentization processing on the sample official seal image to obtain a first image to be cloned;

fusing and pasting the first to-be-cloned image into a first background image to obtain a first fused image for training the official seal recognition model; wherein the first background image is a text data image not including a official seal.

6. The method according to claim 5, wherein the fusing and pasting the first to-be-cloned image into a first background image to obtain training data of the official seal recognition model comprises:

respectively acquiring a gradient field of the sample official seal image and a gradient field of the first background image;

covering the gradient field of the sample official seal image on the gradient field of the first background image to obtain the gradient field of the first fusion image;

performing partial derivation on the gradient field of the first fusion image to obtain a first divergence value;

and processing the first divergence value by using a Poisson reconstruction algorithm to obtain a pixel color value of each pixel point in the first fusion image.

7. The method of claim 1, wherein the specified content comprises the handwritten signature;

and inputting the gray image into a pre-trained handwriting signature recognition model based on YOLOv5 to obtain a handwriting signature recognition result.

8. The method of claim 7, wherein the data enhancement process for training the sample handwritten signature images of the handwritten signature recognition model comprises:

acquiring a sample handwritten signature image;

performing transparentization processing on the sample handwritten signature image to obtain a second image to be cloned;

fusing and pasting the second image to be cloned into a second background image to obtain a second fused image for training the handwritten signature recognition model; wherein the second background image is a text data image not containing a handwritten signature.

9. The method according to claim 8, wherein the fusing and pasting the second image to be cloned into a second background image to obtain a second fused image for training the handwritten signature recognition model comprises:

respectively acquiring a gradient field of the sample handwritten signature image and a gradient field of the second background image;

overlaying the gradient field of the sample handwritten signature image onto the gradient field of the second background image to obtain a gradient field of the second fusion image;

performing partial derivation on the gradient field of the second fusion image to obtain a second divergence value;

and processing the second divergence value by using a Poisson reconstruction algorithm to obtain the pixel color value of each pixel point in the second fusion image.

10. An apparatus for identifying a letter, comprising:

the acquisition module is used for acquiring the letters to be identified;

the format conversion module is used for carrying out format conversion on the letters to be identified to obtain images of the letters to be identified;

the identification module is used for identifying the specified content in the image of the letter to be identified to obtain an identification result; wherein the designated content comprises at least one of a red header title, a character number, a date, a official seal and a handwritten signature; the specified content includes the red-headed title; wherein the identification module is specifically configured to: performing Optical Character Recognition (OCR) on the image of the home page content of the letter to be recognized to obtain a first OCR result; selecting a target text portion from the first OCR result; wherein the target text portion is a text portion with the widest vertical cross pixel and the largest height in the first OCR result; performing HSV three-channel separation on the target text part to obtain a chromaticity channel, a saturation channel and a brightness channel; if the chrominance channel, the saturation channel and the luminance channel are all in a red color gamut range, determining that the letter to be identified has the red header title; and if any one of the chrominance channel, the saturation channel and the luminance channel is not in the red color gamut range, determining that the letter to be identified does not have the red header title.