CN111291726A - Medical bill sorting method, device, equipment and medium - Google Patents

Medical bill sorting method, device, equipment and medium Download PDF

Info

Publication number
CN111291726A
CN111291726A CN202010169928.8A CN202010169928A CN111291726A CN 111291726 A CN111291726 A CN 111291726A CN 202010169928 A CN202010169928 A CN 202010169928A CN 111291726 A CN111291726 A CN 111291726A
Authority
CN
China
Prior art keywords
bill
bill image
ellipse
image
medical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010169928.8A
Other languages
Chinese (zh)
Other versions
CN111291726B (en
Inventor
王亚领
刘设伟
沈程秀
马文伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Original Assignee
Taikang Insurance Group Co Ltd
Taikang Online Property Insurance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taikang Insurance Group Co Ltd, Taikang Online Property Insurance Co Ltd filed Critical Taikang Insurance Group Co Ltd
Priority to CN202010169928.8A priority Critical patent/CN111291726B/en
Publication of CN111291726A publication Critical patent/CN111291726A/en
Application granted granted Critical
Publication of CN111291726B publication Critical patent/CN111291726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/255Detecting or recognising potential candidate objects based on visual cues, e.g. shapes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a medical bill sorting method, a medical bill sorting device, medical bill sorting equipment and a medical bill sorting medium, which are used for improving the efficiency of medical bill sorting. The method comprises the steps of identifying the length of each ellipse in a bill image and the length of the long axis and the short axis of each ellipse; judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image; if yes, determining an area containing a target ellipse in the bill image; inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region; if the character recognition result comprises the preset keywords corresponding to the medical bill, the bill image is determined to be the bill image of the medical bill, the characters at the header of the determined bill image containing the target ellipse are recognized, and the bill image of the medical bill is recognized according to the character recognition result and the preset keywords corresponding to the medical bill, so that the efficiency and the accuracy of sorting the medical bill are improved.

Description

Medical bill sorting method, device, equipment and medium
Technical Field
The invention relates to the technical field of computers, in particular to a medical bill sorting method, a medical bill sorting device, medical bill sorting equipment and a medical bill sorting medium.
Background
In the insurance claim verification link, a client uploads a plurality of claim ticket data, and the information on the medical ticket is particularly important. The method for quickly and effectively sorting out the medical bills from the data of the plurality of claim settlement bills is a key link for performing Optical Character Recognition (OCR) on the whole bills, and the simple, quick and effective medical bill sorting method is a powerful guarantee for successful OCR Recognition.
The existing medical bill sorting method still depends on manpower to distinguish with naked eyes under most scenes, and the medical bill sorting method needs a large amount of manpower and is slow. In order to improve the sorting efficiency of medical bills, sorting can be performed based on an artificial intelligence deep learning model in the prior art, the artificial intelligence deep learning model is used for identifying the text content in the bill image and classifying the bill image, all the texts in each bill image need to be identified by the method, and accordingly the bills are classified according to the identified texts, and the medical bill sorting efficiency is low due to the fact that all the texts in each bill need to be identified.
Disclosure of Invention
The embodiment of the invention provides a medical bill sorting method, a medical bill sorting device, medical bill sorting equipment and a medical bill sorting medium, which are used for solving the problem of low medical bill sorting efficiency caused by waste of artificial intelligent deep learning model resources in the conventional medical bill sorting method.
The embodiment of the invention provides a medical bill sorting method, which comprises the following steps:
identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in the bill image;
judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image;
if yes, determining an area containing the target ellipse in the bill image;
inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region;
and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
Further, the method further comprises:
and if the bill image does not have a target ellipse with the ratio of the long axis to the short axis within a preset ratio range, determining that the bill image is not the bill image of the medical bill.
Further, the method further comprises:
judging whether a first target character group matched with a preset region keyword exists in the character recognition result;
and if so, determining the region to which the bill image belongs according to the matched first target character group.
Further, the determining the region containing the target ellipse in the bill image comprises:
identifying the circle center position of a target ellipse in the bill image;
and determining a rectangular area taking the circle center position as a center position according to the circle center position of the target ellipse and the preset length value and width value.
Further, before determining a rectangular region with the center position as a center position according to the center position of the target ellipse and a preset length value and width value, the method further includes:
identifying an included angle between a target ellipse in the bill image and a preset reference direction;
and rotating the bill image to the preset reference direction according to the included angle.
Further, the identifying an included angle between the target ellipse in the bill image and a preset reference direction includes:
identifying an included angle between the long axis direction of the target ellipse in the bill image and the horizontal direction; or
And identifying an included angle between the short axis direction and the vertical direction of the target ellipse in the bill image.
Further, the method further comprises:
judging whether a second target character group matched with a preset category keyword exists in the character recognition result;
and if so, determining the attributive type of the bill image according to the matched second target character group.
Accordingly, an embodiment of the present invention provides a medical bill sorting apparatus, including:
the identification module is used for identifying each ellipse in the bill image and the length of the long axis and the short axis of each ellipse;
the judging module is used for judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image;
the determining module is used for determining an area containing the target ellipse in the bill image if the target ellipse exists in the bill image; inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region; and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
Further, the determining module is further configured to determine that the bill image is not a bill image of the medical bill if a target ellipse with a ratio of the major axis to the minor axis within a preset ratio range does not exist in the bill image.
Further, the judging module is further configured to judge whether a first target text group matching a preset region keyword exists in the text recognition result;
and the determining module is further used for determining the region to which the bill image belongs according to the matched first target character group if the first target character group exists.
Further, the determining module is specifically configured to identify a circle center position of a target ellipse in the bill image; and determining a rectangular area taking the circle center position as a center position according to the circle center position of the target ellipse and the preset length value and width value.
Further, the identification module is further configured to identify an included angle between a target ellipse in the bill image and a preset reference direction;
the device further comprises:
and the rotating module is used for rotating the bill image to the preset reference direction according to the included angle.
Further, the identification module is specifically configured to identify an included angle between a major axis direction of the target ellipse in the bill image and a horizontal direction; or identifying the included angle between the short axis direction and the vertical direction of the target ellipse in the bill image.
Further, the judging module is further configured to judge whether a second target character group matching a preset category keyword exists in the character recognition result;
and the determining module is further used for determining the attributive type of the bill image according to the matched second target character group if the second target character group exists.
Accordingly, embodiments of the present invention provide an electronic device comprising a processor and a memory, the memory being configured to store program instructions, and the processor being configured to implement the steps of any one of the above-described medical ticket sorting methods when executing a computer program stored in the memory.
Accordingly, embodiments of the present invention provide a computer-readable storage medium storing a computer program, which when executed by a processor implements the steps of any of the above-mentioned medical bill sorting methods.
The embodiment of the invention provides a medical bill sorting method, a device, equipment and a medium, wherein the method comprises the steps of identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in a bill image; judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image; if yes, determining an area containing the target ellipse in the bill image; inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region; if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill, determining an area containing the header characters of the target ellipse by recognizing the target ellipse in the bill image, recognizing the characters at the header of the bill image, and recognizing the bill image of the medical bill according to the character recognition result and the preset keyword corresponding to the medical bill, thereby improving the efficiency and the accuracy of sorting the medical bills.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a process schematic diagram of a medical bill sorting method according to an embodiment of the present invention;
FIG. 2 is a schematic view of a medical ticket image according to an embodiment of the present invention;
FIG. 3 is a schematic process diagram of another medical bill sorting method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a determined rectangular area provided by an embodiment of the present invention;
FIG. 5 is a schematic diagram of an image of a rectangular region input to a character recognition model according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a medical bill sorting device according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to improve the efficiency of medical bill sorting, the embodiment of the invention provides a medical bill sorting method, a medical bill sorting device, medical bill sorting equipment and a medical bill sorting medium.
Example 1:
fig. 1 is a schematic process diagram of a medical bill sorting method according to an embodiment of the present invention, where the process includes the following steps:
s101: the length of each ellipse and the major and minor axes of each ellipse in the document image are identified.
The medical bill sorting method is applied to electronic equipment such as a smart phone, a PC, a server, a tablet personal computer and the like, various bills are converted into bill images, each bill image is input into the electronic equipment, and the electronic equipment sequentially identifies each received bill image.
A plurality of stamps are generally included in the note image, and the shapes of the stamps are generally elliptical, while the ratio of the major axis to the minor axis of the elliptical stamp at the head of the medical note is within a certain range, so that in order to identify the note image of the medical note from the note image, it is necessary to identify each ellipse in the note image and the length of the major axis and the minor axis of each ellipse.
Specifically, the electronic device identifies the ticket image using an ellipse detection algorithm, identifies each ellipse included in the ticket image, and based on each identified ellipse, the electronic device can determine the length of the major axis and the minor axis of each ellipse.
The ellipse detection algorithm may be a kalman filter ellipse detection algorithm, or another ellipse detection algorithm, and specifically, the embodiment of the present invention is not limited thereto.
S102: and judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image.
After identifying each ellipse in the bill image and the length of the long axis and the short axis of each ellipse, the electronic equipment can determine the ratio of the long axis and the short axis of each ellipse, and because the ratio of the long axis and the short axis of the ellipse seal at the header of the bill image of the medical bill is in a certain ratio range, whether the bill image is the bill image of the medical bill can be determined by judging whether the bill image contains the target ellipse of which the ratio of the long axis and the short axis is in the ratio range.
The ratio range is determined by counting the ratio of the major axis and the minor axis of the target ellipse at the head of the bill image of a plurality of medical bills.
S103: if yes, determining the area containing the target ellipse in the bill image.
If the target ellipse with the ratio of the major axis to the minor axis within the set ratio exists in the bill image, the bill image can be basically determined to be the bill image of the medical bill. Therefore, the characters at the head of the bill image are identified, and whether the bill image is the bill image of the medical bill or not can be judged more accurately.
In order to more accurately determine that the bill image is the bill image of the medical bill, the area containing the target ellipse in the bill image, namely the area at the head of the bill image, needs to be determined. Specifically, the region may be a rectangular region, or may be a region having another set shape, for example, an elliptical region.
If two ellipses are shared in the bill image, the ratio of the lengths of the long axis and the short axis of the two ellipses is 1.1 and 1.5 respectively, the set ratio range is [1.45, 1.55], and the ratio 1.5 is positioned in the ratio range [1.45, 1.55], so that the area of the ellipse containing the ratio of the lengths of the long axis and the short axis of 1.5 in the bill image is determined.
S104: and if the bill image does not have a target ellipse with the ratio of the long axis to the short axis within a preset ratio range, determining that the bill image is not the bill image of the medical bill.
The ratio of the long axis to the short axis of the ellipse at the head of the medical bill image is in a preset ratio range. Therefore, if there is no target ellipse whose ratio of the major axis to the minor axis is within the preset ratio range in the document image, it is determined that the document image is not a document image of the medical document.
For example, if the bill image has three ellipses in common, the ratio of the lengths of the major axis and the minor axis of the three ellipses is 1.1, 1.0 and 1.3, respectively; the ratio range is set to be [1.45, 1.55], and the ratio of the lengths of the major axis and the minor axis of the three ellipses is not in the ratio range [1.45, 1.55], so that the bill image is determined not to be the bill image of the medical bill.
S105: and inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region.
In order to further ensure the accuracy of sorting the medical bills, after the region containing the target ellipse in the bill image is determined, because the region containing the target ellipse in the bill image contains partial characters, whether the bill image is the bill image of the medical bill can be further judged by identifying the partial characters.
Therefore, the image of the region in the bill image is input into a character recognition model which is trained in advance, characters in the region are recognized, and a character recognition result of the region is determined. The character recognition model can be trained only according to medical bill images or according to different types of bill images; or the character recognition model may be a character recognition model that is already known in the art.
S106: and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
Because the header of the medical bill includes some words specific to the medical bill, for example, a word identifying what type of the medical bill is, for example, a specific outpatient bill, a specific hospitalized bill, or the like. Therefore, the keywords can be set in advance according to the special characters at the header of the medical bill, so that the keywords corresponding to the medical bill are formed.
And judging whether the character recognition result comprises a keyword corresponding to the medical bill or not according to the character recognition result of the character recognition model to the region, and if the character recognition result comprises the keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
The embodiment of the invention provides a medical bill sorting method, a device, equipment and a medium, wherein the method comprises the steps of identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in a bill image; judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image; if yes, determining an area containing the target ellipse in the bill image; inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region; if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill, determining an area containing the header characters of the target ellipse by recognizing the target ellipse in the bill image, recognizing the characters at the header of the bill image, and recognizing the bill image of the medical bill according to the character recognition result and the preset keyword corresponding to the medical bill, thereby improving the efficiency and the accuracy of sorting the medical bills.
Example 2:
in order to improve the accuracy of sorting medical bills, on the basis of the above embodiments, in an embodiment of the present invention, the determining an area containing the target ellipse in the bill image includes:
identifying the circle center position of a target ellipse in the bill image;
and determining a rectangular area taking the circle center position as a center position according to the circle center position of the target ellipse and the preset length value and width value.
In order to further determine whether the ticket image is the ticket image of the medical ticket, the determined area containing the target ellipse needs to be capable of containing all effective information for medical ticket identification, but the area cannot be too large, otherwise, the word identification model is also burdened, and the efficiency of identification result output is affected. Therefore, the area of the medical note with the header containing the ellipse seal and the text specific to the medical note can be statistically analyzed in advance, and the attribute information of the area containing the target ellipse can be preset according to the statistical analysis result. Specifically, in the embodiment of the present invention, the region including the target ellipse is set as a rectangular region, and according to the result of the statistical analysis, the length value and the width value of the rectangular region are preset, and the position of the rectangular region is set, where the center position of the rectangular region is the center position of the target ellipse.
Specifically, the length value and the width value of the rectangular area are related to the size of the bill image, so that the length value and the width value of the rectangular area can be set according to the size of the bill image.
The electronic equipment provided by the embodiment of the invention can identify the circle center position of each ellipse in the bill image and the lengths of the long axis and the short axis of each ellipse when identifying the ellipse in the bill image through the ellipse detection algorithm, and stores the lengths of the long axis and the short axis and the information of the circle center position of each ellipse in the bill image.
When the area is determined, firstly, the circle center position of the target ellipse in the bill image is determined according to the known lengths of the major axis and the minor axis of the target ellipse, the circle center position of the target ellipse is used as the center position of the area, and then the rectangular area is determined according to the center position and the preset length value and width value.
In the embodiment of the invention, because the characters at the head of the medical bill are in a certain range, the characters at the head of the medical bill can be more accurately included in the region including the target ellipse, the rectangular region taking the center position as the center position is determined by identifying the center position of the target ellipse in the bill image and according to the center position of the target ellipse and the preset length value and width value, so that the medical bill can be more effectively and more accurately sorted.
Example 3:
in order to improve the accuracy of sorting medical bills, on the basis of the foregoing embodiments, in an embodiment of the present invention, before determining a rectangular area centered at the center of a circle according to the center of the target ellipse and a preset length value and a preset width value, the method further includes:
identifying an included angle between a target ellipse in the bill image and a preset reference direction;
and rotating the bill image to the preset reference direction according to the included angle.
When a bill is converted into a bill image, deviation exists in the placement position, so that deviation exists between the transverse direction of the bill in the converted bill image and the horizontal direction of the bill image, and deviation exists between the longitudinal direction of the bill and the vertical direction of the bill image. In order to improve the accuracy of character recognition and thus the accuracy of medical bill sorting, before determining the rectangular area of the bill image, if there is a deviation between the transverse direction of the bill in the bill image and a reference direction, the bill image needs to be rotated to the reference direction, wherein the reference direction is preset, and the reference direction may be a horizontal direction or a vertical direction.
In order to rotate the bill image to the preset reference direction, the included angle between the target ellipse and the preset reference direction needs to be determined, and the electronic device provided by the embodiment of the invention identifies the included angle between the target ellipse and the reference direction in the bill image according to an ellipse detection algorithm.
And the electronic equipment rotates the bill image to the reference direction according to the included angle between the target ellipse in the bill image and the reference direction.
Preferably, the electronic device identifies an included angle between the major axis direction of the target ellipse in the bill image and the horizontal direction of the bill image according to an ellipse detection algorithm, and rotates the bill image according to the size of the included angle, so that the bill in the bill image is in the horizontal position.
Example 4:
in order to determine a deviation between a bill image and a preset reference direction, on the basis of the foregoing embodiments, in an embodiment of the present invention, the identifying an included angle between a target ellipse in the bill image and the preset reference direction includes:
identifying an included angle between the long axis direction of the target ellipse in the bill image and the horizontal direction; or
And identifying an included angle between the short axis direction and the vertical direction of the target ellipse in the bill image.
In order to rotate the bill image to the preset reference direction, an included angle between the target ellipse and the preset reference direction needs to be determined first, and the preset reference direction may be a horizontal direction or a vertical direction. Therefore, in order to determine the deviation between the bill image and the preset reference direction, in the embodiment of the present invention, the included angle between the major axis direction of the target ellipse in the bill image and the horizontal direction may be identified, and the included angle between the minor axis direction of the target ellipse in the bill image and the vertical direction may also be identified.
According to the determined included angle, the bill image can be rotated to the horizontal direction or the vertical direction.
Example 5:
in order to realize more detailed sorting of medical bills, on the basis of the embodiments, in an embodiment of the present invention, the method further includes:
judging whether a first target character group matched with a preset region keyword exists in the character recognition result;
and if so, determining the region to which the bill image belongs according to the matched first target character group.
After the medical bill image is determined to be the medical bill image, the characters at the header of the medical bill also include some specific characters for identifying the region, for example, the characters for identifying the province of the medical bill, or the characters for identifying the city and county of the medical bill. Therefore, the region key words can be preset according to the specific characters for identifying the region, and the region to which the medical bill belongs can be determined by judging the region key words included in the character recognition result.
Specifically, in order to identify the region to which the bill image belongs, the character recognition result of the region including the target ellipse in the bill image may be determined according to the character recognition model.
And if the character recognition result contains a first target character group matched with the regional keywords, determining the region of the bill image represented by the first target character group according to the first target character group. For example, taking the first target character group as black longjiang as an example, if the character group black longjiang exists in the character recognition result, it is determined that the bill image belongs to the bill image of the medical bill of black longjiang.
The text at the head of the medical bill also comprises some specific text for identifying the category of the medical bill, such as a specific outpatient bill, a specific hospitalization bill, and the like. Therefore, the category key words can be preset according to the specific characters for identifying the category of the medical bill, and the category to which the medical bill belongs can be determined by judging the category key words included in the character recognition result.
After the region to which the bill image belongs is determined, in order to realize more detailed sorting of the bill image, the category to which the bill image belongs is determined, and the method further comprises the following steps:
judging whether a second target character group matched with a preset category keyword exists in the character recognition result;
and if so, determining the attributive type of the bill image according to the matched second target character group.
The category keyword is preset, and may be a keyword such as an outpatient service, an in-patient service, and the like, and specifically, the embodiment of the present invention does not limit this.
In order to determine the category to which the bill image belongs, it is necessary to determine whether a second target character group matching the category keyword exists in the character recognition result, and if the second target character group matching the category keyword exists in the character recognition result, the category to which the bill image belongs can be determined according to the matched second target character group.
Specifically, taking the second target character group as an example of hospitalization, if the character recognition result includes the character group hospitalization, it is determined that the bill image belongs to the bill image of the hospitalization medical bill.
The method for sorting medical bills according to the present invention is described below with a specific embodiment, taking the electronic device as a computer, and taking the bill image as an example of a medical institution fare collection bill in heilongjiang province, fig. 2 is a schematic diagram of the bill image of a medical bill provided by the embodiment of the present invention, and as shown in fig. 2, the bill image of the medical bill includes two ellipses.
After receiving the bill image of the medical bill by the scanner, fig. 3 is a schematic process diagram of another medical bill sorting method provided by the embodiment of the invention, and the process includes the following steps:
s301: the computer identifies two ellipses in the bill image by using an ellipse detection algorithm and identifies the lengths of the major axis and the minor axis of the two ellipses.
S302: and determining the ratio of the major axis to the minor axis of the two ellipses, and judging whether a target ellipse with the ratio of the major axis to the minor axis within a preset ratio range exists in the two ellipses.
S303: and identifying an included angle between the major axis direction of the target ellipse in the bill image and the horizontal direction of the bill image, and rotating the bill image of the medical bill according to the value of the included angle and the position of the bill in the bill image so as to enable the bill in the bill image to be in the horizontal position.
S304: and identifying the circle center position of a target ellipse in the bill image, taking the circle center position as the center position of the rectangular area, taking the length value of the bill image as the length value of the rectangular area, taking the minor axis value of the target ellipse as the width value of the rectangular area, and determining the rectangular area.
Fig. 4 is a schematic diagram of determining a rectangular area according to an embodiment of the present invention, and the rectangular area is selected by a black frame, as shown in fig. 4.
S305: and inputting the image of the rectangular area into a character recognition model to obtain a character recognition result.
Fig. 5 is a schematic image diagram of a rectangular area input to the character recognition model according to the embodiment of the present invention, and as shown in fig. 5, the characters in the rectangular area are a black longjiang medical clinic fee bill.
S306: the character recognition result comprises a keyword outpatient service, and the bill image is determined to be the bill image of the medical bill.
S307: the character recognition result comprises a region keyword Heilongjiang, and the bill image is determined to be the bill image of the medical bill of the Heilongjiang; the keyword outpatient service in the character recognition result belongs to the category keyword, so that the bill image is determined to be the bill image of the outpatient service medical bill of Heilongjiang province.
Example 6:
fig. 6 is a schematic structural diagram of a medical bill sorting apparatus according to an embodiment of the present invention, and on the basis of the foregoing embodiments, an embodiment of the present invention further provides a medical bill sorting apparatus, including:
the identification module 601 is used for identifying the length of each ellipse and the length of the major axis and the minor axis of each ellipse in the bill image;
a judging module 602, configured to judge whether a target ellipse whose ratio of the major axis to the minor axis is within a preset ratio range exists in the bill image;
a determining module 603, configured to determine, if the target ellipse exists in the document image, an area in the document image that includes the target ellipse; inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region; and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
The determining module 603 is further configured to determine that the bill image is not a bill image of a medical bill if a target ellipse whose ratio of the major axis to the minor axis is within a preset ratio range does not exist in the bill image.
The determining module 602 is further configured to determine whether a first target text group matching a preset region keyword exists in the text recognition result;
the determining module 603 is further configured to determine, if the first target character group exists, an area to which the ticket image belongs according to the matched first target character group.
The determining module 603 is specifically configured to identify a circle center position of a target ellipse in the bill image; and determining a rectangular area taking the circle center position as a center position according to the circle center position of the target ellipse and the preset length value and width value.
The identification module 601 is further configured to identify an included angle between a target ellipse in the bill image and a preset reference direction;
the device further comprises:
and a rotating module 604, configured to rotate the bill image to the preset reference direction according to the included angle.
The identification module 601 is specifically configured to identify an included angle between the major axis direction of the target ellipse in the bill image and the horizontal direction; or identifying the included angle between the short axis direction and the vertical direction of the target ellipse in the bill image.
The determining module 602 is further configured to determine whether a second target text group matching a preset category keyword exists in the text recognition result;
the determining module 603 is further configured to determine, if the second target text group exists, a category to which the ticket image belongs according to the matched second target text group.
Example 7:
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, and on the basis of the above embodiments, an electronic device according to an embodiment of the present invention further includes a processor 701 and a memory 702, where the processor 701 is configured to implement the steps of the medical bill sorting method when executing a computer program stored in the memory 702.
Alternatively, the processor 701 may be a CPU (central processing unit), an ASIC (Application specific integrated Circuit), an FPGA (Field Programmable Gate Array), or a CPLD (Complex Programmable Logic Device).
A processor 701 for executing the following steps when following the computer program stored in the memory 702:
identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in the bill image;
judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image;
if yes, determining an area containing the target ellipse in the bill image;
inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region;
and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
And if the bill image does not have a target ellipse with the ratio of the long axis to the short axis within a preset ratio range, determining that the bill image is not the bill image of the medical bill.
Judging whether a first target character group matched with a preset region keyword exists in the character recognition result;
and if so, determining the region to which the bill image belongs according to the matched first target character group.
The determining the region containing the target ellipse in the bill image comprises:
identifying the circle center position of a target ellipse in the bill image;
and determining a rectangular area taking the circle center position as a center position according to the circle center position of the target ellipse and the preset length value and width value.
Before determining a rectangular region with the center of circle as the center position according to the position of the center of circle of the target ellipse and the preset length value and width value, the method further comprises:
identifying an included angle between a target ellipse in the bill image and a preset reference direction;
and rotating the bill image to the preset reference direction according to the included angle.
The identification of the included angle between the target ellipse in the bill image and the preset reference direction comprises the following steps:
identifying an included angle between the long axis direction of the target ellipse in the bill image and the horizontal direction; or
And identifying an included angle between the short axis direction and the vertical direction of the target ellipse in the bill image.
Judging whether a second target character group matched with a preset category keyword exists in the character recognition result;
and if so, determining the attributive type of the bill image according to the matched second target character group.
The embodiment of the invention provides a medical bill sorting method, a device, equipment and a medium, wherein the method comprises the steps of identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in a bill image; judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image; if yes, the bill image is determined to be the bill image of the medical bill, and the bill image of the medical bill is determined by identifying the target ellipse in the bill image, so that the sorting efficiency of the medical bill is improved.
Example 8:
on the basis of the foregoing embodiments, an embodiment of the present invention further provides a computer-readable storage medium, which stores a computer program, where the computer program is executed by a processor to perform the following steps:
identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in the bill image;
judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image;
if yes, determining an area containing the target ellipse in the bill image;
inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region;
and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
And if the bill image does not have a target ellipse with the ratio of the long axis to the short axis within a preset ratio range, determining that the bill image is not the bill image of the medical bill.
Judging whether a first target character group matched with a preset region keyword exists in the character recognition result;
and if so, determining the region to which the bill image belongs according to the matched first target character group.
The determining the region containing the target ellipse in the bill image comprises:
identifying the circle center position of a target ellipse in the bill image;
and determining a rectangular area taking the circle center position as a center position according to the circle center position of the target ellipse and the preset length value and width value.
Before determining a rectangular region with the center of circle as the center position according to the position of the center of circle of the target ellipse and the preset length value and width value, the method further comprises:
identifying an included angle between a target ellipse in the bill image and a preset reference direction;
and rotating the bill image to the preset reference direction according to the included angle.
The identification of the included angle between the target ellipse in the bill image and the preset reference direction comprises the following steps:
identifying an included angle between the long axis direction of the target ellipse in the bill image and the horizontal direction; or
And identifying an included angle between the short axis direction and the vertical direction of the target ellipse in the bill image.
Judging whether a second target character group matched with a preset category keyword exists in the character recognition result;
and if so, determining the attributive type of the bill image according to the matched second target character group.
The embodiment of the invention provides a medical bill sorting method, a device, equipment and a medium, wherein the method comprises the steps of identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in a bill image; judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image; if yes, the bill image is determined to be the bill image of the medical bill, and the bill image of the medical bill is determined by identifying the target ellipse in the bill image, so that the sorting efficiency of the medical bill is improved.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims (10)

1. A method of sorting medical documents, the method comprising:
identifying the length of each ellipse and the length of the long axis and the short axis of each ellipse in the bill image;
judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image;
if yes, determining an area containing the target ellipse in the bill image;
inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region;
and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
2. The method of claim 1, further comprising:
and if the bill image does not have a target ellipse with the ratio of the long axis to the short axis within a preset ratio range, determining that the bill image is not the bill image of the medical bill.
3. The method of claim 1, further comprising:
judging whether a first target character group matched with a preset region keyword exists in the character recognition result;
and if so, determining the region to which the bill image belongs according to the matched first target character group.
4. The method of claim 1, wherein the determining the region of the document image containing the target ellipse comprises:
identifying the circle center position of a target ellipse in the bill image;
and determining a rectangular area taking the circle center position as a center position according to the circle center position of the target ellipse and the preset length value and width value.
5. The method according to claim 4, wherein before determining the rectangular region centered at the center of the circle according to the center of the target ellipse and the preset length value and width value, the method further comprises:
identifying an included angle between a target ellipse in the bill image and a preset reference direction;
and rotating the bill image to the preset reference direction according to the included angle.
6. The method of claim 5, wherein the identifying the angle between the target ellipse in the document image and the preset reference direction comprises:
identifying an included angle between the long axis direction of the target ellipse in the bill image and the horizontal direction; or
And identifying an included angle between the short axis direction and the vertical direction of the target ellipse in the bill image.
7. The method according to claim 1 or 3, characterized in that the method further comprises:
judging whether a second target character group matched with a preset category keyword exists in the character recognition result;
and if so, determining the attributive type of the bill image according to the matched second target character group.
8. A medical note sorting apparatus, comprising:
the identification module is used for identifying each ellipse in the bill image and the length of the long axis and the short axis of each ellipse;
the judging module is used for judging whether a target ellipse with the ratio of the long axis to the short axis within a preset ratio range exists in the bill image;
the determining module is used for determining an area containing the target ellipse in the bill image if the target ellipse exists in the bill image; inputting the region into a trained character recognition model, and determining a character recognition result of the character recognition model for the region; and if the character recognition result comprises a preset keyword corresponding to the medical bill, determining that the bill image is the bill image of the medical bill.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory for storing program instructions, the processor being adapted to carry out the steps of the method according to any of claims 1-7 when executing a computer program stored in the memory.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
CN202010169928.8A 2020-03-12 2020-03-12 Medical bill sorting method, device, equipment and medium Active CN111291726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010169928.8A CN111291726B (en) 2020-03-12 2020-03-12 Medical bill sorting method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010169928.8A CN111291726B (en) 2020-03-12 2020-03-12 Medical bill sorting method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111291726A true CN111291726A (en) 2020-06-16
CN111291726B CN111291726B (en) 2023-08-08

Family

ID=71027464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010169928.8A Active CN111291726B (en) 2020-03-12 2020-03-12 Medical bill sorting method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111291726B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348024A (en) * 2020-10-29 2021-02-09 北京信工博特智能科技有限公司 Image-text identification method and system based on deep learning optimization network

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004164674A (en) * 2004-01-23 2004-06-10 Oki Electric Ind Co Ltd Format recognition device and character reader
JP2013061839A (en) * 2011-09-14 2013-04-04 Ricoh Co Ltd Image processor, image processing method, image forming device and image processing program
WO2013052812A1 (en) * 2011-10-05 2013-04-11 Siemens Healthcare Diagnostics Inc. Generalized fast radial symmetry transform for ellipse detection
CN107610138A (en) * 2017-10-20 2018-01-19 四川长虹电器股份有限公司 A kind of bill seal regional sequence dividing method
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
CN108921166A (en) * 2018-06-22 2018-11-30 深源恒际科技有限公司 Medical bill class text detection recognition method and system based on deep neural network
CN109800747A (en) * 2018-12-14 2019-05-24 平安科技(深圳)有限公司 Medical invoice recognition methods, user equipment, storage medium and device
JP2019168856A (en) * 2018-03-22 2019-10-03 セイコーエプソン株式会社 Image processing apparatus, image processing method, and image processing program
CN110321760A (en) * 2018-03-29 2019-10-11 北京和缓医疗科技有限公司 A kind of medical document recognition methods and device
WO2019223391A1 (en) * 2018-05-23 2019-11-28 阿里巴巴集团控股有限公司 Bill photographing interaction method and apparatus, processing device, and client
CN110659647A (en) * 2019-09-11 2020-01-07 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN110688998A (en) * 2019-09-27 2020-01-14 中国银行股份有限公司 Bill identification method and device
CN110781877A (en) * 2019-10-28 2020-02-11 京东方科技集团股份有限公司 Image recognition method, device and storage medium

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004164674A (en) * 2004-01-23 2004-06-10 Oki Electric Ind Co Ltd Format recognition device and character reader
JP2013061839A (en) * 2011-09-14 2013-04-04 Ricoh Co Ltd Image processor, image processing method, image forming device and image processing program
WO2013052812A1 (en) * 2011-10-05 2013-04-11 Siemens Healthcare Diagnostics Inc. Generalized fast radial symmetry transform for ellipse detection
CN107610138A (en) * 2017-10-20 2018-01-19 四川长虹电器股份有限公司 A kind of bill seal regional sequence dividing method
WO2019174130A1 (en) * 2018-03-14 2019-09-19 平安科技(深圳)有限公司 Bill recognition method, server, and computer readable storage medium
CN108446621A (en) * 2018-03-14 2018-08-24 平安科技(深圳)有限公司 Bank slip recognition method, server and computer readable storage medium
JP2019168856A (en) * 2018-03-22 2019-10-03 セイコーエプソン株式会社 Image processing apparatus, image processing method, and image processing program
CN110321760A (en) * 2018-03-29 2019-10-11 北京和缓医疗科技有限公司 A kind of medical document recognition methods and device
WO2019223391A1 (en) * 2018-05-23 2019-11-28 阿里巴巴集团控股有限公司 Bill photographing interaction method and apparatus, processing device, and client
CN108921166A (en) * 2018-06-22 2018-11-30 深源恒际科技有限公司 Medical bill class text detection recognition method and system based on deep neural network
CN109800747A (en) * 2018-12-14 2019-05-24 平安科技(深圳)有限公司 Medical invoice recognition methods, user equipment, storage medium and device
CN110659647A (en) * 2019-09-11 2020-01-07 杭州睿琪软件有限公司 Seal image identification method and device, intelligent invoice identification equipment and storage medium
CN110688998A (en) * 2019-09-27 2020-01-14 中国银行股份有限公司 Bill identification method and device
CN110781877A (en) * 2019-10-28 2020-02-11 京东方科技集团股份有限公司 Image recognition method, device and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112348024A (en) * 2020-10-29 2021-02-09 北京信工博特智能科技有限公司 Image-text identification method and system based on deep learning optimization network

Also Published As

Publication number Publication date
CN111291726B (en) 2023-08-08

Similar Documents

Publication Publication Date Title
US11195006B2 (en) Multi-modal document feature extraction
CN112860841B (en) Text emotion analysis method, device, equipment and storage medium
WO2017214073A1 (en) Document field detection and parsing
CN108170468B (en) Method and system for automatically detecting annotation and code consistency
CN109446345A (en) Nuclear power file verification processing method and system
CN110309301B (en) Enterprise category classification method and device and intelligent terminal
CN111353491A (en) Character direction determining method, device, equipment and storage medium
CN114881698A (en) Advertisement compliance auditing method and device, electronic equipment and storage medium
CN113486664A (en) Text data visualization analysis method, device, equipment and storage medium
CN112434970A (en) Qualification data verification method and device based on intelligent data acquisition
WO2022183991A1 (en) Document classification method and apparatus, and electronic device
CN110858353A (en) Method and system for obtaining case referee result
CN111291726B (en) Medical bill sorting method, device, equipment and medium
CN111597805B (en) Method and device for auditing short message text links based on deep learning
CN112990142A (en) Video guide generation method, device and equipment based on OCR (optical character recognition), and storage medium
CN111488452A (en) Webpage tampering detection method, detection system and related equipment
CN116401343A (en) Data compliance analysis method
CN114443834A (en) Method and device for extracting license information and storage medium
CN114049215A (en) Abnormal transaction identification method, device and application
CN116092094A (en) Image text recognition method and device, computer readable medium and electronic equipment
RU2739342C1 (en) Method and system for intelligent document processing
CN114064893A (en) Abnormal data auditing method, device, equipment and storage medium
CN116029280A (en) Method, device, computing equipment and storage medium for extracting key information of document
CN113673368B (en) Method for judging main text direction of document
WO2021054850A1 (en) Method and system for intelligent document processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant