CN115482538A - Material label extraction method and system based on Mask R-CNN - Google Patents

Material label extraction method and system based on Mask R-CNN Download PDF

Info

Publication number
CN115482538A
CN115482538A CN202211420644.7A CN202211420644A CN115482538A CN 115482538 A CN115482538 A CN 115482538A CN 202211420644 A CN202211420644 A CN 202211420644A CN 115482538 A CN115482538 A CN 115482538A
Authority
CN
China
Prior art keywords
mask
detection frame
material label
control points
cnn
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211420644.7A
Other languages
Chinese (zh)
Other versions
CN115482538B (en
Inventor
范柘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Dingshi Technology Co ltd
Shanghai Aware Information Technology Co ltd
Original Assignee
Wuxi Dingshi Technology Co ltd
Shanghai Aware Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Dingshi Technology Co ltd, Shanghai Aware Information Technology Co ltd filed Critical Wuxi Dingshi Technology Co ltd
Priority to CN202211420644.7A priority Critical patent/CN115482538B/en
Publication of CN115482538A publication Critical patent/CN115482538A/en
Application granted granted Critical
Publication of CN115482538B publication Critical patent/CN115482538B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1463Orientation detection or correction, e.g. rotation of multiples of 90 degrees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/146Aligning or centring of the image pick-up or image-field
    • G06V30/1475Inclination or skew detection or correction of characters or of image to be recognised
    • G06V30/1478Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a material label extraction method and system based on Mask R-CNN, and belongs to the technical field of image recognition. According to the invention, a Mask R-CNN model is used for extracting a first rectangular detection frame of a material label, a plurality of pairs of first control points positioned on the boundary of a text area of the material label and a Mask of the material label, so that the correction transformation of a material label image is realized, and a more reasonable external rectangle is further determined based on a transformation matrix to cut the material label of the corrected and transformed material label image, so that a corresponding material label character can be accurately extracted.

Description

Material label extraction method and system based on Mask R-CNN
Technical Field
The invention relates to the technical field of image recognition, in particular to a material label extraction method and system based on Mask R-CNN, electronic equipment and a computer storage medium.
Background
With the development of artificial intelligence and computer vision technology, the automatic identification and sorting of materials become common technologies in logistics transportation, and the automatic identification and sorting of materials seriously depend on the character identification result of material labels. The character recognition technology for material labels mainly comprises a two-stage method, namely, a character region of interest (RoI) is obtained through a character detection network, and then the character RoI region is handed to a character recognition network for recognition. However, because many materials have various morphological characteristics such as circular cross section, the labels are irregularly arranged in an arch shape, a fan shape and the like during printing, and the detection of the label characters of the materials in the prior art has the following defects and shortcomings: 1) The material labels are arranged in an arch shape, and a rectangular RoI area generated by the existing character detection algorithm is difficult to tightly wrap the labels, so that a large amount of background interference is brought, and the identification result of a subsequent character identification algorithm is inaccurate; 2) The arched area where the label is located can be effectively segmented by the segmentation-based character detection method, but due to the lack of a corresponding correction means, the label is still irregularly arranged in the RoI area of the segmented character by only using minimum external rectangle cutting, and the character recognition network is difficult to accurately recognize.
Therefore, the material label extraction method in the prior art has the defects, and cannot meet the actual use requirement.
Disclosure of Invention
In order to at least solve the technical problems in the background art, the invention provides a material label extraction method, a material label extraction system, electronic equipment and a computer storage medium based on Mask R-CNN.
The invention provides a Mask R-CNN-based material label extraction method, which is characterized by comprising the following steps of:
s1, standardizing a first material picture to be detected, and performing character detection on the first material picture subjected to the standardized processing by adopting a Mask R-CNN model to obtain a first rectangular detection frame of a material label, a plurality of pairs of first control points positioned on the boundary of a character area of the material label and a Mask of the material label;
s2, correcting any one first rectangular detection frame in the step S1 to obtain a second rectangular detection frame;
s3, uniformly taking second control points with the same number as the first control points on the upper and lower boundaries of the second rectangular detection frame, and calculating the corrected coordinates of the second control points;
s4, calculating a transformation matrix corresponding to a transformation algorithm according to the first coordinate of the first control point and the second coordinate of the second control point;
s5, transforming the first material picture in the step S1 by using the transformation matrix obtained in the step S4 to obtain a corrected second material picture;
s6, calculating a first ordered point set of the circumscribed polygon of the Mask obtained in the step S1, and performing transformation processing on the first ordered point set by using the transformation matrix in the step S4 to obtain a second ordered point set;
s7, calculating coordinates of a circumscribed rectangle of the second ordered point set;
s8, cutting the second material picture according to the coordinates of the circumscribed rectangle to obtain a corrected material label character picture;
s9, performing character recognition on the material label character picture to obtain the material label;
and S10, circulating the step S2 to the step S9 until all detected material labels are traversed.
Further, the coordinates of the jth first control point in step S1
Figure DEST_PATH_IMAGE001
Relative distance by regression from the Mask R-CNN model
Figure 224484DEST_PATH_IMAGE002
And the first rectangle detection frame is obtained by calculation, and the calculation formula is as follows:
Figure DEST_PATH_IMAGE003
Figure 203941DEST_PATH_IMAGE004
wherein the first rectangular detection frame is composed ofCorner coordinates of upper left corner
Figure DEST_PATH_IMAGE005
Wide, wide
Figure 780416DEST_PATH_IMAGE006
And height
Figure DEST_PATH_IMAGE007
By definition, the following are expressed:
Figure 972363DEST_PATH_IMAGE008
further, the step S2 specifically includes:
taking any one of the first rectangular detection frames in the step S1, taking the average length of the upper and lower boundaries of the text region boundary as the width of a rectangular template, taking the average height of the text region boundary as the height of the rectangular template, and scaling the rectangular template to a preset size under the condition of keeping the aspect ratio to obtain the second rectangular detection frame.
Further, in step S3, uniformly taking the same number of second control points as the first control points on the upper and lower boundaries of the second rectangular detection frame, and calculating coordinates of the second control points after correction, includes:
if the logarithm of the first control points is k, the total number of the first control points is 2k, and the size of the second rectangular detection frame obtained in step S2 is
Figure DEST_PATH_IMAGE009
Uniformly taking k second control points from left to right on the upper boundary of the second rectangular detection frame obtained in the step S2, and also taking k second control points at corresponding positions on the lower boundary;
wherein the coordinates of the second control point on the upper boundary of the second rectangular detection frame are:
Figure 583473DEST_PATH_IMAGE010
the coordinates of the second control point on the lower boundary of the second rectangular detection frame are:
Figure DEST_PATH_IMAGE011
further, the corner point at the upper left corner of the second rectangular detection frame is placed on the first material picture
Figure 202673DEST_PATH_IMAGE012
Obtaining the coordinates of each second control point in the first material picture, wherein,
Figure DEST_PATH_IMAGE013
Figure 865DEST_PATH_IMAGE014
is an offset.
Further, the transformation algorithm in step S4 uses a thin-plate spline interpolation (TPS) transformation.
Further, in step S6, the Mask is subjected to connected region extraction using the findContours function of OpenCV.
The second aspect of the invention provides a material label extraction system based on Mask R-CNN, which comprises a shooting module, a processing module and a storage module; the processing module is connected with the shooting module and the storage module;
the storage module is used for storing executable computer program codes;
the shooting module is used for shooting a first material picture and transmitting the first material picture to the processing module;
the processing module is configured to execute the method according to any one of the preceding claims by calling the executable computer program code in the storage module.
A third aspect of the present invention provides an electronic device comprising: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory to perform the method of any of the preceding claims.
A fourth aspect of the invention provides a computer storage medium having stored thereon a computer program which, when executed by a processor, performs the method as set out in any one of the preceding claims.
According to the scheme, aiming at the material labels which are arranged in a non-rectangular mode, a Mask R-CNN model is used for extracting a first rectangular detection frame of the material labels, a plurality of pairs of first control points located on the boundaries of character areas of the material labels and Mask masks of the material labels, accordingly, correction transformation of material label images is achieved, and more reasonable external rectangles are further determined based on a transformation matrix to cut the material labels of the corrected and transformed material label images, so that corresponding material label characters can be accurately extracted.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
FIG. 1 is a schematic flow chart of a material label extraction method based on Mask R-CNN disclosed in the embodiment of the present invention;
FIG. 2 is a schematic structural diagram of a Mask R-CNN model disclosed in an embodiment of the present invention;
fig. 3 is a schematic diagram of a result of performing text detection on a first material picture according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a process of cutting a material label text picture according to an embodiment of the present invention;
FIG. 5 is a schematic structural diagram of a material label extraction system based on Mask R-CNN disclosed in the embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "a plurality" typically includes at least two.
It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.
It should be understood that, although the terms first, second, third, etc. may be used in the embodiments of the present application to describe \8230; \8230, these \8230; should not be limited to these terms. These terms are used only to distinguish between 8230; and vice versa. For example, a first 8230; also referred to as a second 8230; without departing from the scope of embodiments of the present application, similarly, the second one (8230) \\8230; also known as the first one (8230); 8230).
The words "if", as used herein may be interpreted as "at \8230; \8230whenor" when 8230; \8230when or "in response to a determination" or "in response to a detection", depending on the context. Similarly, the phrases "if determined" or "if detected (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when detected (a stated condition or event)" or "in response to a detection (a stated condition or event)", depending on the context.
It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrases "comprising one of \8230;" does not exclude the presence of additional like elements in an article or system comprising the element.
Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings.
Example one
Referring to fig. 1, fig. 1 is a schematic flow chart of a material label extraction method based on Mask R-CNN according to an embodiment of the present invention. As shown in fig. 1, the method for extracting a material label based on Mask R-CNN of this embodiment includes the following steps:
s1, standardizing a first material picture to be detected, and performing character detection on the first material picture subjected to the standardized processing by adopting a Mask R-CNN model to obtain a first rectangular detection frame of a material label, a plurality of pairs of first control points positioned on the boundary of a character area of the material label and a Mask of the material label;
s2, carrying out correction processing on any one first rectangular detection frame in the step S1 to obtain a second rectangular detection frame;
s3, uniformly taking second control points with the same number as the first control points on the upper and lower boundaries of the second rectangular detection frame, and calculating the corrected coordinates of the second control points;
s4, calculating a transformation matrix corresponding to a transformation algorithm according to the first coordinate of the first control point and the second coordinate of the second control point;
s5, transforming the first material picture in the step S1 by using the transformation matrix obtained by calculation in the step S4 to obtain a corrected second material picture;
s6, calculating a first ordered point set of the circumscribed polygon of the Mask obtained in the step S1, and performing transformation processing on the first ordered point set by using the transformation matrix in the step S4 to obtain a second ordered point set;
s7, calculating coordinates of a circumscribed rectangle of the second ordered point set;
s8, cutting the second material picture according to the coordinates of the external rectangle to obtain a corrected material label character picture;
s9, performing character recognition on the material label character picture to obtain the material label;
and S10, the steps S2 to S9 are circulated until all detected material labels are traversed.
In the embodiment of the invention, since the material labels in many cases are not in a standard rectangular form, for example, an arc-shaped structure as shown in fig. 2, the invention firstly uses a character detector (i.e., a trained Mask R-CNN model) to perform preliminary detection of a character region on an initial first material picture, thereby determining a first rectangular detection frame, a plurality of pairs of first control points and Mask masks of the material labels; then, correcting the first rectangular detection frames containing the material labels, specifically arranging the label characters from left to right to obtain a second rectangular detection frame; and then, taking the same number of second control points from the second rectangular detection frame, determining a transformation matrix according to the coordinates of the first control points and the second control points, and transforming the initial first material picture by using the transformation matrix to obtain a corrected second material picture. And finally, processing the originally obtained first ordered point set of the circumscribed polygon corresponding to the Mask based on the transformation matrix to obtain a second ordered point set, determining the coordinates of the circumscribed rectangle of the second ordered point set at the moment, and cutting the second material picture by using the coordinates to obtain a material label character picture, namely, extracting the material label region image, and obtaining the material label by using a proper character extraction algorithm.
It should be noted that, in the step S1, the standardizing the material picture to be detected may include: and dividing each pixel of the picture to be detected by 255 for normalization, and then subtracting the mean value and dividing by the square difference to obtain the standardized picture to be detected.
Further, the coordinates of the jth first control point in step S1
Figure 730923DEST_PATH_IMAGE001
Relative distance by regression from the Mask R-CNN model
Figure 665381DEST_PATH_IMAGE002
And the first rectangle detection frame is obtained by calculation, and the calculation formula is as follows:
Figure 189903DEST_PATH_IMAGE003
Figure 209812DEST_PATH_IMAGE004
wherein the first rectangle detection frame is composed of coordinates of corner points at the upper left corner
Figure 477982DEST_PATH_IMAGE005
Wide, wide
Figure 266947DEST_PATH_IMAGE006
And height
Figure 962370DEST_PATH_IMAGE007
By definition, the following are expressed:
Figure 735154DEST_PATH_IMAGE008
in the embodiment of the invention, referring to fig. 3, the logarithm of the control points obtained by the Mask R-CNN model regression is 7 pairs, and the total number of the control points is 14, wherein the number of the control points is 7 for each upper text boundary and lower text boundary. The control points on the upper and lower boundaries are uniformly and orderly arranged, and the upper and lower boundary control points form a pair pairwise.
In particular, the relative distance
Figure DEST_PATH_IMAGE015
Can be calculated by the following formula:
Figure 541436DEST_PATH_IMAGE016
Figure DEST_PATH_IMAGE017
Figure 981645DEST_PATH_IMAGE018
Figure DEST_PATH_IMAGE019
in the formula (I), the compound is shown in the specification,
Figure 113549DEST_PATH_IMAGE020
is composed of
Figure DEST_PATH_IMAGE021
Corresponding normalized weights.
As shown in fig. 3, the regression result of the control points in the Mask R-CNN model in the present invention is the normalized relative distance between each control point and the focus distance at the top left corner of the first rectangular detection frame. And, normalizing the weights
Figure 639208DEST_PATH_IMAGE020
Can be preset manually, e.g.
Figure 983602DEST_PATH_IMAGE022
Figure DEST_PATH_IMAGE023
Further, the Mask R-CNN model adopts the following loss function when training:
Figure 278317DEST_PATH_IMAGE024
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE025
representing regression box classification loss;
Figure 581122DEST_PATH_IMAGE026
representing the regression loss of the rectangular detection box;
Figure DEST_PATH_IMAGE027
represents the regression loss for each of the control points (e.g., the 14 control points shown in FIG. 3);
Figure 328498DEST_PATH_IMAGE028
representing the Mask segmentation loss.
In the embodiment of the present invention, it is,
Figure 736303DEST_PATH_IMAGE025
a cross-entropy loss function may be used in particular,
Figure 354366DEST_PATH_IMAGE026
the Smooth L1 loss function may be used specifically;
Figure 562493DEST_PATH_IMAGE027
a Smooth L1 loss function may be specifically used;
Figure 531586DEST_PATH_IMAGE028
a binary cross entropy loss function may be used in particular.
Further, the step S2 specifically includes:
taking any one of the first rectangular detection frames in the step S1, taking the average length of the upper and lower boundaries of the text region boundary as the width of a rectangular template, taking the average height of the text region boundary as the height of the rectangular template, and scaling the rectangular template to a preset size under the condition of keeping the aspect ratio to obtain the second rectangular detection frame.
In the embodiment of the invention, the upper edge of the text area boundary of the material label is calculated in the following way: and calculating first distance values between every two control points of the upper boundary of the character area boundary obtained in the step S1, and taking the sum of the first distance values as the length of the upper boundary. The lower bound calculation is: and calculating second distance values between every two control points of the lower boundary of the character region boundary obtained in the step S1, and taking the sum of the second distance values as the length of the lower boundary. The average height of the character area boundary is calculated in the following mode: and calculating third distance values of the plurality of pairs of control points obtained in the step S1, wherein the average value of the third distance values between each pair of control points is used as the average height.
Specifically, the scaling the rectangular template to a preset size to obtain the second rectangular detection frame includes:
set the target short side length to
Figure DEST_PATH_IMAGE029
The short side of the rectangular template is recorded as
Figure 76837DEST_PATH_IMAGE030
Scaling the short side of the rectangular template to
Figure DEST_PATH_IMAGE031
Calculating a scaling ratio
Figure 80565DEST_PATH_IMAGE032
And scaling the long side of the rectangular template according to the scaling ratio r to obtain the second rectangular detection frame.
Wherein the target short side length can be set to
Figure DEST_PATH_IMAGE033
Further, as shown in fig. 4, in step S3, uniformly taking the same number of second control points as the first control points on the upper and lower boundaries of the second rectangular detection frame, and calculating coordinates of the second control points after correction, includes:
if the logarithm of the first control points is k, the total number of the first control points is 2k, and the size of the second rectangular detection frame obtained in step S2 is
Figure 990753DEST_PATH_IMAGE009
Uniformly taking k second control points from left to right on the upper boundary of the second rectangular detection frame obtained in the step S2, and also taking k second control points at corresponding positions on the lower boundary;
wherein the coordinates of the second control point of the upper boundary of the second rectangular detection frame are:
Figure 447142DEST_PATH_IMAGE010
the coordinates of the second control point on the lower boundary of the second rectangular detection frame are:
Figure 937029DEST_PATH_IMAGE011
in the embodiment of the invention, the second control points are calibrated in the corrected second rectangular detection frame according to the same number of corresponding first control points, and then the coordinates of the second control points on the upper/lower boundary of the second rectangular detection frame are obtained.
Further, placing the corner point at the upper left corner of the second rectangular detection frame on the first material picture
Figure 264105DEST_PATH_IMAGE012
Obtaining the coordinates of each second control point in the first material picture, wherein,
Figure 79614DEST_PATH_IMAGE013
Figure 757720DEST_PATH_IMAGE014
is an offset.
In the embodiment of the invention, since the limited control points cannot accurately represent the character outline, and therefore, part of characters can be out of the control points during conversion, the rectangular template is not placed at the boundary position, namely, the point at the upper left corner of the second rectangular detection frame is not placed at (0, 0), but is placed at the picture
Figure 316878DEST_PATH_IMAGE012
Wherein the offset amount can be set to
Figure 764039DEST_PATH_IMAGE034
Figure 219292DEST_PATH_IMAGE036
At this time, as shown in fig. 4, the coordinates of the second control point on the upper boundary of the second rectangular detection frame are changed to:
Figure DEST_PATH_IMAGE037
the coordinates of the second control point of the lower boundary of the second rectangular detection frame are changed to:
Figure 915852DEST_PATH_IMAGE038
further, the transformation algorithm in step S4 uses a thin-plate spline interpolation (TPS) transformation.
Step 6, performing connected region extraction on the Mask by using a findContours function of OpenCV.
In the embodiment of the present invention, the first ordered point set of the circumscribed polygon is calculated by using the transformation matrix obtained in step S4, so as to obtain the transformed second ordered point set, and then the transformed second ordered point set can be calculated by using the bounding function of OpenCVThe bounding rectangle of the ordered set of points can be represented as
Figure DEST_PATH_IMAGE039
Further, in step S8, the cutting the second material picture to obtain a corrected material label text picture includes:
cutting out the second material image
Figure 278700DEST_PATH_IMAGE040
Is at the upper left corner and has a width of
Figure DEST_PATH_IMAGE041
A height of
Figure 377106DEST_PATH_IMAGE042
And obtaining the corrected material label character picture in the rectangular area.
Example two
Referring to fig. 5, fig. 5 is a schematic structural diagram of a material label extraction system based on Mask R-CNN according to an embodiment of the present invention. As shown in fig. 5, the system for extracting a material label based on Mask R-CNN of the present embodiment includes a shooting module 101, a processing module 102, and a storage module 103; the processing module 102 is connected with the shooting module 101 and the storage module 103;
the storage module 103 is used for storing executable computer program codes;
the shooting module 101 is used for shooting a first material picture and transmitting the first material picture to the processing module 102;
the processing module 102 is configured to execute the method according to the first embodiment by calling the executable computer program code in the storage module 103.
The specific functions of the Mask R-CNN-based material label extraction system in this embodiment refer to the first embodiment, and since the system in this embodiment adopts all technical solutions of the first embodiment, at least all beneficial effects brought by the technical solutions of the first embodiment are achieved, and details are not repeated herein.
EXAMPLE III
Referring to fig. 6, fig. 6 is an electronic device according to an embodiment of the present invention, including: a memory storing executable program code; a processor coupled with the memory; the processor calls the executable program code stored in the memory to execute the method according to the first embodiment.
Example four
The embodiment of the invention also discloses a computer storage medium, wherein a computer program is stored on the storage medium, and when the computer program is executed by a processor, the method in the first embodiment is executed.
Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).
It should be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in more detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention.

Claims (10)

1. A material label extraction method based on Mask R-CNN is characterized by comprising the following steps:
s1, standardizing a first material picture to be detected, and performing character detection on the first material picture after standardized processing by adopting a Mask R-CNN model to obtain a first rectangular detection frame of a material label, a plurality of pairs of first control points positioned on the boundary of a character area of the material label and a Mask of the material label;
s2, correcting any one first rectangular detection frame in the step S1 to obtain a second rectangular detection frame;
s3, uniformly taking second control points with the same number as the first control points on the upper and lower boundaries of the second rectangular detection frame, and calculating the corrected coordinates of the second control points;
s4, calculating a transformation matrix corresponding to a transformation algorithm according to the first coordinate of the first control point and the second coordinate of the second control point;
s5, transforming the first material picture in the step S1 by using the transformation matrix obtained by calculation in the step S4 to obtain a corrected second material picture;
s6, calculating a first ordered point set of a circumscribed polygon of the Mask obtained in the step S1, and performing transformation processing on the first ordered point set by using the transformation matrix in the step S4 to obtain a second ordered point set;
s7, calculating coordinates of a circumscribed rectangle of the second ordered point set;
s8, cutting the second material picture according to the coordinates of the circumscribed rectangle to obtain a corrected material label character picture;
s9, performing character recognition on the material label character picture to obtain the material label;
and S10, the steps S2 to S9 are circulated until all detected material labels are traversed.
2. The Mask R-CNN-based material label extraction method as claimed in claim 1, wherein: coordinates of jth first control point in step S1
Figure DEST_PATH_IMAGE002
Relative distance regressed by the Mask R-CNN model
Figure DEST_PATH_IMAGE004
And the first rectangle detection frame is obtained by calculation, and the calculation formula is as follows:
Figure DEST_PATH_IMAGE006
Figure DEST_PATH_IMAGE008
wherein the first rectangular detection frame is composed of the coordinates of the corner point at the upper left corner
Figure DEST_PATH_IMAGE010
Wide, wide
Figure DEST_PATH_IMAGE012
And height
Figure DEST_PATH_IMAGE014
By definition, we mean as follows:
Figure DEST_PATH_IMAGE016
3. the Mask R-CNN-based material label extraction method according to claim 1 or 2, characterized in that: the step S2 specifically includes:
taking any one of the first rectangular detection frames in the step S1, taking the average length of the upper and lower boundaries of the text region boundary as the width of a rectangular template, taking the average height of the text region boundary as the height of the rectangular template, and scaling the rectangular template to a preset size under the condition of keeping the aspect ratio to obtain the second rectangular detection frame.
4. The Mask R-CNN-based material label extraction method according to claim 3, characterized in that: in step S3, uniformly taking the same number of second control points as the first control points from the upper and lower boundaries of the second rectangular detection frame, and calculating coordinates of the second control points after correction, including:
if the logarithm of the first control points is k, the total number of the first control points is 2k, and the size of the second rectangular detection frame obtained in step S2 is
Figure DEST_PATH_IMAGE018
Uniformly taking k second control points from left to right on the upper boundary of the second rectangular detection frame obtained in the step S2, and also taking k second control points at corresponding positions on the lower boundary;
wherein the coordinates of the second control point of the upper boundary of the second rectangular detection frame are:
Figure DEST_PATH_IMAGE020
the coordinates of the second control point on the lower boundary of the second rectangular detection frame are:
Figure DEST_PATH_IMAGE022
5. the Mask R-CNN-based material label extraction method according to claim 4, characterized in that: placing the corner point at the upper left corner of the second rectangular detection frame on the first material picture
Figure DEST_PATH_IMAGE024
Obtaining the coordinates of each second control point in the first material picture, wherein,
Figure DEST_PATH_IMAGE026
Figure DEST_PATH_IMAGE028
is an offset.
6. The material label extraction method based on Mask R-CNN as claimed in claim 1, 2, 4 or 5, wherein: the transformation algorithm in step S4 uses a thin-plate spline interpolation (TPS) transformation.
7. The Mask R-CNN-based material label extraction method as claimed in claim 1, 2, 4 or 5, wherein: in step S6, the Mask is subjected to connected region extraction using the findContours function of OpenCV.
8. A material label extraction system based on Mask R-CNN comprises a shooting module, a processing module and a storage module; the processing module is connected with the shooting module and the storage module;
the storage module is used for storing executable computer program codes;
the shooting module is used for shooting a first material picture and transmitting the first material picture to the processing module;
the method is characterized in that: the processing module for performing the method of any one of claims 1-7 by invoking the executable computer program code in the storage module.
9. An electronic device, comprising:
a memory storing executable program code;
a processor coupled with the memory;
the method is characterized in that: the processor calls the executable program code stored in the memory to perform the method of any of claims 1-7.
10. A computer storage medium having a computer program stored thereon, characterized in that: the computer program, when executed by a processor, performs the method of any one of claims 1-7.
CN202211420644.7A 2022-11-15 2022-11-15 Material label extraction method and system based on Mask R-CNN Active CN115482538B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211420644.7A CN115482538B (en) 2022-11-15 2022-11-15 Material label extraction method and system based on Mask R-CNN

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211420644.7A CN115482538B (en) 2022-11-15 2022-11-15 Material label extraction method and system based on Mask R-CNN

Publications (2)

Publication Number Publication Date
CN115482538A true CN115482538A (en) 2022-12-16
CN115482538B CN115482538B (en) 2023-04-18

Family

ID=84396506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211420644.7A Active CN115482538B (en) 2022-11-15 2022-11-15 Material label extraction method and system based on Mask R-CNN

Country Status (1)

Country Link
CN (1) CN115482538B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094764A1 (en) * 2011-10-17 2013-04-18 Richard John Campbell Methods, Systems and Apparatus for Correcting Perspective Distortion in a Document Image
CN109886896A (en) * 2019-02-28 2019-06-14 闽江学院 A kind of blue License Plate Segmentation and antidote
CN110287960A (en) * 2019-07-02 2019-09-27 中国科学院信息工程研究所 The detection recognition method of curve text in natural scene image
CN110751151A (en) * 2019-10-12 2020-02-04 上海眼控科技股份有限公司 Text character detection method and equipment for vehicle body image
CN110837835A (en) * 2019-10-29 2020-02-25 华中科技大学 End-to-end scene text identification method based on boundary point detection
CN111612009A (en) * 2020-05-21 2020-09-01 腾讯科技(深圳)有限公司 Text recognition method, device, equipment and storage medium
CN112001406A (en) * 2019-05-27 2020-11-27 杭州海康威视数字技术股份有限公司 Text region detection method and device
CN112258426A (en) * 2020-11-27 2021-01-22 福州大学 Automatic scaffold image inclination correction method based on Mask RCNN
CN112883964A (en) * 2021-02-07 2021-06-01 河海大学 Method for detecting characters in natural scene
CN113205090A (en) * 2021-04-29 2021-08-03 北京百度网讯科技有限公司 Picture rectification method and device, electronic equipment and computer readable storage medium
US20220346885A1 (en) * 2019-09-20 2022-11-03 Canon U.S.A., Inc. Artificial intelligence coregistration and marker detection, including machine learning and using results thereof

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130094764A1 (en) * 2011-10-17 2013-04-18 Richard John Campbell Methods, Systems and Apparatus for Correcting Perspective Distortion in a Document Image
CN109886896A (en) * 2019-02-28 2019-06-14 闽江学院 A kind of blue License Plate Segmentation and antidote
CN112001406A (en) * 2019-05-27 2020-11-27 杭州海康威视数字技术股份有限公司 Text region detection method and device
CN110287960A (en) * 2019-07-02 2019-09-27 中国科学院信息工程研究所 The detection recognition method of curve text in natural scene image
US20220346885A1 (en) * 2019-09-20 2022-11-03 Canon U.S.A., Inc. Artificial intelligence coregistration and marker detection, including machine learning and using results thereof
CN110751151A (en) * 2019-10-12 2020-02-04 上海眼控科技股份有限公司 Text character detection method and equipment for vehicle body image
CN110837835A (en) * 2019-10-29 2020-02-25 华中科技大学 End-to-end scene text identification method based on boundary point detection
CN111612009A (en) * 2020-05-21 2020-09-01 腾讯科技(深圳)有限公司 Text recognition method, device, equipment and storage medium
CN112258426A (en) * 2020-11-27 2021-01-22 福州大学 Automatic scaffold image inclination correction method based on Mask RCNN
CN112883964A (en) * 2021-02-07 2021-06-01 河海大学 Method for detecting characters in natural scene
CN113205090A (en) * 2021-04-29 2021-08-03 北京百度网讯科技有限公司 Picture rectification method and device, electronic equipment and computer readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
PENGYUAN LYU 等: "Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes", 《ECVV》 *
李杜 等: "工业字符识别中实用的预处理技术", 《江南大学学报(自然科学版)》 *
程瑶 等: "红外图像采集***设计", 《重庆工学院学报自然科学)》 *

Also Published As

Publication number Publication date
CN115482538B (en) 2023-04-18

Similar Documents

Publication Publication Date Title
US11275961B2 (en) Character image processing method and apparatus, device, and storage medium
CN108009543B (en) License plate recognition method and device
CN110147786B (en) Method, apparatus, device, and medium for detecting text region in image
WO2019128646A1 (en) Face detection method, method and device for training parameters of convolutional neural network, and medium
WO2019169772A1 (en) Picture processing method, electronic apparatus, and storage medium
CN111353497B (en) Identification method and device for identity card information
CN110956171A (en) Automatic nameplate identification method and device, computer equipment and storage medium
WO2018233055A1 (en) Method and apparatus for entering policy information, computer device and storage medium
CN110210297B (en) Method for locating and extracting Chinese characters in customs clearance image
CN110659574A (en) Method and system for outputting text line contents after status recognition of document image check box
US20200372248A1 (en) Certificate recognition method and apparatus, electronic device, and computer-readable storage medium
US11600091B2 (en) Performing electronic document segmentation using deep neural networks
CN114529459B (en) Method, system and medium for enhancing image edge
CN111091123A (en) Text region detection method and equipment
CN110443184A (en) ID card information extracting method, device and computer storage medium
CN111368632A (en) Signature identification method and device
US9483834B1 (en) Object boundary detection in an image
CN115482538B (en) Material label extraction method and system based on Mask R-CNN
CN115830604A (en) Surface single image correction method, device, electronic apparatus, and readable storage medium
CN111046770A (en) Automatic annotation method for photo file figures
CN107330470B (en) Method and device for identifying picture
CN111079749A (en) End-to-end commodity price tag character recognition method and system with attitude correction function
CN115424254A (en) License plate recognition method, system, equipment and storage medium
CN115273126A (en) Identification method and device for components in constructional engineering drawing and electronic equipment
CN113569859A (en) Image processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant