CN109685870A - Information labeling method and device, tagging equipment and storage medium - Google Patents

Information labeling method and device, tagging equipment and storage medium Download PDF

Info

Publication number
CN109685870A
CN109685870A CN201811388635.8A CN201811388635A CN109685870A CN 109685870 A CN109685870 A CN 109685870A CN 201811388635 A CN201811388635 A CN 201811388635A CN 109685870 A CN109685870 A CN 109685870A
Authority
CN
China
Prior art keywords
callout box
marked
information
image
mark
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811388635.8A
Other languages
Chinese (zh)
Other versions
CN109685870B (en
Inventor
陈意浩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Huiliu Technology Co Ltd
Original Assignee
Beijing Huiliu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Huiliu Technology Co Ltd filed Critical Beijing Huiliu Technology Co Ltd
Priority to CN201811388635.8A priority Critical patent/CN109685870B/en
Publication of CN109685870A publication Critical patent/CN109685870A/en
Application granted granted Critical
Publication of CN109685870B publication Critical patent/CN109685870B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The embodiment of the invention discloses a kind of information labeling method and device, tagging equipment and storage mediums.The described method includes: the position of the information to be marked in detection image;According to the position of detection, display is in the callout box of editable state on the image;If detecting the first edit operation for acting on the callout box, the callout box is adjusted according to first edit operation;The information to be marked framed using the callout box identifies the information acquisition marked content to be marked as recognition unit.

Description

Information labeling method and device, tagging equipment and storage medium
Technical field
The present invention relates to information technology field more particularly to a kind of information labeling method and devices, tagging equipment and storage Medium.
Background technique
In some scenes, it may be necessary to which the text information in image is labeled;Existing annotation tool only provides people The interface that work marks manually needs user to be manually entered the information of mark;Although easy to operate for a user but repetition Property it is high, long-term disposal is easy to appear manual errors, low so as to cause the accuracy of mark.
Summary of the invention
In view of this, an embodiment of the present invention is intended to provide a kind of information labeling method and device, tagging equipment and storages to be situated between Matter.
The technical scheme of the present invention is realized as follows:
A kind of information labeling method, comprising:
The position of information to be marked in detection image;
According to the position of detection, display is in the callout box of editable state on the image;
If detecting the first edit operation for acting on the callout box, the mark is adjusted according to first edit operation Infuse frame;
The information to be marked framed using the callout box identifies the information acquisition mark to be marked as recognition unit Infuse content.
It is described that the callout box, including at least one of are adjusted according to first edit operation based on above scheme:
According to first edit operation, the position of the callout box of the adjustment in editable state;
According to first edit operation, the size of the callout box of the adjustment in editable state;
According to first edit operation, adjustment in can convenient state the callout box shape.
Based on above scheme, the method also includes:
Described image is shown in first area;
The marked content in editable state is shown in second area;Wherein, the first area is different from institute State second area;
Detection acts on the second edit operation of the marked content;
The marked content is updated based on second edit operation.
Based on above scheme, the method also includes:
Generate mark file, wherein it is described mark file include: the information to be marked mark and with it is described to be marked The corresponding marked content of information.
Based on above scheme, the generation marks file, comprising:
Generate the mark file including at least one key-value pair;
Wherein, the information to be marked is identified as key;The corresponding marked content of the information to be marked is value;
The mark of the information to be marked includes: the coordinate value for framing the callout box of the information to be marked.
Based on above scheme, the method also includes:
Determine the shape of the callout box;
The position according to detection, display is in the callout box of editable state on the image, comprising:
According to the position where the information to be marked, display is in the determined shape of editable state on the image The callout box of shape.
Based on above scheme, the shape of the determination callout box, including at least one of:
According to the specified operation of the shape of the callout box, the shape of the callout box is determined;
According to the distributional pattern of the information to be marked on the image, the shape of the callout box is determined.
Based on above scheme, the method also includes:
Open working directory;
Batch loads the image under the working directory;
If when n-th image meets mark completion condition, into the mark of (n+1)th image, wherein n is positive integer.
A kind of information labeling device, comprising:
Image detection module, the position of the information to be marked in detection image;
Display module, for the position according to detection, display is in the mark of editable state on the image Frame;
Module is adjusted, if for detecting the first edit operation for acting on the callout box, according to first editor Callout box described in operation adjustment;
Identification module, the information to be marked for being framed using the callout box as recognition unit, identification it is described to Markup information obtains marked content.
A kind of tagging equipment, comprising:
Memory;
Memory is connect with the memory, for executable by executing the computer being stored on the memory Code realizes the information labeling method that aforementioned one or more technical solutions provide.
A kind of computer storage medium, the computer storage medium are stored with computer-executable code;The calculating After machine executable code is performed, the information labeling method that aforementioned one or more technical solutions provide can be realized.
Technical solution provided in an embodiment of the present invention detects the position of information to be marked on the image, then in need Callout box is shown at the position of information to be marked, and the callout box is the callout box in editable state, if tagging equipment is examined The first edit operation for measuring user's input can edit the callout box, to change the letter to be marked that single callout box is framed Breath, the image-region then framed with single callout box are an entirety, carry out the identification of information to be marked, in this way, can be with It reduces by the fully automated identification information to be marked of equipment, the standard that mark obtains marked content can be promoted by the introducing of callout box True rate.
Detailed description of the invention
Fig. 1 is the flow diagram of the first information labeling method provided in an embodiment of the present invention;
Fig. 2 is a kind of display schematic diagram for marking interface provided in an embodiment of the present invention;
Fig. 3 is the flow diagram of second of information labeling method provided in an embodiment of the present invention;
Fig. 4 is the flow diagram of the third information labeling method provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of information labeling device provided in an embodiment of the present invention;
Fig. 6 is the flow diagram of the 4th kind of information labeling method provided in an embodiment of the present invention.
Specific embodiment
Technical solution of the present invention is further described in detail with reference to the accompanying drawings and specific embodiments of the specification.
As shown in Figure 1, the present embodiment provides a kind of information labeling methods, comprising:
Step S110: the position of the information to be marked in detection image;
Step S120: according to the position of detection, display is in the callout box of editable state on the image;
Step S130: if detecting the first edit operation for acting on the callout box, according to first edit operation Adjust the callout box;
Step S140: the information to be marked framed using the callout box identifies described to be marked as recognition unit Information acquisition marked content.
Information labeling method provided in this embodiment, can be applied in various electronic equipments, for example, terminal device or position In the annotation server of network side.
In the present embodiment, after image is opened in load one, the position of the information to be marked in meeting detection image.It is described to Markup information includes but is not limited to following one:
Text, including but not limited to character, number, punctuation mark;
Icon, including but not limited to enterprise's icon, trade mark etc.;
Label.
In the present embodiment, the information to be marked be all as picture material, it is stored with picture format.In order to obtain The information to be marked of text formatting is taken, can be applied in the present embodiment using mark or mark plug-in unit carries out figure to described image As content recognition, so that the marked content of text formatting is converted to, for example, the marked content can be text information but be not limited to text This information.
In the present embodiment, before identifying the information to be marked, the information to be marked can be detected on the image Then position shows callout box in the described image of display.Image-region (the letter i.e. to be marked that one callout box is framed Breath) it is a recognition unit.
As shown in Fig. 2, showing 7 callout box on the image, the image-region that each callout box is framed is identification The information to be marked of unit, is converted into marked content.For example, the picture material that the 1st callout box frames in Fig. 2 Region where " WELCOME ";The content that the callout box is framed is a recognition unit, then can be identified as word " WELCOME " is identified as identification mistake caused by two words to reduce the word being split as " WEL " and " COME " Problem.It is whole that one recognition unit is regarded as an identification.
In the present embodiment, the callout box is that editable must identify frame.The callout box includes multiple key points, the pass The coordinate of key point is removable, and first edit operation can move the coordinate of one or more key points in a callout box, To edit the callout box.The information to be marked framed by editing the adjustable single callout box of callout box.
The callout box is distinguished according to the shape of callout box, it may include: rectangle callout box, triangle callout box, five sides More than shape or pentagon polygon callout box, circle markings frame, oval callout box etc..
The key point of callout box of different shapes is different, for example, the key point of rectangle callout box can include: rectangle callout box Four vertex;Alternatively, the central point on four vertex and rectangle callout box.For example, the key point of circle markings frame, it may include: Arbitrary N number of point in the circle and circumference of circle markings frame, N is positive integer, for example, 2,3 or 4.In short, the coordinate of key point Change, can be used for moving the position of the callout box and/or the size of the callout box.
On the one hand, by providing a variety of callout box, can be convenient and single callout box is provided, can solve adjacent callout box by Mutually there are problems that overlapping between the callout box caused by shape is limited to.Described image can be for showing information to be marked Made of the acquisition of paper document, there may be perspective phenomenon during Image Acquisition because of reasons such as acquisition angles.If It is used uniformly rectangle callout box, may to compare in the image of verifying due to perspective phenomenon, two adjacent rectangle callout box need It frames respectively required content and generates overlapping.
On the other hand, the callout box of various shapes is provided, can satisfy different typesettings content to be marked or different shape The mark for markup information.For example, being designated as example with circular diagram, most suitable callout box may be exactly circle markings frame;In this way, Reduction is marked frame and enspheres the pixel for being not belonging to icon as far as possible, so that subtracting in the identification process of information to be marked The processing of few irrelevant reduces load and promotes treatment effeciency.
In another aspect, providing the callout box of various shapes, the demand of the callout box of different shapes of user can satisfy.
In step s 110, it can use deep learning model inspection described image, determine callout box on the image Display position.For example, the deep learning model includes but is not limited to neural network.It, can be preliminary when detecting described image It is set to the information to be marked of a unit out, so that callout box is added in corresponding position on the image.For example, deep learning model Etc. various processing models, can according to the spacing between the similitude and pixel of the pixel value of each pixel of image, by adjacent or The region that image-region where multiple pixels of the spacing in first distance is framed by a callout box.Pixel value herein Similitude can include: the difference of pixel value is in default disparity range, it is assumed that having similitude;Otherwise it can assert no phase Like property.For another example if the gray value of two pixels is within the scope of preset difference value, it may be determined that the two pixels have similitude, no Then it is believed that no similitude.For example, the word in two characters and two words by taking English word as an example, in a word Spacing between symbol is different, therefore in the present embodiment, can by the average headway of each character on whole image of statistics, if Determine first distance threshold value and second distance threshold value;If the spacing between two pixels with similitude is less than first distance threshold Value, it is believed that the two pixels can be framed by the same callout box;If the spacing between two pixels with similitude is big In second distance threshold value, then it is believed that the two pixels needs are framed by different callout box.The second distance threshold value is big In the first distance threshold value.
As shown in Fig. 2, the region that the callout box shown on image is framed may have deviation, such as " SPENCER " is answered It should be framed by a callout box, but it is clear that there is no can completelys to frame " SPENCER " for the callout box in Fig. 2.At this point, by It is editable state in callout box provided in this embodiment, user can be mobile by mouse drag or touch control operation etc. The coordinate of the key point of corresponding callout box, can allow a callout box completely to frame " SPENCER ".
If detecting the first edit operation, callout box is just adjusted according to the first edit operation, for example, adjustment callout box institute frame Information to be marked firmly adjusts position and/or the size of callout box.
If the content to be identified that callout box is framed is changed, it can be re-execute the steps S140, identification can be adjusted and obtained Marked content.
In step S140 can include: using optical character identification (OCR) as unit of the image-region that callout box is framed Content recognition is carried out, the marked content is obtained.
In short, in the present embodiment, by the introducing of the callout box in editable state, user is facilitated to adjust callout box The content framed, to promote the accuracy of information identification.
Tagging equipment just generates the callout box without adjustment when marking first time, and tagging equipment equally can be with mark at this time The information to be marked that note frame is framed is recognition unit
In some embodiments, the step 120 may include at least one of:
It is described that the callout box, including at least one of are adjusted according to first edit operation:
According to first edit operation, the position of the callout box of the adjustment in editable state;
According to first edit operation, the size of the callout box of the adjustment in editable state;
According to first edit operation, adjustment in can convenient state the callout box shape.
The key point of the callout box can be divided into two kinds:
First kind key point shows key point on the image, with reference to rectangle callout box shown in Fig. 2, rectangle callout box It is the first kind key point with the vertex that black circle indicates;
Second class key point, hiding key point do not appear in the key point on image, with reference to rectangle shown in Fig. 2 Callout box, the central point inside rectangle callout box is the second class key point.
For example, detection acts in callout box or the operation of the first movement of the first kind key point, it is based on first movement Operate the position of mobile callout box;The moving operation can click the content of callout box and the operation of mobile mouse for mouse, can also Think inside finger touch callout box and drag the position of callout box.
For example, detection acts on the second moving operation of the second class key point of callout box, moved based on the second moving operation It is dynamic, the content for changing the area of callout box or being framed;It can specifically be changed by the coordinate of the second class key point, and maintained The coordinate of first kind key point realizes the area of callout box and the change of framed content.For example, being corresponded in Fig. 2 by mobile The movement of any one apex coordinate, can make the area of the callout box increase on the right of the callout box of " SPENCER ", and same When completely to frame " SPENCER ".In some embodiments, there is incidence relation, for example, in order to tie up between multiple key points The shape for holding callout box, when detecting the first edit operation for acting on a key point, so that the coordinate hair of a key point When raw change, the coordinate of other key points with incidence relation, which can also synchronize, occurs corresponding change.For example, being marked with rectangle For frame, when detecting the operation for stretching a key point A coordinate, change the coordinate of key point A, synchronization can change and key Point A has the coordinate of the key point B of incidence relation.For example, the incidence relation of key point A and key point B may be embodied in: key point The line of A and key point B are not zero perpendicular to the angle between the moving direction of key point A.
First edit operation may also include that shape changes operation, for example, edit box is switched to from rectangle callout box Circle markings frame, hexagon callout box;By the switching of the shape of callout box, on the one hand can reduce between different labeled frame It is overlapped, different types of information to be marked also may be implemented and be labeled with different callout box, or meet user merely to make The demand being labeled with different shape callout box.
In some embodiments, first edit operation may also include that callout box delete operation, can be used for deleting Extra callout box, so, it is possible to reduce the identification of unnecessary information.
In further embodiments, first edit operation may also include that callout box union operation, if one wait mark It infuses information and corresponds to a callout box, but may be considered as an information to be marked during automatically generating callout box Multiple information to be marked, are split in multiple callout box, at this point, being based on callout box union operation, multiple callout box may be implemented Merging.
In some embodiments, the method also includes:
Described image is shown in first area;
The marked content in editable state is shown in second area;Wherein, the first area is different from institute State second area;
Detection acts on the second edit operation of the marked content;
The marked content is updated based on second edit operation.
As shown in Fig. 2, showing image on the left side of display screen, and Overlapping display has callout box on the image;Aobvious Identification is shown on the right of display screen curtain obtains marked content.And the marked content is in editable state, in this way, marking When equipment identification is incorrect, user can correct marked content with manual editing, to obtain accurate marked content.
In some embodiments, as shown in figure 3, the method also includes:
Step S150: generate mark file, wherein it is described mark file include: the information to be marked mark and with The corresponding marked content of the information to be marked.
Mark file can be generated in the present embodiment, which has corresponding relationship with image is marked.Some It, can be to be marked the file that the image name of image names the mark file in order to facilitate subsequent user access in embodiment Name.
In the present embodiment, the mark file can be the editable file of various marked contents, for example, file suffixes name For TXT file or DOC file.
The step S150 can include:
Generate the mark file including at least one key-value pair;
Wherein, the information to be marked is identified as key;The corresponding marked content of the information to be marked is value;
The mark of the information to be marked includes: the coordinate value for framing the callout box of the information to be marked.
In the present embodiment, the file content for marking file includes one or more key-value pairs.If marking File Open, use Family operates corresponding key assignments, can trigger mark application and open and show image corresponding with the mark file, on the image also Corresponding callout box is shown, it is convenient to edit again.
In the present embodiment, the mark of the markup information includes: the coordinate value of callout box, which can be in correspondence The image coordinate of image coordinate system corresponding to image.For example, establishing plane coordinate system, rectangle callout box with the central point of image Coordinate value can include: the coordinate value on 4 vertex of rectangle callout box;The coordinate value of circle note frame can include: circle markings frame The center of circle coordinate value and radius or diameter.
In some embodiments, the mark label may also include that mark label.The mark label can are as follows: callout box Serial number.For example, by S callout box on an image, then according to callout box name placement on the image (for example, from front to back And from left to right, alternatively, from front to back and from right to left), it is labeled the number of frame, which can be the sequence of the callout box Number.
In some embodiments, the mark label may be used also are as follows: number of the key-value pair in mark file.For example, if Mark file is TXT file, then a line has corresponded to a mark label in TXT file;Then the line number in TXT file can be described Mark label.
Certainly the above is only be not limited to the example above when for example, implementing to mark label.
In some embodiments, as shown in figure 4, the method also includes: step S111: determine the shape of the callout box Shape;
The step S120 can include: according to the position where the information to be marked, display is on the image The callout box of the determined shape of editable state.
In the present embodiment, the shape that can determine whether to need callout box to be shown before callout box is shown on the image, thus The callout box that suitable shape can be used marks corresponding information to be marked.
In some embodiments, the step S111 may include at least one of:
According to the specified operation of the shape of the callout box, the shape of the callout box is determined;
According to the distributional pattern of the information to be marked on the image, the shape of the callout box is determined.
In some embodiments, user may specify a kind of callout box of shape, can be with shape in order to meet user demand The specified operation of shape, determines the shape of callout box.
In some embodiments, the distributional pattern according to markup information on the image automatically determines the shape of callout box.If The perspective phenomenon of current character in the picture is obvious, shows the form that one end is big and the other end is small, can be preferentially automatic Trapezoidal callout box is selected, and non-rectangle callout box, one side reduce the overlapping between rectangle callout box, on the other hand can reduce The exclusion for needing not participate in identification pixel, accelerates the generation of subsequent marked content.
In some embodiments, the step S111 can include: according to the type of information to be marked, determine the shape of callout box Shape.
For example, being directed to the word or Chinese character of various language, rectangle callout box or trapezoidal callout box, and punctuate can be used Symbol, the area of punctuation mark is less than the area of word or Chinese character, can be in order to reduce the pixel of required processing in identification process It is labeled with lesser circle markings frame or oval callout box.For another example round icon can be excellent according to the shape of icon First select circle markings frame.
In some embodiments, the step S140 may also include that
If, can also be according to the information to be marked according to the shape for the callout box that the type of information to be marked determines The pixel value of the shape of type or the callout box, the image-region that callout box is framed inputs in corresponding identification model Carry out the identification of information to be marked.For example, different identification models has different accuracy to different types of information to be marked Resolution, for example, punctuation mark identification model, Text region model and icon-based programming model are accurate to the identification of punctuation mark Du Genggao, in this way, divide identification model by information input to be marked into corresponding identification model, it is accurate that identification can be promoted Degree.
In some embodiments, the method also includes: open working directory;Batch loads the figure under the working directory Picture;
If when n-th image meets mark completion condition, into the mark of (n+1)th image, wherein n is positive integer.
In the present embodiment, the batch identification of image can be carried out in mark application or mark plug-in unit.
After the mark of image is opened in completion one, the mark of another image is entered actively or based on user's operation.
For example, after generating the marked content or after saving the mark file, the entrance based on user's operation The handover operation of next image determines and meets mark completion condition, and enters the mark of next image.Alternatively, detection It triggers the operation for saving all marked contents or marking file manually to user, determines and meet mark completion condition, enter down The mark of one image.
The mark for entering (n+1)th image, it may include: (n+1)th image described in detection interface display, and at least Execute the step S110 to step S150.
In some embodiments, the method also includes:
The callout box that detection acts on described image increases operation, is increased according to callout box and operates acted on position, Callout box is added in described image.It acts in described image the double-click for not showing callout box for example, detecting or clicks, it is raw At a callout box.In some embodiments, the central point of the newly-generated callout box can be increased by callout box and be acted on Position;In this way, if can increase operation when the information to be marked that callout box frames has omission by callout box and increase callout box Number.
As shown in figure 5, the present embodiment provides a kind of information labeling devices, comprising:
Image detection module 110, the position of the information to be marked in detection image;
Display module 120, for the position according to detection, display is in the mark of editable state on the image Infuse frame;
Module 130 is adjusted, if compiling for detecting the first edit operation for acting on the callout box according to described first Collect callout box described in operation adjustment;
Identification module 140, the information to be marked for being framed using the callout box is recognition unit, described in identification Information acquisition marked content to be marked.
In some embodiments, described image detection module 110, display module 120, adjustment module 130 and identification module 140 can be program module, after described program module is performed, can be realized the detection of image, the display of callout box, callout box Adjustment and marked content.
In some embodiments, described image detection module 110, display module 120, adjustment module 130 and identification module 140 can be soft or hard binding modules, for example, programmable array, the programmable array can include: field programmable gate array or complexity Programmable array.
In further embodiments, described image detection module 110, display module 120, adjustment module 130 and identification Module 140 can be pure hardware module, for example, specific integrated circuit.
In some embodiments, the adjustment module 130 is specifically used for executing at least one of:
According to first edit operation, the position of the callout box of the adjustment in editable state;
According to first edit operation, the size of the callout box of the adjustment in editable state;
According to first edit operation, adjustment in can convenient state the callout box shape.
In some embodiments, the display module 120 is specifically used for showing described image in first area;Second Region shows the marked content in editable state;Wherein, the first area is different from the second area;It is described Device further include:
Detection module is operated, for detecting the second edit operation for acting on the marked content;
Update module, for updating the marked content based on second edit operation.
In some embodiments, described device further include:
Generation module, for generating mark file, wherein the mark file includes: the mark of the information to be marked And the marked content corresponding with the information to be marked.
In some embodiments, the generation module, specifically for generating the mark including at least one key-value pair File;Wherein, the information to be marked is identified as key;The corresponding marked content of the information to be marked is value;It is described The mark of information to be marked includes: the coordinate value for framing the callout box of the information to be marked.
In some embodiments, described device further include:
Shape determination module, for determining the shape of the callout box;
The display module 120, specifically for being shown on the image according to the position where the information to be marked The callout box of determined shape in editable state.
In some embodiments, the shape determination module is specifically used for executing at least one of:
According to the specified operation of the shape of the callout box, the shape of the callout box is determined;
According to the distributional pattern of the information to be marked on the image, the shape of the callout box is determined.
In some embodiments, described device further include:
Module is opened, working directory is used to open;
Loading module, for loading the image under the working directory in batches;
Switching module, if for when n-th image meets mark completion condition, into the mark of (n+1)th image, In, n is positive integer.
Several specific examples are provided below in conjunction with above-described embodiment:
This example provides a kind of OCR intelligent dimension tool based on OCR technique and picture label technology, picture mark of such as increasing income Note tool labelme, labelimg etc., providing various shapes includes rectangle, and polygon callout box supports Chinese, English, number With symbol etc..
After OCR model mark in advance, most of marked content is correct, only exists in small part character area Deviation (SPENCER as shown in Figure 2), in this case, user only need the callout box adjustment by SPENCER correct i.e. The mark of the achievable picture, greatly reduces workload.
This OCR intelligent dimension tool can be run directly in PC environment, will if in PC including image processor (GPU) Automatic that GPU is called to be accelerated, default is calculated using CPU, as shown in Figure 6, it may include following steps:
After the interface for entering OCR intelligent dimension tool, working directory is chosen first;
After confirming working directory, load image, it may include: by images all under read work catalogue;
OCR model marks in advance, for example, calling embedded OCR model, predicts all data pictures, is marked Frame;Obtained pre- callout box is shown in picture, and generates mark file;
Annotation results are shown;
Whether correct marked content is judged, if so, being switched to the mark of next image;If it is not, being manually adjusted by user Marked content.
The training on general data collection of OCR model is completed.Before carrying out new data set mark, to OCR model logarithm It is assessed according to the recognition accuracy of collection, obtains marking information that is accurate or having a little error, continue to mark on this basis, Greatly reduce mark cost.OCR model is supported to generate based on a variety of callout box such as horizontal rectangular, rotation rectangle, parallelogram Pre- annotation results adapt to a variety of user demands.
The present embodiment provides a kind of tagging equipments, comprising:
Memory,
Processor is connect with memory, the computer executable instructions for being stored on the memory by execution, It can be realized the information labeling method that any one aforementioned technical solution provides, for example, the side as shown in Fig. 1, Fig. 3, Fig. 4 and Fig. 6 One or more of method.
The present embodiment provides a kind of computer storage medium, the computer storage medium is stored with the executable finger of computer It enables;After the computer executable instructions are performed, the information labeling side that any one aforementioned technical solution provides can be realized Method, for example, one or more of method as shown in Fig. 1, Fig. 3, Fig. 4 and Fig. 6.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit Or communication connection, it can be electrical, mechanical or other forms.
Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit, it can and it is in one place, it may be distributed over multiple network lists In member;Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, each functional unit in various embodiments of the present invention can be fully integrated into a processing module, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units;It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can be stored in a computer readable storage medium, the program When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned include: movable storage device, it is read-only Memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disk or The various media that can store program code such as person's CD.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (11)

1. a kind of information labeling method characterized by comprising
The position of information to be marked in detection image;
According to the position of detection, display is in the callout box of editable state on the image;
If detecting the first edit operation for acting on the callout box, the mark is adjusted according to first edit operation Frame;
The information to be marked framed using the callout box identifies in the information acquisition mark to be marked as recognition unit Hold.
2. the method according to claim 1, wherein
It is described that the callout box, including at least one of are adjusted according to first edit operation:
According to first edit operation, the position of the callout box of the adjustment in editable state;
According to first edit operation, the size of the callout box of the adjustment in editable state;
According to first edit operation, adjustment in can convenient state the callout box shape.
3. method according to claim 1 or 2, which is characterized in that the method also includes:
Described image is shown in first area;
The marked content in editable state is shown in second area;Wherein, the first area is different from described the Two regions;
Detection acts on the second edit operation of the marked content;
The marked content is updated based on second edit operation.
4. method according to claim 1 or 2, which is characterized in that the method also includes:
Generate mark file, wherein it is described mark file include: the information to be marked mark and with the information to be marked The corresponding marked content.
5. according to the method described in claim 4, it is characterized in that,
The generation marks file, comprising:
Generate the mark file including at least one key-value pair;
Wherein, the information to be marked is identified as key;The corresponding marked content of the information to be marked is value;
The mark of the information to be marked includes: the coordinate value for framing the callout box of the information to be marked.
6. the method according to claim 1, wherein the method also includes:
Determine the shape of the callout box;
The position according to detection, display is in the callout box of editable state on the image, comprising:
According to the position where the information to be marked, display is in the determined shape of editable state on the image Callout box.
7. according to the method described in claim 6, it is characterized in that,
The shape of the determination callout box, including at least one of:
According to the specified operation of the shape of the callout box, the shape of the callout box is determined;
According to the distributional pattern of the information to be marked on the image, the shape of the callout box is determined.
8. method according to claim 1 or 2, which is characterized in that the method also includes:
Open working directory;
Batch loads the image under the working directory;
If when n-th image meets mark completion condition, into the mark of (n+1)th image, wherein n is positive integer.
9. a kind of information labeling device characterized by comprising
Image detection module, the position of the information to be marked in detection image;
Display module, for the position according to detection, display is in the callout box of editable state on the image;
Module is adjusted, if for detecting the first edit operation for acting on the callout box, according to first edit operation Adjust the callout box;
Identification module, the information to be marked for being framed using the callout box identify described to be marked as recognition unit Information acquisition marked content.
10. a kind of tagging equipment characterized by comprising
Memory;
Memory is connect with the memory, the computer-executable code for being stored on the memory by execution, Realize the method that any one of claim 1 to 8 provides.
11. a kind of computer storage medium, the computer storage medium is stored with computer-executable code;The computer After executable code is performed, the method that any one of claim 1 to 8 provides can be realized.
CN201811388635.8A 2018-11-21 2018-11-21 Information labeling method and device, labeling equipment and storage medium Active CN109685870B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811388635.8A CN109685870B (en) 2018-11-21 2018-11-21 Information labeling method and device, labeling equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811388635.8A CN109685870B (en) 2018-11-21 2018-11-21 Information labeling method and device, labeling equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109685870A true CN109685870A (en) 2019-04-26
CN109685870B CN109685870B (en) 2023-10-31

Family

ID=66184886

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811388635.8A Active CN109685870B (en) 2018-11-21 2018-11-21 Information labeling method and device, labeling equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109685870B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110263753A (en) * 2019-06-28 2019-09-20 北京海益同展信息科技有限公司 A kind of object statistical method and device
CN110298823A (en) * 2019-06-17 2019-10-01 天津大学 A kind of infrared image auxiliary mask method based on MSER algorithm
CN110796201A (en) * 2019-10-31 2020-02-14 深圳前海达闼云端智能科技有限公司 Method for correcting label frame, electronic equipment and storage medium
CN111145215A (en) * 2019-12-25 2020-05-12 北京迈格威科技有限公司 Target tracking method and device
CN111626233A (en) * 2020-05-29 2020-09-04 江苏云从曦和人工智能有限公司 Key point marking method, system, machine readable medium and equipment
CN111724374A (en) * 2020-06-22 2020-09-29 林晨 Evaluation method of analysis result and terminal
CN112036443A (en) * 2020-07-31 2020-12-04 上海图森未来人工智能科技有限公司 Method and device for labeling object contour in image data
CN116543392A (en) * 2023-04-19 2023-08-04 钛玛科(北京)工业科技有限公司 Labeling method for deep learning character recognition

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102317955A (en) * 2009-04-20 2012-01-11 万涛国际有限公司 Data managing method and system based on image
CN106682667A (en) * 2016-12-29 2017-05-17 成都数联铭品科技有限公司 Image-text OCR (optical character recognition) system for uncommon fonts
CN106845530A (en) * 2016-12-30 2017-06-13 百度在线网络技术(北京)有限公司 character detection method and device
CN106951893A (en) * 2017-05-08 2017-07-14 奇酷互联网络科技(深圳)有限公司 Text information acquisition methods, device and mobile terminal
CN107153822A (en) * 2017-05-19 2017-09-12 北京航空航天大学 A kind of smart mask method of the semi-automatic image based on deep learning
CN108416279A (en) * 2018-02-26 2018-08-17 阿博茨德(北京)科技有限公司 Form analysis method and device in file and picture
CN108573279A (en) * 2018-03-19 2018-09-25 精锐视觉智能科技(深圳)有限公司 Image labeling method and terminal device
CN108665513A (en) * 2017-03-27 2018-10-16 腾讯科技(深圳)有限公司 Drawing practice based on user behavior data and device

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102317955A (en) * 2009-04-20 2012-01-11 万涛国际有限公司 Data managing method and system based on image
CN106682667A (en) * 2016-12-29 2017-05-17 成都数联铭品科技有限公司 Image-text OCR (optical character recognition) system for uncommon fonts
CN106845530A (en) * 2016-12-30 2017-06-13 百度在线网络技术(北京)有限公司 character detection method and device
CN108665513A (en) * 2017-03-27 2018-10-16 腾讯科技(深圳)有限公司 Drawing practice based on user behavior data and device
CN106951893A (en) * 2017-05-08 2017-07-14 奇酷互联网络科技(深圳)有限公司 Text information acquisition methods, device and mobile terminal
CN107153822A (en) * 2017-05-19 2017-09-12 北京航空航天大学 A kind of smart mask method of the semi-automatic image based on deep learning
CN108416279A (en) * 2018-02-26 2018-08-17 阿博茨德(北京)科技有限公司 Form analysis method and device in file and picture
CN108573279A (en) * 2018-03-19 2018-09-25 精锐视觉智能科技(深圳)有限公司 Image labeling method and terminal device

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110298823A (en) * 2019-06-17 2019-10-01 天津大学 A kind of infrared image auxiliary mask method based on MSER algorithm
CN110263753B (en) * 2019-06-28 2020-12-22 北京海益同展信息科技有限公司 Object statistical method and device
CN110263753A (en) * 2019-06-28 2019-09-20 北京海益同展信息科技有限公司 A kind of object statistical method and device
CN110796201A (en) * 2019-10-31 2020-02-14 深圳前海达闼云端智能科技有限公司 Method for correcting label frame, electronic equipment and storage medium
CN110796201B (en) * 2019-10-31 2023-07-11 达闼机器人股份有限公司 Correction method of annotation frame, electronic equipment and storage medium
CN111145215A (en) * 2019-12-25 2020-05-12 北京迈格威科技有限公司 Target tracking method and device
CN111145215B (en) * 2019-12-25 2023-09-05 北京迈格威科技有限公司 Target tracking method and device
CN111626233B (en) * 2020-05-29 2021-07-13 江苏云从曦和人工智能有限公司 Key point marking method, system, machine readable medium and equipment
CN111626233A (en) * 2020-05-29 2020-09-04 江苏云从曦和人工智能有限公司 Key point marking method, system, machine readable medium and equipment
CN111724374A (en) * 2020-06-22 2020-09-29 林晨 Evaluation method of analysis result and terminal
CN111724374B (en) * 2020-06-22 2024-03-01 智眸医疗(深圳)有限公司 Evaluation method and terminal of analysis result
CN112036443A (en) * 2020-07-31 2020-12-04 上海图森未来人工智能科技有限公司 Method and device for labeling object contour in image data
CN116543392A (en) * 2023-04-19 2023-08-04 钛玛科(北京)工业科技有限公司 Labeling method for deep learning character recognition
CN116543392B (en) * 2023-04-19 2024-03-12 钛玛科(北京)工业科技有限公司 Labeling method for deep learning character recognition

Also Published As

Publication number Publication date
CN109685870B (en) 2023-10-31

Similar Documents

Publication Publication Date Title
CN109685870A (en) Information labeling method and device, tagging equipment and storage medium
CN111144215B (en) Image processing method, device, electronic equipment and storage medium
Li et al. Automatic comic page segmentation based on polygon detection
CN109697414A (en) A kind of text positioning method and device
US20220415008A1 (en) Image box filtering for optical character recognition
CN112766255A (en) Optical character recognition method, device, equipment and storage medium
CN110765015A (en) Method for testing application to be tested and electronic equipment
CN109753976B (en) Corpus labeling device and method
CN114549557A (en) Portrait segmentation network training method, device, equipment and medium
Sun et al. Ui components recognition system based on image understanding
CN109871205B (en) Interface code adjustment method, device, computer device and storage medium
Yamauchi et al. Building a handwritten cuneiform character imageset
CN113177542A (en) Method, device and equipment for identifying characters of seal and computer readable medium
CN112084103B (en) Interface test method, device, equipment and medium
CN108682021A (en) Rapid hand tracking, device, terminal and storage medium
CN115457119B (en) Bus bar labeling method, device, computer equipment and readable storage medium
CN106682595A (en) Image content marking method and apparatus thereof
CN112395834B (en) Brain graph generation method, device and equipment based on picture input and storage medium
CN111563511B (en) Method and device for intelligent frame questions, electronic equipment and storage medium
CN114202719A (en) Video sample labeling method and device, computer equipment and storage medium
CN114170451A (en) Text recognition method and device
CN114359160A (en) Screen detection method and device, electronic equipment and storage medium
CN112837398A (en) Text annotation method and device, electronic equipment and storage medium
CN113743438B (en) Data set generation method, device and system for text detection
CN117111817B (en) Annotation information erasing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant