CN106485246A - Character identifying method and device - Google Patents

Character identifying method and device Download PDF

Info

Publication number
CN106485246A
CN106485246A CN201610833487.0A CN201610833487A CN106485246A CN 106485246 A CN106485246 A CN 106485246A CN 201610833487 A CN201610833487 A CN 201610833487A CN 106485246 A CN106485246 A CN 106485246A
Authority
CN
China
Prior art keywords
character
identified
image
sliding window
character segmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610833487.0A
Other languages
Chinese (zh)
Other versions
CN106485246B (en
Inventor
杨松
王百超
龙飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Mobile Software Co Ltd
Original Assignee
Beijing Xiaomi Mobile Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Mobile Software Co Ltd filed Critical Beijing Xiaomi Mobile Software Co Ltd
Priority to CN201610833487.0A priority Critical patent/CN106485246B/en
Publication of CN106485246A publication Critical patent/CN106485246A/en
Application granted granted Critical
Publication of CN106485246B publication Critical patent/CN106485246B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Input (AREA)

Abstract

The disclosure is directed to character identifying method and device.The method includes:Obtain character row image to be identified;Character segmentation line in the character row image to be identified is recognized by characteristics of image;The character row image to be identified is split according to the Character segmentation line, obtain Character segmentation image.The technical scheme recognizes the Character segmentation line in character row image to be identified using characteristics of image, and then carries out Character segmentation with Character segmentation line to character row image to be identified, can effectively improve the robustness of segmentation, eliminates the impact that noise and illumination bring.

Description

Character identifying method and device
Technical field
It relates to character recognition technologies field, more particularly to a kind of character identifying method and device.
Background technology
Optical character identification (OCR) is the word in scanned document to be identified with the method for image recognition, typically Step is:The step such as Image semantic classification, row segmentation, monocase segmentation, monocase feature extraction and identification, language model post processing Suddenly.Existing monocase cutting method is split according to the method for perspective view and connected domain, and which is easily affected by noise, The effect on driving birds is not good of monocase segmentation.
Content of the invention
The embodiment of the present disclosure provides character identifying method and device.The technical scheme is as follows:
According to the embodiment of the present disclosure in a first aspect, provide a kind of character identifying method, including:
Obtain character row image to be identified;
Character segmentation line in the character row image to be identified is recognized by characteristics of image;
The character row image to be identified is split according to the Character segmentation line, obtain Character segmentation image.
Wherein, before the Character segmentation line recognized by characteristics of image in the character row image to be identified, also Including:
In the case of life size constant rate is kept, the character row Image Adjusting to be identified is preliminary dimension.
Wherein, the Character segmentation line recognized by characteristics of image in the character row image to be identified, including:
Sliding window is set, wherein, the height of the sliding window is highly equal with the character row image to be identified;
The sliding window is moved according to pre- fixed step size on the character row image to be identified;
Extract the characteristics of image of the character row image to be identified in the sliding window;
Recognize whether the character row image to be identified includes in the sliding window according to the characteristics of image for being extracted The Character segmentation line.
Wherein, described according to the characteristics of image for the being extracted identification character row image to be identified in the sliding window Whether Character segmentation line is included, including:
Extract the Gradient Features in the sliding window;
Dimensionality reduction is carried out to the Gradient Features;
Using the Gradient Features after dimensionality reduction, whether the vertical center line for recognizing the sliding window by grader is The Character segmentation line.
Wherein, the Character segmentation line recognized by characteristics of image in the character row image to be identified, is also included:
After all Character segmentation lines in the character row image to be identified is identified, the Character segmentation for repeating is removed Line.
Wherein, the Character segmentation line for removing repetition, including:
The Character segmentation line for repeating is removed using non-maximum suppressing method.
According to the second aspect of the embodiment of the present disclosure, a kind of character recognition device is provided, including:
Acquisition module, is configured to obtain character row image to be identified;
Identification module, is configured to characteristics of image and recognizes the Character segmentation line in the character row image to be identified;
Segmentation module, is configured to split the character row image to be identified according to the Character segmentation line, obtains To Character segmentation image.
Wherein, before the identification module, also include:
Adjusting module, is configured in the case of life size constant rate is kept, by the character row image to be identified It is adjusted to preliminary dimension.
Wherein, the identification module, including:
Submodule is set, is configured to sliding window is set, wherein, the height of the sliding window and the word to be identified Accord with the highly equal of row image;
Mobile submodule, is configured on the character row image to be identified move the sliding window according to pre- fixed step size Mouthful;
First extracting sub-module, the image for being configured to extract the character row image to be identified in the sliding window are special Levy;
First identification submodule, is configured to recognize that the character row image to be identified exists according to the characteristics of image for being extracted Whether the Character segmentation line is included in the sliding window.
Wherein, the first identification submodule, including:
Second extracting sub-module, is configured to extract the Gradient Features in the sliding window;
Dimensionality reduction submodule, is configured to carry out dimensionality reduction to the Gradient Features;
Second identification submodule, is configured to, with the Gradient Features after dimensionality reduction, recognizes the cunning by grader Whether the vertical center line of dynamic window is the Character segmentation line.
Wherein, the identification module, also includes:
First denoising submodule, all Character segmentation lines being configured in the character row image to be identified is identified Afterwards, the Character segmentation line for repeating is removed.
Wherein, the first denoising submodule, including:
Second denoising submodule, is configured to, with non-maximum suppressing method and removes the Character segmentation line for repeating.
According to the third aspect of the embodiment of the present disclosure, a kind of character recognition device is provided, including:
Processor;
For storing the memory of processor executable;
Wherein, the processor is configured to:
Obtain character row image to be identified;
Character segmentation line in the character row image to be identified is recognized by characteristics of image;
The character row image to be identified is split according to the Character segmentation line, obtain Character segmentation image.
The technical scheme that embodiment of the disclosure is provided can include following beneficial effect:
Technique scheme, for the character row image to be identified for obtaining, that is, includes that the image of a line character enters line character Identification;In identification process, the Character segmentation line in the character row image to be identified is recognized using sliding window, and identifying After all Character segmentation lines, the character row image to be identified is split according to the Character segmentation line, obtains Character segmentation figure Picture.By the technical scheme of the disclosure, the Character segmentation line in character row image to be identified, Jin Eryong is recognized using characteristics of image Character segmentation line carries out Character segmentation to character row image to be identified, can effectively improve the robustness of segmentation, eliminates noise The impact brought with illumination.
It should be appreciated that above general description and detailed description hereinafter are only exemplary and explanatory, not The disclosure can be limited.
Description of the drawings
Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the enforcement for meeting the disclosure Example, and be used for together with specification explaining the principle of the disclosure.
Fig. 1 is the flow chart of the character identifying method according to an exemplary embodiment.
Fig. 2 is the flow chart of step 102 in character identifying method according to an exemplary embodiment.
Fig. 3 is the flow chart of step 204 in character identifying method according to an exemplary embodiment.
Fig. 4 is the block diagram of the character recognition device according to an exemplary embodiment.
Fig. 5 is the block diagram of identification module 402 described in character recognition device according to an exemplary embodiment.
Fig. 6 is the frame of the first identification submodule 504 described in character recognition device according to an exemplary embodiment Figure.
Fig. 7 is the block diagram of the steering control device suitable for electronic Self-Balancing vehicle according to an exemplary embodiment.
Specific embodiment
Here in detail exemplary embodiment will be illustrated, its example is illustrated in the accompanying drawings.Explained below is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the disclosure.Conversely, they be only with as appended by The example of consistent apparatus and method in terms of some that described in detail in claims, the disclosure.
Fig. 1 is a kind of flow chart of the character identifying method according to an exemplary embodiment, as shown in figure 1, including Following steps 101-103:
In a step 101, character row image to be identified is obtained;
In a step 102, the Character segmentation line in the character row image to be identified is recognized by characteristics of image;
In step 103, the character row image to be identified is split according to the Character segmentation line, obtains character Segmentation figure picture.
In the present embodiment, for the character row image to be identified for obtaining, that is, include that the image of a line character enters line character Identification;In identification process, the Character segmentation line in the character row image to be identified is recognized using sliding window, and identifying After all Character segmentation lines, the character row image to be identified is split according to the Character segmentation line, obtains Character segmentation figure Picture.By the technical scheme of the disclosure, the Character segmentation line in character row image to be identified, Jin Eryong is recognized using characteristics of image Character segmentation line carries out Character segmentation to character row image to be identified, can effectively improve the robustness of segmentation, eliminates noise The impact brought with illumination.
In one embodiment, described image feature can be the textural characteristics of image, gray feature, Gradient Features or shape Feature.The selection of characteristics of image can be specifically arranged according to the difference of concrete RM and identification character.
In one embodiment, in the Character segmentation line recognized by characteristics of image in the character row image to be identified Before, also include:In the case of life size constant rate is kept, the character row Image Adjusting to be identified is pre- scale Very little.In the present embodiment, depending on the size of the character row image to be identified of acquisition is according to the size of original image, and recognizing Generally can all unify character row image size to be identified in journey, in order to follow-up identification.The big I root of the preliminary dimension Know otherwise arranging according to concrete.It should be noted that being in course of adjustment, the dimension scale of image is constant, and is only The width of adjustment image or length, can otherwise cause anamorphose, and affect recognition result.
In one embodiment, as shown in Fig. 2 recognizing the character row figure to be identified by characteristics of image described in step 102 Character segmentation line in picture, comprises the following steps 201-204:
In step 201, sliding window, wherein, the height of the sliding window and the character row figure to be identified are set Picture highly equal;
In step 202., the sliding window is moved according to pre- fixed step size on the character row image to be identified;
In step 203, the characteristics of image of the character row image to be identified in the sliding window is extracted;
In step 204, recognize the character row image to be identified in the sliding window according to the characteristics of image for being extracted Whether the Character segmentation line is included in mouthful.
In the present embodiment, sliding window, the height of the sliding window and the character row figure to be identified are set first Picture highly equal, the width of the sliding window and height can equal can also be unequal, enter with specific reference to actual conditions Row is arranged.In identification, from the beginning of the leftmost side of the character row image to be identified, the sliding window is moved according to pre- fixed step size Mouthful, often slide once, extract the characteristics of image in sliding window, and character row to be identified according to described image feature recognition Whether image includes Character segmentation line in the sliding window.The pre- fixed step size can be arranged as the case may be, In the case of asking recognition accuracy higher, one can be differed before and after the moving step length is set to 1, i.e. sliding window movement Pixel, and in the case of efficiency high is required, moving step length suitably can increase.
In one embodiment, as shown in figure 3, being recognized according to the characteristics of image for being extracted described in step 204 described to be identified Whether character row image includes Character segmentation line in the sliding window, comprises the following steps 301-303:
In step 301, the Gradient Features in the sliding window are extracted;
In step 302, dimensionality reduction is carried out to the Gradient Features;
In step 303, using the Gradient Features after dimensionality reduction, the vertical of the sliding window is recognized by grader Whether center line is the Character segmentation line.
In the present embodiment, image feature selection uses Gradient Features, and the Gradient Features may be selected to extract HOG (Histogram of oriented gradient, histograms of oriented gradients) and/or WDCH (Weighted Direction Code Histogram) feature.If two kinds of features of selective extraction HOG and WDCH, both can be combined shape Become a characteristic vector.As the dimension of the Gradient Features for extracting is higher, so being generally required for carrying out dimensionality reduction to which, concrete drop Dimension amplitude can be selected according to actual conditions, if accuracy of identification will be gone higher, can be selected less dimensionality reduction amplitude, for example, be pressed Amplitude according to retain prime information percent 99 or so carries out dimensionality reduction, if requiring that accuracy of identification is relatively low, can select larger Dimensionality reduction amplitude, for example, carry out dimensionality reduction etc. according to the amplitude of retain source information percent 80 or so.Go forward side by side Gradient Features are extracted After row dimensionality reduction, will be classified in grader good for the Gradient Features input value training in advance, the grader can be selected Logistic regression grader, whether the numerical value center line for determining the sliding window according to the Gradient Features is Character segmentation Line.When the grader is trained, sample set is first obtained, the sample set includes positive sample and negative sample, the sample set Including multiple character row images of known character cut-off rule, the Gradient Features of the image in the different sliding windows, and described just It is the Gradient Features in the sliding window of Character segmentation line that sample is the vertical center line of sliding window, and the negative sample is for sliding The vertical center line of window is not the Gradient Features in the sliding window of Character segmentation line, then according to the sample in the sample set This, trains the grader by selected training method, finally gives each sorting parameter of grader.Classifier training is complete Cheng Hou, in identification process, during the Gradient Features for being extracted are input into grader, in situation known to the sorting parameter Under, the output of grader can be obtained, when the output of the grader is more than or equal to predetermined threshold, the Gradient Features pair The vertical center line of the sliding window that answers is the Character segmentation line of the image in the sliding window, in the output of the grader During less than the predetermined threshold, the vertical center line of the corresponding sliding window of the Gradient Features is in the sliding window The Character segmentation line of image.By the said method in the present embodiment, Character segmentation line can be fast and accurately identified.
In one embodiment, the Character segmentation line recognized by characteristics of image in the character row image to be identified, Also include:
After all Character segmentation lines in the character row image to be identified is identified, the Character segmentation for repeating is removed Line.
In the present embodiment, the Character segmentation line of the repetition refers to that the multiple characters identified in two intercharacters divide Secant.As, in identification process, the moving step length of sliding window is less, and intercharacter interval is larger, therefore at two Intercharacter may identify multiple Character segmentation lines, therefore before separating character, need to remove other Character segmentation lines, And retain one of those most obvious Character segmentation line.
In one embodiment, the Character segmentation line for removing repetition, including:Weight is removed using non-maximum suppressing method Multiple Character segmentation line.
In the present embodiment, the Character segmentation line for repeating is removed using non-maximum suppressing method.For identified Each Character segmentation line, arranges a non-maximum with the Character segmentation line and suppresses window, Jin Ercong as vertical center line A most accurate Character segmentation line is selected in multiple Character segmentation lines in the non-maximum suppression window as final word Symbol cut-off rule, and remove the other Character segmentation lines in the non-maximum suppression window.The non-maximum suppresses the height of window Degree can be equal to the height of the character picture to be identified, and the width of the non-maximum suppression window is less than character to be identified , otherwise can there is kinds of characters in same non-maximum suppression window in the width sum of character and two character pitches in row image Multiple Character segmentation lines in interval, and cause mistake to remove the problem of repeat character (RPT) cut-off rule.In the present embodiment, from one When selecting most accurate Character segmentation line in the multiple Character segmentation lines in non-maximum suppression window, it is possible to use a upper embodiment Described in grader identify the output valve obtained during the Character segmentation line to determine.When the Character segmentation line is recognized, The grader obtains an output valve according to characteristics of image, if the output valve is more than predetermined threshold, then it is assumed that described The vertical center line of the corresponding sliding window of characteristics of image is Character segmentation line, and therefore each Character segmentation line is to there is one Grader output valve, and the grader output valve is bigger, illustrates that corresponding Character segmentation line is more accurate.In the present embodiment, For the multiple Character segmentation lines in a non-maximum suppression window, a maximum character of corresponding grader output valve is divided Secant is used as most accurate Character segmentation line, and removes other Character segmentation lines.By this method of the present embodiment, can be fast Fast denoising, and accuracy rate is high, speed is fast.
Following for disclosure device embodiment, can be used for executing method of disclosure embodiment.
Fig. 4 is a kind of block diagram of the character recognition device according to an exemplary embodiment, and the device can pass through soft Part, hardware or both be implemented in combination with become some or all of of electronic equipment.As shown in figure 4, the character recognition device Including with lower module 401-403:
In acquisition module 401, it is configured to obtain character row image to be identified;
In identification module 402, it is configured to characteristics of image and recognizes the character in the character row image to be identified Cut-off rule;
In segmentation module 403, it is configured to carry out the character row image to be identified according to the Character segmentation line Segmentation, obtains Character segmentation image.
In the present embodiment, for the character row image to be identified for obtaining, that is, include that the image of a line character enters line character Identification.In identification process, the Character segmentation line in the character row image to be identified is recognized using sliding window, and identifying After all Character segmentation lines, the character row image to be identified is split according to the Character segmentation line, obtains Character segmentation figure Picture.By the technical scheme of the disclosure, the Character segmentation line in character row image to be identified, Jin Eryong is recognized using characteristics of image Character segmentation line carries out Character segmentation to character row image to be identified, can effectively improve the robustness of segmentation, eliminates noise The impact brought with illumination.
In one embodiment, described image feature can be the textural characteristics of image, gray feature, Gradient Features or shape Feature.The selection of characteristics of image can be specifically arranged according to the difference of concrete RM and identification character.
In one embodiment, before identification module 402, also include with lower module:
In adjusting module, it is configured in the case of life size constant rate is kept, by the character row to be identified Image Adjusting is preliminary dimension.
In the present embodiment, depending on the size of the character row image to be identified of acquisition is according to the size of original image, and Generally can all unify character row image size to be identified in identification process, in order to follow-up identification.The preliminary dimension big I is known otherwise arranging according to concrete.It should be noted that being in course of adjustment, the dimension scale of image is constant, and Only the width of adjustment image or length, can otherwise cause anamorphose, and affect recognition result.
In one embodiment, as shown in figure 5, the identification module 402, including with lower module 501-504:
In submodule 501 is set, it is configured to submodule is set, is configured to set sliding window, wherein, the cunning The height of dynamic window is highly equal with the character row image to be identified;
In mobile submodule 502, it is configured on the character row image to be identified move institute according to pre- fixed step size State sliding window;
In the first extracting sub-module 503, it is configured to extract the character row image to be identified in the sliding window Characteristics of image;
In the first identification submodule 504, it is configured to recognize the character to be identified according to the characteristics of image for being extracted Whether row image includes the Character segmentation line in the sliding window.
In the present embodiment, described first setting submodule 501 sets sliding window, the height of the sliding window and institute State the highly equal of character row image to be identified, the width of the sliding window and height can equal can also be unequal, tool Body is configured according to actual conditions.In identification, the mobile submodule 502 is most left from the character row image to be identified Side starts, and moves the sliding window according to pre- fixed step size, often slides once, and first extracting sub-module 503 is extracted and slided Characteristics of image in window, the first identification character row figure to be identified according to described image feature recognition of submodule 504 As whether including Character segmentation line in the sliding window.The pre- fixed step size can be arranged as the case may be, required In the case that recognition accuracy is higher, a picture can be differed before and after the moving step length is set to 1, i.e. sliding window movement Element, and in the case of efficiency high is required, moving step length suitably can increase.
In one embodiment, as shown in fig. 6, described first recognizes submodule 504, including with lower module 601-603:
In the second extracting sub-module 601, it is configured to extract the Gradient Features in the sliding window;
In dimensionality reduction submodule 602, it is configured to carry out dimensionality reduction to the Gradient Features;
In the second identification submodule 603, the Gradient Features after dimensionality reduction are configured to, with, are recognized by grader Whether the vertical center line of the sliding window is the Character segmentation line.
In the present embodiment, the second extracting sub-module 601 selects to use Gradient Features, and the Gradient Features may be selected to extract HOG (Histogram of oriented gradient, histograms of oriented gradients) and/or WDCH (Weighted Direction Code Histogram) feature.If two kinds of features of selective extraction HOG and WDCH, can be by both groups It is combined to form a characteristic vector.As the dimension of the Gradient Features for extracting is higher, so being generally required for carrying out which Dimensionality reduction, concrete dimensionality reduction amplitude can be selected according to actual conditions, if accuracy of identification will be gone higher, can select less dimensionality reduction Amplitude, for example, carry out dimensionality reduction according to the amplitude of retain prime information percent 99 or so, if requiring that accuracy of identification is relatively low, permissible Select larger dimensionality reduction amplitude, for example, dimensionality reduction etc. is carried out according to the amplitude of retain source information percent 80 or so.Extracting ladder After spending feature and carrying out dimensionality reduction, described second recognizes submodule 603 by classification good for the Gradient Features input value training in advance It is identified in device.The grader can select logistic regression grader, for determining the cunning according to the Gradient Features Whether the numerical value center line of dynamic window is Character segmentation line.When the grader is trained, sample set, the sample set is first obtained Include positive sample and negative sample, the sample set includes multiple character row images of known character cut-off rule, in different slips The Gradient Features of the image in window, and the positive sample is the sliding window of the vertical center line for Character segmentation line of sliding window Gradient Features in mouthful, it is not the ladder in the sliding window of Character segmentation line that the negative sample is the vertical center line of sliding window Degree feature, then according to the sample in the sample set, trains the grader by selected training method, finally gives point Each sorting parameter of class device.After the completion of classifier training, in identification process, the Gradient Features for being extracted are input into classification The output of grader in device, in the case of known to the sorting parameter, can be obtained, the grader output more than or During equal to predetermined threshold, the vertical center line of the corresponding sliding window of the Gradient Features is the image in the sliding window Character segmentation line, when the output of the grader is less than the predetermined threshold, the corresponding sliding window of the Gradient Features Vertically center line is not the Character segmentation line of the image in the sliding window.By the said method in the present embodiment, can Character segmentation line is fast and accurately identified.
In one embodiment, the identification module 402, also includes:
First denoising submodule, all Character segmentation lines being configured in the character row image to be identified is identified Afterwards, the Character segmentation line for repeating is removed.
In the present embodiment, the Character segmentation line of the repetition refers to that the multiple characters identified in two intercharacters divide Secant.As, in identification process, the moving step length of sliding window is less, and intercharacter interval is larger, therefore at two Intercharacter may identify multiple Character segmentation lines, therefore before separating character, need to remove other Character segmentation lines, And retain one of those most obvious Character segmentation line.
In one embodiment, the first denoising submodule, including:
Second denoising submodule, is configured to, with non-maximum suppressing method and removes the Character segmentation line for repeating.
In the present embodiment, the Character segmentation line for repeating is removed using non-maximum suppressing method.For identified Each Character segmentation line, arranges a non-maximum with the Character segmentation line and suppresses window, Jin Ercong as vertical center line A most accurate Character segmentation line is selected in multiple Character segmentation lines in the non-maximum suppression window as final word Symbol cut-off rule, and remove the other Character segmentation lines in the non-maximum suppression window.The non-maximum suppresses the height of window Degree can be equal to the height of the character picture to be identified, and the width of the non-maximum suppression window is less than character to be identified , otherwise can there is kinds of characters in same non-maximum suppression window in the width sum of character and two character pitches in row image Multiple Character segmentation lines in interval, and cause mistake to remove the problem of repeat character (RPT) cut-off rule.In the present embodiment, from one When selecting most accurate Character segmentation line in the multiple Character segmentation lines in non-maximum suppression window, it is possible to use a upper embodiment Described in grader identify the output valve obtained during the Character segmentation line to determine.When the Character segmentation line is recognized, The grader obtains an output valve according to characteristics of image, if the output valve is more than predetermined threshold, then it is assumed that described The vertical center line of the corresponding sliding window of characteristics of image is Character segmentation line, and therefore each Character segmentation line is to there is one Grader output valve, and the grader output valve is bigger, illustrates that corresponding Character segmentation line is more accurate.In the present embodiment, For the multiple Character segmentation lines in a non-maximum suppression window, a maximum character of corresponding grader output valve is divided Secant is used as most accurate Character segmentation line, and removes other Character segmentation lines.By this method of the present embodiment, can be fast Fast denoising, and accuracy rate is high, speed is fast.
According to the third aspect of the embodiment of the present disclosure, a kind of character recognition device is provided, including:
Processor;
For storing the memory of processor executable;
Wherein, processor is configured to:
Obtain character row image to be identified;
Character segmentation line in the character row image to be identified is recognized by characteristics of image;
The character row image to be identified is split according to the Character segmentation line, obtain Character segmentation image.
Above-mentioned processor is also configured to:
Wherein, described according to the characteristics of image for the being extracted identification character row image to be identified in the sliding window Whether Character segmentation line is included, including:
Extract the Gradient Features in the sliding window;
Dimensionality reduction is carried out to the Gradient Features;
Using the Gradient Features after dimensionality reduction, whether the vertical center line for recognizing the sliding window by grader is The Character segmentation line.
Wherein, the Character segmentation line recognized by characteristics of image in the character row image to be identified, is also included:
After all Character segmentation lines in the character row image to be identified is identified, the Character segmentation for repeating is removed Line.
Wherein, the Character segmentation line for removing repetition, including:
The Character segmentation line for repeating is removed using non-maximum suppressing method.
With regard to the device in above-described embodiment, wherein modules execute the concrete mode of operation in relevant the method Embodiment in be described in detail, explanation will be not set forth in detail herein.
Fig. 7 is a kind of block diagram for character recognition device according to an exemplary embodiment, and the device is applied to Terminal device.For example, device 700 can be mobile phone, computer, digital broadcast terminal, messaging devices, game control Platform, tablet device, Medical Devices, body-building equipment, personal digital assistant etc..
Device 700 can include following one or more assemblies:Process assembly 702, memory 704, power supply module 706, Multimedia groupware 708, audio-frequency assembly 710, the interface 712 of input/output (I/O), sensor cluster 714, and communication component 716.
The integrated operation of 702 usual control device 700 of process assembly, such as with display, call, data communication, phase The associated operation of machine operation and record operation.Process assembly 702 can refer to execute including one or more processors 720 Order, to complete all or part of step of above-mentioned method.Additionally, process assembly 702 can include one or more modules, just Interaction between process assembly 702 and other assemblies.For example, process assembly 702 can include multi-media module, many to facilitate Interaction between media component 708 and process assembly 702.
Memory 704 is configured to store various types of data to support the operation in device 700.The showing of these data Example include on device 700 operate any application program or method instruction, contact data, telephone book data, disappear Breath, picture, video etc..Memory 704 can be by any kind of volatibility or non-volatile memory device or their group Close and realize, such as static RAM (SRAM), Electrically Erasable Read Only Memory (EEPROM), erasable compile Journey read-only storage (EPROM), programmable read only memory (PROM), read-only storage (ROM), magnetic memory, flash Device, disk or CD.
Power supply module 706 provides electric power for the various assemblies of device 700.Power supply module 706 can include power management system System, one or more power supplys, and other generate, manage and distribute, with for device 700, the component that electric power is associated.
Multimedia groupware 708 includes the screen of one output interface of offer between described device 700 and user.One In a little embodiments, screen can include liquid crystal display (LCD) and touch panel (TP).If screen includes touch panel, screen Curtain may be implemented as touch-screen, to receive the input signal from user.Touch panel includes one or more touch sensings Device is with the gesture on sensing touch, slip and touch panel.The touch sensor can not only sensing touch or sliding action Border, but also detect and the touch or slide related duration and pressure.In certain embodiments, many matchmakers Body component 708 includes a front-facing camera and/or post-positioned pick-up head.When device 700 be in operator scheme, such as screening-mode or During video mode, front-facing camera and/or post-positioned pick-up head can receive outside multi-medium data.Each front-facing camera and Post-positioned pick-up head can be the optical lens system of a fixation or with focusing and optical zoom capabilities.
Audio-frequency assembly 710 is configured to output and/or input audio signal.For example, audio-frequency assembly 710 includes a Mike Wind (MIC), when device 700 is in operator scheme, such as call model, logging mode and speech recognition mode, microphone is joined It is set to reception external audio signal.The audio signal for being received can be further stored in memory 704 or via communication set Part 716 sends.In certain embodiments, audio-frequency assembly 710 also includes a loudspeaker, for exports audio signal.
I/O interface 712 is to provide interface between process assembly 702 and peripheral interface module, and above-mentioned peripheral interface module can To be keyboard, click wheel, button etc..These buttons may include but be not limited to:Home button, volume button, start button and lock Determine button.
Sensor cluster 714 includes one or more sensors, and the state for providing various aspects for device 700 is commented Estimate.For example, sensor cluster 714 can detect/the closed mode of opening of device 700, and the relative positioning of component is for example described Component is display and the keypad of device 700, and sensor cluster 714 can be with 700 1 components of detection means 700 or device Position change, user is presence or absence of with what device 700 was contacted, 700 orientation of device or acceleration/deceleration and device 700 Temperature change.Sensor cluster 714 can include proximity transducer, be configured to detect when without any physical contact The presence of object nearby.Sensor cluster 714 can also include optical sensor, such as CMOS or ccd image sensor, for becoming As used in application.In certain embodiments, the sensor cluster 714 can also include acceleration transducer, gyro sensors Device, Magnetic Sensor, pressure sensor or temperature sensor.
Communication component 716 is configured to facilitate the communication of wired or wireless way between device 700 and other equipment.Device 700 can access the wireless network based on communication standard, such as WiFi, 2G or 3G, or combinations thereof.In an exemplary enforcement In example, communication component 716 receives broadcast singal or the broadcast related information from external broadcasting management system via broadcast channel. In one exemplary embodiment, the communication component 716 also includes near-field communication (NFC) module, to promote junction service.Example Such as, NFC module can be based on RF identification (RFID) technology, Infrared Data Association (IrDA) technology, ultra broadband (UWB) technology, Bluetooth (BT) technology and other technologies are realizing.
In the exemplary embodiment, device 700 can be by one or more application specific integrated circuits (ASIC), numeral letter Number processor (DSP), digital signal processing appts (DSPD), PLD (PLD), field programmable gate array (FPGA), controller, microcontroller, microprocessor or other electronic components are realized, for executing said method.
In the exemplary embodiment, a kind of non-transitorycomputer readable storage medium including instructing, example are additionally provided Such as include the memory 704 for instructing, above-mentioned instruction can be executed by the processor 720 of device 700 to complete said method.For example, The non-transitorycomputer readable storage medium can be ROM, random access memory (RAM), CD-ROM, tape, floppy disk With optical data storage devices etc..
A kind of non-transitorycomputer readable storage medium, when the instruction in the storage medium is by the process of device 700 When device is executed so that device 700 is able to carry out above-mentioned character identifying method, and methods described includes:
Obtain character row image to be identified;
Character segmentation line in the character row image to be identified is recognized by characteristics of image;
The character row image to be identified is split according to the Character segmentation line, obtain Character segmentation image.
Wherein, before the Character segmentation line recognized by characteristics of image in the character row image to be identified, also Including:
In the case of life size constant rate is kept, the character row Image Adjusting to be identified is preliminary dimension.
Wherein, the Character segmentation line recognized by characteristics of image in the character row image to be identified, including:
Sliding window is set, wherein, the height of the sliding window is highly equal with the character row image to be identified;
The sliding window is moved according to pre- fixed step size on the character row image to be identified;
Extract the characteristics of image of the character row image to be identified in the sliding window;
Recognize whether the character row image to be identified includes in the sliding window according to the characteristics of image for being extracted The Character segmentation line.
Wherein, described according to the characteristics of image for the being extracted identification character row image to be identified in the sliding window Whether Character segmentation line is included, including:
Extract the Gradient Features in the sliding window;
Dimensionality reduction is carried out to the Gradient Features;
Using the Gradient Features after dimensionality reduction, whether the vertical center line for recognizing the sliding window by grader is The Character segmentation line.
Wherein, the Character segmentation line recognized by characteristics of image in the character row image to be identified, is also included:
After all Character segmentation lines in the character row image to be identified is identified, the Character segmentation for repeating is removed Line.
Wherein, the Character segmentation line for removing repetition, including:
The Character segmentation line for repeating is removed using non-maximum suppressing method.
Those skilled in the art will readily occur to its of the disclosure after considering specification and putting into practice disclosure disclosed herein Its embodiment.The application is intended to any modification, purposes or the adaptations of the disclosure, these modifications, purposes or Person's adaptations follow the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure Or conventional techniques.Description and embodiments be considered only as exemplary, the true scope of the disclosure and spirit by following Claim is pointed out.
It should be appreciated that the disclosure is not limited to the precision architecture for being described above and being shown in the drawings, and And various modifications and changes can carried out without departing from the scope.The scope of the present disclosure is only limited by appended claim.

Claims (13)

1. a kind of character identifying method, it is characterised in that include:
Obtain character row image to be identified;
Character segmentation line in the character row image to be identified is recognized by characteristics of image;
The character row image to be identified is split according to the Character segmentation line, obtain Character segmentation image.
2. the method for claim 1, it is characterised in that the character row to be identified is recognized by characteristics of image described Before Character segmentation line in image, also include:
In the case of life size constant rate is kept, the character row Image Adjusting to be identified is preliminary dimension.
3. the method for claim 1, it is characterised in that described the character row figure to be identified is recognized by characteristics of image Character segmentation line in picture, including:
Sliding window is set, wherein, the height of the sliding window is highly equal with the character row image to be identified;
The sliding window is moved according to pre- fixed step size on the character row image to be identified;
Extract the characteristics of image of the character row image to be identified in the sliding window;
Recognize the character row image to be identified whether comprising described in the sliding window according to the characteristics of image for being extracted Character segmentation line.
4. method as claimed in claim 3, it is characterised in that described described to be identified according to the characteristics of image for being extracted identification Whether character row image includes Character segmentation line in the sliding window, including:
Extract the Gradient Features in the sliding window;
Dimensionality reduction is carried out to the Gradient Features;
Using the Gradient Features after dimensionality reduction, recognize by grader whether the vertical center line of the sliding window is described Character segmentation line.
5. the method for claim 1, it is characterised in that described the character row figure to be identified is recognized by characteristics of image Character segmentation line in picture, also includes:
After all Character segmentation lines in the character row image to be identified is identified, the Character segmentation line for repeating is removed.
6. method as claimed in claim 5, it is characterised in that the Character segmentation line that the removal repeats, including:
The Character segmentation line for repeating is removed using non-maximum suppressing method.
7. a kind of character recognition device, it is characterised in that include:
Acquisition module, is configured to obtain character row image to be identified;
Identification module, is configured to characteristics of image and recognizes the Character segmentation line in the character row image to be identified;
Segmentation module, is configured to split the character row image to be identified according to the Character segmentation line, obtains word Symbol segmentation figure picture.
8. device as claimed in claim 7, it is characterised in that before the identification module, also include:
Adjusting module, is configured in the case of life size constant rate is kept, by the character row Image Adjusting to be identified For preliminary dimension.
9. method as claimed in claim 7, it is characterised in that the identification module, including:
Submodule is set, is configured to sliding window is set, wherein, the height of the sliding window and the character row to be identified Image highly equal;
Mobile submodule, is configured on the character row image to be identified move the sliding window according to pre- fixed step size;
First extracting sub-module, is configured to extract the characteristics of image of the character row image to be identified in the sliding window;
First identification submodule, is configured to recognize the character row image to be identified described according to the characteristics of image for being extracted Whether the Character segmentation line is included in sliding window.
10. device as claimed in claim 9, it is characterised in that the first identification submodule, including:
Second extracting sub-module, is configured to extract the Gradient Features in the sliding window;
Dimensionality reduction submodule, is configured to carry out dimensionality reduction to the Gradient Features;
Second identification submodule, is configured to, with the Gradient Features after dimensionality reduction, recognizes the sliding window by grader Whether the vertical center line of mouth is the Character segmentation line.
11. devices as claimed in claim 7, it is characterised in that the identification module, also include:
First denoising submodule, be configured to all Character segmentation lines in the character row image to be identified is identified it Afterwards, the Character segmentation line for repeating is removed.
12. devices as claimed in claim 11, it is characterised in that the first denoising submodule, including:
Second denoising submodule, is configured to, with non-maximum suppressing method and removes the Character segmentation line for repeating.
13. a kind of character recognition devices, it is characterised in that include:
Processor;
For storing the memory of processor executable;
Wherein, the processor is configured to:
Obtain character row image to be identified;
Character segmentation line in the character row image to be identified is recognized by characteristics of image;
The character row image to be identified is split according to the Character segmentation line, obtain Character segmentation image.
CN201610833487.0A 2016-09-19 2016-09-19 Character identifying method and device Active CN106485246B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610833487.0A CN106485246B (en) 2016-09-19 2016-09-19 Character identifying method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610833487.0A CN106485246B (en) 2016-09-19 2016-09-19 Character identifying method and device

Publications (2)

Publication Number Publication Date
CN106485246A true CN106485246A (en) 2017-03-08
CN106485246B CN106485246B (en) 2019-07-16

Family

ID=58267403

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610833487.0A Active CN106485246B (en) 2016-09-19 2016-09-19 Character identifying method and device

Country Status (1)

Country Link
CN (1) CN106485246B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830278A (en) * 2018-05-17 2018-11-16 河南思维轨道交通技术研究院有限公司 A kind of character string picture recognition methods
CN111401173A (en) * 2020-03-06 2020-07-10 埃洛克航空科技(北京)有限公司 City orthoscopic segmentation identification method based on multi-window state identification process
CN111539438A (en) * 2020-04-28 2020-08-14 北京百度网讯科技有限公司 Text content identification method and device and electronic equipment
CN112613348A (en) * 2020-12-01 2021-04-06 浙江华睿科技有限公司 Character recognition method and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377855A (en) * 2007-08-27 2009-03-04 富士施乐株式会社 Document image processing apparatus, and information processing method
CN101571921A (en) * 2008-04-28 2009-11-04 富士通株式会社 Method and device for identifying key words
CN103268489A (en) * 2013-05-29 2013-08-28 电子科技大学 Motor vehicle plate identification method based on sliding window searching
CN104573688A (en) * 2015-01-19 2015-04-29 电子科技大学 Mobile platform tobacco laser code intelligent identification method and device based on deep learning
CN105407245A (en) * 2014-09-08 2016-03-16 柯尼卡美能达株式会社 Electronic Document Generation Apparatus, Recording Medium, And Electronic Document Generation System
CN105528604A (en) * 2016-01-31 2016-04-27 华南理工大学 Bill automatic identification and processing system based on OCR
CN105574524A (en) * 2015-12-11 2016-05-11 北京大学 Cartoon image page identification method and system based on dialogue and storyboard united identification
CN105654082A (en) * 2014-11-12 2016-06-08 佳能株式会社 Method and equipment for character identification post-processing and image picking equipment comprising equipment

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101377855A (en) * 2007-08-27 2009-03-04 富士施乐株式会社 Document image processing apparatus, and information processing method
CN101571921A (en) * 2008-04-28 2009-11-04 富士通株式会社 Method and device for identifying key words
CN103268489A (en) * 2013-05-29 2013-08-28 电子科技大学 Motor vehicle plate identification method based on sliding window searching
CN105407245A (en) * 2014-09-08 2016-03-16 柯尼卡美能达株式会社 Electronic Document Generation Apparatus, Recording Medium, And Electronic Document Generation System
CN105654082A (en) * 2014-11-12 2016-06-08 佳能株式会社 Method and equipment for character identification post-processing and image picking equipment comprising equipment
CN104573688A (en) * 2015-01-19 2015-04-29 电子科技大学 Mobile platform tobacco laser code intelligent identification method and device based on deep learning
CN105574524A (en) * 2015-12-11 2016-05-11 北京大学 Cartoon image page identification method and system based on dialogue and storyboard united identification
CN105528604A (en) * 2016-01-31 2016-04-27 华南理工大学 Bill automatic identification and processing system based on OCR

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108830278A (en) * 2018-05-17 2018-11-16 河南思维轨道交通技术研究院有限公司 A kind of character string picture recognition methods
CN108830278B (en) * 2018-05-17 2021-11-02 河南思维轨道交通技术研究院有限公司 Character string image recognition method
CN111401173A (en) * 2020-03-06 2020-07-10 埃洛克航空科技(北京)有限公司 City orthoscopic segmentation identification method based on multi-window state identification process
CN111401173B (en) * 2020-03-06 2022-08-02 埃洛克航空科技(北京)有限公司 City orthoscopic segmentation identification method based on multi-window state identification process
CN111539438A (en) * 2020-04-28 2020-08-14 北京百度网讯科技有限公司 Text content identification method and device and electronic equipment
US11810384B2 (en) 2020-04-28 2023-11-07 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for recognizing text content and electronic device
CN111539438B (en) * 2020-04-28 2024-01-12 北京百度网讯科技有限公司 Text content identification method and device and electronic equipment
CN112613348A (en) * 2020-12-01 2021-04-06 浙江华睿科技有限公司 Character recognition method and electronic equipment

Also Published As

Publication number Publication date
CN106485246B (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN105528607A (en) Region extraction method and model training method and device
CN106228168B (en) The reflective detection method of card image and device
CN106503617A (en) Model training method and device
CN105631408B (en) Face photo album processing method and device based on video
CN106228556B (en) image quality analysis method and device
CN106557768A (en) The method and device is identified by word in picture
CN105426857A (en) Training method and device of face recognition model
CN109934275B (en) Image processing method and device, electronic equipment and storage medium
JP2018500706A (en) Region recognition method and apparatus
CN105095881A (en) Method, apparatus and terminal for face identification
CN106650575A (en) Face detection method and device
CN106295511A (en) Face tracking method and device
CN108010060A (en) Object detection method and device
CN107944447A (en) Image classification method and device
CN106485246A (en) Character identifying method and device
CN104239879A (en) Character segmentation method and device
CN104077597B (en) Image classification method and device
CN107145859A (en) E-book conversion process method, device and computer-readable recording medium
CN104284240A (en) Video browsing method and device
CN109670458A (en) A kind of licence plate recognition method and device
CN105354560A (en) Fingerprint identification method and device
CN107784279A (en) Method for tracking target and device
CN106372603A (en) Shielding face identification method and shielding face identification device
CN106296665A (en) Card image obscures detection method and device
CN107038428A (en) Vivo identification method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant