CN109558875A - Method, apparatus, terminal and storage medium based on image automatic identification - Google Patents

Method, apparatus, terminal and storage medium based on image automatic identification Download PDF

Info

Publication number
CN109558875A
CN109558875A CN201811352690.1A CN201811352690A CN109558875A CN 109558875 A CN109558875 A CN 109558875A CN 201811352690 A CN201811352690 A CN 201811352690A CN 109558875 A CN109558875 A CN 109558875A
Authority
CN
China
Prior art keywords
image
character
area
area image
automatic identification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811352690.1A
Other languages
Chinese (zh)
Inventor
刘国平
冯德明
梁文佳
廖梅芳
黄蔚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mdt Infotech Ltd Guangzhou
Original Assignee
Mdt Infotech Ltd Guangzhou
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mdt Infotech Ltd Guangzhou filed Critical Mdt Infotech Ltd Guangzhou
Priority to CN201811352690.1A priority Critical patent/CN109558875A/en
Publication of CN109558875A publication Critical patent/CN109558875A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention relates to field of character recognition, a kind of method, apparatus based on image automatic identification, terminal and storage medium are provided, for solving the problems, such as the localized distortion of image to be identified.It is provided by the invention to be based on automatic distinguishing method for image, device, terminal and storage medium, comprising the following steps: to obtain image step: by scanning or imaging acquisition image;It obtains area image step: from defined area on the image of acquisition, obtaining the image in region;Obtain the character step in area image: character in identification region image and be converted into can typing character.The region selected on image is identified, to improve the recognition effect of corresponding region character, improves input efficiency and mass effect.

Description

Method, apparatus, terminal and storage medium based on image automatic identification
Technical field
The present invention relates to field of character recognition, and in particular to is situated between based on automatic distinguishing method for image, device, terminal and storage Matter.
Background technique
At present in the various data entry process of computer application, character input is usually used keyboard and strikes word and cooperate Various input method softwares are realized.By the mechanical typing of keyboard, input efficiency be limited to operator typing speed or in Selected works word speed, input quality is affected by human factors larger.
The prior art can accomplish the digital picture that original part is directly obtained by digital image acquisition device, then entire figure As carrying out character automatic identification (OCR), the character extracted is inputted as input content.
Due to the factors such as clarity of acquisition device itself and acquisition environment, original data content, inside entire image Some regions have the case where distortion.At this moment, the optimization and pretreatment of same parameters are directly used full figure, it is difficult to be solved simultaneously The problem of picture quality of various different zones, or there are some area image quality preferably other area image quality Bad situation.When full figure uniformly identifies, for the image simultaneously containing multilingual text, character set, due to OCR software Limitation, recognition efficiency and effect are also poor;The character content that full figure identifies together is also difficult to mark off in specified input Hold, influences input efficiency and quality.
Summary of the invention
Present invention solves the technical problem that being images to be recognized localized distortion problem, provide based on image automatic identification side Method, device, terminal and storage medium.
In order to solve the above technical problem, the present invention provides technical solution are as follows:
Method based on image automatic identification, comprising the following steps:
It obtains image step: obtaining image to be identified;
It obtains area image step: from defined area on the image of acquisition, obtaining the image in region;
Obtain the character step in area image: character in identification region image and be converted into can typing character.
Regional assignment is carried out by artificial or computer by the part of the image to acquisition, then obtains region from region Image, then text is obtained from area image by OCR technique, then store acquisition can typing text.
The region selected on image is identified, to improve the recognition effect of corresponding region character, improves input Efficiency and quality.
It preferably, further include character typing step, the word after the character step of the acquisition area image Accord with typing step are as follows: storage can typing character.
It preferably, further include image optimization step after the acquisition area image step;The image optimization step Suddenly include:
It obtains provincial characteristics step: obtaining the feature in area image;
Area image Optimization Steps: algorithm optimization matching is carried out according to the provincial characteristics of acquisition, according to the algorithm being matched to area Area image optimizes, and obtains optimization of region image.The feature of image is extracted, the characteristics of image extracted is mainly text The feature of word optimizes processing further according to characteristics of image, improves the quality of area image, and then improve text in area image Recognition efficiency.
It preferably, include matching step after the image optimization step;The matching step are as follows:
OCR parameter is matched according to provincial characteristics, selects suitable languages and character set.According to provincial characteristics, mainly language kind The features such as class and the font of character screen languages and character set.
Preferably, the region delimited in the acquisition area image step is no less than 1.Unsharp region in image Possible more than one, therefore the more regions of delimitation can further improve recognition effect.
Preferably, include area image quantity obtaining step before the acquisition area image step:
Obtain the graphic feature of area image;
The quantity n of area image is determined according to graphics feature.The graphic feature of area image can be the rule schemas such as rectangle, circle Shape is also possible to irregular hand-drawing graphics.
Preferably, the quantity of the area image be n >=1, it is described based on image automatic identification be single thread handle; After the completion of image automatic identification input in the area image i, then carry out the image automatic identification of another area image Input, the i are 1≤i≤n.The accuracy of single thread processing is higher, helps to improve the effect of Text region.
Preferably, the quantity of the area image is n >=1, described to be belonged to based on image automatic identification for multithreading Processing;The Thread Count is n.The efficiency of multiple threads is higher, can be completed in a relatively short time image recognition and defeated Enter.
It preferably, further include classifying step after the acquisition provincial characteristics step, the classifying step are as follows:
Classified according to provincial characteristics to area image, obtains the image to be processed of m class, the m≤n.
Classify to area image, of a sort image is handled simultaneously, simplifies optimization processing and character recognition Difficulty improves recognition efficiency.
Device based on image automatic identification, including
Image collection module: image to be identified is obtained;
Area image obtains module: from defined area on the image of acquisition, obtaining the image in region;
Character obtains module: character in identification region image and be converted into can typing character;The image obtains Module obtains module connection with area image, and the area image obtains module and obtains module connection with character.
Image collection module obtains image by scanner or video camera, and image is passed to area image and obtains module Defined area, behind defined area, character, which obtains module, to obtain character from the image in region.
Preferably, the character obtains module and connects with character recording module, the character recording module are as follows: by word Symbol is input in the region for obtaining and delimiting in area image step, the character storage that last character recording module will acquire.
Terminal based on image automatic identification, the terminal include processor and memory;Wherein, the processor For executing the program based on automatic distinguishing method for image stored in memory, to realize above-mentioned method.The terminal can be with It is diversified forms, but the processor and memory should be retained.
Storage medium based on image automatic identification, the storage medium is stored with one or more programs, described One or more programs can be executed by one or more processors, to realize above-mentioned method.The storage medium can be a variety of Form, it can be applied to different scenes.
Compared with prior art, the device have the advantages that are as follows: on image select region identify, thus The recognition effect for improving corresponding region character, improves input efficiency and mass effect.
Detailed description of the invention
Fig. 1 is the flow diagram based on automatic distinguishing method for image.
Fig. 2 is the single thread flow diagram based on image automatic identification.
Fig. 3 is the multithreading flow diagram based on image automatic identification.
Fig. 4 is the schematic diagram of the device based on image automatic identification.
Fig. 5 is another schematic diagram of the device based on image automatic identification.
Fig. 6 is the schematic diagram of the device based on image classification single thread automatic identification.
Fig. 7 is the schematic diagram of the terminal based on image single thread automatic identification.
Specific embodiment
Following implementation column is to further explanation of the invention, is not limitation of the present invention.
Method based on image automatic identification, in some embodiments of the present application, comprising the following steps:
It obtains image step: obtaining image to be identified;
It obtains area image step: from defined area on the image of acquisition, obtaining the image in region;
Obtain the character step in area image: character in identification region image and be converted into can typing character.
Regional assignment is carried out by artificial or computer by the part of the image to acquisition, then obtains region from region Image, then text is obtained from area image by OCR technique, then store acquisition can typing text.
The method for obtaining image includes but is not limited to scan, shoot.
The region selected on image is identified, to improve the recognition effect of corresponding region character, improves input Efficiency and quality.
It further include character after the character step in the acquisition area image in other embodiments of the application Typing step;The character typing step are as follows: store acquisition can typing character.
It further include image optimization step in other embodiments of the application, after the acquisition area image step Suddenly;The image optimization step includes:
It obtains provincial characteristics step: obtaining the feature in area image;
Area image Optimization Steps: algorithm optimization matching is carried out according to the provincial characteristics of acquisition, according to the algorithm being matched to area Area image optimizes, and obtains optimization of region image.
The feature of image is extracted, the characteristics of image extracted is mainly the feature of text, further according to characteristics of image Processing is optimized, improves the quality of area image, and then improve the recognition efficiency of text in area image.
In the embodiment, image optimization includes: gray processing, binaryzation, denoising, tilt detection and correction, and text is special Sign is extracted, and the raw image data that family end is sent is received;Gray processing is carried out using weighted mean method to raw image data;To ash Degree figure carries out binary conversion treatment;Denoising is carried out to images to be recognized according to the feature of noise;Binary conversion treatment makes image The only background information of the foreground information comprising black and white, promotes the Efficiency and accuracy of identifying processing;Denoising is into one Step promotes the accuracy of identifying processing;Tilt detection and correction;Papery text is detected by Hough transformation (Hough Transform) The edge line of part frame obtains paper document region and judges that the tilt angle of paper document is corrected;Character features extract Refer to and initial data image is converted into density matrix, judge density matrix line by line, " 0 " element continuous in row is connected to " detection line " judges whether it is text filed according to the depth, difference in length and terminal position feature of " detection line " starting point.It will For current pixel up and down and clinodiagonal in the adjacent pixel on totally 8 directions black picture element pixel of the quantity less than 2 Point is denoted as " 0 " element.It establishes a database to be compared, the content of database should include all character sets to be identified, root Feature group is obtained according to the Feature Extraction Method as input text.
It should be noted that the embodiment is only that the process for carrying out OCR parsing to original image has carried out simple Jie It continues, and non-limiting.The process that user can parse with Demand Design optimization OCR according to the actual situation.
It include matching step after the image optimization step in other embodiments of the application;? With step are as follows:
OCR parameter is matched according to provincial characteristics, selects suitable languages and character set.
According to provincial characteristics, the mainly features such as font of category of language and character, languages and character set are screened.
It selects suitable languages and character set that can realize by following approach: according to different features, selecting not Same mathematical distance function, such as: comparison method, relaxation Comparison Method (Relaxation), the dynamic routine of theorem in Euclid space compare The Database of method (Dynamic Programming, DP) and neural network compares HMM (Hidden Markov Model) etc., due to the discrimination of OCR and it is unable to reach absolutely, in order to reinforce the correctness compared and confidence value, utilizes Identification text after comparison determines its possible similar candidates sub-block, is found out according to the identification text of front and back most logical Word does the function of correcting.
After manual synchronizing refers to matching identification, need to be interacted with user, to it is that may be present can not identification information into Row manual confirmation and adjustment, obtain target text data.
In other embodiments of the application, the region delimited in the acquisition area image step is no less than 1.
The possible more than one in unsharp region in image, therefore the more regions of delimitation can further improve identification Effect.
It include that area image quantity obtains before the acquisition area image step in other embodiments of the application Take step:
Obtain the graphic feature of area image;
The quantity n of area image is determined according to graphics feature.
The graphic feature of area image can be the regular figures such as rectangle, circle, be also possible to irregular hand-drawing graphics.
The figure of rectangle, circle or rule is convenient for identification, but since the distortion zone on some images might not Be it is regular, irregular figure is also necessary sometimes.
In other embodiments of the application, the quantity of the area image is n >=1, it is described based on image from It is dynamic to be identified as single thread processing;After the completion of image automatic identification input in the area image i, then carry out another region The image automatic identification of image inputs, and the i is 1≤i≤n.
The accuracy of single thread processing is higher, helps to improve the effect of Text region.
On the basis of the above embodiments, described based on image as shown in Fig. 2, the quantity of the area image is n It is automatically recognized as single thread processing, i.e., on the image of acquisition behind defined area, selectes a region i, obtains the figure in the region As feature, after the optimization algorithm that image is screened according to this feature, area image is optimized, and then the region after being optimized Image, then carry out OCR processing to the area image after the optimization, identifies character, obtain can typing character, then can typing by this Character storage.To after the automatic identification of region i, then the identification of subsequent region j is carried out, until completing the knowledge of all areas Not.
In other embodiments of the application, the quantity of the area image is n >=1, it is described based on image from Dynamic identification belongs to for multiple threads;The Thread Count is n.
The efficiency of multiple threads is higher, can be completed in a relatively short time image recognition and input.
On the basis of the above embodiments, described based on image as shown in figure 3, the quantity of the area image is n Multiple threads are automatically recognized as, i.e., on the image of acquisition behind defined area, it is automatic to carry out image together to 1 ~ n of all areas It identifies input processing, obtains the characteristics of image of region i`, after the optimization algorithm that image is screened according to this feature, to area image i` It optimizes, and then the area image after being optimized, then OCR processing is carried out to the area image after the optimization, identifies character, Obtain can typing character, then by this can typing character store.The area image i` is any one area in 1 ~ n of region Area image carries out the automatic identification of 1 ~ n of region simultaneously, terminate probably due to image in text identification difficulty and Difference.
It further include classifying step after the acquisition provincial characteristics step in other embodiments of the application, it is described Classifying step are as follows:
Classified according to provincial characteristics to area image, obtains the image to be processed of m class, the m≤n.
Classify to area image, of a sort image is handled simultaneously, simplifies optimization processing and character recognition Difficulty improves recognition efficiency.
On the basis of the above embodiments, in other embodiments of the application, the quantity of the area image is n, Described is single thread processing based on image automatic identification, i.e., on the image of acquisition behind defined area, obtains the figure in each region As feature, is classified according to the feature of area image, obtain sorted image to be processed, the image to be processed The quantity of class is m, and m is less than or equal to n, i.e., the image to be processed of shared m class, and selected according to feature k to filter out one kind to be processed Image after the optimization algorithm for screening image according to feature k, optimizes such area image, and then one after being optimized The area image of class, then carry out OCR processing to the area image of a class after the optimization, identify character, and obtaining can typing Character, then by this can typing character storage be stored in the corresponding storage region of different zones image in such respectively.Feature k is sieved After the automatic identification for selecting image, then the identification of the image to be processed of next class l is carried out, until completing all areas Identification.
On the basis of the above embodiments, in other embodiments of the application, the quantity of the area image is N, it is described based on image automatic identification be multiple threads, i.e., on the image of acquisition behind defined area, obtain all areas Provincial characteristics, total m feature, m be less than or equal to n, classified according to provincial characteristics to area image, number is 1 ~ m, is obtained The image to be processed of the image to be processed of m class, all classes carries out image automatic identification input processing together.For example, obtaining root One kind image to be processed filtered out according to feature k`, after the optimization algorithm of feature k` screening image, to such area image It optimizes, and then a kind of area image after being optimized, carries out OCR in the area image to a class after the optimization Processing, identify character, obtain can typing character, then by this can typing character storage be stored in different zones figure in such respectively As corresponding storage region.The area image k` is the area image of any sort in inhomogeneous 1 ~ m of image to be processed, The automatic identification of the image to be processed of 1 ~ m class is carried out simultaneously, terminates the knowledge that may be because text in inhomogeneous image Other difficulty and different from.
Device based on image automatic identification, in some embodiments of the present application, including
Image collection module: by scanning or imaging acquisition image;
Area image obtains module: from defined area on the image of acquisition, obtaining the image in region;
Character obtains module: character in identification region image and be converted into can typing character, the image obtains Module obtains module connection with area image, and the area image obtains module and obtains module connection with character.
Image collection module obtains image by scanner or video camera, and image is passed to area image and obtains module Defined area, behind defined area, character obtains module will obtain character from the image in region, and last character recording module will The character storage got.
In other embodiments of the application, character recording module: inputs the character into and obtain in area image step In the region of delimitation, the character obtains module and connects with character recording module as shown in Figure 4.
In other embodiments of the application, the area image obtains module and connects with image optimization module;Institute The image optimization module stated includes:
Provincial characteristics obtains module: obtaining the feature in area image;
Area image optimization module: algorithm optimization matching is carried out according to the provincial characteristics of acquisition, according to the algorithm being matched to area Area image optimizes, and obtains optimization of region image.
Provincial characteristics obtains module and extracts to the feature of image, and the characteristics of image extracted is mainly the spy of text Sign, area image optimization module optimize processing according to characteristics of image, improve the quality of area image, and then improve administrative division map The recognition efficiency of text as in.
In the embodiment, image optimization includes: gray processing, binaryzation, denoising, tilt detection and correction, and text is special Sign is extracted, and the raw image data that family end is sent is received;Gray processing is carried out using weighted mean method to raw image data;To ash Degree figure carries out binary conversion treatment;Denoising is carried out to images to be recognized according to the feature of noise;Binary conversion treatment makes image The only background information of the foreground information comprising black and white, promotes the Efficiency and accuracy of identifying processing;Denoising is into one Step promotes the accuracy of identifying processing;Tilt detection and correction;Papery text is detected by Hough transformation (Hough Transform) The edge line of part frame obtains paper document region and judges that the tilt angle of paper document is corrected;Character features extract Refer to and initial data image is converted into density matrix, judge density matrix line by line, " 0 " element continuous in row is connected to " detection line " judges whether it is text filed according to the depth, difference in length and terminal position feature of " detection line " starting point.It will For current pixel up and down and clinodiagonal in the adjacent pixel on totally 8 directions black picture element pixel of the quantity less than 2 Point is denoted as " 0 " element.It establishes a database to be compared, the content of database should include all character sets to be identified, root Feature group is obtained according to the Feature Extraction Method as input text.
It should be noted that the embodiment is only that the process for carrying out OCR parsing to original image has carried out simple Jie It continues, and non-limiting.The process that user can parse with Demand Design optimization OCR according to the actual situation.
In other embodiments of the application, the image optimization module further includes matching module;The matching Module is connected with area image optimization module, the function of the matching module are as follows:
OCR parameter is matched according to provincial characteristics, selects suitable languages and character set.
Matching module screens languages and character according to provincial characteristics, the mainly features such as font of category of language and character Collection.
It selects suitable languages and character set that can realize by following approach: according to different features, selecting not Same mathematical distance function, such as: comparison method, relaxation Comparison Method (Relaxation), the dynamic routine of theorem in Euclid space compare The Database of method (Dynamic Programming, DP) and neural network compares HMM (Hidden Markov Model) etc., due to the discrimination of OCR and it is unable to reach absolutely, in order to reinforce the correctness compared and confidence value, utilizes Identification text after comparison determines its possible similar candidates sub-block, is found out according to the identification text of front and back most logical Word does the function of correcting.
After manual synchronizing refers to matching identification, need to be interacted with user, to it is that may be present can not identification information into Row manual confirmation and adjustment, obtain target text data.
In other embodiments of the application, the area image obtains the region delimited in module and is no less than 1.
In image unsharp region may more than one, therefore area image obtain module delimit more regions can be with It is further to improve recognition effect.
In other embodiments of the application, the area image obtains module and obtains module with area image quantity Connection: the area image quantity obtains the function of module are as follows:
Obtain the graphic feature of area image;
The quantity n of area image is determined according to graphics feature.
Area image quantity obtain module the graphic feature of area image can be the rule schemas such as rectangle, circle Shape is also possible to irregular hand-drawing graphics.
The figure of rectangle, circle or rule is convenient for identification, but since the distortion zone on some images might not Be it is regular, irregular figure is also necessary sometimes.
In other embodiments of the application, as shown in figure 5, the image optimization module, matching module, character are known Other module is integrated in character and obtains in module, and the image optimization module includes that provincial characteristics acquisition module and area image are excellent Change module, the quantity of the area image is n >=1, and the area image obtains module and passes through string with character acquisition module Line interface connection, the image optimization module, matching module, character module are connected by serial line interface, described based on figure As being automatically recognized as single thread processing;After the completion of image automatic identification input in the area image i, then carry out another The image automatic identification of area image inputs, and the i is 1≤i≤n.
The accuracy of single thread processing is higher, helps to improve the effect of Text region.
In other embodiments of the application, as shown in figure 5, the image optimization module, matching module, character are known Other module is integrated in character and obtains in module, and the image optimization module includes that provincial characteristics acquisition module and area image are excellent Change module, the quantity of the area image is n >=1, and the area image obtains module and obtains module by simultaneously with character Line interface connection, the image optimization module, matching module, character module are connected by serial line interface, described based on figure As automatic identification belongs to for multiple threads;The Thread Count is n.
The efficiency of multiple threads is higher, can be completed in a relatively short time image recognition and input.
The provincial characteristics obtains module and connects with categorization module, the function of the categorization module are as follows:
Classified according to provincial characteristics to area image, obtains the image to be processed of m class, the m≤n.
Classify to area image, of a sort image is handled simultaneously, simplifies optimization processing and character recognition Difficulty improves recognition efficiency.
On the basis of the above embodiments, in other embodiments of the application, as shown in fig. 6, the area image Quantity be n, the area image optimization module, matching module, character recognition module be integrated in character obtain module in, institute The quantity for the area image stated is n >=1, and the area image obtains module and obtains module connection, institute with area image quantity The area image quantity stated obtains module and obtains module connection with provincial characteristics, and the provincial characteristics obtains module with mould of classifying Block connection, the categorization module with character obtain module by serial line interface connection, the area image optimization module, With module, character recognition module by serial line interface connect, it is described based on image automatic identification be single thread handle, that is, obtaining On the image taken behind defined area, the characteristics of image in each region is obtained, is classified according to the feature of area image, is classified Image to be processed afterwards, the quantity of the class of the image to be processed are m, and m is less than or equal to n, that is, shares the to be processed of m class Image, it is selected that a kind of image to be processed is filtered out according to feature k, after the optimization algorithm of feature k screening image, to such area Area image optimizes, and then a kind of area image after being optimized, then the area image to a class after the optimization Carry out OCR processing, identify character, obtain can typing character, then by this can typing character storage be stored in such respectively not The corresponding storage region with area image.After the automatic identification for filtering out image to feature k, then carry out next class l's The identification of image to be processed, until completing the identification of all areas.
On the basis of the above embodiments, in other embodiments of the application, as shown in fig. 6, the administrative division map The quantity of picture is n, and the area image optimization module, matching module, character recognition module are integrated in character and obtain in module, The quantity of the area image is n >=1, and the area image obtains module and obtains module connection with area image quantity, The area image quantity obtains module and obtains module connection with provincial characteristics, and the provincial characteristics acquisition module is same to classify Module connection, the categorization module obtain module with character and are connected by parallel interface, the area image optimization module, Matching module, character recognition module by serial line interface connect, it is described based on image automatic identification be multiple threads, that is, exist On the image of acquisition behind defined area, the provincial characteristics of all areas, total m feature are obtained, m is less than or equal to n, according to region spy Sign classifies to area image, obtains the image to be processed of m class, it is automatic that the image to be processed of all classes carries out image together It identifies input processing, such as obtains one kind image to be processed filtered out according to feature k`, the excellent of image is screened according to feature k` After selecting algorithm, such area image is optimized, and then a kind of area image after being optimized, after to the optimization The area image of one class carries out OCR processing, identifies character, obtain can typing character, the Thread Count is m at this time;Again By this can typing character storage be stored in the corresponding storage region of different zones image in such respectively.The area image k` For the area image of any sort in inhomogeneous 1 ~ m of image to be processed, the automatic identification to the image to be processed of 1 ~ m class is same Shi Jinhang's, terminate may be because the identification difficulty and different from of text in inhomogeneous image.
As shown in fig. 7, the terminal based on image automatic identification, the terminal includes processor and memory;Wherein, institute The processor stated is for executing the program based on automatic distinguishing method for image stored in memory, to realize above-mentioned method.
The terminal can be diversified forms, but should retain the processor and memory.
Storage medium based on image automatic identification, the storage medium is stored with one or more programs, described One or more programs can be executed by one or more processors, to realize above-mentioned method.
The storage medium can be diversified forms, can be applied to different scenes.
Above-listed detailed description is illustrating for possible embodiments of the present invention, and above embodiments are not to limit this The scope of the patents of invention, all equivalence enforcements or change without departing from carried out by the present invention, is intended to be limited solely by the scope of the patents of this case.

Claims (10)

1. the method based on image automatic identification, which comprises the following steps:
It obtains image step: obtaining image to be identified;
It obtains area image step: from defined area on the image of acquisition, obtaining the image in region;
Obtain the character step in area image: character in identification region image and be converted into can typing character.
2. the method according to claim 1 based on image automatic identification, which is characterized in that the acquisition area image It further include image optimization step after step;The image optimization step includes:
It obtains provincial characteristics step: obtaining the feature in area image;
Area image Optimization Steps: algorithm optimization matching is carried out according to the provincial characteristics of acquisition, according to the algorithm being matched to area Area image optimizes, and obtains optimization of region image.
3. the method according to claim 2 based on image automatic identification, which is characterized in that the image optimization step It later include matching step;The matching step are as follows:
OCR parameter is matched according to provincial characteristics, selects suitable languages and character set.
4. the method according to claim 1 based on image automatic identification, which is characterized in that the acquisition area image The region delimited in step is no less than 1.
5. the method according to claim 1 based on image automatic identification, which is characterized in that the acquisition area image Include area image quantity obtaining step before step:
Obtain the graphic feature of area image;
The quantity n of area image is determined according to graphic feature.
6. according to claim 5 be based on automatic distinguishing method for image, which is characterized in that the quantity of the area image For n >=1, described based on image automatic identification is single thread processing;Image automatic identification input in the area image i After the completion, then carry out another area image image automatic identification input, the i be 1≤i≤n.
7. according to claim 5 be based on automatic distinguishing method for image, which is characterized in that the quantity of the area image It is described to be belonged to based on image automatic identification for multiple threads for n >=1;The Thread Count is n.
8. a kind of device based on image automatic identification, which is characterized in that including
Image collection module: image to be identified is obtained;
Area image obtains module: from defined area on the image of acquisition, obtaining the image in region;
Character obtains module: character in identification region image and be converted into can typing character;The image obtains Module obtains module connection with area image, and the area image obtains module and obtains module connection with character.
9. a kind of terminal based on image automatic identification, which is characterized in that the terminal includes processor and memory;Its In, the processor is for executing the program based on automatic distinguishing method for image stored in memory, to realize that right is wanted Seek 1 ~ 7 described in any item methods.
10. a kind of storage medium based on image automatic identification, which is characterized in that the storage medium is stored with one or more A program, one or more programs can be executed by one or more processors, to realize any one of claim 1 ~ 7 institute The method stated.
CN201811352690.1A 2018-11-14 2018-11-14 Method, apparatus, terminal and storage medium based on image automatic identification Pending CN109558875A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811352690.1A CN109558875A (en) 2018-11-14 2018-11-14 Method, apparatus, terminal and storage medium based on image automatic identification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811352690.1A CN109558875A (en) 2018-11-14 2018-11-14 Method, apparatus, terminal and storage medium based on image automatic identification

Publications (1)

Publication Number Publication Date
CN109558875A true CN109558875A (en) 2019-04-02

Family

ID=65866017

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811352690.1A Pending CN109558875A (en) 2018-11-14 2018-11-14 Method, apparatus, terminal and storage medium based on image automatic identification

Country Status (1)

Country Link
CN (1) CN109558875A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188745A (en) * 2019-05-30 2019-08-30 北京爱尖子教育科技有限责任公司 The online code method and system of the content of courses
CN112712058A (en) * 2021-01-15 2021-04-27 深圳市悦创进科技有限公司 Character recognition and extraction method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101344925A (en) * 2007-07-10 2009-01-14 富士通株式会社 Character recognition method
CN102306173A (en) * 2011-08-25 2012-01-04 重庆理工大学 Image similarity comparison method
CN105046254A (en) * 2015-07-17 2015-11-11 腾讯科技(深圳)有限公司 Character recognition method and apparatus
CN107203763A (en) * 2016-03-18 2017-09-26 北大方正集团有限公司 Character recognition method and device
CN107832757A (en) * 2017-11-03 2018-03-23 深圳航天信息有限公司 A kind of recognition methods of invoice image
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN108462866A (en) * 2018-03-26 2018-08-28 福州大学 A kind of 3D stereo-picture color calibration methods based on matching and optimization
CN108615058A (en) * 2018-05-10 2018-10-02 苏州大学 A kind of method, apparatus of character recognition, equipment and readable storage medium storing program for executing

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101344925A (en) * 2007-07-10 2009-01-14 富士通株式会社 Character recognition method
CN102306173A (en) * 2011-08-25 2012-01-04 重庆理工大学 Image similarity comparison method
CN105046254A (en) * 2015-07-17 2015-11-11 腾讯科技(深圳)有限公司 Character recognition method and apparatus
CN107203763A (en) * 2016-03-18 2017-09-26 北大方正集团有限公司 Character recognition method and device
CN107832757A (en) * 2017-11-03 2018-03-23 深圳航天信息有限公司 A kind of recognition methods of invoice image
CN108229463A (en) * 2018-02-07 2018-06-29 众安信息技术服务有限公司 Character recognition method based on image
CN108462866A (en) * 2018-03-26 2018-08-28 福州大学 A kind of 3D stereo-picture color calibration methods based on matching and optimization
CN108615058A (en) * 2018-05-10 2018-10-02 苏州大学 A kind of method, apparatus of character recognition, equipment and readable storage medium storing program for executing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110188745A (en) * 2019-05-30 2019-08-30 北京爱尖子教育科技有限责任公司 The online code method and system of the content of courses
CN112712058A (en) * 2021-01-15 2021-04-27 深圳市悦创进科技有限公司 Character recognition and extraction method

Similar Documents

Publication Publication Date Title
CN107609549B (en) Text detection method for certificate image in natural scene
US20190355113A1 (en) Multi-sample Whole Slide Image Processing in Digital Pathology via Multi-resolution Registration and Machine Learning
US9298981B1 (en) Categorizer assisted capture of customer documents using a mobile device
JP4516778B2 (en) Data processing system
US11055584B2 (en) Image processing apparatus, image processing method, and non-transitory computer-readable storage medium that perform class identification of an input image using a discriminator that has undergone learning to perform class identification at different granularities
US11151402B2 (en) Method of character recognition in written document
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
CN107451582A (en) A kind of graphics context identifying system and its recognition methods
Sidhwa et al. Text extraction from bills and invoices
CN108564079A (en) A kind of portable character recognition device and method
CN104182722A (en) Text detection method and device and text information extraction method and system
CN104598881B (en) Feature based compresses the crooked scene character recognition method with feature selecting
CN109558875A (en) Method, apparatus, terminal and storage medium based on image automatic identification
CN114140465B (en) Self-adaptive learning method and system based on cervical cell slice image
CN115240210A (en) System and method for auxiliary exercise of handwritten Chinese characters
CN113221778B (en) Method and device for detecting and identifying handwritten form
CN114445843A (en) Card image character recognition method and device of fixed format
Karanje et al. Survey on text detection, segmentation and recognition from a natural scene images
CN112749696A (en) Text detection method and device
Jia et al. Grayscale-projection based optimal character segmentation for camera-captured faint text recognition
CN115311666A (en) Image-text recognition method and device, computer equipment and storage medium
JP2017228297A (en) Text detection method and apparatus
CN114973394A (en) Gesture motion recognition method and device, electronic equipment and computer storage medium
Winger et al. Low-complexity character extraction in low-contrast scene images
CN111242047A (en) Image processing method and apparatus, electronic device, and computer-readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination