CN106886776A - The application model of license electronization is realized in a kind of utilization image recognition - Google Patents
The application model of license electronization is realized in a kind of utilization image recognition Download PDFInfo
- Publication number
- CN106886776A CN106886776A CN201710099520.6A CN201710099520A CN106886776A CN 106886776 A CN106886776 A CN 106886776A CN 201710099520 A CN201710099520 A CN 201710099520A CN 106886776 A CN106886776 A CN 106886776A
- Authority
- CN
- China
- Prior art keywords
- image
- license
- word
- information
- ocr
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
- G06V30/153—Segmentation of character regions using recognition of characters or words
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/30—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Discrimination (AREA)
Abstract
The present invention provides the application model that license electronization is realized in a kind of utilization image recognition, belong to field of image recognition, the present invention is based on graphical analysis and OCR identification technologies, after being recognized to image binaryzation treatment and characteristic area, the word on image is parsed using OCR identification technologies and standardization processing;Treatment is analyzed to license characteristics of image;Chinese text storehouse is read, permit identification processes spcial character according to the word in image, and specification is key value forms.License electronization is realized by image recognition, it is possible to achieve the license electronization that will be issued in the past, reduce the work of user's craft typing.
Description
Technical field
The application model of license electronization is realized the present invention relates to image recognition technology, more particularly to a kind of utilization image recognition.
Background technology
In recent years, country constantly handled informationization in propulsion government affairs, and achieved the significant effect of comparing, substantially
Have been able to realize the demand of online working.The trend that country develops along this, the General Office of the State Council has turned hair country recently
The Committee of Development and Reform, the Ministry of Finance, the Ministry of Education, the Ministry of Public Security, Ministry of Civil Affairs, portion of people society, live to build portion, country and defend planning commission, State Council Legislative Affairs Office, country
The departments of standard Wei Deng 10《Information Huimin pilot embodiment is carried out in propulsion " internet+government affairs service "》([23] number are sent out by the Office of the State Council
Text), scheme emphasize accelerate propulsion internet+government affairs service, go deep into implementation information civilian-oriented project, with present informations such as big datas
Technology, reinforcing department collaboration linkage, breaks information island, promotes the government affairs service of working department to be mutually linked, and gos deep into implementation information
Civilian-oriented project, builds convenient and swift, fair Hewlett-Packard, high-quality and efficient government affairs service system.
Online working need to scan or duplicate for license, and high degree constrains online working, new in the urgent need to relying on
Means, new thinking, new paragon build the online working system of modernization, and electronics license storehouse can solve bottle of handling affairs on the net
Neck, realizes real online working.
, it is necessary to the substantial amounts of storage license weight issued before license electronization is implemented during the popularization of electronics license
Their electronic model is newly set up, typing license system is preserved.If to these storage licenses, one by one using the side of manual typing
Formula is collected in electronics license system, and workload therein is huge, is unfavorable for the development of license electronization work.In order to faster more
Promote electronics license well, be badly in need of a kind of effective method to assist to carry out the electronization of license, reduce the work of license typing
Amount, solves the problems, such as storage license electronization.
The content of the invention
In order to solve the above technical problems, the present invention proposes the application that license electronization is realized in a kind of utilization image recognition
Model.License electronization is realized by image recognition, it is possible to achieve the license electronization that will be issued in the past, reduce user and record by hand
The work for entering.
The present invention is based on graphical analysis and OCR identification technologies, after being recognized to image binaryzation treatment and characteristic area,
The word on image is parsed using OCR identification technologies and standardization processing;
Treatment is analyzed to license characteristics of image;
Chinese text storehouse is read, permit identification processes spcial character according to the word in image, and specification is key-value forms.
Mainly including three steps, image preprocessing, Word Input, three steps of information MAP,
1)Noise is removed to image preprocessing using OpenCV, character area is extracted, recognition efficiency is improved;
2)Text region is carried out by Tesseract-OCR, and standardization processing is made to recognition result;
3)The configuration information of license template is read, is matched with recognition result, license image information is mapped to the electricity of license for realization
In submodel, and the electronic result of license is stored.
Image pre-processing phase is realized based on open source projects OpenCV, it is therefore intended that the noise in removal image, is carried
Recognition efficiency high.First, gray processing treatment is carried out to the license image that user uploads, appropriate threshold value is made in selection [0,255]
Image binaryzation treatment;For the interference of image border, connected region feature detection is carried out using MSER algorithms, find out image
Stability region;Finally, the screening of image zonule and link are carried out, image to be identified is generated.
In Word Input stage, the software for discerning characters that pretreated image feeding is increased income, Tesseract-OCR.
Tesseract-OCR is laid out analysis to image, form, picture, the text message of image is distinguished, afterwards using intercharacter
Interval carry out cutting, word is parsed according to information such as Chinese word libraries, identification text is drawn, so as to generate preliminary knowledge
Other result.After preliminary recognition result is got, the spcial character such as space gone in division result is carried out system to recognition result
Standardization, makes recognition result by the form tissue of key-value, the recognition result after return treatment.
In the information MAP stage, the configuration information of license template is obtained, determine the information and these information having on license
Mark in papery version, is matched using message identification with Text region result, so as to the word that will be recognized is mapped to
In the model of license.After matching terminates, to the electronics certificate information of user's displaying generation in software interface, by user in necessity
When manual synchronizing is carried out to electronics license.
The beneficial effects of the invention are as follows
, it is necessary to the papery license to having issued carries out electronization during electronics license is promoted, if right one by one by user
The information of showing up of license carries out manual typing and is undoubtedly the cumbersome work of item, is read automatically by image recognition, maps license
Content, so as to realize the electronization of license, can greatly reduce the workload of user, improve operating efficiency.
Brief description of the drawings
Fig. 1 is workflow schematic diagram of the invention;
Fig. 2 is operating procedure schematic diagram of the invention.
Specific embodiment
More detailed elaboration is carried out to present disclosure below according to accompanying drawing:
As shown in figure 1, workflow of the invention is
1), user upload papery license photo or scanned copy;
2), background program image is pre-processed based on OpenCV, image is carried out gray processing treatment and character area identification;
3)Program is identified based on Tesseract-OCR to the word on the image after treatment, and recognition result is formatted as
Key-value forms;
4)Program reads the configuration information of license template, and Text region result is mapped on the electronic model of license, will map
Result returns to user, is verified by user and confirmed to preserve.
Operating procedure as shown in Fig. 2
First, gray processing treatment is carried out to the license image that user uploads, appropriate threshold value makees image two-value in selection [0,255]
Change is processed;For the interference of image border, connected region feature detection is carried out using MSER algorithms, find out the stable region of image
Domain;Finally, the screening of image zonule and link are carried out, image to be identified is generated.
In Word Input stage, the software for discerning characters that pretreated image feeding is increased income, Tesseract-OCR.
Tesseract-OCR is laid out analysis to image, form, picture, the text message of image is distinguished, afterwards using intercharacter
Interval carry out cutting, word is parsed according to information such as Chinese word libraries, identification text is drawn, so as to generate preliminary knowledge
Other result.After preliminary recognition result is got, the spcial character such as space gone in division result is carried out system to recognition result
Standardization, makes recognition result by the form tissue of key-value, the recognition result after return treatment.
In the information MAP stage, the configuration information of license template is obtained, determine the information and these information having on license
Mark in papery version, is matched using message identification with Text region result, so as to the word that will be recognized is mapped to
In the model of license.After matching terminates, to the electronics certificate information of user's displaying generation in software interface, by user in necessity
When manual synchronizing is carried out to electronics license.
Claims (5)
1. the application model of license electronization is realized in a kind of utilization image recognition, it is characterised in that
Based on graphical analysis and OCR identification technologies, after being recognized to image binaryzation treatment and characteristic area, known using OCR
Other technology is parsed and standardization processing to the word on image;
Treatment is analyzed to license characteristics of image;
Chinese text storehouse is read, permit identification processes spcial character according to the word in image, and specification is key-value forms.
2. application model according to claim 1, it is characterised in that
Mainly including three steps, image preprocessing, Word Input, three steps of information MAP,
1)Noise is removed to image preprocessing using OpenCV, character area is extracted, recognition efficiency is improved;
2)Text region is carried out by Tesseract-OCR, and standardization processing is made to recognition result;
3)The configuration information of license template is read, is matched with recognition result, license image information is mapped to the electricity of license for realization
In submodel, and the electronic result of license is stored.
3. application model according to claim 2, it is characterised in that
Image pre-processing phase realizes that first, the license image to user's upload carries out ash based on open source projects OpenCV
Degreeization treatment, appropriate threshold value makees image binary conversion treatment in selection [0,255];For the interference of image border, using MSER
Algorithm carries out connected region feature detection, finds out the stability region of image;Finally, the screening of image zonule and link are carried out, it is raw
Into image to be identified.
4. application model according to claim 3, it is characterised in that
In Word Input stage, the software for discerning characters that pretreated image feeding is increased income, Tesseract-OCR;
Tesseract-OCR is laid out analysis to image, form, picture, the text message of image is distinguished, afterwards using intercharacter
Interval carry out cutting, word is parsed according to information such as Chinese word libraries, identification text is drawn, so as to generate preliminary knowledge
Other result;After preliminary recognition result is got, the spcial character such as space gone in division result is carried out system to recognition result
Standardization, makes recognition result by the form tissue of key-value, the recognition result after return treatment.
5. application model according to claim 4, it is characterised in that
In the information MAP stage, the configuration information of license template is obtained, determine the information that has on license and these information in paper
Mark in matter version, is matched using message identification with Text region result, so as to the word that will be recognized is mapped to license
Model in;It is right when necessary by user to the electronics certificate information of user's displaying generation in software interface after matching terminates
Electronics license carries out manual synchronizing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710099520.6A CN106886776A (en) | 2017-02-23 | 2017-02-23 | The application model of license electronization is realized in a kind of utilization image recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710099520.6A CN106886776A (en) | 2017-02-23 | 2017-02-23 | The application model of license electronization is realized in a kind of utilization image recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106886776A true CN106886776A (en) | 2017-06-23 |
Family
ID=59180208
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710099520.6A Pending CN106886776A (en) | 2017-02-23 | 2017-02-23 | The application model of license electronization is realized in a kind of utilization image recognition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106886776A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108171239A (en) * | 2018-02-02 | 2018-06-15 | 杭州清本科技有限公司 | The extracting method of certificate pictograph, apparatus and system, computer storage media |
CN108256530A (en) * | 2017-12-29 | 2018-07-06 | 北京城市网邻信息技术有限公司 | Image-recognizing method, device and equipment |
CN108304843A (en) * | 2017-12-25 | 2018-07-20 | 山东浪潮云服务信息科技有限公司 | A kind of image measures and procedures for the examination and approval and examination & approval device |
CN112686237A (en) * | 2020-12-21 | 2021-04-20 | 福建新大陆软件工程有限公司 | Certificate OCR recognition method |
CN113642557A (en) * | 2021-08-10 | 2021-11-12 | 中国民用航空局信息中心 | System and method for supplementing historical data in airworthiness field |
CN115035520A (en) * | 2021-11-22 | 2022-09-09 | 荣耀终端有限公司 | Character recognition method for image, electronic device and storage medium |
CN115116060A (en) * | 2022-08-25 | 2022-09-27 | 深圳前海环融联易信息科技服务有限公司 | Key value file processing method, device, equipment, medium and computer program product |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104933429A (en) * | 2015-06-01 | 2015-09-23 | 深圳市诺比邻科技有限公司 | Method and device for extracting information from image |
CN105046253A (en) * | 2015-06-24 | 2015-11-11 | 山西同方知网数字出版技术有限公司 | Paper front page automatic recognition system and method based on OCR (Optical Character Recognition) |
CN105320952A (en) * | 2015-10-15 | 2016-02-10 | 广东广信通信服务有限公司 | OCR based identification method for driving license information |
CN105528604A (en) * | 2016-01-31 | 2016-04-27 | 华南理工大学 | Bill automatic identification and processing system based on OCR |
CN106127659A (en) * | 2016-08-26 | 2016-11-16 | 南威软件股份有限公司 | A kind of community grid management system |
CN106326888A (en) * | 2016-08-16 | 2017-01-11 | 北京旷视科技有限公司 | Image recognition method and device |
CN106446898A (en) * | 2016-09-14 | 2017-02-22 | 宇龙计算机通信科技(深圳)有限公司 | Extraction method and extraction device of character information in image |
-
2017
- 2017-02-23 CN CN201710099520.6A patent/CN106886776A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104933429A (en) * | 2015-06-01 | 2015-09-23 | 深圳市诺比邻科技有限公司 | Method and device for extracting information from image |
CN105046253A (en) * | 2015-06-24 | 2015-11-11 | 山西同方知网数字出版技术有限公司 | Paper front page automatic recognition system and method based on OCR (Optical Character Recognition) |
CN105320952A (en) * | 2015-10-15 | 2016-02-10 | 广东广信通信服务有限公司 | OCR based identification method for driving license information |
CN105528604A (en) * | 2016-01-31 | 2016-04-27 | 华南理工大学 | Bill automatic identification and processing system based on OCR |
CN106326888A (en) * | 2016-08-16 | 2017-01-11 | 北京旷视科技有限公司 | Image recognition method and device |
CN106127659A (en) * | 2016-08-26 | 2016-11-16 | 南威软件股份有限公司 | A kind of community grid management system |
CN106446898A (en) * | 2016-09-14 | 2017-02-22 | 宇龙计算机通信科技(深圳)有限公司 | Extraction method and extraction device of character information in image |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304843A (en) * | 2017-12-25 | 2018-07-20 | 山东浪潮云服务信息科技有限公司 | A kind of image measures and procedures for the examination and approval and examination & approval device |
CN108256530A (en) * | 2017-12-29 | 2018-07-06 | 北京城市网邻信息技术有限公司 | Image-recognizing method, device and equipment |
CN108256530B (en) * | 2017-12-29 | 2021-12-07 | 北京城市网邻信息技术有限公司 | Image recognition method, device and equipment |
CN108171239A (en) * | 2018-02-02 | 2018-06-15 | 杭州清本科技有限公司 | The extracting method of certificate pictograph, apparatus and system, computer storage media |
CN112686237A (en) * | 2020-12-21 | 2021-04-20 | 福建新大陆软件工程有限公司 | Certificate OCR recognition method |
CN113642557A (en) * | 2021-08-10 | 2021-11-12 | 中国民用航空局信息中心 | System and method for supplementing historical data in airworthiness field |
CN115035520A (en) * | 2021-11-22 | 2022-09-09 | 荣耀终端有限公司 | Character recognition method for image, electronic device and storage medium |
CN115035520B (en) * | 2021-11-22 | 2023-04-18 | 荣耀终端有限公司 | Character recognition method for image, electronic device and storage medium |
CN115116060A (en) * | 2022-08-25 | 2022-09-27 | 深圳前海环融联易信息科技服务有限公司 | Key value file processing method, device, equipment, medium and computer program product |
CN115116060B (en) * | 2022-08-25 | 2023-01-24 | 深圳前海环融联易信息科技服务有限公司 | Key value file processing method, device, equipment and medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106886776A (en) | The application model of license electronization is realized in a kind of utilization image recognition | |
Gatos et al. | Automatic table detection in document images | |
CN104123550A (en) | Cloud computing-based text scanning identification method | |
CN105574063A (en) | Image retrieval method based on visual saliency | |
CN111428710A (en) | File classification collaboration robot and image character recognition method based on same | |
CN107301414A (en) | Chinese positioning, segmentation and recognition methods in a kind of natural scene image | |
Thokchom et al. | Recognition of Handwritten Character of Manipuri Script. | |
Dongre et al. | Devnagari handwritten numeral recognition using geometric features and statistical combination classifier | |
CN110889311A (en) | Financial electronic facsimile document identification system and method | |
CN113901952A (en) | Print form and handwritten form separated character recognition method based on deep learning | |
Agrawal et al. | An algorithmic approach for text recognition from printed/typed text images | |
CN113139535A (en) | OCR document recognition method | |
CN101853313A (en) | Handwriting font object library generating method based on font categorization | |
CN110717397A (en) | Online translation system based on mobile phone camera | |
JPH11110481A (en) | Form rendering and character extracting method | |
CN115147703B (en) | Garbage segmentation method and system based on GinTrans network | |
JPS60114967A (en) | Picture file device | |
Soua et al. | Improved Hybrid Binarization based on Kmeans for Heterogeneous document processing | |
Bhandare et al. | Handwritten (Marathi) compound character recognition | |
Hegadi | Recognition of printed Kannada numerals based on zoning method | |
Rajput et al. | Handwritten script recognition using DCT, gabor filter and wavelet features at line level | |
Rajput et al. | Handwritten script recognition at line level-a multiple feature based approach | |
CN117237971B (en) | Food quality inspection report data extraction method based on multi-mode information extraction | |
Sanjrani et al. | Multilingual OCR systems for the regional languages in Balochistan | |
Wu et al. | Identification of inpainted images and natural images for digital forensics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170623 |