CN106886776A - The application model of license electronization is realized in a kind of utilization image recognition - Google Patents

The application model of license electronization is realized in a kind of utilization image recognition Download PDF

Info

Publication number
CN106886776A
CN106886776A CN201710099520.6A CN201710099520A CN106886776A CN 106886776 A CN106886776 A CN 106886776A CN 201710099520 A CN201710099520 A CN 201710099520A CN 106886776 A CN106886776 A CN 106886776A
Authority
CN
China
Prior art keywords
image
license
word
information
ocr
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710099520.6A
Other languages
Chinese (zh)
Inventor
宁方刚
王冠军
陈兆亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Inspur Cloud Service Information Technology Co Ltd
Original Assignee
Shandong Inspur Cloud Service Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Inspur Cloud Service Information Technology Co Ltd filed Critical Shandong Inspur Cloud Service Information Technology Co Ltd
Priority to CN201710099520.6A priority Critical patent/CN106886776A/en
Publication of CN106886776A publication Critical patent/CN106886776A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T11/002D [Two Dimensional] image generation
    • G06T11/60Editing figures and text; Combining figures or text
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/30Noise filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Character Discrimination (AREA)

Abstract

The present invention provides the application model that license electronization is realized in a kind of utilization image recognition, belong to field of image recognition, the present invention is based on graphical analysis and OCR identification technologies, after being recognized to image binaryzation treatment and characteristic area, the word on image is parsed using OCR identification technologies and standardization processing;Treatment is analyzed to license characteristics of image;Chinese text storehouse is read, permit identification processes spcial character according to the word in image, and specification is key value forms.License electronization is realized by image recognition, it is possible to achieve the license electronization that will be issued in the past, reduce the work of user's craft typing.

Description

The application model of license electronization is realized in a kind of utilization image recognition
Technical field
The application model of license electronization is realized the present invention relates to image recognition technology, more particularly to a kind of utilization image recognition.
Background technology
In recent years, country constantly handled informationization in propulsion government affairs, and achieved the significant effect of comparing, substantially Have been able to realize the demand of online working.The trend that country develops along this, the General Office of the State Council has turned hair country recently The Committee of Development and Reform, the Ministry of Finance, the Ministry of Education, the Ministry of Public Security, Ministry of Civil Affairs, portion of people society, live to build portion, country and defend planning commission, State Council Legislative Affairs Office, country The departments of standard Wei Deng 10《Information Huimin pilot embodiment is carried out in propulsion " internet+government affairs service "》([23] number are sent out by the Office of the State Council Text), scheme emphasize accelerate propulsion internet+government affairs service, go deep into implementation information civilian-oriented project, with present informations such as big datas Technology, reinforcing department collaboration linkage, breaks information island, promotes the government affairs service of working department to be mutually linked, and gos deep into implementation information Civilian-oriented project, builds convenient and swift, fair Hewlett-Packard, high-quality and efficient government affairs service system.
Online working need to scan or duplicate for license, and high degree constrains online working, new in the urgent need to relying on Means, new thinking, new paragon build the online working system of modernization, and electronics license storehouse can solve bottle of handling affairs on the net Neck, realizes real online working.
, it is necessary to the substantial amounts of storage license weight issued before license electronization is implemented during the popularization of electronics license Their electronic model is newly set up, typing license system is preserved.If to these storage licenses, one by one using the side of manual typing Formula is collected in electronics license system, and workload therein is huge, is unfavorable for the development of license electronization work.In order to faster more Promote electronics license well, be badly in need of a kind of effective method to assist to carry out the electronization of license, reduce the work of license typing Amount, solves the problems, such as storage license electronization.
The content of the invention
In order to solve the above technical problems, the present invention proposes the application that license electronization is realized in a kind of utilization image recognition Model.License electronization is realized by image recognition, it is possible to achieve the license electronization that will be issued in the past, reduce user and record by hand The work for entering.
The present invention is based on graphical analysis and OCR identification technologies, after being recognized to image binaryzation treatment and characteristic area, The word on image is parsed using OCR identification technologies and standardization processing;
Treatment is analyzed to license characteristics of image;
Chinese text storehouse is read, permit identification processes spcial character according to the word in image, and specification is key-value forms.
Mainly including three steps, image preprocessing, Word Input, three steps of information MAP,
1)Noise is removed to image preprocessing using OpenCV, character area is extracted, recognition efficiency is improved;
2)Text region is carried out by Tesseract-OCR, and standardization processing is made to recognition result;
3)The configuration information of license template is read, is matched with recognition result, license image information is mapped to the electricity of license for realization In submodel, and the electronic result of license is stored.
Image pre-processing phase is realized based on open source projects OpenCV, it is therefore intended that the noise in removal image, is carried Recognition efficiency high.First, gray processing treatment is carried out to the license image that user uploads, appropriate threshold value is made in selection [0,255] Image binaryzation treatment;For the interference of image border, connected region feature detection is carried out using MSER algorithms, find out image Stability region;Finally, the screening of image zonule and link are carried out, image to be identified is generated.
In Word Input stage, the software for discerning characters that pretreated image feeding is increased income, Tesseract-OCR. Tesseract-OCR is laid out analysis to image, form, picture, the text message of image is distinguished, afterwards using intercharacter Interval carry out cutting, word is parsed according to information such as Chinese word libraries, identification text is drawn, so as to generate preliminary knowledge Other result.After preliminary recognition result is got, the spcial character such as space gone in division result is carried out system to recognition result Standardization, makes recognition result by the form tissue of key-value, the recognition result after return treatment.
In the information MAP stage, the configuration information of license template is obtained, determine the information and these information having on license Mark in papery version, is matched using message identification with Text region result, so as to the word that will be recognized is mapped to In the model of license.After matching terminates, to the electronics certificate information of user's displaying generation in software interface, by user in necessity When manual synchronizing is carried out to electronics license.
The beneficial effects of the invention are as follows
, it is necessary to the papery license to having issued carries out electronization during electronics license is promoted, if right one by one by user The information of showing up of license carries out manual typing and is undoubtedly the cumbersome work of item, is read automatically by image recognition, maps license Content, so as to realize the electronization of license, can greatly reduce the workload of user, improve operating efficiency.
Brief description of the drawings
Fig. 1 is workflow schematic diagram of the invention;
Fig. 2 is operating procedure schematic diagram of the invention.
Specific embodiment
More detailed elaboration is carried out to present disclosure below according to accompanying drawing:
As shown in figure 1, workflow of the invention is
1), user upload papery license photo or scanned copy;
2), background program image is pre-processed based on OpenCV, image is carried out gray processing treatment and character area identification;
3)Program is identified based on Tesseract-OCR to the word on the image after treatment, and recognition result is formatted as Key-value forms;
4)Program reads the configuration information of license template, and Text region result is mapped on the electronic model of license, will map Result returns to user, is verified by user and confirmed to preserve.
Operating procedure as shown in Fig. 2
First, gray processing treatment is carried out to the license image that user uploads, appropriate threshold value makees image two-value in selection [0,255] Change is processed;For the interference of image border, connected region feature detection is carried out using MSER algorithms, find out the stable region of image Domain;Finally, the screening of image zonule and link are carried out, image to be identified is generated.
In Word Input stage, the software for discerning characters that pretreated image feeding is increased income, Tesseract-OCR. Tesseract-OCR is laid out analysis to image, form, picture, the text message of image is distinguished, afterwards using intercharacter Interval carry out cutting, word is parsed according to information such as Chinese word libraries, identification text is drawn, so as to generate preliminary knowledge Other result.After preliminary recognition result is got, the spcial character such as space gone in division result is carried out system to recognition result Standardization, makes recognition result by the form tissue of key-value, the recognition result after return treatment.
In the information MAP stage, the configuration information of license template is obtained, determine the information and these information having on license Mark in papery version, is matched using message identification with Text region result, so as to the word that will be recognized is mapped to In the model of license.After matching terminates, to the electronics certificate information of user's displaying generation in software interface, by user in necessity When manual synchronizing is carried out to electronics license.

Claims (5)

1. the application model of license electronization is realized in a kind of utilization image recognition, it is characterised in that
Based on graphical analysis and OCR identification technologies, after being recognized to image binaryzation treatment and characteristic area, known using OCR Other technology is parsed and standardization processing to the word on image;
Treatment is analyzed to license characteristics of image;
Chinese text storehouse is read, permit identification processes spcial character according to the word in image, and specification is key-value forms.
2. application model according to claim 1, it is characterised in that
Mainly including three steps, image preprocessing, Word Input, three steps of information MAP,
1)Noise is removed to image preprocessing using OpenCV, character area is extracted, recognition efficiency is improved;
2)Text region is carried out by Tesseract-OCR, and standardization processing is made to recognition result;
3)The configuration information of license template is read, is matched with recognition result, license image information is mapped to the electricity of license for realization In submodel, and the electronic result of license is stored.
3. application model according to claim 2, it is characterised in that
Image pre-processing phase realizes that first, the license image to user's upload carries out ash based on open source projects OpenCV Degreeization treatment, appropriate threshold value makees image binary conversion treatment in selection [0,255];For the interference of image border, using MSER Algorithm carries out connected region feature detection, finds out the stability region of image;Finally, the screening of image zonule and link are carried out, it is raw Into image to be identified.
4. application model according to claim 3, it is characterised in that
In Word Input stage, the software for discerning characters that pretreated image feeding is increased income, Tesseract-OCR; Tesseract-OCR is laid out analysis to image, form, picture, the text message of image is distinguished, afterwards using intercharacter Interval carry out cutting, word is parsed according to information such as Chinese word libraries, identification text is drawn, so as to generate preliminary knowledge Other result;After preliminary recognition result is got, the spcial character such as space gone in division result is carried out system to recognition result Standardization, makes recognition result by the form tissue of key-value, the recognition result after return treatment.
5. application model according to claim 4, it is characterised in that
In the information MAP stage, the configuration information of license template is obtained, determine the information that has on license and these information in paper Mark in matter version, is matched using message identification with Text region result, so as to the word that will be recognized is mapped to license Model in;It is right when necessary by user to the electronics certificate information of user's displaying generation in software interface after matching terminates Electronics license carries out manual synchronizing.
CN201710099520.6A 2017-02-23 2017-02-23 The application model of license electronization is realized in a kind of utilization image recognition Pending CN106886776A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710099520.6A CN106886776A (en) 2017-02-23 2017-02-23 The application model of license electronization is realized in a kind of utilization image recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710099520.6A CN106886776A (en) 2017-02-23 2017-02-23 The application model of license electronization is realized in a kind of utilization image recognition

Publications (1)

Publication Number Publication Date
CN106886776A true CN106886776A (en) 2017-06-23

Family

ID=59180208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710099520.6A Pending CN106886776A (en) 2017-02-23 2017-02-23 The application model of license electronization is realized in a kind of utilization image recognition

Country Status (1)

Country Link
CN (1) CN106886776A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171239A (en) * 2018-02-02 2018-06-15 杭州清本科技有限公司 The extracting method of certificate pictograph, apparatus and system, computer storage media
CN108256530A (en) * 2017-12-29 2018-07-06 北京城市网邻信息技术有限公司 Image-recognizing method, device and equipment
CN108304843A (en) * 2017-12-25 2018-07-20 山东浪潮云服务信息科技有限公司 A kind of image measures and procedures for the examination and approval and examination & approval device
CN112686237A (en) * 2020-12-21 2021-04-20 福建新大陆软件工程有限公司 Certificate OCR recognition method
CN113642557A (en) * 2021-08-10 2021-11-12 中国民用航空局信息中心 System and method for supplementing historical data in airworthiness field
CN115035520A (en) * 2021-11-22 2022-09-09 荣耀终端有限公司 Character recognition method for image, electronic device and storage medium
CN115116060A (en) * 2022-08-25 2022-09-27 深圳前海环融联易信息科技服务有限公司 Key value file processing method, device, equipment, medium and computer program product

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933429A (en) * 2015-06-01 2015-09-23 深圳市诺比邻科技有限公司 Method and device for extracting information from image
CN105046253A (en) * 2015-06-24 2015-11-11 山西同方知网数字出版技术有限公司 Paper front page automatic recognition system and method based on OCR (Optical Character Recognition)
CN105320952A (en) * 2015-10-15 2016-02-10 广东广信通信服务有限公司 OCR based identification method for driving license information
CN105528604A (en) * 2016-01-31 2016-04-27 华南理工大学 Bill automatic identification and processing system based on OCR
CN106127659A (en) * 2016-08-26 2016-11-16 南威软件股份有限公司 A kind of community grid management system
CN106326888A (en) * 2016-08-16 2017-01-11 北京旷视科技有限公司 Image recognition method and device
CN106446898A (en) * 2016-09-14 2017-02-22 宇龙计算机通信科技(深圳)有限公司 Extraction method and extraction device of character information in image

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104933429A (en) * 2015-06-01 2015-09-23 深圳市诺比邻科技有限公司 Method and device for extracting information from image
CN105046253A (en) * 2015-06-24 2015-11-11 山西同方知网数字出版技术有限公司 Paper front page automatic recognition system and method based on OCR (Optical Character Recognition)
CN105320952A (en) * 2015-10-15 2016-02-10 广东广信通信服务有限公司 OCR based identification method for driving license information
CN105528604A (en) * 2016-01-31 2016-04-27 华南理工大学 Bill automatic identification and processing system based on OCR
CN106326888A (en) * 2016-08-16 2017-01-11 北京旷视科技有限公司 Image recognition method and device
CN106127659A (en) * 2016-08-26 2016-11-16 南威软件股份有限公司 A kind of community grid management system
CN106446898A (en) * 2016-09-14 2017-02-22 宇龙计算机通信科技(深圳)有限公司 Extraction method and extraction device of character information in image

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108304843A (en) * 2017-12-25 2018-07-20 山东浪潮云服务信息科技有限公司 A kind of image measures and procedures for the examination and approval and examination & approval device
CN108256530A (en) * 2017-12-29 2018-07-06 北京城市网邻信息技术有限公司 Image-recognizing method, device and equipment
CN108256530B (en) * 2017-12-29 2021-12-07 北京城市网邻信息技术有限公司 Image recognition method, device and equipment
CN108171239A (en) * 2018-02-02 2018-06-15 杭州清本科技有限公司 The extracting method of certificate pictograph, apparatus and system, computer storage media
CN112686237A (en) * 2020-12-21 2021-04-20 福建新大陆软件工程有限公司 Certificate OCR recognition method
CN113642557A (en) * 2021-08-10 2021-11-12 中国民用航空局信息中心 System and method for supplementing historical data in airworthiness field
CN115035520A (en) * 2021-11-22 2022-09-09 荣耀终端有限公司 Character recognition method for image, electronic device and storage medium
CN115035520B (en) * 2021-11-22 2023-04-18 荣耀终端有限公司 Character recognition method for image, electronic device and storage medium
CN115116060A (en) * 2022-08-25 2022-09-27 深圳前海环融联易信息科技服务有限公司 Key value file processing method, device, equipment, medium and computer program product
CN115116060B (en) * 2022-08-25 2023-01-24 深圳前海环融联易信息科技服务有限公司 Key value file processing method, device, equipment and medium

Similar Documents

Publication Publication Date Title
CN106886776A (en) The application model of license electronization is realized in a kind of utilization image recognition
Gatos et al. Automatic table detection in document images
CN104123550A (en) Cloud computing-based text scanning identification method
CN105574063A (en) Image retrieval method based on visual saliency
CN111428710A (en) File classification collaboration robot and image character recognition method based on same
CN107301414A (en) Chinese positioning, segmentation and recognition methods in a kind of natural scene image
Thokchom et al. Recognition of Handwritten Character of Manipuri Script.
Dongre et al. Devnagari handwritten numeral recognition using geometric features and statistical combination classifier
CN110889311A (en) Financial electronic facsimile document identification system and method
CN113901952A (en) Print form and handwritten form separated character recognition method based on deep learning
Agrawal et al. An algorithmic approach for text recognition from printed/typed text images
CN113139535A (en) OCR document recognition method
CN101853313A (en) Handwriting font object library generating method based on font categorization
CN110717397A (en) Online translation system based on mobile phone camera
JPH11110481A (en) Form rendering and character extracting method
CN115147703B (en) Garbage segmentation method and system based on GinTrans network
JPS60114967A (en) Picture file device
Soua et al. Improved Hybrid Binarization based on Kmeans for Heterogeneous document processing
Bhandare et al. Handwritten (Marathi) compound character recognition
Hegadi Recognition of printed Kannada numerals based on zoning method
Rajput et al. Handwritten script recognition using DCT, gabor filter and wavelet features at line level
Rajput et al. Handwritten script recognition at line level-a multiple feature based approach
CN117237971B (en) Food quality inspection report data extraction method based on multi-mode information extraction
Sanjrani et al. Multilingual OCR systems for the regional languages in Balochistan
Wu et al. Identification of inpainted images and natural images for digital forensics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170623