CN204856534U - System of looking that helps is read to low eyesight based on OCR and TTS - Google Patents

System of looking that helps is read to low eyesight based on OCR and TTS Download PDF

Info

Publication number
CN204856534U
CN204856534U CN201520484407.6U CN201520484407U CN204856534U CN 204856534 U CN204856534 U CN 204856534U CN 201520484407 U CN201520484407 U CN 201520484407U CN 204856534 U CN204856534 U CN 204856534U
Authority
CN
China
Prior art keywords
image
ocr
module
text
visual acuity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201520484407.6U
Other languages
Chinese (zh)
Inventor
高铁塔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING AUMED GROUP CORP
Original Assignee
BEIJING AUMED GROUP CORP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING AUMED GROUP CORP filed Critical BEIJING AUMED GROUP CORP
Priority to CN201520484407.6U priority Critical patent/CN204856534U/en
Application granted granted Critical
Publication of CN204856534U publication Critical patent/CN204856534U/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Character Discrimination (AREA)

Abstract

The utility model provides a system of looking that helps is read to low eyesight based on OCR and TTS, include: image gathering module for the object is read in scanning, gathers and outgoing picture, processing module includes: the OCR character recognition unit is connected with image gathering module for receive images carries out image preprocessing and isolated word recognition to the image, obtains the text that the image corresponds, TTS engine unit is with the OCR character recognition unit connection for be audio file with text file conversion, output module is connected with processing module for export text and audio file in step. The utility model discloses combined OCR and TTS technique, scanned and gathered the image to reading the object through image gathering module, handled and export text and audio frequency in step through output module the image of gathering through processing module to for user's auxiliary reading mode that has realized tin reading to give first place to, visualing, have use convenient, alleviate advantage such as eye fatigue.

Description

A kind of low visual acuity based on OCR and TTS reads vision assisting system
Technical field
The utility model relates to electronic reading device technical field, particularly relates to a kind of low visual acuity based on OCR and TTS and reads vision assisting system.
Background technology
Obstacle is in various degree there is in low visual acuity patient and the elderly when picture and text such as book or newspaper reading, file, instructionss, traditional approach is by magnifier, but because it is only optical amplifier, there is the problems such as enlargement factor is limited, edge deformation, therefore in developed countries such as America and Europes, substantially eliminated magnifier, generally used electronics to help view apparatus etc. to improve the high-tech product of reading disorder of low eyesight crowd, but low visual acuity crowd can cause deteriorating vision under the situation of long-time use eyes.
Along with the development of terminal technology, software engineering, the particularly development of intelligent terminal technology, OCR technology and TTS technology, for the combination of OCR technology and TTS technology provides feasibility.
Character recognition technology (OpticalCharacterRecognition is called for short OCR) is namely identified word by optical technology, is a kind of important technology in automatic identification technology investigation and application field.Word can identify and be entered in computer by automatically, is applicable to set up library online, paper book is scanned, and the word then needed by OCR software for discerning characters identification stored in computer in the form of a file just can the form of text be shown.
Speech synthesis technique (TextToSpeech is called for short TTS), relating to multiple subject technologies such as acoustics, linguistics, Digital Signal Processing, multimedia technology, is a cutting edge technology in Chinese information processing field.
Compared with the application program realizing sounding with the audio files prerecorded with some, the Speech Engine of TTS only has several million sizes, does not need a large amount of audio files supports, therefore can save very large storage area, and can read aloud any statement unknown in advance.Now many application software application TTS technology have been had to realize phonetic function, such as some broadcast softwares can be used for reading novel or doing proof-reading, can also read aloud Email, some electronic dictionaries can read word, can also be used for Help Center and automatically play information on services etc.
Utility model content
Provide hereinafter about brief overview of the present utility model, to provide about the basic comprehension in some of the present utility model.Should be appreciated that this general introduction is not summarize about exhaustive of the present utility model.It is not that intention determines key of the present utility model or pith, neither intended limitation scope of the present utility model.Its object is only provide some concept in simplified form, in this, as the preorder in greater detail discussed after a while.
The low visual acuity based on OCR and TTS that the utility model provides a kind of reduces eye frequency of utilization, realize reading simultaneously reads vision assisting system.
The utility model provides a kind of low visual acuity based on OCR and TTS to read vision assisting system, comprising:
Image capture module, for scanning reading object, gathers and output image;
Processing module, comprising:
OCR word recognition unit, is connected with described image capture module, for receiving described image, carrying out Image semantic classification and individual character identification, obtain the text that described image is corresponding to described image;
Tts engine unit, is connected with described OCR word recognition unit, for described text is converted to audio file;
Output module, is connected with described processing module, for text described in synchronism output and described audio file.
The low visual acuity based on OCR and TTS that the utility model provides is read vision assisting system and is combined OCR character recognition technology and TTS speech recognition technology, by image capture module reading object scanned and gather image, by processing module the image gathered to be processed and eventually through output module simultaneous display read text with export corresponding audio frequency, thus read to be master, visual auxiliary reading method for user achieves to listen.User also arranges display mode by keyboard or touch-screen, and the display modes such as such as black matrix wrongly written or mispronounced character, white gravoply, with black engraved characters, eyeshield pattern, alleviate eye strain further, achieve the effect that auxiliary low visual acuity patient, presbyopia crowd and blind users carry out reading.In sum, the utility model have easy to use, alleviate the advantage such as eye strain.
Accompanying drawing explanation
With reference to below in conjunction with the explanation of accompanying drawing to the utility model embodiment, above and other objects, features and advantages of the present utility model can be understood more easily.Parts in accompanying drawing are just in order to illustrate principle of the present utility model.In the accompanying drawings, same or similar technical characteristic or parts will adopt same or similar Reference numeral to represent.
Fig. 1 is the system architecture schematic diagram that a kind of low visual acuity based on OCR and TTS of the utility model reads a kind of embodiment of vision assisting system.
Fig. 2 is the system architecture schematic diagram that a kind of low visual acuity based on OCR and TTS of the utility model reads a kind of preferred implementation of vision assisting system.
Fig. 3 is the system architecture schematic diagram that a kind of low visual acuity based on OCR and TTS of the utility model reads the another kind of preferred implementation of vision assisting system.
Description of reference numerals:
10 image capture modules
20 user's load modules
30 processing modules
50 output modules
301OCR word recognition unit
303TTS engine unit
501 display units
503 audio output units
Embodiment
With reference to the accompanying drawings embodiment of the present utility model is described.The element described in an accompanying drawing of the present utility model or a kind of embodiment and feature can combine with the element shown in one or more other accompanying drawing or embodiment and feature.It should be noted that for purposes of clarity, accompanying drawing and eliminate expression and the description of parts that have nothing to do with the utility model, known to persons of ordinary skill in the art and process in illustrating.
Fig. 1 is the system architecture schematic diagram that a kind of low visual acuity based on OCR and TTS of the utility model reads a kind of embodiment of vision assisting system.
As shown in Figure 1, in the present embodiment, the reading of the low visual acuity based on OCR and TTS vision assisting system of the present utility model comprises:
Image capture module 10, for scanning reading object, gathers and output image;
Processing module 30, comprising:
OCR word recognition unit 301, is connected with image capture module 10, for receiving described image, carrying out Image semantic classification and individual character identification, obtain the text that described image is corresponding to described image;
Tts engine unit 303, is connected with OCR word recognition unit 301, for described text is converted to audio file;
Output module 50, is connected with processing module 30, for text described in synchronism output and described audio file.
Particularly, image capture module 10 be generally scanner, camera or other there is the scanning/capture apparatus of identical effect, by image capture module 10, the collection of the reading object such as newspaper, books is input in computing machine, thus realizes original copy digitizing.The precondition of OCR recognition correct rate is that the quality of scanning of file and picture is higher.Selecting scanning resolution and correlation parameter, higher resolution ratio of camera head rightly, is the key ensureing that character image is clear, feature is not lost.In addition, reading object to be scanned is placed proper as much as possible, less to ensure the pitch angle that pre-service detects, and after carrying out slant correction, the distortion of character image is just less.These shirtsleeve operations, can make OCR recognition correct rate increase.Otherwise because scan setting is improper, the disconnected pen of word too much may divide the image detecting half word, break pen and stroke adhesion of word can cause Partial Feature to lose, when the feature of character image being compared with feature database, its characteristic distance can be made to strengthen, identification error rate rises.
Each character image in the image received described in Image semantic classification and go-on-go, and some preliminary works before carrying out individual character identification, comprise image purification process, namely remove the noise (interference) in original image, measure the pitch angle that document is placed, printed page analysis is carried out to document, typesetting confirmation is carried out to the domain of discourse selected, cutting is carried out to the literal line of horizontal, vertical typesetting, the separation of the character image of every a line, the differentiation etc. of punctuation mark.The pre-treatment step of this one-phase is extremely important, and the effect of process directly has influence on the accuracy rate of Text region.
Namely described character image is transformed into the standard code of word by individual character identification by computing machine, i.e. so-called recognition technology.The characteristic informations such as the structure of word, stroke are prestored in system, areal distribution etc. according to the stroke of word, unique point, projection information, point is analyzed, and adopt phrase mode to mate up and down to the word identified or multiple recognition result, the result of individual character identification is carried out participle, compare with the phrase in dictionary, to improve the discrimination of system, reduce misclassification rate, finally obtain the text be made up of word.
Text is converted into audio file and exports by tts engine unit 303, word in text is mainly decomposed into phoneme by word or word by the work of this process, and want the symbol of special processing to analyze to the numeral in text, monetary unit, word deforming and punctuate etc., and phoneme is generated DAB, obtain audio file.
Fig. 2 is the system architecture schematic diagram of the preferred implementation of Fig. 1 illustrated embodiment.
As shown in Figure 2, compare Fig. 1 illustrated embodiment, in the exemplary embodiment illustrated in fig. 2, output module 50 comprises:
Display unit 501, is connected with OCR word recognition unit 301, for exporting described text;
Audio output unit 503, is connected with tts engine unit 303, display unit 501, for exporting described audio file.
Particularly, the way of output of output module 50 comprises VGA and audio sync exports, or HDMI exports.
Display unit 501 is generally display screen, and audio output unit 503 is generally the audio output apparatus such as sound equipment, loudspeaker.
Fig. 3 is the system architecture schematic diagram of the preferred implementation of Fig. 2 illustrated embodiment.
As shown in Figure 3, compare Fig. 2 illustrated embodiment, in the embodiment shown in fig. 3, the low visual acuity based on OCR and TTS of the present utility model is read vision assisting system and is also comprised:
User's load module 20, is connected with processing module 30, and for input system enabled instruction, system-off instruction, output mode arranges instruction and output parameter arranges instruction.
Particularly, user's load module 20 is generally button, external connection keyboard, mouse or the touch-screen on equipment.
Preferably, image capture module 10 is also for gathering the video of described reading object and exporting.
Preferably, OCR word recognition unit 301 is also for gathering the image in described video according to parameter preset.
Preferably, output module 50 is also for exporting described video.
Preferably, OCR word recognition unit 301 judges the category of language of the word that described image comprises when carrying out Image semantic classification, calls corresponding language library and carries out individual character identification, and language category information is sent to tts engine unit 303.
Preferably, the sound bank that tts engine unit 303 calls corresponding language according to described language category information carries out text-to-speech conversion.
In sum, the low visual acuity based on OCR and TTS that the utility model provides is read vision assisting system and is combined OCR character recognition technology and TTS speech recognition technology, by image capture module reading object scanned and gather image, by processing module the image gathered to be processed and eventually through output module simultaneous display read text with export corresponding audio frequency, thus read to be master, visual auxiliary reading method for user achieves to listen.User also arranges display mode by keyboard or touch-screen, and the display modes such as such as black matrix wrongly written or mispronounced character, white gravoply, with black engraved characters, eyeshield pattern, alleviate eye strain further, achieve the effect that auxiliary low visual acuity patient, presbyopia crowd and blind users carry out reading.The utility model have easy to use, alleviate the advantage such as eye strain.
Last it is noted that above embodiment is only in order to illustrate the technical solution of the utility model, be not intended to limit; Although be described in detail the utility model with reference to previous embodiment, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of each embodiment technical scheme of the utility model.

Claims (9)

1. the low visual acuity based on OCR and TTS reads a vision assisting system, it is characterized in that, comprising:
Image capture module, for scanning reading object, gathers and output image;
Processing module, comprising:
OCR word recognition unit, is connected with described image capture module, for receiving described image, carrying out Image semantic classification and individual character identification, obtain the text that described image is corresponding to described image;
Tts engine unit, is connected with described OCR word recognition unit, for described text is converted to audio file;
Output module, is connected with described processing module, for text described in synchronism output and described audio file.
2. low visual acuity according to claim 1 reads vision assisting system, and it is characterized in that, described output module comprises:
Display unit, is connected with described OCR word recognition unit, for exporting described text;
Audio output unit, is connected with described tts engine unit, described display unit, for exporting described audio file.
3. low visual acuity according to claim 1 reads vision assisting system, it is characterized in that, also comprises:
User's load module, is connected with described processing module, and for input system enabled instruction, system-off instruction, output mode arranges instruction and output parameter arranges instruction.
4. low visual acuity according to claim 1 reads vision assisting system, and it is characterized in that, described image capture module is also for gathering the video of described reading object and exporting.
5. low visual acuity according to claim 4 reads vision assisting system, and it is characterized in that, described OCR word recognition unit is also for gathering the image in described video according to parameter preset.
6. low visual acuity according to claim 4 reads vision assisting system, and it is characterized in that, described output module is also for exporting described video.
7. low visual acuity according to claim 1 reads vision assisting system, it is characterized in that, described OCR word recognition unit judges the category of language of the word that described image comprises when carrying out Image semantic classification, call corresponding language library and carry out individual character identification, and language category information is sent to described tts engine unit.
8. low visual acuity according to claim 7 reads vision assisting system, and it is characterized in that, the sound bank that described tts engine unit calls corresponding language according to described language category information carries out text-to-speech conversion.
9. low visual acuity according to claim 1 reads vision assisting system, it is characterized in that, the way of output of described output module comprises VGA and audio sync exports, or HDMI exports.
CN201520484407.6U 2015-07-07 2015-07-07 System of looking that helps is read to low eyesight based on OCR and TTS Expired - Fee Related CN204856534U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201520484407.6U CN204856534U (en) 2015-07-07 2015-07-07 System of looking that helps is read to low eyesight based on OCR and TTS

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201520484407.6U CN204856534U (en) 2015-07-07 2015-07-07 System of looking that helps is read to low eyesight based on OCR and TTS

Publications (1)

Publication Number Publication Date
CN204856534U true CN204856534U (en) 2015-12-09

Family

ID=54747065

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201520484407.6U Expired - Fee Related CN204856534U (en) 2015-07-07 2015-07-07 System of looking that helps is read to low eyesight based on OCR and TTS

Country Status (1)

Country Link
CN (1) CN204856534U (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182432A (en) * 2017-12-28 2018-06-19 北京百度网讯科技有限公司 Information processing method and device
CN111273558A (en) * 2020-02-10 2020-06-12 南通大学 Intelligent reading system

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108182432A (en) * 2017-12-28 2018-06-19 北京百度网讯科技有限公司 Information processing method and device
US10963760B2 (en) 2017-12-28 2021-03-30 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for processing information
CN111273558A (en) * 2020-02-10 2020-06-12 南通大学 Intelligent reading system

Similar Documents

Publication Publication Date Title
CN104966084A (en) OCR (Optical Character Recognition) and TTS (Text To Speech) based low-vision reading visual aid system
US10741167B2 (en) Document mode processing for portable reading machine enabling document navigation
Mithe et al. Optical character recognition
US9626000B2 (en) Image resizing for optical character recognition in portable reading machine
US8538087B2 (en) Aiding device for reading a printed text
US8036895B2 (en) Cooperative processing for portable reading machine
US20100331043A1 (en) Document and image processing
Ani et al. Smart Specs: Voice assisted text reading system for visually impaired persons using TTS method
Rajesh et al. Text recognition and face detection aid for visually impaired person using Raspberry PI
Hagargund et al. Image to speech conversion for visually impaired
Manage et al. An intelligent text reader based on python
CN111723653A (en) Drawing book reading method and device based on artificial intelligence
CN204856534U (en) System of looking that helps is read to low eyesight based on OCR and TTS
Kaur et al. Conversion of Hindi Braille to speech using image and speech processing
CN112163513A (en) Information selection method, system, device, electronic equipment and storage medium
Hairuman et al. OCR signage recognition with skew & slant correction for visually impaired people
CN115988149A (en) Method for generating video by AI intelligent graphics context
KR20200058026A (en) Operating methed in electronic device for kanji study using agumented reality
Dhulekar et al. Automatic voice generation system after street board identification for visually impaired
CN113392847A (en) OCR (optical character recognition) handheld scanning translation device and translation method for Tibetan Chinese and English
Jadhav et al. Raspberry pi based reader for blind
CN111428569A (en) Visual identification method and device for picture book or teaching material based on artificial intelligence
Shanmugam et al. Hardcopy Text Recognition and Vocalization for Visually Impaired and Illiterates in Bilingual Language
Ramkishor et al. Artificial vision for blind people using OCR technology
CN115410061B (en) Image-text emotion analysis system based on natural language processing

Legal Events

Date Code Title Description
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20151209

Termination date: 20170707

CF01 Termination of patent right due to non-payment of annual fee