CN204856534U

CN204856534U - System of looking that helps is read to low eyesight based on OCR and TTS

Info

Publication number: CN204856534U
Application number: CN201520484407.6U
Authority: CN
Inventors: 高铁塔
Original assignee: BEIJING AUMED GROUP CORP
Current assignee: BEIJING AUMED GROUP CORP
Priority date: 2015-07-07
Filing date: 2015-07-07
Publication date: 2015-12-09
Anticipated expiration: 2025-07-07

Abstract

The utility model provides a system of looking that helps is read to low eyesight based on OCR and TTS, include: image gathering module for the object is read in scanning, gathers and outgoing picture, processing module includes: the OCR character recognition unit is connected with image gathering module for receive images carries out image preprocessing and isolated word recognition to the image, obtains the text that the image corresponds, TTS engine unit is with the OCR character recognition unit connection for be audio file with text file conversion, output module is connected with processing module for export text and audio file in step. The utility model discloses combined OCR and TTS technique, scanned and gathered the image to reading the object through image gathering module, handled and export text and audio frequency in step through output module the image of gathering through processing module to for user's auxiliary reading mode that has realized tin reading to give first place to, visualing, have use convenient, alleviate advantage such as eye fatigue.

Description

A kind of low visual acuity based on OCR and TTS reads vision assisting system

Technical field

The utility model relates to electronic reading device technical field, particularly relates to a kind of low visual acuity based on OCR and TTS and reads vision assisting system.

Background technology

Obstacle is in various degree there is in low visual acuity patient and the elderly when picture and text such as book or newspaper reading, file, instructionss, traditional approach is by magnifier, but because it is only optical amplifier, there is the problems such as enlargement factor is limited, edge deformation, therefore in developed countries such as America and Europes, substantially eliminated magnifier, generally used electronics to help view apparatus etc. to improve the high-tech product of reading disorder of low eyesight crowd, but low visual acuity crowd can cause deteriorating vision under the situation of long-time use eyes.

Along with the development of terminal technology, software engineering, the particularly development of intelligent terminal technology, OCR technology and TTS technology, for the combination of OCR technology and TTS technology provides feasibility.

Character recognition technology (OpticalCharacterRecognition is called for short OCR) is namely identified word by optical technology, is a kind of important technology in automatic identification technology investigation and application field.Word can identify and be entered in computer by automatically, is applicable to set up library online, paper book is scanned, and the word then needed by OCR software for discerning characters identification stored in computer in the form of a file just can the form of text be shown.

Speech synthesis technique (TextToSpeech is called for short TTS), relating to multiple subject technologies such as acoustics, linguistics, Digital Signal Processing, multimedia technology, is a cutting edge technology in Chinese information processing field.

Compared with the application program realizing sounding with the audio files prerecorded with some, the Speech Engine of TTS only has several million sizes, does not need a large amount of audio files supports, therefore can save very large storage area, and can read aloud any statement unknown in advance.Now many application software application TTS technology have been had to realize phonetic function, such as some broadcast softwares can be used for reading novel or doing proof-reading, can also read aloud Email, some electronic dictionaries can read word, can also be used for Help Center and automatically play information on services etc.

Utility model content

Provide hereinafter about brief overview of the present utility model, to provide about the basic comprehension in some of the present utility model.Should be appreciated that this general introduction is not summarize about exhaustive of the present utility model.It is not that intention determines key of the present utility model or pith, neither intended limitation scope of the present utility model.Its object is only provide some concept in simplified form, in this, as the preorder in greater detail discussed after a while.

The low visual acuity based on OCR and TTS that the utility model provides a kind of reduces eye frequency of utilization, realize reading simultaneously reads vision assisting system.

The utility model provides a kind of low visual acuity based on OCR and TTS to read vision assisting system, comprising:

Image capture module, for scanning reading object, gathers and output image;

Processing module, comprising:

OCR word recognition unit, is connected with described image capture module, for receiving described image, carrying out Image semantic classification and individual character identification, obtain the text that described image is corresponding to described image;

Tts engine unit, is connected with described OCR word recognition unit, for described text is converted to audio file;

Output module, is connected with described processing module, for text described in synchronism output and described audio file.

The low visual acuity based on OCR and TTS that the utility model provides is read vision assisting system and is combined OCR character recognition technology and TTS speech recognition technology, by image capture module reading object scanned and gather image, by processing module the image gathered to be processed and eventually through output module simultaneous display read text with export corresponding audio frequency, thus read to be master, visual auxiliary reading method for user achieves to listen.User also arranges display mode by keyboard or touch-screen, and the display modes such as such as black matrix wrongly written or mispronounced character, white gravoply, with black engraved characters, eyeshield pattern, alleviate eye strain further, achieve the effect that auxiliary low visual acuity patient, presbyopia crowd and blind users carry out reading.In sum, the utility model have easy to use, alleviate the advantage such as eye strain.

Accompanying drawing explanation

With reference to below in conjunction with the explanation of accompanying drawing to the utility model embodiment, above and other objects, features and advantages of the present utility model can be understood more easily.Parts in accompanying drawing are just in order to illustrate principle of the present utility model.In the accompanying drawings, same or similar technical characteristic or parts will adopt same or similar Reference numeral to represent.

Fig. 1 is the system architecture schematic diagram that a kind of low visual acuity based on OCR and TTS of the utility model reads a kind of embodiment of vision assisting system.

Fig. 2 is the system architecture schematic diagram that a kind of low visual acuity based on OCR and TTS of the utility model reads a kind of preferred implementation of vision assisting system.

Fig. 3 is the system architecture schematic diagram that a kind of low visual acuity based on OCR and TTS of the utility model reads the another kind of preferred implementation of vision assisting system.

Description of reference numerals:

10 image capture modules

20 user's load modules

30 processing modules

50 output modules

301OCR word recognition unit

303TTS engine unit

501 display units

503 audio output units

Embodiment

With reference to the accompanying drawings embodiment of the present utility model is described.The element described in an accompanying drawing of the present utility model or a kind of embodiment and feature can combine with the element shown in one or more other accompanying drawing or embodiment and feature.It should be noted that for purposes of clarity, accompanying drawing and eliminate expression and the description of parts that have nothing to do with the utility model, known to persons of ordinary skill in the art and process in illustrating.

As shown in Figure 1, in the present embodiment, the reading of the low visual acuity based on OCR and TTS vision assisting system of the present utility model comprises:

Image capture module 10, for scanning reading object, gathers and output image;

Processing module 30, comprising:

OCR word recognition unit 301, is connected with image capture module 10, for receiving described image, carrying out Image semantic classification and individual character identification, obtain the text that described image is corresponding to described image;

Tts engine unit 303, is connected with OCR word recognition unit 301, for described text is converted to audio file;

Output module 50, is connected with processing module 30, for text described in synchronism output and described audio file.

Particularly, image capture module 10 be generally scanner, camera or other there is the scanning/capture apparatus of identical effect, by image capture module 10, the collection of the reading object such as newspaper, books is input in computing machine, thus realizes original copy digitizing.The precondition of OCR recognition correct rate is that the quality of scanning of file and picture is higher.Selecting scanning resolution and correlation parameter, higher resolution ratio of camera head rightly, is the key ensureing that character image is clear, feature is not lost.In addition, reading object to be scanned is placed proper as much as possible, less to ensure the pitch angle that pre-service detects, and after carrying out slant correction, the distortion of character image is just less.These shirtsleeve operations, can make OCR recognition correct rate increase.Otherwise because scan setting is improper, the disconnected pen of word too much may divide the image detecting half word, break pen and stroke adhesion of word can cause Partial Feature to lose, when the feature of character image being compared with feature database, its characteristic distance can be made to strengthen, identification error rate rises.

Each character image in the image received described in Image semantic classification and go-on-go, and some preliminary works before carrying out individual character identification, comprise image purification process, namely remove the noise (interference) in original image, measure the pitch angle that document is placed, printed page analysis is carried out to document, typesetting confirmation is carried out to the domain of discourse selected, cutting is carried out to the literal line of horizontal, vertical typesetting, the separation of the character image of every a line, the differentiation etc. of punctuation mark.The pre-treatment step of this one-phase is extremely important, and the effect of process directly has influence on the accuracy rate of Text region.

Namely described character image is transformed into the standard code of word by individual character identification by computing machine, i.e. so-called recognition technology.The characteristic informations such as the structure of word, stroke are prestored in system, areal distribution etc. according to the stroke of word, unique point, projection information, point is analyzed, and adopt phrase mode to mate up and down to the word identified or multiple recognition result, the result of individual character identification is carried out participle, compare with the phrase in dictionary, to improve the discrimination of system, reduce misclassification rate, finally obtain the text be made up of word.

Text is converted into audio file and exports by tts engine unit 303, word in text is mainly decomposed into phoneme by word or word by the work of this process, and want the symbol of special processing to analyze to the numeral in text, monetary unit, word deforming and punctuate etc., and phoneme is generated DAB, obtain audio file.

Fig. 2 is the system architecture schematic diagram of the preferred implementation of Fig. 1 illustrated embodiment.

As shown in Figure 2, compare Fig. 1 illustrated embodiment, in the exemplary embodiment illustrated in fig. 2, output module 50 comprises:

Display unit 501, is connected with OCR word recognition unit 301, for exporting described text;

Audio output unit 503, is connected with tts engine unit 303, display unit 501, for exporting described audio file.

Particularly, the way of output of output module 50 comprises VGA and audio sync exports, or HDMI exports.

Display unit 501 is generally display screen, and audio output unit 503 is generally the audio output apparatus such as sound equipment, loudspeaker.

Fig. 3 is the system architecture schematic diagram of the preferred implementation of Fig. 2 illustrated embodiment.

As shown in Figure 3, compare Fig. 2 illustrated embodiment, in the embodiment shown in fig. 3, the low visual acuity based on OCR and TTS of the present utility model is read vision assisting system and is also comprised:

User's load module 20, is connected with processing module 30, and for input system enabled instruction, system-off instruction, output mode arranges instruction and output parameter arranges instruction.

Particularly, user's load module 20 is generally button, external connection keyboard, mouse or the touch-screen on equipment.

Preferably, image capture module 10 is also for gathering the video of described reading object and exporting.

Preferably, OCR word recognition unit 301 is also for gathering the image in described video according to parameter preset.

Preferably, output module 50 is also for exporting described video.

Preferably, OCR word recognition unit 301 judges the category of language of the word that described image comprises when carrying out Image semantic classification, calls corresponding language library and carries out individual character identification, and language category information is sent to tts engine unit 303.

Preferably, the sound bank that tts engine unit 303 calls corresponding language according to described language category information carries out text-to-speech conversion.

In sum, the low visual acuity based on OCR and TTS that the utility model provides is read vision assisting system and is combined OCR character recognition technology and TTS speech recognition technology, by image capture module reading object scanned and gather image, by processing module the image gathered to be processed and eventually through output module simultaneous display read text with export corresponding audio frequency, thus read to be master, visual auxiliary reading method for user achieves to listen.User also arranges display mode by keyboard or touch-screen, and the display modes such as such as black matrix wrongly written or mispronounced character, white gravoply, with black engraved characters, eyeshield pattern, alleviate eye strain further, achieve the effect that auxiliary low visual acuity patient, presbyopia crowd and blind users carry out reading.The utility model have easy to use, alleviate the advantage such as eye strain.

Last it is noted that above embodiment is only in order to illustrate the technical solution of the utility model, be not intended to limit; Although be described in detail the utility model with reference to previous embodiment, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of each embodiment technical scheme of the utility model.

Claims

1. the low visual acuity based on OCR and TTS reads a vision assisting system, it is characterized in that, comprising:

Image capture module, for scanning reading object, gathers and output image;

Processing module, comprising:

2. low visual acuity according to claim 1 reads vision assisting system, and it is characterized in that, described output module comprises:

Display unit, is connected with described OCR word recognition unit, for exporting described text;

Audio output unit, is connected with described tts engine unit, described display unit, for exporting described audio file.

3. low visual acuity according to claim 1 reads vision assisting system, it is characterized in that, also comprises:

User's load module, is connected with described processing module, and for input system enabled instruction, system-off instruction, output mode arranges instruction and output parameter arranges instruction.

4. low visual acuity according to claim 1 reads vision assisting system, and it is characterized in that, described image capture module is also for gathering the video of described reading object and exporting.

5. low visual acuity according to claim 4 reads vision assisting system, and it is characterized in that, described OCR word recognition unit is also for gathering the image in described video according to parameter preset.

6. low visual acuity according to claim 4 reads vision assisting system, and it is characterized in that, described output module is also for exporting described video.

7. low visual acuity according to claim 1 reads vision assisting system, it is characterized in that, described OCR word recognition unit judges the category of language of the word that described image comprises when carrying out Image semantic classification, call corresponding language library and carry out individual character identification, and language category information is sent to described tts engine unit.

8. low visual acuity according to claim 7 reads vision assisting system, and it is characterized in that, the sound bank that described tts engine unit calls corresponding language according to described language category information carries out text-to-speech conversion.

9. low visual acuity according to claim 1 reads vision assisting system, it is characterized in that, the way of output of described output module comprises VGA and audio sync exports, or HDMI exports.