CN106507321A - The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese - Google Patents

The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese Download PDF

Info

Publication number
CN106507321A
CN106507321A CN201611044873.8A CN201611044873A CN106507321A CN 106507321 A CN106507321 A CN 106507321A CN 201611044873 A CN201611044873 A CN 201611044873A CN 106507321 A CN106507321 A CN 106507321A
Authority
CN
China
Prior art keywords
chinese
voice
note
syllable
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611044873.8A
Other languages
Chinese (zh)
Inventor
白涛
王磊
寇晓斌
杨抒
吴乃宁
吴艳
程鲁玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xinjiang Agricultural University
Original Assignee
Xinjiang Agricultural University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xinjiang Agricultural University filed Critical Xinjiang Agricultural University
Priority to CN201611044873.8A priority Critical patent/CN106507321A/en
Publication of CN106507321A publication Critical patent/CN106507321A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/12Messaging; Mailboxes; Announcements
    • H04W4/14Short messaging services, e.g. short message services [SMS] or unstructured supplementary service data [USSD]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/18Information format or content conversion, e.g. adaptation by the network of the transmitted or received information for the purpose of wireless delivery to users or terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Disclosed by the invention is a kind of dimension, the bilingual GSM message breath voice conversion broadcasting system of the Chinese, is divided into two ingredients of software system and hardware system;The software system function is divided into note receiver module, Text Pretreatment module, building of corpus module, four part of waveform concatenation synthesis module;Described hardware system is:Use the arm processor of cortex M3 kernels as control unit, use SIM900A gsm modules to receive as note and voice transfer unit.Present invention achieves the conversion of dimension, Chinese short message voice, and ensure only to be identified the short message content of phone number for authorizing, automatic clawback mechanism and DTMF password authentification mechanism ensure that the legitimacy of the sender of short message content.Safety and stability of the present invention, integrated level are high, PC can be coordinated to be converted into voice and be broadcast to designated area by secure authentication mechanisms emergent note, substantially reduce the laying cost of emergent broadcast while flexibility ratio is improved.

Description

The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese
Technical field
The present invention relates to information speech changes broadcasting system technical field, specifically one kind is related to dimension, the bilingual GSM message of the Chinese Breath voice conversion broadcasting system.
Background technology
Communication is one of electronic information technology of most worthy that twentieth century occurs.Wherein SMS is that people use The earliest major way of mobile phone exchange.The conversion of short message voice broadcasts function in daily life with extremely important Using value, it can give people class offer cheap, easily communication service.
The technology of the voice broadcast note that some mobile phones existing are provided is not well positioned to meet the needs of user:Have Require connect to server and synthesized voice messaging is issued mobile phone again, so have the restriction of use environment and bring expense Increase;Had carries out phonetic synthesis by the method in locally-stored syllable, and so synthesis tonequality is excessively poor.
Broadcast especially for dimension, the bilingual short message voice conversion of the Chinese, there is also some problems at present, note of such as meeting an urgent need Voice cannot be converted into safely and be broadcast to designated area;In addition, in terms of phonetic synthesis, conventional phoneme synthesizing method is main There are two kinds of technology of parameter synthesis and waveform concatenation method, parameter synthesis method to compare the parameter of dependence voice in synthetic technology and carry Technology is taken, but, the research to model for speech production is perfect not enough at present, therefore the phonetic synthesis based on parameter synthesis method Definition and naturalness all also do not reach practical level.Different in this, the phonetic synthesis principle based on waveform concatenation technology is handle Speech waveform is spliced, the overall speech data of output.Waveform concatenation phonetic synthesis replaces parameter, voice with natural-sounding waveform Waveform takes from word, phrase and sentence, implies the rhythm of nature in raw tone, and the lamprophonia nature of synthesis, its quality are general All over higher than the phonetic synthesis based on parameter.During waveform concatenation, if directly spliced simple waveform, often send out Splice point has problems with now:First, sound splice point can be found and there is thorn sound, secondly, if two sound sounds High different, it is found that in splicing, the pitch of whole sound is different, suddenly low height suddenly.
For problem above, we are more deepened to dimension, the bilingual short message voice conversion broadcasting system application platform of the Chinese Entered and refine designs and develops, will be with very extensive actual application value.
Content of the invention
Present invention solves the technical problem that be to provide the bilingual GSM message breath voice of a kind of dimension of low power consumption and low cost, the Chinese turning Broadcasting system is changed, solves the problems, such as that emergent note cannot be converted into safely voice and be broadcast to designated area.
The technical scheme is that:The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese, is divided into software system System and two ingredients of hardware system;
The software system function is divided into note receiver module, Text Pretreatment module, building of corpus module, waveform and spells It is bonded into four part of module;
Note receiver module:Responsible note is received, and obtains newest short message text in real time;When note is received, carry out Clawback, carries out DTMF decodings by MT8870 chips to input through keyboard, obtains input password, and is verified, then by eventually End control works come the real-time processing for carrying out pushed information;Achieve the information receive capabilities pushed based on SMS.Note Reception is an exploitation formula system, and note source is not specified.Due in the actual environment of project application, can not The meeting for avoiding receives some non-notes for pushing request.In order to carry out the certification for pushing request in short message receiving terminal, therefore add Enter verification process.By calling back, user input password is then treated, carry out password authentification.In the proof procedure, Major Difficulties When exactly calling back, the password that client is input is obtained by dialing keyboard.
Text Pretreatment module:Carry out languages identification first, distinguish Chinese and Uighur text, be then based on Chinese, The normalization rule of Uighur, carries out text regularization to Chinese, Uighur respectively, and Chinese is entered according to existing dictionary Row participle, carries out cutting to Uighur according to existing syllable storehouse, obtains the pronunciation unit of text;
Building of corpus module:Build Chinese vocabulary, individual character corpus;Build Uighur syllable corpus;
Waveform concatenation synthesis module:For Chinese, the corresponding audio file of vocabulary is selected to be spliced, for Uygur Language, selects the corresponding audio file of syllable to be spliced;
Described hardware system is:Use the arm processor of cortex M3 kernels as control unit, use SIM900A Gsm module is solved to MP3 audio files as note reception and voice transfer unit, using VS1003 audio decoders chip Code, driven using SDIO and SD card set up FAT32 file system storage configuration files and audio file, use 400X240TFT3 Very little each running status of chromatic liquid crystal screen display system, MX3232 is used as the driving chip reality of RS232 interface circuits Existing system and the communication connection of PC, RTC clock circuit is adopted to provide accurate real-time clock for system, adopt AMS1117 linearly steady Depressor is down to running voltage, the embedded short message broadcasting system energy of arm processor and other chips by the voltage of power supply adaptor Enough receive the note of designated mobile phone and actively can call back to mandate phone number, by dual tone multi-frequency dtmf (Dual Tone Multi Frequency) identification mandate cell phone password, the short message sending for needing to report can be carried out language to PC after being verified Sound synthesizes and broadcasts.SIM900A modules are the GSM/GPRS wireless communication modules of compact dimensions, are easy to client using SMT encapsulation Carry out flexible design, powerful.
Further, in such scheme, the Text Pretreatment module is concrete in Unicode using languages character Location recognition languages, distinguish Chinese and Uighur information, by engineering and the angle of technology, rule-based to Chinese, tie up me You carry out Regularization by Chinese language part;Participle is carried out according to existing dictionary using Forward Maximum Method algorithm to Chinese, to dimension I equally carries out cutting using Forward Maximum Method algorithm according to existing syllable storehouse by your radix;Languages identification is known comprising voice languages Not, i.e., languages are distinguished according to voice document;The identification of text languages, i.e., distinguish languages according to text.
Used as the basis of languages identification, code identification is precondition, and the present invention adopts Unified coding UTF-8, therefore Code identification link is avoided, identification minority language is operated mainly in, we adopt in the identification of minority language Method based on particular location of the languages character in Unicode is carrying out canonical coupling.
In real text, often include substantial amounts of non-standard word character string, such as " in January, 2016 ", therein " 2016 " and " 1 ";" ten thousand metres ", " 10000 " therein;" 1000 " therein;" 15 " therein.These non-standard alphabetic characters, mainly have Arabic numerals, English character, various symbols Number composition.During phonetic synthesis, the date of these non-standard alphabetic character compositions, phone, numerical value etc. are needed to carry out spy Different process, its process is exactly text regularization.
By taking the regularization of Chinese as an example.In order to distinguish the pronunciation rule of numeral, first, the pronunciation type of numeral is set, when For phone type when, i.e., " 101 " reading " the one 0 one " when, then it is assumed that numeric type is P;When for character string type when, i.e., " 145 " read When " one four five ", then it is assumed that numeric type is S;When for numeric type when, i.e., " 165 " reading " 165 " when, then it is assumed that number Word type is N.With this form express time " on January 1st, 2016 " and when, then be denoted as " S S days month S ";Represent " 1,000,000 " when, Then be denoted as " N ten thousand ";Represent " 2016/1/1 ", be then denoted as " S/S/S ";When representing " 010-25124585 ", then it is denoted as " P-P ".Logical Dynamic construction primitive formula is crossed, so as to complete the regularization of numeral.The pronunciation of special symbol is considered, in order to send out symbol Sound is expressed, and is proposed a kind of phonetic representation formula, is exactly directly to be described the pronunciation of special symbol.Such as " 12.25% ", It is " N.N% " with above primitive formula, its phonetic representation is " percent N.N ".By dynamic construction phonetic representation formula, so as to Complete the regularization of special symbol.
The phonetic synthesis of Chinese is the phoneme synthesizing method based on Large Scale Corpus.Spelled based on Large Scale Corpus waveform The advantage for connecing speech synthesis technique is the linguistic feature for remaining raw tone to greatest extent, for vocabulary is not logged in, adopts It is basic concatenation unit with syllable.By choosing in corpus, long vocabulary, phrase are synthesized as far as possible, are reduced in synthesis sentence Splice point.Do so, has two benefits:First, considerable rhythm word, prosodic phrase use the original language in corpus Sound data, maintain good primitive nature degree, while reduce splice point as far as possible, it is ensured that the overall naturalness of synthesis voice; Secondly, exactly can be good at tackling the phonetic synthesis for being not logged in vocabulary.
Further, in such scheme, the structure Uighur syllable corpus of succeeding, in phonetic synthesis, can To shield the pronunciation rule inside syllable, make the voice inside syllable more natural, by including long, can be to a certain degree On avoid syllable from directly joining sound phenomenon, increase the speech naturalness between syllable and syllable;Uighur corpus is comprising about 6000 Uighur syllables, in addition to more than 2000 conventional syllable, remaining is the long that includes;Corpus raw tone size About 0.72G;
Further, in such scheme, described structure Chinese vocabulary, individual character corpus are by collecting from interconnection Net dictionary and association analysiss dictionary, input method dictionary, and popular comprehensive glossary storehouse on network, and all of dictionary is carried out Duplicate removal collects, and obtains unduplicated lexicon dictionary;Vocabulary more than 50,000 is obtained, wherein mainly includes Chinese vocabulary, secondly comprising appropriate Long phrase, and individual character more than 7000;Speech data is corresponded with the vocabulary in dictionary, and voice document adopts schoolgirl's sounding, Sample rate is 8000Hz, and quantization digit is 16, is stored with wav phonetic matrixs, obtains 1.2G altogether;
Further, in such scheme, the building of corpus module is present for the voice document in corpus Voice border issue, has carried out speech terminals detection, with general speech terminals detection technology, sound end is marked.
Further, in such scheme, the building of corpus module is analyzed to the sound bank for having built, is found There is a considerable amount of non-speech data in voice, fusion mel-frequency cepstrum coefficient (Mel- has then been carried out to speech data Scale Frequency Cepstral Coefficients) and kNN sorting algorithm speech terminals detections.
Further, in such scheme, described waveform concatenation synthesis module in splicing, using smoothing processing Algorithm, by each syllable, vocabulary audio volume control the amplitude processing into being fade-in fade-out, prevents splice point from having the noise made in coughing or vomiting loudspeaker produced by imbalance Sound;Then by using strategies such as rhythm model, duration controls, pairing is optimized into voice.
The rhythm model:The main syllables different by screening combine to be realized more natural voice, as far as possible Ensure the globality of syllable, in a vocabulary, first have to the syllable combination for searching corresponding maximum, if there is connection two sound of sound The voice of section, then using two syllable verbal audios, it is ensured that maximum syllable unit preferential principle, the natural rhythm for ensureing in voice with this Rule;
For Chinese syllable synthesis splice point be primarily present in word and word, word and word, word and word, word and word, word with short Language, sentence and sentence;The categorized gap mainly having between sentence and sentence, main comprising be made up of comma, fullstop etc. half Sentence and the gap of whole sentence;Gap between word and word, between the word wherein comprising word and word, word and phrase, word and word composition Gap;Gap between word and word;Wherein, the gap between word is less than the gap between word, and the gap between word is less than sentence Between gap.In by splicing, constantly adjustment obtains the gap between various splicings.
The bilingual GSM message breath voice conversion broadcasting system of described dimension, the Chinese, its method of work is:When SIM900A is received After one new note, UART can be passed through and frame AT instructions are sent to STM32 controllers, point out have new note to be received;This When STM32 controllers send note to SIM900A and read instruction, read unread short messages, and extract note and receive time, note and send out The information such as the phone number and short message content of the side of going out, while the configuration file that can read SD card contrasts the mobile phone of short message sending side Whether number is authorization number, otherwise comes back to the state for waiting new message;Afterwards, controller passes through AT command operatings SIM900A calls back to the sender of note, and points out to be input into password;Send after note side receives calls according to voice message The short breath of input pushes password, and STM32 controllers parse character according to the DTMF of SIM900A and judge whether password is correct.If close Code is verified, and the note for receiving before can be passed through RS232 communication interfaces by STM32 controllers, and according to specific frame format PC ends are sent to, and short message content is converted into voice and broadcasted.
The invention has the beneficial effects as follows:The present invention is by note reception, Text Pretreatment, building of corpus, waveform concatenation The realization of function, it is achieved that dimension, the conversion of Chinese short message voice.And ensure only to know the short message content of the phone number for authorizing Not, automatic clawback mechanism and DTMF password authentification mechanism ensure that the legitimacy of the sender of short message content.Safety of the invention is steady Fixed, integrated level is high, PC can be coordinated to be converted into voice and be broadcast to designated area by secure authentication mechanisms emergent note, carried The laying cost of emergent broadcast is substantially reduced while high flexibility ratio.
Description of the drawings
Fig. 1 is that note data receives flow chart;
Fig. 2 is to push checking flow chart;
Fig. 3 is Forward Maximum Method algorithm flow chart;
Fig. 4 is speech terminals detection schematic diagram;
Fig. 5 is hardware system functional block diagram;
Fig. 6 is the shell structure schematic diagram of present invention conversion broadcasting system.
Wherein, 1- base plates, 2-PCB protection boards, 3- is without interface minor face side plate, 4- copper pillar one, 5-PCB plates, the long side plates of 6- One, 7- display screen, 8- copper pillar two, 9- screen protecting plates, 10- signal emitting plates, 232 joints of 11-RS, 12- panels, 13- have Interface minor face side plate, 14-RS232 joint jam plates, the long side plates two of 15-, 16-SD storage cards.
Specific embodiment
Below in conjunction with the accompanying drawings the present invention is further described in detail:
The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese, is divided into two compositions of software system and hardware system Part;
Software system function is divided into note receiver module, Text Pretreatment module, building of corpus module, waveform concatenation and closes Into four part of module.Note receiver module:Responsible note receives (receiving flow chart, as shown in Figure 1), obtains in real time newest short Message sheet;When note is received, clawback is carried out, DTMF decodings are carried out by MT8870 chips to input through keyboard, obtain input Password, and (flow chart of proof procedure, such as Fig. 2) is verified, the real-time of pushed information is carried out by terminal control then Process work;Achieve the information receive capabilities pushed based on SMS.It is an exploitation formula system that note is received, and does not have Note source is specified.Please due in the actual environment of project application, inevitably receiving some non-push The note that asks.In order to the certification for pushing request is carried out in short message receiving terminal, therefore add verification process.By calling back, so After treat user input password, carry out password authentification.In the proof procedure, when Major Difficulties are exactly to call back, obtain client and lead to Cross the password of dialing keyboard input.Text Pretreatment module:Carry out languages identification first, distinguish Chinese and Uighur text, The normalization rule of Chinese, Uighur is then based on, text regularization is carried out to Chinese, Uighur respectively, to Chinese root Participle is carried out according to existing dictionary, cutting is carried out according to existing syllable storehouse to Uighur, obtain the pronunciation unit of text.Language Material storehouse builds module:Build Chinese vocabulary, individual character corpus;Build Uighur syllable corpus.Waveform concatenation synthesis module: For Chinese, select the corresponding audio file of vocabulary to be spliced, for Uighur, select the corresponding audio file of syllable to enter Row splicing.
Particular location identification languages of the pretreatment module using languages character in Unicode, distinguish Chinese and Uygur Language information, by engineering and the angle of technology, rule-based carries out Regularization to Chinese, Uighur file;To Chinese Participle is carried out using Forward Maximum Method algorithm (Fig. 3) according to existing dictionary, same according to existing syllable storehouse to Uighur Sample carries out cutting using Forward Maximum Method algorithm (Fig. 3);Languages identification comprising voice languages identification, i.e., according to voice document come Difference languages;The identification of text languages, i.e., distinguish languages according to text.
Used as the basis of languages identification, code identification is precondition, and the present invention adopts Unified coding UTF-8, therefore Code identification link is avoided, identification minority language is operated mainly in, we adopt in the identification of minority language Method based on particular location of the languages character in Unicode is carrying out canonical coupling.
The phonetic synthesis of Chinese is the phoneme synthesizing method based on Large Scale Corpus.Spelled based on Large Scale Corpus waveform The advantage for connecing speech synthesis technique is the linguistic feature for remaining raw tone to greatest extent, for vocabulary is not logged in, adopts It is basic concatenation unit with syllable.By choosing in corpus, long vocabulary, phrase are synthesized as far as possible, are reduced in synthesis sentence Splice point.Do so, has two benefits:First, considerable rhythm word, prosodic phrase use the original language in corpus Sound data, maintain good primitive nature degree, while reduce splice point as far as possible, it is ensured that the overall naturalness of synthesis voice; Secondly, exactly can be good at tackling the phonetic synthesis for being not logged in vocabulary.
During building Uighur syllable corpus, in phonetic synthesis, the pronunciation rule inside syllable can be shielded, Make the voice inside syllable more natural, by including long, syllable can be avoided to a certain extent directly to join sound phenomenon, Increase the speech naturalness between syllable and syllable;Uighur corpus includes about 6000 Uighur syllables, except 2000 Outside multiple conventional syllables, remaining is the long that includes;Corpus raw tone size is about 0.72G.Build Chinese vocabulary, list Word corpus is the input method dictionary by collecting from the Internet dictionary and association analysiss dictionary, and on network popular comprehensive Lexicon is closed, and all of dictionary is carried out duplicate removal and collected, obtain unduplicated lexicon dictionary;Vocabulary more than 50,000 is obtained, is wherein led Chinese vocabulary to be included, secondly includes appropriate long phrase, and individual character more than 7000;Vocabulary in speech data and dictionary is one by one Corresponding, voice document adopts schoolgirl's sounding, and sample rate is 8000Hz, and quantization digit is 16, is stored with wav phonetic matrixs, 1.2G is obtained altogether;
The voice border issue that building of corpus module is present for the voice document in corpus, has carried out sound end Detection, with general speech terminals detection technology, is marked to sound end.The sound bank for having built is analyzed, is sent out There is a considerable amount of non-speech data in existing voice, fusion mel-frequency cepstrum coefficient has then been carried out to speech data (Mel-scale Frequency Cepstral Coefficients) and kNN sorting algorithm speech terminals detection (Cleaning Principles As shown in Figure 4).
Waveform concatenation synthesis module in splicing, using smoothing processing algorithm, by each syllable, vocabulary audio volume control The amplitude processing prevents splice point from having the noise made in coughing or vomiting loudspeaker sound produced by imbalance into being fade-in fade-out;Then by adopting rhythm model, duration The strategies such as control, pairing are optimized into voice.
Rhythm model is mainly realized more natural voice by screening different syllable combinations, is ensured as far as possible The globality of syllable, in a vocabulary, first has to the syllable combination for searching corresponding maximum, if there is connection two syllable of sound Voice, then using two syllable verbal audios, it is ensured that maximum syllable unit preferential principle, the rhythm of nature for ensureing in voice with this;
For Chinese syllable synthesis splice point be primarily present in word and word, word and word, word and word, word and word, word with short Language, sentence and sentence;The categorized gap mainly having between sentence and sentence, main comprising be made up of comma, fullstop etc. half Sentence and the gap of whole sentence;Gap between word and word, between the word wherein comprising word and word, word and phrase, word and word composition Gap;Gap between word and word;Wherein, the gap between word is less than the gap between word, and the gap between word is less than sentence Between gap.In by splicing, constantly adjustment obtains the gap between various splicings, is finally set to such as table 1.
Table 1:Interval setting
Hardware system (system functional block diagram, such as Fig. 5) use the arm processor of cortex M3 kernels as control unit, Use SIM900A gsm modules as note reception and voice transfer unit, VS1003 audio decoders chip is adopted to MP3 audio frequency File is decoded, is driven SD card using SDIO and set up FAT32 file system storage configuration files and audio file, use 400X240TFT3 cun chromatic liquid crystal screen display systems each running statuses, MX3232 is used as RS232 interface circuits Driving chip is realized the communication connection of system and PC, adopts RTC clock circuit to provide accurate real-time clock, employing for system AMS1117 linear voltage regulators are down to the running voltage of arm processor and other chips, embedded short by the voltage of power supply adaptor Letter broadcasting system can receive the note of designated mobile phone and actively can call back to mandate phone number, by dual-tone multifrequency DTMF (Dual Tone Multi Frequency) identifications authorize cell phone password, after being verified can will need the short of report Letter is sent to PC and carries out phonetic synthesis and broadcast.SIM900A modules are the GSM/GPRS wireless communication modules of compact dimensions, adopt SMT encapsulation is easy to client carries out flexible design, powerful.
The shell structure schematic diagram of the conversion broadcasting system, as shown in fig. 6, including base plate 1, PCB protection boards 2, without interface Minor face side plate 3, copper pillar 1, pcb board 5, long side plate 1, display screen 7, copper pillar 28, screen protecting plate 9, signal emitting plate 10th, 232 joints 11 of RS, panel 12, have interface minor face side plate 13, RS232 joints jam plate 14, long side plate 2 15, SD storage cards 16.Base plate 1, panel 12, without interface minor face side plate 3, have interface minor face side plate 13 and long side plate 1,2 15 conduct of long side plate Six faces besieged city, one cuboid, PCB protection boards 2, pcb board 5, display screen 7, signal emitting plate 10, RS232 joints 11, SD are deposited Card storage 16 is each provided at rectangular internal;On the inside of base plate 1, the signal emitting plate 10 is located at PCB protection boards 2 to PCB protection boards 2 Surface on, the pcb board 5 by copper pillar 1 on the signal emitting plate 10, display screen 7 on pcb board 5, copper Pillar 28 has 4, is located at the corner of signal emitting plate 10 respectively, and one end is fixed on expelling plate 10, and the other end withstands on panel 12 On;Joint jam plate 14 has been located on interface minor face side plate 13, and screen protecting plate 9 is opened on panel 12,11 one end of RS232 joints In the side of pcb board 5, the other end is having on interface minor face side plate 13, and SD storage cards 16 are located on signal emitting plate 10.
The bilingual GSM message of dimension, the Chinese ceases the method for work of voice conversion broadcasting system:When SIM900A receives one newly Note after, can pass through UART to STM32 controllers send a frame AT instruction, point out have new note to be received;Now STM32 Controller sends note to SIM900A and reads instruction, reads unread short messages, and extracts note and receives time, note sender The information such as phone number and short message content, while the phone number that can read the configuration file contrast short message sending side of SD card is No is authorization number, otherwise comes back to the state for waiting new message;Afterwards, controller by AT command operatings SIM900A to The sender of note calls back, and points out to be input into password;Send after note side receives calls and short breath is input into according to voice message Password is pushed, STM32 controllers parse character according to the DTMF of SIM900A and judge whether password is correct.If password authentification is led to Cross, STM32 controllers by the note for receiving before by RS232 communication interfaces, and can be sent to PC according to specific frame format Hold, and short message content is converted into voice and broadcasted.
Finally it should be noted that:Above example only in order to technical scheme to be described, rather than a limitation;Although With reference to the foregoing embodiments the present invention has been described in detail, it will be understood by those within the art that:Which still may be used To modify to the technical scheme described in previous embodiment, or equivalent is carried out to which part technical characteristic;And These modifications are replaced, and do not make the essence of appropriate technical solution depart from spirit and the model of embodiment of the present invention technical scheme Enclose.

Claims (8)

1. the bilingual GSM message breath voice of a kind of dimension, the Chinese changes broadcasting system, it is characterised in that be divided into software system and hardware system Two ingredients of system.
The software system function is divided into note receiver module, Text Pretreatment module, building of corpus module, waveform concatenation and closes Into four part of module;
Note receiver module:Responsible note is received, and obtains newest short message text in real time;When note is received, clawback is carried out, DTMF decodings are carried out by MT8870 chips to input through keyboard, input password is obtained, and is verified, then pass through terminal control Work to carry out the real-time processing of pushed information;
Text Pretreatment module:Carry out languages identification first, distinguish Chinese and Uighur text, be then based on Chinese, tie up me The normalization rule of your language, carries out text regularization to Chinese, Uighur respectively, and Chinese is carried out point according to dictionary Word, carries out cutting to Uighur according to existing syllable storehouse, obtains the pronunciation unit of text;
Building of corpus module:Build Chinese vocabulary, individual character corpus;Build Uighur syllable corpus;
Waveform concatenation synthesis module:For Chinese, the corresponding audio file of vocabulary is selected to be spliced, for Uighur, choosing Select the corresponding audio file of syllable to be spliced;
Described hardware system is:Use the arm processor of cortex M3 kernels as control unit, use SIM900A GSM Module is decoded, is adopted to MP3 audio files as note reception and voice transfer unit, using VS1003 audio decoders chip SD card is driven with SDIO and set up FAT32 file system storage configuration files and audio file, used 400X240TFT3 cun of colour Liquid crystal display screen display system each running status, MX3232 is used to realize system as the driving chip of RS232 interface circuits Communication connection with PC, adopt RTC clock circuit provide for system accurate real-time clock, will using AMS1117 linear voltage regulators The voltage of power supply adaptor is down to the running voltage of arm processor and other chips, embedded short message broadcasting system and can be received The note of designated mobile phone simultaneously can actively to authorizing phone number to call back, by dual tone multi-frequency dtmf (Dual Tone Multi Frequency) identification authorizes cell phone password, after being verified the short message sending for needing to report can be carried out phonetic synthesis to PC And broadcast.
2. Text Pretreatment module according to claim 1, it is characterised in that the Text Pretreatment module uses languages Particular location identification languages of the character in Unicode, distinguish Chinese and Uighur information, by engineering and the angle of technology Degree, rule-based carries out Regularization to Chinese, Uighur file;To Chinese according to existing dictionary using positive maximum Matching algorithm carries out participle, equally carries out cutting using Forward Maximum Method algorithm to Uighur according to existing syllable storehouse; Languages identification includes the identification of voice languages, i.e., distinguish languages according to voice document;Text languages are recognized, i.e., according to text To distinguish languages.
3. the bilingual GSM message breath voice of dimension according to claim 1, the Chinese changes broadcasting system, it is characterised in that the structure Uighur syllable corpus is built, in phonetic synthesis, the pronunciation rule inside syllable can be shielded, be made the voice inside syllable More natural, by including long, syllable can be avoided to a certain extent from directly joining sound phenomenon, increase syllable and syllable it Between speech naturalness.
4. the bilingual GSM message breath voice of dimension according to claim 1, the Chinese changes broadcasting system, it is characterised in that described It is the input method dictionary by collecting from the Internet dictionary and association analysiss dictionary to build Chinese vocabulary, individual character corpus, and Popular comprehensive glossary storehouse on network, and all of dictionary is carried out duplicate removal collect, obtain unduplicated lexicon dictionary.
5. the bilingual GSM message breath voice of dimension according to claim 1, the Chinese changes broadcasting system, it is characterised in that institute's predicate Material storehouse builds the voice border issue that module is present for the voice document in corpus, has carried out speech terminals detection, to language Voice endpoint is marked.
6. the bilingual GSM message breath voice of dimension according to claim 1, the Chinese changes broadcasting system, it is characterised in that institute's predicate Material storehouse builds module and has carried out fusion mel-frequency cepstrum coefficient (Mel-scale Frequency Cepstral to speech data ) and kNN sorting algorithm speech terminals detections Coefficients.
7. the bilingual GSM message breath voice of dimension according to claim 1, the Chinese changes broadcasting system, it is characterised in that described Waveform concatenation synthesis module in splicing, using smoothing processing algorithm, by each syllable, vocabulary audio volume control the amplitude processing Into being fade-in fade-out;Then by using strategies such as rhythm model, duration controls, pairing is optimized into voice.
8. the bilingual GSM message breath voice conversion broadcasting system of dimension according to claim 1 to 7, the Chinese, its feature exist, work Method is:After SIM900A receives a new note, UART can be passed through and frame AT instructions are sent to STM32 controllers, be carried It is shown with new note to be received;Now STM32 controllers send note to SIM900A and read instruction, read unread short messages, and carry Take note and receive the information such as time, the phone number of note sender and short message content, while the configuration text of SD card can be read Whether the phone number of part contrast short message sending side is authorization number, otherwise comes back to the state for waiting new message;Afterwards, control Device processed is called back to the sender of note by AT command operatings SIM900A, and points out to be input into password;Send note side to answer Password is pushed according to the short breath of voice message input after phone, STM32 controllers judge close according to the DTMF of SIM900A parsing characters Whether code is correct.If password authentification passes through, the note for receiving before can be passed through RS232 communication interfaces by STM32 controllers, and PC ends are sent to according to specific frame format, and short message content is converted into voice and broadcasted.
CN201611044873.8A 2016-11-22 2016-11-22 The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese Pending CN106507321A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611044873.8A CN106507321A (en) 2016-11-22 2016-11-22 The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611044873.8A CN106507321A (en) 2016-11-22 2016-11-22 The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese

Publications (1)

Publication Number Publication Date
CN106507321A true CN106507321A (en) 2017-03-15

Family

ID=58328489

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611044873.8A Pending CN106507321A (en) 2016-11-22 2016-11-22 The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese

Country Status (1)

Country Link
CN (1) CN106507321A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107680579A (en) * 2017-09-29 2018-02-09 百度在线网络技术(北京)有限公司 Text regularization model training method and device, text regularization method and device
CN109031474A (en) * 2018-08-31 2018-12-18 成都润联科技开发有限公司 A kind of weather information hiding Chinese phonetic broadcasting terminals and its working method based on Beidou satellite communication
WO2021134591A1 (en) * 2019-12-31 2021-07-08 深圳市优必选科技股份有限公司 Speech synthesis method, speech synthesis apparatus, smart terminal and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101082908A (en) * 2007-06-26 2007-12-05 腾讯科技(深圳)有限公司 Method and system for dividing Chinese sentences
CN103164396A (en) * 2011-12-19 2013-06-19 新疆新能信息通信有限责任公司 Chinese-Uygur language-Kazakh-Kirgiz language electronic dictionary and automatic translating Chinese-Uygur language-Kazakh-Kirgiz language method thereof
CN103164398A (en) * 2011-12-19 2013-06-19 新疆新能信息通信有限责任公司 Chinese-Uygur language electronic dictionary and automatic translating Chinese-Uygur language method thereof
CN103165126A (en) * 2011-12-15 2013-06-19 无锡中星微电子有限公司 Method for voice playing of mobile phone text short messages

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101082908A (en) * 2007-06-26 2007-12-05 腾讯科技(深圳)有限公司 Method and system for dividing Chinese sentences
CN103165126A (en) * 2011-12-15 2013-06-19 无锡中星微电子有限公司 Method for voice playing of mobile phone text short messages
CN103164396A (en) * 2011-12-19 2013-06-19 新疆新能信息通信有限责任公司 Chinese-Uygur language-Kazakh-Kirgiz language electronic dictionary and automatic translating Chinese-Uygur language-Kazakh-Kirgiz language method thereof
CN103164398A (en) * 2011-12-19 2013-06-19 新疆新能信息通信有限责任公司 Chinese-Uygur language electronic dictionary and automatic translating Chinese-Uygur language method thereof

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
白涛等: "基于词典和全切分的中文农业网页分词算法的研究", 《新疆农业大学学报》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107680579A (en) * 2017-09-29 2018-02-09 百度在线网络技术(北京)有限公司 Text regularization model training method and device, text regularization method and device
CN107680579B (en) * 2017-09-29 2020-08-14 百度在线网络技术(北京)有限公司 Text regularization model training method and device, and text regularization method and device
CN109031474A (en) * 2018-08-31 2018-12-18 成都润联科技开发有限公司 A kind of weather information hiding Chinese phonetic broadcasting terminals and its working method based on Beidou satellite communication
WO2021134591A1 (en) * 2019-12-31 2021-07-08 深圳市优必选科技股份有限公司 Speech synthesis method, speech synthesis apparatus, smart terminal and storage medium

Similar Documents

Publication Publication Date Title
CN107516509B (en) Voice database construction method and system for news broadcast voice synthesis
CN101141666B (en) Method of converting text note to voice broadcast in mobile phone
CN102831195B (en) Personalized speech gathers and semantic certainty annuity and method thereof
CN101008942A (en) Machine translation device and method thereof
CN111951779B (en) Front-end processing method for speech synthesis and related equipment
CN1731510B (en) Text-speech conversion for amalgamated language
KR20080032640A (en) Conversion of number into text and speech
CN1901041B (en) Voice dictionary forming method and voice identifying system and its method
CN100592385C (en) Method and system for performing speech recognition on multi-language name
CN106507321A (en) The bilingual GSM message breath voice conversion broadcasting system of a kind of dimension, the Chinese
RU2419142C2 (en) Method to organise synchronous interpretation of oral speech from one language to another by means of electronic transceiving system
CN112580335B (en) Method and device for disambiguating polyphone
CN109859746B (en) TTS-based voice recognition corpus generation method and system
Rabiner The role of voice processing in telecommunications
CN102196100A (en) Instant call translation system and method
CN112242134A (en) Speech synthesis method and device
Tan Design of intelligent speech translation system based on deep learning
Fung et al. Concatenating syllables for response generation in spoken language applications
CN103366732A (en) Voice broadcast method and device and vehicle-mounted system
RU80603U1 (en) ELECTRONIC TRANSMISSION SYSTEM WITH THE FUNCTION OF SYNCHRONOUS TRANSLATION OF ORAL SPEECH FROM ONE LANGUAGE TO ANOTHER
Baum et al. SpeechDat-AT: A telephone speech database for Austrian German
CN100365551C (en) Words input method and apparatus for hand-held devices
CN111145727B (en) Method and device for recognizing digital string by voice
CN115695943A (en) Digital human video generation method, device, equipment and storage medium
TWI768412B (en) Pronunciation teaching method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170315