CN105427855A - Voice broadcast system and voice broadcast method of intelligent software - Google Patents

Voice broadcast system and voice broadcast method of intelligent software Download PDF

Info

Publication number
CN105427855A
CN105427855A CN201510757022.7A CN201510757022A CN105427855A CN 105427855 A CN105427855 A CN 105427855A CN 201510757022 A CN201510757022 A CN 201510757022A CN 105427855 A CN105427855 A CN 105427855A
Authority
CN
China
Prior art keywords
voice
text
model
word message
intelligent software
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510757022.7A
Other languages
Chinese (zh)
Inventor
王程程
刘青松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Shanghai Intelligent Technology Co Ltd
Original Assignee
SHANGHAI YUZHIYI INFORMATION TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHANGHAI YUZHIYI INFORMATION TECHNOLOGY Co Ltd filed Critical SHANGHAI YUZHIYI INFORMATION TECHNOLOGY Co Ltd
Priority to CN201510757022.7A priority Critical patent/CN105427855A/en
Publication of CN105427855A publication Critical patent/CN105427855A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72475User interfaces specially adapted for cordless or mobile telephones specially adapted for disabled users
    • H04M1/72481User interfaces specially adapted for cordless or mobile telephones specially adapted for disabled users for visually impaired users

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a voice broadcast system and a voice broadcast method of intelligent software. The voice broadcast system comprises a character information acquisition module used for acquiring character information, a text front-end processing module connected with the character information acquisition module and used for converting the character information into text information with a special reading method, a model storage module used for building and storing a sound model, a voice synthesis module connected with the text front-end processing module and the model storage module and a voice broadcast module connected with the voice synthesis module and used for playing voice files, wherein the voice synthesis module is used for calling the sound model, obtaining acoustic parameters corresponding to the text information according to the sound model and prediction of a decision tree, carrying out voice synthesis of the acoustic parameters and outputting the voice files synthesized through voice. The technologies of text processing, parameter modeling, voice synthesis and the like are comprehensively used, an intelligent mobile phone end/tablet computer end text broadcast function is provided, and text broadcast of a specific tone is achieved.

Description

A kind of voice broadcasting system of intelligent software and voice broadcast method
Technical field
The present invention relates to a kind of voice broadcast field, particularly relate to a kind of voice broadcasting system and voice broadcast method of intelligent software.
Background technology
Along with the raising of people ' s health level and the prolongation of population life, the ratio that the elderly accounts for population is increasing, day by day receives the concern of international community in the idea of China's Healthy aging.The United Nations proposes, and Healthy ageing is turned to the objective of the struggle that the whole world solves aging problem.In this state of ceremonies of China, respect the aged people and like always to incorporate Chinese people deeply in the heart, data statistics according to China in 2010 the 6th census shows: current, China 60 years old and above population accounting reach 13.26%, along with the quickening of China's aging population trend, how to be the life in old age that the elderly's creating policy is comfortable, the aspect such as health, cultural life of the elderly more and more pay close attention to by people, following aspect such as product design, the marketing will to this trend development.
According to the investigation of centering old group specialty, investigation result reflects, the inside in person in middle and old age group, and the people before 60 years old, have 85.59% to have mobile phone, but 62.38% tenure of use exceeded 2 years; According to another investigation, children give in the present of old man, and smart mobile phone and panel computer account for great majority, because the large display screen of panel computer, can well solve the problem that old man sees word difficulty.
From the elderly's physiological function and life habit, by deeply going to carry out a large amount of the elderly's interviews and investigation, we find, in the cultural life of old man, have greatly people can be relevant to stock.There is time enough to operate staring at dish the day of trade after their a lot of people's retirement, compare some young men more absorbed, in addition, certain sense of accomplishment can be brought to them again, always have used, be looked after properly in the old age.The elderly is different from young man's maximum difficulty of speculating in shares and is, thickly dotted numeral and transaction code are challenges greatly for the rudimentary the elderly of eyesight, this just may cause, " oolong refers to " or input the amount of money by mistake and miss current transaction value, as easy as rolling off a logly causes economy and emotional distress.
In sum, a set of stock voice broadcast software based on Android or iOS platform is set up to old man's being necessary very.
Summary of the invention
Technical matters to be solved by this invention is to provide a kind of voice broadcasting system and voice broadcast method of intelligent software, the stock tickers based on Android or iOS platform can be applied to, the problem of Chu's stock numeral is not seen for the elderly, report voice message and the confirmation of each operation, and can real-time broadcasting current stock market overview.
For realizing above-mentioned technique effect, the invention discloses a kind of voice broadcasting system of intelligent software, comprising:
Word message acquisition module, for gathering the Word message in intelligent software;
Text front end processing block, is connected with described Word message acquisition module, for the described Word message gathered is converted into the text message with specific pronunciation;
Model storage module, for setting up and stored sound model;
Voice synthetic module, with described text front end processing block and described model storage model calling, for calling the sound model of described model storage module stores, parameters,acoustic corresponding to text message that described text front end processing block transmits is obtained according to described sound model and decision tree prediction, described parameters,acoustic is carried out phonetic synthesis, exports the voice document through phonetic synthesis; And
Voice playing module, is connected with described voice synthetic module, the voice file for playing.
The voice broadcasting system of described intelligent software further improves and is, described Word message acquisition module is connected with intellectual broadcast client communication, and described intellectual broadcast client is the plug-in unit assigning into the collection carrying out Word message in intelligent software.
The voice broadcasting system of described intelligent software further improves and is, described text front end processing block comprises:
Regular regular setting unit, is connected with described Word message acquisition module, for carrying out the regularization based on ad hoc rules to the described Word message collected; And
Text transforms mark unit, is connected with described regular regular setting unit, for marking the described Word message through regularization, is converted into the text message with specific pronunciation through mark.
The voice broadcasting system of described intelligent software further improves and is, described model storage module comprises:
Voice annotation front-end processing unit, for gathering sound data sources, carrying out voice annotation front-end processing to the described sound data sources gathered, obtaining text marking information;
Feature extracting unit, is connected with described voice annotation front-end processing unit, for extracting the described fundamental frequency of text marking information and the acoustic feature of frequency spectrum;
Training unit, is connected with described feature extracting unit, for based on the Parameter Clustering of hidden Markov model and training, forms the sound model of described acoustic feature; And
Model storage unit, is connected with described training unit, for storing described sound model.
The voice broadcasting system of described intelligent software further improves and is, described voice synthetic module comprises:
Mark storage unit, is connected with described text front end processing block, carries out part of speech analysis and prosody prediction for the text message transmitted described text front end processing block;
Parameter prediction unit, with described mark storage unit and described model storage model calling, for calling the sound model of described model storage module stores, obtain through parameters,acoustic corresponding to the described text message of part of speech analysis and prosody prediction according to described sound model and decision tree prediction; And
Compositor synthetic speech unit, is connected with described parameter prediction unit, carrying out phonetic synthesis, exporting the voice document through phonetic synthesis for being delivered in Parametric synthesizers by described parameters,acoustic.
The invention also discloses a kind of voice broadcast method of intelligent software, comprising:
Gather the Word message in intelligent software;
The described Word message gathered is converted into the text message with specific pronunciation;
Set up and stored sound model;
Call the sound model of storage, obtain parameters,acoustic corresponding to described text message according to described sound model and decision tree prediction, described parameters,acoustic is carried out phonetic synthesis, exports the voice document through phonetic synthesis; And
Play institute's voice file.
The voice broadcast method of described intelligent software further improves and is, gathering Word message, comprising: in intelligent software, assign the intellectual broadcast client for gathering Word message.
The voice broadcast method of described intelligent software further improves and is, the Word message of collection is converted into the text message with specific pronunciation, comprises:
Regularization based on ad hoc rules is carried out to the Word message collected; And
Described Word message through regularization is marked, is converted into the text message with specific pronunciation through mark.
The voice broadcast method of described intelligent software further improves and is, sets up and stored sound model, comprising:
Gather sound data sources, voice annotation front-end processing is carried out to the described sound data sources gathered, obtains text marking information;
Extract the described fundamental frequency of text marking information and the acoustic feature of frequency spectrum;
Based on Parameter Clustering and the training of hidden Markov model, form the sound model of described acoustic feature; And
Store described sound model.
The voice broadcast method of described intelligent software further improves and is, call the sound model of storage, obtain parameters,acoustic corresponding to described text message according to described sound model and decision tree prediction, described parameters,acoustic is carried out phonetic synthesis, export the voice document through phonetic synthesis, comprising:
Part of speech analysis and prosody prediction are carried out to text message;
Call the sound model of storage, obtain through parameters,acoustic corresponding to the described text message of part of speech analysis and prosody prediction according to described sound model and decision tree prediction; And
Described parameters,acoustic is delivered in Parametric synthesizers and carries out phonetic synthesis, export the voice document through phonetic synthesis.
The present invention, owing to have employed above technical scheme, makes it have following beneficial effect:
The technology such as comprehensive utilization text-processing, parameter model, phonetic synthesis, improve the voice broadcasting system of a set of omnibearing intelligent software, the intellectual broadcast client collection user assigned in intelligent software is utilized to need the Word message reported, recycling text front end processing block carries out special processing for the text rule of different field, can obtain the text message of the specific pronunciation in applicable various field, then, utilize model storage module to set up and store the sound model with special tamber, calling for voice synthetic module, afterwards, voice synthetic module is utilized to call the sound model of special tamber, text message is carried out to the phonetic synthesis of special tamber, the text obtaining special tamber is reported, user is facilitated to replace the mode of simple reading in the mode listening to report, listening to the laggard line operate of report information, avoid maloperation, accomplish accurate convenience, simultaneously, sound model in model storage module can be changed at any time, realize reporting text to adjust at any time with pronunciation tone color, when running into new warning scene and needing to upgrade report text or when wanting the pronunciation tone color of changing up-to-date network rising star, can adjust at any time, convenience very, cost-saving and increase and listen to enjoyment.
Accompanying drawing explanation
Fig. 1 is the high-level schematic functional block diagram of the voice broadcasting system of intelligent software of the present invention.
Fig. 2 is the process flow diagram of the voice broadcast method of intelligent software of the present invention.
Embodiment
Below in conjunction with the drawings and the specific embodiments, the present invention is further detailed explanation.
Consult shown in Fig. 1, the voice broadcasting system of intelligent software of the present invention forms primarily of Word message acquisition module 11, text front end processing block 12, model storage module 13, voice synthetic module 14 and voice playing module 15.
Wherein, Word message acquisition module 11 is for gathering Word message.This word information acquisition module 11 communicates to connect with intellectual broadcast client 111, intellectual broadcast client 111 generally can as plug-in unit, assign into the intelligent software based on Android or iOS platform, carry out the collection of Word message among such as stock tickers (as: stock trader's client, sequence, large wisdom etc.), the function that the mobile phone terminal of intelligence/flat board end is reported with text is provided.User is when needs carry out voice broadcast, intellectual broadcast client 111 can be started, intellectual broadcast client 111 is responsible for the Word message that collection user needs to report, such as relevant to stock text, the problem of Chu's stock numeral is not seen for the elderly, can be voice message and the confirmation that the elderly reports each operation, and can real-time broadcasting current stock market overview.Meanwhile, intellectual broadcast client 111 puts into stock tickers as a plug-in unit, whether reports by click switch unrestricted choice, practical and can not cause harassing and wrecking.
Text front end processing block 12 is connected with Word message acquisition module 11, and the Word message for being gathered by Word message acquisition module 11 is converted into the text message with specific pronunciation.Such as, text for stock carries out special processing, we know, at stock, "+" needs to be read into " rising ", "-" needs to be read into " falling ", index " 3542 " needs to be read into " 3,542 point ", etc., these need the Word message to gathering to carry out special process, make the specific pronunciation of its applicable stock, namely the semanteme of stock is resolved.Wherein, text front end processing block 12 specifically comprises regular regular setting unit 121 and text transforms mark unit 122, regular regular setting unit 121 is connected with Word message acquisition module 11, Word message for collecting Word message acquisition module 11 carries out the regularization based on ad hoc rules, such as read into " point " based on ". ", " % " read into ad hoc ruless such as " percent ", " 1.2% " canonical is turned to " 1 percent two ", then export through normalized Word message, as " 1 percent two ".Text transforms mark unit 122 and is connected with regular regular setting unit 121, for receiving the Word message through regularization that regular regular setting unit 121 exports, and this Word message through regularization is marked, such as, " 1 percent two " are labeled as " baifenzhiyidianer ", and further phone-level part of speech prosodic labeling, be converted into the text message with specific pronunciation through mark, and the text message this with specific pronunciation is delivered to next unit.
Model storage module 13, for setting up and stored sound model, is the vital step of the present invention.The report people that can be set up different tone color by model storage module 13 (can be have Wa Li robot tone color to report people, also can be the tone color of the cartoon figures such as similar RNB, Chibi Maruko Chan, the celebrity voice that when also can be, lower network be hot) sound model, and store, for follow-up phonetic synthesis provides the sound model of the speaker that precondition is good, call at any time for voice synthetic module 14, the text realizing special tamber is reported.Wherein, model storage module 13 specifically comprises voice annotation front-end processing unit 131, feature extracting unit 132, training unit 133 and model storage unit 134.Voice annotation front-end processing unit 131 for by collection 2 ~ 3 hours certain or some report the sound of people as sound data sources, and voice annotation front-end processing is carried out to the sound data sources gathered, obtains the text marking information of this sound data sources.Feature extracting unit 132 is connected with mark front-end processing unit 131, for the acoustic feature of the fundamental frequency and frequency spectrum that extract text marking information.Training unit 133 is connected with feature extracting unit 132, for based on the Parameter Clustering of hidden Markov model (HiddenMarkovModel, be called for short HMM) and training, forms the sound model of the acoustic feature extracted.Model storage unit 134 is connected with training unit 133, for the sound model of the report people of the various tone color of offline storage.Complete model storage module 13 to the foundation of the sound model of the report people of various different tone color and storage, when synthesizing demand and arriving, the relevant sound model reporting people can be called, carry out phonetic synthesis, thus reach the object of voice broadcast.
Voice synthetic module 14 is core technologies of the present invention, also be the module running through whole system, voice synthetic module 14 is connected with text front end processing block 12 and model storage module 13 simultaneously, for the sound model that calling model memory module 13 stores, parameters,acoustic corresponding to text message that text front end processing block 12 transmits is obtained according to this sound model and decision tree prediction, this parameters,acoustic is carried out phonetic synthesis, exports the voice document through phonetic synthesis.Phonetic synthesis, also known as literary periodicals (TexttoSpeech) technology, can be converted into the massage voice reading of standard smoothness out by any Word message in real time, is equivalent to load onto artificial face to machine.It relates to multiple subject technologies such as acoustics, linguistics, digital signal processing, computer science, it is a cutting edge technology in Chinese information processing field, the subject matter solved is exactly how Word message to be converted into the acoustic information that can listen, and also namely allows machine lift up one's voice as people.
Voice synthetic module 14 specifically comprises mark storage unit 141, parameter prediction unit 142 and compositor synthetic speech unit 143.Mark storage unit 141 and the text of text front end processing block 12 transform and marks unit 122 and be connected, and for marking the text message that unit 122 is sent to text conversion, as " deep bid rise today 35 six points ", carry out part of speech analysis and prosody prediction, parameter prediction unit 142 is connected with the model storage unit 134 of mark storage unit 141 and model storage module 13, for sending synthesis demand to model storage unit 134, the sound model of certain report people that the precondition stored in calling model storage unit 134 is good, can be the report people with Wa Li robot tone color, also can be similar RNB, the tone color of the cartoon figures such as Chibi Maruko Chan, the sound model of the celebrity voice that lower network is hot when also can be, obtain through parameters,acoustic corresponding to the text message of part of speech analysis and prosody prediction according to this sound model and decision tree prediction again.Decision tree (DecisionTree) is on the basis of known various situation probability of happening, the expectation value asking for net present value (NPV) by forming decision tree is more than or equal to the probability of zero, assessment item risk, judging the method for decision analysis of its feasibility, is a kind of graphical method intuitively using probability analysis.Compositor synthetic speech unit 143 is connected with parameter prediction unit 142, phonetic synthesis is carried out for parameter prediction unit 142 being predicted the parameters,acoustic obtained is delivered in Parametric synthesizers, export through the voice document of phonetic synthesis, as the sound of " today deep bid go up 35.6 points ".
Voice playing module 15 is connected with the compositor synthetic speech unit 143 of voice synthetic module 14, for playing the sound of the voice document " today deep bid go up 35.6 points " through phonetic synthesis.Process reported by the text completing whole special tamber.
The present invention fully utilizes the technology such as text-processing, parameter model, phonetic synthesis, for old man provides a kind of omnibearing stock to report solution, the intellectual broadcast client collection user assigned in stock tickers is utilized to need the Word message reported, recycling text front end processing block carries out special processing for stock text, can obtain the text message of the specific pronunciation of applicable stock, then, utilize model storage module to set up and store the sound model with special tamber, calling for voice synthetic module, afterwards, voice synthetic module is utilized to call the sound model of special tamber, text message is carried out to the phonetic synthesis of special tamber, the text obtaining special tamber is reported, user is facilitated to replace the mode of simple reading in the mode listening to report, listening to the laggard line operate of report information, avoid maloperation, accomplish accurate convenience, simultaneously, sound model in model storage module can be changed at any time, realize reporting text to adjust at any time with pronunciation tone color, when running into new warning scene and needing to upgrade report text or when wanting the pronunciation tone color of changing up-to-date network rising star, can adjust at any time, convenience very, cost-saving and increase and listen to enjoyment.
Coordinate shown in Fig. 2, utilize voice broadcasting system of the present invention to carry out voice broadcast, mainly comprise the steps:
S001: gather the Word message in intelligent software;
S002: the Word message of collection is converted into the text message with specific pronunciation;
S003: set up and stored sound model;
S004: the sound model calling storage, obtains parameters,acoustic corresponding to text message according to sound model and decision tree prediction, parameters,acoustic is carried out phonetic synthesis, exports the voice document through phonetic synthesis; And
S005: play voice document.
Wherein, step S001: gathering Word message, comprising: assign the intellectual broadcast client for gathering Word message in intelligent software.
This intellectual broadcast client generally can as plug-in unit, assign into the intelligent software based on Android or iOS platform, among such as stock tickers (as: stock trader's client, sequence, large wisdom etc.), carry out the collection of Word message, the function that the mobile phone terminal of intelligence/flat board end is reported with text is provided.User, when needs carry out voice broadcast, can start intellectual broadcast client, and intellectual broadcast client is responsible for the Word message that collection user needs to report, such as relevant to stock text.Do not see the problem of Chu's stock numeral for the elderly, the present invention can be voice message and the confirmation that the elderly reports each operation, and can real-time broadcasting current stock market overview.Meanwhile, intellectual broadcast client puts into stock tickers as a plug-in unit, whether reports by click switch unrestricted choice, practical and can not cause harassing and wrecking.
Step S002: the Word message of collection is converted into the text message with specific pronunciation, such as, the text for stock carries out special processing, and we know, at stock, "+" needs to be read into " rising ", and "-" needs to be read into " falling ", and index " 3542 " needs to be read into " 3,542 point ", etc., these need the Word message to gathering to carry out special process, make the specific pronunciation of its applicable stock, and namely the semanteme of stock is resolved.Specifically comprise:
First, regularization based on ad hoc rules is carried out to the Word message collected, such as read into " point " based on ". ", " % " read into ad hoc ruless such as " percent ", " 1.2% " canonical is turned to " 1 percent two ", then export through normalized Word message, as " 1 percent two ";
Then, the Word message through regularization is marked, such as, " 1 percent two " are labeled as " baifenzhiyidianer ", and further phone-level part of speech prosodic labeling, be converted into the text message with specific pronunciation through mark.
Step S003: set up and stored sound model, comprising:
First, certain reports the sound of people as sound data sources to gather 2-3 hour, carries out voice annotation front-end processing, obtain text marking information to this sound data sources gathered;
Secondly, the fundamental frequency of text marking information and the acoustic feature of frequency spectrum is extracted;
Then, based on Parameter Clustering and the training of HMM, the sound model of acoustic feature is formed;
Finally, stored sound model.
(can be there is Wa Li robot tone color report people by setting up the report people of different tone color, also can be the tone color of the cartoon figures such as similar RNB, Chibi Maruko Chan, the celebrity voice that when also can be, lower network be hot) sound model, and store, the sound model of the speaker that precondition is good can be provided for follow-up phonetic synthesis, for calling at any time, the text realizing special tamber is reported, for voice broadcast increases enjoyment.
Step S004: the sound model calling storage, obtains parameters,acoustic corresponding to text message according to sound model and decision tree prediction, parameters,acoustic is carried out phonetic synthesis, exports the voice document through phonetic synthesis, comprising:
First, to the text message reached, as " deep bid rise today 35 six points ", carry out part of speech analysis and prosody prediction;
Next, send synthesis demand, call the sound model of the storage of the report people trained, the sound model called according to this and decision tree prediction obtain through parameters,acoustic corresponding to the text message of part of speech analysis and prosody prediction;
Finally, carrying out phonetic synthesis by predicting that the parameters,acoustic that obtains is delivered in Parametric synthesizers, exporting the voice document through phonetic synthesis, as the sound of " today deep bid go up 35.6 points ".Complete whole special tamber text and report process.
Adopt voice broadcasting system of the present invention and voice broadcast method, the elderly can check certain stock in stock tickers, this page there will be report plug-in unit thereupon, click switch, then carry out the report of this page basic condition, as: stock code: 600001, stock name: Pudong Development Bank, present price: ten five four null elements.As user needs dealing to operate, after reporting plug-in unit unlatching, can carry out placing an order again after report confirms to the operation of user, prevent maloperation.As: code 600001 of buying in stocks, stock name Pudong Development Bank, 1000 strands, declaration form price ten is hexa-atomic whole.User confirms errorless can placing an order after receiving report information, can accomplish accurate convenience like this.
Below by reference to the accompanying drawings and embodiment to invention has been detailed description, those skilled in the art can make many variations example to the present invention according to the above description.Thus, some details in embodiment should not form limitation of the invention, the present invention by the scope that defines using appended claims as protection scope of the present invention.

Claims (10)

1. a voice broadcasting system for intelligent software, is characterized in that, comprising:
Word message acquisition module, for gathering the Word message in intelligent software;
Text front end processing block, is connected with described Word message acquisition module, for the described Word message gathered is converted into the text message with specific pronunciation;
Model storage module, for setting up and stored sound model;
Voice synthetic module, with described text front end processing block and described model storage model calling, for calling the sound model of described model storage module stores, parameters,acoustic corresponding to text message that described text front end processing block transmits is obtained according to described sound model and decision tree prediction, described parameters,acoustic is carried out phonetic synthesis, exports the voice document through phonetic synthesis; And
Voice playing module, is connected with described voice synthetic module, the voice file for playing.
2. the voice broadcasting system of intelligent software as claimed in claim 1, it is characterized in that: described Word message acquisition module is connected with intellectual broadcast client communication, described intellectual broadcast client is the plug-in unit assigning into the collection carrying out Word message in intelligent software.
3. the voice broadcasting system of intelligent software as claimed in claim 1, it is characterized in that, described text front end processing block comprises:
Regular regular setting unit, is connected with described Word message acquisition module, for carrying out the regularization based on ad hoc rules to the described Word message collected; And
Text transforms mark unit, is connected with described regular regular setting unit, for marking the described Word message through regularization, is converted into the text message with specific pronunciation through mark.
4. the voice broadcasting system of intelligent software as claimed in claim 1, it is characterized in that, described model storage module comprises:
Voice annotation front-end processing unit, for gathering sound data sources, carrying out voice annotation front-end processing to the described sound data sources gathered, obtaining text marking information;
Feature extracting unit, is connected with described voice annotation front-end processing unit, for extracting the described fundamental frequency of text marking information and the acoustic feature of frequency spectrum;
Training unit, is connected with described feature extracting unit, for based on the Parameter Clustering of hidden Markov model and training, forms the sound model of described acoustic feature; And
Model storage unit, is connected with described training unit, for storing described sound model.
5. the voice broadcasting system of intelligent software as claimed in claim 1, it is characterized in that, described voice synthetic module comprises:
Mark storage unit, is connected with described text front end processing block, carries out part of speech analysis and prosody prediction for the text message transmitted described text front end processing block;
Parameter prediction unit, with described mark storage unit and described model storage model calling, for calling the sound model of described model storage module stores, obtain through parameters,acoustic corresponding to the described text message of part of speech analysis and prosody prediction according to described sound model and decision tree prediction; And
Compositor synthetic speech unit, is connected with described parameter prediction unit, carrying out phonetic synthesis, exporting the voice document through phonetic synthesis for being delivered in Parametric synthesizers by described parameters,acoustic.
6. a voice broadcast method for intelligent software, is characterized in that, comprising:
Gather the Word message in intelligent software;
The described Word message gathered is converted into the text message with specific pronunciation;
Set up and stored sound model;
Call the sound model of storage, obtain parameters,acoustic corresponding to described text message according to described sound model and decision tree prediction, described parameters,acoustic is carried out phonetic synthesis, exports the voice document through phonetic synthesis; And
Play institute's voice file.
7. the voice broadcast method of intelligent software as claimed in claim 6, is characterized in that, gathering the Word message in intelligent software, comprising: in intelligent software, assign the intellectual broadcast client for gathering Word message.
8. the voice broadcast method of intelligent software as claimed in claim 6, is characterized in that, the Word message of collection is converted into the text message with specific pronunciation, comprises:
Regularization based on ad hoc rules is carried out to the Word message collected; And
Described Word message through regularization is marked, is converted into the text message with specific pronunciation through mark.
9. the voice broadcast method of intelligent software as claimed in claim 6, is characterized in that, sets up and stored sound model, comprising:
Gather sound data sources, voice annotation front-end processing is carried out to the described sound data sources gathered, obtains text marking information;
Extract the described fundamental frequency of text marking information and the acoustic feature of frequency spectrum;
Based on Parameter Clustering and the training of hidden Markov model, form the sound model of described acoustic feature; And
Store described sound model.
10. the voice broadcast method of intelligent software as claimed in claim 6, it is characterized in that, call the sound model of storage, parameters,acoustic corresponding to described text message is obtained according to described sound model and decision tree prediction, described parameters,acoustic is carried out phonetic synthesis, export the voice document through phonetic synthesis, comprising:
Part of speech analysis and prosody prediction are carried out to text message;
Call the sound model of storage, obtain through parameters,acoustic corresponding to the described text message of part of speech analysis and prosody prediction according to described sound model and decision tree prediction; And
Described parameters,acoustic is delivered in Parametric synthesizers and carries out phonetic synthesis, export the voice document through phonetic synthesis.
CN201510757022.7A 2015-11-09 2015-11-09 Voice broadcast system and voice broadcast method of intelligent software Pending CN105427855A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510757022.7A CN105427855A (en) 2015-11-09 2015-11-09 Voice broadcast system and voice broadcast method of intelligent software

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510757022.7A CN105427855A (en) 2015-11-09 2015-11-09 Voice broadcast system and voice broadcast method of intelligent software

Publications (1)

Publication Number Publication Date
CN105427855A true CN105427855A (en) 2016-03-23

Family

ID=55506010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510757022.7A Pending CN105427855A (en) 2015-11-09 2015-11-09 Voice broadcast system and voice broadcast method of intelligent software

Country Status (1)

Country Link
CN (1) CN105427855A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202238A (en) * 2016-06-30 2016-12-07 马根昌 Real person's analogy method
CN106601228A (en) * 2016-12-09 2017-04-26 百度在线网络技术(北京)有限公司 Sample marking method and device based on artificial intelligence prosody prediction
CN106856091A (en) * 2016-12-21 2017-06-16 北京智能管家科技有限公司 The automatic broadcasting method and system of a kind of multi-language text
CN107958415A (en) * 2017-10-31 2018-04-24 阿里巴巴集团控股有限公司 Securities information broadcasting method and device
WO2018121757A1 (en) * 2016-12-31 2018-07-05 深圳市优必选科技有限公司 Method and system for speech broadcast of text
CN108965600A (en) * 2018-07-24 2018-12-07 Oppo(重庆)智能科技有限公司 Voice pick-up method and Related product
CN109036388A (en) * 2018-07-25 2018-12-18 李智彤 A kind of intelligent sound exchange method based on conversational device
CN109686361A (en) * 2018-12-19 2019-04-26 深圳前海达闼云端智能科技有限公司 A kind of method, apparatus of speech synthesis calculates equipment and computer storage medium
CN111145722A (en) * 2019-12-30 2020-05-12 出门问问信息科技有限公司 Text processing method and device, computer storage medium and electronic equipment
CN111178042A (en) * 2019-12-31 2020-05-19 出门问问信息科技有限公司 Data processing method and device and computer storage medium
CN111246027A (en) * 2020-04-28 2020-06-05 南京硅基智能科技有限公司 Voice communication system and method for realizing man-machine cooperation
CN111261139A (en) * 2018-11-30 2020-06-09 上海擎感智能科技有限公司 Character personification broadcasting method and system
CN111627417A (en) * 2019-02-26 2020-09-04 北京地平线机器人技术研发有限公司 Method and device for playing voice and electronic equipment
CN111667812A (en) * 2020-05-29 2020-09-15 北京声智科技有限公司 Voice synthesis method, device, equipment and storage medium
CN111785247A (en) * 2020-07-13 2020-10-16 北京字节跳动网络技术有限公司 Voice generation method, device, equipment and computer readable medium
CN112150233A (en) * 2019-06-26 2020-12-29 三竹资讯股份有限公司 Device and method for controlling financial quotation of television application program by voice
CN113159925A (en) * 2021-04-30 2021-07-23 中国银行股份有限公司 Financing information prompting method and device
CN113223492A (en) * 2021-04-08 2021-08-06 北京戴纳实验科技有限公司 Voice broadcasting system
CN113506558A (en) * 2021-07-07 2021-10-15 深圳汇商通盈科技有限公司 Method, device and equipment for collection and broadcast and storage medium
CN113763920A (en) * 2020-05-29 2021-12-07 广东美的制冷设备有限公司 Air conditioner, voice generation method thereof, voice generation device and readable storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178896A (en) * 2007-12-06 2008-05-14 安徽科大讯飞信息科技股份有限公司 Unit selection voice synthetic method based on acoustics statistical model
CN102254550A (en) * 2010-05-21 2011-11-23 腾讯科技(深圳)有限公司 Method and system for reading characters on webpage
CN103632663A (en) * 2013-11-25 2014-03-12 飞龙 HMM-based method of Mongolian speech synthesis and front-end processing
CN104392716A (en) * 2014-11-12 2015-03-04 百度在线网络技术(北京)有限公司 Method and device for synthesizing high-performance voices
CN104464716A (en) * 2014-11-20 2015-03-25 北京云知声信息技术有限公司 Voice broadcasting system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178896A (en) * 2007-12-06 2008-05-14 安徽科大讯飞信息科技股份有限公司 Unit selection voice synthetic method based on acoustics statistical model
CN102254550A (en) * 2010-05-21 2011-11-23 腾讯科技(深圳)有限公司 Method and system for reading characters on webpage
CN103632663A (en) * 2013-11-25 2014-03-12 飞龙 HMM-based method of Mongolian speech synthesis and front-end processing
CN104392716A (en) * 2014-11-12 2015-03-04 百度在线网络技术(北京)有限公司 Method and device for synthesizing high-performance voices
CN104464716A (en) * 2014-11-20 2015-03-25 北京云知声信息技术有限公司 Voice broadcasting system and method

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202238A (en) * 2016-06-30 2016-12-07 马根昌 Real person's analogy method
CN106601228A (en) * 2016-12-09 2017-04-26 百度在线网络技术(北京)有限公司 Sample marking method and device based on artificial intelligence prosody prediction
CN106601228B (en) * 2016-12-09 2020-02-04 百度在线网络技术(北京)有限公司 Sample labeling method and device based on artificial intelligence rhythm prediction
CN106856091A (en) * 2016-12-21 2017-06-16 北京智能管家科技有限公司 The automatic broadcasting method and system of a kind of multi-language text
WO2018121757A1 (en) * 2016-12-31 2018-07-05 深圳市优必选科技有限公司 Method and system for speech broadcast of text
CN107958415A (en) * 2017-10-31 2018-04-24 阿里巴巴集团控股有限公司 Securities information broadcasting method and device
CN107958415B (en) * 2017-10-31 2021-07-27 创新先进技术有限公司 Security information broadcasting method and device
CN108965600A (en) * 2018-07-24 2018-12-07 Oppo(重庆)智能科技有限公司 Voice pick-up method and Related product
CN108965600B (en) * 2018-07-24 2021-05-04 Oppo(重庆)智能科技有限公司 Voice pickup method and related product
CN109036388A (en) * 2018-07-25 2018-12-18 李智彤 A kind of intelligent sound exchange method based on conversational device
CN111261139B (en) * 2018-11-30 2023-12-26 上海擎感智能科技有限公司 Literal personification broadcasting method and system
CN111261139A (en) * 2018-11-30 2020-06-09 上海擎感智能科技有限公司 Character personification broadcasting method and system
CN109686361A (en) * 2018-12-19 2019-04-26 深圳前海达闼云端智能科技有限公司 A kind of method, apparatus of speech synthesis calculates equipment and computer storage medium
CN109686361B (en) * 2018-12-19 2022-04-01 达闼机器人有限公司 Speech synthesis method, device, computing equipment and computer storage medium
CN111627417B (en) * 2019-02-26 2023-08-08 北京地平线机器人技术研发有限公司 Voice playing method and device and electronic equipment
CN111627417A (en) * 2019-02-26 2020-09-04 北京地平线机器人技术研发有限公司 Method and device for playing voice and electronic equipment
CN112150233A (en) * 2019-06-26 2020-12-29 三竹资讯股份有限公司 Device and method for controlling financial quotation of television application program by voice
TWI778273B (en) * 2019-06-26 2022-09-21 三竹資訊股份有限公司 Device and method of a voice-activated financial quotes application on a tv
CN111145722A (en) * 2019-12-30 2020-05-12 出门问问信息科技有限公司 Text processing method and device, computer storage medium and electronic equipment
CN111145722B (en) * 2019-12-30 2022-09-02 出门问问信息科技有限公司 Text processing method and device, computer storage medium and electronic equipment
CN111178042A (en) * 2019-12-31 2020-05-19 出门问问信息科技有限公司 Data processing method and device and computer storage medium
CN111178042B (en) * 2019-12-31 2023-04-28 出门问问信息科技有限公司 Data processing method and device and computer storage medium
US11380327B2 (en) 2020-04-28 2022-07-05 Nanjing Silicon Intelligence Technology Co., Ltd. Speech communication system and method with human-machine coordination
CN111246027A (en) * 2020-04-28 2020-06-05 南京硅基智能科技有限公司 Voice communication system and method for realizing man-machine cooperation
CN111667812B (en) * 2020-05-29 2023-07-18 北京声智科技有限公司 Speech synthesis method, device, equipment and storage medium
CN113763920A (en) * 2020-05-29 2021-12-07 广东美的制冷设备有限公司 Air conditioner, voice generation method thereof, voice generation device and readable storage medium
CN111667812A (en) * 2020-05-29 2020-09-15 北京声智科技有限公司 Voice synthesis method, device, equipment and storage medium
CN113763920B (en) * 2020-05-29 2023-09-08 广东美的制冷设备有限公司 Air conditioner, voice generating method thereof, voice generating device and readable storage medium
CN111785247A (en) * 2020-07-13 2020-10-16 北京字节跳动网络技术有限公司 Voice generation method, device, equipment and computer readable medium
CN113223492B (en) * 2021-04-08 2023-02-28 北京戴纳实验科技有限公司 Voice broadcasting system
CN113223492A (en) * 2021-04-08 2021-08-06 北京戴纳实验科技有限公司 Voice broadcasting system
CN113159925A (en) * 2021-04-30 2021-07-23 中国银行股份有限公司 Financing information prompting method and device
CN113506558A (en) * 2021-07-07 2021-10-15 深圳汇商通盈科技有限公司 Method, device and equipment for collection and broadcast and storage medium

Similar Documents

Publication Publication Date Title
CN105427855A (en) Voice broadcast system and voice broadcast method of intelligent software
CN110534092B (en) Speech phoneme recognition method and device, storage medium and electronic device
CN110970018B (en) Speech recognition method and device
CN108447471A (en) Audio recognition method and speech recognition equipment
CN110619889B (en) Sign data identification method and device, electronic equipment and storage medium
CN103280216B (en) Improve the speech recognition device the relying on context robustness to environmental change
CN105304081A (en) Smart household voice broadcasting system and voice broadcasting method
CN109271631A (en) Segmenting method, device, equipment and storage medium
CN110335592B (en) Speech phoneme recognition method and device, storage medium and electronic device
CN105551498A (en) Voice recognition method and device
CN105244042B (en) A kind of speech emotional interactive device and method based on finite-state automata
CN108304154A (en) A kind of information processing method, device, server and storage medium
CN110942763A (en) Voice recognition method and device
CN108074571A (en) Sound control method, system and the storage medium of augmented reality equipment
CN110111778B (en) Voice processing method and device, storage medium and electronic equipment
CN111445898A (en) Language identification method and device, electronic equipment and storage medium
CN107104994A (en) Audio recognition method, electronic installation and speech recognition system
CN107910004A (en) Voiced translation processing method and processing device
CN112463942A (en) Text processing method and device, electronic equipment and computer readable storage medium
WO2021179703A1 (en) Sign language interpretation method and apparatus, computer device, and storage medium
CN112447168A (en) Voice recognition system and method, sound box, display device and interaction platform
CN114283820A (en) Multi-character voice interaction method, electronic equipment and storage medium
CN112017670B (en) Target account audio identification method, device, equipment and medium
CN103730117A (en) Self-adaptation intelligent voice device and method
CN103474075A (en) Method and system for sending voice signals, and method and system for receiving voice signals

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20170929

Address after: 200233 Shanghai City, Xuhui District Guangxi 65 No. 1 Jinglu room 702 unit 03

Applicant after: Cloud known sound (Shanghai) Technology Co. Ltd.

Address before: 200031 Shanghai Xuhui District Qinzhou North Road 1198, 82 buildings, 2 stories, 01 rooms

Applicant before: SHANGHAI YUZHIYI INFORMATION TECHNOLOGY CO., LTD.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20160323

RJ01 Rejection of invention patent application after publication