CN1267384A - Method for determining representative speech sound block from voice signal comprising speech units - Google Patents
Method for determining representative speech sound block from voice signal comprising speech units Download PDFInfo
- Publication number
- CN1267384A CN1267384A CN98808350A CN98808350A CN1267384A CN 1267384 A CN1267384 A CN 1267384A CN 98808350 A CN98808350 A CN 98808350A CN 98808350 A CN98808350 A CN 98808350A CN 1267384 A CN1267384 A CN 1267384A
- Authority
- CN
- China
- Prior art keywords
- voice segments
- speech
- group
- representative
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims description 21
- 230000006870 function Effects 0.000 claims description 28
- 230000008901 benefit Effects 0.000 abstract description 3
- 230000014509 gene expression Effects 0.000 description 8
- 238000010606 normalization Methods 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 3
- 238000002789 length control Methods 0.000 description 2
- 241000219780 Pueraria Species 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000000428 dust Substances 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
After segmenting a voice signal into individual speech units, said units representing a speech sound block are assembled in a group. These multiple speech units included in a group describe distinctively well a sound block. Different selection criteria to evaluate the usability of individual speech units are provided. One advantage of combining the selection criteria is that different criteria can be taken into account when selecting a representative speech unit. Each selection criterion includes a membership function which indicates the 'usability' of individual speech units to be selected as a representative of the group. Preferably, the speech unit representing a maximum amongst the speech units of the group according to the selection criteria indicated by the membership function is selected as the representative of the corresponding sound block.
Description
The present invention relates to from a voice signal that comprises some voice segments (Lautabschnitt), to determine a kind of method of representative (Repraesentant) of block of speech (Sprachbaustein) of language.
Known for the expert, by the signal that a people says, promptly a voice signal can be divided into voice segments (segmentation), and wherein each voice segments comprises the part of voice signal.
A kind of language can be described as being the combination of a lot of modular block of speech from its that aspect.
A membership function declaration, a voice segments is represented a corresponding block of speech with which type of membership yardstick.
In order from database, to select block of speech to have many methods.Wherein through [1] of metrics, philological [2] or continuity criterion [3] are carried out a kind of optimization.In document [4], narrated the database of automatic generation.
Known several implying-Markov-model (HMMs) in document [5].
The segmentation of a voice signal can be carried out with " quick-Viterbi-adjustment " by means of the HMMs (seeing document [4]) by the voice signal training.
It is imperfect with manual methods voice signal being divided into each voice segments, because this requires great expense and experience, and must the people of each speech be carried out separately.
Also have in addition more that important disadvantages is, the applicability of selecteed representative (Repraesentant) is not tested, and therefore owing to select a bad representative as a block of speech, correspondingly the result of phonetic synthesis also is bad.
Task as basis of the present invention is, determines the method for the block of speech of a kind of language of representative from a voice signal that comprises some voice segments.This method has been avoided above-mentioned shortcoming and has been guaranteed to improve the selection of representative.
To a kind of segmentation evaluation is the statistical appraisal of carrying out by means of the individual voice section, thereby can be defined as a section to " good " representative on the statistics of relevant voice segments.
Task of the present invention is to solve according to the feature of claim 1.
Pointed out from a voice signal that comprises some voice segments, to determine the method for the block of speech of a kind of language of representative according to the present invention.The voice segments of voice signal is always comprehensively being belonged in the group of this chosen block of speech according to the membership of block of speech in the method.Thereby people obtain having separately for a plurality of block of speech a group of at least one voice segments.Selecting scale is used as, and tries to achieve the selective value of voice segments from voice signal, and determines the frequency of the selective value that the relevant voice segments of organizing obtains.Determine the membership function by means of the frequency of so obtaining, the membership function of each voice segments in group provides the membership yardstick, the membership yardstick illustrates then whether this voice segments can be used as a representative (i.e. chosen voice segments).Voice segments is confirmed as the representative about the group of chosen block of speech now, and its membership yardstick is positioned at more than the threshold value of predesignating.
A big advantage of this method is, not in the group of chosen block of speech, to take out any one representative, but obtain a representative, and this representative has sufficiently high quality factor chosen block of speech (quite high membership yardstick) is described.
Belong to the voice segments of a group of a block of speech, relate to its workability, be dispersed in the voice signal to statistics.And voice signal is preferentially used for computing machine as a long language exemplar by the talk language of nature.For what is called " good " and " bad " voice segments are arranged about chosen block of speech.Can avoid especially with the present invention, determine of the representative of a bad voice segments as chosen block of speech.
An other selecting scale of voice segments is used in an expansion of the present invention at least.Draw at least one other selective value separately for each voice segments therein.Each group (promptly for each chosen block of speech) for voice segments is obtained the probability of all selective values, and draws a membership function by these probability as mentioned above.
In an additional expansion, from the group of voice segments, determine the representative of chosen block of speech, to multiply each other be an out to out to each membership yardstick (drawing the membership function with a membership yardstick for each selecting scale) therein.If the out to out of each voice segments is positioned on the total threshold value of predesignating, then this voice segments is suitable as the representative of chosen block of speech, and chosen from the group of voice segments.And this voice segments is to belong to this chosen block of speech.
Determine a plurality of selecting scales determine the representative advantage be because can guarantee too not bad selective value like this.In out to out the membership yardstick multiply each other weighting be equivalent to one of the probability density function with-logical operation.Representative then can enough quality factors be satisfied all selecting scales.
An expansion of the present invention in addition is that voice segments is a kind of phoneme of language, double-tone, three sounds, syllable semitone joint or word.Combination by these described voice segments also is possible.
An other expansion is, voice segments is the single status that is subordinated to implicit-Markov-model (HMM).
Also have an expansion to be, selecting scale is to be determined by the following amount of enumerating.
A) energy of each voice segments;
B) length of each voice segments;
C) fundamental frequency of each voice segments;
D) length of each voice segments control;
E) be suitable for the statistics yardstick of each voice segments.
A special expansion of the present invention is, from being produced synthetic speech by the representative of trying to achieve.Obtain the representative of block of speech according to the present invention, can become the language of determining by block of speech with new fully composition of relations by means of these representatives.Thereby draw a synthetic voice output, wherein the block of speech that is embodied by each representative (voice segments) is output with new being arranged in order.
The present invention also has an expansion to be, determines the representative of voice segments as chosen block of speech, if when its membership yardstick has the highest numerical value or considers with a plurality of selecting scale, its out to out has mxm..So just in the group of the voice segments of relevant chosen block of speech, obtain " best " voice segments.
Expansion of the present invention also can be by obtaining in the dependent claims.
Further narrate embodiments of the invention by means of following accompanying drawing.
Their expressions
Accompanying drawing 2 is represented a kind of language construction and the maps on voice signal thereof, particularly reads aloud a sketch of text,
A sketch of accompanying drawing 3 expression ' length control ' selecting scales,
A sketch of accompanying drawing 4 expression ' fundamental frequency ' selecting scales,
A sketch of accompanying drawing 5 expression ' energy ' selecting scales,
A sketch of accompanying drawing 6 expression ' SCORE score ' selecting scales.
From a voice signal, preferably from a teller's a sufficiently long voice sample, determine that block of speech is important for the phonetic synthesis of a splicing, the voice that exactly found block of speech rearranged into new semanteme are arranged.It is more accurate " to cut out " each voice segments of getting off from voice signal, and then the quality of synthetic speech is also higher.
Expression in accompanying drawing 1, the single step of the method for the block of speech of a kind of language of definite representative from the voice signal that comprises a voice segments.In 101 steps,, comprehensively become each group of each block of speech with the voice segments of voice signal membership corresponding to block of speech.This comprehensively can automatically carry out and for example in document [4] narration.Preferentially carry out HMM (=implicit-Markov-model) training by voice signal.Voice signal can be that about length is a kind of sample of voice arbitrarily of one hour to three hours.Voice segments is by comprehensively in groups after 101 steps are carried out, and wherein each group comprises that at least a voice segments, this voice segments are block of speech of predesignating that belongs to language.
Mostly include a plurality of voice segments in each such group, should determine a representative for phonetic synthesis this moment from each group.Not all the same of each voice segments in a group, but follow statistical distribution.To utilize the knowledge that distributes below, so that find and be selected in a suitable representative of a voice segments in the group.
For this reason, according to the selecting scale computing voice section of predesignating, wherein each voice segments to each selecting scale draws a selective value.Preferably estimate by different selecting scales, draw a distinctive selective value (for each voice segments) (seeing step 102) for each selecting scale for each voice segments.
Obtain the frequency (seeing step 103) of the selective value that the quilt of all voice segments of this group obtains for each group.This is equivalent to draw on X-Y scheme, and wherein horizontal ordinate is that selecting scale numerical value and ordinate are represented frequency.Produce such width of cloth figure for each selecting scale of all voice segments in the group, wherein this figure represents a statistical distribution of the voice segments that calculates according to selecting scale.
In next procedure 104, utilize the frequency of being tried to achieve, so that try to achieve membership function (for each above-mentioned figure).The membership function is preferably in the envelope that draws above the frequency of statistical distribution of selective value.This step also still will be carried out the selecting scale of each group.As mentioned above, a group comprises all voice segments of expressing the block of speech of predesignating.Can obtain a membership yardstick from the membership function to each voice segments.The membership yardstick represents, as a yardstick of the workability of each voice segments in the group of representing each selecting scale.
Select voice segments as representative subsequently in step 105, its membership yardstick is positioned on the threshold value of predesignating.As mentioned above, preferably use a plurality of selecting scales, just draw a plurality of membership yardsticks for each voice segments like this.A plurality of membership yardstick logic phase multiplications draw an out to out.Correspondingly selected then voice segments is as the representative of group, and its out to out is positioned on the total threshold value of predesignating.
For clarity, accompanying drawing 2 represented to include block of speech SBSi (i=1,2 ..., language SPR n) and comprise comprehensively voice segments LAi-j (j=1,2 in group GRi ..., the relation between voice signal SSI n).
Represent that with logical operation 201 block of speech SBS1 can use voice segments LA1-1, AL1-2, LA1-3 ..., LA1-m expresses.This voice segments that is subordinated to block of speech SBS1 is comprehensively in group GR1.Among the group GR1 each voice segments be by obtain in the voice signal and all block of speech SBS1 are described.According to voice signal, relevant each voice segments with different selecting scales has different quality factors separately.Therefore target is, draws one " spendable " representative from the voice segments of group GR1.This representative can realize block of speech SBS1 when synthetic speech.
Same relation similarly is suitable for logical operation 202.One arbitrarily block of speech SBSn can with a large amount of (here being ' p ') comprehensively the voice segments in a group GR2 express.
Tackling above-mentioned selecting scale subsequently studies.For such selecting scale multiple possibility is arranged, wherein recommend a kind of selection here.This selection can be used single, or make up mutually, or also can make up with other selecting scale, so that might from the voice segments group, advantageously determine a representative.
Accompanying drawing 3 expression with length control as selecting scale, i.e. the yardstick of duration synthetic voice segments duration originally with respect to voice segments.Up to each threshold value L
UGWith upper threshold value L
OGDeviation all be considered to no problem.Exceed this threshold value, promptly less than lower threshold value L
UGOr greater than upper threshold value L
OG, membership function Z then
L_synDescend exponentially.This moment membership function Z
L_synDetermine by following formula:
(1).
By with average length l
ΦNormalization is 1, and then deviation is relative.Membership function Z
L_synAlso normalization is 1.ZG represents the membership yardstick.
Accompanying drawing 4 is represented fundamental frequency-control as selecting scale.The fundamental frequency of voice segments should be minimum to the deviation of a target-fundamental frequency (when the synthetic speech) therein.Membership function Z
L-synHave following form:
(2).
To be average frequency f also for clarity here to the frequency f normalization
ΦAlso with membership function Z
L-synNormalization is 1.The last parameter f of frequency
OGFollowing parameter f with frequency
UGExpression.
In accompanying drawing 5 expression with the energy of voice segments as selecting scale.This energy is membership function Z to the relative deviation of a mean value of energy
E-alCriterion:
(3).
The mean value of ENERGY E is E
Φ(expectation value), E
UGBe a lower threshold value of energy, E
OGBe a upper threshold value of energy, and σ
EIt is the variable of energy.With membership function Z
E_alNormalization is 1.
People use the length of voice segments to replace energy as selecting scale, produce a membership function Z similarly with accompanying drawing 5 like this
L-alBe used for estimating the relative deviation that voice segments length changes.If also there is a upper threshold value L
OG, a lower threshold value L
UGVariances sigma with a length
1, membership function Z then
L_alFor:
(4). represented that in accompanying drawing 6 score SCORE is as selecting scale.Score SCORE is the yardstick that a voice segments is suitable as representative, that is to say one prepare selected voice segments be one typical, characteristic voice segments by the byte pronunciation, therefore ' be fit to ' thus as the representative of block of speech accordingly.
Has " the best " (Z
S (smax)=1) and have " the poorest " (Z
S (smin)=1-s
G) membership function Z between the voice segments of score SCORE selecting scale
S (s)Supposed to be linear and (seen response curve Z in the accompanying drawing 6
S (s)).This membership function Z
S (s)Can determine by following formula:
In order to judge, whether a voice segments is suitable as a representative of corresponding block of speech, preferably considers a plurality of membership functions of having set up.In order to ensure, a chosen representative, the numerical value of neither one membership function is positioned at below the threshold value of predesignating, and then single membership yardstick is carried out and-logical operation.This is that to be multiplied each other by each membership yardstick be that an out to out realizes.Under the above-named membership function situation of consideration, draw:
About at membership function Z
E-alAnd Z
L-alMultiplying each other of all states is meant each state in being used to describe a kind of HMMs of voice segments.Each can use the HMMs with varying number state according to modelling, wherein all these states of each voice segments individually is written in the out to out that is drawn by membership function Zges.
In this paper scope, quoted following document:
[1]Nick?Campell,Alan?W?Black:“Prosody?and?the?Selection
of?Source?Units?for?Concatenative?Synthesis”,in
Progress?Speechsynthesis,ISBN?0-387-94701-9,Springer
Verlag?New?York,1997,S.279-292
Ni Ke. bear Pei Er, A Lan. dimension. the cloth Rec: " being used to splice the metrics and the selection of synthetic source unit " language synthesizes proceedings, ISBN 0-387-94701-9, Springer publishing house, New York, 279-292 page or leaf in 1997
[2]Andrew?J.Hunt,Alan?W.Black:“Unit?Selection?in?a
concatenative?speechsynthesis?system?using?a?large
speech?data?base”,Proc.EUROSPEECH?1995,Madrid,
S.373-376。
Gheorghe Andriev. victory. Hui Te, A Lan. dimension. cloth Rec: " unit in the language synthesis system of the splicing of using a big language database is selected " european language 1995 proceedings, Madrid, 373-376 page or leaf.
[3]Alistair?D.Conkie,Stephen?Isard:“Optimal?Coupling
of?Diphones”,in?Progress?in?Speechsynthesis,ISBN
0-387-94701-9,Springer?Verlag?New?York,1997,S.293-
304。
Alistair. moral. Kang Ke, this carries all. Yi Saer: " optimum coupling of double-tone ", language synthesizes proceedings, ISBN 0-387-94701-9, Springer publishing house, New York, 279-292 page or leaf in 1997.
[4]R.E.Donovan,P.C.Woodland:“Improvements?in?an?HMM
-based?speechsynthesiser”,Proc.ICASSP?1995,
Michigan,S.573-576。
Ah .'s dust. it is slow by ten thousand to struggle against, skin. and uncommon. military Te Lande: " improvement of the voice operation demonstrator on the HMM-basis ", ICASSP 1995 proceedings, Michigan, 573-576 page or leaf
[5]G.Ruske:“Automati?sche?Spracherkennung:Methoden?der
Klassifikation?u.Merkmalsextraktion”,Oldenbourg
Verlag,Muenchen,1988,S.160-171。
Pueraria lobota. Lu Sike: " automatic speech recognition: classification and feature extracting method ", Ou Lunbao publishing house, Munich, 1988, the 160-171 pages or leaves.
Claims (8)
1. from a voice signal that comprises some voice segments, determine a kind of method of representative of the block of speech of predesignating of language,
A) wherein, the voice segments of voice signal is become each one corresponding to the block of speech of language is comprehensive
Group,
B) wherein, for the voice segments of each group according to a kind of selecting scale of predesignating from language
Obtain selective value in the tone signal,
C) wherein, determine the frequency of the selective value of group,
D) wherein, determine the membership function by means of frequency, this membership function is explanation
A membership yardstick of relevant group relevant voice segments workability,
E) wherein, from the group of the voice segments of chosen block of speech, determine its membership chi
Degree is positioned at more than the threshold value of predesignating, and that voice segments is as representative.
2. according to the method for claim 1,
Wherein, obtain the other selective value of voice segments in the group by means of at least one other selecting scale, with other frequency of determining other selective value, and, determine to have an other membership function of accordingly other membership yardstick for each other frequency.
3. according to the method for claim 2,
Wherein, each membership yardstick enters out to out with multiplying each other, and obtains representative from the group of voice segments, and its out to out is positioned at more than total threshold value of predesignating.
4. the method that one of requires according to aforesaid right,
Wherein, voice segments is the phoneme of language, double-tone, three sounds, syllable, semitone joint, word or these combination.
5. the method that one of requires according to aforesaid right,
Wherein, voice segments is the single status that belongs to implicit-Markov-model.
6. the method that one of requires according to aforesaid right,
Wherein, selecting scale is in the amount of enumerating below one:
A) energy of each voice segments;
B) length of each voice segments;
C) fundamental frequency of each voice segments;
D) length of each voice segments control;
E) statistical yardstick that each voice segments is cooperated.
7. the method that one of requires according to aforesaid right,
Wherein, be combined into language from the representative that obtains.
8. the method that one of requires according to aforesaid right,
Wherein, determine that voice segments is the representative of block of speech, its membership yardstick has the highest numerical value, if or consider a plurality of selecting scales, its out to out has the highest numerical value.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19736465.9 | 1997-08-21 | ||
DE19736465 | 1997-08-21 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1267384A true CN1267384A (en) | 2000-09-20 |
CN1115664C CN1115664C (en) | 2003-07-23 |
Family
ID=7839772
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN98808350A Expired - Fee Related CN1115664C (en) | 1997-08-21 | 1998-07-27 | Method for determining representative speech sound block from voice signal comprising speech units |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP1005694B1 (en) |
JP (1) | JP2001514400A (en) |
CN (1) | CN1115664C (en) |
DE (1) | DE59801989D1 (en) |
ES (1) | ES2167945T3 (en) |
WO (1) | WO1999010878A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108269589A (en) * | 2016-12-31 | 2018-07-10 | ***通信集团贵州有限公司 | For the speech quality assessment method and its device of call |
CN110246490A (en) * | 2019-06-26 | 2019-09-17 | 合肥讯飞数码科技有限公司 | Voice keyword detection method and relevant apparatus |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE10120513C1 (en) | 2001-04-26 | 2003-01-09 | Siemens Ag | Method for determining a sequence of sound modules for synthesizing a speech signal of a tonal language |
US8918316B2 (en) * | 2003-07-29 | 2014-12-23 | Alcatel Lucent | Content identification system |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2590414B2 (en) * | 1991-03-12 | 1997-03-12 | 科学技術庁長官官房会計課長 | Fuzzy pattern recognition method |
-
1998
- 1998-07-27 WO PCT/DE1998/002120 patent/WO1999010878A1/en active IP Right Grant
- 1998-07-27 JP JP2000508109A patent/JP2001514400A/en not_active Withdrawn
- 1998-07-27 EP EP98948677A patent/EP1005694B1/en not_active Expired - Lifetime
- 1998-07-27 CN CN98808350A patent/CN1115664C/en not_active Expired - Fee Related
- 1998-07-27 DE DE59801989T patent/DE59801989D1/en not_active Expired - Fee Related
- 1998-07-27 ES ES98948677T patent/ES2167945T3/en not_active Expired - Lifetime
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108269589A (en) * | 2016-12-31 | 2018-07-10 | ***通信集团贵州有限公司 | For the speech quality assessment method and its device of call |
CN108269589B (en) * | 2016-12-31 | 2021-01-29 | ***通信集团贵州有限公司 | Voice quality evaluation method and device for call |
CN110246490A (en) * | 2019-06-26 | 2019-09-17 | 合肥讯飞数码科技有限公司 | Voice keyword detection method and relevant apparatus |
Also Published As
Publication number | Publication date |
---|---|
EP1005694A1 (en) | 2000-06-07 |
DE59801989D1 (en) | 2001-12-06 |
JP2001514400A (en) | 2001-09-11 |
CN1115664C (en) | 2003-07-23 |
ES2167945T3 (en) | 2002-05-16 |
WO1999010878A1 (en) | 1999-03-04 |
EP1005694B1 (en) | 2001-10-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lluís et al. | End-to-end music source separation: Is it possible in the waveform domain? | |
CN1152365C (en) | Apparatus and method for pitch tracking | |
CN1162839C (en) | Method and device for producing acoustics model | |
CN1169115C (en) | Prosodic databases holding fundamental frequency templates for use in speech synthesis | |
DE60020434T2 (en) | Generation and synthesis of prosody patterns | |
EP1213705B1 (en) | Method and apparatus for speech synthesis | |
CN1275746A (en) | Equipment for converting text into audio signal by using nervus network | |
US20090254349A1 (en) | Speech synthesizer | |
CN101064104A (en) | Emotion voice creating method based on voice conversion | |
CN1750120A (en) | Indexing apparatus and indexing method | |
CN101075432A (en) | Speech synthesis apparatus and method | |
CN101051462A (en) | Feature-vector compensating apparatus and feature-vector compensating method | |
JPH0782348B2 (en) | Subword model generation method for speech recognition | |
CN1835075A (en) | Speech synthetizing method combined natural sample selection and acaustic parameter to build mould | |
CN106295717A (en) | A kind of western musical instrument sorting technique based on rarefaction representation and machine learning | |
CN1308911C (en) | Method and system for identifying status of speaker | |
CN1924994A (en) | Embedded language synthetic method and system | |
WO2014183411A1 (en) | Method, apparatus and speech synthesis system for classifying unvoiced and voiced sound | |
CN1115664C (en) | Method for determining representative speech sound block from voice signal comprising speech units | |
Steffman et al. | An automated method for detecting F measurement jumps based on sample-to-sample differences | |
CN1787072A (en) | Method for synthesizing pronunciation based on rhythm model and parameter selecting voice | |
Jacewicz et al. | Variability in within-category implementation of stop consonant voicing in American English-speaking children | |
US7454347B2 (en) | Voice labeling error detecting system, voice labeling error detecting method and program | |
CN1238805C (en) | Method and apparatus for compressing voice library | |
CN105719641A (en) | Voice selection method and device used for waveform splicing of voice synthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C19 | Lapse of patent right due to non-payment of the annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |