US7069217B2 - Waveform synthesis - Google Patents
Waveform synthesis Download PDFInfo
- Publication number
- US7069217B2 US7069217B2 US09/043,171 US4317198A US7069217B2 US 7069217 B2 US7069217 B2 US 7069217B2 US 4317198 A US4317198 A US 4317198A US 7069217 B2 US7069217 B2 US 7069217B2
- Authority
- US
- United States
- Prior art keywords
- waveform
- sequence
- cycles
- point
- successive
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 230000015572 biosynthetic process Effects 0.000 title claims description 24
- 238000003786 synthesis reaction Methods 0.000 title claims description 24
- 238000000034 method Methods 0.000 claims abstract description 45
- 230000009466 transformation Effects 0.000 claims description 30
- 239000013598 vector Substances 0.000 claims description 27
- 238000006073 displacement reaction Methods 0.000 claims description 5
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 230000002123 temporal effect Effects 0.000 claims 8
- 125000004122 cyclic group Chemical group 0.000 abstract 1
- 230000000875 corresponding effect Effects 0.000 description 29
- 239000000523 sample Substances 0.000 description 28
- 239000011159 matrix material Substances 0.000 description 19
- 230000008569 process Effects 0.000 description 15
- 238000010586 diagram Methods 0.000 description 13
- 230000015654 memory Effects 0.000 description 10
- 230000007704 transition Effects 0.000 description 7
- 238000004364 calculation method Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 230000002250 progressing effect Effects 0.000 description 4
- NPOJQCVWMSKXDN-UHFFFAOYSA-N Dacthal Chemical compound COC(=O)C1=C(Cl)C(Cl)=C(C(=O)OC)C(Cl)=C1Cl NPOJQCVWMSKXDN-UHFFFAOYSA-N 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000001208 nuclear magnetic resonance pulse sequence Methods 0.000 description 3
- 230000003252 repetitive effect Effects 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 2
- 230000001934 delay Effects 0.000 description 2
- 238000009795 derivation Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000000739 chaotic effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000005183 dynamical system Methods 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 238000002604 ultrasonography Methods 0.000 description 1
- 230000003936 working memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/06—Elementary speech units used in speech synthesisers; Concatenation rules
- G10L13/07—Concatenation rules
Definitions
- the corresponding point s i in the state sequence space is represented by the value of that point s i together with those of a preceding and a succeeding point x i+j , x i+k (where j is conveniently equal to k and in this case both are equal to 10).
- the attractor of FIG. 4 consists of a double loop (which, in the projection indicated, appears to cross itself but does not in fact do so in three dimensions).
- each voiced sound gives rise to an attractor of this nature, all of which can adequately be represented in a three dimensional state space, although it might also be possible to use as few as two dimensions or as many as four, five or more.
- the important parameters for an effective representation of voiced sounds in such a state space are the number of dimensions selected and the time delay between adjacent samples.
- the shapes of the attractors vary considerably (with the corresponding shapes of the speech waveforms to which they correspond) although there is some relationship between the topologies of respective attractors and the sounds to which they correspond.
- voiced sounds such as vowels and voiced consonants
- the state space representation will not follow successive closely similar loops with a well defined topology, but instead will follow a trajectory which passes in an apparently random fashion through a volume in the state sequence space.
- a speech synthesizer comprises a loudspeaker 2 , fed from the analogue output of a digital to analog converter 4 , coupled to an output port of a central processing unit 6 in communication with a storage system 8 (comprising random access memory 8 a , for use by the CPU 6 in calculation; program memory 8 b for storing the CPU operating program; and data constant memory 8 c for storing data for use in synthesis).
- a storage system 8 comprising random access memory 8 a , for use by the CPU 6 in calculation; program memory 8 b for storing the CPU operating program; and data constant memory 8 c for storing data for use in synthesis).
- the apparatus of FIG. 6 may conveniently be provided by a personal computer and sound card such as an Elonex (TM) Personal Computer comprising a 33 MHz Intel 486 microprocessor as the CPU 6 and an Ultrasound Max. (TM) soundcard providing the digital to analogue converter 4 and output to a loudspeaker 2 .
- TM Elonex
- TM Ultrasound Max.
- Any other digital processor of similar or higher power could be used instead.
- the storage system 8 comprises a mass storage device (e.g. a hard disk) containing the operating program and data to be used in synthesis and a random access memory comprising partitioned areas 8 a , 8 b , 8 c , the program and data being loaded into the latter two areas, respectively, prior to use of the apparatus of FIG. 6 .
- a mass storage device e.g. a hard disk
- a random access memory comprising partitioned areas 8 a , 8 b , 8 c , the program and data being loaded into the latter two areas, respectively, prior to use of the apparatus of FIG. 6 .
- the stored data held within the stored data memory 8 c comprises a set of records 10 a , 10 b , . . . 10 c , each of which represents a small segment of a word which may be considered to be unambiguously distinguishable regardless of its context in a word or phrase (i.e. each corresponds to a phoneme or allophone).
- the phonemes can be represented by any of a number of different phonetic alphabets; in this embodiment, the SAMPA (Speech Assessment Methodology Phonetic Alphabet, as disclosed in A. Breen, “Speech Synthesis Models: A Review”, Electronics and Communication Engineering Journal, pages 19–31, February 1992) is used.
- Each of the records comprises a respective waveform recording 11 , comprising successive digital values (e.g. sampled at 20 kHz) of the waveform of an actual utterance of the phoneme in question as successive samples x 1 , x 2 . . . x N .
- each of the records 10 associated with a voiced sound comprises, for each stored sample x i , a transform matrix defined by nine stored constant values.
- the data memory 8 c comprises on the order of thirty to forty records 10 (depending the phonetic alphabet chosen), each consisting of the order of half a second of recorded digital waveforms (i.e., for sampling at 20 kHz, around ten thousand samples x i , each of the sample records for voiced sounds having an associated nine element transform matrix).
- an utterance to be synthesised by the speech synthesizer consists of a sequence of portions each with an associated duration, comprising a silence portion 14 a followed by a word comprising a sequence of portions 14 b – 14 f each consisting of a phoneme of predetermined duration, followed by a further silence portion 14 g , followed by a further word comprised of phoneme portions 14 h – 14 j each of an associated duration, and so on.
- the sequence of phonemes, together with their durations, are either stored or derived by one of several well known rule systems forming no part of the present invention, but comprised within the control program.
- the closest point selected in step 508 will in fact be the last point on the current strand (in this case s 21 ). However, it may correspond instead to one of the nearest neighbours on that strand (as in this case, where s 22 is closer), or to a point on another strand of the trajectory where this is closely spaced in the state sequence space, as indicated in FIG. 9 c.
- step 520 the CPU 6 determines whether the required predetermined duration of the phoneme being synthesised has been reached. If not, then the CPU 6 returns to step 508 of the control program, and determines the new closest point on the trajectory to the most recently synthesized point. In many cases, this may be the same as the point s i+1 from which the synthesised point was itself calculated, but this is not necessarily so.
- a human speaker recites a single utterance of a desired sound (e.g. a vowel)
- the CPU 26 and analog to digital converter 24 sample the analog waveform thus produced at the output of the microphone 22 and store successive samples (e.g. around 10,000 samples, corresponding to around half a second of speech) in the working memory area 28 a.
- the CPU 26 is arranged to normalise the pitch of the recorded utterance by determining the start and end of each pitch pulse period (illustrated in FIG. 1 ) for example by determining the zero crossing points thereof, and then equalising the number of samples within each pitch period (for example to 140 samples in each pitch period) by interpolating between the originally stored samples.
- the stored data are transferred (either by communications link or a removable carrier such as a floppy disk) to the memory 8 of synthesis apparatus of FIG. 6 .
- unvoiced sounds do not exhibit stable low dimensional behaviour, and hence they do not follow regular, repeating attractors in state sequence space and synthesis of an attractor as described above is therefore unstable. Accordingly, unvoiced sounds are produced in this embodiment by simply outputting, in succession, the stored waveform values x i stored for the unvoiced sound to the DAC 4 . The same is true of plosive sounds.
- the present invention interpolates between two waveforms, one representing each sound, in state sequence space.
- the state space representation is useful where one or both of the waveforms between which interpolation is performed are being synthesised (i.e. one or both are voiced waveforms).
- the synthesised points in state space are derived, and then the interpolated point is calculated between them; in fact, as discussed below, it is only necessary to interpolate on one co-ordinate axis, so that the state space representation plays no part in the actual interpolation process.
- the interpolation is performed over more than one pitch pulse cycle (for example 10 cycles) by progressively linearly varying the euclidean distance between the two waveforms in state sequence space.
- an index j is initialised (e.g. at zero).
- the transformation matrix is calculated directly at each newly synthesised point; in this case, the synthesizer of FIG. 6 incorporates the functionality of the apparatus of FIG. 10 .
- Such calculation reduces the required storage space by around one order of magnitude, although higher processing speed is required.
- a corresponding pair of points s a k , s b l are read from the stored waveform records 10 ; as described in the first embodiment, the points correspond to matching parts of the respective pitch pulse cycles of the two waveforms.
- step 814 the CPU 6 performs the steps 610 – 622 of FIG. 12 , to calculate the transform matrices T k for each point along this stored track.
- each interpolated trajectory and set of transformation vectors is used only once to calculate only a single output value, in fact fewer interpolated sets of trajectories and sets of transformation matrices could be calculated, and the same trajectory used for several successive output samples.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrophonic Musical Instruments (AREA)
- Lasers (AREA)
- Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB9600774-5 | 1996-01-15 | ||
GBGB9600774.5A GB9600774D0 (en) | 1996-01-15 | 1996-01-15 | Waveform synthesis |
PCT/GB1997/000060 WO1997026648A1 (en) | 1996-01-15 | 1997-01-09 | Waveform synthesis |
Publications (2)
Publication Number | Publication Date |
---|---|
US20010018652A1 US20010018652A1 (en) | 2001-08-30 |
US7069217B2 true US7069217B2 (en) | 2006-06-27 |
Family
ID=10787066
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US09/043,171 Expired - Fee Related US7069217B2 (en) | 1996-01-15 | 1997-01-09 | Waveform synthesis |
Country Status (8)
Country | Link |
---|---|
US (1) | US7069217B2 (de) |
EP (1) | EP0875059B1 (de) |
JP (1) | JP4194656B2 (de) |
AU (1) | AU724355B2 (de) |
CA (1) | CA2241549C (de) |
DE (1) | DE69722585T2 (de) |
GB (1) | GB9600774D0 (de) |
WO (1) | WO1997026648A1 (de) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040034530A1 (en) * | 2002-05-31 | 2004-02-19 | Tomomi Hara | Data structure for waveform synthesis data and method and apparatus for synthesizing waveform |
US20040133585A1 (en) * | 2000-07-11 | 2004-07-08 | Fabrice Pautot | Data-processing arrangement comprising confidential data |
US20080172349A1 (en) * | 2007-01-12 | 2008-07-17 | Toyota Engineering & Manufacturing North America, Inc. | Neural network controller with fixed long-term and adaptive short-term memory |
US20110226116A1 (en) * | 2010-03-17 | 2011-09-22 | Casio Computer Co., Ltd. | Waveform generation apparatus and waveform generation program |
US20120016672A1 (en) * | 2010-07-14 | 2012-01-19 | Lei Chen | Systems and Methods for Assessment of Non-Native Speech Using Vowel Space Characteristics |
US20120310650A1 (en) * | 2011-05-30 | 2012-12-06 | Yamaha Corporation | Voice synthesis apparatus |
US8719030B2 (en) * | 2012-09-24 | 2014-05-06 | Chengjun Julian Chen | System and method for speech synthesis |
US9933990B1 (en) * | 2013-03-15 | 2018-04-03 | Sonitum Inc. | Topological mapping of control parameters |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3912913B2 (ja) * | 1998-08-31 | 2007-05-09 | キヤノン株式会社 | 音声合成方法及び装置 |
JP4656443B2 (ja) * | 2007-04-27 | 2011-03-23 | カシオ計算機株式会社 | 波形発生装置および波形発生処理プログラム |
JP5347405B2 (ja) * | 2008-09-25 | 2013-11-20 | カシオ計算機株式会社 | 波形発生装置および波形発生処理プログラム |
JP5224552B2 (ja) * | 2010-08-19 | 2013-07-03 | 達 伊福部 | 音声生成装置およびその制御プログラム |
US11373672B2 (en) | 2016-06-14 | 2022-06-28 | The Trustees Of Columbia University In The City Of New York | Systems and methods for speech separation and neural decoding of attentional selection in multi-speaker environments |
WO2017218492A1 (en) * | 2016-06-14 | 2017-12-21 | The Trustees Of Columbia University In The City Of New York | Neural decoding of attentional selection in multi-speaker environments |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4022974A (en) | 1976-06-03 | 1977-05-10 | Bell Telephone Laboratories, Incorporated | Adaptive linear prediction speech synthesizer |
US4622877A (en) | 1985-06-11 | 1986-11-18 | The Board Of Trustees Of The Leland Stanford Junior University | Independently controlled wavetable-modification instrument and method for generating musical sound |
US4635520A (en) * | 1983-07-28 | 1987-01-13 | Nippon Gakki Seizo Kabushiki Kaisha | Tone waveshape forming device |
US4718093A (en) * | 1984-03-27 | 1988-01-05 | Exxon Research And Engineering Company | Speech recognition method including biased principal components |
EP0385444A2 (de) | 1989-03-02 | 1990-09-05 | Yamaha Corporation | Vorrichtung zum Erzeugen eines Musiktonsignals |
US5111505A (en) * | 1988-07-21 | 1992-05-05 | Sharp Kabushiki Kaisha | System and method for reducing distortion in voice synthesis through improved interpolation |
US5745651A (en) * | 1994-05-30 | 1998-04-28 | Canon Kabushiki Kaisha | Speech synthesis apparatus and method for causing a computer to perform speech synthesis by calculating product of parameters for a speech waveform and a read waveform generation matrix |
US5832437A (en) * | 1994-08-23 | 1998-11-03 | Sony Corporation | Continuous and discontinuous sine wave synthesis of speech signals from harmonic data of different pitch periods |
-
1996
- 1996-01-15 GB GBGB9600774.5A patent/GB9600774D0/en active Pending
-
1997
- 1997-01-09 US US09/043,171 patent/US7069217B2/en not_active Expired - Fee Related
- 1997-01-09 CA CA002241549A patent/CA2241549C/en not_active Expired - Fee Related
- 1997-01-09 EP EP97900309A patent/EP0875059B1/de not_active Expired - Lifetime
- 1997-01-09 WO PCT/GB1997/000060 patent/WO1997026648A1/en active IP Right Grant
- 1997-01-09 AU AU13897/97A patent/AU724355B2/en not_active Ceased
- 1997-01-09 DE DE69722585T patent/DE69722585T2/de not_active Expired - Lifetime
- 1997-01-09 JP JP52576897A patent/JP4194656B2/ja not_active Expired - Fee Related
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4022974A (en) | 1976-06-03 | 1977-05-10 | Bell Telephone Laboratories, Incorporated | Adaptive linear prediction speech synthesizer |
US4635520A (en) * | 1983-07-28 | 1987-01-13 | Nippon Gakki Seizo Kabushiki Kaisha | Tone waveshape forming device |
US4718093A (en) * | 1984-03-27 | 1988-01-05 | Exxon Research And Engineering Company | Speech recognition method including biased principal components |
US4622877A (en) | 1985-06-11 | 1986-11-18 | The Board Of Trustees Of The Leland Stanford Junior University | Independently controlled wavetable-modification instrument and method for generating musical sound |
US5111505A (en) * | 1988-07-21 | 1992-05-05 | Sharp Kabushiki Kaisha | System and method for reducing distortion in voice synthesis through improved interpolation |
EP0385444A2 (de) | 1989-03-02 | 1990-09-05 | Yamaha Corporation | Vorrichtung zum Erzeugen eines Musiktonsignals |
US5745651A (en) * | 1994-05-30 | 1998-04-28 | Canon Kabushiki Kaisha | Speech synthesis apparatus and method for causing a computer to perform speech synthesis by calculating product of parameters for a speech waveform and a read waveform generation matrix |
US5832437A (en) * | 1994-08-23 | 1998-11-03 | Sony Corporation | Continuous and discontinuous sine wave synthesis of speech signals from harmonic data of different pitch periods |
Non-Patent Citations (12)
Title |
---|
Daniel P. Lathrop et al. (Characterization of an experimental strange attractor by periodic orbits), Physical Review, p. 4028-4031, 1989. * |
Gabriel B. Mindlin et al. (Topological analysis and sunthesis of chaotic time series) , Physica D, pp. 229-242, 1992. * |
IBM Technical Disclosure Bulletin, vol. 28, No. 3, Aug. 1985, New York, US, pp. 1248-1249, Anonymous, Use of the Grid Search Technique for Improving Synthetic Speech Control-Data. |
IEE Colloquium on 'Exploiting Chaos in Signal Processing' (Digest No. 1994/143), Jun. 6, 1994, London, GB, pp. 8/1-10, Banbrook et al, "Is speech chaotic?: invariant geometrical measures for speech data". |
IEEE 100 The Authoritative Dictionary of IEEE Standards Terms, Seventh Edition, Standards Information Network IEEE Press 2000. p. 1000. * |
IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, vol. E76-A, No. 11, Nov. 1993, JP, pp. 1964-1970, Hirokawa et al, "High quality speech synthesis system based on waveform concatenation of phoneme segment". |
International Conference on Acoustics, Speech, and Signal Processing 1988, vol. 1, Apr. 11-14, 1988, New York, NY, pp. 675-678, Everett, "Word synthesis based on line spectrum pairs". |
Kleijn, W.B. and Paliwal, K.K. (Eds), 'Speech Coding and Synthesis', pp. 557-559, 581-587, 600-610 Elsevier Science B.V., 1995. |
M. Banbrook and S. McLaughlin, "Speech Characterisation by Non-Linear Methods", presented at IEEE workshop on Nonlinear Signal and Image Processing NSIP '95, pp. 396-400, Jun. 1995. |
M. Casdagli, "Chaos and Deterministic versus Stochastic Non-Linear Modelling", Journal of the Royal Statistical Society B, vol. 54, No. 2, pp. 303-328, 1991. |
Mark Shelhamer (Correlation Dimension of Optokinetic Nystragmus as Evidence of Chaos in the Oculomotor System), IEEE Transactions on Biomedical Engineering, vol. 39, No. 12, p. 1319-1321, 1992. * |
Westall, F.A. and Ip, S.F.A, "Digital Signal Processing in Telecommunications", pp. 295-297, Chapman & Hall, 1993. |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040133585A1 (en) * | 2000-07-11 | 2004-07-08 | Fabrice Pautot | Data-processing arrangement comprising confidential data |
US7486794B2 (en) * | 2000-07-11 | 2009-02-03 | Gemalto Sa | Data-processing arrangement comprising confidential data |
US7714935B2 (en) * | 2002-05-31 | 2010-05-11 | Leader Electronics Corporation | Data structure for waveform synthesis data and method and apparatus for synthesizing waveform |
US20040034530A1 (en) * | 2002-05-31 | 2004-02-19 | Tomomi Hara | Data structure for waveform synthesis data and method and apparatus for synthesizing waveform |
US20080172349A1 (en) * | 2007-01-12 | 2008-07-17 | Toyota Engineering & Manufacturing North America, Inc. | Neural network controller with fixed long-term and adaptive short-term memory |
US8373056B2 (en) * | 2010-03-17 | 2013-02-12 | Casio Computer Co., Ltd | Waveform generation apparatus and waveform generation program |
US20110226116A1 (en) * | 2010-03-17 | 2011-09-22 | Casio Computer Co., Ltd. | Waveform generation apparatus and waveform generation program |
US20120016672A1 (en) * | 2010-07-14 | 2012-01-19 | Lei Chen | Systems and Methods for Assessment of Non-Native Speech Using Vowel Space Characteristics |
US9262941B2 (en) * | 2010-07-14 | 2016-02-16 | Educational Testing Services | Systems and methods for assessment of non-native speech using vowel space characteristics |
US20120310650A1 (en) * | 2011-05-30 | 2012-12-06 | Yamaha Corporation | Voice synthesis apparatus |
US8996378B2 (en) * | 2011-05-30 | 2015-03-31 | Yamaha Corporation | Voice synthesis apparatus |
US8719030B2 (en) * | 2012-09-24 | 2014-05-06 | Chengjun Julian Chen | System and method for speech synthesis |
US9933990B1 (en) * | 2013-03-15 | 2018-04-03 | Sonitum Inc. | Topological mapping of control parameters |
Also Published As
Publication number | Publication date |
---|---|
AU724355B2 (en) | 2000-09-21 |
DE69722585D1 (de) | 2003-07-10 |
EP0875059A1 (de) | 1998-11-04 |
CA2241549A1 (en) | 1997-07-24 |
JP2000503412A (ja) | 2000-03-21 |
DE69722585T2 (de) | 2004-05-13 |
EP0875059B1 (de) | 2003-06-04 |
CA2241549C (en) | 2002-09-10 |
JP4194656B2 (ja) | 2008-12-10 |
US20010018652A1 (en) | 2001-08-30 |
AU1389797A (en) | 1997-08-11 |
GB9600774D0 (en) | 1996-03-20 |
WO1997026648A1 (en) | 1997-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US6836761B1 (en) | Voice converter for assimilation by frame synthesis with temporal alignment | |
US5740320A (en) | Text-to-speech synthesis by concatenation using or modifying clustered phoneme waveforms on basis of cluster parameter centroids | |
EP2276019B1 (de) | Vorrichtung und Verfahren zur Schaffung einer Gesangssynthetisierungsdatenbank sowie Vorrichtung und Verfahren zur Tonhöhenkurvenerzeugung | |
US7069217B2 (en) | Waveform synthesis | |
EP2270773B1 (de) | Vorrichtung und Verfahren zur Schaffung einer Gesangssynthetisierungsdatenbank sowie Vorrichtung und Verfahren zur Tonhöhenkurvenerzeugung | |
US7035791B2 (en) | Feature-domain concatenative speech synthesis | |
US8280724B2 (en) | Speech synthesis using complex spectral modeling | |
JP2000172285A (ja) | フィルタパラメ―タとソ―ス領域において独立にクロスフェ―ドを行う半音節結合型のフォルマントベ―スのスピ―チシンセサイザ | |
EP0380572A1 (de) | Spracherzeugung aus digital gespeicherten koartikulierten sprachsegmenten. | |
JPS63285598A (ja) | 音素接続形パラメ−タ規則合成方式 | |
US5890118A (en) | Interpolating between representative frame waveforms of a prediction error signal for speech synthesis | |
JPH0727397B2 (ja) | 音声合成装置 | |
JP4430174B2 (ja) | 音声変換装置及び音声変換方法 | |
WO2004027753A1 (en) | Method of synthesis for a steady sound signal | |
JP4454780B2 (ja) | 音声情報処理装置とその方法と記憶媒体 | |
JP2000099020A (ja) | ビブラート制御方法及びプログラム記録媒体 | |
JP3904871B2 (ja) | 歌唱音声合成における韻律生成方法及び韻律生成プログラム、そのプログラムを記録した記録媒体 | |
Jayasinghe | Machine Singing Generation Through Deep Learning | |
Rodet | Sound analysis, processing and synthesis tools for music research and production | |
CN118262696A (en) | Singing voice synthesis model training method, singing voice synthesis method, device and storage medium | |
CN117995163A (zh) | 语音编辑方法及装置 | |
JPH0962295A (ja) | 音声素片作成方法および音声合成方法とその装置 | |
JPS58105198A (ja) | 音声分析合成方法 | |
JPH07104795A (ja) | 音声規則合成装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: BRITISH TELECOMMUNICATIONS PUBLIC LIMITED COMPANY, Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MCLAUGHLIN, STEPHEN;BANBROOK, MICHAEL;REEL/FRAME:009456/0754 Effective date: 19980206 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
FPAY | Fee payment |
Year of fee payment: 4 |
|
REMI | Maintenance fee reminder mailed | ||
LAPS | Lapse for failure to pay maintenance fees | ||
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20140627 |