EP0144731A2 - Speech synthesizer - Google Patents
Speech synthesizer Download PDFInfo
- Publication number
- EP0144731A2 EP0144731A2 EP84113186A EP84113186A EP0144731A2 EP 0144731 A2 EP0144731 A2 EP 0144731A2 EP 84113186 A EP84113186 A EP 84113186A EP 84113186 A EP84113186 A EP 84113186A EP 0144731 A2 EP0144731 A2 EP 0144731A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- speech
- waveform
- articulation
- unit
- memory
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000015654 memory Effects 0.000 claims abstract description 36
- 230000002194 synthesizing effect Effects 0.000 claims abstract description 12
- MQJKPEGWNLWLTK-UHFFFAOYSA-N Dapsone Chemical compound C1=CC(N)=CC=C1S(=O)(=O)C1=CC=C(N)C=C1 MQJKPEGWNLWLTK-UHFFFAOYSA-N 0.000 claims abstract description 4
- 238000000034 method Methods 0.000 claims description 17
- 239000011295 pitch Substances 0.000 description 11
- 230000001052 transient effect Effects 0.000 description 10
- 210000000056 organ Anatomy 0.000 description 7
- 239000002360 explosive Substances 0.000 description 5
- 230000015572 biosynthetic process Effects 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 230000009467 reduction Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 210000003254 palate Anatomy 0.000 description 1
- 210000001584 soft palate Anatomy 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
Definitions
- This invention relates to a speech synthesizer.
- V CV vowel - consonant - vowel
- a speech synthesizer comprising a converting means for converting the input sequence of characters to a sequence of articulation symbols corresponding to a unit speech waveform which is obtained by dividing a diphone, a memory for storing the unit speech waveform corresponding to the predetermined articulation symbols and a synthesizing means for reading the unit speech waveforms corresponding to the articulation symbols of the converted sequence of articulation symbols from the memory and synthesizing them.
- a speech to be synthesized is first indicated by a keyboard 10. From the keyboard 10, a sequence of character signal, a stress strength signal (in this embodiment, three-levelled) and a boundary signal between speeches are generated.
- a stress strength signal in this embodiment, three-levelled
- a boundary signal between speeches are generated.
- the structure and performance of the speech synthesizer shown in Fig. 1 will be described on the assumption that the speech to be synthesized is "kite".
- the alphabetical character sequence signals incorporating “kite” are generated by pushing keys "K”, “I”, “T” and “E".
- the boundary signal B indicating the boundary such as the beginning, ending and pause of the word “kite” and the stress strength signal S T are also supplied to a phoneme symbol/articulation symbol converting circuit 20 together with the character sequence signal.
- the stress strength is determined based on the pitch and strength of each syllable, for example, a high stress strength shows high pitch frequency.
- the converting circuit 20 has a processing part 21 and a memory 22. In the memory 22 is stored the phoneme symbol corresponding to the speech which has been prepared in advance. For example, a phoneme symbol /kait/ is stored in correspondence with the word "kite".
- the processing part 21 supplies an address information to the memory 22 in response to the input signal for a sequence of character. Then the phoneme symbol signal /kait/ is read from the memory 22 and supplied to a phoneme symbol/articulation symbol converting circuit 30.
- the converting circuit 30 has, as well as the converting circuit 20, a processing part 31 and a memory 32. In the memory 32 is stored an articulation symbol (determined by the phonemes located therebefore and thereafter) which is determined in advance corresponding to the phoneme symbol and by the method peculiar to the present invention which will be described in the following.
- the articulatory organs of the human being include vocal chords, a tongue, lips, a velum palatinum, etc., as shown in Fig. 3, and various speech is generated by controlling these articulatory organs in accordance with nerve pulse signals. Therefore, if two articulations of the articulatory organs are similar, two similar speech waveforms are generated. Further, it is apparent that if the.articulation parameter values representing the movements of these articulatory organs are approximate to each other, the generated speech waveforms are analogous. As described above, in the conventional synthesizing method basedon the CV, VC waveform connecting type, many speech waveforms corresponding to CV and VC are prepared, but from the viewpoint of the movement of an articulation parameter considerably redundant waveforms are included therein.
- the speech waveform corresponding to a phoneme /ka/ and that corresponding to a phoneme /ga/ are prepared separately.
- the movement of the articulatory organs for /ka/ and that for /ga/ are very similar.
- the relationship between the tongue, palate, etc. is almost the same, and the main difference is in whether the vocal chords are vibrating or not (voiced or unvoiced) in the consonant parts. Therefore, in the voiced section after the unvoiced section of the consonant part /k/ in /ka/ (the section shifting to the normal part of the vowel /a/ which corresponds to (C in Fig.
- the articulation parameter is almost the same as that of /ga/ (C in Fig. 4B), which can take the place of the partial. waveform of /ka/ in that section with a fairly good approximation. It is clear that in the pairs /kV/ - /gV/, /tV/ - /dV/ and /pV/ - /bV/ (V represents a vowel) also, the waveforms in the part shifting to the vowels can be shared. In Figs.
- part A is the silent part at the beginning of /ka/ or /ga/ (represented as * )
- part B the waveform of "k” in /ka/ or “g” in /ga/
- B' the waveform of the part affected by the phoneme following "g” in /ga/
- C and D are, as described above, the speech waveforms of the vowels "a” following the consonant of /ka/ and /ga/.
- the time section which is determined in consideration of manner of articulation is shorter than a CV or VC waveform and can be substituted by a speech waveform based on a different phoneme series, as is shown in Figs. 4A, 4 B , is called an articulation segment, and a speech waveform in the articulation segment is called an articulation element piece waveform. That is, the syllables /ka/ and /ga/ are divided into the time sections B and C for the purpose of using the transient parts of those syllables as those for another speech synthesis.
- articulation segments the manner of articulation of which are the same are represented by the same articulation symbol and the articulation element piece waveform corresponding to this articulation symbol is stored in the memory 32 in advance.
- the articulation symbols corresponding to a sequence of phoneme symbols are stored in advance.
- Fig. 2 shows the classified articulation symbols, in which * represents the silent part which is placed at the beginning of speech or immediately before an explosive, "p", “t”, “k” explosive parts, and (b)a, (d)a, (g)a represent transient parts of the vowel "a” parts which follow the consonants "b", "d", “g”.
- i(b), i(d), i(g) represent the transient parts of the vowel "i” parts which precede the consonants "b", “d", "g", and ai, au, ao represent the transient parts where the vowel "a” is followed by the vowels "i", "u” and "o".
- the phoneme symbol /it/ is substituted by a transient part i(d) shifting from the vowel to the consonant of the phoneme symbol /id/ resembling /it/ and a silent part * is placed immediately before the silent explosive "t".
- speech synthesis by using, in place of /ka/ and /it/, the waveforms taken from /ga/ and /id/ the phoneme sequence of which is different from, but the articulation of which is similar to /ka/ and /it/, dispenses with the need to previously store the transient part of /ka/ or /it/ and enables reduction in the memory capacity.
- These articulation element piece waveforms can be easily obtained from, for example, waveforms of uttered speech.
- the waveform address generation circuit 50 reads the articulation element piece waveform corresponding to each articulation symbol which is contained in the articulation signal, and corresponding to the stress signal S T from an articulation waveform memory which is selected from among memories 80a, 80b and 80c included in an articulation waveform memory 80 by the stress signal S T .
- the articulation element piece waveform is generated on the basis of the address corresponding to each articulation symbol from the memory 80.
- the stress signal S T from the processing part 21 is detected in a stress strength detection circuit 40, and the articulation phoneme piece waveform of the strength corresponding to the strength of the detected stress strength is read from the memory 80.
- the articulation element piece waveforms corresponding to the articulation symbols shown in Fig. 2 are stored.
- An interpolation method selection circuit 60 judges whether the articulation symbol (two continuous waveforms) from the phoneme symbol/articulation symbol converting circuit 30 is voiced or unvoiced.
- the interpolation circuit 70 is controlled by this judge result to perform the following interpolation, namely, when the articulation symbol is unvoiced (as well as silence) the two continuous articulation element piece waveforms read from the memory 80 are directly connected and, when the articulation symbol is voiced, these waveforms are interpolated, for example, synchronously with a pitch.
- any spoken word is synthesized by connecting articulation waveforms having several levels of pitches by interpolation process between waveforms on the synchronous pitch process. For example, as shown in Fig.
- N i namely the time length of h i (n)
- N i is assumed to be the value obtained by interpolating N f and Ng.
- N f and N are shorter than N i the final sample value of the waveform may be repeated, and when N f and Ng are longer than N i the surplus waveform may be discarded.
- a continuous articulation element piece waveform (in this example, a digital waveform) corresponding to the input sequence of characters is supplied to a D/A converter 90 where the interpolated synthesized articulation waveform is converted to an analogue waveform and generated as a synthesized speech.
- the symbol waveform of a synthesized speech obtained in this way is shown in Fig. 6.
- this invention in which a unit of speech is used which is shorter from the viewpoint of time than a unit speech waveform such as CV, VC waveforms in the CV, VC waveform compiling type synthesizing method, not only requires a small memory capacity of waveform but also reflects exactly the articulation of the articulatory organs so as to obtain a synthesized speech of high quality.
- an articulation element piece waveform corresponding to an articulation symbol is compiled and synthesized, but it is clear that the reduction in memory capacity is also possible when this invention is applied to the synthesizing method using what is called a "characteristic parameter" such as a Formant parameter.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrophonic Musical Instruments (AREA)
Abstract
Description
- This invention relates to a speech synthesizer.
- In the conventional speech synthesis, appropriate syllable waveforms represented by combination of vowel - consonant - vowel (VCV) are prepared in advance, .and connected together. However, since the number of phonemes represented by VCV is very large, an enormous memory capacity for storing them is required. On the other hand, there has been proposed a method in which the waveforms corresponding to the combinations of consonant - vowel (CV), or vowel - consonant (VC), namely, demisyllable or diphone, which have a time length of about half that of a single syllable, are prepared in advance, and the waveforms corresponding to the CV or VC to be required for synthesized speech are selected, and are connected together (compiled and synthesized). According to this method, a reduction in memory capacity is possible than in the case of preparing VCV, but a relatively large memory capacity is still required because of the large quantity of speech waveform information corresponding to CV and VC.
- Accordingly, it is an object of the invention to provide a speech synthesizer which requires a comparatively small memory capacity in respect of speech data such as speech waveforms to be prepared in advance.
- It is another object of the invention to provide a speech synthesizer which has the above advantage and by which synthesized speech of high quality can be obtained.
- According to the present invention, there is provided a speech synthesizer comprising a converting means for converting the input sequence of characters to a sequence of articulation symbols corresponding to a unit speech waveform which is obtained by dividing a diphone, a memory for storing the unit speech waveform corresponding to the predetermined articulation symbols and a synthesizing means for reading the unit speech waveforms corresponding to the articulation symbols of the converted sequence of articulation symbols from the memory and synthesizing them.
- These and other objects and features of the present invention_will become clear by the following description of a preferred embodiment of the present invention with reference to the accompanying drawings.
-
- Fig. 1 is a block diagram showing the structure of an embodiment of a speech synthesizer according to the invention;
- Fig. 2 is a table of information of the synthesizer shown in Fig. 1 which is stored in a
memory 32 of a phoneme symbol/articulationsymbol converting part 30; - Fig. 3 illustrates the concept of the articulatory organs of the human body for explaiming the principle of the invention;
- Figs. 4A and 4B show examples of articulatory segments for explaiming the principle of the invention;
- Fig. 5 shows waveforms interpolated by a synchronous pitch method used in the present invention; and
- Fig. 6 is a waveform of synthesized speech formed by compiling and synthesizing waveforms of articulation element pieces.
- Referring to Fig. 1, a speech to be synthesized is first indicated by a
keyboard 10. From thekeyboard 10, a sequence of character signal, a stress strength signal (in this embodiment, three-levelled) and a boundary signal between speeches are generated. Hereinunder, the structure and performance of the speech synthesizer shown in Fig. 1 will be described on the assumption that the speech to be synthesized is "kite". - Now, the alphabetical character sequence signals incorporating "kite" are generated by pushing keys "K", "I", "T" and "E". The boundary signal B indicating the boundary such as the beginning, ending and pause of the word "kite" and the stress strength signal ST are also supplied to a phoneme symbol/articulation
symbol converting circuit 20 together with the character sequence signal. The stress strength is determined based on the pitch and strength of each syllable, for example, a high stress strength shows high pitch frequency. The convertingcircuit 20 has aprocessing part 21 and amemory 22. In thememory 22 is stored the phoneme symbol corresponding to the speech which has been prepared in advance. For example, a phoneme symbol /kait/ is stored in correspondence with the word "kite". Theprocessing part 21 supplies an address information to thememory 22 in response to the input signal for a sequence of character. Then the phoneme symbol signal /kait/ is read from thememory 22 and supplied to a phoneme symbol/articulationsymbol converting circuit 30. The convertingcircuit 30 has, as well as the convertingcircuit 20, aprocessing part 31 and amemory 32. In thememory 32 is stored an articulation symbol (determined by the phonemes located therebefore and thereafter) which is determined in advance corresponding to the phoneme symbol and by the method peculiar to the present invention which will be described in the following. - The articulatory organs of the human being include vocal chords, a tongue, lips, a velum palatinum, etc., as shown in Fig. 3, and various speech is generated by controlling these articulatory organs in accordance with nerve pulse signals. Therefore, if two articulations of the articulatory organs are similar, two similar speech waveforms are generated. Further, it is apparent that if the.articulation parameter values representing the movements of these articulatory organs are approximate to each other, the generated speech waveforms are analogous. As described above, in the conventional synthesizing method basedon the CV, VC waveform connecting type, many speech waveforms corresponding to CV and VC are prepared, but from the viewpoint of the movement of an articulation parameter considerably redundant waveforms are included therein. For example, in the CV, VC waveform connecting type method, the speech waveform corresponding to a phoneme /ka/ and that corresponding to a phoneme /ga/ are prepared separately. However, the movement of the articulatory organs for /ka/ and that for /ga/ are very similar. The relationship between the tongue, palate, etc. is almost the same, and the main difference is in whether the vocal chords are vibrating or not (voiced or unvoiced) in the consonant parts. Therefore, in the voiced section after the unvoiced section of the consonant part /k/ in /ka/ (the section shifting to the normal part of the vowel /a/ which corresponds to (C in Fig. 4A) the articulation parameter is almost the same as that of /ga/ (C in Fig. 4B), which can take the place of the partial. waveform of /ka/ in that section with a fairly good approximation. It is clear that in the pairs /kV/ - /gV/, /tV/ - /dV/ and /pV/ - /bV/ (V represents a vowel) also, the waveforms in the part shifting to the vowels can be shared. In Figs. 4A and 4B, part A is the silent part at the beginning of /ka/ or /ga/ (represented as *), part B the waveform of "k" in /ka/ or "g" in /ga/, B' the waveform of the part affected by the phoneme following "g" in /ga/, and C and D are, as described above, the speech waveforms of the vowels "a" following the consonant of /ka/ and /ga/.
- Here; the time section which is determined in consideration of manner of articulation is shorter than a CV or VC waveform and can be substituted by a speech waveform based on a different phoneme series, as is shown in Figs. 4A, 4B, is called an articulation segment, and a speech waveform in the articulation segment is called an articulation element piece waveform. That is, the syllables /ka/ and /ga/ are divided into the time sections B and C for the purpose of using the transient parts of those syllables as those for another speech synthesis.
- As described above, articulation segments the manner of articulation of which are the same are represented by the same articulation symbol and the articulation element piece waveform corresponding to this articulation symbol is stored in the
memory 32 in advance. In this way, in thememory 32, the articulation symbols corresponding to a sequence of phoneme symbols are stored in advance. Fig. 2 shows the classified articulation symbols, in which * represents the silent part which is placed at the beginning of speech or immediately before an explosive, "p", "t", "k" explosive parts, and (b)a, (d)a, (g)a represent transient parts of the vowel "a" parts which follow the consonants "b", "d", "g". On the other hand, i(b), i(d), i(g) represent the transient parts of the vowel "i" parts which precede the consonants "b", "d", "g", and ai, au, ao represent the transient parts where the vowel "a" is followed by the vowels "i", "u" and "o". - Now returning to Fig. 1, in response to an address corresponding to the phoneme signal /kait/ sent from the
processing part 21, a sequence of the articulation symbols /* - k - (g)a - ai - i(d) - * - t/ corresponding to the phoneme signal /kait/ is read as an articulation signal from thememory 32 in the phoneme symbol/articulationsymbol converting circuit 31. Here, * represents a silent part described above (#1 in Fig. 2), "k" and "t" explosive parts of /k/ and /t/ (#2, #6 in Fig. 2), "g(a)" a transient part shifting from the consonant to the vowel of /ga/ (#3), "ai" a transient part of the vowel link /ai/ (#4) and "i(d)" a transient part shifting from the vowel to the consonant of /id/ (#5), respectively. In this example, /ka/ in the phoneme symbol /kait/ is substituted by a silent explosive "k" and "(g)a" representing the transient part shifting from the consonant to the vowel of the phoneme symbol "ga" resembling /ka/. The phoneme symbol /it/ is substituted by a transient part i(d) shifting from the vowel to the consonant of the phoneme symbol /id/ resembling /it/ and a silent part * is placed immediately before the silent explosive "t". - As described above, speech synthesis by using, in place of /ka/ and /it/, the waveforms taken from /ga/ and /id/ the phoneme sequence of which is different from, but the articulation of which is similar to /ka/ and /it/, dispenses with the need to previously store the transient part of /ka/ or /it/ and enables reduction in the memory capacity.. These articulation element piece waveforms can be easily obtained from, for example, waveforms of uttered speech.
- Thus obtained articulation signal is supplied to a waveform
address generation circuit 50. The waveformaddress generation circuit 50 reads the articulation element piece waveform corresponding to each articulation symbol which is contained in the articulation signal, and corresponding to the stress signal ST from an articulation waveform memory which is selected from among memories 80a, 80b and 80c included in anarticulation waveform memory 80 by the stress signal ST. In other words, the articulation element piece waveform is generated on the basis of the address corresponding to each articulation symbol from thememory 80. The stress signal ST from theprocessing part 21 is detected in a stressstrength detection circuit 40, and the articulation phoneme piece waveform of the strength corresponding to the strength of the detected stress strength is read from thememory 80. In thearticulation waveform memory 80 the articulation element piece waveforms corresponding to the articulation symbols shown in Fig. 2 are stored. - An interpolation
method selection circuit 60 judges whether the articulation symbol (two continuous waveforms) from the phoneme symbol/articulationsymbol converting circuit 30 is voiced or unvoiced. Theinterpolation circuit 70 is controlled by this judge result to perform the following interpolation, namely, when the articulation symbol is unvoiced (as well as silence) the two continuous articulation element piece waveforms read from thememory 80 are directly connected and, when the articulation symbol is voiced, these waveforms are interpolated, for example, synchronously with a pitch. - Generally, direct connection of the articulation waveforms makes an unnatural synthesis because of the discontinuous change of a pitch or spectrum. To eliminate this drawback, in this invention, any spoken word is synthesized by connecting articulation waveforms having several levels of pitches by interpolation process between waveforms on the synchronous pitch process. For example, as shown in Fig. 5 if one pitch period of waveform (element piece waveform) at the connected ending part of a temporally preceding unit speech waveform is f(n), its time length (pitch period) Nf, the element piece waveform at the connected beginning part of a succeeding unit speech waveform g(n), its time length (pitch period) N , and the element piece waveform in the i-th section of the interpolation waveform of k pitch section is hi(n), the hi(n) is generated on the basis of the following formulae:
-
- Ni, namely the time length of hi(n), is assumed to be the value obtained by interpolating Nf and Ng. In this case, when Nf and N are shorter than Ni the final sample value of the waveform may be repeated, and when Nf and Ng are longer than Ni the surplus waveform may be discarded.
- In this way, a continuous articulation element piece waveform (in this example, a digital waveform) corresponding to the input sequence of characters is supplied to a D/
A converter 90 where the interpolated synthesized articulation waveform is converted to an analogue waveform and generated as a synthesized speech. The symbol waveform of a synthesized speech obtained in this way is shown in Fig. 6. - As described above, this invention, in which a unit of speech is used which is shorter from the viewpoint of time than a unit speech waveform such as CV, VC waveforms in the CV, VC waveform compiling type synthesizing method, not only requires a small memory capacity of waveform but also reflects exactly the articulation of the articulatory organs so as to obtain a synthesized speech of high quality.
- In the embodiment above described, an articulation element piece waveform corresponding to an articulation symbol is compiled and synthesized, but it is clear that the reduction in memory capacity is also possible when this invention is applied to the synthesizing method using what is called a "characteristic parameter" such as a Formant parameter.
Claims (12)
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP205227/83 | 1983-11-01 | ||
JP58205227A JPH0642158B2 (en) | 1983-11-01 | 1983-11-01 | Speech synthesizer |
Publications (3)
Publication Number | Publication Date |
---|---|
EP0144731A2 true EP0144731A2 (en) | 1985-06-19 |
EP0144731A3 EP0144731A3 (en) | 1985-07-03 |
EP0144731B1 EP0144731B1 (en) | 1988-09-07 |
Family
ID=16503507
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP84113186A Expired EP0144731B1 (en) | 1983-11-01 | 1984-11-02 | Speech synthesizer |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP0144731B1 (en) |
JP (1) | JPH0642158B2 (en) |
DE (1) | DE3473956D1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5463713A (en) * | 1991-05-07 | 1995-10-31 | Kabushiki Kaisha Meidensha | Synthesis of speech from text |
WO1997034291A1 (en) * | 1996-03-14 | 1997-09-18 | G Data Software Gmbh | Microsegment-based speech-synthesis process |
EP1617408A2 (en) | 2004-07-15 | 2006-01-18 | Yamaha Corporation | Voice synthesis apparatus and method |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH02141054A (en) * | 1988-11-21 | 1990-05-30 | Nec Home Electron Ltd | Terminal equipment for personal computer communication |
JP5782751B2 (en) * | 2011-03-07 | 2015-09-24 | ヤマハ株式会社 | Speech synthesizer |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE2531006A1 (en) * | 1975-07-11 | 1977-01-27 | Deutsche Bundespost | Speech synthesis system from diphthongs and phonemes - uses time limit for stored diphthongs and their double application |
EP0058130A2 (en) * | 1981-02-11 | 1982-08-18 | Eberhard Dr.-Ing. Grossmann | Method for speech synthesizing with unlimited vocabulary, and arrangement for realizing the same |
DE3220281A1 (en) * | 1981-05-29 | 1982-12-23 | Matsushita Electric Industrial Co., Ltd., Kadoma, Osaka | System for composing a voice through compilation of phoneme components |
DE3246712A1 (en) * | 1981-12-17 | 1983-06-30 | Matsushita Electric Industrial Co., Ltd., Kadoma, Osaka | METHOD FOR COMPOSING A VOICE ANALYSIS |
EP0087199A1 (en) * | 1982-02-24 | 1983-08-31 | Koninklijke Philips Electronics N.V. | Device for generating audio information of individual characters |
EP0107945A1 (en) * | 1982-10-19 | 1984-05-09 | Kabushiki Kaisha Toshiba | Speech synthesizing apparatus |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPS5331561A (en) * | 1976-09-04 | 1978-03-24 | Mitsukawa Shiyouichi | Method of manufacturing ssshaped springs |
JPS5868099A (en) * | 1981-10-19 | 1983-04-22 | 富士通株式会社 | Voice synthesizer |
-
1983
- 1983-11-01 JP JP58205227A patent/JPH0642158B2/en not_active Expired - Lifetime
-
1984
- 1984-11-02 DE DE8484113186T patent/DE3473956D1/en not_active Expired
- 1984-11-02 EP EP84113186A patent/EP0144731B1/en not_active Expired
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE2531006A1 (en) * | 1975-07-11 | 1977-01-27 | Deutsche Bundespost | Speech synthesis system from diphthongs and phonemes - uses time limit for stored diphthongs and their double application |
EP0058130A2 (en) * | 1981-02-11 | 1982-08-18 | Eberhard Dr.-Ing. Grossmann | Method for speech synthesizing with unlimited vocabulary, and arrangement for realizing the same |
DE3220281A1 (en) * | 1981-05-29 | 1982-12-23 | Matsushita Electric Industrial Co., Ltd., Kadoma, Osaka | System for composing a voice through compilation of phoneme components |
DE3246712A1 (en) * | 1981-12-17 | 1983-06-30 | Matsushita Electric Industrial Co., Ltd., Kadoma, Osaka | METHOD FOR COMPOSING A VOICE ANALYSIS |
EP0087199A1 (en) * | 1982-02-24 | 1983-08-31 | Koninklijke Philips Electronics N.V. | Device for generating audio information of individual characters |
EP0107945A1 (en) * | 1982-10-19 | 1984-05-09 | Kabushiki Kaisha Toshiba | Speech synthesizing apparatus |
Non-Patent Citations (4)
Title |
---|
ICASSP 80, PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, Vol.2, April 9-11, 1980, Fairmont Hotel, DENVER, COLORADO, (US), pages 557-560, IEEE, NEW YORK, (US) S. IMAI et al.:"Cepstral synthesis of Japanese from CV syllable parameters". * |
ICASSP 82, PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, Vol. 2, May 3-5, 1982, Palais des Congres, PARIS, (FR), pages 936-939, IEEE, NEW YORK, (US) E.R. GROSSMANN:"Speech synthesis in the time domain from text". * |
PROCEEDINGS OF THE SEMINAR ON PATTERN RECOGNITION, Vol. 1, November 19-20, 1977, University Sart-Tilman, LIEGE, (BE) pages 4.4.1 - 4.4.6, Sitel, OPHAIN, (BE) D. TEIL:"Un peripherique a reponse vocale: L'ICOPHONE 5". * |
THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA, Vol. 25, No. 1, January 1953, pages 105-113, (US) A.S. HOUSE et al.:"The influence of consonant environment upon the secondary acoustical characteristics of vowels". * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5463713A (en) * | 1991-05-07 | 1995-10-31 | Kabushiki Kaisha Meidensha | Synthesis of speech from text |
WO1997034291A1 (en) * | 1996-03-14 | 1997-09-18 | G Data Software Gmbh | Microsegment-based speech-synthesis process |
DE19610019A1 (en) * | 1996-03-14 | 1997-09-18 | Data Software Gmbh G | Digital speech synthesis process |
DE19610019C2 (en) * | 1996-03-14 | 1999-10-28 | Data Software Gmbh G | Digital speech synthesis process |
US6308156B1 (en) | 1996-03-14 | 2001-10-23 | G Data Software Gmbh | Microsegment-based speech-synthesis process |
EP1617408A2 (en) | 2004-07-15 | 2006-01-18 | Yamaha Corporation | Voice synthesis apparatus and method |
EP1617408A3 (en) * | 2004-07-15 | 2007-06-20 | Yamaha Corporation | Voice synthesis apparatus and method |
US7552052B2 (en) | 2004-07-15 | 2009-06-23 | Yamaha Corporation | Voice synthesis apparatus and method |
Also Published As
Publication number | Publication date |
---|---|
EP0144731A3 (en) | 1985-07-03 |
DE3473956D1 (en) | 1988-10-13 |
EP0144731B1 (en) | 1988-09-07 |
JPH0642158B2 (en) | 1994-06-01 |
JPS6097396A (en) | 1985-05-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US4862504A (en) | Speech synthesis system of rule-synthesis type | |
US6778962B1 (en) | Speech synthesis with prosodic model data and accent type | |
US7240005B2 (en) | Method of controlling high-speed reading in a text-to-speech conversion system | |
EP0831460B1 (en) | Speech synthesis method utilizing auxiliary information | |
JP3361066B2 (en) | Voice synthesis method and apparatus | |
US5463713A (en) | Synthesis of speech from text | |
EP0427485A2 (en) | Speech synthesis apparatus and method | |
US6035272A (en) | Method and apparatus for synthesizing speech | |
EP0239394B1 (en) | Speech synthesis system | |
KR20000005183A (en) | Image synthesizing method and apparatus | |
US5212731A (en) | Apparatus for providing sentence-final accents in synthesized american english speech | |
US6970819B1 (en) | Speech synthesis device | |
EP0144731B1 (en) | Speech synthesizer | |
EP0107945B1 (en) | Speech synthesizing apparatus | |
JPS6050600A (en) | Rule synthesization system | |
JP3060276B2 (en) | Speech synthesizer | |
Furtado et al. | Synthesis of unlimited speech in Indian languages using formant-based rules | |
JP3771565B2 (en) | Fundamental frequency pattern generation device, fundamental frequency pattern generation method, and program recording medium | |
JP3081300B2 (en) | Residual driven speech synthesizer | |
JP3318290B2 (en) | Voice synthesis method and apparatus | |
JP3086333B2 (en) | Voice synthesis device and voice synthesis method | |
JP2573586B2 (en) | Rule-based speech synthesizer | |
JPH11161297A (en) | Method and device for voice synthesizer | |
JPS58168096A (en) | Multi-language voice synthesizer | |
Eady et al. | Pitch assignment rules for speech synthesis by word concatenation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
17P | Request for examination filed |
Effective date: 19841102 |
|
AK | Designated contracting states |
Designated state(s): DE FR GB |
|
AK | Designated contracting states |
Designated state(s): DE FR GB |
|
17Q | First examination report despatched |
Effective date: 19861003 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): DE FR GB |
|
REF | Corresponds to: |
Ref document number: 3473956 Country of ref document: DE Date of ref document: 19881013 |
|
ET | Fr: translation filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed | ||
REG | Reference to a national code |
Ref country code: GB Ref legal event code: IF02 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: GB Payment date: 20021030 Year of fee payment: 19 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20021107 Year of fee payment: 19 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20021108 Year of fee payment: 19 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20031102 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040602 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20031102 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040730 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST |