JPH0266600A

JPH0266600A - Speech synthesis system

Info

Publication number: JPH0266600A
Application number: JP63219371A
Authority: JP
Inventors: Kazuhiko Iwata; 和彦岩田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1988-08-31
Filing date: 1988-08-31
Publication date: 1990-03-06

Abstract

PURPOSE:To generate a synthetic speech of good quality with small speech quality deterioration by finding a sound source parameter indicating the spectrum characteristics of a sound source signal and generating a sound source signal which has a desired pitch period by using the sound source parameter and sound source signal. CONSTITUTION:A parameter indicating voice channel characteristics stored in a filter parameter storage part 104 is selected according to a filter parameter select signal and sent to a speech synthesizing filter part 109. A sound source signal storage part 105 selects a sound source signal according to a sound source select signal inputted from a sound source signal select signal input terminal 102 and sends it to a changeover switch 106. A speech synthesizing filter part 109 generate a speech synthesizing filter by using said parameter sent from the filter parameter storage part 104 and synthesizes a speech signal by inputting the sound source signal sent from the sound signal storage part 105 through the switch part 106 to the speech synthesizing filter, thereby outputting the synthetic signal to a synthetic speech output terminal 110. Consequently, the synthetic speech of good quality with small speech quality deterioration is obtained.

Description

【発明の詳細な説明】（産業上の利用分野）本発明は、音声合成方式に関する。[Detailed description of the invention] (Industrial application field) The present invention relates to a speech synthesis method.

（従来の技術）音声を合成する方式として、人間が発声した音声信号を
人間の声道の特性を表すパラメータと音源を表す信号と
に分離しておき、声道の特性を表すパラメータで表され
る特性を持った音声合成フィルタを構成し、音源を表す
信号を音声合成フィルタに入力し、このときの出力を合
成音声とする方式が知られている。声道の特性を表すパ
ラメータとしては、線形予測係数、線スペクトル対、ホ
ルマントなどのパラメータが知られている。一方、音源
を表す信号としては、声帯の振動を伴う有声音の音源と
してはパルス列が、声帯振動を伴わない無声音の音源と
してはランダム雑音が用いられることが多い。しかしな
がら、パルス列やランダム雑音を用いた場合の合成音声
の品質はかなり低い。合成音声の品質を向上させるため
に、パラメータによって声道の特性をモデル化した際の
モデル化誤差を音源として用いることもある。以下では
、このモデル化誤差のことを残差と呼ぶことにする。残
差を音源として用いる音声合成方式としては、例えば、
佐藤氏による日本音響学会昭和５６年度秋季研究発表会
講演論文集１−２−１６［音韻連鎖と残差波形を用いた
音声合成Ｊ（文献１）等で詳述されている。(Prior art) As a method for synthesizing speech, a speech signal produced by a human is separated into parameters representing the characteristics of the human vocal tract and signals representing the sound source, and the signal is expressed by the parameters representing the characteristics of the vocal tract. There is a known method in which a voice synthesis filter is configured with characteristics such that a signal representing a sound source is input to the voice synthesis filter, and the output is a synthesized voice. Parameters such as linear prediction coefficients, line spectrum pairs, and formants are known as parameters representing characteristics of the vocal tract. On the other hand, as a signal representing a sound source, a pulse train is often used as a sound source for voiced sounds accompanied by vocal cord vibration, and random noise is often used as a sound source for unvoiced sounds without vocal fold vibration. However, the quality of synthesized speech using pulse trains or random noise is quite low. In order to improve the quality of synthesized speech, modeling errors when vocal tract characteristics are modeled using parameters may be used as a sound source. In the following, this modeling error will be referred to as a residual. Examples of speech synthesis methods that use residuals as sound sources include:
Mr. Sato's Proceedings of the 1981 Autumn Research Conference of the Acoustical Society of Japan 1-2-16 [Details are given in Speech Synthesis J Using Phonological Chains and Residual Waveforms (Reference 1), etc.

一般に、予め人間が発声した音声をもとにして、合成音
声を生成する方式の場合、声帯の振動を伴う有声音区間
においては、もとの音声のピッチ周期と異なるピッチ周
期で合成しなければならない。このとき、先に上げた文
献でも述べられているが、残差を音源として用いる音声
合成方式においては、もとのピッチ周期よりも長いピッ
チ周期で合成する場合には、残差が不足する。そこで、
不足分の長さを補うために、値が零である信号を残差の
後ろに付は加えることによってピッチ周期の伸張を行う
。もとのピッチ周期よりも短いピッチ周期で合成する場
合には、残差を途中で打ち切ることによりピッチ周期の
短縮を行う。Generally, in the case of a method that generates synthesized speech based on speech uttered by a human beforehand, voiced sound sections that involve vibration of the vocal cords must be synthesized with a pitch period that is different from the pitch period of the original speech. It won't happen. At this time, as mentioned in the above-mentioned literature, in a speech synthesis method that uses the residual as a sound source, when synthesizing with a pitch period longer than the original pitch period, the residual is insufficient. Therefore,
To compensate for the shortfall in length, the pitch period is extended by adding a signal with a value of zero to the end of the residual. When synthesizing with a pitch period shorter than the original pitch period, the pitch period is shortened by truncating the residual.

４０１は、もとの残差の例を示す図、４０２は４０１の
ピッチ周期を伸張したものの例を示す図、４０３は４０
２を音声合成フィルタに入力して得られた合成音声の例
を示す図である。図において、時間軸に垂直に引かれた
実線は、ピッチ区間の境界を示している。もとの残差の
ピッチ周期は、第４図４０１のピッチ周期Ｔ１であるが
、合成したい音声のピッチ周期Ｔ２が第４図４０２に示
すようにピッチ周期Ｔ１よりも長い場合は、不足する長
さを補うために、値が零である信号（区間Ｂ）をもとの
残差（区間Ａ）の後ろに付は加える。これを、音声合成
フィルタに入力し、第４図４０３に示すような合成音声
を得る。このようにして、もとのピッチ周期Ｔ□よりも
長いピッチ周期Ｔ２を持つ音声を合成していた。401 is a diagram showing an example of the original residual, 402 is a diagram showing an example of the pitch period of 401 expanded, and 403 is a diagram showing an example of 40
2 is a diagram illustrating an example of synthesized speech obtained by inputting 2 to a speech synthesis filter. In the figure, solid lines drawn perpendicular to the time axis indicate boundaries between pitch sections. The pitch period of the original residual is the pitch period T1 in 401 of FIG. 4, but if the pitch period T2 of the voice to be synthesized is longer than the pitch period T1 as shown in 402 of FIG. In order to compensate for the difference, a signal whose value is zero (section B) is added to the end of the original residual (section A). This is input to a speech synthesis filter to obtain synthesized speech as shown in FIG. 4 403. In this way, speech having a pitch period T2 longer than the original pitch period T□ was synthesized.

（発明が解決しようとする問題点）このように、残差を音源として用いる音声合成方式では
、従来、残差のピッチ周期よりも長いピッチ周期で合成
する場合に、残差の後ろに零を付は加えることによって
不足分の長さを補っていた。しかしながら、もとのピッ
チ周期よりもかなり長い、例えば２倍以上の長さのピッ
チ周期で合成する場合には、合成される音声の音質が著
しく劣化してしまうという問題があった。(Problems to be Solved by the Invention) As described above, in the speech synthesis method using the residual as a sound source, conventionally, when synthesizing with a pitch period longer than the pitch period of the residual, a zero is added after the residual. The lack of length was made up for by adding additional parts. However, when synthesizing with a pitch period that is considerably longer than the original pitch period, for example, twice or more as long, there is a problem in that the quality of the synthesized voice deteriorates significantly.

これに対して本発明は、残差を音源として用いる音声合
成方式において、もとの残差のピッチ周期よりも長いピ
ッチ周期で合成した場合にも、音質劣化の少ない良質な
合成音声を生成することが可能な音声合成方式を提供す
ることを目的としている。In contrast, in a speech synthesis method that uses residuals as sound sources, the present invention generates high-quality synthesized speech with little deterioration in sound quality even when synthesis is performed with a pitch period longer than the pitch period of the original residuals. The purpose of this research is to provide a speech synthesis method that can perform the following tasks.

（問題点を解決するための手段）第１の本発明は、音声を声道特性を表すパラメータと音
源信号とに分離し、前記音源信号のピッチ周期が所望の
ピッチ周期となるように制御し、前記声道特性を表すパ
ラメータと前記所望のピッチ周期を持った音源信号とか
ら所望のピッチ周期を持った音声信号を合成する音声合
成方式において、前記音源信号のスペクトル特性を表す
音源パラメータを求め、前記音源パラメータと前記音源
信号とを用いて所望のピッチ周期を持った音源信号を生
成し、前記声道特性を表すパラメータと前記所望のピッ
チ周期を持った音源信号とから所望のピッチ周期を持っ
た音声信号を合成することを特徴とする。(Means for Solving the Problems) The first invention separates speech into parameters representing vocal tract characteristics and a sound source signal, and controls the pitch period of the sound source signal to be a desired pitch period. , in a speech synthesis method for synthesizing a speech signal having a desired pitch period from the parameter representing the vocal tract characteristic and the sound source signal having the desired pitch period, a sound source parameter representing the spectral characteristic of the sound source signal is determined. , generating a sound source signal having a desired pitch period using the sound source parameter and the sound source signal, and generating the desired pitch period from the parameter representing the vocal tract characteristics and the sound source signal having the desired pitch period. It is characterized by synthesizing the voice signals that have been received.

また、第２の本発明は、音源パラメータを予め求めて格
納しておき、前記格納された音源パラメータと音源信号
とを用いて所望のピッチ周期を持った音源信号を生成す
ることを特徴とする。Further, the second invention is characterized in that sound source parameters are determined and stored in advance, and a sound source signal having a desired pitch period is generated using the stored sound source parameters and the sound source signal. .

（作用）本発明では、残差を音源として用いる音声合成方式にお
いて、合成したいピッチ周期かもとの音声のピッチ周期
よりも長い場合に、音声合成フィルタへの入力である残
差に対して線形予測分析を行って、残差のスペクトル包
絡特性を表す線形予測係数を求める。この線形予測係数
を用いて線形予測フィルタを形成し、不足する区間の信
号をもとの残差を用いて予測し合成する。第３図は本発
明の音声合成方式におけるピッチ周期の伸張方法を説明
するための図である。第３図の３０１は、もとの残差の
例を示す図、３０２は線形予測フィルタを用いて３０１
の残差のピッチ周期を伸張した例を示す図、３０３は３
０２を音声合成フィルタに入力して得られた合成音声の
例を示す図である。図において、時間軸に垂直に引かれ
た実線は、ピッチ区間の境界を示している。もとの残差
のピッチ周期は第３図３０１のピッチ周期Ｔ１であるが
、合成したい音声のピッチ周期Ｔ２が第３図３０２に示
すようにピッチ周期Ｔ１よりも長い場合は、第３図３０
１の残差をピッチ区間ごとに線形予測分析し、残差の線
形予測係数を求める。この線形予測係数を用いて線形予
測フィルタを構成し、区間Ａの残差を線形予測フィルタ
に人力して、もとの残差で不足している区間Ｂの信号を
予測する。これにより、第３図３０２に示すようなピッ
チ周期Ｔ２の残差が得られる。これを、音声合成フィル
タに入力し、第３図３０３に示すような合成音声を得る
。(Operation) In the present invention, in a speech synthesis method that uses residuals as a sound source, when the pitch period to be synthesized is longer than the pitch period of the original speech, linear prediction is performed on the residuals that are input to the speech synthesis filter. An analysis is performed to determine linear prediction coefficients representing the spectral envelope characteristics of the residual. A linear prediction filter is formed using these linear prediction coefficients, and the signal in the missing section is predicted and synthesized using the original residual. FIG. 3 is a diagram for explaining a pitch period expansion method in the speech synthesis method of the present invention. 301 in FIG. 3 is a diagram showing an example of the original residual, and 302 is a diagram illustrating 301 using a linear prediction filter.
A diagram showing an example in which the pitch period of the residual is expanded, 303 is 3
2 is a diagram showing an example of synthesized speech obtained by inputting 02 into a speech synthesis filter. In the figure, solid lines drawn perpendicular to the time axis indicate boundaries between pitch sections. The pitch period of the original residual is the pitch period T1 in FIG. 3 301, but if the pitch period T2 of the voice to be synthesized is longer than the pitch period T1 as shown in FIG. 3 302, the pitch period T1 in FIG.
1 residual is subjected to linear prediction analysis for each pitch section, and linear prediction coefficients of the residual are determined. A linear prediction filter is constructed using these linear prediction coefficients, and the residual of section A is manually input to the linear prediction filter to predict the signal of section B where the original residual is insufficient. As a result, a residual error of pitch period T2 as shown in FIG. 3 302 is obtained. This is input to a speech synthesis filter to obtain synthesized speech as shown in FIG. 3 303.

このように、残差の不足する区間に零を付は加えている
従来方式とは異なり、不足する区間の信号をもとの残差
を用いて予測し、音声合成フィルタへの入力、すなわち
音）原として用いる。このような方式により、もとの残
差のピッチ周期よりも長いピッチ周期で合成した場合で
も、音質劣化の少ない良質な合成音声を生成することが
可能となる。In this way, unlike the conventional method in which zero is added to the section where the residual is insufficient, the signal in the section where the residual is insufficient is predicted using the original residual, and the input to the speech synthesis filter, that is, the sound ) used as a source. With such a method, even when synthesis is performed with a pitch period longer than the pitch period of the original residual, it is possible to generate high-quality synthesized speech with little deterioration in sound quality.

（実施例）第１図は、第１の本発明による音声合成方式を実現する
ための一実施例を示すブロック図である。(Embodiment) FIG. 1 is a block diagram showing an embodiment for realizing the speech synthesis method according to the first invention.

人間が発声した音声信号は、予め分析を行って声道特性
を表すパラメータと音源信号とに分離されている。声道
特性を表すパラメータとしては、線形予測係数、線スペ
クトル対、ケプストラム、改良ケプストラムなどを用い
ることができる。この声道特性を表すパラメータは、フ
ィルタパラメータ記憶部１０４に記憶される。一方、音
源信号は音源信号記憶部１０５に蓄えられる。A voice signal produced by a human being is analyzed in advance and separated into parameters representing vocal tract characteristics and a sound source signal. As parameters representing vocal tract characteristics, linear prediction coefficients, line spectrum pairs, cepstrums, improved cepstrums, etc. can be used. Parameters representing this vocal tract characteristic are stored in the filter parameter storage section 104. On the other hand, the sound source signal is stored in the sound source signal storage section 105.

合成音声を得るには、まず、合成したい音声の声道特性
を表すパラメータと音源信号とを選択する信号を、フィ
ルタパラメータ選択信号入力端子１０１、音源信号選択
信号入力端子１０２からそれぞれ人力する。また、合成
したい音声のピッチ周期を、ピッチ周期入力端子１０３
から人力する。To obtain synthesized speech, first, signals for selecting a parameter representing the vocal tract characteristics of the speech to be synthesized and a sound source signal are manually inputted from the filter parameter selection signal input terminal 101 and the sound source signal selection signal input terminal 102, respectively. In addition, the pitch period of the voice to be synthesized is input to the pitch period input terminal 103.
From human power.

前記フィルタパラメータ選択信号にしたがってフィルタ
パラメータ記憶部１０４に蓄えられている声道特性を表
すパラメータが選択され、音声合成フィルタ部１０９に
送られる。また、音源信号記憶部１０５は、音源信号選
択信号入力端子１０２から人力された前記音源信号選択
信号にしたがって音源信号を選択し、切り替えスイッチ
部１０６に送る。切り替えスイッチ部１０６は、前記入
力されたピッチ周期と、前記選択された音源信号のピッ
チ周期とを比較し、前記音源信号のピッチ周期が前記人
力されたピッチ周期と同じかそれよりも長い場合は、前
記人力されたピッチ周期と等しい長さの前記音源信号を
音声合成フィルタ部１０９に送る。In accordance with the filter parameter selection signal, parameters representing vocal tract characteristics stored in the filter parameter storage section 104 are selected and sent to the speech synthesis filter section 109. Further, the sound source signal storage unit 105 selects a sound source signal according to the sound source signal selection signal input manually from the sound source signal selection signal input terminal 102 and sends the selected sound source signal to the changeover switch unit 106 . The changeover switch section 106 compares the input pitch period with the pitch period of the selected sound source signal, and if the pitch period of the sound source signal is the same as or longer than the manually input pitch period, , sends the sound source signal having a length equal to the manually input pitch period to the speech synthesis filter section 109 .

一方、前記音源信号のピッチ周期が前記人力されたピッ
チ周期よりも短い場合は、前記音源信号を線形予測分析
部１０７、及び線形予測合成部１０８に送る。線形予測
分析部１０７は、（作用）の項で述べたように、前記音
源信号に対して線形予測分析を行い線形予測係数を求め
、前記線形予測合成部１０８に送る。線形予測合成部１
０８は、前記線形予測係数と前記音源信号とから、前記
入力されたピッチ周期と等しい長さの音源信号を合成し
、音声合成フィルタ部１０９に送る。On the other hand, if the pitch period of the sound source signal is shorter than the manually generated pitch period, the sound source signal is sent to the linear prediction analysis section 107 and the linear prediction synthesis section 108 . As described in the (Operation) section, the linear prediction analysis unit 107 performs linear prediction analysis on the sound source signal to obtain linear prediction coefficients, and sends them to the linear prediction synthesis unit 108. Linear prediction synthesis unit 1
08 synthesizes a sound source signal having a length equal to the input pitch period from the linear prediction coefficient and the sound source signal, and sends the synthesized sound source signal to the speech synthesis filter section 109.

音声合成フィルタ部１０９は、前記フィルタパラメータ
記憶部１０４から送られた声道特性を表すパラメータを
用いて音声合成フィルタを形成し、切り替えスイッチ部
１０６を介して音源信号記憶部１０５から送られてきた
音源信号、または線形予測合成部１０８によって合成さ
れた音源信号を前記音声合成フィルタの入力として音声
信号を合成し、合成音声出力端子１１０に出力する。The speech synthesis filter section 109 forms a speech synthesis filter using the parameters representing the vocal tract characteristics sent from the filter parameter storage section 104 and sent from the sound source signal storage section 105 via the changeover switch section 106. The sound source signal or the sound source signal synthesized by the linear predictive synthesis section 108 is input to the speech synthesis filter to synthesize a speech signal and output it to the synthesized speech output terminal 110.

上記の実施例では、合成時に線形予測分析部１０７で音
源信号のスペクトル特性を表す線形予測係数を求め、こ
れを用いて線形予測合成部１０８において残差信号を予
測してピッチ周期を長くしていた。音源信号のスペクト
ル特性を表す線形予測係数は、予めピッチごとに分析し
て格納しておき、合成時にこれを読み出して用いるよう
にすることもできる。これが、第２の本発明の原理であ
る。この場合の構成の一例としては、第１図の線形予測
分析部１０７の代わりとして線形予測係数記憶部が必要
となる。第２図は、このような第２の本発明による音声
合成方式を実現するための一実施例を示すブロック図で
ある。第２図では、音源信号選択信号入力端子２０２か
ら人力された音源信号を選択する信号が、音源信号記憶
部２０５、及び線形予測係数記憶部２０７に送られる点
が、第１図と異なっている。線形予測係数記憶部２０７
では、音源信号記憶部２０５に格納されている音源信号
に対応した前記音源信号のスペクトル特性を表す線形予
測係数が予め分析され格納されており、前記人力された
音源信号を選択する信号にしたがって読み出され、線形
予測合成部２０８に送られる。上記以外の部分は第１図
と同様であり、その動作は先の説明から明らかであるの
で説明を省略する。In the above embodiment, at the time of synthesis, the linear prediction analysis unit 107 obtains linear prediction coefficients representing the spectral characteristics of the sound source signal, and using this, the linear prediction synthesis unit 108 predicts the residual signal and lengthens the pitch period. Ta. The linear prediction coefficients representing the spectral characteristics of the sound source signal may be analyzed and stored in advance for each pitch, and may be read out and used during synthesis. This is the principle of the second invention. As an example of the configuration in this case, a linear prediction coefficient storage section is required in place of the linear prediction analysis section 107 shown in FIG. FIG. 2 is a block diagram showing an embodiment for realizing the speech synthesis method according to the second invention. The difference between FIG. 2 and FIG. 1 is that a signal for manually selecting a sound source signal from a sound source signal selection signal input terminal 202 is sent to a sound source signal storage unit 205 and a linear prediction coefficient storage unit 207. . Linear prediction coefficient storage unit 207
In this case, linear prediction coefficients representing the spectral characteristics of the sound source signal corresponding to the sound source signal stored in the sound source signal storage unit 205 are analyzed and stored in advance, and read according to the signal for selecting the sound source signal manually input. and sent to the linear prediction synthesis unit 208. The parts other than those mentioned above are the same as those shown in FIG. 1, and since the operation is clear from the previous explanation, the explanation will be omitted.

（発明の効果）以上述べてきたように、本発明によれば、もとの残差の
ピッチ周期よりも短いピッチ周期で合成した場合は勿論
のこと、非常に長いピッチ周期で合成した場合にも、音
質劣化の少ない良質な合成音声を生成することが可能で
ある。したがって、本発明は、ピッチ周期を任意に変化
させる必要がある場合でも、音質劣化の少ない良質な合
成音声を生成することが可能な音声合成方式として有効
である。(Effects of the Invention) As described above, according to the present invention, not only when synthesis is performed with a pitch period shorter than the pitch period of the original residual, but also when synthesis is performed with a very long pitch period. It is also possible to generate high-quality synthesized speech with little deterioration in sound quality. Therefore, the present invention is effective as a speech synthesis method that can generate high-quality synthesized speech with little deterioration in sound quality even when it is necessary to arbitrarily change the pitch period.

【図面の簡単な説明】[Brief explanation of the drawing]

第１図は第１の本発明による音声合成方式を実現するた
めの一実施例を示すブロック図、第２図は第２の本発明
による音声合成方式を実現するための一実施例を示すブ
ロック図、第３図は本発明の音声合成方式におけるピッ
チ周期の伸張方法を説明するための図、第４図は従来方
式におけるピッチ周期の伸張方法を説明するための図で
ある。第１図において、１０１・・・フィルタパラメータ選択
信号入力端子、１０２・・・音源信号選択信号入力端子
、１０３・・・ピッチ周期入力端子、１０４・・・フィ
ルタパラメータ記憶部、１０５・・・音源信号記憶部、
１０６・・・切り替えスイッチ部、１０７・・・線形予
測分析部、１０８・・・線形予測合成部、１０９・・・
音声合成フィルタ部、１１０・・・合成音声出力端子で
ある。また、第２図において、２０１・・・フィルタパラメー
タ選択信号入力端子、２０２・・・音源信号選択信号入
力端子、２０３００．ピッチ周期入力端子、２０４・・
・フィルタパラメータ記憶部、２０５・・・音源信号記
憶部、２０６・・・切り替えスイッチ部、２０７・・・
線形予測係数記憶部、２０８・・線形予測合成部、２０
９・・・音声合成フィルタ部、２１０・・・合成音声出
力端子である。FIG. 1 is a block diagram showing an embodiment of the speech synthesis method according to the first invention, and FIG. 2 is a block diagram showing an embodiment of the speech synthesis method according to the second invention. 3 and 3 are diagrams for explaining the pitch period extension method in the speech synthesis method of the present invention, and FIG. 4 is a diagram for explaining the pitch period extension method in the conventional method. In FIG. 1, 101... Filter parameter selection signal input terminal, 102... Sound source signal selection signal input terminal, 103... Pitch period input terminal, 104... Filter parameter storage section, 105... Sound source signal storage unit,
106... Changeover switch unit, 107... Linear prediction analysis unit, 108... Linear prediction synthesis unit, 109...
Speech synthesis filter section 110...Synthesized speech output terminal. In FIG. 2, 201...filter parameter selection signal input terminal, 202...sound source signal selection signal input terminal, 20300. Pitch period input terminal, 204...
- Filter parameter storage section, 205... Sound source signal storage section, 206... Changeover switch section, 207...
Linear prediction coefficient storage unit, 208...Linear prediction synthesis unit, 20
9...Speech synthesis filter unit, 210...Synthesized voice output terminal.

Claims

【特許請求の範囲】[Claims]

（１）音声を声道特性を表すパラメータと音源信号とに
分離し、前記音源信号のピッチ周期が所望のピッチ周期
となるように制御し、前記声道特性を表すパラメータと
前記所望のピッチ周期を持った音源信号とから所望のピ
ッチ周期を持った音声信号を合成する音声合成方式にお
いて、前記音源信号のスペクトル特性を表す音源パラメ
ータを求め、前記音源パラメータと前記音源信号とを用
いて所望のピッチ周期を持った音源信号を生成し、前記
声道特性を表すパラメータと前記所望のピッチ周期を持
った音源信号とから所望のピッチ周期を持った音声信号
を合成することを特徴とする音声合成方式。(1) Separating speech into a parameter representing the vocal tract characteristics and a sound source signal, controlling the pitch period of the sound source signal to be a desired pitch period, and controlling the pitch period of the sound source signal to a desired pitch period; In a speech synthesis method that synthesizes a speech signal with a desired pitch period from a sound source signal with Speech synthesis characterized in that a sound source signal having a pitch period is generated, and a speech signal having a desired pitch period is synthesized from the parameter representing the vocal tract characteristics and the sound source signal having the desired pitch period. method.

（２）特許請求の範囲第１項記載の音声合成方式におい
て、音源パラメータを予め求めて格納しておき、前記格
納された音源パラメータと音源信号とを用いて所望のピ
ッチ周期を持った音源信号を生成することを特徴とする
音声合成方式。(2) In the speech synthesis method according to claim 1, sound source parameters are determined and stored in advance, and a sound source signal having a desired pitch period is generated using the stored sound source parameters and the sound source signal. A speech synthesis method characterized by generating.