JPS5924439B2

JPS5924439B2 - Control method for speech analysis and synthesis equipment

Info

Publication number: JPS5924439B2
Application number: JP53001282A
Authority: JP
Inventors: 哲田口
Original assignee: Nippon Electric Co Ltd
Current assignee: NEC Corp
Priority date: 1978-01-09
Filing date: 1978-01-09
Publication date: 1984-06-09
Also published as: JPS5494209A

Abstract

PURPOSE:To stably produce the synthesized audio having good approximation for limited accuracy operation, by obtaining a plurality of linear expected coefficient circulatingly while monitoring expected residual power and adjusting the number of stages of the synthesized filter at the synthesis side. CONSTITUTION:The reproductivity calculator 102 calculates the coefficient expressing the degree of reproduction of waveform, and the linear expectancy coefficient calculator 103 calculates the next linear expected coefficient and the expected residual power. The control unit 105 judges whether the expected residual power is more than a given value or not, and if it is less then the value, the caculation stop signal is given to the calculator 103. If no stop signal is given, the second order linear expected coefficiency and the expected residual power are calculated. After that, the calculator 103 repeats calculation until the controller 105 produces stop signal, circulatingly. The variable stage synthesis filter 108 controls the coefficient of filter and the number of stages of filter, and the synthesized audio waveform is outputted to the terminal 111 with the exciting signal 110.

Description

【発明の詳細な説明】本発明は線形予測係数を用いた音声の分析合成装置に関
し、殊に有限精度演算において概周期的な音声などの分
析合成に実施して好適な合成音声を得るための音声分析
合成装置の制御方法に係るものである。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to a speech analysis and synthesis device using linear prediction coefficients, and in particular to an analysis and synthesis device for approximately periodic speech in finite precision arithmetic to obtain suitable synthesized speech. The present invention relates to a method of controlling a speech analysis and synthesis device.

一般に線形方程式の直接的な解として求まる予測係数ま
たは前記予測係数の変形である部分自己相関係数もしく
はそれらを変換した線形予測係数を用いる音声分析合成
装置は被分析音声のスペクトルエンベロープの近似性を
高めるため、通常複数個の線形予測係数を求める。In general, a speech analysis and synthesis device that uses prediction coefficients obtained as a direct solution of a linear equation, partial autocorrelation coefficients that are a modification of the prediction coefficients, or linear prediction coefficients obtained by converting them evaluates the approximation of the spectral envelope of the speech to be analyzed. In order to improve the linear prediction coefficient, multiple linear prediction coefficients are usually determined.

この線形予測係数を求める方法としては種々の方法が知
られている。（例えば日本音響学会研究発表会講演論文
集、昭和４４年１０月「偏自己相関係数による音声分析
合成系」。東北大学電気通信研究所主催第８回シンポジ
ウム論文集１９７１年２月「統計的手法による音声の特
徴抽出」。など）。特に自己相関係数又は共分散係数か
ら巡回的に複数の線形予測係数およびスペクトラムパラ
メータの近似性を表現する予測残差電力を求める代数的
手法はディジタル計算システムにより実現し得ることが
知られている。上記巡回的計算手法においては例えばＮ
＋１次の線形予測係数は１次からＮ次までの線形予測係
数と、予測残差電力とを用いて求められる。一般に上記
予測残差電力は周期性の高い有声音などにおいては比較
的に小さく、周期性の低い無声音などにおいては比較的
に大きいことが知られている。従来の種の音声分析合成
装置では計算速度および装置規模に限界があるため有限
語長固定小数点などの有限精度演算が用いられている。
したがつて線形予測係数の分析次数およびそれに対応す
る合成側の合成フイルタの段数を固定しているため波形
の周期性が極めて高い（スペクトル構造が明確な）有声
音定常部などでは高次の正規化予測残差電力が更に小さ
くなり、上記有限精度演算の範囲外に入り込む恐れがあ
る。このことは線形予測係数と対応する部分自己相関係
数（Ｋパラメータ）が予測残差電力の関数であることか
らＫ！〉１となる恐れが生じ、合成フイルタの安定性が
損なわれることになる。このような欠点を除去するため
に線形予測係数の分析次数および合成フイルタの段数を
減少させると、無声音など比較的に定常性の低い音声や
、有声音でも予測残差が演算精度に対して大きい音声の
スペクトル包絡の近似性が著しく低下し、合成音声の音
質が劣化する。本発明の目的は合成音声品質の劣化を伴
なわずに安定性の高い音声分析合成を可能とする音声分
析合成装置の制御方法を提供することにある。Various methods are known to obtain this linear prediction coefficient. (For example, Proceedings of the Acoustical Society of Japan Research Conference, October 1960, ``Speech analysis and synthesis system using partial autocorrelation coefficients.'' Proceedings of the 8th Symposium sponsored by the Institute of Electrical Communication, Tohoku University, February 1971, ``Statistical ``Voice feature extraction method'', etc.). In particular, it is known that an algebraic method for calculating prediction residual power that cyclically expresses the approximation of multiple linear prediction coefficients and spectral parameters from autocorrelation coefficients or covariance coefficients can be realized using a digital calculation system. . In the above cyclic calculation method, for example, N
The +1st order linear prediction coefficient is obtained using the first to Nth order linear prediction coefficients and the prediction residual power. It is generally known that the predicted residual power is relatively small for voiced sounds with high periodicity and relatively large for unvoiced sounds with low periodicity. Conventional speech analysis and synthesis devices have limitations in calculation speed and device scale, so finite-precision arithmetic, such as a fixed-point number with a finite word length, is used.
Therefore, since the analysis order of the linear prediction coefficient and the corresponding number of stages of the synthesis filter on the synthesis side are fixed, high-order normalization is achieved in stationary parts of voiced sounds where the waveform has extremely high periodicity (with a clear spectral structure). There is a possibility that the prediction residual power becomes even smaller and falls outside the range of the above-mentioned finite precision calculation. This is because the partial autocorrelation coefficient (K parameter) corresponding to the linear prediction coefficient is a function of the prediction residual power, so K! >1, and the stability of the synthesis filter will be impaired. If we reduce the analysis order of the linear prediction coefficients and the number of stages of the synthesis filter in order to eliminate these drawbacks, the prediction residual will be large compared to the calculation accuracy even for speech with relatively low stationarity such as unvoiced speech, and even for voiced speech. The approximation of the spectral envelope of speech is significantly reduced, and the quality of the synthesized speech is degraded. An object of the present invention is to provide a control method for a speech analysis and synthesis apparatus that enables highly stable speech analysis and synthesis without deteriorating the quality of synthesized speech.

本発明は線形予測型の音声分析合成装置に関するもので
あり、巡回的に複数の線形予測係数を予測残差電力を監
視しながら求める手段と、予測残差電力に対応する段数
の合成フイルタの係数を前記線形予測係数により決定し
音声を合成する手段とから構成されている。本発明の特
徴は線形予測型音声分析合成装置に関し、予測残差電力
を監視しながら巡回的に複数の線形予測係数を求め、前
記線形予測係数の次数を予測残差電力に対応して決定し
、合成側において合成フイルタの段数を前記線形予測係
数の次数に対応して加減することにある。The present invention relates to a linear prediction type speech analysis and synthesis device, and includes means for cyclically obtaining a plurality of linear prediction coefficients while monitoring prediction residual power, and coefficients of a synthesis filter having a number of stages corresponding to the prediction residual power. and a means for determining the above-mentioned linear prediction coefficients and synthesizing speech. The feature of the present invention relates to a linear prediction type speech analysis and synthesis device, which cyclically obtains a plurality of linear prediction coefficients while monitoring prediction residual power, and determines the order of the linear prediction coefficient in accordance with the prediction residual power. The method is to adjust the number of stages of the synthesis filter on the synthesis side in accordance with the order of the linear prediction coefficient.

このため、有限精度演算において安定に線形予測係数を
求め、近似性のよい合成音声を安定に発生させることが
できるという効果がある。次に図面を参照して本発明を
詳細に説明する。Therefore, it is possible to stably obtain linear prediction coefficients in finite precision calculations and to stably generate synthesized speech with good approximation. Next, the present invention will be explained in detail with reference to the drawings.

図は本発明の一実施例を示すプロツク図である。音声波
形データが波形入力端子１０１を介して波形再現度計算
器１０２に入力される。波形再現度計算器１０２は自己
相関係数、共分散係数などの波形再現度を表現する係数
を計算し線形予測係数計算器１０３に出力する。線形予
測係数計算器１０３は波形再現度を表現する係数から１
次の線形予測係数および予測残差電力を計算する。予測
残差電力は予測残差電力伝送路１０４を介して制御器１
０５に与えられる。制御器１０５は予測残差電力が所定
の値以上であるか否かを判定し、予測残差電力が所定の
値以下の場合には系が不安定になる恐れが強いものと判
断して計算停止信号が計算停止信号伝送路１０６を介し
て線形予測係数計算器１０３に与えられる。線形予測係
数計算器１０３は計算停止信号が与えられた場合には計
算を停止し、計算停止信号が与えられなかつた場合には
、波形再現度を表現する係数、一次の線形予測係数およ
び予測残差電力より２次の線形予測係数および予測残差
電力を計算する。以下巡回的に線形予測係数計算器１０
３は制御器１０５が計算停止信号を発生するまで線形予
測係数を計算する。なお最大予測欠数Ｎ１をあらかじめ
設定し、計算停止信号の有無にかかわらず、Ｎ１次の線
形予測係数を計算後、線形予測係数計算器を自動的に停
止し、線形予測係数の次数を不必要に多くすることを防
ぎ得ることは明らかである。仮にＮ２次の線形予測係数
を計算後、計算停止信号により計算を停止した場合、Ｎ
２次の線形予測係数は線形予測係数伝送路１０７を介し
て可変段合成フイルタ１０８に与えられる。The figure is a block diagram showing one embodiment of the present invention. Audio waveform data is input to a waveform reproducibility calculator 102 via a waveform input terminal 101 . The waveform reproducibility calculator 102 calculates coefficients representing the waveform reproducibility, such as autocorrelation coefficients and covariance coefficients, and outputs them to the linear prediction coefficient calculator 103. The linear prediction coefficient calculator 103 extracts 1 from the coefficients expressing the waveform reproducibility.
Compute the following linear prediction coefficients and prediction residual power. The predicted residual power is transmitted to the controller 1 via the predicted residual power transmission line 104.
Given on 05. The controller 105 determines whether the predicted residual power is greater than or equal to a predetermined value, and if the predicted residual power is less than the predetermined value, it is determined that there is a strong possibility that the system will become unstable. A stop signal is given to the linear prediction coefficient calculator 103 via a calculation stop signal transmission line 106. The linear prediction coefficient calculator 103 stops calculation when a calculation stop signal is given, and when no calculation stop signal is given, it calculates coefficients expressing waveform reproducibility, first-order linear prediction coefficients, and prediction residuals. A second-order linear prediction coefficient and prediction residual power are calculated from the difference power. The following cyclic linear prediction coefficient calculator 10
3 calculates linear prediction coefficients until the controller 105 generates a calculation stop signal. In addition, the maximum prediction missing number N1 is set in advance, and the linear prediction coefficient calculator is automatically stopped after calculating the N1-order linear prediction coefficient regardless of the presence or absence of a calculation stop signal, and the order of the linear prediction coefficient is unnecessary. It is clear that this can be avoided. If the calculation is stopped by the calculation stop signal after calculating the N2 linear prediction coefficient, N
The secondary linear prediction coefficients are provided to a variable stage synthesis filter 108 via a linear prediction coefficient transmission line 107.

また制御器１０５は可変フイルタ制御信号を可変段合成
フイルタ制御信号伝送路１０９を介して可変段合成フイ
ルタ１０８に与える。可変段合成フイルタ１０８はＮ２
次の線形予測係数によりフイルタの係数が、可変段合成
フイルタ制御信号によりフイルタ段数が制御され、励振
信号入力端子１１０より入力される励振信号により励振
され合成音声波形を波形出力端子１１１へ出力する。な
お可変合成フイルタは容易に実現し得ることは明らかで
ある。また線形予測係数計算器１０３からＮ２次の線形
予測係数を制御器１０５から可変段合成フイルタ制御信
号を合成フイルタへ与える代りに、常にＮ３次の線形予
測係数を伝送し、Ｎ２＋１次からＮ３次までの線形予測
係数を零とすることにより、Ｎ３段の固定段合成フイル
タを用いて可変段合成フイルタと等価な効果を期待し得
ることは明らかである。Further, the controller 105 provides a variable filter control signal to the variable stage synthesis filter 108 via the variable stage synthesis filter control signal transmission line 109. The variable stage synthesis filter 108 is N2
The filter coefficients are controlled by the next linear prediction coefficient, the number of filter stages is controlled by the variable stage synthesis filter control signal, and excited by the excitation signal input from the excitation signal input terminal 110 to output a synthesized speech waveform to the waveform output terminal 111. Note that it is clear that a variable synthesis filter can be easily realized. Also, instead of giving the N2-order linear prediction coefficient from the linear prediction coefficient calculator 103 to the variable-stage synthesis filter control signal from the controller 105 to the synthesis filter, the N3-order linear prediction coefficient is always transmitted, and from the N2+1 order to the N3 order. It is clear that by setting the linear prediction coefficient to zero, it is possible to expect an effect equivalent to that of a variable stage synthesis filter using an N3 fixed stage synthesis filter.

【図面の簡単な説明】図は本発明の実施例を説明するためのプロツク図である
。１０１・・・・・・波形入力端子、１０２・・・・・・
波形再現度計算器、１０３−・・・線形予測係数計算器
、１０４・・・・・・予測残差電力伝送路、１０５・・
・・・・制御器、１０６・・・・・・計算停止信号伝送
路、１０７・・・・・・線形予測係数伝送路、１０８・
・・・・・可変段合成フイルタ、１０９・・・・・・可
変段合成フイルタ制御信号伝送路、１１０・・・・・・
励振信号入力端子、１１１・・・・・・波形出力端子。BRIEF DESCRIPTION OF THE DRAWINGS The figure is a block diagram for explaining an embodiment of the present invention. 101... Waveform input terminal, 102...
Waveform reproducibility calculator, 103--Linear prediction coefficient calculator, 104--Prediction residual power transmission path, 105--
...Controller, 106...Calculation stop signal transmission line, 107...Linear prediction coefficient transmission line, 108.
...Variable stage synthesis filter, 109...Variable stage synthesis filter control signal transmission line, 110...
Excitation signal input terminal, 111... Waveform output terminal.

Claims

【特許請求の範囲】１分析側では予め定めた時間間隔で入力音声信号から
スペクトル情報を示す線形予測係数と、音源情報を示す
正規化予測残差電力を求め、合成側では前記線形予測係
数、正規化予測残差電力等のパラメータによつて合成フ
ィルタの諸係数および励振音源等を定める音声分析合成
装置において、分析側で前記正規化予測残差電力が予め
定めた値以下になつたとき、より高い次数の前記線形予
測係数の算出を停止させると共に停止されるまでに求め
た次数を示す制御信号を送出し、合成側では前記合成フ
ィルタの段数を可変にし、前記制御信号によりその段数
を決定するようにしたことを特徴とする音声分析合成装
置の制御方法。２特許請求の範囲第１項記載の音声分析合成装置にお
いて、分析される前記線形予測係数の次数および合成フ
ィルタの段数が予め定められていることを特徴とする音
声分析装置の制御方法。３分析側では予め定めた時間間隔で入力音声信号から
スペクトル情報を示す線形予測係数と、音源情報を示す
正規化予測電差電力を求め、合成側では前記線形予測係
数、正規化予測残差電力等のパラメータによつて合成フ
ィルタの諸係数および励振音源等を定める音声分析合成
装置において、分析側では前記正規化予測残差電力が予
め定めた値以下になつたとき、より高い次数の前記線形
予測係数の値を零として出力することを特徴とする音声
分析合成装置の制御方法。４特許請求の範囲第３項記載の音声分析合成装置にお
いて、分析される前記線形予測係数の次数および合成フ
ィルタの段数が予め定められていることを特徴とする音
声分析合成装置の制御方法。[Claims] 1. On the analysis side, linear prediction coefficients indicating spectral information and normalized prediction residual power indicating sound source information are obtained from the input audio signal at predetermined time intervals, and on the synthesis side, the linear prediction coefficients, In a speech analysis and synthesis device that determines various coefficients of a synthesis filter, an excitation sound source, etc. based on parameters such as normalized predicted residual power, when the normalized predicted residual power becomes less than a predetermined value on the analysis side, A control signal is sent to stop the calculation of the linear prediction coefficient of a higher order and to indicate the order obtained before the calculation is stopped, and on the synthesis side, the number of stages of the synthesis filter is made variable, and the number of stages is determined by the control signal. A method for controlling a speech analysis and synthesis device, characterized in that: 2. A method for controlling a speech analysis and synthesis device according to claim 1, characterized in that the order of the linear prediction coefficient to be analyzed and the number of stages of the synthesis filter are determined in advance. 3 On the analysis side, linear prediction coefficients indicating spectral information and normalized predicted difference power indicating sound source information are obtained from the input audio signal at predetermined time intervals, and on the synthesis side, the linear prediction coefficients and normalized predicted residual power are obtained from the input audio signal. In a speech analysis and synthesis device that determines various coefficients of a synthesis filter, excitation sound source, etc. using parameters such as A method for controlling a speech analysis and synthesis device, characterized in that the value of a prediction coefficient is output as zero. 4. A method for controlling a speech analysis and synthesis device according to claim 3, wherein the order of the linear prediction coefficient to be analyzed and the number of stages of the synthesis filter are determined in advance.