JP2775533B2

JP2775533B2 - Long-term speech prediction device

Info

Publication number: JP2775533B2
Application number: JP3212288A
Authority: JP
Inventors: 田幸司吉
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1991-08-23
Filing date: 1991-08-23
Publication date: 1998-07-16
Anticipated expiration: 2013-07-16
Also published as: JPH0553600A

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、ディジタル移動通信、
ボイスメール等に利用する音声符号化装置に組み込まれ
て音声の長期予測を行なうための長期予測装置に関す
る。BACKGROUND OF THE INVENTION The present invention relates to digital mobile communications,
The present invention relates to a long-term prediction device that is incorporated in a voice encoding device used for voice mail and the like and performs long-term prediction of voice.

【０００２】[0002]

【従来の技術】図４は従来のこの種の長期予測装置の構
成を示している。図４において、４１は音声の入力残差
信号の各区間に対して、ある長期予測遅延における長期
予測を行なう長期予測器、４２はその区間における長期
予測遅延を決定する予測誤差最小化器、４３はこれらを
構成要素とする長期予測装置である。2. Description of the Related Art FIG. 4 shows the configuration of a conventional long-term prediction device of this kind. In FIG. 4, reference numeral 41 denotes a long-term predictor for performing a long-term prediction at a certain long-term prediction delay for each section of a speech input residual signal; 42, a prediction error minimizing unit for determining a long-term prediction delay in the section; Is a long-term prediction device having these components.

【０００３】次に上記従来例の動作について説明する。
まず、音声の入力残差信号を一定区間毎に区切り、各区
間毎に長期予測器４１において、ある長期予測遅延に対
する長期予測を以下の式に基づいて行なう。ｙ［ｎ］＝ｂ・ｘ［ｎ−Ｌ］（ｎ＝０，・・・，Ｎ−
１）ｙ［ｎ］：長期予測信号ｘ［ｎ］：入力残差信号Ｌ：長期予測遅延ｂ：長期予測係数Ｎ：一定区間長Next, the operation of the above conventional example will be described.
First, a speech input residual signal is divided into fixed sections, and a long-term predictor 41 performs a long-term prediction for a certain long-term prediction delay in each section based on the following equation. y [n] = b · x [n−L] (n = 0,..., N−
1) y [n]: long-term prediction signal x [n]: input residual signal L: long-term prediction delay b: long-term prediction coefficient N: fixed section length

【０００４】次に、予測誤差最小化器４２において、長
期予測器４１で得られた長期予測信号ｙ［ｎ］と入力残
差信号ｘ［ｎ］との予測誤差の自乗和Ｅ（Ｌ）、Next, in a prediction error minimizing unit 42, a sum of squares E (L) of prediction errors between the long-term prediction signal y [n] obtained by the long-term prediction unit 41 and the input residual signal x [n],

【数１】を最小にするような長期予測遅延Ｌを決定し、その時の
長期予測信号ｙ［ｎ］および長期予測遅延Ｌを出力す
る。(Equation 1) Is determined and the long-term prediction signal y [n] and the long-term prediction delay L at that time are output.

【０００５】このように、上記従来の長期予測装置にお
いても、音声の入力残差信号の各区間に対して、ある長
期予測遅延における長期予測を行なう長期予測器と、そ
の区間における長期予測遅延を決定する予測誤差最小化
器とを組み合わせることにより、音声の長期予測を行な
うことができる。As described above, also in the conventional long-term prediction device, a long-term predictor that performs a long-term prediction at a certain long-term prediction delay for each section of a speech input residual signal, and a long-term prediction delay in the section. Long-term prediction of speech can be performed by combining with the prediction error minimizing unit to be determined.

【０００６】[0006]

【発明が解決しようとする課題】しかしながら、上記従
来の長期予測装置では、これをＣＥＬＰ（Code Excited
Linear Prediction ）等の分析合成系の音声符号化装置
に組み込み、音声の伝送レートを低ビットレートで実現
する場合、長期予測を行なう区間長が長くなり、すなわ
ち長期予測遅延の更新周期が長くなり、このため長期予
測遅延の変化に対する追随性が悪くなり、長期予測の性
能劣化が生じるという問題があった。However, in the above-mentioned conventional long-term prediction device, it is required to use a CELP (Code Excited
When a speech transmission rate is realized at a low bit rate by incorporating it into a speech encoding device of an analysis / synthesis system such as Linear Prediction), the section length for performing long-term prediction becomes long, that is, the update cycle of long-term prediction delay becomes long, For this reason, there is a problem that the ability to follow a change in the long-term prediction delay is deteriorated, and the performance of the long-term prediction is deteriorated.

【０００７】本発明は、このような従来の問題を解決す
るものであり、長期予測区間内の長期予測遅延の変化を
考慮して長期予測を行なうことにより、その区間長が長
い場合に長期予測の性能を向上させることのできる優れ
た音声の長期予測装置を提供することを目的とする。The present invention solves such a conventional problem, and performs long-term prediction in consideration of a change in long-term prediction delay in a long-term prediction section. It is an object of the present invention to provide an excellent long-term speech prediction device capable of improving the performance of a speech.

【０００８】[0008]

【課題を解決するための手段】本発明は、上記目的を達
成するために、音声の入力残差信号の各区間に対して、
ある長期予測遅延における長期予測を区間内の長期予測
遅延の変化を考慮して行なう長期予測器と、その区間に
おける長期予測遅延を決定する予測誤差最小化器を備え
たものである。According to the present invention, in order to attain the above object, the present invention provides a method for each section of a speech input residual signal.
It is provided with a long-term predictor that performs a long-term prediction at a certain long-term prediction delay in consideration of a change in the long-term prediction delay within a section, and a prediction error minimizer that determines a long-term prediction delay in the section.

【０００９】本発明はまた、上記構成の長期予測装置を
ＣＥＬＰ音声符号化装置に組み込んだものである。The present invention also incorporates the long-term prediction device having the above configuration into a CELP speech coding device.

【００１０】[0010]

【作用】したがって、本発明によれば、長期予測区間内
の長期予測遅延の変化を考慮して長期予測を行なうこと
により、その区間長が長い場合の長期予測の性能を向上
させることができる。Therefore, according to the present invention, the performance of long-term prediction when the section length is long can be improved by performing long-term prediction in consideration of a change in long-term prediction delay within the long-term prediction section.

【００１１】本発明はまた、上記構成の長期予測装置を
ＣＥＬＰ音声符号化装置に組み込むことにより、長期予
測の性能を向上させることができ、音声品質を向上させ
ることができる。Further, the present invention can improve the performance of long-term prediction and improve the speech quality by incorporating the long-term prediction device having the above configuration into a CELP speech coding device.

【００１２】[0012]

【実施例】図１は本発明の第１の実施例の構成を示すも
のである。図１において、１は音声の入力残差信号の各
区間に対して、ある長期予測遅延における長期予測を長
期予測区間内の長期予測遅延の変化を考慮して行なう長
期予測器、２はその区間における長期予測遅延および長
期予測遅延差を決定する予測誤差最小化器、３はこれら
を構成要素とする長期予測装置である。FIG. 1 shows the configuration of a first embodiment of the present invention. In FIG. 1, reference numeral 1 denotes a long-term predictor for performing a long-term prediction at a certain long-term prediction delay for each section of a speech input residual signal in consideration of a change in a long-term prediction delay in the long-term prediction section. The prediction error minimizer 3 for determining the long-term prediction delay and the long-term prediction delay difference in the above is a long-term prediction device including these as constituent elements.

【００１３】次に上記第１の実施例の動作について説明
する。まず、音声の入力残差信号を一定区間毎に区切
り、各区間毎に長期予測器１において、ある長期予測遅
延に対する長期予測を行なう。その際、その区間内の長
期予測遅延の変化に対応するため、区間内を前後２つに
分割し、前後の各分割区間の間の長期予測遅延の差を設
け、以下の式に基づいて長期予測を行なう。なお、図２
にこの長期予測の様子を示す。ｙ［ｎ］＝ｂ・ｘ［ｎ−Ｌ］（ｎ＝０，・・・，Ｎ／２−１）ｂ・ｘ［ｎ−（Ｌ＋ｋ）］（ｎ＝Ｎ／２，・・・，Ｎ−１）ｙ［ｎ］：長期予測信号ｘ［ｎ］：入力残差信号Ｌ：長期予測遅延ｋ：長期予測遅延の２つの分割区間の差（ｋ＝０，±
１，・・・）ｂ：長期予測係数Ｎ：一定区間長Next, the operation of the first embodiment will be described. First, a speech input residual signal is divided into fixed sections, and the long-term predictor 1 performs long-term prediction for a certain long-term prediction delay for each section. At this time, in order to cope with a change in the long-term prediction delay in the section, the section is divided into two parts before and after, and a difference in long-term prediction delay between each of the preceding and following divided sections is provided. Make predictions. Note that FIG.
Fig. 3 shows the state of this long-term prediction. y [n] = b · x [n−L] (n = 0,..., N / 2-1) b · x [n− (L + k)] (n = N / 2,. -1) y [n]: long-term prediction signal x [n]: input residual signal L: long-term prediction delay k: difference between two divided sections of long-term prediction delay (k = 0, ±
1, ...) b: Long-term prediction coefficient N: Fixed section length

【００１４】次に、予測誤差最小化器２において、長期
予測器１で得られた長期予測信号ｙ［ｎ］と入力残差信
号ｘ［ｎ］との予測誤差の自乗和Ｅ（Ｌ）、Next, in the prediction error minimizer 2, the sum of squares E (L) of the prediction error between the long-term prediction signal y [n] obtained by the long-term predictor 1 and the input residual signal x [n],

【数２】を最小にするような長期予測遅延Ｌおよび長期予測遅延
差ｋを決定し、その時の長期予測信号ｙ［ｎ］を出力す
る。(Equation 2) Is determined, and the long-term prediction signal y [n] at that time is output.

【００１５】このように、上記第１の実施例によれば、
区間内の長期予測遅延の変化を考慮して長期予測を行な
うため、長期予測の性能を向上させることができる。As described above, according to the first embodiment,
Since the long-term prediction is performed in consideration of the change of the long-term prediction delay in the section, the performance of the long-term prediction can be improved.

【００１６】図３は本発明の第２の実施例の構成を示す
ものである。この第２の実施例は、分析合成系のＣＥＬ
Ｐ音声符号化装置に上記第１の実施例に示す長期予測装
置を組み込んだ例で、図３はその符号化側を示す。図３
において、３１は入力音声のＬＰＣ（Linear Predictiv
e Coding）分析を行なうＬＰＣ分析器、３２は入力音声
に聴覚重み付けを行なう重み付きフィルタ器、３３は前
のフレームの影響を取り除く零入力応答減算器、３４は
長期予測の過去の状態を格納する適応コードブック格納
器、３５は上記第１の実施例に示した長期予測装置、３
６はガウス性の音源を格納する確率的コードブック格納
器、３７は駆動音源を生成する駆動音源生成器、３８は
駆動音源から合成音声を生成する重み付き合成フィルタ
器、３９は駆動音源を符号化する駆動音源符号化器、４
０は復号化側に伝送するデータを多重化する多重化器で
ある。FIG. 3 shows the configuration of a second embodiment of the present invention. This second embodiment is based on the CEL of the analytical synthesis system.
FIG. 3 shows an encoding side of the P speech encoding apparatus in which the long-term prediction apparatus shown in the first embodiment is incorporated. FIG.
, 31 is the LPC (Linear Predictiv
e Coding) LPC analyzer for performing analysis, 32 is a weighted filter for performing auditory weighting on input speech, 33 is a zero-input response subtractor that removes the influence of the previous frame, and 34 stores the past state of long-term prediction. The adaptive codebook storage 35 is the long-term prediction device shown in the first embodiment, 3
6 is a stochastic codebook storage for storing Gaussian sound sources, 37 is a driving sound source generator that generates a driving sound source, 38 is a weighted synthesis filter that generates synthesized speech from the driving sound source, and 39 is a code for the driving sound source. Drive excitation encoder to be transformed, 4
A multiplexer 0 multiplexes data to be transmitted to the decoding side.

【００１７】次に上記第２の実施例の動作について説明
する。入力音声に対しＬＰＣ分析器３１で得られた線形
予測係数を用いて、重み付けフィルタ器３２により聴覚
重み付き入力音声を生成し、零入力応答減算器３３によ
り前のフレームの影響を取り除く。一方、長期予測装置
３５により、上記したように区間内の長期予測遅延の変
化を考慮して長期予測を行ない、それにより得られた長
期予測信号と確率的コードブック格納器３６内のガウス
性音源とから、駆動音源生成器３７、重み付き合成フィ
ルタ器３８を通して合成音声を生成し、駆動音源符号化
器３９において、重み付き合成フィルタ器３８から出力
された合成音声と零入力応答減算器３３で得られた零入
力応答減算後重み付き入力音声との自乗誤差を最小にす
るような駆動音源を、適応コードブック格納器３４また
は確率的コードブック格納器３６から選択して符号化
し、多重化器４０で多重化して音声符号化データを出力
する。Next, the operation of the second embodiment will be described. Using the linear prediction coefficients obtained by the LPC analyzer 31 for the input speech, the weighting filter 32 generates an auditory weighted input speech, and the zero input response subtracter 33 removes the influence of the previous frame. On the other hand, the long-term prediction device 35 performs the long-term prediction in consideration of the change of the long-term prediction delay in the section as described above, and obtains the long-term prediction signal and the Gaussian sound source in the stochastic codebook storage 36. Then, a synthesized speech is generated through the driving excitation generator 37 and the weighted synthesis filter 38, and in the driving excitation encoder 39, the synthesized speech output from the weighted synthesis filter 38 and the zero-input response subtractor 33 are output. A driving sound source that minimizes the square error with the obtained weighted input speech after subtraction of the zero input response is selected from the adaptive codebook storage 34 or the stochastic codebook storage 36 and encoded. At 40, the data is multiplexed and the audio encoded data is output.

【００１８】このように、上記第２の実施例によれば、
第１の実施例と同様に、長期予測区間内の長期予測遅延
の変化を考慮して長期予測を行なうため、長期予測の性
能を向上させることができ、音声の伝送レートの低い場
合、すなわち長期予測区間が長い場合における音声品質
を向上させることができる。As described above, according to the second embodiment,
As in the first embodiment, long-term prediction is performed in consideration of a change in long-term prediction delay in the long-term prediction section, so that the performance of long-term prediction can be improved. It is possible to improve speech quality when the prediction section is long.

【００１９】以上のように、本発明の音声の長期予測装
置によれば、長期予測区間内の長期予測遅延の変化に対
応するため、区間内を前後２つに分割し、前後の各分割
区間の間の長期予測遅延の差を設けて長期予測を行なう
ことにより、長期予測遅延の性能を向上させることがで
きる。As described above, according to the long-term speech prediction apparatus of the present invention, in order to cope with a change in the long-term prediction delay in the long-term prediction section, the section is divided into two before and after, and each of the preceding and following divided sections is divided. By performing the long-term prediction by providing a difference in the long-term prediction delay between the above, the performance of the long-term prediction delay can be improved.

【００２０】本発明はまた、上記長期予測装置をＣＥＬ
Ｐ音声符号化装置に組み込むことにより、長期予測の性
能を向上させることができ、音声品質を向上させること
ができる。The present invention also relates to the long-term prediction device,
Incorporation into the P speech encoding device can improve long-term prediction performance and improve speech quality.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の第１の実施例を示す長期予測装置の概
略ブロック図FIG. 1 is a schematic block diagram of a long-term prediction device showing a first embodiment of the present invention.

【図２】本発明の第１の実施例における長期予測動作を
説明するための信号波形図FIG. 2 is a signal waveform diagram for explaining a long-term prediction operation in the first embodiment of the present invention.

【図３】本発明の第２の実施例を示す長期予測装置を組
み込んだ音声符号化装置の概略ブロック図FIG. 3 is a schematic block diagram of a speech encoding device incorporating a long-term prediction device according to a second embodiment of the present invention.

【図４】従来の長期予測装置の一例を示す概略ブロック
図FIG. 4 is a schematic block diagram showing an example of a conventional long-term prediction device.

【符号の説明】[Explanation of symbols]

１長期予測器２予測誤差最小化器３長期予測装置３１ＬＰＣ分析器３２重み付けフィルタ器３３零入力応答減算器３４適応コードブック格納器３５長期予測装置３６確率的コードブック格納器３７駆動音源生成器３８重み付き合成フィルタ器３９駆動音源符号化器４０多重化器 DESCRIPTION OF SYMBOLS 1 Long-term predictor 2 Prediction error minimizer 3 Long-term predictor 31 LPC analyzer 32 Weighting filter 33 Zero input response subtractor 34 Adaptive codebook storage 35 Long-term prediction device 36 Stochastic codebook storage 37 Drive sound source generator 38 Weighted synthesis filter 39 Driving excitation encoder 40 Multiplexer

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】音声の入力残差信号を一定区間毎に区切
り、各区間内を前後２つに分割し、前後の各分割区間の
間に長期予測遅延の差を設けて、ある長期予測遅延に対
する長期予測を行なう長期予測器と、前記長期予測器に
より得られる長期予測信号と入力残差信号との誤差を最
小にするような長期予測遅延を決定し、そのときの長期
予測遅延、長期予測遅延差および長期予測信号を出力す
る予測誤差最小化器とを備えた音声の長期予測装置。1. An input residual signal of a voice is divided into fixed sections, each section is divided into two parts before and after, and a difference in long-term prediction delay is provided between each of the preceding and following divided sections to provide a certain long-term prediction delay. A long-term predictor that performs long-term prediction on the input signal, and a long-term prediction delay that minimizes an error between the long-term prediction signal obtained by the long-term predictor and the input residual signal. A long-term speech prediction device comprising: a delay difference and a prediction error minimizer that outputs a long-term prediction signal.

【請求項２】適応コードブック探索を行なうＣＥＬＰ
音声符号化装置に組み込まれた請求項１記載の音声の長
期予測装置。2. A CELP for performing an adaptive codebook search.
The speech long-term prediction device according to claim 1, which is incorporated in a speech encoding device.