JPH10161696A

JPH10161696A - Voice encoding device and voice decoding device

Info

Publication number: JPH10161696A
Application number: JP8317517A
Authority: JP
Inventors: Kazuhisa Murakami; 和久村上; Keiichi Kayahara; 桂一茅原
Original assignee: OKI SYST KAIHATSU TOKAI KK; Oki Electric Industry Co Ltd
Current assignee: OKI SYST KAIHATSU TOKAI KK; Oki Electric Industry Co Ltd
Priority date: 1996-11-28
Filing date: 1996-11-28
Publication date: 1998-06-19

Abstract

PROBLEM TO BE SOLVED: To provide a voice encoding device and voice decoding device in which the voice change between frames is smoothed with low delay. SOLUTION: A voice encoding device transmits a linear estimated parameter αq[k] of the k-th frame calculated on the basis of an input voice (q shows quantized state), and a sound source parameter every sub-frame calculated on the basis of an interpolation linear estimated parameter αq' [k, j] of the (j+1)th sub-frame (j=0, 1, 2, 3) of the k-th frame to a transmission path. A voice decoding device drives a composing filter having α'[k, j] as filter coefficient by the received/restored sound source parameter to regenerate a voice. The sub-frame interpolating circuit provided in each device weighting-adds the received/restored α(q) [k] to a (q) [k-1] of the previous frame to generate α(q)' [k, 0], α(q)' [k, 1], and α(q) [k] is also used to α(q)' [k, 2], α(q)' [k, 3] as it is to generate α(q)' [k, j].

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ディジタル電話や
ボイスメール等の音声符号化伝送システムに用いられ、
ＣＥＬＰ符号化法（Code Excited Linear Prediction）
等の線形予測符号化法をベースとする音声符号化装置お
よび音声復号化装置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention is used for a voice coded transmission system such as a digital telephone and a voice mail.
CELP coding method (Code Excited Linear Prediction)
The present invention relates to a speech coding apparatus and a speech decoding apparatus based on a linear predictive coding method such as the above.

【０００２】[0002]

【従来の技術】近年、４．８〜８．０［ｋｂ／ｓ］程度
の低伝送ビットレートの音声符号化伝送システムにおけ
る音声符号化装置には、ＣＥＬＰ符号化法をベースとし
た符号化法が広く適用されている。ＣＥＬＰ符号化法に
おいては、伝送パラメータとして線形予測パラメータと
音源パラメータとを算出する。ＣＥＬＰ符号化法の詳細
は、例えば「M.R.Schroeder and Atal,"Code-Excited L
inear Prediction(CELP):high quality speech at very
low bit rates",in Proc.ICASSP'85,pp.937-939,198
5」に示されている。2. Description of the Related Art In recent years, a speech encoding apparatus in a speech encoding transmission system having a low transmission bit rate of about 4.8 to 8.0 [kb / s] has been developed based on a CELP encoding method. Has been widely applied. In the CELP coding method, linear prediction parameters and excitation parameters are calculated as transmission parameters. For details of the CELP encoding method, see, for example, “MR Schroeder and Atal,” Code-Excited L
inear Prediction (CELP): high quality speech at very
low bit rates ", in Proc.ICASSP'85, pp.937-939,198
5 ".

【０００３】ＣＥＬＰ符号化法を用いた音声符号化装置
は、線形予測パラメータ算出回路と、線形予測パラメー
タ量子化回路と、サブフレーム補間回路と、音源パラメ
ータ算出回路と、音源パラメータ量子化回路と、マルチ
プレクサとを有し、線形予測パラメータ算出回路および
線形予測パラメータ量子化回路において、中心に重み付
けされた分析窓を乗じて１フレーム長（通常２０〜４０
［ｍｓ］程度）に切り出した入力音声データを線形予測
分析してフレーム単位の線形予測パラメータを算出し、
これを量子化して量子化線形予測パラメータを生成し、
サブフレーム補間回路において、フレーム単位の量子化
線形予測パラメータを補間してサブフレーム（フレーム
を分割したもの）ごとに補間量子化線形予測パラメータ
を生成し、音源パラメータ算出回路および音源パラメー
タ量子化回路において、補間量子化線形予測パラメータ
とサブフレーム単位の入力音声とに基づいてサブフレー
ムごとに音源パラメータを算出し、これを量子化して量
子化音源パラメータを生成し、上記のフレーム単位の量
子化線形予測パラメータと上記の量子化音源パラメータ
とをマルチプレクサで多重化してビットストリーム（多
重化パラメータ信号）として伝送路に送出する。A speech coding apparatus using the CELP coding method includes a linear prediction parameter calculation circuit, a linear prediction parameter quantization circuit, a subframe interpolation circuit, a sound source parameter calculation circuit, a sound source parameter quantization circuit, A multiplexer and a linear prediction parameter calculation circuit and a linear prediction parameter quantization circuit multiply a center-weighted analysis window by one frame length (typically 20 to 40).
[Ms]) to calculate linear prediction parameters for each frame by performing linear prediction analysis on the input audio data cut out,
This is quantized to generate a quantized linear prediction parameter,
The sub-frame interpolation circuit interpolates the quantized linear prediction parameters for each frame to generate an interpolated quantized linear prediction parameter for each sub-frame (divided frame), and the sound source parameter calculation circuit and the sound source parameter quantization circuit Calculating a sound source parameter for each sub-frame based on the interpolated quantized linear prediction parameter and the input speech for each sub-frame, and quantizing this to generate a quantized sound source parameter; The parameters and the above-mentioned quantized sound source parameters are multiplexed by a multiplexer and transmitted to a transmission path as a bit stream (multiplexed parameter signal).

【０００４】また、上記音声符号化伝送システムにおけ
る音声復号化装置は、デマルチプレクサと、線形予測パ
ラメータ復元回路と、サブフレーム補間回路と、音源パ
ラメータ復元回路と、合成回路とを有し、上記のビット
ストリームをデマルチプレクサで受信して量子化線形予
測パラメータと量子化音源パラメータとに分離し、線形
予測パラメータ復元回路において、量子化線形予測パラ
メータをフレーム単位の線形予測パラメータに復元し、
サブフレーム補間回路において、上記のフレーム単位の
線形予測パラメータを補間してサブフレームごとに補間
線形予測パラメータを生成し、また音源パラメータ復元
回路において、量子化音源パラメータをサブフレームご
との音源パラメータに復元し、上記の補間線形予測パラ
メータを合成回路のフィルタ係数とし、上記の音源パラ
メータで合成回路のフィルタを駆動することにより、音
声を再生する。[0004] Further, the speech decoding apparatus in the speech encoding and transmission system has a demultiplexer, a linear prediction parameter restoration circuit, a subframe interpolation circuit, a sound source parameter restoration circuit, and a synthesis circuit. The bit stream is received by the demultiplexer and separated into a quantized linear prediction parameter and a quantized excitation parameter, and in a linear prediction parameter restoration circuit, the quantized linear prediction parameter is restored to a linear prediction parameter for each frame,
The sub-frame interpolation circuit interpolates the above-described frame-based linear prediction parameters to generate an interpolated linear prediction parameter for each sub-frame, and the sound source parameter restoration circuit restores the quantized sound source parameters to the sound source parameters for each sub-frame. Then, the sound is reproduced by driving the filter of the synthesis circuit with the sound source parameters using the interpolation linear prediction parameter as a filter coefficient of the synthesis circuit.

【０００５】図４は分析窓とフレームとサブフレームの
関係を示す図である。図４において、入力音声デ−タ３
００は中心に重み付けされた分析窓３１１が乗じられて
フレーム単位に切り出される。分析窓３１１の時間長が
１フレーム長を与える。サブフレームはフレームを時間
的に等分割したものであり、通常、フレームを４分割し
たものである（例えば、第ｋ（ｋは整数）フレーム３０
２は第１サブフレーム３０７〜第４サブフレーム３１０
を有する）。分析窓３１１が乗じられて切り出された１
フレームの入力音声デ−タを線形予測分析して算出され
た線形予測パラメータは、そのフレームの中心付近の音
声データの特徴をよく表わす。すなわち、第（ｋ−１）
フレーム３０１の入力音声デ−タに基づいて算出された
線形予測パラメータは、入力音声デ−タ３００の３０４
付近の特徴をよく表わし、次の第ｋフレーム３０２の線
形予測パラメータは、入力音声デ−タ３００の３０５付
近の特徴をよく表わす。そこで、フレーム間の再生音声
の変化を滑らかにするために、フレーム単位の（量子
化）線形予測パラメータをサブフレームごとに補間して
補間（量子化）線形予測パラメータを生成する（例え
ば、第ｋフレーム３０２の（量子化）線形予測パラメー
タを補間して第１サブフレーム３０７〜第４サブフレー
ム３１０の４つの補間（量子化）線形予測パラメータを
生成する）。FIG. 4 is a diagram showing a relationship between an analysis window, a frame, and a subframe. In FIG. 4, the input voice data 3
00 is multiplied by the analysis window 311 weighted at the center and cut out in frame units. The time length of the analysis window 311 gives one frame length. The sub-frame is obtained by dividing a frame into equal parts in time, and is generally a frame obtained by dividing a frame into four (for example, a k-th (k is an integer) frame 30).
2 is a first subframe 307 to a fourth subframe 310
Having). 1 which is cut out by being multiplied by the analysis window 311
The linear prediction parameters calculated by performing a linear prediction analysis on the input voice data of the frame well represent characteristics of the voice data near the center of the frame. That is, the (k-1) th
The linear prediction parameters calculated based on the input voice data of the frame 301 are represented by 304 in the input voice data 300.
The linear prediction parameters of the next k-th frame 302 well represent the features near 305 of the input speech data 300. Therefore, in order to smooth a change in reproduced voice between frames, an interpolated (quantized) linear prediction parameter is generated by interpolating a (quantized) linear prediction parameter for each frame for each subframe (for example, k-th linear prediction parameter). The (quantized) linear prediction parameters of the frame 302 are interpolated to generate four interpolated (quantized) linear prediction parameters of the first to fourth subframes 307 to 310).

【０００６】図５は音声符号化装置における従来のサブ
フレーム補間回路の構成図である。図５において、第ｋ
フレームの量子化線形予測パラメータをαｑ［ｋ］、第
ｋフレームの第（ｊ＋１）（ｊ＝０，１，２，３）サブ
フレームの補間量子化線形予測パラメータをαｑ′
［ｋ，ｊ］とする。サブフレーム補間回路は、入力信号
に定数ａ₀〜ａ₃、ｂ₀〜ｂ₃を乗じる乗算器３０３１
〜３０３８と、加算器３０３９〜３０４２とを有し、以
下に示す演算を実施することにより、量子化線形予測パ
ラメータαｑ［ｋ−１］、αｑ［ｋ］、αｑ［ｋ＋１］
から補間量子化線形予測パラメータαｑ′［ｋ，ｊ］を
算出する。 αｑ′［ｋ，０］＝αｑ［ｋ−１］×ａ₀＋αｑ［ｋ］
×ｂ₀ αｑ′［ｋ，１］＝αｑ［ｋ−１］×ａ₁＋αｑ［ｋ］
×ｂ₁ αｑ′［ｋ，２］＝αｑ［ｋ］×ａ₂＋αｑ［ｋ＋１］
×ｂ₂ αｑ′［ｋ，３］＝αｑ［ｋ］×ａ₃＋αｑ［ｋ＋１］
×ｂ₃ ここで、定数ａ₀〜ａ₃、ｂ₀〜ｂ₃は、ａ₀＝３／８ｂ₀＝５／８ａ₁＝１／８ｂ₁＝７／８ａ₂＝７／８ｂ₂＝１／８ａ₃＝５／８ｂ₃＝３／８という値に設定されている。FIG. 5 is a configuration diagram of a conventional subframe interpolation circuit in a speech encoding device. In FIG. 5, the k-th
The quantized linear prediction parameter of the frame is αq [k], and the interpolated quantized linear prediction parameter of the (j + 1) (j = 0, 1, 2, 3) subframe of the kth frame is αq ′.
[K, j]. The sub-frame interpolation circuit multiplies the input signal by constants a _{0 to} a ₃ and b _{0 to} b _3.
3038 and adders 3039 to 3042, and by performing the following operation, quantized linear prediction parameters αq [k−1], αq [k], αq [k + 1]
To calculate the interpolated quantized linear prediction parameter αq ′ [k, j]. αq ′ [k, 0] = αq [k−1] × a ₀ + αq [k]
× b ₀ αq ′ [k, 1] = αq [k−1] × a ₁ + αq [k]
× b ₁ αq ′ [k, 2] = αq [k] × a ₂ + αq [k + 1]
× b ₂ αq ′ [k, 3] = αq [k] × a ₃ + αq [k + 1]
× b ₃ where the constant _{_{_{a 0 ~a 3, b 0 ~b}}} 3 _{is, a 0 = 3/8 b} 0 = 5/8 a 1 = 1/8 b 1 = 7/8 a 2 = 7/8 b ₂ = 1/8 a ₃ = 5/8 b ₃ = 3/8

【０００７】図６は上記のサブフレーム補間回路におい
て用いられたサブフレーム補間手法を説明する図であ
り、上記のサブフレーム補間演算式および定数の決定理
由を示すものである。このサブフレーム補間手法は第ｋ
フレーム５０２の量子化線形予測パラメータαｑ［ｋ］
と、前後隣接する第（ｋ−１）フレーム５０１および第
（ｋ＋１）フレーム５０３の量子化線形予測パラメータ
αｑ［ｋ−１］、αｑ［ｋ＋１］とを用いて、第ｋフレ
ーム５０２における第（ｊ＋１）サブフレームの補間量
子化線形予測パラメータαｑ´［ｋ，ｊ］を生成するも
のであり、その手順を以下に説明する。FIG. 6 is a diagram for explaining the sub-frame interpolation method used in the above-mentioned sub-frame interpolation circuit, and shows the above-mentioned sub-frame interpolation arithmetic expression and the reason for determining the constant. This subframe interpolation method
Quantized linear prediction parameter αq [k] of frame 502
And the quantized linear prediction parameters αq [k−1] and αq [k + 1] of the (k−1) -th frame 501 and the (k + 1) -th frame 503 adjacent before and after, to the (j + 1) -th frame ) Generates an interpolated quantized linear prediction parameter αq ′ [k, j] of a subframe, and the procedure will be described below.

【０００８】第ｋフレーム５０２のαｑ［ｋ］に対し、
第ｋフレームの中心を１とし、第（ｋ−１）フレーム５
０１、第（ｋ＋１）フレーム５０３のそれぞれの中心を
０とする線形重み付け５０９を施す。また第（ｋ−１）
フレームのαｑ［ｋ−１］、および第（ｋ＋１）フレー
ムのαｑ［ｋ＋１］に対しても、同様の線形重み付け５
０８、５１０を施す。次に、第（ｋ−１）フレーム５０
１のαｑ［ｋ−１］と、第ｋフレーム５０２のαｑ
［ｋ］とを第ｋフレーム５０２の第１および第２サブフ
レームの中心における線形重み付け５０８、５０９に応
じて加算することにより、第１および第２サブフレーム
のαｑ´［ｋ，０］、αｑ´［ｋ，１］を生成する。ま
た、第ｋフレーム５０２の量子化線形予測パラメータα
ｑ［ｋ］と、第（ｋ＋１）フレームの量子化線形予測パ
ラメータαｑ［ｋ＋１］とを、第ｋフレーム５０２の第
３および第４サブフレームの中心における線形重み付け
５０９、５１０に応じて加算することにより、第３およ
び第４サブフレームのαｑ´［ｋ，２］、αｑ´［ｋ，
３］を生成する。[0008] With respect to αq [k] of the k-th frame 502,
The center of the k-th frame is set to 1 and the (k-1) -th frame 5
01, and linear weighting 509 with the center of each of the (k + 1) th frame 503 as 0 is performed. In addition, the (k-1)
Similar linear weighting 5 is applied to αq [k−1] of the frame and αq [k + 1] of the (k + 1) th frame.
08 and 510 are performed. Next, the (k-1) th frame 50
1 and the αq [k−1] of the k-th frame 502
[K] with the linear weights 508 and 509 at the centers of the first and second sub-frames of the k-th frame 502 to obtain αq ′ [k, 0], αq of the first and second sub-frames. '[K, 1] is generated. Also, the quantization linear prediction parameter α of the k-th frame 502
adding q [k] and the quantized linear prediction parameter αq [k + 1] of the (k + 1) th frame according to the linear weights 509 and 510 at the centers of the third and fourth subframes of the kth frame 502 As a result, αq ′ [k, 2] and αq ′ [k,
3].

【０００９】すなわち、第ｋフレーム５０２の第１サブ
フレームの中心５０４においては、αｑ［ｋ−１］に対
する重み付けは３／８、αｑ［ｋ］に対する重み付けは
５／８なので、第１サブフレームのαｑ´［ｋ，０］
は、第（ｋ−１）フレームのαｑ［ｋ−１］に３／８を
乗じたものと、第ｋフレーム５０２のαｑ［ｋ］に５／
８を乗じたものとを加算して生成される。従って定数ａ
₀＝３／８、ｂ₀＝５／８となる。同様に、第３サブフ
レームの中心５０６においては、αｑ［ｋ］に対する重
み付けは７／８、αｑ［ｋ＋１］に対する重み付けは１
／８なので、第３サブフレームのαｑ´［ｋ，２］の演
算式において、定数ａ₂＝７／８、ｂ₂＝１／８とな
る。このようにして上記定数が決定されている。尚、音
声復号化装置のサブフレーム補間回路におけるサブフレ
ーム補間手法も上記と同じである。That is, at the center 504 of the first sub-frame of the k-th frame 502, the weight for αq [k−1] is 3/8 and the weight for αq [k] is 5/8, so that αq '[k, 0]
Is obtained by multiplying [alpha] q [k-1] of the (k-1) th frame by 3/8 and [alpha] q [k] of the kth frame 502 by 5 /
It is generated by adding the result of multiplying by 8. Therefore the constant a
₀ = 3/8 and _b0 = 5/8. Similarly, at the center 506 of the third subframe, the weight for αq [k] is 7/8, and the weight for αq [k + 1] is 1
/ 8, the constants a ₂ = 7/8 and b ₂ = １／ in the arithmetic expression of αq ′ [k, 2] in the third subframe. The above constant is determined in this way. The sub-frame interpolation method in the sub-frame interpolation circuit of the audio decoding device is the same as described above.

【００１０】上記の音声符号化装置においては、音声入
力されてから多重化パラメータ信号を送出するまでの時
間は、音声入力から量子化音源パラメータが生成される
までの時間に依存する。また上記の音声復号化装置にお
いては、多重化パラメータ信号を受信してから音声を再
生するまでの時間は、信号受信から補間線形予測パラメ
ータが算出されるまでの時間に依存する。In the above-mentioned speech coding apparatus, the time from the input of speech to the transmission of the multiplexing parameter signal depends on the time from the input of speech to the generation of the quantized excitation parameter. Further, in the above-described audio decoding device, the time from when the multiplex parameter signal is received to when the audio is reproduced depends on the time from when the signal is received until the interpolation linear prediction parameter is calculated.

【００１１】尚、上記のようにフレーム単位の線形予測
パラメータをサブフレーム補間するのは、線形予測パラ
メータの時間的変化をより滑らかにし、フレーム間の音
韻の連続性を保つためである。サブフレーム補間を全く
行わなかった場合、線形予測パラメータの不連続が音質
に影響し、とぎれとぎれの非常に耳障りな音声になって
しまう。[0011] The reason why the linear prediction parameter in the frame unit is subframe-interpolated as described above is to make the temporal change of the linear prediction parameter smoother and maintain the continuity of phonemes between frames. If subframe interpolation is not performed at all, discontinuities in the linear prediction parameters affect the sound quality, resulting in a very disturbing sound.

【００１２】[0012]

【発明が解決しようとする課題】しかしながら上記従来
のサブフレーム補間においては、前後隣接するフレーム
の線形予測パラメータを用いるため、フレームの中心か
ら時間的終端までのサブフレーム（第３、第４サブフレ
ーム）の線形予測パラメータは、次フレームの線形予測
パラメータの算出（または復元）を待って生成すること
となり、従って、音声符号化装置においては、次フレー
ムの線形予測パラメータの算出されるまで第３、第４サ
ブフレームの音源パラメータを算出できず、また音声復
号化装置においては、次フレームの線形予測パラメータ
の算出されるまで第３、第４サブフレームに対応する音
声の再生を開始できず、これにより、サブフレーム補間
をしない場合に比べて、それぞれの装置において半フレ
ーム分の遅延を生じてしまうという問題があった。この
音声符号化システムをディジタル電話に用いた場合、上
記の遅延により違和感のある会話を生じさせてしまう。However, in the above-mentioned conventional subframe interpolation, since the linear prediction parameters of the immediately preceding and succeeding frames are used, the subframes from the center of the frame to the temporal end (third and fourth subframes) are used. ) Is generated after calculating (or restoring) the linear prediction parameter of the next frame. Therefore, in the speech coding apparatus, the third and the second linear prediction parameters are calculated until the linear prediction parameter of the next frame is calculated. The sound source parameter of the fourth sub-frame cannot be calculated, and the speech decoding apparatus cannot start reproducing the speech corresponding to the third and fourth sub-frames until the linear prediction parameter of the next frame is calculated. As a result, each device generates a half-frame delay compared to the case without sub-frame interpolation. There was a problem that is. When this speech coding system is used for a digital telephone, the above-mentioned delay causes a strange conversation.

【００１３】本発明は、上記従来の問題を解決するもの
であり、低遅延でフレーム間の音声変化が滑らかな音声
符号化装置および音声復号化装置を提供することを目的
とする。An object of the present invention is to solve the above-mentioned conventional problems, and an object of the present invention is to provide a speech encoding device and a speech decoding device with low delay and smooth speech change between frames.

【００１４】[0014]

【課題を解決するための手段】上記の目的を達成するた
めに本発明の音声符号化装置は、フレーム単位の入力音
声を分析してフレームごとに線形予測パラメータを算出
する線形予測パラメータ算出手段と、前記フレームごと
の線形予測パラメータを補間して、フレームを分割して
なるサブフレームごとに補間線形予測パラメータを生成
するサブフレーム補間手段と、前記補間線形予測パラメ
ータとサブフレーム単位の前記入力音声とに基づいて音
源パラメータを算出する音源パラメータ算出手段とを備
え、前記サブフレーム補間手段は、補間対象フレームの
線形予測パラメータと、この補間対象フレームの隣接前
フレームの線形予測パラメータとを用いて、前記補間対
象フレームの補間線形予測パラメータを生成することを
特徴とする。In order to achieve the above object, a speech encoding apparatus according to the present invention analyzes input speech in units of frames and calculates a linear prediction parameter for each frame. A sub-frame interpolation means for interpolating the linear prediction parameter for each frame to generate an interpolation linear prediction parameter for each sub-frame obtained by dividing a frame; and Sound source parameter calculation means for calculating a sound source parameter based on the sub-frame interpolation means, the sub-frame interpolation means, using a linear prediction parameter of the interpolation target frame and a linear prediction parameter of the adjacent previous frame of the interpolation target frame, It is characterized in that an interpolation linear prediction parameter of the interpolation target frame is generated.

【００１５】請求項２記載の音声符号化装置は、前記サ
ブフレーム補間手段が、前記補間対象フレームに属する
サブフレームのうち、フレームの時間的先端から所定位
置までの各サブフレームに対しては、前記補間対象フレ
ームの線形予測パラメータと前記隣接前フレームの線形
予測パラメータとを所定の重み付けで線形加算して補間
線形予測パラメータを生成し、また前記所定位置からフ
レームの時間的終端までの各サブフレームに対しては、
前記補間対象フレームの線形予測パラメータをそのまま
補間線形予測パラメータとすることを特徴とする。According to a second aspect of the present invention, in the speech coding apparatus, the sub-frame interpolating means may perform, for each of the sub-frames belonging to the interpolation target frame, a sub-frame from a temporal leading end of the frame to a predetermined position. The linear prediction parameter of the interpolation target frame and the linear prediction parameter of the adjacent previous frame are linearly added with a predetermined weight to generate an interpolation linear prediction parameter, and each subframe from the predetermined position to the temporal end of the frame is generated. For
The linear prediction parameter of the interpolation target frame is directly used as an interpolation linear prediction parameter.

【００１６】次に、本発明の音声復号化装置は、入力さ
れたフレームごとの線形予測パラメータを補間して、フ
レームを分割してなるサブフレームごとに補間線形予測
パラメータを生成するサブフレーム補間手段と、前記補
間線形予測パラメータと、入力されたサブフレーム単位
の音源パラメータとに基づいて音声を再生する合成手段
とを備え、前記サブフレーム補間手段は、補間対象フレ
ームの線形予測パラメータと、この補間対象フレームの
隣接前フレームの線形予測パラメータとを用いて、前記
補間対象フレームの補間線形予測パラメータを生成する
ことを特徴とする。Next, the speech decoding apparatus of the present invention interpolates the input linear prediction parameter for each frame and generates an interpolated linear prediction parameter for each subframe obtained by dividing the frame. Synthesizing means for reproducing a sound based on the interpolated linear prediction parameters and the input sound source parameters in units of subframes. The subframe interpolating means includes a linear prediction parameter of a frame to be interpolated, An interpolated linear prediction parameter of the interpolation target frame is generated using a linear prediction parameter of a previous frame adjacent to the target frame.

【００１７】請求項４記載の音声復号化装置は、前記サ
ブフレーム補間手段が、前記補間対象フレームに属する
サブフレームのうち、フレームの時間的先端から所定位
置までの各サブフレームに対しては、前記補間対象フレ
ームの線形予測パラメータと前記隣接前フレームの線形
予測パラメータとを所定の重み付けで線形加算して補間
線形予測パラメータを生成し、また前記所定位置からフ
レームの時間的終端までの各サブフレームに対しては、
前記補間対象フレームの線形予測パラメータをそのまま
補間線形予測パラメータとすることを特徴とする。According to a fourth aspect of the present invention, in the speech decoding apparatus, the sub-frame interpolating means may perform, for each of the sub-frames belonging to the interpolation target frame, a sub-frame from a temporal leading end of the frame to a predetermined position. The linear prediction parameter of the interpolation target frame and the linear prediction parameter of the adjacent previous frame are linearly added with a predetermined weight to generate an interpolation linear prediction parameter, and each subframe from the predetermined position to the temporal end of the frame is generated. For
The linear prediction parameter of the interpolation target frame is directly used as an interpolation linear prediction parameter.

【００１８】[0018]

【発明の実施の形態】図１は本発明の実施の形態を示す
音声符号化伝送システムのブロック構成図であり、音声
符号化装置１と音声復号化装置２とを伝送路により接続
したものである。音声符号化装置１は、ＣＥＬＰ符号化
法を用いて入力音声データから線形予測パラメータと音
源パラメータとを生成するものであり、線形予測パラメ
ータ算出回路１０１（線形予測パラメータ算出手段）
と、線形予測パラメータ量子化回路１０２と、サブフレ
ーム補間回路１０３（サブフレーム補間手段）と、音源
パラメータ算出回路１０４（音源パラメータ算出手段）
と、音源パラメータ量子化回路１０５と、マルチプレク
サ１０６とを有する。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a block diagram of an audio encoding and transmitting system showing an embodiment of the present invention, in which an audio encoding device 1 and an audio decoding device 2 are connected by a transmission line. is there. The speech encoding device 1 generates a linear prediction parameter and a sound source parameter from input speech data using a CELP encoding method, and includes a linear prediction parameter calculation circuit 101 (linear prediction parameter calculation means).
, A linear prediction parameter quantization circuit 102, a sub-frame interpolation circuit 103 (sub-frame interpolation means), and a sound source parameter calculation circuit 104 (sound source parameter calculation means)
, A sound source parameter quantization circuit 105 and a multiplexer 106.

【００１９】線形予測パラメータ算出回路１０１は、ハ
ミング窓やハニング窓のような中心に重み付けされた１
フレーム長の分析窓を入力音声データに乗じて１フレー
ム長の入力音声データを切り出し、このフレーム単位の
入力音声データを線形予測分析することによりフレーム
単位の線形予測パラメータを算出する。線形予測パラメ
ータとしては、補間特性、量子化特性のよいＬＳＰ（Li
ne Spectrum Pair）パラメータを用いる。また１フレー
ム長は２０〜４０［ｍｓ］である。尚、ｎ（ｎは正の整
数）次の線形予測分析を行う場合には、線形予測パラメ
ータはｎ個のパラメータ値を有する。また、線形予測パ
ラメータ量子化回路１０２は、線形予測パラメータ算出
回路１０１で算出されたフレーム単位の線形予測パラメ
ータを量子化することにより、フレーム単位の量子化線
形予測パラメータを生成する。The linear prediction parameter calculation circuit 101 calculates a center-weighted 1 such as a Hamming window or a Hanning window.
The input audio data is cut out by multiplying the input audio data by the analysis window of the frame length, and the input audio data of each frame is subjected to linear prediction analysis to calculate a linear prediction parameter of each frame. As linear prediction parameters, LSP (Li
ne Spectrum Pair) parameter. One frame length is 20 to 40 [ms]. Note that, when performing linear prediction analysis of order n (n is a positive integer), the linear prediction parameters have n parameter values. The linear prediction parameter quantization circuit 102 generates a quantized linear prediction parameter for each frame by quantizing the linear prediction parameter for each frame calculated by the linear prediction parameter calculation circuit 101.

【００２０】サブフレーム補間回路１０３は、線形予測
パラメータ量子化回路１０２で生成されたフレーム単位
の量子化線形予測パラメータを、この量子化線形予測パ
ラメータと、このフレームの隣接前フレーム（１つ前の
フレーム）の量子化線形予測パラメータとを用いて補間
し、サブフレームごとに補間量子化線形予測パラメータ
を生成する。サブフレームはフレームを４分割したもの
であり、１フレームは第１〜第４のサブフレームを有す
る。従って１サブフレーム長は５〜１０［ｍｓ］であ
る。尚、このサブフレーム補間は、再生音声におけるフ
レーム間の不連続性を低減するためのものである。The sub-frame interpolation circuit 103 converts the quantized linear prediction parameters for each frame generated by the linear prediction parameter quantization circuit 102 with the quantized linear prediction parameters and the immediately preceding frame (the immediately preceding frame). Interpolation is performed using the quantized linear prediction parameter of each frame, and an interpolated quantized linear prediction parameter is generated for each subframe. The subframe is obtained by dividing a frame into four, and one frame has first to fourth subframes. Therefore, the length of one subframe is 5 to 10 [ms]. Note that this sub-frame interpolation is for reducing discontinuity between frames in the reproduced sound.

【００２１】音源パラメータ算出回路１０４は、サブフ
レーム補間回路１０３で算出された補間量子化線形予測
パラメータと、サブフレーム単位の入力音声データとに
基づいてサブフレームごとに音源パラメータを算出す
る。音源パラメータは、音声のピッチ（高さ）、音声の
ピッチゲイン（強さ）、雑音コードブックのテーブル参
照用インデックス、雑音コードブックゲイン等を示すも
のである。また、音源パラメータ量子化回路１０５は、
音源パラメータ算出回路１０４で算出された音源パラメ
ータを量子化することにより、サブフレームごとの量子
化音源パラメータを生成する。The sound source parameter calculation circuit 104 calculates sound source parameters for each subframe based on the interpolated quantized linear prediction parameters calculated by the subframe interpolation circuit 103 and the input speech data in subframe units. The sound source parameters indicate a voice pitch (height), a voice pitch gain (strength), a noise codebook table reference index, a noise codebook gain, and the like. In addition, the sound source parameter quantization circuit 105
The sound source parameters calculated by the sound source parameter calculation circuit 104 are quantized to generate quantized sound source parameters for each subframe.

【００２２】マルチプレクサ１０６は、線形予測パラメ
ータ量子化回路１０２で生成されたフレーム単位の量子
化線形予測パラメータと、音源パラメータ量子化回路１
０５で生成されたサブフレーム単位の量子化音源パラメ
ータとを多重化してビットストリーム（多重化パラメー
タ信号）を生成する。The multiplexer 106 includes a frame-based quantized linear prediction parameter generated by the linear prediction parameter quantization circuit 102 and the sound source parameter quantization circuit 1.
A bit stream (multiplexed parameter signal) is generated by multiplexing the quantized excitation parameters in subframe units generated in step 05.

【００２３】図２はサブフレーム補間回路１０３のブロ
ック構成図である。図２において、αｑ［ｋ］は、線形
予測パラメータ量子化回路１０２で生成された第ｋフレ
ームの量子化線形予測パラメータを示し、αｑ［ｋ−
１］は第（ｋ−１）フレームの量子化線形予測パラメー
タを示す。また、αｑ′［ｋ，ｊ］（ｊ＝０，１，２，
３）は、第ｋフレームの第（ｊ＋１）サブフレームの補
間量子化線形予測パラメータを示す。サブフレーム補間
回路１０３は、乗算器１０３１〜１０３４と、加算器１
０３５および１０３６とを有する。乗算器１０３１は第
（ｋ−１）フレームの量子化線形予測パラメータαｑ
［ｋ−１］に定数ａ₀を乗じて加算器１０３５の第１入
力端子に入力し、乗算器１０３２はαｑ［ｋ−１］に定
数ａ₁を乗じて加算器１０３６の第１入力端子に入力す
る。また、乗算器１０３３は第ｋフレームの量子化線形
予測パラメータαｑ［ｋ］に定数ｂ₀を乗じて加算器１
０３５の第２入力端子に入力し、乗算器１０３４はαｑ
［ｋ］に定数ｂ₁を乗じて加算器１０３６の第２入力端
子に入力する。加算器１０３５は乗算器１０３１の出力
と乗算器１０３３の出力とを加算して第１サブフレーム
の補間量子化線形予測パラメータαｑ′［ｋ，０］を生
成し、また加算器１０３６は乗算器１０３２の出力と乗
算器１０３４の出力とを加算して第２サブフレームのα
ｑ′［ｋ，１］を生成する。また量子化線形予測パラメ
ータαｑ［ｋ］は、そのまま第３、第４サブフレームの
補間量子化線形予測パラメータαｑ′［ｋ，２］、α
ｑ′［ｋ，３］として用いられる。FIG. 2 is a block diagram of the sub-frame interpolation circuit 103. In FIG. 2, αq [k] indicates a quantized linear prediction parameter of the k-th frame generated by the linear prediction parameter quantization circuit 102, and αq [k−
1] indicates a quantized linear prediction parameter of the (k−1) th frame. Also, αq ′ [k, j] (j = 0, 1, 2,
3) shows the interpolated quantized linear prediction parameter of the (j + 1) th subframe of the kth frame. The sub-frame interpolation circuit 103 includes multipliers 1031 to 1034 and an adder 1
035 and 1036. The multiplier 1031 calculates the quantized linear prediction parameter αq of the (k−1) th frame.
[K-1] multiplied by the constant a ₀ to input to the first input terminal of the adder 1035, the multiplier 1032 to the first input terminal of the adder 1036 is multiplied by a constant a ₁ on .alpha.q [k-1] input. Further, the multiplier 1033 multiplies the quantized linear prediction parameter αq [k] of the k-th frame by a constant b ₀ to adder 1
035, the multiplier 1034 outputs αq
Multiplied by a constant b ₁ input to the second input terminal of the adder 1036 to [k]. An adder 1035 adds the output of the multiplier 1031 and the output of the multiplier 1033 to generate an interpolated quantized linear prediction parameter αq ′ [k, 0] of the first subframe. And the output of multiplier 1034 are added to obtain α of the second subframe.
Generate q '[k, 1]. Further, the quantized linear prediction parameters αq [k] are directly used as the interpolation quantized linear prediction parameters αq ′ [k, 2], α of the third and fourth subframes.
Used as q '[k, 3].

【００２４】図１に戻り、音声復号化装置２は、音声符
号化装置１からの多重化パラメータ信号を伝送路より受
信し、これらのパラメータに基づいて音声を再生するも
のであり、デマルチプレクサ２０８と、線形予測パラメ
ータ復元回路２１０と、サブフレーム補間回路２１１
（サブフレーム補間手段）と、音源パラメータ復元回路
２１２と、合成回路２１３（合成手段）とを有する。Returning to FIG. 1, the speech decoding device 2 receives the multiplexed parameter signal from the speech encoding device 1 from the transmission line and reproduces speech based on these parameters. , A linear prediction parameter restoring circuit 210 and a sub-frame interpolating circuit 211
(Subframe interpolation means), a sound source parameter restoration circuit 212, and a synthesis circuit 213 (synthesis means).

【００２５】デマルチプレクサ２０８は、伝送路から送
信されたビットストリーム（多重化パラメータ信号）を
受信し、フレーム単位の量子化線形予測パラメータとサ
ブフレーム単位の量子化音源パラメータとに分離し、量
子化線形予測パラメータを線形予測パラメータ復元回路
２１０に送り、量子化音源パラメータを音源パラメータ
復元回路２１２に送る。また、線形予測パラメータ復元
回路２１０は、デマルチプレクサ２０８で分離された量
子化線形予測パラメータを復号することにより、フレー
ム単位の線形予測パラメータを復元する。The demultiplexer 208 receives the bit stream (multiplex parameter signal) transmitted from the transmission path, separates the bit stream into a quantized linear prediction parameter in frame units and a quantized excitation parameter in subframe units, and performs quantization. The linear prediction parameters are sent to the linear prediction parameter restoration circuit 210, and the quantized excitation parameters are sent to the excitation parameter restoration circuit 212. Further, the linear prediction parameter restoring circuit 210 restores the linear prediction parameters for each frame by decoding the quantized linear prediction parameters separated by the demultiplexer 208.

【００２６】サブフレーム補間回路２１１は、線形予測
パラメータ復元回路２１０で復元されたフレーム単位の
線形予測パラメータを、この線形予測パラメータと、こ
のフレームの隣接前フレームの線形予測パラメータとを
用いて補間し、サブフレームごとに補間線形予測パラメ
ータを生成する。このサブフレーム補間回路２１１の回
路構成は、図２のサブフレーム補間回路１０３の回路構
成と同じである。The sub-frame interpolation circuit 211 interpolates the linear prediction parameters for each frame restored by the linear prediction parameter restoration circuit 210 using the linear prediction parameters and the linear prediction parameters of the immediately preceding frame adjacent to this frame. , Generate an interpolation linear prediction parameter for each sub-frame. The circuit configuration of the sub-frame interpolation circuit 211 is the same as the circuit configuration of the sub-frame interpolation circuit 103 in FIG.

【００２７】音源パラメータ復元回路２１２は、デマル
チプレクサ２０８で分離された量子化音源パラメータを
復号することにより、サブフレームごとの音源パラメー
タを復元する。また、合成回路２１３は、全極形ディジ
タルフィルタからなる合成フィルタを有し、合成フィル
タのフィルタ係数をサブフレーム補間回路２１１で生成
された補間線形予測パラメータのパラメータ値に設定
し、音源パラメータ復元回路２１２復元された音源パラ
メータに従って生成した周期パルス等の音源信号で合成
フィルタを駆動することにより、音声を再生する。The excitation parameter restoring circuit 212 restores the excitation parameters for each sub-frame by decoding the quantized excitation parameters separated by the demultiplexer 208. Further, the synthesizing circuit 213 has a synthesizing filter composed of an all-pole digital filter, sets the filter coefficient of the synthesizing filter to the parameter value of the interpolation linear prediction parameter generated by the sub-frame interpolation circuit 211, 212 The sound is reproduced by driving the synthesis filter with a sound source signal such as a periodic pulse generated according to the sound source parameters restored.

【００２８】次に、上記の構成を有する図１の音声符号
化システムの動作について説明する。まず、音声符号化
装置１の動作を説明する。線形予測パラメータ算出回路
１０１において、第（ｋ−１）フレームの線形予測パラ
メータα［ｋ−１］を算出し、続いて第ｋフレームの線
形予測パラメータα［ｋ］を算出する。すなわち、１フ
レーム長の分析窓を入力音声データに乗じて１フレーム
長の入力音声データを切り出し（図４参照）、このフレ
ーム単位の入力音声データを線形予測分析することによ
り、第（ｋ−１）フレームおよび第ｋフレームの線形予
測パラメータα［ｋ−１］、α［ｋ］を順次算出する。
次に、線形予測パラメータ量子化回路１０２において、
上記の線形予測パラメータα［ｋ−１］、α［ｋ］を順
次量子化して、第（ｋ−１）フレームの量子化線形予測
パラメータαｑ［ｋ−１］、第ｋフレームの量子化線形
予測パラメータαｑ［ｋ］を順次生成する。尚、量子化
手法は任意である。Next, the operation of the speech coding system of FIG. 1 having the above configuration will be described. First, the operation of the speech encoding device 1 will be described. The linear prediction parameter calculation circuit 101 calculates the linear prediction parameter α [k−1] of the (k−1) th frame, and then calculates the linear prediction parameter α [k] of the kth frame. That is, the input audio data is multiplied by the analysis window of one frame length to cut out the input audio data of one frame length (see FIG. 4), and the input audio data of this frame unit is subjected to linear prediction analysis to obtain the (k−1) th. ) The linear prediction parameters α [k−1] and α [k] of the frame and the k-th frame are sequentially calculated.
Next, in the linear prediction parameter quantization circuit 102,
The above-mentioned linear prediction parameters α [k−1] and α [k] are sequentially quantized, and the quantized linear prediction parameter αq [k−1] of the (k−1) th frame and the quantized linear prediction of the kth frame The parameter αq [k] is sequentially generated. Note that the quantization method is optional.

【００２９】次に、サブフレーム補間回路１０３におい
て、上記の第（ｋ−１）フレームおよび第ｋフレームの
量子化線形予測パラメータαｑ［ｋ−１］、αｑ［ｋ］
を用いた補間により、第ｋフレームの第（ｊ＋１）（ｊ
＝０，１，２，３）サブフレームの補間量子化線形予測
パラメータαｑ′［ｋ，ｊ］を生成する。このサブフレ
ーム補間は、以下に示す演算を実施することにより行わ
れる（図２参照）。 αｑ′［ｋ，０］＝αｑ［ｋ−１］×ａ₀＋αｑ［ｋ］×ｂ₀ （１） αｑ′［ｋ，１］＝αｑ［ｋ−１］×ａ₁＋αｑ［ｋ］×ｂ₁ （２） αｑ′［ｋ，２］＝αｑ［ｋ］（３） αｑ′［ｋ，３］＝αｑ［ｋ］（４）上式において、ａ₀、ａ₁、ｂ₀、ｂ₁は前述したよう
に定数であるが、ａ₀＝３／４ｂ₀＝１／４ａ₁＝１／４ｂ₁＝３／４という値に設定されている。尚、上記の補間演算式およ
び定数値の根拠となるサブフレーム補間手法については
後述する。Next, in the sub-frame interpolation circuit 103, the quantized linear prediction parameters αq [k−1] and αq [k] of the (k−1) th frame and the kth frame described above.
(J + 1) (j) of the k-th frame by interpolation using
= 0, 1, 2, 3) Generate an interpolated quantized linear prediction parameter αq ′ [k, j] for the subframe. This sub-frame interpolation is performed by performing the following calculation (see FIG. 2). αq ′ [k, 0] = αq [k−1] × a ₀ + αq [k] × b ₀ (1) αq ′ [k, 1] = αq [k−1] × a ₁ + αq [k] × b ₁ (2) αq ′ [k, 2] = αq [k] (3) αq ′ [k, 3] = αq [k] (4) In the above equation, a ₀ , a ₁ , b ₀ , and b ₁ are Although it is a constant as described above, it is set to a _value of a ₀ = 3/4 b ₀ = １／ a ₁ = １／ b ₁ = 3/4. The above-described interpolation calculation formula and the subframe interpolation method that is the basis of the constant value will be described later.

【００３０】次に、音源パラメータ算出回路１０４にお
いて、上記の補間量子化線形予測パラメータαｑ′
［ｋ，ｊ］と、第ｋフレームの第（ｊ＋１）サブフレー
ムの入力音声データとに基づいて、第ｋフレームにおけ
る第１〜第４サブフレームの音源パラメータを順次算出
する。次に、音源パラメータ量子化回路１０５におい
て、上記の音源パラメータを順次量子化し、第１〜第４
のサブフレームの量子化音源パラメータを順次生成す
る。Next, in the sound source parameter calculation circuit 104, the interpolation quantized linear prediction parameter αq ′
Based on [k, j] and the input audio data of the (j + 1) th subframe of the kth frame, the sound source parameters of the first to fourth subframes in the kth frame are sequentially calculated. Next, in the sound source parameter quantization circuit 105, the above sound source parameters are sequentially quantized,
Are sequentially generated.

【００３１】最後に、マルチプレクサ１０６において、
線形予測パラメータ量子化回路１０２からの第ｋフレー
ムの量子化線形予測パラメータαｑ［ｋ］と、音源パラ
メータ量子化回路１０５からの第ｋフレームの第１〜第
４サブフレームの量子化音源パラメータとを多重化して
ビットストリーム（多重化パラメータ信号）を生成し、
これを伝送路に送出する。Finally, in the multiplexer 106,
The quantized linear prediction parameter αq [k] of the k-th frame from the linear prediction parameter quantization circuit 102 and the quantized excitation parameters of the first to fourth sub-frames of the k-th frame from the excitation parameter quantization circuit 105 Multiplex to generate a bitstream (multiplexed parameter signal),
This is transmitted to the transmission path.

【００３２】ここで、サブフレーム補間回路１０３にお
いて用いられるサブフレーム補間手法について説明す
る。図３は上記のサブフレーム補間手法を説明する図で
あり、補間演算式（式（１）〜（４））の決定理由およ
び定数ａ₀、ａ₁、ｂ₀、ｂ₁の値の決定理由を示すも
のである。このサブフレーム補間手法は第ｋフレーム６
０２の量子化線形予測パラメータαｑ［ｋ］と、隣接前
フレームである第（ｋ−１）フレーム６０１の量子化線
形予測パラメータαｑ［ｋ−１］とを用いて、第ｋフレ
ーム６０２における第（ｊ＋１）サブフレームの補間量
子化線形予測パラメータαｑ´［ｋ，ｊ］を生成するも
のであり、その手順を以下に説明する。Here, a subframe interpolation method used in the subframe interpolation circuit 103 will be described. FIG. 3 is a diagram for explaining the above-described sub-frame interpolation method. The reason for determining the interpolation arithmetic expressions (Equations (1) to (4)) and the reason for determining the values of the constants a ₀ , a ₁ , b ₀ , and b ₁ are shown. It shows. This sub-frame interpolation method uses the k-th frame 6
02 in the k-th frame 602 using the quantized linear prediction parameter αq [k] of No. 02 and the quantized linear prediction parameter αq [k−1] of the (k−1) -th frame 601 which is the immediately preceding frame. j + 1) Generates an interpolated quantized linear prediction parameter αq ′ [k, j] of a subframe, and the procedure will be described below.

【００３３】第ｋフレーム６０２の量子化線形予測パラ
メータαｑ［ｋ］に対し、第ｋフレームの時間的先端に
おいて０、第ｋフレームの中心から時間的終端までにお
いて１、第ｋフレームの時間的終端から隣接次フレーム
である第（ｋ＋１）フレームの中心までにおいて０とな
るような台形型の線形重み付け６０９を施す。また第ｋ
フレームの隣接前フレームである第（ｋ−１）フレーム
の量子化線形予測パラメータαｑ［ｋ−１］、および第
（ｋ＋１）フレームの量子化線形予測パラメータαｑ
［ｋ＋１］に対しても、同様の線形重み付け６０８、６
１０を施す。次に、量子化線形予測パラメータαｑ
［ｋ］とαｑ［ｋ−１］とを、第ｋフレーム６０２の第
（ｊ＋１）サブフレームの中心における線形重み付け６
０８、６０９に応じて加算することにより、補間量子化
線形予測パラメータαｑ´［ｋ，ｊ］を生成する。この
とき第（ｋ＋１）フレームの量子化線形予測パラメータ
αｑ［ｋ＋１］は用いない。For the quantized linear prediction parameter αq [k] of the k-th frame 602, 0 at the temporal leading end of the k-th frame, 1 from the center of the k-th frame to the temporal end, and the temporal ending of the k-th frame. To the center of the (k + 1) th frame which is the next adjacent frame, a trapezoidal linear weighting 609 is applied. Also the k-th
The quantized linear prediction parameter αq [k−1] of the (k−1) th frame, which is the previous frame adjacent to the frame, and the quantized linear prediction parameter αq of the (k + 1) th frame
For [k + 1], similar linear weights 608, 6
Apply 10. Next, the quantized linear prediction parameter αq
[K] and αq [k−1] are linearly weighted at the center of the (j + 1) th subframe of the kth frame 602 by 6
08 and 609 to generate an interpolated quantized linear prediction parameter αq ′ [k, j]. At this time, the quantized linear prediction parameter αq [k + 1] of the (k + 1) th frame is not used.

【００３４】すなわち、第ｋフレーム６０２の第１サブ
フレームの中心６０４においては、αｑ［ｋ−１］に対
する重み付けは３／４、αｑ［ｋ］に対する重み付けは
１／４なので、第１サブフレームの補間量子化線形予測
パラメータαｑ´［ｋ，０］は、αｑ［ｋ−１］に３／
４を乗じたものと、αｑ［ｋ］に１／４を乗じたものと
を加算して生成される。従って定数ａ₀＝３／４、ｂ₀
＝１／４となる。同様に、第２サブフレームの中心６０
５においては、αｑ［ｋ−１］に対する重み付けは１／
４、αｑ［ｋ］に対する重み付けは３／４なので、第２
サブフレームの補間量子化線形予測パラメータαｑ´
［ｋ，１］は、αｑ［ｋ−１］に１／４を乗じたもの
と、αｑ［ｋ］に３／４を乗じたものとを加算して生成
される。従って定数ａ₁＝１／４、ｂ₁＝３／４とな
る。また第３、第４サブフレームの中心６０６、６０７
においては、αｑ［ｋ−１］に対する重み付けは０、α
ｑ［ｋ］に対する重み付けは１なので、αｑ［ｋ］をそ
のまま第３、第４サブフレームの補間量子化線形予測パ
ラメータαｑ´［ｋ，２］、αｑ´［ｋ，３］とする。That is, at the center 604 of the first sub-frame of the k-th frame 602, the weighting for αq [k-1] is ／ and the weighting for αq [k] is 、. The interpolated quantized linear prediction parameter αq ′ [k, 0] is calculated by adding 3 / α to αq [k−1].
It is generated by adding a value obtained by multiplying 4 and a value obtained by multiplying αq [k] by ４. Therefore, the constants a ₀ = 3/4, b ₀
= 1/4. Similarly, the center 60 of the second sub-frame
5, the weight for αq [k−1] is 1 /
4, the weight for αq [k] is ／, so the second
Interpolated quantized linear prediction parameter αq ′ for subframe
[K, 1] is generated by adding αq [k−1] multiplied by ４ and αq [k] multiplied by ／. Therefore, the constants a ₁ = １／ and b ₁ = ３. Also, the centers 606 and 607 of the third and fourth subframes
, The weight for αq [k−1] is 0, α
Since the weight for q [k] is 1, αq [k] is directly used as the interpolated quantized linear prediction parameters αq ′ [k, 2] and αq ′ [k, 3] for the third and fourth subframes.

【００３５】上記のように第（ｋ−１）フレームおよび
第ｋフレームの量子化線形予測パラメータαｑ［ｋ−
１］、αｑ［ｋ］を用い、第（ｋ＋１）フレームの量子
化線形予測パラメータαｑ［ｋ＋１］を用いずに、αｑ
［ｋ］を補間することにより、サブフレームごとの補間
量子化線形予測パラメータαｑ´［ｋ，ｊ］の生成タイ
ミング（音声入力時を基準とする）を従来の音声符号化
装置よりも半フレーム分速くすることができ、従ってサ
ブフレームごとの音源パラメータの算出タイミングおよ
び多重化パラメータ信号の送出タイミングを従来よりも
半フレーム分速めることができる。As described above, the quantized linear prediction parameters αq [k−k of the (k−1) th frame and the kth frame
1], αq [k], and without using the quantized linear prediction parameter αq [k + 1] of the (k + 1) th frame, αq
By interpolating [k], the generation timing of the interpolated quantized linear prediction parameter αq ′ [k, j] for each sub-frame (based on the time of voice input) is set to a half frame compared with the conventional voice coding apparatus. Therefore, the calculation timing of the excitation parameter and the transmission timing of the multiplexing parameter signal for each subframe can be advanced by half a frame as compared with the related art.

【００３６】次に図１に戻り、音声復号化装置２の動作
を説明する。音声符号化装置１が送信したビットストリ
ーム（多重化パラメータ信号）は、伝送路によって音声
復号化装置２に伝送される。デマルチプレクサ２０８に
おいて、ビットストリームを受信してフレーム単位の量
子化線形予測パラメータとサブフレーム単位の量子化音
源パラメータとに分離する。すなわち、第（ｋ−１）フ
レームのビットストリームを受信して第（ｋ−１）フレ
ームの量子化線形予測パラメータαｑ［ｋ−１］と第
（ｋ−１）フレームの各サブフレームの量子化音源パラ
メータとに分離し、続いて第ｋフレームのビットストリ
ームを受信して第ｋフレームの量子化線形予測パラメー
タαｑ［ｋ］と第ｋフレームの各サブフレームの量子化
音源パラメータとに分離する。Next, returning to FIG. 1, the operation of the speech decoding apparatus 2 will be described. The bit stream (multiplex parameter signal) transmitted by the audio encoding device 1 is transmitted to the audio decoding device 2 via a transmission path. The demultiplexer 208 receives the bit stream and separates the bit stream into quantized linear prediction parameters in frame units and quantized excitation parameters in subframe units. That is, the bit stream of the (k-1) th frame is received, and the quantized linear prediction parameter αq [k-1] of the (k-1) th frame and the quantization of each subframe of the (k-1) th frame are received. Then, the bit stream of the k-th frame is received and separated into the quantized linear prediction parameter αq [k] of the k-th frame and the quantized excitation parameter of each sub-frame of the k-th frame.

【００３７】次に、線形予測パラメータ復元回路２１０
において、上記の量子化線形予測パラメータαｑ［ｋ−
１］、αｑ［ｋ］を順次復号することにより、第（ｋ−
１）フレーム、第ｋフレームの線形予測パラメータα
［ｋ−１］、α［ｋ］を順次復元する。Next, the linear prediction parameter restoring circuit 210
In the above, the above-described quantized linear prediction parameter αq [k−
1] and αq [k] are sequentially decoded to obtain the (k−
1) Linear prediction parameter α of frame and k-th frame
[K−1] and α [k] are sequentially restored.

【００３８】次に、サブフレーム補間回路２１１におい
て、上記の線形予測パラメータα［ｋ−１］とα［ｋ］
とを用いた補間により、第ｋフレームの第（ｊ＋１）サ
ブフレームの補間線形予測パラメータα′［ｋ，ｊ］を
生成する。このサブフレーム補間は、サブフレーム補間
回路１０３と同様の補間手法により、以下に示す演算を
実施することにより行われる。 α′［ｋ，０］＝α［ｋ−１］×ａ₀＋α［ｋ］×ｂ₀ α′［ｋ，１］＝α［ｋ−１］×ａ₁＋α［ｋ］×ｂ₁ α′［ｋ，２］＝α［ｋ］ α′［ｋ，３］＝α［ｋ］上式において、ａ₀、ａ₁、ｂ₀、ｂ₁は前述した定数
である。Next, in the sub-frame interpolation circuit 211, the above-described linear prediction parameters α [k−1] and α [k]
To generate the interpolated linear prediction parameter α ′ [k, j] of the (j + 1) th sub-frame of the k-th frame. This sub-frame interpolation is performed by performing the following calculation by the same interpolation method as that of the sub-frame interpolation circuit 103. α ′ [k, 0] = α [k−1] × a ₀ + α [k] × b ₀ α ′ [k, 1] = α [k−1] × a ₁ + α [k] × b ₁ α ′ [K, 2] = α [k] α '[k, 3] = α [k] In the above equation, a ₀ , a ₁ , b ₀ , and b ₁ are the above-mentioned constants.

【００３９】また、音源パラメータ復元回路２１２にお
いて、デマルチプレクサ２０８で分離されたサブフレー
ム単位の量子化音源パラメータを順次復号することによ
り、第ｋフレームの第１〜第４サブフレームの音源パラ
メータを順次復元する。In the excitation parameter restoring circuit 212, the excitation parameters of the first to fourth subframes of the k-th frame are sequentially decoded by sequentially decoding the quantized excitation parameters in subframe units separated by the demultiplexer 208. Restore.

【００４０】最後に、合成回路２１３において、合成フ
ィルタのフィルタ係数を、上記の第ｋフレームの第（ｊ
＋１）サブフレームの線形予測パラメータα′［ｋ，
ｊ］のパラメータ値に設定し、上記第（ｊ＋１）サブフ
レームの音源パラメータに従って生成した音源信号で合
成フィルタを駆動することにより、第ｋフレームの第１
〜第４サブフレームの音声を順次再生する。Finally, in the synthesizing circuit 213, the filter coefficient of the synthesizing filter is changed to the (j) th frame of the k-th frame.
+1) Linear prediction parameter α ′ [k,
j] of the k-th frame by driving the synthesis filter with the excitation signal generated according to the excitation parameter of the (j + 1) -th subframe.
To reproduce the audio of the fourth sub-frame sequentially.

【００４１】上記のように第（ｋ−１）フレームおよび
第ｋフレームの線形予測パラメータα［ｋ−１］、α
［ｋ］を用い、第（ｋ＋１）フレームの線形予測パラメ
ータα［ｋ＋１］を用いずに、α［ｋ］を補間すること
により、サブフレームごとの補間線形予測パラメータα
´［ｋ，ｊ］の生成タイミング（多重化パラメータ信号
受信時を基準とする）を従来の音声復号化装置よりも半
フレーム分速くすることができ、従って音声の再生タイ
ミングを従来よりも半フレーム分速めることができる。
また上記のサブフレーム補間手法においては、第３、第
４サブフレームに対しては隣接フレームとの補間を行っ
ていないが、第３、第４サブフレームの補間線形予測パ
ラメータは次フレームの第１、第２サブフレームに影響
を与えることとなるので、フレーム間の音韻の連続性は
保たれ、聴感上の音質の劣化はほとんどない。As described above, the linear prediction parameters α [k−1] and α for the (k−1) th frame and the kth frame
By interpolating α [k] using [k] and not using the linear prediction parameter α [k + 1] of the (k + 1) th frame, the interpolated linear prediction parameter α
The generation timing of '[k, j] (based on the reception of the multiplexed parameter signal) can be made faster by half a frame than that of the conventional speech decoding apparatus. Can be accelerated by a minute.
In the above-described sub-frame interpolation method, the third and fourth sub-frames are not interpolated with adjacent frames, but the interpolation linear prediction parameters of the third and fourth sub-frames are the first and second sub-frames. , The second sub-frame is affected, continuity of phonemes between frames is maintained, and there is almost no deterioration in sound quality in audibility.

【００４２】以上のように上記実施の形態によれば、サ
ブフレーム補間回路１０３および２１１により、フレー
ム単位の線形予測パラメータを線形補間してサブフレ−
ムごとの線形予測パラメータを生成する際に、補間対象
フレームの中心よりも時間的に前の第１、第２サブフレ
ームに対しては、該フレームの線形予測パラメータと隣
接前フレームの線形予測パラメータとを用いて線形補間
し、また該フレームの中心よりも時間的に後の第３、第
４サブフレームに対しては、該フレームの線形予測パラ
メータをそのまま用いることにより、該フレームの隣接
後フレームの線形予測パラメータを用いずに線形予測パ
ラメータのサブフレーム補間を行えるので、フレーム間
の音声変化の滑らかさを保持しつつ、従来のサブフレー
ム補間で発生していた半フレーム分の遅延をなくすこと
ができる。As described above, according to the above-described embodiment, the subframe interpolation circuits 103 and 211 linearly interpolate the linear prediction parameters for each frame, and
When generating linear prediction parameters for each frame, for the first and second sub-frames that are temporally earlier than the center of the frame to be interpolated, the linear prediction parameter of the frame and the linear prediction parameter of the adjacent previous frame , And for the third and fourth sub-frames temporally later than the center of the frame, by using the linear prediction parameters of the frame as they are, Because the sub-frame interpolation of the linear prediction parameter can be performed without using the linear prediction parameter of, the delay of half a frame that occurs in the conventional sub-frame interpolation is eliminated while maintaining the smoothness of the voice change between frames. Can be.

【００４３】尚、上記実施の形態においては、隣接前フ
レームの線形予測パラメータを用いて補間するサブフレ
ームを第１および第２サブフレームとしたが、これはフ
レーム間の線形予測パラメータの遷移を滑らかにするた
めの一例であり、例えば第１サブフレームのみ、または
第１〜第３サブフレームとしても良く、また定数ａ₀、
ａ₁、ｂ₀、ｂ₁の値も上記に限定されるものではな
い。さらに上記のような線形補間手法以外の、補間対象
フレームの線形予測パラメータと隣接前フレームの線形
予測パラメータとに基づく補間手法を採用したサブフレ
ーム補間手段を用いても良い。In the above embodiment, the subframes to be interpolated using the linear prediction parameters of the immediately preceding frame are the first and second subframes. For example, only the first sub-frame or the first to third sub-frames may be used, and the constant a ₀ ,
a _{_1,} b _0, b ₁ value is also not limited to the above. Further, other than the above-described linear interpolation method, a sub-frame interpolation unit employing an interpolation method based on the linear prediction parameters of the interpolation target frame and the linear prediction parameters of the adjacent previous frame may be used.

【００４４】また、上記実施の形態においては、線形予
測パラメータ算出手段、サブフレーム補間手段、音源パ
ラメータ算出手段を回路としたが、これらの手段はハー
ドウェアに限定されるものではなく、これら各手段にお
ける手順をパソコン等を用いて実施しても良い。In the above embodiment, the linear prediction parameter calculating means, the subframe interpolating means, and the sound source parameter calculating means are circuits. However, these means are not limited to hardware. May be performed using a personal computer or the like.

【００４５】[0045]

【発明の効果】以上のように本発明の音声符号化装置お
よび音声復号化装置によれば、サブフレーム補間手段に
より、補間対象フレームの線形予測パラメータと隣接前
フレームの線形予測パラメータとを用いて補間線形予測
パラメータを生成することにより、フレーム間の音声変
化の滑らかさを保持しつつ、従来のサブフレーム補間で
発生していた半フレーム分の遅延をなくすことができる
ので、遅延により再生音声に生じる違和感を解消するこ
とができるという効果がある。As described above, according to the speech coding apparatus and the speech decoding apparatus of the present invention, the subframe interpolation means uses the linear prediction parameters of the frame to be interpolated and the linear prediction parameters of the immediately preceding frame. By generating interpolated linear prediction parameters, it is possible to eliminate the half-frame delay that occurred in conventional sub-frame interpolation while maintaining the smoothness of speech change between frames. There is an effect that the discomfort that occurs can be eliminated.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の実施の形態を示す音声符号化システム
のブロック構成図である。FIG. 1 is a block diagram of a speech encoding system according to an embodiment of the present invention.

【図２】本発明の実施の形態を示す音声符号化システム
におけるサブフレーム補間回路の構成図である。FIG. 2 is a configuration diagram of a subframe interpolation circuit in the speech encoding system according to the embodiment of the present invention.

【図３】本発明の実施の形態を示す音声符号化システム
におけるサブフレーム補間手法を示す図である。FIG. 3 is a diagram illustrating a subframe interpolation method in the speech coding system according to the embodiment of the present invention.

【図４】音声符号化システムにおける分析窓とフレーム
とサブフレームの関係を示す図である。FIG. 4 is a diagram illustrating a relationship between an analysis window, a frame, and a subframe in the speech coding system.

【図５】従来の音声符号化システムにおけるサブフレー
ム補間回路の構成図である。FIG. 5 is a configuration diagram of a subframe interpolation circuit in a conventional speech encoding system.

【図６】従来の音声符号化システムにおけるサブフレー
ム補間手法を示す図である。FIG. 6 is a diagram illustrating a subframe interpolation method in a conventional speech coding system.

【符号の説明】[Explanation of symbols]

１音声符号化装置、２音声復号化装置、１０１線
形予測パラメータ算出回路、１０２線形予測パラメー
タ量子化回路、１０３サブフレーム補間回路、１０４
音源パラメータ算出回路、１０５音源パラメータ量
子化回路、１０６マルチプレクサ、２０８デマルチ
プレクサ、２１０線形予測パラメータ復元回路、２１
１サブフレーム補間回路、２１２音源パラメータ復
元回路、２１３合成回路、１０３１〜１０３４乗算
器、１０３５、１０３６加算器REFERENCE SIGNS LIST 1 speech encoding device, 2 speech decoding device, 101 linear prediction parameter calculation circuit, 102 linear prediction parameter quantization circuit, 103 subframe interpolation circuit, 104
Sound source parameter calculation circuit, 105 sound source parameter quantization circuit, 106 multiplexer, 208 demultiplexer, 210 linear prediction parameter restoration circuit, 21
1 subframe interpolation circuit, 212 sound source parameter restoration circuit, 213 synthesis circuit, 1031 to 1034 multiplier, 1035, 1036 adder

Claims

【特許請求の範囲】[Claims]

【請求項１】フレーム単位の入力音声を分析してフレ
ームごとに線形予測パラメータを算出する線形予測パラ
メータ算出手段と、前記フレームごとの線形予測パラメータを補間して、フ
レームを分割してなるサブフレームごとに補間線形予測
パラメータを生成するサブフレーム補間手段と、前記補間線形予測パラメータとサブフレーム単位の前記
入力音声とに基づいて音源パラメータを算出する音源パ
ラメータ算出手段とを備え、前記サブフレーム補間手段は、補間対象フレームの線形予測パラメータと、この補間対
象フレームの隣接前フレームの線形予測パラメータとを
用いて、前記補間対象フレームの補間線形予測パラメー
タを生成することを特徴とする音声符号化装置。1. A linear prediction parameter calculating means for analyzing an input speech in a frame unit to calculate a linear prediction parameter for each frame, and a subframe obtained by dividing a frame by interpolating the linear prediction parameter for each frame Sub-frame interpolation means for generating an interpolation linear prediction parameter for each, and sound source parameter calculation means for calculating a sound source parameter based on the interpolation linear prediction parameter and the input speech in sub-frame units, the sub-frame interpolation means Is a speech coding apparatus, wherein an interpolated linear prediction parameter of the interpolation target frame is generated using a linear prediction parameter of the interpolation target frame and a linear prediction parameter of a previous frame adjacent to the interpolation target frame.

【請求項２】前記サブフレーム補間手段は、前記補間対象フレームに属するサブフレームのうち、フ
レームの時間的先端から所定位置までの各サブフレーム
に対しては、前記補間対象フレームの線形予測パラメー
タと前記隣接前フレームの線形予測パラメータとを所定
の重み付けで線形加算して補間線形予測パラメータを生
成し、また前記所定位置からフレームの時間的終端まで
の各サブフレームに対しては、前記補間対象フレームの
線形予測パラメータをそのまま補間線形予測パラメータ
とすることを特徴とする請求項１記載の音声符号化装
置。2. The method according to claim 1, wherein the sub-frame interpolator includes: for each of the sub-frames belonging to the interpolation target frame from a temporal start of the frame to a predetermined position, a linear prediction parameter of the interpolation target frame; The linear prediction parameter of the adjacent previous frame is linearly added with a predetermined weight to generate an interpolation linear prediction parameter. For each subframe from the predetermined position to the temporal end of the frame, the interpolation target frame 2. The speech encoding apparatus according to claim 1, wherein the linear prediction parameter is directly used as an interpolation linear prediction parameter.

【請求項３】入力されたフレームごとの線形予測パラ
メータを補間して、フレームを分割してなるサブフレー
ムごとに補間線形予測パラメータを生成するサブフレー
ム補間手段と、前記補間線形予測パラメータと、入力されたサブフレー
ム単位の音源パラメータとに基づいて音声を再生する合
成手段とを備え、前記サブフレーム補間手段は、補間対象フレームの線形予測パラメータと、この補間対
象フレームの隣接前フレームの線形予測パラメータとを
用いて、前記補間対象フレームの補間線形予測パラメー
タを生成することを特徴とする音声復号化装置。3. A sub-frame interpolation means for interpolating an input linear prediction parameter for each frame to generate an interpolation linear prediction parameter for each sub-frame obtained by dividing a frame; Synthesizing means for reproducing sound based on the obtained sound source parameters in subframe units, wherein the subframe interpolating means comprises: a linear prediction parameter of the interpolation target frame; and a linear prediction parameter of the immediately preceding frame adjacent to the interpolation target frame. A speech decoding apparatus for generating an interpolation linear prediction parameter of the interpolation target frame using

【請求項４】前記サブフレーム補間手段は、前記補間対象フレームに属するサブフレームのうち、フ
レームの時間的先端から所定位置までの各サブフレーム
に対しては、前記補間対象フレームの線形予測パラメー
タと前記隣接前フレームの線形予測パラメータとを所定
の重み付けで線形加算して補間線形予測パラメータを生
成し、また前記所定位置からフレームの時間的終端まで
の各サブフレームに対しては、前記補間対象フレームの
線形予測パラメータをそのまま補間線形予測パラメータ
とすることを特徴とする請求項３記載の音声復号化装
置。4. The sub-frame interpolation means, for each of the sub-frames belonging to the interpolation target frame from a temporal start of the frame to a predetermined position, a linear prediction parameter of the interpolation target frame and The linear prediction parameter of the adjacent previous frame is linearly added with a predetermined weight to generate an interpolation linear prediction parameter. For each subframe from the predetermined position to the temporal end of the frame, the interpolation target frame 4. The speech decoding apparatus according to claim 3, wherein the linear prediction parameter is directly used as an interpolation linear prediction parameter.