JPH10293599A

JPH10293599A - Sound signal encoding method

Info

Publication number: JPH10293599A
Application number: JP9104308A
Authority: JP
Inventors: Shigeaki Sasaki; 茂明佐々木; Akitoshi Kataoka; 章俊片岡
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-04-22
Filing date: 1997-04-22
Publication date: 1998-11-04
Anticipated expiration: 2017-04-22
Also published as: JP3520955B2

Abstract

PROBLEM TO BE SOLVED: To perform excellent auditory weighting for a broad-band signal like music. SOLUTION: For an input signal, linear prediction of p'th order (41) is performed to constitute a reverse filter 42, the prediction residue of the input signal is found through the reverse filter 42, and its residue signal is processed by linear predictive analysis (43) of (n)th order (n>p) to generate a linear filter (45), whose impulse response is calculated (47), converted into a frequency range (49), and deformed (51) with auditory characteristics. A linear predictive filter 44 of (p)th order, on the other hand, is generated, the frequency range signal of its filter coefficient is deformed (50) into auditory characteristics, the error signal between the input signal and composite signal is converted (53) into a frequency range, and the error signal is multiplied by the outputs of converting means 50 and 51 to give auditory weight (52 and 54); and encoding is carried out so that the error becomes minimum.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、音声、音響など
の音響入力信号と復号化された合成信号との誤差が最小
になるように符号を決定する音響信号の符号化法、特に
誤差に聴覚特性を考慮した重み付けを施す方法に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio signal encoding method for determining a code so as to minimize an error between an audio input signal such as speech and audio and a decoded synthesized signal, and more particularly to an audio signal encoding method. The present invention relates to a method of performing weighting in consideration of characteristics.

【０００２】[0002]

【従来の技術】従来において音響信号を線形予測符号化
により低ビットレートに符号化する方法の典型としてＣ
ＥＬＰ（ＣｏｄｅＥｘｃｉｔｅｄＬｉｎｅａｒＰ
ｒｅｄｉｃｔｉｏｎ：符号励振線形予測）があげられ
る。この概略処理を図５Ａに示す。入力端子１１からの
入力音声信号は５〜２０ｍｓ程度のフレーム毎に線形予
測分析手段１２で線形予測分析され、ｐ次の線形予測係
数α_i（ｉ＝１，２，…，ｐ）が求められ、この線形予
測係数α_iは量子化手段１３で量子化され、この量子化
線形予測係数α_iは線形予測合成フィルタ１４にフィル
タ係数として設定される。合成フィルタ１４の伝達関数
は以下の（１）式で表わされる。2. Description of the Related Art Conventionally, a typical method of encoding an audio signal at a low bit rate by linear predictive encoding is C
ELP (Code Excited Linear P
reduction: code-excited linear prediction). This schematic processing is shown in FIG. 5A. The input speech signal from the input terminal 11 is subjected to linear prediction analysis by the linear prediction analysis means 12 for each frame of about 5 to 20 ms, and a p-order linear prediction coefficient α _i (i = 1, 2,..., P) is obtained. The linear prediction coefficient α _i is quantized by the quantization means 13, and the quantized linear prediction coefficient α _i is set as a filter coefficient in the linear prediction synthesis filter 14. The transfer function of the synthesis filter 14 is represented by the following equation (1).

【０００３】ｈ（ｚ）＝１／（１＋Σ_i=1 ^pα_iＺ^-i）（１）合成フィルタ１４の励振信号が適応符号帳１５に記憶さ
れ、適応符号帳１５から、制御手段１６よりの入力符号
に応じたピッチ周期で励振信号（ベクトル）が切り出さ
れ、これが繰り返されてフレーム長とされ、利得付与手
段１７で利得が付与され、加算手段１８を通じて励振信
号として合成フィルタ１４へ供給される。減算手段１９
で入力信号から合成フィルタ１４よりの合成信号が差し
引かれ、その差信号は聴覚重み付けフィルタ２１で聴覚
特性のマスキング特性と対応した重み付けがなされ、制
御手段１６によりこの重み付けされた差信号のエネルギ
ーが最小となるように適応符号帳１５の入力符号（つま
りピッチ周期）が探索される。H (z) = 1 / (1 + Σ _{i = 1} ^p α _i Z ⁻ⁱ ) (1) The excitation signal of the synthesis filter 14 is stored in the adaptive codebook 15, and from the adaptive codebook 15, An excitation signal (vector) is cut out at a pitch cycle corresponding to the input code of, and this is repeated to obtain a frame length, gain is applied by a gain applying unit 17, and supplied to the synthesis filter 14 as an excitation signal through an adding unit 18. You. Subtraction means 19
Then, the combined signal from the combining filter 14 is subtracted from the input signal, and the difference signal is weighted by the auditory weighting filter 21 in correspondence with the masking characteristic of the auditory characteristic, and the energy of the weighted difference signal is minimized by the control means 16. The input code (that is, the pitch period) of the adaptive codebook 15 is searched so that

【０００４】その後、制御手段１６により雑音符号帳
（固定符号帳）２２から励振ベクトルが順次取り出さ
れ、利得付与手段２３で利得が付与された後、先に選択
した適応符号帳１５からの励振ベクトルに加算されて励
振信号として合成フィルタ１４へ供給され、先の場合と
同様で聴覚重み付けフィルタ２１よりの差信号のエネル
ギーが最小となる励振ベクトルが選択される。最後に、
これら選択された適応符号帳１５及び雑音符号帳２２か
らの各励振ベクトルに対して、それぞれ利得付与手段１
７、２３で付与する各利得が最適となるように、前述と
同様に聴覚重み付けフィルタ２１の出力信号のエネルギ
ーが最小となるものを探索して決められる。聴覚重み付
けフィルタ２１は量子化していない線形予測係数α_iと
二つの１以下の定数γ₁、γ₂とを用いて以下の（２）
式で表わされるものが用いられる。After that, an excitation vector is sequentially extracted from a noise codebook (fixed codebook) 22 by a control means 16, a gain is applied by a gain applying means 23, and an excitation vector from the adaptive codebook 15 previously selected is applied. Is supplied to the synthesis filter 14 as an excitation signal, and an excitation vector that minimizes the energy of the difference signal from the auditory weighting filter 21 is selected as in the previous case. Finally,
For each of the excitation vectors from the selected adaptive codebook 15 and noise codebook 22, the gain applying means 1
In the same manner as described above, the gain of the output signal of the auditory weighting filter 21 is determined by searching for a signal having the minimum energy so that each of the gains 7 and 23 is optimized. The auditory weighting filter 21 uses the unquantized linear prediction coefficient α _i and two constants γ ₁ and γ ₂ equal to or less than ₁ to obtain the following (2).
What is represented by a formula is used.

【０００５】ｗ（ｚ）＝（１＋Σ_i=1 ^pα_iγ₁ ⁱｚ^-i）／（１＋Σ_i=1 ^pα_iγ₂ ⁱｚ^-i）（２）量子化線形予測係数を示す符号と、適応符号帳１５、雑
音符号帳２２よりそれぞれ選択されたベクトルを示す各
符号と、利得付与手段１７、２３に与えられた各最適利
得を示す符号とが符号化出力とされる。図５Ａ中の線形
予測フィルタ１４と聴覚重み付けフィルタ２１とを合成
して図６に示すように聴覚重み付き合成フィルタ２４と
することもある。この場合、入力端子１１からの入力信
号を聴覚重み付けフィルタ２１を通して差手段１９へ供
給する。[0005] w (z) = (1 + Σ i = 1 p α i γ 1 i z -i) / (1 + Σ i = 1 p α i γ 2 i z -i) (2) code indicating the quantized linear prediction coefficients And the codes indicating the vectors selected from the adaptive codebook 15 and the noise codebook 22, respectively, and the codes indicating the respective optimum gains given to the gain applying means 17 and 23 are output as coded outputs. The linear prediction filter 14 and the perceptual weighting filter 21 in FIG. 5A may be combined into an auditory weighted combining filter 24 as shown in FIG. In this case, the input signal from the input terminal 11 is supplied to the difference means 19 through the auditory weighting filter 21.

【０００６】このＣＥＬＰ符号化に対する復号は図５Ｂ
に示すように行われる。入力端子３１からの入力符号中
の線形予測係数符号が逆量子化手段３２で逆量子化さ
れ、逆量子化線形予測係数は線形予測フィルタ３３にフ
ィルタ係数として設定される。入力符号中のピッチ符号
により適応符号帳３４から励振ベクトルが切り出され、
また雑音符号により雑音符号帳３５から励振ベクトルが
選択され、これら符号帳３４，３５からの各励振ベクト
ルは利得付与手段３６，３７で入力符号中の利得符号に
応じてそれぞれ利得が付与された後、加算されて合成フ
ィルタ３３に励振信号として与えられる。合成フィルタ
３３からの合成信号にポストフィルタ３９で、量子化雑
音が聴覚特性を考慮して小さくなるように処理されて出
力される。The decoding for the CELP coding is shown in FIG.
Is performed as shown in FIG. The linear prediction coefficient code in the input code from the input terminal 31 is inversely quantized by the inverse quantization means 32, and the inversely quantized linear prediction coefficient is set as a filter coefficient in the linear prediction filter 33. An excitation vector is cut out from the adaptive codebook 34 by the pitch code in the input code,
An excitation vector is selected from the noise codebook 35 by the noise code, and the respective excitation vectors from the codebooks 34 and 35 are given gains by the gain applying means 36 and 37 according to the gain code in the input code. , And are given to the synthesis filter 33 as an excitation signal. The synthesized signal from the synthesis filter 33 is processed by the post-filter 39 so that the quantization noise is reduced in consideration of the auditory characteristics, and is output.

【０００７】[0007]

【発明が解決しようとする課題】上述したようにＣＥＬ
Ｐ等の時間領域における音響信号符号化において、従来
の聴覚特性を考慮した重み付けは、音声のフォルマント
をモデル化できる１０〜２０次程度の線形予測による自
己回帰移動平均型線形フィルタもしくは単一のピッチ周
波数に基づくコムフィルタとの組み合わせで行われるた
め、周波数領域において定常的で多数かつ不等間隔のピ
ークを持つような音響信号の微細スペクトル構造を考慮
した重み付けを実現することはできない。この微細スペ
クトル構造を重み付けに反映させる手段として、単純に
線形予測の次数を増やす手法は、従来の線形予測係数
を、帯域幅拡張してフィルタ係数を決定する手法では細
かな制御が不可能であり、また、予測係数を求める過程
に必要な計算精度において予測次数を大幅に上げること
はできない。この発明は、時間領域の音響信号符号化に
おいて、音響信号の微細スペクトル構造に基づき、周波
数領域で細かく制御できる重み付けの方法を提供するこ
とにある。SUMMARY OF THE INVENTION As described above, CEL
In the audio signal encoding in the time domain such as P, the conventional weighting in consideration of the auditory characteristics is performed by an autoregressive moving average type linear filter or a single pitch using linear prediction of about 10 to 20th order that can model the formant of speech. Since it is performed in combination with a comb filter based on frequency, it is not possible to realize weighting in consideration of a fine spectral structure of an acoustic signal that has a large number of unequally spaced peaks in a frequency domain. As a means to reflect this fine spectral structure in the weighting, the method of simply increasing the order of linear prediction is not possible with the conventional method of determining the filter coefficient by extending the bandwidth of the linear prediction coefficient to the bandwidth. In addition, the prediction order cannot be significantly increased in the calculation accuracy required for the process of obtaining the prediction coefficient. SUMMARY OF THE INVENTION It is an object of the present invention to provide a weighting method that can be finely controlled in a frequency domain based on a fine spectral structure of an audio signal in audio signal encoding in a time domain.

【０００８】[0008]

【課題を解決するための手段】請求項１の発明によれ
ば、入力信号と合成信号との誤差が最小になるように符
号化符号を決定する音響信号符号化法に用いられ、入力
信号もしくは過去の合成信号に対し線形予測を行い、そ
の線形予測係数から自己回帰型又は移動平均型あるいは
自己回帰移動平均型線形フィルタのフィルタ係数を決定
し、フィルタ係数を周波数特性に変換し、その周波数特
性に対し聴覚特性に応じた変形を施し、この入力信号と
符号帳ベクトルの合成信号との誤差信号を周波数領域に
変換し、この周波数領域の誤差信号に前記聴覚特性に応
じて変形した周波数特性で重みを施し、その重み付けさ
れた誤差信号が最小になるように固定符号帳符号を決定
する。According to the first aspect of the present invention, the present invention is used for an audio signal encoding method for determining an encoding code so that an error between an input signal and a composite signal is minimized. Perform linear prediction on the past synthesized signal, determine the filter coefficients of the autoregressive, moving average, or autoregressive moving average linear filter from the linear prediction coefficients, convert the filter coefficients into frequency characteristics, and Is subjected to deformation according to the auditory characteristics, an error signal between the input signal and the synthesized signal of the codebook vector is converted to the frequency domain, and the frequency domain error signal is transformed according to the auditory characteristics with the frequency characteristics. Weighting is performed, and a fixed codebook code is determined so that the weighted error signal is minimized.

【０００９】請求項２の発明によれば、入力信号の線形
予測残差信号と符号帳ベクトルとの誤差が最小になるよ
うに符号化符号を決定する音響信号符号化法に用いら
れ、入力信号もしくは過去の合成信号に対し線形予測を
行い、その線形予測係数から自己回帰型又は移動平均型
あるいは自己回帰移動平均型線形フィルタのフィルタ係
数を決定し、そのフィルタ係数を周波数特性に変換し、
その周波数特性に聴覚特性に応じた変形を施し、入力信
号の線形予測残差信号と符号帳ベクトルとの誤差信号を
周波数領域に変換し、その周波数領域の誤差信号に聴覚
特性で前記変形した周波数特性で重み付け、その重み付
けられた誤差信号が最小になるように固定符号帳符号を
決定する。According to the second aspect of the present invention, the input signal is used for an acoustic signal encoding method for determining an encoding code so as to minimize an error between a linear prediction residual signal of an input signal and a codebook vector. Or perform linear prediction on the past synthesized signal, determine the filter coefficient of the autoregressive type or moving average type or autoregressive moving average type linear filter from the linear prediction coefficient, convert the filter coefficient to frequency characteristics,
The frequency characteristic is transformed according to the auditory characteristic, the error signal between the linear prediction residual signal of the input signal and the codebook vector is converted into the frequency domain, and the frequency domain error signal is transformed with the auditory characteristic into the frequency signal. The fixed codebook code is determined so that the weighted error signal is minimized.

【００１０】請求項３の発明では、請求項１又は２の発
明において、前記変形された周波数特性を求めるには、
前記線形予測係数としてｐ次のものを求め、このｐ次の
予測係数に基づき、前記線形フィルタの第１フィルタ係
数を決定し、これを第１周波数特性に変換し、その第１
周波数特性を聴覚特性で変形し、また、入力信号もしく
は過去の合成信号に対し、ｐ′次の線形予測を行い、
ｐ′次の線形予測係数により線形予測逆フィルタを構成
し、その線形予測逆フィルタを用い、入力信号もしくは
過去の合成信号から求められた予測残差信号に対し、ｎ
次（ｎ＞ｐ）の線形予測を行い、ｎ次の線形予測係数か
ら前記線形フィルタの第２フィルタ係数を決定し、その
第２フィルタの第２周波数特性に変換した後、聴覚特性
に応じた変形を施し、これと前記変形された第１周波数
特性とを乗じて重みとする。According to a third aspect of the present invention, in the first or second aspect of the present invention, in order to obtain the modified frequency characteristic,
The p-th linear prediction coefficient is obtained, a first filter coefficient of the linear filter is determined based on the p-th prediction coefficient, and the first filter coefficient is converted into a first frequency characteristic.
The frequency characteristic is transformed by the auditory characteristic, and the input signal or the past synthesized signal is subjected to p′-order linear prediction,
A linear prediction inverse filter is formed by the p'th-order linear prediction coefficient, and n is used for the prediction residual signal obtained from the input signal or the past synthesized signal using the linear prediction inverse filter.
After performing the next (n> p) linear prediction, determining the second filter coefficient of the linear filter from the n-th linear prediction coefficient, converting the second filter coefficient into the second frequency characteristic of the second filter, Deformation is performed, and the weight is multiplied by the deformed first frequency characteristic.

【００１１】請求項４の発明では請求項３の発明におい
てｐ＝ｐ′とし、従って線形予測逆フィルタの構成は、
第１フィルタ係数を決定する際に用いた線形予測係数を
用いて行う。According to a fourth aspect of the present invention, p = p 'in the third aspect of the present invention.
This is performed using the linear prediction coefficient used when determining the first filter coefficient.

【００１２】[0012]

【発明の実施の形態】図１に請求項１及び３の発明の実
施例における処理手順を示す。この実施例は図５Ａに示
した符号化方式における聴覚重み付けフィルタ２１に、
この発明を適用した場合である。まず入力端子１１から
の現フレームの入力信号を線形予測分析してｐ次の線形
予測係数α_i（ｉ＝１，２，…，ｐ）を求める。この線
形予測係数α_iは図５Ａ中の線形予測分析手段１２で得
られたものを用いることができる。通常、ｐは１０から
２０程度とする。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 shows a processing procedure in the first and third embodiments of the present invention. In this embodiment, the auditory weighting filter 21 in the encoding method shown in FIG.
This is a case where the present invention is applied. First, a linear prediction coefficient α _i (i = 1, 2,..., P) of the p- _th order is obtained by performing linear prediction analysis on the input signal of the current frame from the input terminal 11. As the linear prediction coefficient α _i, the one obtained by the linear prediction analysis means 12 in FIG. 5A can be used. Usually, p is about 10 to 20.

【００１３】次に現フレームにおける入力信号を線形予
測分析手段４１で線形予測分析してｐ′次の線形予測係
数α_k′（ｋ＝１，２，…，ｐ′）を求める。先に述べ
たようにｐ′はｐと等しくても多少異なっていてもよ
い。つまり、必ずしもｐ＝ｐ′としなくてもよい。ま
た、線形予測分析を行う際、分析対象の信号系列にかけ
る窓は、非対称窓もしくはハミング窓等の対称窓のどち
らを用いてもよい。Next, the input signal in the current frame is subjected to linear prediction analysis by the linear prediction analysis means 41 to obtain a p'th-order linear prediction coefficient α _k '(k = 1, 2,..., P'). As mentioned earlier, p 'may be equal to or slightly different from p. That is, it is not always necessary to set p = p '. Further, when performing the linear prediction analysis, a window applied to the signal sequence to be analyzed may be an asymmetric window or a symmetric window such as a Hamming window.

【００１４】次に入力信号に対し、この線形予測係数α
_k′をフィルタ係数として、伝達特性が下記（３）式で
表わされるディジタルフィルタ４２で線形予測逆フィル
タリングを行い、予測残差信号Ａ（ｚ）を求める。Ａ（ｚ）＝１＋Σ_i=1 ^p'α_i′ｚ^-i （３）次に入力信号の予測残差信号Ａ（ｚ）を線形予測分析手
段４３で線形予測分析してｎ次の線形予測係数β_j（ｊ
＝１，２，…，ｎ）を求める。ｎ次の線形予測によっ
て、ｐ′次の線形予測で予測しきれない高次の相関を表
わすため、ｎはｐ′よりも、即ち５〜１０倍以上大であ
ることが望ましい。例えば符号化対象が音楽である場
合、３次線形予測分析手段４３の分析は１００次以上の
予測が最適な場合がある。従ってこの線形予測に用いる
入力信号は５〜１０フレームについて行う。この線形予
測の際に用いる窓は現在時点に近いもの程、重みが大き
くなるようにする。Next, the linear prediction coefficient α
_{Using k} ′ as a filter coefficient, linear prediction inverse filtering is performed by a digital filter 42 whose transfer characteristic is expressed by the following equation (3), and a prediction residual signal A (z) is obtained. A (z) = 1 + Σ _{i = 1} ^{p ′} α _i ′ z− ⁱ (3) Next, linear prediction analysis is performed on the prediction residual signal A (z) of the input signal by the linear prediction analysis unit 43, and n-order linear prediction is performed. Coefficient β _j (j
= 1, 2, ..., n). In order to express a higher-order correlation that cannot be completely predicted by the p′-th linear prediction by the n-th linear prediction, n is preferably larger than p ′, that is, 5 to 10 times or more. For example, when the encoding target is music, the third-order linear prediction analysis unit 43 may perform the analysis of the 100th or higher order in some cases. Therefore, the input signal used for this linear prediction is performed for 5 to 10 frames. The weight used for the window used in the linear prediction becomes larger as the window is closer to the current time.

【００１５】次に得られた線形予測係数α_i、β_jを用
いて、伝達特性がそれぞれ下記の（４），（５）式で表
わされるディジタルフィルタ４４，４５をそれぞれ構成
する。Ｆ（ｚ）＝１／（１＋Σ_i=1 ^pα_iｚ^-i）（４）Ｇ（ｚ）＝１／（１＋Σ_i=1 ⁿβ_iｚ^-i）（５）次に（４），（５）式で表されるディジタルフィルタの
インパルス応答をインパルス応答計算手段４６，４７で
それぞれ求め、更に周波数領域変換手段４８，４９でＦ
ＦＴ、ＤＣＴ、ＭＤＣＴ等で周波数領域に変換すること
でそれぞれのフィルタの周波数特性の信号系列Ｕ_m、Ｖ
_m（ｍ＝１，２，…，Ｍ）を求める。ここで周波数特性
の信号系列Ｕ_m、Ｖ_mはフィルタ係数α_i、β_iから直
接計算してもよい。Next, using the obtained linear prediction coefficients α _i and β _j , digital filters 44 and 45 whose transfer characteristics are expressed by the following equations (4) and (5) are formed, respectively. F (z) = 1 / ( 1 + Σ i = 1 p α i z -i) (4) G (z) = 1 / (1 + Σ i = 1 n β i z -i) (5) Next (4), The impulse response of the digital filter represented by the equation (5) is obtained by the impulse response calculating means 46 and 47, respectively,
The signal sequence U _m , V of the frequency characteristic of each filter is converted into the frequency domain by FT, DCT, MDCT or the like.
_m (m = 1, 2,..., M) is obtained. Here, the signal series U _m and V _m of the frequency characteristic may be directly calculated from the filter coefficients α _i and β _i .

【００１６】次に得られたそれぞれの周波数特性Ｆ
（ｚ），Ｇ（ｚ）を表す信号系列Ｕ_m、Ｖ_mに対して変
形手段５０，５１でそれぞれ変形処理を行い、乗算手段
５２で互いに乗ずることで、下記の（６）式で表される
重みを表す信号系列Ｗ₁（ｍ＝１，２，…，Ｍ）を求め
る。Ｗ_m＝Ｕ_m ^-q1Ｖ_m ^-q2 （０＜ｑ１＜１，０＜ｑ２＜１）（６）ここでｑ１，ｑ２は、従来の聴覚重み付けフィルタ２１
における（２）式中のγ₁，γ₂を求める場合と同様
に、実際に聴覚で確認して適切な値を決める。Next, the obtained frequency characteristics F
The signal units U _m and V _m representing (z) and G (z) are subjected to transformation processing by transformation means 50 and 51, respectively, and are multiplied by a multiplication means 52 to be expressed by the following equation (6). signal sequence _{W 1 (m = 1,2, ...} , M) which represents the that weights determined. W _m = U _m ^-q1 V _m ^-q2 (0 <q1 <1,0 <q2 <1) (6) Here, q1 and q2 are the conventional auditory weighting filters 21
As in the case of obtaining γ ₁ and γ ₂ in the expression (2) in the above, an appropriate value is determined by actually confirming the hearing.

【００１７】ここで入力信号と符号帳ベクトルによる合
成信号とから減算手段１９で得られる差信号は周波数領
域変換手段５３で周波数領域に変換しておけば（６）式
で表される重みＷ_mを乗算手段５４で周波数領域の差信
号に乗ずることで重み付けが実現され、重み付き誤差が
最小な符号帳ベクトルを選択することができる。このよ
うにこの実施例では、入力信号の周波数特性（図２Ａ）
に対して、従来から用いられている線形予測係数α_iで
表される包絡（図２Ｂ）を聴覚特性に基づいて変形させ
る聴覚重み付け（図２Ｃ）に、入力信号の周波数特性か
ら包絡を除いた後に残る、高次のフィルタ係数β_jで表
わされる微細構造（図２Ｄ）を聴覚特性に基づいて変形
し（図２Ｅ）、乗ずることで聴覚重み付けを構成し（図
２Ｆ）、複数のピッチが混在するような入力信号を符号
化する場合にも聴覚特性に応じてより細かな重み付けを
行うことができる。Here, if the difference signal obtained by the subtracting means 19 from the input signal and the composite signal based on the codebook vector is converted into the frequency domain by the frequency domain converting means 53, the weight W _m represented by the equation (6) is obtained. Is multiplied by the difference signal in the frequency domain by the multiplication means 54 to realize weighting, and a codebook vector with the smallest weighted error can be selected. Thus, in this embodiment, the frequency characteristics of the input signal (FIG. 2A)
In contrast, the envelope weighting (FIG. 2C) for transforming the envelope (FIG. 2B) represented by the linear prediction coefficient α _i conventionally used based on the auditory characteristics is obtained by removing the envelope from the frequency characteristics of the input signal. The remaining fine structure (FIG. 2D) represented by the higher-order filter coefficient β _j is deformed based on the auditory characteristics (FIG. 2E) and multiplied to form an auditory weight (FIG. 2F), and a plurality of pitches are mixed. In the case of encoding an input signal that performs the above, more detailed weighting can be performed according to the auditory characteristics.

【００１８】また、一回の線形予測分析では微細構造を
表現できるほどの高次の分析を行うと予測係数を安定に
求めることは困難な場合があるが、この発明では高次の
分析を行う対象の信号が低次の相関が除かれた予測残差
信号であるため、高次の分析であっても予測係数を安定
に求めやすい。なお、安定な予測係数が求められなかっ
た場合は、前のフレームで得られた値、もしくは初期化
された値、つまり周波数特性が平坦で入力と全く同一の
出力が得られるフィルタ係数を用いればよい。初期化さ
れた値を用いる場合でも、包絡を表わす低次の予測係数
が求められていれば、その予測による利得は少なくとも
保証される。In a single linear prediction analysis, it may be difficult to obtain a prediction coefficient stably if a high-order analysis that can express a fine structure is difficult. However, in the present invention, a high-order analysis is performed. Since the target signal is a prediction residual signal from which low-order correlation has been removed, it is easy to stably obtain a prediction coefficient even in high-order analysis. If a stable prediction coefficient cannot be obtained, a value obtained in the previous frame or an initialized value, that is, a filter coefficient that has a flat frequency characteristic and obtains exactly the same output as the input is used. Good. Even when the initialized value is used, if a low-order prediction coefficient representing the envelope is obtained, at least the gain by the prediction is guaranteed.

【００１９】また、従来からある聴覚重み付けを時間領
域の線形フィルタで行う方法と異なり、聴覚重み付けを
周波数領域で行うため、信号系列Ｕ_m、Ｖ_mを変形させ
る方法として、（６）式で表される演算だけでなく、移
動平均の累乗や系列を複数のブロックに分割してブロッ
ク毎に演算を変える方法のように線形フィルタで実現す
るには複雑な手法を用いることも可能になる。Also, unlike the conventional method in which perceptual weighting is performed by a linear filter in the time domain, the perceptual weighting is performed in the frequency domain. Therefore, as a method for transforming the signal sequences U _m and V _m , the following equation (6) is used. In addition to the operation performed, a complicated method can be used to realize a linear filter such as a method of dividing the power of a moving average or a series into a plurality of blocks and changing the operation for each block.

【００２０】また、全ての符号帳に対してこの重み付け
を施す必要はなく、例えば雑音符号帳から符号帳ベクト
ルを選択する時にのみ行ってもよい。次に請求項２及び
３の発明における実施例を図３に示す。この実施例は図
５Ａに示した符号化方式における聴覚重み付けフィルタ
と合成フィルタに、この発明を適用した場合である。ま
ず入力端子１１からの現フレームの入力信号を線形予測
分析してｐ次の線形予測係数α_i（ｉ＝１，２，…，
ｐ）を求める。この線形予測係数α_iは図５Ａ中の線形
予測分析手段１２で得られたものを用いることができ
る。さらにこの予測係数を量子化手段１３で量子化し、
量子化予測係数α_i（ｉ＝１，２，…，ｐ）を求める。
通常、ｐは１０から２０程度とする。It is not necessary to apply this weighting to all codebooks, and may be applied only when selecting a codebook vector from a noise codebook, for example. Next, an embodiment according to the second and third aspects of the present invention is shown in FIG. This embodiment is a case where the present invention is applied to an auditory weighting filter and a synthesis filter in the encoding method shown in FIG. 5A. First, the input signal of the current frame from the input terminal 11 is subjected to linear prediction analysis to obtain a p-order linear prediction coefficient α _i (i = 1, 2,...,
Find p). As the linear prediction coefficient α _i, the one obtained by the linear prediction analysis means 12 in FIG. 5A can be used. Further, the prediction coefficient is quantized by the quantization means 13, and
The quantization prediction coefficient α _i (i = 1, 2,..., P) is obtained.
Usually, p is about 10 to 20.

【００２１】次に入力端子６１からの過去の合成信号
（現フレームの直前から例えば５〜１０フレーム程度の
全体）を線形予測分析手段６２で線形予測分析してｐ′
次の線形予測係数α_k′（ｋ＝１，２，…，ｐ′）を求
める。先に述べたようにｐ′はｐと等しくしてもよい。
また、線形予測分析を行う際、分析対象の信号系列にか
ける窓は、非対称窓もしくはハミング窓等の対称窓のど
ちらを用いてもよい。Next, the past synthesized signal from the input terminal 61 (for example, the whole of about 5 to 10 frames immediately before the current frame) is subjected to linear prediction analysis by the linear prediction analysis means 62, and p 'is obtained.
The next linear prediction coefficient α _k ′ (k = 1, 2,..., P ′) is obtained. As mentioned above, p 'may be equal to p.
Further, when performing the linear prediction analysis, a window applied to the signal sequence to be analyzed may be an asymmetric window or a symmetric window such as a Hamming window.

【００２２】次に入力信号に対し、この線形予測係数α
_k′をフィルタ係数として、伝達特性が前記の（３）式
と同じ式で表わされるディジタルフィルタ６３で線形予
測逆フィルタリングを行い、予測残差信号を求める。次
に求められた過去の合成信号の予測残差信号を線形予測
分析手段６４で線形予測分析してｎ次の線形予測係数β
_j（ｊ＝１，２，…，ｎ）を求める。ｎ次の線形予測に
よって、ｐ′次の線形予測で予測しきれない高次の相関
を表わすため、ｎはｐ′よりも大であることが望まし
い。例えば符号化対象が音楽である場合、１００次以上
つまり、ｐ′の５〜１０倍以上の予測が最適な場合があ
る。Next, the linear prediction coefficient α
_{Using k} ′ as a filter coefficient, linear prediction inverse filtering is performed by a digital filter 63 whose transfer characteristic is expressed by the same expression as the above expression (3), and a prediction residual signal is obtained. Next, the obtained prediction residual signal of the past synthesized signal is subjected to linear prediction analysis by the linear prediction analysis means 64, and an nth-order linear prediction coefficient β
_j (j = 1, 2,..., n) is obtained. It is desirable that n is larger than p 'because the n-th linear prediction indicates a higher-order correlation that cannot be completely predicted by the p'th linear prediction. For example, when the encoding target is music, prediction of order 100 or higher, that is, 5 to 10 times or more of p 'may be optimal.

【００２３】次に得られた予測係数α_i、β_jを用い
て、伝達特性がそれぞれ下記の（４），（５）式と同じ
式で表わされるディジタルフィルタ６５，６６を構成す
る。これら（４），（５）式で表されるディジタルフィ
ルタのインパルス応答をインパルス応答計算手段６７，
６８でそれぞれ求め、求めたインパルス応答を周波数領
域変換手段６９，７０でＦＦＴ、ＤＣＴ、ＭＤＣＴ等に
より変換することでそれぞれのフィルタの周波数特性の
信号系列Ｕ_m、Ｖ_m（ｍ＝１，２，…，Ｍ）を求める。
ここで周波数特性の信号系列Ｕ_m、Ｖ_mはフィルタ係数
α_i、β_iから直接計算してもよい。Next, using the obtained prediction coefficients α _i and β _j , digital filters 65 and 66 whose transfer characteristics are expressed by the same equations as the following equations (4) and (5) are constructed. The impulse response of the digital filter represented by these equations (4) and (5) is calculated by the impulse response calculating means 67,
68, and the obtained impulse responses are converted by FFT, DCT, MDCT or the like by the frequency domain conversion means 69, 70 to thereby obtain signal sequences U _m , V _m (m = 1, 2, .., M).
Here, the signal series U _m and V _m of the frequency characteristic may be directly calculated from the filter coefficients α _i and β _i .

【００２４】次に得られたそれぞれの周波数特性を表す
信号系列Ｕ_m、Ｖ_mに対して変形手段７１，７２で変形
処理を行い、乗算手段７３で乗ずることで、下記の
（７）式で表される重みを表す信号系列Ｗ_m（ｍ＝１，
２，…，Ｍ）を求める。Ｗ_m＝Ｕ_m ^1-q1'Ｖ_m ^1-q2' （０＜ｑ１′＜１，０＜ｑ２′＜１）（７）ここで入力信号から線形予測逆フィルタ７４で得られる
予測残差信号と符号帳ベクトルとから減算手段７５で得
られる差信号は周波数領域変換手段７６で周波数領域に
変換しておけば（７）式の信号系列を乗算手段７７で周
波数領域の差信号に乗ずることで重み付けが実現され、
重み付き誤差が最小な符号帳ベクトルを選択することが
できる。実際の合成信号は選択された符号帳ベクトルを
時間領域の合成フィルタに入力することで求めることが
できる。Next, the obtained signal sequences U _m and V _m representing the respective frequency characteristics are transformed by the transforming means 71 and 72 and multiplied by the multiplying means 73 to obtain the following equation (7). A signal sequence W _m (m = 1,
2,..., M). W _m = U _m ^{1−q1 ′} V _m ^{1−q2 ′} (0 <q1 ′ <1, 0 <q2 ′ <1) (7) where the prediction residual signal obtained from the input signal by the linear prediction inverse filter 74 If the difference signal obtained by the subtracting means 75 from the codebook vector and the codebook vector is converted into the frequency domain by the frequency domain converting means 76, the signal sequence of equation (7) is multiplied by the multiplying means 77 by the frequency domain difference signal. Weighting is realized,
A codebook vector with the smallest weighted error can be selected. The actual synthesized signal can be obtained by inputting the selected codebook vector to the time domain synthesis filter.

【００２５】このようにこの実施例では、入力信号の周
波数特性（図４Ａ）に対して、線形予測係数α_iで表さ
れる包絡（図４Ｂ）を聴覚特性に基づいて変形させる重
み付け（図４Ｃ）に、入力信号の周波数特性から包絡を
除いた後に残る、高次のフィルタ係数β_jで表わされる
微細構造（図４Ｄ）を聴覚特性に基づいて変形して（図
４Ｅ）、乗ずることで重み付けを構成し（図４Ｆ）、複
数のピッチが混在するような入力信号を符号化する場合
にも聴覚特性に応じてより細かな重み付けを行うことが
できる。As described above, in this embodiment, the weighting (FIG. 4C) for transforming the envelope (FIG. 4B) represented by the linear prediction coefficient α _i based on the auditory characteristics with respect to the frequency characteristics of the input signal (FIG. 4A). ), The fine structure (FIG. 4D) represented by the higher-order filter coefficient β _j remaining after removing the envelope from the frequency characteristics of the input signal is deformed based on the auditory characteristics (FIG. 4E), and weighted by multiplication. (FIG. 4F), and even when an input signal in which a plurality of pitches are mixed is encoded, finer weighting can be performed in accordance with the auditory characteristics.

【００２６】図１において、ｐ′次線形予測分析手段４
１へ入力する信号は現フレームの直前の過去の合成信号
であってもよく、また図３において、ｐ′次線形予測分
析手段６３へ入力する信号は入力端子１１の入力信号で
あってもよい。ｐ′次線形予測逆フィルタ４２，６３の
構成に、それぞれｐ′次線形予測係数を量子化したもの
を用いてもよく、ｐ次線形予測フィルタ４４，６５の構
成に、それぞれｐ次線形予測係数を量子化したものを用
いてもよく、ｎ次線形予測フィルタ４５，６６の構成
に、ｎ次線形予測係数を量子化したものを用いてもよ
い。In FIG. 1, the p'-order linear prediction analysis means 4
The signal input to 1 may be a past synthesized signal immediately before the current frame. In FIG. 3, the signal input to the p′th-order linear prediction analysis means 63 may be an input signal of the input terminal 11. . Quantized p′-order linear prediction coefficients may be used for the configuration of the p′-order linear prediction inverse filters 42 and 63, and p-order linear prediction coefficients may be used for the configurations of the p-order linear prediction filters 44 and 65, respectively. May be used, and the n-th order linear prediction coefficient may be used for the configuration of the n-th order linear prediction filters 45 and 66.

【００２７】図１及び図３において、線形予測逆フィル
タ４２，６３としてｐ次のものを用いる場合は、ｐ次線
形予測分析手段１２で得られたｐ次線形予測係数又はそ
れを量子化したものを用い、ｐ′次線形予測分析手段４
１，６２は省略できる。更に図１において、手段４１，
４２，４３，４５，４７，４９，５１，５２を省略し
て、変形手段５０の出力を直接乗算手段５４へ供給して
もよい。同様に、図３において手段６２，６３，６４，
６６，６８，７０，７２，７３を省略して、変形手段７
１の出力を乗算手段７７へ直接供給してもよい。In FIGS. 1 and 3, when the p-order linear prediction coefficients are used as the linear prediction inverse filters 42 and 63, the p-order linear prediction coefficient obtained by the p-order linear prediction analysis means 12 or a quantized version thereof is used. And the p′-order linear prediction analysis means 4
1, 62 can be omitted. 1, means 41,
42, 43, 45, 47, 49, 51, 52 may be omitted, and the output of the transforming means 50 may be directly supplied to the multiplying means 54. Similarly, in FIG. 3, means 62, 63, 64,
66, 68, 70, 72 and 73 are omitted, and the deformation means 7
The output of 1 may be directly supplied to the multiplication means 77.

【００２８】つまり図１，３において聴覚特性で変形し
た周波数領域の信号を時間領域の信号に変換し、その変
換された時間領域信号で、図１では減算手段１９よりの
誤差信号、図２では減算手段７５よりの誤差信号に対し
て重み付けを行ってもよいが、前記時間領域への変換が
比較的複雑であるが、この発明では、誤差信号を周波数
領域へ変換することにより、比較的簡単な処理で聴覚重
み付けが行える。That is, in FIG. 1 and FIG. 3, the signal in the frequency domain deformed by the auditory characteristics is converted into a signal in the time domain, and the converted time domain signal is an error signal from the subtraction means 19 in FIG. The error signal from the subtracting means 75 may be weighted, but the conversion into the time domain is relatively complicated. However, in the present invention, the error signal is converted into the frequency domain to be relatively simple. Perceptual weighting can be performed by simple processing.

【００２９】図３において、乗算手段１７の出力と逆フ
ィルタ７４よりの予測残差信号との差をとったものを周
波数領域に変換し、その周波数領域の信号と、雑音符号
帳２２内に予め周波数領域の信号として格納してある固
定ベクトルとの差をとり、これを周波数領域の誤差信号
として乗算手段７７へ供給してもよい。この請求項２
は、請求項１が入力信号と固定符号帳ベクトルによる合
成信号との誤差に対して重み付けを行うのに対して、入
力信号の残差信号と固定符号帳ベクトルの誤差に対して
重み付けを行う点で異なる。しかしながら、入力信号の
周波数特性と聴覚特性に応じた重み付けが行える点では
同じ効果を得ることができる。In FIG. 3, the difference between the output of the multiplication means 17 and the prediction residual signal from the inverse filter 74 is converted to the frequency domain, and the signal in the frequency domain and the noise codebook 22 are stored in the noise codebook 22 in advance. The difference from the fixed vector stored as a signal in the frequency domain may be obtained and supplied to the multiplication means 77 as an error signal in the frequency domain. This claim 2
Is that weighting is performed on the error between the input signal and the synthesized signal based on the fixed codebook vector, whereas weighting is performed on the error between the residual signal of the input signal and the fixed codebook vector. Different. However, the same effect can be obtained in that weighting can be performed according to the frequency characteristics and the auditory characteristics of the input signal.

【００３０】[0030]

【発明の効果】以上述べたようにこの発明によれば、合
成信号もしくは入力信号に対して低次の線形予測分析を
行い、その予測残差信号に対して高次の線形予測分析を
行い、得られた予測係数を用いて、音声よりも複雑な楽
音等のパワースペクトル特性を表わすことができ、包絡
を表す低次の予測係数と微細構造を表す高次の予測係数
を周波数領域で独立に変形することにより、ＣＥＬＰ方
式のような聴覚特性を考慮した重み付き誤差が最小にな
るように符号帳符号を決定する符号化において、従来よ
りも聴覚特性に応じた重み付けを細かく制御できる点に
おいて有効である。As described above, according to the present invention, a low-order linear prediction analysis is performed on a synthesized signal or an input signal, and a high-order linear prediction analysis is performed on the prediction residual signal. Using the obtained prediction coefficients, it is possible to represent the power spectrum characteristics of musical sounds and the like that are more complex than speech, and the lower-order prediction coefficients representing the envelope and the higher-order prediction coefficients representing the fine structure are independently obtained in the frequency domain. By transforming, in the coding that determines the codebook code so that the weighted error considering the auditory characteristics like the CELP method is minimized, it is effective in that the weighting according to the auditory characteristics can be more finely controlled than in the past. It is.

【００３１】図３に示した符号化器と、図５Ａに示した
従来の符号化器をそれぞれ１６ｋｂｉｔ／ｓと、２４ｋ
ｂｉｔ／ｓのものを設計し、音楽信号を符号化し、その
時の平均ＳＮＲ_segを求めた。この結果は１６ｋｂｉｔ
／ｓでは、従来法が１１．５ｄＢであるのに対し、この
発明法では１２．１ｄＢと向上し、２４ｋｂｉｔ／ｓで
も従来法が１３．９ｄＢであるのに対し、この発明は１
４．７ｄＢに向上した。これよりこの発明が優れている
ことが理解される。The encoder shown in FIG. 3 and the conventional encoder shown in FIG. 5A are respectively 16 kbit / s and 24 kbit / s.
A bit / s signal was designed, the music signal was encoded, and the average SNR _seg at that time was determined. This result is 16 kbit
/ S, the conventional method is 11.5 dB, whereas the present method is improved to 12.1 dB. Even at 24 kbit / s, the conventional method is 13.9 dB.
Improved to 4.7 dB. It is understood from this that the present invention is superior.

【図面の簡単な説明】[Brief description of the drawings]

【図１】請求項１及び２の発明の実施例を適用した符号
化器の機能構成を示すブロック図。FIG. 1 is a block diagram showing a functional configuration of an encoder to which embodiments of the present invention are applied.

【図２】図１の実施例において、Ａは入力信号の周波数
特性の例を示す図、Ｂは包絡を表わす線形予測係数の周
波数特性の例を示す図、Ｃは包絡を表す線形予測係数か
ら求められた重みの例を示す図、Ｄは微細構造を表わす
線形予測係数の周波数特性の例を示す図、Ｅは微細構造
を表わす線形予測係数から求められた重みの例を示す
図、ＦはＣとＥで表された包絡と微細構造に基づく重み
を乗じて求められた重みの例を示す図である。2 is a diagram illustrating an example of a frequency characteristic of an input signal, FIG. 2B is a diagram illustrating an example of a frequency characteristic of a linear prediction coefficient representing an envelope, and FIG. FIG. 4D is a diagram illustrating an example of the obtained weight, D is a diagram illustrating an example of the frequency characteristic of the linear prediction coefficient representing the fine structure, E is a diagram illustrating an example of the weight determined from the linear prediction coefficient representing the fine structure, and F is It is a figure which shows the example of the weight calculated | required by multiplying the envelope represented by C and E and the weight based on a fine structure.

【図３】請求項２及び３の発明の実施例を適用した符号
化器の機能構成を示すブロック図。FIG. 3 is a block diagram showing a functional configuration of an encoder to which the embodiments of the present invention are applied;

【図４】図３の実施例において、Ａは入力信号の周波数
特性の例を示す図、Ｂは包絡を表わす線形予測係数の周
波数特性の例を示す図、Ｃは包絡を表す線形予測係数か
ら求められた重みの例を示す図、Ｄは微細構造を表わす
線形予測係数の周波数特性の例を示す図、Ｅは微細構造
を表わす線形予測係数から求められた重みの例を示す
図、ＦはＣとＥで表された包絡と微細構造に基づく重み
を乗じて求められた重みの例を示す図である。4 is a diagram illustrating an example of a frequency characteristic of an input signal, FIG. 4B is a diagram illustrating an example of a frequency characteristic of a linear prediction coefficient representing an envelope, and FIG. FIG. 4D is a diagram illustrating an example of the obtained weight, D is a diagram illustrating an example of the frequency characteristic of the linear prediction coefficient representing the fine structure, E is a diagram illustrating an example of the weight determined from the linear prediction coefficient representing the fine structure, and F is It is a figure which shows the example of the weight calculated | required by multiplying the envelope represented by C and E and the weight based on a fine structure.

【図５】Ａは従来のＣＥＬＰ方式の符号化方法の符号化
器の機能構成を示すブロック図、Ｂはその復号化器の機
能構成を示すブロック図である。FIG. 5A is a block diagram showing a functional configuration of an encoder of a conventional CELP encoding method, and FIG. 5B is a block diagram showing a functional configuration of a decoder;

【図６】従来のＣＥＬＰ方式の符号化方法の他の符号化
器の機能構成を示すブロック図。FIG. 6 is a block diagram showing a functional configuration of another encoder according to a conventional CELP encoding method.

Claims

【特許請求の範囲】[Claims]

【請求項１】音声や楽音などの入力信号のスペクトル
包絡を線形予測分析で求め、ピッチ周期成分をもつ適応
符号帳と固定符号帳からの各ベクトルの利得付き和を励
振源として、先に求められた予測係数に基づく合成フイ
ルタにより合成された信号と入力信号との誤差を最小と
するようなピッチ周期、固定符号帳符号及び利得を決定
する音響信号符号化法において、入力信号と符号帳ベクトルによる合成信号との誤差信号
を周波数領域に変換する過程と、その変換された誤差信号に聴覚特性に応じた重み付けを
施し、その重み付けされた誤差信号を上記固定符号帳符
号の決定に用いる重み付け過程と、入力信号もしくは過去の合成信号に対して線形予測を行
い、得られた線形予測係数に基づき自己回帰型又は移動
平均型あるいは自己回帰移動平均型線形フィルタを構成
し、その線形フィルタの周波数特性を聴覚特性に応じて
変形して、上記重み付けに用いる重みとする重み生成過
程と、を有する音響信号符号化法。1. A spectrum envelope of an input signal such as a voice or a musical tone is obtained by a linear prediction analysis, and a sum with gain of each vector from an adaptive codebook having a pitch period component and a fixed codebook is obtained as an excitation source. In a sound signal encoding method for determining a pitch period, a fixed codebook code, and a gain so as to minimize an error between a signal synthesized by a synthesis filter based on a predicted coefficient and an input signal, an input signal and a codebook vector are determined. Transforming the error signal with the synthesized signal into the frequency domain, weighting the converted error signal according to the auditory characteristics, and using the weighted error signal to determine the fixed codebook code And perform linear prediction on the input signal or past synthesized signal, and perform autoregressive, moving average, or autoregressive transition based on the obtained linear prediction coefficients. A weight generation step of forming a moving average type linear filter, deforming the frequency characteristics of the linear filter according to the auditory characteristics, and using the weights for the weighting.

【請求項２】音声や楽音などの入力信号のスペクトル
包絡を線形予測分析で求め、ピッチ周期成分をもつ適応
符号帳と固定符号帳からの各ベクトルの利得付き和を符
号帳ベクトルとして、その符号帳ベクトルと、入力信号
の線形予測残差との誤差を最小とするようなピッチ周
期、固定符号帳符号及び利得を決定する音響信号符号化
法において、入力信号の線形予測誤差信号と符号帳ベクトルとの誤差
信号を周波数領域に変換する過程と、その変換された誤差信号に聴覚特性に応じた重み付けを
施し、その重み付けされた誤差信号を上記固定符号帳符
号の決定に用いる重み付け過程と、入力信号もしくは過去の合成信号に対して線形予測を行
い、得られた線形予測係数に基づき自己回帰型又は移動
平均型、あるいは自己回帰移動平均型線形フィルタを構
成し、その線形フィルタの周波数特性を聴覚特性に応じ
て変形して、上記重み付けに用いる重みとする重み生成
過程と、を有する音響信号符号化法。2. A spectrum envelope of an input signal such as a voice or a musical tone is obtained by a linear prediction analysis, and a sum with a gain of each vector from an adaptive codebook having a pitch period component and a fixed codebook is used as a codebook vector, In an acoustic signal encoding method for determining a pitch period, a fixed codebook code, and a gain so as to minimize an error between a book vector and a linear prediction residual of an input signal, a linear prediction error signal of an input signal and a codebook vector Converting the error signal into a frequency domain, weighting the converted error signal according to the auditory characteristics, and using the weighted error signal to determine the fixed codebook code. Linear prediction is performed on the signal or past synthesized signal, and based on the obtained linear prediction coefficients, an autoregressive type, a moving average type, or an autoregressive moving average type linear A weight generation step of forming a filter, transforming the frequency characteristics of the linear filter according to the auditory characteristics, and using the weights for the weighting.

【請求項３】上記重み生成過程は、上記線形予測でｐ
次の予測係数を求め、そのｐ次の予測係数に基づき上記
線形フィルタを構成し、その周波数特性と聴覚特性に応
じて変形する第１過程と、上記線形予測でｐ′次の予測係数を求め、そのｐ′次の
線形予測係数に基づき線形予測逆フィルタを構成し、そ
の線形予測逆フィルタにより入力信号もしくは過去の合
成信号の線形予測残差信号を得る過程と、その線形予測残差信号に対してｎ次（ｎ＞ｐ′）の線形
予測を行い、得られたｎ次の線形予測係数に基づき自己
回帰型又は移動平均型あるいは自己回帰移動平均型線形
フィルタを構成し、その線形フィルタの周波数特性を聴
覚特性に応じて変形する第２過程と、上記第１過程と上記第２過程でそれぞれ独立に変形され
た周波数特性を乗じて上記重み付け過程に用いる重みと
する過程と、よりなることを特徴とする請求項１又は２記載の音響信
号符号化法。3. The weight generation step includes the step of:
Calculating a next prediction coefficient, forming a linear filter based on the p-th prediction coefficient, deforming the linear filter in accordance with the frequency characteristic and the auditory characteristic, and obtaining a p′-th prediction coefficient by the linear prediction A process of constructing a linear prediction inverse filter based on the p'th-order linear prediction coefficient and obtaining a linear prediction residual signal of the input signal or the past synthesized signal by the linear prediction inverse filter; On the other hand, an n-order (n> p ') linear prediction is performed, and an autoregressive type, a moving average type, or an autoregressive moving average type linear filter is formed based on the obtained nth order linear prediction coefficient. A second process of transforming the frequency characteristics according to the auditory characteristics, and a process of multiplying the frequency characteristics independently transformed in the first process and the second process to obtain weights used in the weighting process. 3. The audio signal encoding method according to claim 1, wherein:

【請求項４】上記ｐとｐ′は等しく、上記第１過程で
求めたｐ次の予測係数を用いて、上記線形予測逆フィル
タを構成することを特徴とする請求項３記載の音響信号
符号化法。4. The acoustic signal code according to claim 3, wherein said p and p 'are equal, and said p-order prediction coefficient obtained in said first step is used to constitute said inverse linear prediction filter. Chemical method.