JPH1020894A

JPH1020894A - Speech encoding device and recording medium

Info

Publication number: JPH1020894A
Application number: JP8171484A
Authority: JP
Inventors: Riyuutarou Yamanaka; 中隆太朗山
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 1996-07-01
Filing date: 1996-07-01
Publication date: 1998-01-23
Anticipated expiration: 2016-07-01
Also published as: JP3462958B2

Abstract

PROBLEM TO BE SOLVED: To compose a speech source code vector with fidelity to an input speech from a relatively small bit rate in a speech encoding system with a low bit rate of about 4kbits/sec. SOLUTION: An adaptive code book searching circuit 17 generates an adaptive code vector, and a pitch adaptive speech source code book searching circuit 19 searches a pitch position in the adaptive code vector, and shifts the speech source code vector up to the pitch position by using the sound source code books 201 -20n with a lean mode taking the horizontal axis for a distance relative to the pitch position, and searches a vector minimizing a mean-square error of the residual signals with weights. By taking a linear summation of n-pieces of speech code vectors and shifting the obtained vector up to the pitch position, a speech source code vector adaptive to the pitch position is obtained, and when a speech is coded at a low bit rate of about 4kbits/sec, a speech source vector with fidelity to an input speech can be built up in a low operation volume and a low memory capacity.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、音声符号化装置お
よびそれをソフトウェアで実現したプログラムを記録し
た記録媒体に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an audio encoding device and a recording medium on which a program realized by software is recorded.

【０００２】[0002]

【従来の技術】８kbit/sの音声符号化方式として、ＩＴ
Ｕ−Ｔにより標準化されたConjugate-Structure Algebr
aic-Code-Excited-Linear-Predictive(CS-ACELP)Coding
(DraftRecommendation G. ７２９) が知られている。図
６にＣＳ−ＡＣＥＬＰのブロック図を示す。図６におい
て、３１は入力信号をバッファリングするバッファメモ
リ、３２は入力信号のＬＰＣ分析を行うＬＰＣ分析回
路、３３はＬＰＣ係数を量子化する量子化器、３４は音
源信号から合成音声を再生する合成フィルタ、３５は入
力音声信号から合成音声を差し引いて残差信号を求める
加算器、３６は求めた残差信号に聴感重み付けを行う重
み付け回路、３７は過去の駆動音源を蓄える適応コード
ブック３８から適応コードベクトルを探索する適応コー
ドブック探索回路、３９は雑音音源等の固定の音源ベク
トルを蓄える固定コードブック４０から固定コードベク
トルを探索する固定コードブック探索回路、４１はゲイ
ンコードブック４２を用いてゲインの予測を行うゲイン
量子化器、４３は量子化されたＬＰＣ係数とそれぞれ探
索されたコードベクトルと量子化ゲインとを多重化して
符号化する多重化器である。この方式はフレーム長を８
０次元（１０ms）として、サブフレーム長４０次元（５
ms）ごとに音源コードブック３８、４０が探索される。
音源情報には、１７ビットが割り当てられ、１７ビット
で４本の音源パルスの位置と符号を表すよう構成されて
いる。2. Description of the Related Art As an 8 kbit / s audio coding method, IT
Conjugate-Structure Algebr standardized by UT
aic-Code-Excited-Linear-Predictive (CS-ACELP) Coding
(DraftRecommendation G. 729) is known. FIG. 6 shows a block diagram of CS-ACELP. In FIG. 6, reference numeral 31 denotes a buffer memory for buffering an input signal; 32, an LPC analysis circuit for performing LPC analysis of the input signal; 33, a quantizer for quantizing LPC coefficients; and 34, a synthesized voice from a sound source signal. A synthesis filter, 35 is an adder for subtracting the synthesized speech from the input speech signal to obtain a residual signal, 36 is a weighting circuit for performing perceptual weighting on the obtained residual signal, and 37 is an adaptive codebook 38 for storing past driving sound sources. An adaptive codebook search circuit for searching an adaptive code vector, a fixed codebook search circuit 39 for searching for a fixed code vector from a fixed codebook 40 for storing a fixed sound source vector such as a noise sound source, and a 41 using a gain codebook 42 The gain quantizer 43 for predicting the gain is searched for the quantized LPC coefficients and 43, respectively. Dobekutoru and a quantization gain a multiplexer for coding and multiplexing. This method uses a frame length of 8
As a zero dimension (10 ms), a subframe length of 40 dimensions (5
The sound source codebooks 38 and 40 are searched for every ms).
17 bits are assigned to the sound source information, and the 17 bits indicate the positions and codes of the four sound source pulses.

【０００３】[0003]

【発明が解決しようとする課題】このＣＳ−ＡＣＥＬＰ
において、さらに低ビットレート（４kbit/s）化するた
めにフレーム長を１６０次元（２０ms）、サブフレーム
長を８０次元（１０ms）として、他は８kbit/sの場合と
全く同じ構成にした場合、問題となるのは１サブフレー
ムにつき４本の音源パルスしか探索・量子化することが
できないため、入力音声を忠実に再現することに限界が
生じることである。SUMMARY OF THE INVENTION This CS-ACELP
In order to further reduce the bit rate (4 kbit / s), if the frame length is set to 160 dimensions (20 ms) and the subframe length is set to 80 dimensions (10 ms), and otherwise the configuration is exactly the same as the case of 8 kbit / s, The problem is that only four excitation pulses can be searched and quantized per subframe, so that there is a limit in faithfully reproducing the input speech.

【０００４】したがって、４kbit/s程度の低ビットレー
ト音声符号化方式においては、できるだけ少ないビット
レートで、入力音声を忠実に再現する音声符号化方式が
要求されている。[0004] Therefore, in a low bit rate speech coding system of about 4 kbit / s, a speech coding system which faithfully reproduces input speech at a bit rate as small as possible is required.

【０００５】本発明は、比較的少ないビットレートで、
入力音声に忠実な音源コードベクトルを構成することを
目的とする。[0005] The present invention provides a relatively low bit rate,
An object is to construct a sound source code vector that is faithful to input speech.

【０００６】[0006]

【課題を解決するための手段】上記目的を達成するため
に、本発明は、ピッチパルス近傍における残差信号の冗
長性を利用して、ピッチパルスを中心に、ピッチパルス
との相対的な距離を考慮に入れた学習法によりピッチ適
応音源コードベクトルを求め、これによりピッチ適応音
源コードブックを作成することで、４kbit/s程度の低ビ
ットレートで音声の符号化を行うとき、できるだけ少な
いビットレートで入力音声に忠実な音源コードベクトル
を構築するように音源コードブックを構成したものであ
る。In order to achieve the above object, the present invention utilizes the redundancy of a residual signal in the vicinity of a pitch pulse to make the distance relative to the pitch pulse centered on the pitch pulse. The pitch adaptive excitation code vector is obtained by a learning method that takes into account the above, and a pitch adaptive excitation codebook is created. By this, when encoding speech at a low bit rate of about 4 kbit / s, the bit rate as small as possible The sound source code book is constructed so as to construct a sound source code vector that is faithful to the input voice.

【０００７】またピッチ適応音源コードブックを複数に
分割することにより、従来のＣＥＬＰ方式の音源コード
ブックのサイズよりもメモリ量、演算量が少なくてす
み、また、従来のＣＳ−ＡＣＥＬＰを４kbit/s化したと
きの音源コードベクトルよりも、入力音声を忠実に再現
した音源コードベクトルが得られる。Further, by dividing the pitch-adaptive excitation codebook into a plurality of parts, the amount of memory and the amount of calculation can be reduced as compared with the size of the excitation codebook of the conventional CELP system, and the conventional CS-ACELP requires 4 kbit / s. A sound source code vector that faithfully reproduces the input voice can be obtained rather than the sound source code vector at the time of conversion.

【０００８】[0008]

【発明実施の形態】本発明の請求項１記載の発明は、入
力信号をバッファリングするバッファメモリと、入力信
号のＬＰＣ分析を行うＬＰＣ分析回路と、ＬＰＣ係数を
量子化する量子化器と、音源信号から合成音声を再生す
る合成フィルタと、入力音声から合成音声を差し引いた
残差信号に聴感重み付けを行う重み付け回路と、重み付
き残差信号からピッチ周期を予測し、適応コードブック
を用いて適応コードベクトルを探索する適応コードブッ
ク探索回路と、ピッチに適応した雑音源を生成・量子化
するピッチ適応雑音源と、ゲイン予測を行うゲイン量子
化器およびゲインコードブックと、各量子化パラメータ
を多重化する多重化器とを備えた音声符号化装置であ
り、ピッチパルス近傍の残差信号の冗長性を利用するこ
とにより、４kbit/s程度の低ビットレートで入力音声に
忠実な音声の符号化を行うことが可能という作用を有す
る。DESCRIPTION OF THE PREFERRED EMBODIMENTS The invention according to claim 1 of the present invention comprises a buffer memory for buffering an input signal, an LPC analysis circuit for performing an LPC analysis of the input signal, a quantizer for quantizing LPC coefficients, and A synthesis filter that reproduces synthesized speech from the sound source signal, a weighting circuit that performs perceptual weighting on the residual signal obtained by subtracting the synthesized speech from the input voice, and a pitch period predicted from the weighted residual signal, using an adaptive codebook. An adaptive codebook search circuit for searching an adaptive code vector, a pitch adaptive noise source for generating and quantizing a noise source adapted to pitch, a gain quantizer and a gain codebook for performing gain prediction, and each quantization parameter. And a multiplexer for multiplexing. The speech encoder uses redundancy of a residual signal in the vicinity of a pitch pulse to generate a signal of about 4 kbit / s. An effect that can perform coding of faithful sound input speech at a low bit rate.

【０００９】請求項２記載の発明は、ピッチ適応雑音源
が、ピッチ位置との相対的な距離を横軸にとって学習を
行ったピッチ適応音源コードブックとピッチ適応音源コ
ードブック探索回路とからなる請求項１記載の音声符号
化装置であり、ピッチ適応雑音源の構成として、ピッチ
位置を固定しその近傍の雑音パルスをピッチ位置との相
対的な距離を横軸にとって学習を行ったピッチ適応音源
コードブックと、その音源コードブックを用いて探索を
行い、ピッチ適応音源コードベクトルを生成するピッチ
適応音源コードブック探索回路とすることにより、４kb
it/s程度の低ビットレートで入力音声に忠実な音声の符
号化を行うことが可能という作用を有する。According to a second aspect of the present invention, the pitch-adaptive noise source comprises a pitch-adaptive excitation codebook and a pitch-adaptive excitation codebook search circuit that have been trained with the relative distance from the pitch position as the horizontal axis. Item 2. The pitch-adaptive sound source code according to Item 1, wherein the pitch-adaptive noise source has a configuration in which a pitch position is fixed and a noise pulse in the vicinity thereof is trained with the relative distance from the pitch position as a horizontal axis. And a pitch-adaptive excitation codebook search circuit that searches using the excitation codebook and generates a pitch-adaptive excitation code vector, thereby obtaining 4 kb.
It has an effect that it is possible to encode a sound faithful to the input sound at a low bit rate of about it / s.

【００１０】請求項３記載の発明は、ピッチ適応音源コ
ードブック探索回路が、ピッチ位置検出回路とコードブ
ック探索回路からなる請求項２記載の音声符号化装置で
あり、ピッチ位置検出回路により、検出されたピッチ位
置を用いてコードブック探索を行い、ピッチに適応した
音源コードベクトルを生成することにより、４kbit/s程
度の低ビットレートで入力音声に忠実な音声の符号化を
行うことが可能という作用を有する。According to a third aspect of the present invention, there is provided the speech encoding apparatus according to the second aspect, wherein the pitch adaptive excitation codebook search circuit comprises a pitch position detection circuit and a codebook search circuit. By performing a codebook search using the pitch position obtained and generating a sound source code vector adapted to the pitch, it is possible to encode a speech faithful to the input speech at a low bit rate of about 4 kbit / s. Has an action.

【００１１】請求項４記載の発明は、ピッチ位置検出回
路が、適応コードベクトルの振幅が最大となる位置を探
索する請求項３記載の音声符号化装置であり、ピッチ位
置の探索を容易に行うことが可能という作用を有する。According to a fourth aspect of the present invention, there is provided the speech encoding apparatus according to the third aspect, wherein the pitch position detecting circuit searches for a position where the amplitude of the adaptive code vector is maximum, and the pitch position is easily searched. It has the effect that it is possible.

【００１２】請求項５記載の発明は、複数のｎ個のコー
ドブックの組合せによりピッチ適応音源コードベクトル
を表し、各コードブックのコードベクトルが、他のコー
ドブックのそれと互いに直交するようにｎ個毎にパルス
を配置し、ベクトル長をサブフレーム長の１／ｎ倍に圧
縮したコードブックを有する請求項２から４のいずれか
に記載の音声符号化装置であり、複数のコードブックに
分けて探索することにより、低演算量で探索が可能とな
り、ベクトル長をサブフレーム長の１／ｎ倍とすること
により、低メモリ量にすることが可能という作用を有す
る。According to a fifth aspect of the present invention, a pitch adaptive excitation code vector is represented by a combination of a plurality of n codebooks, and n codebooks of each codebook are orthogonal to that of another codebook. 5. The speech coding apparatus according to claim 2, further comprising a code book in which a pulse is arranged for each of the codes and the vector length is compressed to 1 / n times the subframe length, wherein the code book is divided into a plurality of code books. By performing the search, the search can be performed with a small amount of calculation, and the memory length can be reduced by setting the vector length to 1 / n times the subframe length.

【００１３】請求項６記載の発明は、コードブック探索
回路が、入力音声から合成音声を差し引いた残差信号に
重みを付けた重み付き残差信号の平均自乗誤差を最小に
するピッチ適応音源コードベクトルを探索・生成する請
求項３または４に記載の音声符号化装置であり、検出さ
れたピッチ位置だけピッチ適応音源コードベクトルをシ
フトさせ、その中から最適なベクトルを探索することに
より、４kbit/s程度の低ビットレートで入力音声に忠実
な音声の符号化を行うことが可能という作用を有する。According to a sixth aspect of the present invention, the code book search circuit minimizes the mean square error of the weighted residual signal obtained by subtracting the synthesized voice from the input voice. The speech encoding apparatus according to claim 3, wherein the speech encoding apparatus searches for and generates a vector, and shifts a pitch adaptive excitation code vector by a detected pitch position, and searches for an optimal vector from the vector. This has the effect that it is possible to encode a speech faithful to the input speech at a low bit rate of about s.

【００１４】請求項７記載の発明は、請求項５記載のピ
ッチ適応音源コードブックを用い、請求項６記載のコー
ドブック探索回路で各コードブックの探索を行い、複数
のｎ個得られたコードベクトルの線形和を取ることによ
り、ピッチ適応音源コードベクトルを得る請求項３また
は４記載の音声符号化装置であり、低演算量、低メモリ
量で、４kbit/s程度の低ビットレートで入力音声に忠実
な音声の符号化を行うことが可能という作用を有する。According to a seventh aspect of the present invention, a plurality of n obtained codes are obtained by using the pitch adaptive excitation codebook according to the fifth aspect and searching each codebook by the codebook search circuit according to the sixth aspect. 5. The speech encoding apparatus according to claim 3, wherein a pitch-adapted excitation code vector is obtained by taking a linear sum of the vectors, wherein the input speech is input at a low bit rate of about 4 kbit / s with a low operation amount and a low memory amount. This has the effect that it is possible to perform audio coding faithful to

【００１５】請求項８記載の発明は、請求項１から７の
いずれかに記載の音声符号化装置をソフトウェアで実現
したプログラムを記録した磁気ディスク、光磁気ディス
クＲＯＭカードリッジ等の記録媒体であり、例えばパー
ソナルコンピュータ等にこれら記録媒体を入力すること
により、請求項１から７記載のいずれかの音声符号化装
置をソフトウェアにより実現できるという作用を有す
る。According to an eighth aspect of the present invention, there is provided a recording medium such as a magnetic disk, a magneto-optical disk ROM cartridge, or the like, on which a program in which the voice encoding device according to any one of the first to seventh aspects is realized by software is recorded. For example, by inputting these recording media into a personal computer or the like, there is an effect that any one of the speech encoding devices according to claims 1 to 7 can be realized by software.

【００１６】以下、本発明の実施の形態について、図１
から図４を用いて説明する。（実施の形態１）図１において、１１は入力信号をバッ
ファリングするバッファメモリ、１２は入力信号のＬＰ
Ｃ分析を行うＬＰＣ分析回路、１３はＬＰＣ係数を量子
化する量子化器、１４は音源信号から合成音声を再生す
る合成フィルタ、１５は入力音声信号から合成音声を差
し引いて残差信号を求める加算器、１６は求めた残差信
号に聴感重み付けを行う重み付け回路、１７は過去の駆
動音源を蓄える適応コードブック１８から適応コードベ
クトルを探索する適応コードブック探索回路、１９はピ
ッチ適応音源コードブック２０を用いて重み付き残差信
号の平均自乗誤差を最小にするベクトルを探索するピッ
チ適応音源コードブック探索回路であり、ピッチ位置検
出回路１９Ａとコードブック探索回路１９Ｂとからな
る。２１はゲインコードブック２２を用いてゲインの予
測を行うゲイン量子化器、２３は量子化されたＬＰＣ係
数とそれぞれ探索されたコードベクトルと量子化ゲイン
とを多重化して符号化する多重化器である。Hereinafter, an embodiment of the present invention will be described with reference to FIG.
This will be described with reference to FIG. (Embodiment 1) In FIG. 1, reference numeral 11 denotes a buffer memory for buffering an input signal, and 12 denotes an LP of an input signal.
LPC analysis circuit for performing C analysis, 13 a quantizer for quantizing LPC coefficients, 14 a synthesis filter for reproducing synthesized speech from a sound source signal, and 15 an addition for subtracting synthesized speech from an input speech signal to obtain a residual signal. 16 is a weighting circuit for weighting the perceived weight of the obtained residual signal, 17 is an adaptive codebook search circuit for searching for an adaptive code vector from an adaptive codebook 18 for storing past driving sound sources, and 19 is a pitch adaptive sound source codebook 20. Is a pitch adaptive excitation codebook search circuit for searching for a vector that minimizes the mean square error of the weighted residual signal, and includes a pitch position detection circuit 19A and a codebook search circuit 19B. Reference numeral 21 denotes a gain quantizer that predicts a gain using a gain codebook 22. Reference numeral 23 denotes a multiplexer that multiplexes and encodes a quantized LPC coefficient, a searched code vector, and a quantization gain. is there.

【００１７】次に本実施の形態における動作について説
明する。入力信号はバッファメモリ１１でバッファリン
グされ、サブフレーム長に分割される。サブフレーム長
に分割された音声信号は、ＬＰＣ分析回路１２でＬＰＣ
係数を算出し、量子化器１３で量子化を行い、その出力
の一方を多重化器２３に入力して符号化し、他方はＬＰ
Ｃ係数に逆量子化されて合成フィルタ１４の係数として
入力される。合成フィルタ１４には、ゲイン量子化器２
１でそれぞれスケーリングされた適応コードベクトルと
ピッチ適応音源ベクトルとの和が入力し、そこで合成音
声が得られる。入力音声から合成音声を加算器１５で減
算することにより残差信号が得られ、この残差信号を重
み付け回路１６に通すことにより、重み付き残差信号が
得られる。適応コードブック探索回路１７では、重み付
き残差信号を入力として、その平均自乗誤差が最小とな
るようにピッチ周期を算出する。算出したピッチ周期を
多重化器２３に入力して符号化する。次いで、このピッ
チ周期に基づいて、適応コードブック１８から適応コー
ドベクトルを生成する。Next, the operation of this embodiment will be described. The input signal is buffered in the buffer memory 11 and divided into subframe lengths. The audio signal divided into sub-frame lengths is converted into an LPC
The coefficients are calculated, quantized by the quantizer 13, one of the outputs is input to the multiplexer 23 for encoding, and the other is LP
It is inversely quantized to a C coefficient and input as a coefficient of the synthesis filter 14. The synthesis filter 14 includes a gain quantizer 2
The sum of the adaptive code vector and the pitch adaptive excitation vector each scaled by 1 is input, and a synthesized speech is obtained there. The adder 15 subtracts the synthesized speech from the input speech to obtain a residual signal. The residual signal is passed through a weighting circuit 16 to obtain a weighted residual signal. The adaptive codebook search circuit 17 receives the weighted residual signal and calculates the pitch period so that the mean square error is minimized. The calculated pitch period is input to the multiplexer 23 and encoded. Next, an adaptive code vector is generated from the adaptive code book 18 based on the pitch period.

【００１８】ピッチ適応音源コードブック探索回路１９
には、適応コードベクトルと重み付き残差信号が入力さ
れる。まず適応コードベクトルの中で振幅が最大となる
パルス位置をピッチ位置として探索する。次いで、ピッ
チ位置との相対的な距離を横軸に取って学習を行ったピ
ッチ適応音源コードブック２０₁〜２０_nを用いて、そ
のコードベクトルをピッチ位置までシフトして探索を行
う。探索は各コードブック２０₁〜２０_nの中で、重み
付き残差信号の平均自乗誤差を最小にするベクトルを探
索する。探索の結果、ｎ個の音源コードベクトルとその
インデックスが得られ、インデックスは多重化器２３に
入力して符号化される。また、ｎ個のコードベクトルの
線形和を取り、最終的なピッチ適応音源コードベクトル
を生成する。ゲイン量子化器２１には、適応コードベク
トルとピッチ適応音源コードベクトルと重み付き残差信
号が入力し、この重み付き残差信号の平均自乗誤差が最
小になるように、適応コードベクトルとピッチ適応音源
コードベクトルのゲインを求める。求められたゲインを
ゲインコードブック２２により量子化して多重化器２３
に出力するとともに、それぞれのベクトルのスケーリン
グを行い、加算して音源信号を生成する。音源信号を合
成フィルタ１４に通すことにより合成音声が得られる。Pitch adaptive sound source codebook search circuit 19
, An adaptive code vector and a weighted residual signal are input. First, a pulse position having the maximum amplitude in the adaptive code vector is searched for as a pitch position. Then, using the pitch adaptive excitation codebook 20 ₁ to 20 _n that the relative distance was taken learning the horizontal axis between the pitch position, and searches by shifting the code vector to a pitch position. Search in each codebook 20 ₁ to 20 _n, to search for a vector to a mean square error of the weighted residual signal is minimized. As a result of the search, n excitation code vectors and their indices are obtained, and the indices are input to the multiplexer 23 and encoded. Further, a linear sum of n code vectors is calculated to generate a final pitch adaptive excitation code vector. The adaptive code vector, the pitch adaptive excitation code vector and the weighted residual signal are input to the gain quantizer 21, and the adaptive code vector and the pitch adaptive excitation signal are minimized so that the mean square error of the weighted residual signal is minimized. Find the gain of the sound source code vector. The obtained gain is quantized by a gain codebook 22 and a multiplexer 23 is provided.
And scales the respective vectors and adds them to generate a sound source signal. By passing the sound source signal through the synthesis filter 14, a synthesized voice is obtained.

【００１９】次に、ピッチ適応音源コードブック探索回
路１９の詳細について説明する。図２はピッチ適応音源
コードブック探索回路１９の処理手順をフローチャート
で示したもので、同図（ａ）はピッチ適応音源コードブ
ック探索回路１９全体の処理の流れを示し、ステップＳ
１がピッチ位置検出回路１９Ａの動作であり、ステップ
Ｓ２からＳ６までの動作がコードブック探索回路１９Ｂ
の動作である。図２（ｂ）はピッチ位置検出回路１９Ａ
の動作の詳細を示しており、入力された適応コードベク
トルの中で振幅が最大となる位置を探索する。Next, the details of the pitch adaptive sound source codebook search circuit 19 will be described. FIG. 2 is a flowchart showing the processing procedure of the pitch adaptive excitation codebook search circuit 19, and FIG. 2A shows the flow of processing of the pitch adaptive excitation codebook search circuit 19 as a whole.
1 is the operation of the pitch position detection circuit 19A, and the operations from steps S2 to S6 are the codebook search circuit 19B
Operation. FIG. 2B shows a pitch position detection circuit 19A.
The details of the operation are described below, and a position where the amplitude becomes maximum is searched for in the input adaptive code vector.

【００２０】なお、ピッチ位置検出法に関して、パルス
列と適応コードベクトルとの波形歪みを最小化するピッ
チ位置を検出する方法や、それを合成波形領域で考え、
パルス列による合成波形と適応コードベクトルによる合
成波形との波形歪みを最小にするパルス位置を検出する
方法などを用いることにより、ピッチ位置の検出の精度
を上げることができる。また、ピッチ周期を利用して、
得られたピッチ適応音源コードベクトルをピッチ周期化
することにより、有声音部での性能向上を図ることがで
きる。Regarding the pitch position detection method, a method for detecting a pitch position that minimizes the waveform distortion between the pulse train and the adaptive code vector, and a method for detecting the pitch position in a composite waveform region are considered.
By using a method of detecting a pulse position that minimizes the waveform distortion between the synthesized waveform based on the pulse train and the synthesized waveform based on the adaptive code vector, the accuracy of detecting the pitch position can be improved. Also, using the pitch period,
By making the obtained pitch-adaptive sound source code vector a pitch period, it is possible to improve the performance in the voiced sound part.

【００２１】図３はコードブック探索回路１９Ｂの処理
手順を示しており、検出したピッチ位置を用いて、第１
コードブックをピッチ位置までシフトして重み付けを行
い、入力音声との平均自乗誤差を最小にするコードベク
トルを探索する。FIG. 3 shows a processing procedure of the codebook search circuit 19B.
The codebook is shifted to the pitch position and weighted, and a code vector that minimizes the mean square error with the input speech is searched for.

【００２２】第２コードブック以下も同様な操作を行う
が、第２コードブック以降の探索では、それ以前に決定
されたコードベクトルとの和を取ったベクトルを用いて
探索を行う。以下同様な操作により、各コードブックで
最適なコードベクトルが得られ、それらの和が最終的な
ピッチ適応音源コードベクトルとなる。The same operation is performed for the second codebook and thereafter, but in the search for the second codebook and thereafter, the search is performed using a vector obtained by adding the code vector determined before that to the second codebook. Thereafter, by the same operation, an optimal code vector is obtained in each code book, and the sum thereof becomes a final pitch adaptive excitation code vector.

【００２３】以上の処理を図解すると図４のようにな
り、生成されたピッチ適応音源コードベクトルは、互い
に直交したベクトルの和で表されることがわかる。FIG. 4 illustrates the above processing, and it can be seen that the generated pitch adaptive excitation code vector is represented by the sum of orthogonal vectors.

【００２４】なお、第１コードブックから順に一つずつ
コードベクトルの決定を行ったが、まず各コードブック
から最適な候補を複数挙げておき、次に挙げた候補の中
で全探索により最適な組合せを検出する処理すなわち予
備選択処理を設けることで、より最適なピッチ適応音源
コードベクトルが得られる。The code vectors are determined one by one in order from the first codebook. First, a plurality of optimum candidates are listed from each codebook, and the optimum candidates are determined by performing a full search among the following candidates. By providing a process of detecting a combination, that is, a preliminary selection process, a more optimal pitch adaptive excitation code vector can be obtained.

【００２５】図５はコードベクトルがピッチ位置までシ
フトする様子を図示したもので、各コードベクトルは、
ピッチ位置との相対的な距離を横軸に取って学習されて
いるので、ピッチの位置情報が分かれば各コードベクト
ルをピッチ位置までシフトして、サブフレーム長からは
み出た成分については切り捨てることにより得られる。FIG. 5 shows how the code vector shifts to the pitch position.
Since learning is done by taking the relative distance from the pitch position on the horizontal axis, if the position information of the pitch is known, each code vector is shifted to the pitch position, and components outside the subframe length are truncated. can get.

【００２６】またこのとき、メモリ領域でｉの位置にあ
るパルス位置を、探索の時点でｉ’の位置にシフトする
下記の変換式を用いることにより、メモリ領域ではサブ
フレーム長の１／ｎ倍に圧縮されたコードベクトルも、
探索の時点ではサブフレーム長に復元することができ
る。At this time, the following conversion formula for shifting the pulse position at the position i in the memory area to the position i 'at the time of the search is used, so that 1 / n times the subframe length is used in the memory area. The code vector compressed to
At the time of the search, the subframe length can be restored.

【００２７】[0027]

【数１】ただし、ｉ：メモリ領域でのパルス位置、ｉ’：探索時
点でのパルス位置、Ｆ：サブフレーム長、ｎ：コードブ
ックの分割数、ｋ：コードブックの番号( ０,
１，．．．ｎ−１）、Ｌ：ピッチ位置である。(Equation 1) Here, i: pulse position in the memory area, i ′: pulse position at the time of search, F: subframe length, n: number of codebook divisions, k: codebook number (0,
1,. . . n-1), L: pitch position.

【００２８】なお、図５はサブフレーム長を８０次元と
し、ピッチ適応音源コードブックを５分割、ピッチ位置
をＬ＝５８とした例であるが、この場合メモリ領域にお
けるコードベクトル長は８０÷５＝１６となり、ピッチ
位置を時間の原点として、パルス位置( ０, １,...,
７) までを時間軸上の負の領域、パルス位置( ８,
９,..., １５) を正もしくは原点として学習しているた
め、探索時点でコードベクトルを変換するときに、式
（１）の右辺第１項ｉ−８でメモリ領域のパルス位置か
ら８を減じている。また、第２項ｋ−１で、それぞれコ
ードブックの番号から１を減じた分だけシフトすること
により直交関係が保証され、さらに第３項で、ピッチ位
置Ｌだけシフトすることによりピッチ適応が施される。FIG. 5 shows an example in which the subframe length is 80 dimensions, the pitch adaptive excitation codebook is divided into 5 and the pitch position is L = 58. In this case, the code vector length in the memory area is 80 ÷ 5. = 16 and the pulse position (0, 1, ...,
Up to 7), the negative region on the time axis, the pulse position (8,
9,..., 15) as the positive or origin, when converting the code vector at the time of search, the first term i-8 on the right-hand side of equation (1) is used to calculate 8 from the pulse position in the memory area. Has been reduced. In the second term k-1, the orthogonal relationship is assured by shifting the codebook number by one less than one, and in the third term, pitch adaptation is performed by shifting the pitch position L. Is done.

【００２９】[0029]

【発明の効果】以上のように本発明によれば、ピッチパ
ルス近傍における、残差信号の冗長性を利用して、例え
ば、サブフレーム長８０次元、音源コードブック５分
割、１つの音源コードブックにつき３ビット（８種類）
を割り当てることにより、１サブフレーム当たり１５ビ
ットでピッチに適応した音源コードベクトルを表すこと
ができ、４kbit/s程度の低ビットレートで音声の符号化
を行うとき、低演算量、低メモリ量で、入力音声に忠実
な音源ベクトルを構築できるという有利な効果が得られ
る。As described above, according to the present invention, by utilizing the redundancy of the residual signal in the vicinity of the pitch pulse, for example, the subframe length is 80 dimensions, the excitation codebook is divided into five, and one excitation codebook is used. 3 bits per type (8 types)
Can be represented by 15 bits per subframe to represent a pitch-adapted excitation code vector. When speech is encoded at a low bit rate of about 4 kbit / s, a small amount of computation and a small amount of memory are required. This has an advantageous effect that a sound source vector that is faithful to the input voice can be constructed.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の一実施の形態による音声符号化装置の
構成を示すブロック図FIG. 1 is a block diagram showing a configuration of a speech encoding device according to an embodiment of the present invention.

【図２】（ａ）同装置におけるピッチ適応音源コードブ
ック探索回路の探索処理手順を示すフロー図（ｂ）同ピッチ適応音源コードブック探索回路における
ピッチ位置検出回路の処理手順を示すフロー図2A is a flowchart showing a search processing procedure of a pitch adaptive excitation codebook search circuit in the same apparatus. FIG. 2B is a flowchart showing a processing procedure of a pitch position detection circuit in the pitch adaptive excitation codebook search circuit.

【図３】同ピッチ適応音源コードブック探索回路におけ
るコードブック探索回路の処理手順を示すフロー図FIG. 3 is a flowchart showing a processing procedure of a codebook search circuit in the pitch adaptive sound source codebook search circuit.

【図４】コードベクトルの和により表される適応音源コ
ードベクトルの一覧図FIG. 4 is a list of adaptive excitation code vectors represented by a sum of code vectors;

【図５】ピッチ適応音源コードベクトルのピッチ位置Ｌ
までシフトする様子を示す遷移図FIG. 5 shows a pitch position L of a pitch adaptive excitation code vector.
Transition diagram showing how to shift to

【図６】従来のＣＳ−ＡＣＥＬＰ方式の音声符号化装置
の構成を示すブロック図FIG. 6 is a block diagram showing a configuration of a conventional CS-ACELP speech coding apparatus.

【符号の説明】[Explanation of symbols]

１１バッファメモリ１２ＬＰＣ分析回路１３量子化器１４合成フィルタ１５加算器１６重み付け回路１７適応コードブック探索回路１８適応コードブック１９ピッチ適応音源コードブック探索回路１９Ａピッチ位置検出回路１９Ｂコードブック探索回路２０₁〜２０_n ピッチ適応音源コードブック２１ゲイン量子化器２２ゲインコードブック２３多重化器Reference Signs List 11 buffer memory 12 LPC analysis circuit 13 quantizer 14 synthesis filter 15 adder 16 weighting circuit 17 adaptive codebook search circuit 18 adaptive codebook 19 pitch adaptive sound source codebook search circuit 19A pitch position detection circuit 19B codebook search circuit 20 ₁ ２０20 _n pitch adaptive sound source codebook 21 gain quantizer 22 gain codebook 23 multiplexer

Claims

【特許請求の範囲】[Claims]

【請求項１】入力信号をバッファリングするバッファ
メモリと、入力信号のＬＰＣ分析を行うＬＰＣ分析回路
と、ＬＰＣ係数を量子化する量子化器と、音源信号から
合成音声を再生する合成フィルタと、入力音声から合成
音声を差し引いた残差信号に聴感重み付けを行う重み付
け回路と、重み付き残差信号からピッチ周期を予測し、
適応コードブックを用いて適応コードベクトルを探索す
る適応コードブック探索回路と、ピッチに適応した雑音
源を生成・量子化するピッチ適応雑音源と、ゲイン予測
を行うゲイン量子化器およびゲインコードブックと、各
量子化パラメータを多重化する多重化器とを備えた音声
符号化装置。1. A buffer memory for buffering an input signal, an LPC analysis circuit for performing LPC analysis of the input signal, a quantizer for quantizing LPC coefficients, and a synthesis filter for reproducing synthesized speech from a sound source signal. A weighting circuit for performing perceptual weighting on the residual signal obtained by subtracting the synthesized voice from the input voice, and predicting a pitch period from the weighted residual signal,
An adaptive codebook search circuit for searching for an adaptive code vector using an adaptive codebook, a pitch adaptive noise source for generating and quantizing a noise source adapted to pitch, a gain quantizer and a gain codebook for performing gain prediction And a multiplexer for multiplexing each quantization parameter.

【請求項２】ピッチ適応雑音源が、ピッチ位置との相
対的な距離を横軸にとって学習を行ったピッチ適応音源
コードブックとピッチ適応音源コードブック探索回路と
からなる請求項１記載の音声符号化装置。2. A speech code according to claim 1, wherein the pitch-adaptive noise source comprises a pitch-adaptive excitation codebook and a pitch-adaptive excitation codebook search circuit that have been trained on the horizontal axis of the relative distance from the pitch position. Device.

【請求項３】ピッチ適応音源コードブック探索回路
が、ピッチ位置検出回路とコードブック探索回路からな
る請求項２記載の音声符号化装置。3. The speech coding apparatus according to claim 2, wherein the pitch adaptive excitation codebook search circuit comprises a pitch position detection circuit and a codebook search circuit.

【請求項４】ピッチ位置検出回路が、適応コードベク
トルの振幅が最大となる位置を探索する請求項３記載の
音声符号化装置。4. The speech coding apparatus according to claim 3, wherein the pitch position detection circuit searches for a position where the amplitude of the adaptive code vector is maximum.

【請求項５】複数のｎ個のコードブックの組合せによ
りピッチ適応音源コードベクトルを表し、各コードブッ
クのコードベクトルが、他のコードブックのそれと互い
に直交するようにｎ個毎にパルスを配置し、ベクトル長
をサブフレーム長の１／ｎ倍に圧縮したコードブックを
有する請求項２から４のいずれかに記載の音声符号化装
置。5. A pitch-adapted excitation code vector is represented by a combination of a plurality of n codebooks, and pulses are arranged every n pulses so that the code vector of each codebook is orthogonal to that of another codebook. 5. The speech encoding apparatus according to claim 2, further comprising a codebook in which a vector length is compressed to 1 / n times a subframe length.

【請求項６】コードブック探索回路が、入力音声から
合成音声を差し引いた残差信号に重みを付けた重み付き
残差信号の平均自乗誤差を最小にするピッチ適応音源コ
ードベクトルを探索・生成する請求項３または４記載の
音声符号化装置。6. A codebook search circuit searches and generates a pitch-adapted sound source code vector that minimizes the mean square error of a weighted residual signal obtained by subtracting a synthesized voice from an input voice and weighting the residual signal. The speech encoding device according to claim 3.

【請求項７】請求項５記載のピッチ適応音源コードブ
ックを用い、請求項６記載のコードブック探索回路で、
各コードブックの探索を行い、複数のｎ個得られたコー
ドベクトルの線形和を取ることにより、ピッチ適応音源
コードベクトルを得る請求項３または４記載の音声符号
化装置。7. The codebook search circuit according to claim 6, wherein the pitch adaptive excitation codebook according to claim 5 is used.
The speech coding apparatus according to claim 3, wherein a search for each codebook is performed, and a pitch-adapted excitation code vector is obtained by taking a linear sum of a plurality of n obtained code vectors.

【請求項８】請求項１から７のいずれかに記載の音声
符号化装置をソフトウェアで実現したプログラムを記録
した記録媒体。8. A recording medium storing a program that implements the speech encoding device according to claim 1 by software.