JP3132031B2

JP3132031B2 - High-efficiency coding of digital signals.

Info

Publication number: JP3132031B2
Application number: JP03091190A
Authority: JP
Inventors: 健三赤桐
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1991-03-29
Filing date: 1991-03-29
Publication date: 2001-02-05
Anticipated expiration: 2016-02-05
Also published as: JPH04302537A

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、入力ディジタル音声信
号の圧縮符号化を行うディジタル信号の高能率符号化方
法に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a highly efficient digital signal encoding method for compressing and encoding an input digital audio signal.

【０００２】[0002]

【従来の技術】オーディオ_,音声等の信号の高能率符号
化においては、オーディオ，音声等の入力信号を時間軸
又は周波数軸で複数のチャンネルに分割すると共に、各
チャンネル毎のビット数を適応的に割当てるビットアロ
ケーシヨン（ビット割当て）による符号化技術がある。
例えば、オーディオ信号等の上記ビット割当てによる符
号化技術には、時間軸上のオーディオ信号等を複数の周
波数帯域に分割して符号化する帯域分割符号化（サブ・
バンド・コーディング：ＳＢＣ）や、時間軸の信号を周
波数軸上の信号に変換（直交変換）して複数の周波数帯
域に分割し各帯域毎で適応的に符号化するいわゆる適応
変換符号化（ＡＴＣ）、或いは、上記ＳＢＣといわゆる
適応予測符号化（ＡＰＣ）とを組み合わせ、時間軸の信
号を帯域分割して各帯域信号をベースバンド（低域）に
変換した後複数次の線形予測分析を行って予測符号化す
るいわゆる適応ビット割当て（ＡＰＣ−ＡＢ）等の符号
化技術がある。BACKGROUND ART _Audio, in high-efficiency encoding of the signal of voice, etc., audio, adaptive with, the number of bits of each channel is divided into a plurality of channels of the input signal in the time axis or the frequency axis such as voice There is a coding technique based on bit allocation (bit allocation).
For example, the encoding technique based on the above-mentioned bit allocation of an audio signal or the like includes band division encoding (sub-coding) in which an audio signal or the like on a time axis is divided into a plurality of frequency bands and encoded.
Band coding: SBC) or so-called adaptive conversion coding (ATC) in which a signal on the time axis is converted into a signal on the frequency axis (orthogonal conversion), divided into a plurality of frequency bands, and adaptively encoded for each band. Alternatively, the SBC is combined with so-called adaptive prediction coding (APC), a signal on the time axis is divided into bands, and each band signal is converted into a baseband (low band), and then a multi-order linear prediction analysis is performed. There is an encoding technique such as so-called adaptive bit allocation (APC-AB) for predictive encoding.

【０００３】すなわち、上記高能率符号化においては、
例えば、時間軸上のオーディオ信号等を、複数のバンド
パスフィルタ等からなるフィルタアレイを通して周波数
分割し、これら周波数分割された信号を適応的なビット
割り当てで符号化するようなことが行われる。また、例
えば、時間軸上のオーディオ信号等を、所定の単位時間
（直交変換ブロック）毎に例えば高速フーリエ変換（Ｆ
ＦＴ）等の直交変換によって時間軸に直交する軸（周波
数軸）に変換し、その後複数の帯域に分割して、これら
分割された各帯域のＦＦＴ係数データを適応的なビット
割り当てによって符号化したり、或いはフィルタアレイ
で周波数分割した後に直交変換してＦＦＴ係数データに
変換した後、適応的なビット割り当てで符号化するよう
なことが行われる。この符号化データが伝送される。That is, in the above-mentioned high efficiency coding,
For example, audio signals and the like on the time axis are frequency-divided through a filter array including a plurality of band-pass filters and the like, and these frequency-divided signals are encoded by adaptive bit allocation. Further, for example, an audio signal or the like on the time axis is converted into, for example, a fast Fourier transform (F) for each predetermined unit time (orthogonal transform block).
FT) or the like, and transforms it into an axis (frequency axis) orthogonal to the time axis by orthogonal transformation, and then divides the band into a plurality of bands, and encodes the FFT coefficient data of each of the divided bands by adaptive bit allocation Alternatively, the frequency is divided by a filter array and then orthogonally transformed to convert the data into FFT coefficient data, and then encoding is performed by adaptive bit allocation. This encoded data is transmitted.

【０００４】更に、各帯域毎の上記ＦＦＴ係数データを
上記適応ビット割り当てにより符号化する際には、例え
ば、上記周波数軸上のＦＦＴ係数データをブロック（フ
ローティングブロック）化し、このフローティングブロ
ック毎にいわゆるフローティング処理を施すことで、更
にビット圧縮を行う場合が多い。この場合、後の復号化
のための構成には、上記フローティングブロック毎にフ
ローティング処理されたＦＦＴ係数データと、当該各フ
ローティングブロック毎のフローティング係数及び割り
当てビット数に応じたワード長情報からなるサブ情報と
が伝送されることになる。Further, when the FFT coefficient data for each band is encoded by the adaptive bit allocation, for example, the FFT coefficient data on the frequency axis is converted into blocks (floating blocks), and so-called FFT coefficient data is defined for each floating block. In many cases, the bit compression is further performed by performing the floating process. In this case, the configuration for later decoding includes the sub-information composed of the FFT coefficient data subjected to the floating processing for each floating block and the word length information corresponding to the floating coefficient and the number of allocated bits for each floating block. Is transmitted.

【０００５】[0005]

【発明が解決しようとする課題】ところで、上記高能率
符号化においては、更に圧縮効率を高めることが望まれ
ているが、上述した符号化では、例えば直交変換ブロッ
ク或いはフローティングブロック内で一定のビット量が
消費されるため、ブロック毎の伝送ビットレートを下げ
ることができず、圧縮効率を高めることが出来なかっ
た。By the way, in the above-mentioned high-efficiency coding, it is desired to further increase the compression efficiency. In the above-mentioned coding, for example, a fixed number of bits in an orthogonal transform block or a floating block are required. Since the amount of data is consumed, the transmission bit rate for each block cannot be reduced, and the compression efficiency cannot be increased.

【０００６】そこで、本発明は、上述のような実情に鑑
みて提案されたものであり、より高いビット圧縮が可能
なディジタル信号の高能率符号化方法を提供することを
目的とするものである。Accordingly, the present invention has been proposed in view of the above-described circumstances, and has as its object to provide a highly efficient digital signal encoding method capable of performing higher bit compression. .

【０００７】[0007]

【課題を解決するための手段】本発明のディジタル信号
の高能率符号化方法は、上述の目的を達成するために提
案されたものであり、入力ディジタル音声信号を周波数
軸上の信号に変換すると共にブロック化し、該ブロック
毎のディジタル信号を適応的な割り当てビットで符号化
して伝送するディジタル信号の高能率符号化方法におい
て、上記各ブロックのうちの任意の代表ブロックの上記
符号化に関連するデータと、他のブロックの上記符号化
に関連するデータとの間の偏差を求め、上記偏差算出出
力を伝送するようにしたものである。ここで、代表ブロ
ックと他のブロックは時間的に前後するブロックであ
る。また、ブロックは、入力ディジタル音声信号を所定
サンプル毎に周波数軸上の信号に変換した直交変換ブロ
ック、入力ディジタル音声信号を所定サンプル毎に周波
数軸上の信号に変換した後のフローティングブロックで
あり、代表ブロックと他のブロックは周波数的に前後す
るブロックである。さらに、偏差算出出力のデータ量が
符号化に関するデータのデータ量より少ない場合、偏差
算出出力を符号化に関するデータに代えて伝送する。ま
た、偏差算出出力に代えられる前の符号化データは一定
ビットレートであり、偏差算出出力に代えられた後の符
号化データは可変ビットレートである。すなわち、本発
明のディジタル信号の高能率符号化方法は、入力ディジ
タル音声信号が例えばサイン波等の定常的なディジタル
音声信号が連続するものであるか或いはこの定常的信号
に準ずる準定常的なディジタル音声信号が連続するもの
である場合、任意の代表ブロックと他のブロック間での
偏差（差分）を求めるようにし、この偏差のデータを伝
送するようにしている。また、入力ディジタル音声信号
が定常的，準定常的な信号であるか否かの判断は、偏差
のデータの情報量とブロック毎の符号化に関連するデー
タの情報量とを比較することにより行う。偏差のデータ
の情報量が各ブロックのものよりも少ない場合には、入
力ディジタル音声信号が定常的，準定常的な信号である
と判断すると共に、当該偏差のデータを伝送するように
する。逆に、偏差のデータの情報の方が多い場合には、
各ブロック毎のデータを伝送するようにする。更に、偏
差算出の際において、各ブロック間から得られる偏差の
データとしては、ブロック毎の直交変換による係数デー
タ（スペクトル成分）の差分のデータの場合と、ブロッ
ク毎にいわゆるフローティング処理を行う場合のフロー
ティング係数及びワード長情報の差分（サブ情報の差
分）のデータの場合、若しくは、この両者とする場合を
考えることができる。また更に、フローティング処理を
行う場合、直交変換による係数データの偏差は、該フロ
ーティング処理の前のデータか或いは処理後のデータと
することができる。SUMMARY OF THE INVENTION A high efficiency coding method of a digital signal according to the present invention has been proposed to achieve the above object, and converts an input digital voice signal into a signal on a frequency axis. And a digital signal for each block, the digital signal being encoded with adaptively assigned bits and transmitted. In the method for highly efficient encoding of digital signals, data related to the encoding of an arbitrary representative block of each of the blocks is provided. And a deviation between the data of the other blocks and related to the encoding, and the deviation calculation output is transmitted. Here, the representative block and the other blocks are blocks that are temporally preceding and following. The block is an orthogonal transform block in which the input digital audio signal is converted into a signal on the frequency axis for each predetermined sample, and a floating block after the input digital audio signal is converted into a signal on the frequency axis for each predetermined sample. The representative block and the other blocks are blocks that precede and follow in frequency. Further, when the data amount of the deviation calculation output is smaller than the data amount of the data relating to the encoding, the deviation calculation output is transmitted instead of the data relating to the encoding. The encoded data before being replaced with the deviation calculation output has a constant bit rate, and the encoded data after being replaced with the deviation calculation output has a variable bit rate. In other words, the high-efficiency encoding method of the digital signal according to the present invention is characterized in that the input digital audio signal is a continuous digital audio signal such as a sine wave, or a quasi-stationary digital When the audio signal is continuous, a deviation (difference) between an arbitrary representative block and another block is obtained, and data of this deviation is transmitted. The determination as to whether or not the input digital audio signal is a stationary or quasi-stationary signal is made by comparing the information amount of the deviation data with the information amount of the data related to encoding for each block. . If the information amount of the deviation data is smaller than that of each block, it is determined that the input digital audio signal is a stationary or quasi-stationary signal, and the deviation data is transmitted. Conversely, if there is more information on the deviation data,
Data for each block is transmitted. Further, in calculating the deviation, the data of the deviation obtained between the respective blocks includes data of a difference between coefficient data (spectral components) by orthogonal transformation for each block and a case of performing a so-called floating process for each block. The case of the data of the difference between the floating coefficient and the word length information (the difference of the sub information), or the case of both of them can be considered. Furthermore, when performing the floating process, the deviation of the coefficient data due to the orthogonal transformation can be the data before the floating process or the data after the process.

【０００８】[0008]

【作用】本発明によれば、任意の代表ブロックの符号化
に関連するデータと他のブロックの符号化に関連するデ
ータとの偏差すなわち差分を伝送するようにしているた
め、伝送ビットレートが下がるようになる。また、本発
明によれば、偏差算出出力のデータ量が符号化に関する
データのデータ量より少ない場合に、偏差算出出力を符
号化に関するデータに代えて伝送することで、さらに伝
送データ量を少なくしている。According to the present invention, since the deviation or difference between the data relating to the encoding of an arbitrary representative block and the data relating to the encoding of another block is transmitted, the transmission bit rate is reduced. Become like Further, according to the present invention, when the data amount of the deviation calculation output is smaller than the data amount of the data relating to the encoding, the deviation calculation output is transmitted instead of the data relating to the encoding, thereby further reducing the transmission data amount. ing.

【０００９】[0009]

【実施例】以下、本発明を適用した実施例について図面
を参照しながら説明する。本発明のディジタル信号の高
能率符号化方法が適用される一実施例のディジタル信号
の高能率符号化装置は、オーディオ或いは音声等の入力
ディジタル信号を、例えば、前述の高能率符号化の帯域
分割符号化（ＳＢＣ）によって帯域分割すると共に、直
交変換して周波数軸上の信号に変換した後符号化するよ
うにしている。Embodiments of the present invention will be described below with reference to the drawings. The digital signal high-efficiency encoding apparatus according to one embodiment to which the digital signal high-efficiency encoding method of the present invention is applied, for example, converts an input digital signal such as audio or voice into the above-mentioned high efficiency encoding band division In addition to band division by encoding (SBC), the signal is orthogonally transformed and converted into a signal on the frequency axis, and then encoded.

【００１０】すなわち、本実施例の高能率符号化装置で
は、図１に示すように、帯域分割フィルタとしてのいわ
ゆるミラーフィルタのＱＭＦ(quadrature mirror filte
r)４１，４２によって、入力端子１を介して供給される
上記入力ディジタル信号を、いわゆる臨界帯域（クリテ
ィカルバンド）での分割を考慮して高域程帯域幅が広く
なるように複数の帯域に分割（例えば大別して３つの帯
域に分割）し、この分割された帯域毎に複数のサンプル
からなるブロック（直交変換ブロック）を形成して、こ
れら各直交変換ブロック毎に高速フーリエ変換（ＦＦ
Ｔ）回路４３，４４，４５による直交変換（時間軸を周
波数軸に変換）を行うことで係数データ（ＦＦＴ係数デ
ータ）を得るようになっている。その後、この３分割さ
れた各帯域のＦＦＴ係数データを、後述する許容ノイズ
レベル算出回路６０で求められる許容ノイズレベルに基
づいた適応的な割り当てビット数で符号化している。こ
の符号化が符号化回路５０によって行われ、当該符号化
後のデータが偏差算出回路７０を介して出力端子２から
出力されるようになっている。That is, in the high-efficiency coding apparatus of the present embodiment, as shown in FIG. 1, a QMF (quadrature mirror filter) of a so-called mirror filter as a band division filter is used.
r) By 41 and 42, the input digital signal supplied via the input terminal 1 is divided into a plurality of bands so that the higher the band, the wider the band in consideration of the division in a so-called critical band (critical band). It is divided (for example, roughly divided into three bands), a block (orthogonal transform block) composed of a plurality of samples is formed for each of the divided bands, and a fast Fourier transform (FF) is performed for each of the orthogonal transform blocks.
T) The coefficient data (FFT coefficient data) is obtained by performing orthogonal transformation (time axis is converted to frequency axis) by the circuits 43, 44, and 45. After that, the FFT coefficient data of each of the three divided bands is encoded with an adaptively allocated number of bits based on an allowable noise level obtained by an allowable noise level calculation circuit 60 described later. This encoding is performed by the encoding circuit 50, and the encoded data is output from the output terminal 2 via the deviation calculation circuit 70.

【００１１】ここで、本実施例装置は、上記各直交変換
ブロックのうちの任意の代表ブロック（代表直交変換ブ
ロック）の上記符号化に関連するデータと、他の直交変
換ブロックの符号化に関連するデータとの間の偏差を求
める上記偏差算出回路７０を有し、この偏差算出回路７
０からの出力を伝送するようにしている。すなわち、例
えば、入力ディジタル信号が例えばサイン波等の定常的
なディジタル信号が連続するものであったり或いはこの
定常的信号に準ずる準定常的なディジタル信号が連続す
るものであった場合、任意の代表直交変換ブロックと他
の直交変換ブロック間での偏差を求めるようにし、この
偏差のデータを伝送するようにしている。例えば、図２
に示すように準定常的なディジタル信号が連続して供給
された場合、任意の代表直交変換ブロックＢａのデータ
を直交変換して得られた周波数成分（後述する振幅値）
Ｓａと、他の直交変換ブロックＢｂのデータを直交変換
して得られた周波数成分（振幅値）Ｓｂとは、略同じ周
波数の成分であり、したがって、上記偏差算出回路７０
では、それらの差分をとって、図３に示すような差分値
Ｓｄを得て、当該差分値Ｓｄを伝送するようにしてい
る。なお図３にはディジタル信号をアナログ的な信号と
して示している。Here, the apparatus according to the present embodiment includes data relating to the encoding of an arbitrary representative block (representative orthogonal transform block) among the above orthogonal transform blocks and data relating to the encoding of another orthogonal transform block. And a deviation calculating circuit 70 for calculating a deviation from the data to be calculated.
The output from 0 is transmitted. That is, for example, when the input digital signal is a continuous continuous digital signal such as a sine wave or a continuous quasi-steady digital signal corresponding to the continuous signal, any representative A deviation between the orthogonal transformation block and another orthogonal transformation block is determined, and data of the deviation is transmitted. For example, FIG.
When a quasi-stationary digital signal is continuously supplied as shown in (1), frequency components (amplitude values to be described later) obtained by orthogonally transforming data of an arbitrary representative orthogonal transform block Ba
Sa and the frequency component (amplitude value) Sb obtained by orthogonally transforming the data of the other orthogonal transform block Bb are components having substantially the same frequency.
Then, by taking the difference between them, a difference value Sd as shown in FIG. 3 is obtained, and the difference value Sd is transmitted. FIG. 3 shows a digital signal as an analog signal.

【００１２】このように入力ディジタル信号が準定常的
信号であるならば、上述したような差分演算処理を連続
して繰り返すようにする。また、この繰り返される該差
分演算は、先ず、代表直交変換ブロックＢａとその次の
ブロックＢｂとの差分を取り、次に、該ブロックＢａと
ブロックＢｂの次のブロックＢｃ（図示は省略）との差
分、該ブロックＢａとブロックＢｃの次のブロックＢｄ
（図示は省略）との差分を取っていくというように順次
繰り返すようにする。更に、この差分演算の繰り返しの
他の方法としては、上述のように代表直交変換ブロック
をブロックＢａのみに決めるのではなく、順次代表直交
変換ブロックを更新するような方法とすることも可能で
ある。例えば、ブロックＢｂと代表のブロックＢａとの
差分を取り、次に、代表ブロックをブロックＢａからＢ
ｂに更新して当該ブロックＢｃとブロックＢｂとの差分
を取り、更に代表ブロックをブロックＢｂからＢｃに更
新して当該ブロックＢｄとブロックＢｃとの差分を取っ
ていくという方法とすることも可能である。As described above, if the input digital signal is a quasi-stationary signal, the above-described difference calculation processing is continuously repeated. In the repeated difference calculation, first, the difference between the representative orthogonal transformation block Ba and the next block Bb is obtained, and then the difference between the block Ba and the next block Bc (not shown) of the block Bb is calculated. Difference, next block Bd of the block Ba and the block Bc
(Not shown), and so on. Further, as another method of repeating the difference calculation, it is also possible to adopt a method of sequentially updating the representative orthogonal transform block instead of determining the representative orthogonal transform block only to the block Ba as described above. . For example, the difference between the block Bb and the representative block Ba is calculated, and then the representative block is changed from the block Ba to the block Ba.
b, the difference between the block Bc and the block Bb is obtained, and the representative block is updated from the block Bb to Bc to obtain the difference between the block Bd and the block Bc. is there.

【００１３】また、上記入力ディジタル信号は上記定常
的或いは準定常的な信号のみが連続するものであるとは
限らず、例えば図４に示すように、これら準定常的信号
以外の非定常的信号が連続するような場合が多い。この
場合、伝送するデータとしては、上記差分のデータを伝
送するか或いは各ブロック毎のデータをそのまま送るか
の何れかを判断して、伝送情報量の少なくなる方を選ん
で伝送するようにする。例えば、図４に示すような非定
常的信号の場合、代表直交変換ブロックＢａとその次の
ブロックＢｂから得られる各周波数成分は異なり、した
がって、これらの差分を取っても情報量は低減されない
ことが考えられる。このようなことから、上記偏差算出
回路７０では、上記ブロックＢａとブロックＢｂとの差
分を取って得た上記差分値のデータを伝送する際の情報
量とブロックＢａ及びＢｂのデータを伝送する際の情報
量とを比較して、情報量が少なくなる方を伝送する。た
だし、この場合は、差分値のデータを送ったか或いはブ
ロック毎のデータをそのまま伝送したかの何れかを示す
モード情報も同時に伝送することが必要となる。なお、
この図４もディジタル信号を便宜的にアナログ信号とし
て示している。Further, the input digital signal is not limited to a signal in which only the stationary or quasi-stationary signal is continuous. For example, as shown in FIG. Are often continuous. In this case, as the data to be transmitted, it is determined whether the difference data is transmitted or the data for each block is transmitted as it is, and the transmission with the smaller amount of transmission information is selected and transmitted. . For example, in the case of an unsteady signal as shown in FIG. 4, each frequency component obtained from the representative orthogonal transform block Ba and the next block Bb is different, and therefore, even if these differences are taken, the information amount is not reduced. Can be considered. For this reason, the deviation calculation circuit 70 uses the information amount for transmitting the data of the difference value obtained by calculating the difference between the block Ba and the block Bb and the data amount for transmitting the data of the blocks Ba and Bb. The information amount is compared with the information amount and the one with the smaller information amount is transmitted. However, in this case, it is necessary to simultaneously transmit mode information indicating whether data of the difference value has been transmitted or data of each block has been transmitted as it is. In addition,
FIG. 4 also shows a digital signal as an analog signal for convenience.

【００１４】更に、上記偏差算出回路７０においては、
上述したように、各ブロック間から得られる偏差のデー
タを、上記直交変換ブロック間のＦＦＴ係数データ（ス
ペクトル成分）の差分とする場合と、例えばいわゆるフ
ローティング処理を行う場合のフローティング係数及び
ワード長情報の差分（サブ情報の差分）とする場合、若
しくは、この両者とする場合を考えることができる。な
おワード長情報とは、符号化の際の割り当てビット数に
対応するものである。このように、フローティング処理
を行って得られたフローティング係数及びワード長情報
の差分を伝送するような場合も、上述同様に連続するブ
ロック（フローティングブロック）間での差分を取るよ
うにする。また、上記両者の差分を伝送するようにした
場合、上記ＦＦＴ係数データの差分演算は、上記フロー
ティング処理の前か或いは処理後のいずれであってもよ
い。Further, in the deviation calculating circuit 70,
As described above, the data of the deviation obtained between the blocks is used as the difference between the FFT coefficient data (spectral components) between the orthogonal transform blocks, and the floating coefficient and word length information when performing, for example, a so-called floating process. (Difference of sub-information), or both. Note that the word length information corresponds to the number of bits allocated at the time of encoding. As described above, even when the difference between the floating coefficient and the word length information obtained by performing the floating process is transmitted, the difference between the continuous blocks (floating blocks) is calculated as described above. When the difference between the two is transmitted, the difference calculation of the FFT coefficient data may be performed before or after the floating process.

【００１５】上述のように、本実施例では、上記差分の
データを伝送するようにしているため、各ブロック毎に
一定のビットレート（固定ビットレート）であったデー
タを、各ブロック毎に異なるビットレート（すなわち可
変ビットレート）で伝送することになり、データ圧縮率
を高めることが可能となっている。As described above, in this embodiment, since the difference data is transmitted, the data having a constant bit rate (fixed bit rate) for each block is changed for each block. Since the data is transmitted at a bit rate (that is, a variable bit rate), the data compression rate can be increased.

【００１６】上記偏差算出回路７０の具体的構成を図５
に示す。この図５において、端子７１には、上記符号化
回路５０からの符号化データが供給される。該符号化デ
ータは、上述したように、直交変換ブロック単位のＦＦ
Ｔ係数データや、フローティングブロック単位のフロー
ティング係数及びワード長のデータである。これらデー
タのうち、例えばＮ番目のブロックのデータが上記代表
ブロック（直交変換ブロック，フローティングブロッ
ク）としてメモリ等のデータ格納手段７２に蓄えられ、
更に、該Ｎ番目のブロックの次のＮ＋１番目のブロック
（上記他のブロック）のデータがデータ格納手段７３に
蓄えられる。これら、格納手段７２，７３の出力は、共
に、減算手段７４に送られ、上述したような差分値が求
められる。この差分値が比較選択手段７５に送られる。
また、当該比較選択手段７５には上記各格納手段７３，
７４からのブロック毎のデータも供給されており、した
がって、当該比較選択手段７５では上記減算手段７４か
らのブロックの差分値の情報量と、各ブロック毎の情報
量とを比較して、伝送情報量の少ない方を選択して出力
するようになっている。また、上記モード情報も出力さ
れるようになっている。この出力が端子７６から出力さ
れる。FIG. 5 shows a specific configuration of the deviation calculating circuit 70.
Shown in In FIG. 5, a terminal 71 is supplied with encoded data from the encoding circuit 50. As described above, the encoded data is an FF in the orthogonal transform block unit.
T coefficient data, floating coefficients in units of floating blocks, and data of word length. Of these data, for example, the data of the N-th block is stored in the data storage means 72 such as a memory as the representative block (orthogonal transformation block, floating block),
Further, data of the (N + 1) th block (the other block) next to the Nth block is stored in the data storage means 73. The outputs of the storage units 72 and 73 are both sent to the subtraction unit 74, and the above-described difference value is obtained. This difference value is sent to the comparison and selection means 75.
Further, the storage means 73,
The data for each block from the block 74 is also supplied. Therefore, the comparison and selection unit 75 compares the information amount of the block difference value from the subtraction unit 74 with the information amount of each block, and The smaller one is selected and output. Further, the mode information is also output. This output is output from the terminal 76.

【００１７】再び図１に戻って、入力端子１にはアナロ
グオーディオ信号等をサンプリング（例えば１０２４サ
ンプル）して得たディジタル信号（０〜２０ｋＨｚ）が
供給されており、該ディジタル信号は上記ＱＭＦ４１，
４２により、上記高域程帯域幅が広くなるように大まか
に３つの帯域（０〜５ｋＨｚ，５ｋＨｚ〜１０ｋＨｚ，
１０ｋＨｚ〜２０ｋＨｚ）に分割される。上記ＱＭＦ４
１では、上記０〜２０ｋＨｚのディジタル信号が２分割
されて１０ｋＨｚ〜２０ｋＨｚと０〜１０ｋＨｚの２つ
の出力が得られ、１０ｋＨｚ〜２０ｋＨｚの出力は高速
フーリエ変換回路４３に、０〜１０ｋＨｚの出力はＱＭ
Ｆ４２に送られる。ＱＭＦ４２へ送られた０〜１０ｋＨ
ｚの出力は、該ＱＭＦ４２で更に２分割されて５ｋＨｚ
〜１０ｋＨｚと０〜５ｋＨｚの２つの出力が得られる。
上記５ｋＨｚ〜１０ｋＨｚの出力は上記高速フーリエ変
換回路４４に送られ、上記０〜５ｋＨｚの出力は高速フ
ーリエ変換回路４５に送られる。Returning to FIG. 1, a digital signal (0 to 20 kHz) obtained by sampling an analog audio signal or the like (for example, 1024 samples) is supplied to the input terminal 1. The digital signal is supplied to the QMF 41,
According to 42, roughly three bands (0 to 5 kHz, 5 kHz to 10 kHz,
(10 kHz to 20 kHz). QMF4 above
1, the digital signal of 0 to 20 kHz is divided into two to obtain two outputs of 10 kHz to 20 kHz and 0 to 10 kHz. The output of 10 kHz to 20 kHz is output to the fast Fourier transform circuit 43, and the output of 0 to 10 kHz is output to the QM
It is sent to F42. 0-10kHz sent to QMF42
The output of z is further divided into two by the QMF 42 and
Two outputs of 10 kHz and 0-5 kHz are obtained.
The output at 5 kHz to 10 kHz is sent to the fast Fourier transform circuit 44, and the output at 0 to 5 kHz is sent to the fast Fourier transform circuit 45.

【００１８】ここで、上記各高速フーリエ変換回路４
３，４４，４５における上記３つの帯域の各帯域の直交
変換ブロックのブロック長は、それぞれ異なるブロック
長となされている。例えば、図６に示すように、上記１
０ｋＨｚ〜２０ｋＨｚの高域に対応する高速フーリエ変
換回路４３では例えば５ｍｓｅｃ毎の直交変換ブロック
長ｂ_H1，ｂ_H2，ｂ_H3，ｂ_H4とされ、上記５ｋＨｚ〜１０
ｋＨｚの中域に対応する高速フーリエ変換回路４４では
例えば１０ｍｓｅｃ毎の直交変換ブロック長ｂ_M1，ｂ_M2
とされ、上記０〜５ｋＨｚの低域に対応する高速フーリ
エ変換回路４５では例えば２０ｍｓｅｃ毎の直交変換ブ
ロック長ｂ_Lとされる。Here, each of the above-mentioned fast Fourier transform circuits 4
The block lengths of the orthogonal transform blocks of each of the three bands in 3, 44, and 45 are different from each other. For example, as shown in FIG.
In the fast Fourier transform circuit 43 corresponding to the high frequency range of 0 kHz to 20 kHz, for example, the orthogonal transform block lengths b _H1 , b _H2 , b _H3 , b _H4 are set every 5 msec.
In the fast Fourier transform circuit 44 corresponding to the middle band of kHz, for example, the orthogonal transform block lengths b _M1 and b _M2 every 10 msec
In the fast Fourier transform circuit 45 corresponding to the low range of 0 to 5 kHz, the orthogonal transform block length b _L is set to, for example, every 20 msec.

【００１９】上述したように、高域及び中域の直交変換
ブロック長を低域よりも短くし、低域の直交変換ブロッ
ク長を長くするのは、以下に示すような理由による。す
なわち、人間の聴覚における周波数分析能力（周波数分
解能）は、一般に、高域ではさほど高くないが低域では
高いものであり、したがって、該低域での周波数分解能
を確保する必要性から、現実には上述したように直交変
換ブロック長をあまり短くすることはできないためであ
る。また、一般に、低域信号では定常区間が長く、逆に
高域信号では短いため、高域（及び中域）での直交変換
ブロック長を短くする（時間分解能を高める）ことは有
効となる。上述のようなことから、本実施例では、上記
定常的な信号以外の時に高域及び中域の直交変換ブロッ
ク長を、低域の直交変換ブロック長よりも短いものと
し、低域の直交変換ブロック長を長くしている。As described above, the reason why the orthogonal transform block lengths of the high band and the middle band are shorter than those of the low band and the length of the orthogonal transform block of the low band is long is as follows. That is, the frequency analysis capability (frequency resolution) in human hearing is generally not so high in the high frequency range but high in the low frequency range. Therefore, it is necessary to secure the frequency resolution in the low frequency range. This is because the orthogonal transform block length cannot be reduced too much as described above. Further, generally, since a stationary section is long in a low band signal and short in a high band signal, it is effective to shorten the orthogonal transform block length in the high band (and the middle band) (to increase the time resolution). From the above, in the present embodiment, the orthogonal transform block length of the high band and the middle band is set to be shorter than the orthogonal transform block length of the low band at times other than the stationary signal, and the orthogonal transform of the low band is performed. The block length is lengthened.

【００２０】このように、本実施例においては、聴覚か
ら必要とされる周波数軸上の分解能と時間軸上の分解能
を同時に満足するような構成となっていて、上記低域
（０〜５ｋＨｚ）では処理のサンプル数を多くして周波
数分解能を上げ、高域（１０ｋＨｚ〜２０ｋＨｚ）では
時間分解能を上げている。また、中域（５ｋＨｚ〜１０
ｋＨｚ）でも時間分解能を上げている。As described above, in the present embodiment, the configuration is such that the resolution on the frequency axis and the resolution on the time axis required from the auditory sense are simultaneously satisfied, and the low frequency range (0 to 5 kHz) is satisfied. In (2), the frequency resolution is increased by increasing the number of processing samples, and the time resolution is increased in a high frequency range (10 kHz to 20 kHz). Also, in the middle frequency range (5 kHz to 10
kHz) also improves the time resolution.

【００２１】なお、上記直交変換は上述した高速フーリ
エ変換に限らず例えば離散的余弦変換（ＤＣＴ）、ＭＤ
ＣＴ等をも適用することができる。Note that the orthogonal transform is not limited to the fast Fourier transform described above, but may be, for example, a discrete cosine transform (DCT), an MD
CT and the like can also be applied.

【００２２】これら各高速フーリエ変換回路４３，４
４，４５の出力が、符号化回路５０に送られている。こ
こで、本実施例の符号化回路５０における上記３つの帯
域のＦＦＴ係数データの符号化の際には、人間の聴覚特
性に基づく適応的な割当てビット数で符号化を行うよう
にしているため、上記各ＦＦＴ係数データを、上記臨界
帯域での帯域（例えば２５バンド）に対応させている。
このため、上記高速フーリエ変換回路４３の出力は、臨
界帯域の高域の例えば２つの帯域と対応し、高速フーリ
エ変換回路４４の出力は臨界帯域の中域の例えば３つの
帯域と対応し、高速フーリエ変換回路４５の出力は臨界
帯域の低域の例えば２０個の帯域と対応するようになさ
れている。なお、上記臨界帯域幅とは、人間の聴覚特性
（周波数分析能力）を考慮したものであり、例えば０〜
２０ｋＨｚを２５帯域に分け、高い周波数帯域ほど帯域
幅を広く選定しているものである。すなわち人間の聴覚
は、一種のバンドパスフィルタのような特性を有してい
て、この各フィルタによって分けられたバンドを臨界帯
域と呼んでいる。Each of these fast Fourier transform circuits 43, 4
The outputs of 4, 45 are sent to the encoding circuit 50. Here, when encoding the FFT coefficient data of the above three bands in the encoding circuit 50 of the present embodiment, encoding is performed with an adaptively allocated number of bits based on human auditory characteristics. , Each FFT coefficient data corresponds to a band (for example, 25 bands) in the critical band.
For this reason, the output of the fast Fourier transform circuit 43 corresponds to, for example, two high bands in the critical band, and the output of the fast Fourier transform circuit 44 corresponds to, for example, three bands in the middle band of the critical band. The output of the Fourier transform circuit 45 is designed to correspond to, for example, 20 bands in the lower critical band. The critical bandwidth is based on human auditory characteristics (frequency analysis ability), and is, for example, 0 to 0.
20 kHz is divided into 25 bands, and the higher the frequency band, the wider the bandwidth is selected. That is, human hearing has characteristics like a kind of band-pass filter, and the band divided by each filter is called a critical band.

【００２３】上記符号化回路５０での符号化は、上記許
容ノイズレベル算出回路６０における各臨界帯域毎の許
容ノイズレベルに基づいた割り当てビット数によって適
応的に行われている。The encoding by the encoding circuit 50 is adaptively performed by the number of allocated bits based on the allowable noise level for each critical band in the allowable noise level calculation circuit 60.

【００２４】上述した本実施例のディジタル信号の高能
率符号化装置の許容ノイズレベル算出回路６０の具体的
構成を図７に示す。FIG. 7 shows a specific configuration of the allowable noise level calculating circuit 60 of the above-described digital signal high-efficiency encoding apparatus according to the present embodiment.

【００２５】すなわちこの図７において、入力端子６１
には、各高速フーリエ変換回路４３，４４，４５から上
記臨界帯域毎のＦＦＴ係数データのうち振幅情報Ａｍの
情報のみが供給される。すなわち、一般に人間の聴覚は
周波数領域の振幅（パワー）には敏感であるが、位相に
ついてはかなり鈍感であるため、本具体例では上記振幅
情報Ａｍのみを用いて上記許容ノイズレベルを算出する
ようにしている。That is, in FIG.
, Only the information of the amplitude information Am among the FFT coefficient data for each critical band is supplied from each of the fast Fourier transform circuits 43, 44, and 45. In other words, human hearing is generally sensitive to the amplitude (power) in the frequency domain, but rather insensitive to the phase. Therefore, in this specific example, the permissible noise level is calculated using only the amplitude information Am. I have to.

【００２６】上記臨界帯域毎の上記振幅情報Ａｍは、各
々上記総和検出回路１４に伝送される。この総和検出回
路１４では、各帯域毎のエネルギ（各帯域でのスペクト
ル強度）が、各帯域内のそれぞれの振幅情報Ａｍの総和
（振幅情報Ａｍのピーク又は平均或いはエネルギ総和）
をとることにより求められる。該総和検出回路１４の出
力すなわち各帯域の総和のスペクトルは、一般にバーク
スペクトルと呼ばれ、この各帯域のバークスペクトルＳ
Ｂは例えば図８に示すようになる。ただし、図８では図
示を簡略化するため、上記クリティカルバンドのバンド
数を１２の帯域（Ｂ₁〜Ｂ₁₂）で表現している。The amplitude information Am for each critical band is transmitted to the sum detection circuit 14, respectively. In the sum detection circuit 14, the energy of each band (spectral intensity in each band) is calculated by summing the amplitude information Am in each band (peak or average of amplitude information Am or energy sum).
Required by taking The output of the sum detection circuit 14, that is, the spectrum of the sum of each band is generally called a bark spectrum, and the bark spectrum S of each band is generally called a bark spectrum.
B is, for example, as shown in FIG. However, in FIG. 8, for simplicity of illustration, the number of the critical bands is represented by 12 bands (B _{1 to} B ₁₂ ).

【００２７】ここで、上記バークスペクトルＳＢのいわ
ゆるマスキングに於ける影響を考慮するため、該バーク
スペクトルＳＢに所定の重みづけの関数を畳込む（コン
ボリューション）。このため、上記総和検出回路１４の
出力すなわち該バークスペクトルＳＢの各値は、フィル
タ回路１５に送られる。該フィルタ回路１５は、例え
ば、入力データを順次遅延させる複数の遅延素子と、こ
れら遅延素子からの出力にフィルタ係数（重みづけの関
数）を乗算する複数の乗算器（例えば各帯域に対応する
２５個の乗算器）と、各乗算器出力の総和をとる総和加
算器とから構成されるものである。このフィルタ回路１
５の各乗算器において、例えば、任意の帯域に対応する
乗算器Ｍでフィルタ係数１を、乗算器Ｍ−１でフィルタ
係数０．１５を、乗算器Ｍ−２でフィルタ係数０．００
１９を、乗算器Ｍ−３でフィルタ係数０．０００００８
６を、乗算器Ｍ＋１でフィルタ係数０．４を、乗算器Ｍ
＋２でフィルタ係数０．０６を、乗算器Ｍ＋３でフィル
タ係数０．００７を各遅延素子の出力に乗算することに
より、上記バークスペクトルＳＢの畳込み処理が行われ
る。ただし、Ｍは１〜２５の任意の整数である。この畳
込み処理により、図８中点線で示す部分の総和がとられ
る。なお、上記マスキングとは、人間の聴覚上の特性に
より、ある信号によって他の信号がマスクされて聞こえ
なくなる現象をいうものであり、このマスキング効果に
は、時間軸上のオーディオ信号に対するマスキング効果
と周波数軸上の信号に対するマスキング効果とがある。
すなわち、該マスキング効果により、マスキングされる
部分にノイズがあったとしても、このノイズは聞こえな
いことになる。このため、実際のオーディオ信号では、
このマスキングされる部分内のノイズは許容可能なノイ
ズとされる。Here, in order to consider the influence of the bark spectrum SB on so-called masking, a function of a predetermined weight is convolved with the bark spectrum SB (convolution). Therefore, the output of the sum detection circuit 14, that is, each value of the bark spectrum SB is sent to the filter circuit 15. The filter circuit 15 includes, for example, a plurality of delay elements for sequentially delaying input data and a plurality of multipliers (for example, 25 corresponding to each band, for multiplying an output from these delay elements by a filter coefficient (weighting function)). Multipliers) and a sum adder for summing the outputs of the multipliers. This filter circuit 1
In each of the multipliers 5, for example, the multiplier M corresponding to an arbitrary band has the filter coefficient 1, the multiplier M−1 has the filter coefficient 0.15, and the multiplier M− 2 has the filter coefficient 0.00
19 is multiplied by a filter coefficient 0.000008 by the multiplier M-3.
6, the filter coefficient 0.4 by the multiplier M + 1 and the multiplier M
By multiplying the output of each delay element by the filter coefficient 0.06 by +2 and the filter coefficient 0.007 by the multiplier M + 3, the convolution process of the bark spectrum SB is performed. Here, M is an arbitrary integer of 1 to 25. By this convolution processing, the sum of the parts indicated by the dotted lines in FIG. 8 is obtained. The masking refers to a phenomenon that a certain signal masks another signal and makes it inaudible due to human auditory characteristics.This masking effect includes a masking effect for an audio signal on a time axis. There is a masking effect on signals on the frequency axis.
That is, even if there is noise in the masked portion due to the masking effect, this noise will not be heard. Therefore, in an actual audio signal,
The noise in this masked portion is considered acceptable noise.

【００２８】ここで、上記マスキングとは、人間の聴覚
特性に関するものである。すなわち、一般に音に対する
人間の聴覚特性には、マスキング効果と呼ばれるものが
あり、当該マスキング効果には、テンポラルマスキング
効果と同時刻マスキング効果等がある。上記同時刻マス
キング効果とは、ある大きな音と同時刻に発生する小さ
な音（或いはノイズ）が当該大きな音によってマスクさ
れて聞こえなくなるような効果であり、上記テンポラル
マスキング効果とは、大きな音の時間的な前後の小さな
音（ノイズ）が、この大きな音にマスクされて聞こえな
くなるような効果である。このテンポラルマスキング効
果において、上記大きな音の時間的に後方のマスキング
はフォワードマスキングと呼ばれ、また、時間的に前方
のマスキングはバックワードマスキングと呼ばれてい
る。また、テンポラルマスキングにおいては、人間の聴
覚特性から、フォワードマスキングの効果は長時間（例
えば100ｍｓｅｃ程度）効くようになっているのに対
し、バックワードマスキングの効果の持続時間は短時間
（例えば５ｍｓｅｃ程度）となっている。更に、上記マ
スキング効果のレベル（マスキング量）は、フォワード
マスキングが２０ｄＢ程度で、バックワードマスキング
が３０ｄＢ程度となっている。Here, the above-mentioned masking relates to human auditory characteristics. That is, in general, there is a so-called masking effect in human auditory characteristics of sound, and the masking effect includes a temporal masking effect, a same-time masking effect, and the like. The same-time masking effect is an effect in which a small sound (or noise) generated at the same time as a certain loud sound is masked by the loud sound and becomes inaudible, and the temporal masking effect is a time of a loud sound. The effect is that small sounds (noise) before and after the target are masked by the loud sounds and become inaudible. In this temporal masking effect, the temporally backward masking of the loud sound is called forward masking, and the temporally forward masking is called backward masking. Further, in temporal masking, the effect of forward masking is effective for a long time (for example, about 100 msec) due to human auditory characteristics, whereas the duration of the effect of backward masking is short (for example, about 5 msec). ). Further, the level (masking amount) of the masking effect is about 20 dB for forward masking and about 30 dB for backward masking.

【００２９】したがって、このマスキング効果を上記ブ
ロック間でのビット割当ての際に考慮すれば、よりビッ
ト圧縮が可能になる。すなわち、マスキングされる部分
の信号に対してはビット数を少なくしても聴感上何ら悪
影響がないため、このマスキングされる部分のビット数
を減らして圧縮効果をより高めることができる。なお、
上記マスキング効果におけるマスキング量は、例えば上
記臨界帯域毎のエネルギの総和を求め、この臨界帯域毎
のエネルギに基づいて求められる。また、ある臨界帯域
の信号による他の臨界帯域（或いは当該ある臨界帯域自
身）の他の時間へのマスキング量を求めるようにするこ
とも可能である。このようなマスキング量に基づいて各
帯域毎の許容可能なノイズレベルが求められ、更に、こ
の各帯域毎の許容可能なノイズレベルに基づいて上記符
号化の際の割当てビット数を決定することができる。Therefore, if this masking effect is taken into account when allocating bits between the blocks, bit compression becomes possible. That is, even if the number of bits is reduced for the signal of the masked portion, there is no adverse effect on the audibility, so that the number of bits of the masked portion can be reduced to further enhance the compression effect. In addition,
The masking amount in the masking effect is obtained, for example, by calculating the sum of the energy for each critical band, and based on the energy for each critical band. Further, it is also possible to calculate the amount of masking of another critical band (or the critical band itself) by another signal in a certain critical band at another time. An allowable noise level for each band is obtained based on the masking amount, and further, the number of bits to be allocated at the time of encoding is determined based on the allowable noise level for each band. it can.

【００３０】その後、上記フィルタ回路１５の出力は引
算器１６に送られる。該引算器１６は、上記畳込んだ領
域での後述する許容可能なノイズレベルに対応するレベ
ルαを求めるものである。なお、当該許容可能なノイズ
レベル（許容ノイズレベル）に対応するレベルαは、後
述するように、逆コンボリューション処理を行うことに
よって、臨界帯域の各帯域毎の許容ノイズレベルとなる
ようなレベルである。ここで、上記引算器１６には、上
記レベルαを求めるための許容関数（マスキングレベル
を表現する関数）が供給される。この許容関数を増減さ
せることで上記レベルαの制御を行っている。当該許容
関数は、後述する関数発生回路２９から供給されている
ものである。Thereafter, the output of the filter circuit 15 is sent to a subtractor 16. The subtracter 16 calculates a level α corresponding to an allowable noise level described later in the convolved area. The level α corresponding to the permissible noise level (permissible noise level) is, as described later, a level at which the permissible noise level of each critical band is obtained by performing inverse convolution processing. is there. Here, an allowance function (a function expressing a masking level) for obtaining the level α is supplied to the subtractor 16. The level α is controlled by increasing or decreasing the allowable function. The permissible function is supplied from a function generation circuit 29 described later.

【００３１】すなわち、許容ノイズレベルに対応するレ
ベルαは、臨界帯域幅の帯域の低域から順に与えられる
番号をｉとすると、次の式で求めることができる。 α＝Ｓ−（ｎ−ａｉ）この式において、ｎ，ａは定数でａ＞０、Ｓは畳込み処
理されたバークスペクトルの強度であり、該式中（ｎ−
ａｉ）が許容関数となる。本具体例ではｎ＝３８，ａ＝
１としており、この時の音質劣化はなく、良好な符号化
が行えた。That is, the level α corresponding to the allowable noise level can be obtained by the following equation, where i is a number sequentially given from the lower band of the critical bandwidth. α = S− (n−ai) In this equation, n and a are constants and a> 0, and S is the intensity of the convolution-processed Bark spectrum.
ai) is an allowable function. In this specific example, n = 38, a =
It was set to 1. At this time, there was no sound quality deterioration, and good encoding was performed.

【００３２】このようにして、上記レベルαが求めら
れ、このデータは、割算器１７に伝送される。当該割算
器１７では、上記畳込みされた領域での上記レベルαを
逆コンボリューションするためのものである。したがっ
て、この逆コンボリューション処理を行うことにより、
上記レベルαからマスキングスペクトルが得られるよう
になる。すなわち、このマスキングスペクトルが許容ノ
イズスペクトルとなる。なお、上記逆コンボリューショ
ン処理は、複雑な演算を必要とするが、本具体例では簡
略化した割算器１７を用いて逆コンボリューションを行
っている。In this way, the level α is obtained, and this data is transmitted to the divider 17. The divider 17 is for inversely convolving the level α in the convolved region. Therefore, by performing this inverse convolution processing,
A masking spectrum can be obtained from the level α. That is, this masking spectrum becomes an allowable noise spectrum. The inverse convolution process requires a complicated operation, but in this specific example, inverse convolution is performed using a simplified divider 17.

【００３３】次に、上記マスキングスペクトルは、合成
回路１８を介して減算器１９に伝送される。ここで、当
該減算器１９には、上記総和検出回路１４の出力すなわ
ち前述した総和検出回路１４からのバークスペクトルＳ
Ｂが、遅延回路２１を介して供給されている。したがっ
て、この減算器１９で上記マスキングスペクトルとバー
クスペクトルＳＢとの減算演算が行われることで、図９
に示すように、上記バークスペクトルＳＢは、該マスキ
ングスペクトルＭＳのレベルで示すレベル以下がマスキ
ングされることになる。Next, the masking spectrum is transmitted to a subtractor 19 via a synthesis circuit 18. Here, the subtractor 19 outputs the output of the sum detection circuit 14, that is, the bark spectrum S from the sum detection circuit 14 described above.
B is supplied via a delay circuit 21. Accordingly, the subtractor 19 performs a subtraction operation between the masking spectrum and the bark spectrum SB, thereby obtaining the signal shown in FIG.
As shown in (1), the bark spectrum SB is masked below the level indicated by the level of the masking spectrum MS.

【００３４】当該減算器１９の出力は、上記許容ノイズ
レベル補正回路２０を介してＲＯＭ３０に送られる。該
ＲＯＭ３０には、上記符号化回路５０におけるＦＦＴ係
数データの符号化に用いる複数の割当ビット数情報が格
納されており、上記減算回路１９の出力（上記各帯域の
エネルギと上記ノイズレベル設定手段の出力との差分の
レベル）に応じた割当ビット数情報を出力するようにな
っている。なお、出力端子６２からは、ＦＦＴ係数デー
タの符号化出力と共に、フローティング処理のフローテ
ィングブロック毎のフローティング係数及びワード長情
報からなるサブ情報も出力される。The output of the subtracter 19 is sent to the ROM 30 via the allowable noise level correction circuit 20. The ROM 30 stores a plurality of pieces of assigned bit number information used for encoding the FFT coefficient data in the encoding circuit 50, and outputs the information of the subtraction circuit 19 (the energy of each band and the noise level setting means). It outputs assigned bit number information according to the level of difference from the output. The output terminal 62 outputs the sub-information including the floating coefficient and the word length information for each floating block in the floating process, together with the encoded output of the FFT coefficient data.

【００３５】また、合成回路１８での合成の際には、最
小可聴カーブ発生回路２２から供給される図１０に示す
ような人間の聴覚特性であるいわゆる最小可聴カーブＲ
Ｃを示すデータと、上記マスキングスペクトルＭＳとを
合成することができる。この最小可聴カーブにおいて、
雑音絶対レベルがこの最小可聴カーブ以下ならば該雑音
は聞こえないことになる。更に、該最小可聴カーブは、
コーディングが同じであっても例えば再生時の再生ボリ
ュームの違いで異なるものとなる。ただし、現実的なデ
ィジタルシステムでは、例えば１６ビットダイナミック
レンジへの音楽のはいり方にはさほど違いがないので、
例えば４ｋＨｚ付近の最も耳に聞こえやすい周波数帯域
の量子化雑音が聞こえないとすれば、他の周波数帯域で
はこの最小可聴カーブのレベル以下の量子化雑音は聞こ
えないと考えられる。したがって、このように例えばシ
ステムの持つワードレングスの４ｋＨｚ付近の雑音が聞
こえない使い方をすると仮定し、この最小可聴カーブＲ
ＣとマスキングスペクトルＭＳとを共に合成することで
許容ノイズレベルを得るようにすると、この場合の許容
ノイズレベルは、図中斜線で示す部分までとすることが
できるようになる。なお、本具体例では、上記最小可聴
カーブの４ｋＨｚのレベルを、例えば２０ビット相当の
最低レベルに合わせている。また、この図１０は、信号
スペクトルＳＳも同時に示している。At the time of synthesizing by the synthesizing circuit 18, a so-called minimum audible curve R which is a human auditory characteristic supplied from the minimum audible curve generating circuit 22 as shown in FIG.
The data indicating C and the masking spectrum MS can be synthesized. At this minimum audible curve,
If the absolute noise level is below this minimum audible curve, the noise will not be heard. Further, the minimum audible curve is
Even if the coding is the same, the coding will be different depending on, for example, the reproduction volume at the time of reproduction. However, in a realistic digital system, for example, there is not much difference in the way music is entered into the 16-bit dynamic range.
For example, if quantization noise in the most audible frequency band around 4 kHz is not heard, it is considered that quantization noise below the level of the minimum audible curve is not heard in other frequency bands. Therefore, it is assumed that the system is used in such a manner that the noise around 4 kHz of the word length of the system is not heard.
If the allowable noise level is obtained by synthesizing C and the masking spectrum MS together, the allowable noise level in this case can be up to the shaded portion in the figure. In this specific example, the 4 kHz level of the minimum audible curve is adjusted to the lowest level corresponding to, for example, 20 bits. FIG. 10 also shows the signal spectrum SS.

【００３６】ここで、上記許容ノイズレベル補正回路２
０では、補正値決定回路２８から送られてくるいわゆる
等ラウドネス曲線の情報に基づいて、上記減算器１９か
らの許容ノイズレベルを補正している。すなわち、上記
補正値決定回路２８からは、上記減算器１９からの許容
ノイズレベルを、いわゆる等ラウドネス曲線の情報デー
タに基づいて補正させるための補正値データが出力さ
れ、この補正値データが上記許容ノイズレベル補正回路
２０に伝送されることで、上記減算器１９からの許容ノ
イズレベルの等ラウドネス曲線を考慮した補正がなされ
るようになる。なお、上記等ラウドネス曲線とは、人間
の聴覚特性に関するものであり、例えば１ｋＨｚの純音
と同じ大きさに聞こえる各周波数での音の音圧を求めて
曲線で結んだもので、ラウドネスの等感度曲線とも呼ば
れる。また、該等ラウドネス曲線は、図１０に示した最
小可聴カーブＲＣと略同じ曲線を描くものである。該等
ラウドネス曲線においては、例えば４ｋＨｚ付近では１
ｋＨｚのところより音圧が８〜１０ｄＢ下がっても１ｋ
Ｈｚと同じ大きさに聞こえ、逆に５０ｋＨｚ付近では１
ｋＨｚでの音圧よりも約１５ｄＢ高くないと同じ大きさ
に聞こえない。このため、上記最小可聴カーブのレベル
を越えた雑音（許容ノイズレベル）は、該等ラウドネス
曲線に応じたカーブで与えられる周波数特性を持つよう
にするのが良いことがわかる。このようなことから、上
記等ラウドネス曲線を考慮して上記許容ノイズレベルを
補正することは人間の聴覚特性に適合していることがわ
かる。Here, the allowable noise level correction circuit 2
In the case of 0, the allowable noise level from the subtractor 19 is corrected based on information of a so-called equal loudness curve sent from the correction value determination circuit 28. That is, the correction value determination circuit 28 outputs correction value data for correcting the allowable noise level from the subtractor 19 based on information data of a so-called equal loudness curve. By being transmitted to the noise level correction circuit 20, the allowable noise level from the subtracter 19 is corrected in consideration of the equal loudness curve. The above-mentioned equal loudness curve relates to human auditory characteristics. For example, the equal loudness curve is obtained by obtaining sound pressures of sounds at each frequency that sounds as loud as a pure sound of 1 kHz and connecting them with a curve. Also called a curve. Further, the equal loudness curve draws substantially the same curve as the minimum audible curve RC shown in FIG. In the equal loudness curve, for example, 1
1k even if the sound pressure drops 8-10dB below the kHz
Hz, it sounds the same size.
If the sound pressure is not higher than the sound pressure at kHz by about 15 dB, the sound cannot be heard at the same level. For this reason, it can be seen that noise exceeding the level of the minimum audible curve (allowable noise level) preferably has a frequency characteristic given by a curve corresponding to the equal loudness curve. From this, it can be seen that correcting the allowable noise level in consideration of the equal loudness curve is suitable for human auditory characteristics.

【００３７】なお、本具体例においては、上述した最小
可聴カーブの合成処理を行わない構成とすることもでき
る。すなわち、この場合には、最小可聴カーブ発生回路
２２，合成回路１８が不要となり、上記引算器１６から
の出力は、割算器１７で逆コンボリューションされた
後、すぐに減算器１９に伝送されることになる。In this specific example, a configuration may be adopted in which the above-described minimum audible curve synthesizing process is not performed. That is, in this case, the minimum audible curve generating circuit 22 and the synthesizing circuit 18 become unnecessary, and the output from the subtractor 16 is inversely convolved by the divider 17 and immediately transmitted to the subtractor 19. Will be done.

【００３８】ここで、上述した本実施例による可変ビッ
トレートでの圧縮データの伝送は、例えば、一定ビット
レートの記録媒体と可変ビットレートの記録媒体との間
でデータ転送し記録するような場合に特に有効である。The transmission of compressed data at a variable bit rate according to the above-described embodiment is performed, for example, when data is transferred and recorded between a recording medium having a constant bit rate and a recording medium having a variable bit rate. It is especially effective for

【００３９】すなわち、例えば、記録媒体として例えば
いわゆるＣＤ−Ｉ（ＣＤ−インタラクティブ）、ＣＤ−
ＲＯＭＸＡ等、或いは、光磁気ディスク等を用い、こ
れらディスクからの上記一定ビットレートのデータを上
述した本実施例での可変ビットレートで更に圧縮して、
例えば半導体メモリ等の記録媒体（例えばいわゆるＩＣ
カード）に対して転送するような場合に特に有効であ
る。That is, for example, a so-called CD-I (CD-interactive), CD-
Using a ROM XA or the like, or a magneto-optical disk, etc., the data of the above-mentioned constant bit rate from these disks is further compressed at the above-mentioned variable bit rate in this embodiment,
For example, a recording medium such as a semiconductor memory (for example, a so-called IC
This is particularly effective when transferring to a card.

【００４０】[0040]

【発明の効果】本発明のディジタル信号の高能率符号化
方法においては、入力ディジタル音声信号を周波数軸上
の信号に変換すると共にブロック化し、該ブロック毎の
ディジタル信号を適応的な割り当てビットで符号化して
伝送するに際し、各ブロックのうちの任意の代表ブロッ
クの上記符号化に関連するデータと、他のブロックの上
記符号化に関連するデータとの間の偏差を求め、上記偏
差算出出力を伝送するようにしたことにより、ブロック
毎の伝送ビットレートを下げることが可能となり、圧縮
効率を高めることができるようになった。According to the digital signal high efficiency coding method of the present invention, an input digital audio signal is converted into a signal on the frequency axis and divided into blocks, and the digital signal for each block is encoded with adaptively assigned bits. In transmitting the data, the deviation between the data related to the encoding of any representative block of each block and the data related to the encoding of the other blocks is determined, and the deviation calculation output is transmitted. By doing so, the transmission bit rate for each block can be reduced, and the compression efficiency can be increased.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明実施例のディジタル信号の高能率符号化
装置の概略構成を示すブロック回路図である。FIG. 1 is a block circuit diagram showing a schematic configuration of a high efficiency digital signal encoding apparatus according to an embodiment of the present invention.

【図２】準定常的な信号を示す図である。FIG. 2 is a diagram showing a quasi-stationary signal.

【図３】差分値を示す図である。FIG. 3 is a diagram showing a difference value.

【図４】非定常的な信号を示す図である。FIG. 4 is a diagram showing a non-stationary signal.

【図５】偏差算出のための具体的構成を示すブロック回
路図である。FIG. 5 is a block circuit diagram showing a specific configuration for calculating a deviation.

【図６】高速フーリエ変換処理のブロック長を示す図で
ある。FIG. 6 is a diagram illustrating a block length of a fast Fourier transform process.

【図７】許容ノイズレベル算出のための具体的構成を示
すブロック図である。FIG. 7 is a block diagram showing a specific configuration for calculating an allowable noise level.

【図８】バークスペクトルを示す図である。FIG. 8 is a diagram showing a bark spectrum.

【図９】マスキングスペクトルを示す図である。FIG. 9 is a diagram showing a masking spectrum.

【図１０】最小可聴カーブ，マスキングスペクトルを合
成した図である。FIG. 10 is a diagram in which a minimum audible curve and a masking spectrum are combined.

【符号の説明】[Explanation of symbols]

４１，４２・・・・ＱＭＦ４３，４４，４５・・・高速フーリエ変換回路５０・・・・・・・・・符号化回路６０・・・・・・・・・許容ノイズレベル算出回路７０・・・・・・・・・偏差算出回路 41, 42... QMF 43, 44, 45... Fast Fourier transform circuit 50.... Encoding circuit 60... ........... Deviation calculation circuit

Claims

(57)【特許請求の範囲】(57) [Claims]

【請求項１】入力ディジタル音声信号を周波数軸上の
信号に変換すると共にブロック化し、該ブロック毎のデ
ィジタル信号を適応的な割り当てビットで符号化して伝
送するディジタル信号の高能率符号化方法において、上記各ブロックのうちの任意の代表ブロックの上記符号
化に関連するデータと、他のブロックの上記符号化に関
連するデータとの間の偏差を求め、上記偏差算出出力を伝送することを特徴とするディジタ
ル信号の高能率符号化方法。1. A high-efficiency digital signal encoding method for converting an input digital audio signal into a signal on a frequency axis and dividing the signal into blocks, encoding the digital signal for each block with adaptively allocated bits, and transmitting the digital signal. and wherein the transmitting the data relating to the coding of any representative block of said blocks, a deviation between the data relating to the coding of the other blocks, the deviation calculating output Efficient coding method for digital signals.

【請求項２】上記代表ブロックと他のブロックは時間
的に前後するブロックであることを特徴とする請求項１
記載のディジタル信号の高能率符号化方法。2. The apparatus according to claim 1, wherein the representative block and the other blocks are blocks that are temporally adjacent to each other.
A high-efficiency encoding method of a digital signal according to the above.

【請求項３】上記ブロックは、入力ディジタル音声信
号を所定サンプル毎に周波数軸上の信号に変換した直交
変換ブロックであることを特徴とする請求項１記載のデ
ィジタル信号の高能率符号化方法。 3. The input digital voice signal according to claim 1, wherein
Signal is converted to a signal on the frequency axis for each predetermined sample
The data according to claim 1, wherein the data is a conversion block.
A highly efficient encoding method for digital signals.

【請求項４】上記ブロックは、入力ディジタル音声信
号を所定サンプル毎に周波数軸上の信号に変換した後の
フローティングブロックであり、上記代表ブロックと他
のブロックは周波数的に前後するブロックであることを
特徴とする請求項１記載のディジタル信号の高能率符号
化方法。 4. The apparatus according to claim 1, wherein said block comprises an input digital audio signal.
After converting the signal into a signal on the frequency axis for each predetermined sample
This is a floating block.
Block is a block that precedes and succeeds in frequency.
2. The high efficiency code of a digital signal according to claim 1, wherein:
Method.

【請求項５】上記偏差算出出力のデータ量が上記符号
化に関するデータのデータ量より少ない場合、上記偏差
算出出力を上記符号化に関するデータに代えて伝送する
ことを特徴とするディジタル信号の高能率符号化方法。 5. The data amount of the deviation calculation output is equal to the sign
If the data volume is smaller than the data volume,
The calculated output is transmitted instead of the data related to the above encoding
A highly efficient encoding method of a digital signal, characterized in that:

【請求項６】上記偏差算出出力に代えられる前の符号
化データは一定ビットレートであり、上記偏差算出出力
に代えられた後の符号化データは可変ビットレートであ
ることを特徴とする請求項５記載のディジタル信号の高
能率符号化方法。 6. A code before being replaced with the deviation calculation output.
The digitized data has a constant bit rate, and the deviation calculation output
The encoded data after being replaced with a variable bit rate
6. The digital signal according to claim 5, wherein
Efficiency coding method.