JP2011059685A

JP2011059685A - Method for detecting audio signal that has basic layer and enhancement layer

Info

Publication number: JP2011059685A
Application number: JP2010196542A
Authority: JP
Inventors: Peter Jax; ヤクスペーター; Sven Kordon; コルドンスヴェン
Original assignee: Thomson Licensing SAS
Current assignee: Thomson Licensing SAS
Priority date: 2009-09-04
Filing date: 2010-09-02
Publication date: 2011-03-24
Anticipated expiration: 2030-09-02
Also published as: BRPI1002734A2; EP2306454B1; ATE534989T1; JP5808092B2; US20110060596A1; KR20110025616A; EP2306456A1; CN102013255B; EP2306454A1; US8566083B2; CN102013255A

Abstract

<P>PROBLEM TO BE SOLVED: To provide an efficient method for reducing the power necessary for a dual-layer audio signal. <P>SOLUTION: The audio signal may have a BL and an EL, wherein the EL represents additional information for enhancing the quality of the BL audio content. Decoding of such a dual-layer signal normally includes a step (21) of partially decoding the BL data. Frequency bins of the BL are restored (22), the restored frequency bins are mapped to an MDCT domain (23), they are added to the decoded EL, and inverse integer MDCT is performed. A low-complexity method for decoding includes the steps of reverse-mapping the decoded EL data (45), adding the reversely mapped EL data to the partially decoded BL data (42) and filtering the sum by using an inverse BL filter bank. <P>COPYRIGHT: (C)2011,JPO&INPIT

Description

本発明は、基本層及び拡張層を有する音声信号を検出する方法に関する。 The present invention relates to a method for detecting an audio signal having a base layer and an enhancement layer.

音声信号は、集合的に二重層と表される基本層及び拡張層を有し、前記基本層は符号化された音声コンテンツの限られた品質の版を表し、前記拡張層は音声コンテンツの品質を高める符号化された追加情報を表す。例えば、ビット・ストリームは、例えばＭＰ３（ＭＰＥＧ−１レイヤＩＩＩ）ビット・ストリームのような低ビット・レート層、及び基本品質を拡張品質に拡張する追加層を有しうる。原則的に、１より多い追加層が用いられてもよい。該追加層のうちの最も高いものは、元のＰＣＭ（パルス符号変調）サンプルのビット単位で正確な（bit-exact）再現を可能にしてもよい。 The audio signal has a base layer and an enhancement layer, collectively referred to as a double layer, wherein the base layer represents a limited quality version of the encoded audio content, and the enhancement layer is the quality of the audio content. Represents encoded additional information that enhances. For example, the bit stream may have a low bit rate layer such as an MP3 (MPEG-1 Layer III) bit stream and an additional layer that extends the basic quality to an extended quality. In principle, more than one additional layer may be used. The highest of the additional layers may enable bit-exact reproduction in bit units of the original PCM (pulse code modulation) sample.

このような二重層信号の符号化は、通常、基本層を符号化し、それにより入力信号の特定の情報を省略し、次に符号化された基本層を少なくとも部分的に再構成して予測信号を得ることにより行われる。更に、予測信号と完全な品質の入力信号との間の差分信号が、決定され符号化される。符号化された差分信号は、次に拡張層として機能する。 Such double layer signal encoding usually encodes the base layer, thereby omitting certain information in the input signal, and then at least partially reconstructing the encoded base layer to produce a predicted signal. Is done by Furthermore, the difference signal between the predicted signal and the full quality input signal is determined and encoded. The encoded difference signal then functions as an enhancement layer.

図１は、組み込み型可逆音声コーデックの符号器を示す。上の信号経路では、入力信号は基本層のビット・ストリームを符号化するために用いられる。基本層符号器は、例えばＭＰ３に準拠してよい。基本層コーデックは、拡張層信号経路で用いられるＭＤＣＴフィルタ・バンク１３に等しくない時間−周波数分解のためのフィルタ・バンク１１を用いる。例であるＭＰ３の場合には、基本層フィルタ・バンク１１は、ハイブリッド・フィルタ・バンクであり、３２帯域の多相フィルタ・バンクを有し、その後段に各サブ帯域に独立ＭＤＣＴ分析ブロックを有する。第２の信号経路では、入力信号は、信号の完全な可逆ＭＤＣＴ分解を実施する整数ＭＤＣＴブロック１３に供給される。整数値のＭＤＣＴ周波数ビンは、拡張層情報の可逆符号化の基礎である。 FIG. 1 shows an encoder of an embedded lossless speech codec. In the upper signal path, the input signal is used to encode the base layer bit stream. The base layer encoder may be compliant with MP3, for example. The base layer codec uses a filter bank 11 for time-frequency decomposition that is not equal to the MDCT filter bank 13 used in the enhancement layer signal path. In the case of the example MP3, the base layer filter bank 11 is a hybrid filter bank, having a 32-band polyphase filter bank, and having an independent MDCT analysis block in each sub-band in the subsequent stage. . In the second signal path, the input signal is fed to an integer MDCT block 13 that performs a complete reversible MDCT decomposition of the signal. Integer MDCT frequency bins are the basis for lossless encoding of enhancement layer information.

ハイブリッド基本層フィルタ・バンク１１は拡張層の整数ＭＤＣＴフィルタ・バンク１３とは異なるので、予測信号を得るためにマッピング処理が必要である。この目的のために、（ハイブリッド・フィルタ・バンク１１の領域内の）基本層周波数ビンは、部分的復号化により復元され１６、次にＭＤＣＴ領域にマッピングされる。マッピング１７は、例えばＥＰ２０６４７００Ａ１（ＰＤ０６００８０）に記載されたように効率的な方法で実行されてもよい。マッピングされた基本層情報は、次に整数値ＭＤＣＴ係数から減算される１４。残りの係数ｓ１４は、可逆拡張層を送信するために必要なビット・レートを最小化するために、エントロピ符号器１５に供給される。 Since the hybrid base layer filter bank 11 is different from the enhancement layer integer MDCT filter bank 13, a mapping process is required to obtain a prediction signal. For this purpose, the base layer frequency bins (in the region of the hybrid filter bank 11) are recovered by partial decoding 16 and then mapped to the MDCT region. The mapping 17 may be performed in an efficient manner, for example as described in EP2064700A1 (PD060080). The mapped base layer information is then subtracted 14 from the integer value MDCT coefficients. The remaining coefficient s14 is supplied to the entropy encoder 15 to minimize the bit rate required to transmit the lossless enhancement layer.

このような二重層信号の復号化は、通常、図２に示されるような手順を用いる。上側の信号経路では、基本層情報は、周波数ビン情報を復元するために部分的に復号化される２１。このとき時間領域への合成フィルタリングは実行されない。何故なら、これは基本層信号を復号化するためにのみ必要だからである。次に、正確に同一の処理が符号器内で行われる。つまり、基本層情報の周波数ビンは復元され（復号化され）２２、復元された周波数ビンのＭＤＣＴ領域へのマッピングが実行される２３。並行して、下側の信号経路は拡張ビット・ストリームを復号化する。エントロピ復号器２４の出力ｓ２４は、符号器の減算ブロック１４により計算されたように、ＭＤＣＴ領域内の基本層の残余誤りｓ１４と同一である。誤り残余ｓ２４は、基本層情報からマッピングされた係数ｓ２３に加算され２５、その和は逆整数ＭＤＣＴブロック２６に供給される。逆整数ＭＤＣＴブロックの出力信号は、符号器に供給された元の入力信号と（ビット単位で）完全に同一である。 Such a decoding of a double layer signal usually uses a procedure as shown in FIG. In the upper signal path, the base layer information is partially decoded 21 to recover the frequency bin information 21. At this time, the synthesis filtering to the time domain is not executed. This is necessary only for decoding the base layer signal. Next, exactly the same processing is performed in the encoder. That is, the frequency bin of the base layer information is restored (decoded) 22 and mapping of the restored frequency bin to the MDCT region is performed 23. In parallel, the lower signal path decodes the extended bit stream. The output s24 of the entropy decoder 24 is identical to the base layer residual error s14 in the MDCT domain, as calculated by the encoder subtraction block 14. The error residue s24 is added to the coefficient s23 mapped from the base layer information 25, and the sum is supplied to the inverse integer MDCT block 26. The output signal of the inverse integer MDCT block is completely identical (in bits) to the original input signal supplied to the encoder.

図４に、「IntMDCT‐A Link Between Perceptual and Lossless Audio Coding」、２００２年、ＩＥＥＥ、R.Geiger、J.Herre、J.Koller、K.-H.Brandenburgと同様の例を示す。 FIG. 4 shows an example similar to “IntMDCT-A Link Between Perceptual and Lossless Audio Coding”, 2002, IEEE, R. Geiger, J. Herre, J. Koller, K.-H. Brandenburg.

音声復号器は、小型の携帯型のバッテリ駆動式装置内にしばしば実装される。従って、一般に、電力を節約する方法で、符号化された音声信号の復号化を実行することが望ましい。プロセッサに基づく復号器の実施形態では、これは、プロセッサが実行する必要のある処理サイクル数を低減することと等価である。 Speech decoders are often implemented in small portable battery-powered devices. Therefore, it is generally desirable to perform decoding of the encoded speech signal in a manner that saves power. In processor-based decoder embodiments, this is equivalent to reducing the number of processing cycles that the processor needs to execute.

本発明は、二重層音声信号のために必要な電力を低減する効率的な解決策を提供する。 The present invention provides an efficient solution that reduces the power required for a dual layer audio signal.

本発明のある一般的態様によると、基本層部分と拡張層部分とを有する音声信号を復号化する方法が提供される。該拡張層部分は、フィルタ・バンク領域のマッピングを用いて該基本層信号部分から予測される。当該方法は、前記符号化された基本層部分を部分的に復号化する段階、前記フィルタ・バンク領域のマッピングの簡易反転に従って、前記拡張層部分を逆マッピングする段階、該逆マッピングされた拡張層部分を前記部分的に復号化された基本層部分に加算する段階、及び逆基本層フィルタ・バンクを用いて、該加算する段階の出力信号を合成フィルタリングする段階、を有する。 According to one general aspect of the invention, a method is provided for decoding an audio signal having a base layer portion and an enhancement layer portion. The enhancement layer portion is predicted from the base layer signal portion using filter bank region mapping. The method includes partially decoding the encoded base layer portion, de-mapping the enhancement layer portion according to a simple inversion of the mapping of the filter bank region, the inverse-mapped enhancement layer Adding a portion to the partially decoded base layer portion, and synthetic filtering the output signal of the adding step using an inverse base layer filter bank.

本発明の別の一般的態様によると、基本層信号部分と拡張層信号部分とを有する音声信号を復号化する復号器が提供される。該拡張層部分は、フィルタ・バンク領域のマッピングを用いて該基本層信号部分から予測される。当該復号器は、前記符号化された基本層部分を部分的に復号化する部分復号器、前記フィルタ・バンク領域のマッピングの簡易反転に従って、前記拡張層部分を逆マッピングする第１のマッパー、該逆マッピングされた拡張層部分を前記部分的に復号化された基本層部分に加算する第１の加算器、及び該加算された出力信号を合成フィルタリングし、逆基本層フィルタ・バンクとして動作する第１の合成フィルタ、を有する。 According to another general aspect of the invention, a decoder is provided for decoding an audio signal having a base layer signal portion and an enhancement layer signal portion. The enhancement layer portion is predicted from the base layer signal portion using filter bank region mapping. The decoder comprises: a partial decoder for partially decoding the encoded base layer portion; a first mapper for inverse mapping the enhancement layer portion according to a simple inversion of the mapping of the filter bank region; A first adder that adds the inverse-mapped enhancement layer portion to the partially decoded base layer portion, and a first filter that performs synthetic filtering on the added output signal and operates as an inverse base layer filter bank 1 synthesis filter.

本発明の一態様によると、基本層信号部分と拡張層信号部分とを有する音声信号を復号化する方法が提供される。該基本層信号部分及び該拡張層信号部分は、異なるフィルタ種類から得られ、異なるフィルタ・バンク領域にあり、該拡張層信号部分は、フィルタ・バンク領域のマッピングを用い次にエントロピ符号化されて、該基本層信号部分から予測される。当該方法は、前記符号化された基本層部分を部分的に復号化する段階、前記拡張層部分をエントロピ復号化する段階、前記フィルタ・バンク領域のマッピングの簡易反転に従って、前記エントロピ復号化された拡張層部分を逆マッピングする段階、該逆マッピングされた拡張層部分を前記部分的に復号化された基本層部分に加算する段階、及び逆基本層フィルタ・バンクを用いて、該加算する段階の出力信号を合成フィルタリングする段階、を有する。 According to one aspect of the invention, a method is provided for decoding an audio signal having a base layer signal portion and an enhancement layer signal portion. The base layer signal portion and the enhancement layer signal portion are obtained from different filter types and are in different filter bank regions, and the enhancement layer signal portion is then entropy encoded using filter bank region mapping. , Predicted from the base layer signal portion. The method includes: partially decoding the encoded base layer portion; entropy decoding the enhancement layer portion; and performing entropy decoding according to simple inversion of the mapping of the filter bank region. Inverse mapping the enhancement layer portion, adding the inverse-mapped enhancement layer portion to the partially decoded base layer portion, and adding using an inverse base layer filter bank Synthesizing and filtering the output signal.

本発明の別の態様によると、基本層部分と拡張層部分とを有する音声信号を復号化する復号器が提供される。該基本層部分及び該拡張層部分は、異なるフィルタ・バンク領域にあり、該拡張層部分は、フィルタ・バンク領域のマッピングを用い次にエントロピ符号化されて、該基本層部分から予測される。当該復号器は、前記基本層部分を部分的に復号化する部分復号器、前記拡張層部分をエントロピ復号化するエントロピ復号器、前記フィルタ・バンク領域のマッピングの簡易反転に従って、前記エントロピ復号化された拡張層信号を逆マッピングする第１のマッピング要素、該逆マッピングされた拡張層を前記部分的に復号化された基本層に加算する第１の加算器、及び該加算された出力信号をフィルタリングし、逆基本層フィルタ・バンクとして動作する第１の合成フィルタ、を有する。 According to another aspect of the invention, a decoder is provided for decoding an audio signal having a base layer portion and an enhancement layer portion. The base layer portion and the enhancement layer portion are in different filter bank regions, and the enhancement layer portion is then entropy encoded using the filter bank region mapping and predicted from the base layer portion. The decoder is a partial decoder that partially decodes the base layer part, an entropy decoder that entropy decodes the enhancement layer part, and is entropy decoded according to simple inversion of the mapping of the filter bank region. A first mapping element for inverse mapping the enhanced enhancement layer signal, a first adder for adding the inversely mapped enhancement layer to the partially decoded base layer, and filtering the summed output signal And a first synthesis filter that operates as an inverse base layer filter bank.

一実施形態では、前記基本層部分は周波数ビンを有し、前記基本層信号を部分的に復号化する段階は、該周波数ビンを復元する段階を有する。 In one embodiment, the base layer portion has frequency bins, and partially decoding the base layer signal comprises restoring the frequency bin.

留意すべき点は、フィルタ・バンク領域のマッピングの単純な反転は、元のフィルタ・バンク領域のマッピングよりも低い精度で実行される逆処理を意味する。低い精度は、数値の丸め込み、及びより効率的な実施のためのフィルタリング機能の単純化を表してもよい。 It should be noted that a simple inversion of the filter bank area mapping implies an inverse process that is performed with less accuracy than the original filter bank area mapping. Low accuracy may represent rounding of numbers and simplification of the filtering function for more efficient implementation.

本発明の１つの利点は、本発明が既存の符号化フォーマットに適用可能であること、如何なる特別なフォーマットも必要ないことである。 One advantage of the present invention is that the present invention is applicable to existing encoding formats and does not require any special format.

本発明の実施例の更なる利点は、従属請求項、以下の説明及び図面に示される。 Further advantages of embodiments of the invention are indicated in the dependent claims, the following description and the drawings.

本発明の例である実施例は、添付の図面を参照して説明される。
組み込み型可逆音声コーデックの符号器を示す。符号化された二重層音声データのビット単位で正確な音声復号器である。拡張された低複雑度復号器の構造である。ビット単位で正確な復号器の相対的な計算の複雑性である。拡張された低複雑度復号器の相対的な計算の複雑性である。ビット単位で正確な復号化部と低複雑度復号化部とを有する柔軟な復号器の構造である。ソース音声信号の例であるパワー・スペクトル、従来の復号化された音声信号と拡張された復号化された音声信号、及び対応する誤りスペクトルである。 Exemplary embodiments of the invention will now be described with reference to the accompanying drawings.
An encoder for an embedded lossless speech codec is shown. It is an accurate speech decoder in bit units of encoded double layer speech data. It is the structure of an extended low complexity decoder. It is the relative computational complexity of a decoder that is accurate in bits. It is the relative computational complexity of the extended low complexity decoder. It is a flexible decoder structure having an accurate decoding unit and a low complexity decoding unit in bit units. An example of a source speech signal is a power spectrum, a conventional decoded speech signal and an extended decoded speech signal, and a corresponding error spectrum.

本発明の以下の例である実施形態は、ＭＰＥＧ−１ＬａｙｅｒＩＩＩ（ＭＰ３）を参照して説明される。しかしながら、本発明は、フィルタ・バンクによる同様の音声符号化フォーマットの実施形態で、特にフィルタ・バンクの領域マッピングが必要な場合に、用いられてもよい。 The following example embodiment of the present invention is described with reference to MPEG-1 Layer III (MP3). However, the present invention may be used with embodiments of similar speech coding formats with filter banks, especially when region mapping of filter banks is required.

図３は、本発明の一態様による復号化手法のブロック図を示す。入力信号Ｉｎは、如何なる種類のデータ源から、例えば任意の記憶素子からのファイルの読み取りから、又は無線若しくは有線データ・ブロードキャスト若しくはユニキャストのための受信機から得られてもよい。入力信号Ｉｎは、基本層部分を拡張層部分から分離するために、例えばファイルＩ／Ｏ処理により予め処理される。次に、基本層信号は、部分基本層復号器４１に入力される。部分基本層復号器４１は、基本層フィルタ・バンク領域で基本層信号ｓ４１を生成する。 FIG. 3 shows a block diagram of a decoding technique according to one aspect of the present invention. The input signal In may be obtained from any kind of data source, for example from reading a file from any storage element, or from a receiver for wireless or wired data broadcast or unicast. The input signal In is processed in advance by, for example, file I / O processing in order to separate the base layer portion from the extension layer portion. Next, the base layer signal is input to the partial base layer decoder 41. The partial base layer decoder 41 generates a base layer signal s41 in the base layer filter bank region.

部分基本層復号器４１は、部分復号化のみを実行する。つまり時間領域への如何なる逆変換も行わない。従来の基本層復号器では、この基本層フィルタ・バンク領域の信号ｓ４１は、時間領域信号を得るために逆基本層フィルタ・バンク４３に直接入力されるだろう。これに対し、拡張復号器は、基本層と拡張層信号の和が逆基本層フィルタ・バンク４３に入力される前に、拡張データを加算する加算器４２を有する。有利なことに、フィルタ・バンク４３は、従来のＭＰ３基本層の復号化に関しては同様であってもよい。拡張データは、拡張層から逆マッパー４５により生成される。逆マッパー４５は、拡張層のＭＤＣＴ領域からのデータを基本層のフィルタ・バンク領域にマッピングする。入力データはしばしばエントロピ符号化されるので、拡張層データは、本発明の一実施形態ではエントロピ復号器４４から得られる。入力データが別に符号化されるか又は全く符号化されない場合、エントロピ復号器４４は、対応する復号器により置き換えられるか又はそれぞれスキップされてもよい。 The partial base layer decoder 41 performs only partial decoding. That is, no reverse transformation to the time domain is performed. In a conventional base layer decoder, this base layer filter bank domain signal s41 would be input directly to the inverse base layer filter bank 43 to obtain a time domain signal. On the other hand, the extension decoder has an adder 42 for adding extension data before the sum of the base layer and the enhancement layer signal is input to the inverse base layer filter bank 43. Advantageously, the filter bank 43 may be similar for conventional MP3 base layer decoding. The extension data is generated by the inverse mapper 45 from the extension layer. The inverse mapper 45 maps the data from the MDCT area of the enhancement layer to the filter bank area of the base layer. Because input data is often entropy encoded, enhancement layer data is obtained from entropy decoder 44 in one embodiment of the invention. If the input data is encoded separately or not encoded at all, the entropy decoder 44 may be replaced by a corresponding decoder or skipped respectively.

従来のビット単位で正確な完全な可逆復号器と比べると、図２に関して上述されたように、信号フローは低複雑度復号器の一部で変更されている。周波数ビンを基本層コーデックのフィルタ・バンク領域から拡張層コーデックのＭＤＣＴ領域にマッピングする代わりに、マッピングは逆方向に行われる。拡張層復号器は、ＭＤＣＴ領域からＭＰ３基本層コーデックの領域への逆マッピング４５を用いる。従って、マッピングの出力（つまりマッピングされた誤り残余）は、基本層の復号化された周波数ビンに直接加算される４２。従って、基本層コーデックの合成フィルタ・バンク（ＦＢ）を用いることにより、拡張された時間領域信号を得ることが可能である。 Compared to a conventional bit-wise exact complete lossless decoder, the signal flow has been modified in some of the low complexity decoders as described above with respect to FIG. Instead of mapping frequency bins from the filter bank region of the base layer codec to the MDCT region of the enhancement layer codec, the mapping is done in the reverse direction. The enhancement layer decoder uses an inverse mapping 45 from the MDCT domain to the MP3 base layer codec domain. Accordingly, the mapping output (ie, mapped error residual) is directly added 42 to the base layer decoded frequency bins 42. Therefore, it is possible to obtain an extended time-domain signal by using the synthesis filter bank (FB) of the base layer codec.

拡張された復号器の１つの利点は、ビット単位で正確な復号器と比べて、復号化のために用いる電力が有意に少なく、一方で同様の品質の音声出力信号を生成することである図４は、ビット単位で正確な従来の復号器のブロックの相対的な計算の複雑性を示す。計算の複雑性は、概して電力消費と等価である。何故なら、計算の複雑性は計算を実行する１又は複数の処理要素の、例えばプロセッサの処理サイクル数に対応するからである。発明者らの測定及び計算により次のことが明らかになった。部分基本層復号器は、従来の復号器の総電力消費の約８％を消費し、拡張層エントロピ復号器は、従来の復号器の総電力消費の約１９％を消費する。マッピング・ブロック及び逆整数ＭＤＣＴブロックは、総電力消費のそれぞれ３５％及び３８％と比較的高い率を必要とする。加算器は、他のブロックと比較して、比較的単純な構造を有し、実質的に如何なる電力も必要としない。従って、部分基本層復号器、拡張層エントロピ復号器、マッピング・ブロック及び逆整数ＭＤＣＴブロックの総電力消費は、合計で最大１００％になる。 One advantage of an enhanced decoder is that it uses significantly less power for decoding compared to a bit-wise accurate decoder, while producing a speech output signal of similar quality. 4 indicates the relative computational complexity of a conventional decoder block that is accurate in bits. The computational complexity is generally equivalent to power consumption. This is because the computational complexity corresponds to the number of processing cycles of one or more processing elements performing the calculation, for example the processor. The following things became clear from the measurement and calculation by the inventors. The partial base layer decoder consumes about 8% of the total power consumption of the conventional decoder, and the enhancement layer entropy decoder consumes about 19% of the total power consumption of the conventional decoder. Mapping blocks and inverse integer MDCT blocks require relatively high rates of 35% and 38% of total power consumption, respectively. The adder has a relatively simple structure compared to other blocks and does not require substantially any power. Therefore, the total power consumption of the partial base layer decoder, enhancement layer entropy decoder, mapping block and inverse integer MDCT block is up to 100% in total.

図５は、従来の復号器に対する、拡張二重層復号器のブロックの計算の複雑性を示す。比較が示すように、両方の実施形態とも、同一の部分基本層復号器及びエントロピ復号器を用い、総電力消費の約８％及び１９％を消費する。しかしながら、電力消費の主な低減は、従来のマッパーの代わりに逆マッパー４５を用いることにより、及び逆整数ＭＤＣＴフィルタ・バンクの代わりに逆基本層フィルタ・バンク４３を用いることにより得られる。逆マッパー４５は、従来の復号器の総電力消費の約１０％のみを消費し、総電力の３５％を消費するマッピング・ブロックを置き換える。従って、この対策により（３５％−１０％＝）２５％の節約が得られる。更に、逆基本層フィルタ・バンク４３は、従来の総電力消費の約８％のみを必要とし、３８％を使用していた逆整数ＭＤＣＴブロックを置き換える。この対策は、総電力消費の（３８％−８％＝）３０％の節約をもたらす。加算器は、ＭＤＣＴ領域の信号部分の代わりに基本層フィルタ・バンク領域の信号部分を加算するので、僅かに異なる。加算器は、特定のデータ・フォーマット又は計算動作に従う必要がないので、複雑性が低減されうる。しかしながら、加算器は、依然として実質的に如何なる電力も必要としない。従って、拡張復号器の総電力消費は、５５％だけ低減され、従来の復号器の電力消費の４５％まで低減された。これは、本発明の拡張復号器を、低電力用途に、例えばバッテリ式装置に望ましくする。 FIG. 5 illustrates the computational complexity of the block of an enhanced double layer decoder relative to a conventional decoder. As the comparison shows, both embodiments use the same partial base layer decoder and entropy decoder and consume about 8% and 19% of the total power consumption. However, the main reduction in power consumption is obtained by using the inverse mapper 45 instead of the conventional mapper and by using the inverse base layer filter bank 43 instead of the inverse integer MDCT filter bank. The inverse mapper 45 consumes only about 10% of the total power consumption of a conventional decoder and replaces the mapping block that consumes 35% of the total power. Therefore, this measure can save 25% (35% -10% =). Furthermore, the inverse base layer filter bank 43 only requires about 8% of the total power consumption of the prior art, replacing the inverse integer MDCT block that used 38%. This measure results in a saving of (38% -8% =) 30% of the total power consumption. The adder is slightly different because it adds the signal portion of the base layer filter bank region instead of the signal portion of the MDCT region. The adder does not need to follow a specific data format or calculation operation, so complexity can be reduced. However, the adder still requires virtually no power. Thus, the total power consumption of the extended decoder has been reduced by 55%, down to 45% of the power consumption of the conventional decoder. This makes the extended decoder of the present invention desirable for low power applications, such as battery powered devices.

計算の複雑性の観点から、新たな手法は２つの利点を有する。 From the point of view of computational complexity, the new approach has two advantages.

第一に、逆マッパー４５における逆マッピングは、図２に示された順マッピングより遙かに低い信号対歪み比（signal-to-distortion ratio：ＳＤＲ）を有しうる。遙かに低い精度の要件の理由は、マッピングへの入力が誤り残余であることである。逆マッピング手順により生成される如何なる歪みも、低電力残余信号に直接加算される。従って、逆マッピングの絶対的な歪みは順マッピングの場合と同程度の大きさになりうるが、ＳＤＲ要件は、入力信号の電力の低減と同じくらい遙かに低くなりうる。実際には、逆マッパー４５は、順マッピングの場合に必要な５０ｄＢの代わりに、約２０ｄＢのマッピング精度を有すれば十分である。低いＳＤＲ要件のため、逆マッピング４５の計算の複雑性は、順マッピングよりも遙かに低い。 First, the inverse mapping in the inverse mapper 45 may have a signal-to-distortion ratio (SDR) that is much lower than the forward mapping shown in FIG. The reason for the much lower accuracy requirement is that the input to the mapping is an error residue. Any distortion produced by the inverse mapping procedure is added directly to the low power residual signal. Thus, the absolute distortion of the inverse mapping can be as large as in the forward mapping, but the SDR requirement can be as low as the power reduction of the input signal. In practice, it is sufficient for the inverse mapper 45 to have a mapping accuracy of about 20 dB instead of the 50 dB required for forward mapping. Due to the low SDR requirement, the computational complexity of inverse mapping 45 is much lower than forward mapping.

第二に、更に、複雑性の少ない逆フィルタ・バンク４３の基本層コーデックの手順が用いられうる。上述の例では、ＭＰ３コーデックの合成フィルタ・バンクが用いられうる。該合成フィルタ・バンクは、逆整数ＭＤＣＴの約３８％の代わりに、可逆復号器の全体の複雑性の約８％のみしか必要としない。逆基本層フィルタ・バンク４３は、従来の逆整数ＭＤＣＴより有意に少ない処理を実行する。 Second, the base layer codec procedure of the inverse filter bank 43 with less complexity can also be used. In the above example, an MP3 codec synthesis filter bank may be used. The synthesis filter bank requires only about 8% of the overall complexity of the lossless decoder, instead of about 38% of the inverse integer MDCT. Inverse base layer filter bank 43 performs significantly less processing than conventional inverse integer MDCT.

上述のように、逆マッパー４５で実行されるような、フィルタ・バンク領域のマッピングの単純な反転は、元のフィルタ・バンク領域のマッピングよりも低い精度で実行される逆処理を意味する。低い精度は、数値の丸め込み、及びより効率的な実施のためのフィルタリング機能の単純化を表してもよい。例えば、１又は複数の段階をスキップすること、又はより短い位相補正フィルタを使用することである。更なる例は、ＥＰ２０６４７００Ａ１で与えられる。 As described above, a simple inversion of the filter bank area mapping, as performed by the inverse mapper 45, implies an inverse process performed with less accuracy than the original filter bank area mapping. Low accuracy may represent rounding of numbers and simplification of the filtering function for more efficient implementation. For example, skip one or more steps, or use a shorter phase correction filter. A further example is given in EP2064700A1.

纏めると、拡張された信号フローは、新たな準可逆復号化構造をもたらす。該準可逆復号化構造は、実装が容易で、単純な基本層復号器よりも有意に良好な音声品質を得るのに適している。これは、誤り残余信号の逆マッピングにおいて、拡張層からの情報を用いることにより達成される。 In summary, the extended signal flow results in a new quasi-reversible decoding structure. The quasi lossless decoding structure is easy to implement and is suitable for obtaining significantly better speech quality than a simple base layer decoder. This is achieved by using information from the enhancement layer in the inverse mapping of the error residual signal.

異なる処理のため、拡張された低複雑度復号器の出力信号は、元の入力信号とビット単位で同一ではない。しかしながら、本発明の低複雑度拡張された復号器は、出力信号で、元の入力信号の全周波数部分を提供する。有利なことに、信号間に可聴の差異はない。従って、品質の観点から、低複雑度復号器は、ビット単位の復号器に十分に匹敵する。 Due to the different processing, the output signal of the extended low complexity decoder is not identical bit by bit with the original input signal. However, the low complexity extended decoder of the present invention provides the full frequency portion of the original input signal at the output signal. Advantageously, there is no audible difference between the signals. Thus, from a quality standpoint, a low complexity decoder is sufficiently comparable to a bit-wise decoder.

歪みのより詳細な分析は、次のことを明らかにする。逆マッピングは、実際に３個の信号成分、つまり順及び逆マッピングのＭＰ３基本層の量子化誤り、整数ＭＤＣＴの量子化誤り、及び累積量子化誤り又は歪みをそれぞれ基本層フィルタ・バンク領域に変換する。これらの誤りの種類では、次のことが適用できる。 A more detailed analysis of the distortion reveals: Inverse mapping actually converts the three signal components, ie, forward and reverse mapping MP3 base layer quantization error, integer MDCT quantization error, and cumulative quantization error or distortion into base layer filter bank regions, respectively. To do. For these error types, the following applies.

ＭＰ３基本層の量子化誤りは、単独で捕らえられた場合、ＭＰ３基本層の復号化された周波数成分を完全に補完する。つまり、この誤りの種類のみを考慮すると、本発明の低複雑度復号化は、周波数スペクトルに関する限り、入力信号の完全な再構成をもたらす。 The MP3 base layer quantization error, when captured alone, completely complements the decoded frequency component of the MP3 base layer. That is, considering only this type of error, the low complexity decoding of the present invention results in a complete reconstruction of the input signal as far as the frequency spectrum is concerned.

整数ＭＤＣＴの量子化誤りは、整数ＭＤＣＴ分析フィルタから必然的に生じる。これは、スペクトル的には平坦であり、無相関である。本発明の復号化では、この誤りは、結果として生じる時間領域信号に約２．６／１２（ＬＳＢ＾２））の分散を有する、実質的に固定した、付加的な白色ガウス雑音をもたらす。この誤りの種類の影響は、ＰＣＭワード幅の減少、例えば１６ビット／サンプルから１５にビット／サンプルへの減少に匹敵する。標準的な良好なレベルの音声コンテンツでは、この誤り種類は聞こえないので無視できる。 Integer MDCT quantization errors necessarily arise from integer MDCT analysis filters. This is spectrally flat and uncorrelated. In the decoding of the present invention, this error results in a substantially fixed, additional white Gaussian noise with a variance of about 2.6 / 12 (LSB ^ 2)) in the resulting time domain signal. The effect of this error type is comparable to a reduction in PCM word width, eg from 16 bits / sample to 15 bits / sample. With standard good level audio content, this error type is inaudible and can be ignored.

マッピング誤りは、信号に依存し、約５０−６０ｄＢの信号対雑音比（signal-to-noise-ratio：ＳＮＲ）を有する線形及び非線形の歪みを含む。つまり、誤り電力は、信号電力と共に、約５０−６０ｄＢの一定の距離を有し、変化する。 Mapping errors depend on the signal and include linear and non-linear distortion with a signal-to-noise-ratio (SNR) of about 50-60 dB. That is, the error power varies with the signal power with a constant distance of about 50-60 dB.

纏めると、本発明の低複雑度復号器の出力信号は、ビット単位で正確な拡張層復号器の出力信号に匹敵し、基本層復号器の出力信号よりも遙かに良好な音声品質を有する。一方で、要求される計算量は、従来のビット単位で正確な拡張層復号器よりも遙かに低い。例えば、低複雑度復号器は、１２８ｋｂｉｔ／ｓの標準的なビット・レートを有する従来のＭＰ３の場合の２０ｄＢと比べて５０−６０ｄＢのＳＮＲを提供する。主観的に、質の向上の程度は、基本層のＭＰ３ビット・レートに依存する。特に、共通の低及び中ビット・レートでは、大きく向上される。 In summary, the output signal of the low complexity decoder of the present invention is comparable to the output signal of the enhancement layer decoder accurate in bit units, and has a much better voice quality than the output signal of the base layer decoder. . On the other hand, the required amount of calculation is much lower than the conventional enhancement layer decoder in bit units. For example, a low complexity decoder provides an SNR of 50-60 dB compared to 20 dB for a conventional MP3 with a standard bit rate of 128 kbit / s. Subjectively, the degree of quality improvement depends on the MP3 bit rate of the base layer. In particular, the common low and medium bit rates are greatly improved.

図７は、例であるソース音声信号の例であるパワー・スペクトルｐ_Ｓ、従来の復号化された音声信号ｐ_Ｃと拡張された復号化された音声信号ｐ_Ｅ、及び対応する変化（誤り）スペクトルｅ_Ｃ、ｅ_Ｅである。ビット単位で正確な復号器は、入力信号ｐ_Ｓと同一の十分な品質の音声信号を提供する。通常のＭＰ３プレーヤの出力信号のような従来の方法で復号化された基本層音声信号ｐ_Ｃでは、高い周波数部分は切り取られる。標準的に、遮断周波数ｆＣを超えるスペクトル部分は、音声品質に少ない影響しか与えず、従って（基本層）符号器で除去される。従って、従来のＭＰ３信号の誤りｅ_Ｃは、高い周波数では特に高い。実際の遮断周波数ｆ_Ｃは、現在の信号エネルギに依存して僅かに変化しうる。しかしながら、少なくとも特定の音声状況では、これらの周波数部分は、多くの人々に少なくとも部分的に知覚でき、それらの欠失は音声品質を有意に低下させうる。 FIG. 7 shows an example power spectrum p _S , an example source speech signal, a conventional decoded speech signal p _C and an extended decoded speech signal p _E , and corresponding changes (errors). The spectra e _C and e _E. A bit-accurate decoder provides a speech signal of sufficient quality that is identical to the input signal p _S. In the base layer audio signal p _C decoded by a conventional method such as an output signal of a normal MP3 player, a high frequency part is cut off. Typically, the part of the spectrum above the cut-off frequency fC has little impact on speech quality and is therefore removed by the (base layer) encoder. Therefore, the error e _C of the conventional MP3 signal is particularly high at high frequencies. The actual cut-off frequency f _C can vary slightly depending on the current signal energy. However, at least in certain speech situations, these frequency portions can be at least partially perceived by many people, and their deletion can significantly reduce speech quality.

これに対し、本発明の低複雑度二重層復号器の出力信号ｐ_Ｅは、入力信号ｐ_Ｓからの逸脱が少なく、入力信号ｐ_Ｓの全ての周波数成分を有する。従って、入力信号の誤り信号ｅ_Ｅは、非常に低いパワーを有し、全周波数範囲に渡り遙かに一定である。留意すべき点は、図７は、例である短期間のスペクトルを示し、縦（パワー）軸に対数目盛を用いていること、誤りパワーは一般的に入力信号及び出力信号の信号パワーに依存すること、更に復号化された音声信号の実際のパワーｐ_Ｃ、ｐ_Ｅは最小値と最大値の間ｐ_{Ｃ，ｍｉｎ}−ｐ_{Ｃ，ｍａｘ}及びｐ_{Ｅ，ｍｉｎ}−ｐ_{Ｅ，ｍａｘ}でそれぞれ変化すること、しかし少なくとも遮断周波数ｆ_Ｃを遙かに下回る元の信号ｐ_Ｓと概して同一であることである。図７は差異を明確にするために誇張された方法で示されているが、ｐ_{Ｅ，ｍｉｎ}−ｐ_{Ｅ，ｍａｘ}の範囲は、ｐ_{Ｃ，ｍｉｎ}−ｐ_{Ｃ，ｍａｘ}の範囲よりも元のｐ_Ｓにもっと近い。これは、ｐ_Ｅの音声品質がもっと良好であることを意味する。 In contrast, the output signal p _E of low complexity bilayer decoder of the present invention has less deviation from the input signal p _S, having all of the frequency components of the input signal p _S. Accordingly, the error signal e _E of the input signal has very low power and is much constant over the entire frequency range. It should be noted that FIG. 7 shows an example of a short-term spectrum, using a logarithmic scale on the vertical (power) axis, and error power generally depends on the signal power of the input signal and output signal And the actual powers p _C and p _E of the decoded speech signal vary between the minimum and maximum values at p _{C, min} −p _{C, max} and p _{E, min} −p _{E, max} , respectively. But at least generally the same as the original signal p _S which is well below the cut-off frequency f _C. Although FIG. 7 is shown in an exaggerated manner to clarify the difference _, the range of p _{E, min} −p _{E, max} is the original p than the range of p _{C, min} −p _{C, max.} Closer to _S. This means that the voice quality of the p _E is more favorable.

新たな復号化手法は、計算能力の低い装置又は限られた電力供給しか有さない装置、例えばバッテリ式装置で特に有益である。低複雑度復号化機能の使用をより分かり易く使い易くするために、完全な可逆（ビット単位で正確な）復号化と低複雑度準可逆復号化との間の自動切り替えが適用されうる。以下の例がある。 The new decoding technique is particularly beneficial for devices with low computational power or devices with limited power supply, such as battery powered devices. In order to make the use of the low complexity decoding function easier to understand and easier to use, an automatic switch between fully lossless (bit-wise accurate) decoding and low complexity quasi-reversible decoding can be applied. There are the following examples.

電源に依存した自動切り替え復号化モード：
装置がバッテリ式のとき、準可逆モードが用いられる。装置がより信頼性の高い電源、例えば幹線電圧に接続されているとき、ビット単位で正確な可逆モードが用いられる。切り替えは、電源検出器に応答して自動的に行われうる。 Automatic switching decoding mode depending on power supply:
When the device is battery powered, a quasi-reversible mode is used. When the device is connected to a more reliable power source, such as a mains voltage, a precise reversible mode on a bit-by-bit basis is used. The switching can be done automatically in response to the power detector.

総プロセッサ負荷に依存した自動切り替え復号化モード：
他の実行ファイルを通じて高い負荷がプロセッサに課されているとき、準可逆モードが用いられる。或いは、プロセッサの負荷が低いとき、ビット単位で正確な可逆モードが用いられる。切り替えは、処理負荷検出器に応答して自動的に行われうる。 Automatic switching decoding mode depending on total processor load:
A quasi-reversible mode is used when the processor is heavily loaded through other executables. Alternatively, when the processor load is low, an accurate lossless mode is used on a bit-by-bit basis. The switching can be done automatically in response to the processing load detector.

要求される信号出力に依存した自動切り替え復号化モード：
低品質の出力、例えばアナログ線レベルの出力が要求されるとき、準可逆モードが用いられる。高品質の出力、例えばデジタルＳＰＤＩＦ出力が要求されるとき、ビット単位で正確な可逆モードが用いられる。切り替えは、出力種類検出器に応答して自動的に行われうる。 Automatic switching decoding mode depending on the required signal output:
The quasi-reversible mode is used when low quality output, for example, analog line level output is required. When high quality output is required, for example digital SPDIF output, an accurate lossless mode is used on a bit-by-bit basis. The switching can be done automatically in response to the output type detector.

上述の例は、閾（電圧閾、処理負荷閾）及び対応する検出器を利用してもよい。例えば、節電モードを有効化する条件は、復号化方法の１又は複数の段階を実行する少なくとも１つの処理要素の処理負荷が閾を超えることであってもよい。２以上の異なる条件の種々の組み合わせが可能である。例えば高処理負荷と低電力供給である。 The above examples may utilize thresholds (voltage thresholds, processing load thresholds) and corresponding detectors. For example, the condition for enabling the power saving mode may be that the processing load of at least one processing element that executes one or more stages of the decoding method exceeds a threshold. Various combinations of two or more different conditions are possible. For example, high processing load and low power supply.

図６は、現在の動作条件に依存して自動切り替え復号化モードを用いる例である復号器を示す。機械的又は電子的電源検出器、又は電子的電圧閾検出器、処理負荷閾検出器等は、スイッチ５０を制御するために用いられる制御信号Ｃｔｒを供給する。スイッチ５０は、図３に示されたような本発明の準可逆低複雑度復号化モードを用いる省電力モードを有効にするか、又は図２に示されたような従来のビット単位で正確な可逆復号化モードを用いる全電力モードを有効にする。 FIG. 6 shows a decoder that is an example of using the automatic switching decoding mode depending on the current operating conditions. A mechanical or electronic power detector, or an electronic voltage threshold detector, a processing load threshold detector, etc. provides a control signal Ctr that is used to control the switch 50. The switch 50 enables a power saving mode using the quasi-reversible low complexity decoding mode of the present invention as shown in FIG. 3, or is accurate in conventional bit units as shown in FIG. Enable full power mode using lossless decoding mode.

節電モードでは、スイッチ５０は、逆マッパー３４、第１の加算器４２及び逆基本層フィルタ・バンク４３を有効にする。更に、節電モードでは、スイッチ５０は、マッパー４７、第２の加算器４８及び逆整数ＭＤＣＴ４９を無効にする。これに対し、全電力モードでは、スイッチ５０は、マッパー４７、第２の加算器４８及び逆整数ＭＤＣＴ４９を有効にし、逆マッパー４５、第１の加算器４２及び逆基本層フィルタ・バンク４３を無効にする。部分基本総復号器４１及び拡張層エントロピ復号器４４は、両方のモードで用いられる。マッパー４７は、図２に示されるように、周波数ビンの復元及びＭＤＣＴ領域への実際のマッピングを実行してもよい。第１及び／又は第２の加算器４２、４８は実際には如何なる電力も必要としないので、それらの無効化又は有効化は不必要であってもよい。 In the power saving mode, the switch 50 enables the inverse mapper 34, the first adder 42 and the inverse base layer filter bank 43. Further, in the power saving mode, the switch 50 disables the mapper 47, the second adder 48, and the inverse integer MDCT 49. In contrast, in full power mode, switch 50 enables mapper 47, second adder 48 and inverse integer MDCT 49, and disables inverse mapper 45, first adder 42 and inverse base layer filter bank 43. To. The partial basic total decoder 41 and the enhancement layer entropy decoder 44 are used in both modes. Mapper 47 may perform frequency bin reconstruction and actual mapping to the MDCT domain, as shown in FIG. Since the first and / or second adders 42, 48 do not actually require any power, their disabling or enabling may be unnecessary.

原理上は、１より多い拡張層も用いられるので、階層的な多層構造が存在する。その場合には、本発明は、階層構造内の如何なる２つの連続する層に適用されてもよい。２つの層の一方は他方を予測するために機能し、フィルタ・バンク領域のマッピングは予測のために用いられる。 In principle, since more than one extension layer is also used, there is a hierarchical multilayer structure. In that case, the present invention may be applied to any two consecutive layers in the hierarchical structure. One of the two layers serves to predict the other, and the filter bank area mapping is used for prediction.

留意すべき点は、加算器４２、４８のように簡単に示されたが、当業者に明らかなように加算器以外のより高度な重畳要素が用いられてもよいことである。それらの全ては本発明の精神と範囲に包含される。 It should be noted that although shown simply as adders 42, 48, more sophisticated superposition elements other than adders may be used as will be apparent to those skilled in the art. All of which are within the spirit and scope of the present invention.

本発明の基本的な新規な特徴は本発明の好適な実施形態に適用されるとして示され説明され指摘されたが、記載された装置及び方法の中で種々の省略及び代用及び変更が開示された装置の携帯及び詳細において及びそれらの動作において、本発明の精神から逸脱することなく当業者により行われてもよいことが理解されるだろう。本発明はＭＰ３に関して記載されたが、当業者は本願明細書に記載された方法及び装置が種々の種類の二重層音声復号化に適用されてもよいことを理解するだろう。明示的に意図されることは、実質的に同一の方法で実質的に同一の機能を実行して同一の結果を達成する要素の全ての組み合わせが、本発明の範囲に包含されることである。ある記載された実施形態から他の実施形態への要素の代用も。完全に意図され考慮されたものである。 While the basic novel features of the present invention have been shown, described and pointed out as applied to the preferred embodiment of the present invention, various omissions and substitutions and modifications within the described apparatus and method are disclosed. It will be appreciated that the carrying and details of such devices and their operation may be made by those skilled in the art without departing from the spirit of the invention. Although the present invention has been described with respect to MP3, those skilled in the art will appreciate that the methods and apparatus described herein may be applied to various types of double layer speech decoding. It is expressly intended that all combinations of elements that perform substantially the same function in substantially the same way to achieve the same result are included within the scope of the invention. . Substitution of elements from one described embodiment to another. It is completely intended and considered.

本発明は単なる例として記載され、詳細の変更が本発明の範囲から逸脱することなくなされ得ることが理解されるだろう。本願明細書に開示された各特長及び（必要に応じて）請求項及び図面は、独立に又は如何なる適切な組み合わせで提供されてもよい。特徴は、必要に応じてハードウェア、ソフトウェア、又はそれらの組み合わせで実施されてもよい。適切な場合、接続は無線又は有線で実施されてもよく、必ずしも直接又は専用接続でなくてもよい。参照符号等は、同一の又は対応する要素を全体を通じて指定する。請求項内の参照符合は、単に説明のためであり、請求項の範囲を制限するものではない。 It will be understood that the present invention has been described by way of example only and modifications of detail can be made without departing from the scope of the invention. Each feature disclosed in the specification and (where appropriate) the claims and drawings may be provided independently or in any appropriate combination. Features may be implemented in hardware, software, or a combination thereof as desired. Where appropriate, the connection may be implemented wirelessly or wired, and not necessarily a direct or dedicated connection. Reference numerals and the like designate the same or corresponding elements throughout. Reference signs in the claims are merely explanatory and do not limit the scope of the claims.

１０基本層符号器
１１基本層フィルタ・バンク
１２基本層エントロピ符号器
１３整数ＭＤＣＴ
１４減算
１５エントロピ符号器
１６周波数ビンの復元
１７ＭＤＣＴ領域へのマッピング
２１部分基本層復号器
２２周波数ビンの復元
２３ＭＤＣＴ領域へのマッピング
２４エントロピ復号器
２５加算
２６逆整数ＭＤＣＴ
４１部分基本層復号器
４２加算器
４３逆基本層フィルタ・バンク
４４ＥＬエントロピ復号器
４５逆マッピング
４７マッピング
４８加算器
４９逆整数ＭＤＣＴ
５０スイッチ 10 Base Layer Encoder 11 Base Layer Filter Bank 12 Base Layer Entropy Encoder 13 Integer MDCT
14 Subtraction 15 Entropy Encoder 16 Reconstruction of Frequency Bin 17 Mapping to MDCT Domain 21 Partial Base Layer Decoder 22 Reconstruction of Frequency Bin 23 Mapping to MDCT Domain 24 Entropy Decoder 25 Addition 26 Inverse Integer MDCT
41 Partial Base Layer Decoder 42 Adder 43 Inverse Base Layer Filter Bank 44 EL Entropy Decoder 45 Inverse Mapping 47 Mapping 48 Adder 49 Inverse Integer MDCT
50 switches

Claims

基本層部分と拡張層部分とを有する音声信号を復号化する方法であって、
該基本層部分及び該拡張層部分は、異なるフィルタ・バンク領域にあり、
該拡張層部分は、フィルタ・バンク領域のマッピングを用い次にエントロピ符号化されて、該基本層部分から予測され、
当該方法は、
−前記符号化された基本層部分を部分的に復号化する段階、
−前記拡張層部分をエントロピ復号化する段階、
−簡易反転は低減された処理精度を表し、前記フィルタ・バンク領域のマッピングの該簡易反転に従って、前記エントロピ復号化された拡張層部分を逆マッピングする段階、
−該逆マッピングされた拡張層部分を前記部分的に復号化された基本層部分に加算する段階、及び
−逆基本層フィルタ・バンクを用いて、該加算する段階の出力信号を合成フィルタリングする段階、
を有する方法。 A method for decoding an audio signal having a base layer portion and an enhancement layer portion, comprising:
The base layer portion and the enhancement layer portion are in different filter bank regions;
The enhancement layer portion is then entropy encoded using filter bank region mapping and predicted from the base layer portion;
The method is
-Partially decoding the encoded base layer portion;
-Entropy decoding the enhancement layer portion;
-Simple inversion represents reduced processing accuracy, and inverse mapping the entropy decoded enhancement layer portion according to the simple inversion of the filter bank region mapping;
Adding the inverse-mapped enhancement layer portion to the partially decoded base layer portion; and synthetic filtering the output signal of the adding step using an inverse base layer filter bank. ,
Having a method.

前記基本層部分は周波数ビンを有し、
前記基本層信号を部分的に復号化する段階は、該周波数ビンを復元する段階を有する、
ことを特徴とする請求項１に記載の方法。 The base layer portion has frequency bins;
Partially decoding the base layer signal comprises restoring the frequency bins;
The method according to claim 1.

前記基本層信号を部分的に復号化する段階は、時間領域への逆変換を実行しない、
ことを特徴とする請求項１又は２に記載の方法。 Partially decoding the base layer signal does not perform an inverse transform to the time domain;
The method according to claim 1 or 2, characterized in that

前記合成フィルタリングする段階から、元の信号と同一の周波数スペクトルを有するが該元の信号のビット単位で正確な複製ではない信号が得られる、
ことを特徴とする請求項１乃至３の何れか一項に記載の方法。 From the synthetic filtering, a signal is obtained that has the same frequency spectrum as the original signal, but is not an exact replica in bit units of the original signal.
4. A method according to any one of claims 1 to 3, characterized in that

前記エントロピ復号化された拡張層部分を逆マッピングする段階、前記逆マッピングされた拡張層を前記部分的に復号化された基本層部分に加算する段階、及び合成フィルタリングする段階は、簡易復号化モードと称され、
当該方法は、
−前記部分的に復号化された基本層信号は前記基本層フィルタ・バンク領域からＭＤＣＴ領域へマッピングされ、結果として生じたＭＤＣＴ領域信号は前記エントロピ復号化された拡張層信号に加算され、全てのスペクトル周波数ビンが得られ、逆整数ＭＤＣＴが該全てのスペクトル周波数ビンに実行され、可逆符号化された信号が得られる、可逆復号化モードを提供する段階、及び
−前記簡易復号化モードと前記可逆復号化モードの間を切り替える段階、
を更に有する請求項１乃至４の何れか一項に記載の方法。 Decoding the entropy-decoded enhancement layer portion, adding the inverse-mapped enhancement layer to the partially decoded base layer portion, and synthesizing filtering include simple decoding mode Called
The method is
The partially decoded base layer signal is mapped from the base layer filter bank region to the MDCT region, and the resulting MDCT region signal is added to the entropy decoded enhancement layer signal, Providing a lossless decoding mode in which spectral frequency bins are obtained and an inverse integer MDCT is performed on all the spectral frequency bins to obtain a lossless encoded signal; and-the simplified decoding mode and the lossless Switching between decryption modes,
The method according to any one of claims 1 to 4, further comprising:

−省電力モードを有効化又は無効化する条件を検出する段階、及び
−該条件が検出されると、自動的に、省電力モードを有効化する条件が検出された場合に前記簡易復号化モードに切り替えるか、省電力モードを無効化する条件が検出された場合に可逆復号化モードに切り替える段階、
を更に有する請求項５に記載の方法。 -Detecting a condition for enabling or disabling the power saving mode; and-when the condition is detected, the simple decoding mode is automatically detected when the condition for enabling the power saving mode is detected. Or switching to lossless decoding mode when a condition to disable power saving mode is detected,
The method of claim 5 further comprising:

省電力モードを有効化する条件は、バッテリから電力が供給されること又は低電力しか利用可能でないことを有する、
ことを特徴とする請求項６に記載の方法。 The conditions for enabling the power saving mode include that power is supplied from the battery or that only low power is available,
The method according to claim 6.

節電モードを有効化する条件は、当該方法の１又は複数の段階を実行する少なくとも１つの処理要素の処理負荷が閾を超えることを有する、
ことを特徴とする請求項６又は７に記載の方法。 The condition for enabling the power saving mode has that the processing load of at least one processing element that performs one or more stages of the method exceeds a threshold,
The method according to claim 6 or 7, characterized in that

前記可逆復号化モードの可逆復号化された信号は、前記符号器の元の信号のビット単位で正確な表現である、
ことを特徴とする請求項５乃至８の何れか一項に記載の方法。 The lossless decoded signal of the lossless decoding mode is an accurate representation in bit units of the original signal of the encoder,
9. A method according to any one of claims 5 to 8, characterized in that

前記低減された精度は、数値の丸め込み又はフィルタリング機能の簡略化を表す、
ことを特徴とする請求項１乃至９の何れか一項に記載の方法。 The reduced accuracy represents a rounding of numbers or a simplification of the filtering function,
10. A method according to any one of the preceding claims, characterized in that

前記基本層信号はＭＰ３フォーマットの音声信号である、
ことを特徴とする請求項１乃至１０の何れか一項に記載の方法。 The base layer signal is an MP3 format audio signal.
11. A method according to any one of the preceding claims, characterized in that

基本層部分と拡張層部分とを有する音声信号を復号化する復号器であって、
該基本層部分及び該拡張層部分は、異なるフィルタ・バンク領域にあり、
該拡張層部分は、フィルタ・バンク領域のマッピングを用い次にエントロピ符号化されて、該基本層部分から予測され、
当該復号器は、
−前記基本層部分を部分的に復号化する部分復号器、
−前記拡張層部分をエントロピ復号化するエントロピ復号器、
−簡易反転は低減された処理精度を表し、前記フィルタ・バンク領域のマッピングの該簡易反転に従って、前記エントロピ復号化された拡張層信号を逆マッピングする第１のマッピング要素、
−該逆マッピングされた拡張層を前記部分的に復号化された基本層に加算する第１の加算器、及び
−該第１の加算器の出力信号をフィルタリングし、逆基本層フィルタ・バンクとして動作する第１の合成フィルタ、
を有する復号器。 A decoder for decoding an audio signal having a base layer portion and an enhancement layer portion,
The base layer portion and the enhancement layer portion are in different filter bank regions;
The enhancement layer portion is then entropy encoded using filter bank region mapping and predicted from the base layer portion;
The decoder is
A partial decoder for partially decoding the base layer part;
An entropy decoder for entropy decoding the enhancement layer portion;
-A simple inversion represents a reduced processing accuracy, a first mapping element for inverse mapping the entropy decoded enhancement layer signal according to the simple inversion of the mapping of the filter bank region;
A first adder for adding the inverse-mapped enhancement layer to the partially decoded base layer; and- filtering the output signal of the first adder as an inverse base layer filter bank A first synthesis filter that operates;
A decoder.

前記基本層部分は周波数ビンを有し、
前記部分復号器は、該周波数ビンを復元する、
ことを特徴とする請求項１２に記載の復号器。 The base layer portion has frequency bins;
The partial decoder recovers the frequency bin;
The decoder according to claim 12, wherein:

前記部分復号器は、時間領域への逆変換を実行しない、
ことを特徴とする請求項１２に記載の復号器。 The partial decoder does not perform an inverse transform to the time domain;
The decoder according to claim 12, wherein:

前記第１の合成フィルタから、符号化前の元の信号と同一の周波数スペクトルを有するが該元の信号のビット単位で正確な複製ではない信号が得られる、
ことを特徴とする請求項１２乃至１４の何れか一項に記載の復号器。 From the first synthesis filter, a signal is obtained that has the same frequency spectrum as the original signal before encoding, but is not an exact replica in bit units of the original signal.
15. The decoder according to any one of claims 12 to 14, characterized by:

前記マッピング要素、前記加算器、及び前記合成フィルタは、簡易復号化のためのユニットを表し、
当該復号器は、
−前記部分的に復号化された基本層信号を前記フィルタ・バンク領域からＭＤＣＴ領域へマッピングする第２のマッピング要素と、結果として生じたＭＤＣＴ領域信号を前記エントロピ復号化された拡張層信号に加算する第２の加算ユニットとを有して元のソース周波数ビンが得られ、該元のソース周波数ビンをフィルタリングする逆整数ＭＤＣＴフィルタ・バンクを有して可逆復号化された信号が得られる、可逆復号化モードを提供する第２の可逆復号器、及び
−前記簡易復号化のためのユニットと前記可逆復号器との間を切り替える切り替え要素、
を更に有する請求項１２乃至１５の何れか一項に記載の復号器。 The mapping element, the adder, and the synthesis filter represent a unit for simple decoding,
The decoder is
A second mapping element for mapping the partially decoded base layer signal from the filter bank region to the MDCT region, and adding the resulting MDCT region signal to the entropy decoded enhancement layer signal And a second summing unit to obtain an original source frequency bin, and an inverse integer MDCT filter bank for filtering the original source frequency bin to obtain a lossless decoded signal. A second lossless decoder that provides a decoding mode; and a switching element that switches between the unit for the simple decoding and the lossless decoder;
The decoder according to claim 12, further comprising:

−省電力モードを有効化又は無効化する条件を検出する検出器、及び
−自動的に、省電力モードを有効化する条件が検出されると前記簡易復号化モードに切り替えるか、省電力モードを無効化する条件が検出された場合に可逆復号化モードに切り替えるスイッチ、
を更に有する請求項１６に記載の復号器。 A detector for detecting a condition for enabling or disabling the power saving mode, and automatically switching to the simple decoding mode when the condition for enabling the power saving mode is detected, or switching the power saving mode to A switch to switch to lossless decoding mode when a condition to disable is detected,
The decoder of claim 16 further comprising:

前記基本層信号はＭＰ３フォーマットの音声信号である、
ことを特徴とする請求項１２乃至１７の何れか一項に記載の復号器。 The base layer signal is an MP3 format audio signal.
A decoder according to any one of claims 12 to 17, characterized in that

前記低減された精度は、数値の丸め込み又はフィルタリング機能の簡略化を表す、
ことを特徴とする請求項１２乃至１８の何れか一項に記載の復号器。 The reduced accuracy represents a rounding of numbers or a simplification of the filtering function,
The decoder according to any one of claims 12 to 18, characterized in that: