JP2007271916A

JP2007271916A - Speech data compressing device and expanding device

Info

Publication number: JP2007271916A
Application number: JP2006097357A
Authority: JP
Inventors: Norio Suzuki; 典雄鈴木
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2006-03-31
Filing date: 2006-03-31
Publication date: 2007-10-18

Abstract

<P>PROBLEM TO BE SOLVED: To provide a speech data compressing device and an expanding device that reduce deterioration in tone quality in a high frequency range, in compression by a linear predictive encoding compression system, an ADPCM compression system, etc. <P>SOLUTION: A high-frequency attenuating circuit 1 attenuates a high frequency range of input audio data (PCM data) and, for example, an IIR shelving type filter is used. A selecting circuit 2 selects and outputs one of audio data which is outputted from the high-frequency attenuating circuit 1 and has been subjected to frequency attenuation, and audio data before being inputted to the high-frequency attenuating circuit. Selecting operation of the selecting circuit 2 is done by a user. A linear predictive encoding compressing circuit 3 compresses the audio data outputted from the selecting circuit 2 by the linear predictive encoding compression system and outputs the resulting data. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、線形予測符号化圧縮方式、ＡＤＰＣＭ（Adaptive Differential Pulse Code Modulation）圧縮方式等の予測符号化圧縮方式によって圧縮を行う音声データ圧縮装置および圧縮された音声データを伸張する音声データ伸張装置に関する。 The present invention relates to an audio data compression apparatus that performs compression using a predictive coding compression system such as a linear predictive coding compression system and an ADPCM (Adaptive Differential Pulse Code Modulation) compression system, and an audio data expansion apparatus that decompresses compressed audio data. .

周知のように、線形予測符号化圧縮方式は、数サンプル前の情報から次のサンプルのデータを予測し、その差分（誤差）を記録しておく圧縮方式であり、ＡＤＰＣＭ圧縮方式より良好な結果が得られる。しかし、高域では近接サンプル間の相関が悪くなるため、予測精度が悪くなり、このため、高域成分の大きい波形では誤差成分が大きくなり、量子化ビット数を制限すると誤差が増える（言い換えればノイズが増える）問題がある。
なお、線形予測を用いた波形信号形成装置として特許文献１が知られている。また、本出願に関係する従来技術文献として特許文献２が知られている。
特許第2897377号公報特開平5-268114号公報 As is well known, the linear predictive coding compression method is a compression method in which data of the next sample is predicted from information several samples before and the difference (error) is recorded, and a better result than the ADPCM compression method. Is obtained. However, since the correlation between adjacent samples deteriorates at high frequencies, the prediction accuracy deteriorates. For this reason, an error component increases in a waveform with a large high frequency component, and errors increase when the number of quantization bits is limited (in other words, (Noise increases)
Patent Document 1 is known as a waveform signal forming apparatus using linear prediction. Moreover, patent document 2 is known as a prior art document relevant to this application.
Japanese Patent No. 2897377 Japanese Patent Laid-Open No. 5-268114

本発明は上記事情を考慮してなされたもので、その目的は、予測符号化圧縮方式によって圧縮を行う音声データ圧縮装置であって、高域の音質の劣化を小さく抑えることができる音声データ圧縮装置および伸張装置を提供することにある。 The present invention has been made in view of the above circumstances, and an object of the present invention is an audio data compression apparatus that performs compression using a predictive coding compression method, and is an audio data compression capable of minimizing deterioration of high-frequency sound quality. It is to provide a device and a stretching device.

この発明は上記の課題を解決するためになされたもので、請求項１に記載の発明は、被圧縮音声データの高域を減衰させる高域減衰手段と、前記高域減衰手段から出力される音声データを、予測符号化圧縮方式によって圧縮する圧縮手段とを具備することを特徴とする音声データ圧縮装置である。 The present invention has been made to solve the above problems, and the invention according to claim 1 outputs the high frequency attenuation means for attenuating the high frequency of the compressed audio data and the high frequency attenuation means. An audio data compression apparatus comprising compression means for compressing audio data by a predictive coding compression method.

請求項２に記載の発明は、請求項１に記載の音声データ圧縮装置において、前記高域減衰手段の出力データと前記高域減衰手段の入力データの一方を選択して前記圧縮手段へ出力する選択手段を設けたことを特徴とする。
請求項３に記載の発明は、請求項１または請求項２に記載の音声データ圧縮装置において、前記圧縮手段は、線形予測符号化圧縮方式またはＡＤＰＣＭ圧縮方式によって音声データの圧縮を行うことを特徴とする。 According to a second aspect of the present invention, in the audio data compression apparatus according to the first aspect, one of the output data of the high frequency attenuation means and the input data of the high frequency attenuation means is selected and output to the compression means. A selection means is provided.
According to a third aspect of the present invention, in the audio data compression apparatus according to the first or second aspect, the compression unit compresses the audio data by a linear predictive coding compression method or an ADPCM compression method. And

請求項４に記載の発明は、圧縮済み音声データを圧縮時の方式に対応する伸張方式を用いて伸張する伸張手段と、前記伸張手段から出力される音声データの高域を増大して元の音声データを復元する高域復元手段とを具備することを特徴とする音声データ伸張装置である。 According to a fourth aspect of the present invention, there is provided decompression means for decompressing compressed audio data by using a decompression method corresponding to the method at the time of compression, and increasing the high frequency range of the audio data output from the decompression means. An audio data decompressing device comprising high frequency restoration means for restoring audio data.

請求項５に記載の発明は、請求項４に記載の音声データ伸張装置において、前記高域復元手段の出力データと前記高域復元手段の入力データの一方を選択して出力する選択手段を設けたことを特徴とする。 According to a fifth aspect of the present invention, in the audio data decompressing apparatus according to the fourth aspect of the present invention, there is provided selection means for selecting and outputting one of the output data of the high frequency restoration means and the input data of the high frequency restoration means. It is characterized by that.

請求項６に記載の発明は、請求項４または請求項５に記載の音声データ伸張装置において、前記伸張手段は、線形予測符号化圧縮方式またはＡＤＰＣＭ圧縮方式に対応する伸張方式に基づいて音声データの伸張を行うことを特徴とする。 According to a sixth aspect of the present invention, in the voice data decompression apparatus according to the fourth or fifth aspect, the decompression means is configured to perform voice data based on a decompression method corresponding to a linear predictive coding compression method or an ADPCM compression method. It is characterized by performing the decompression.

この発明によれば、線形予測符号化圧縮方式、ＡＤＰＣＭ圧縮方式等の符号化圧縮方式によってデータ圧縮を行う音声データ圧縮装置および音声データ伸張装置において、高域の音質の劣化を小さく抑えることができる効果がある。 According to the present invention, in an audio data compression apparatus and audio data decompression apparatus that perform data compression using an encoding compression system such as a linear predictive encoding compression system or an ADPCM compression system, it is possible to suppress deterioration in sound quality in a high frequency range. effective.

以下、図面を参照し、この発明の実施の形態について説明する。図１（ａ）はこの発明の一実施の形態による音声データ圧縮装置の要部の構成を示すブロック図である。この図において、符号１は入力される音声データ（ＰＣＭデータ）の高域を減衰させる高域減衰回路であり、例えばＩＩＲシェルビング型フィルタが用いられる。なお、この明細書において、音声データとは、音声だけでなく楽音をディジタルデータに変換したデータも含むものとする。図２において、曲線Ｌ１はこの高域減衰回路１の特性例を示しており、この図に示すように、高域減衰回路１は入力される音声データの１ＫＨｚ以上の帯域について、例えば１２ｄＢ減衰させる。 Embodiments of the present invention will be described below with reference to the drawings. FIG. 1A is a block diagram showing a configuration of a main part of an audio data compression apparatus according to an embodiment of the present invention. In this figure, reference numeral 1 denotes a high-frequency attenuation circuit for attenuating the high frequency of input audio data (PCM data). For example, an IIR shelving filter is used. In this specification, the voice data includes not only voice but also data obtained by converting musical sound into digital data. In FIG. 2, a curve L1 shows an example of the characteristics of the high frequency attenuating circuit 1. As shown in the figure, the high frequency attenuating circuit 1 attenuates, for example, 12 dB in a band of 1 kHz or more of input audio data. .

なお、ＩＩＲシェルビング型フィルタに代えてＦＩＲフィルタを用いてもよい。しかし、ＩＩＲフィルタの方がＦＩＲフィルタよりレイテンシを小さく抑えることができる利点がある。また、ＩＩＲフィルタには規模が小さいという利点もある。 Note that an FIR filter may be used instead of the IIR shelving type filter. However, the IIR filter has an advantage that the latency can be suppressed smaller than the FIR filter. The IIR filter also has the advantage of being small in scale.

符号２は選択回路であり、高域減衰回路１から出力される高域減衰済みの音声データまたは高域減衰回路１に入力される前の音声データのいずれか一方を選択して出力する。この選択回路２の選択操作はユーザによって行われる。すなわち、データ圧縮処理時において、ユーザが圧縮前の音声データの高域に雑音成分が多いと感じた時は、選択回路２の選択を高域減衰回路１の出力側に切り替え、特に感じない場合は高域減衰回路１の入力側に切り替える。 Reference numeral 2 denotes a selection circuit which selects and outputs either the high frequency attenuated audio data output from the high frequency attenuation circuit 1 or the audio data before being input to the high frequency attenuation circuit 1. The selection operation of the selection circuit 2 is performed by the user. That is, when the user feels that there is a lot of noise components in the high frequency range of the audio data before compression during the data compression process, the selection of the selection circuit 2 is switched to the output side of the high frequency attenuation circuit 1 and there is no particular feeling. Is switched to the input side of the high-frequency attenuation circuit 1.

符号３は線形予測符号化圧縮回路であり、選択回路２から出力される音声データを線形予測符号化圧縮方式によって圧縮し出力する。なお、この回路は従来から公知の回路である。 Reference numeral 3 denotes a linear predictive coding / compression circuit, which compresses and outputs speech data output from the selection circuit 2 by a linear predictive coding / compression method. This circuit is a conventionally known circuit.

次に、図１（ｂ）は上述したデータ圧縮回路によって圧縮された音声データを伸張して圧縮前の音声データに戻すデータ伸張回路の構成を示すブロック図である。この図において、符号５は線形予測符号化伸張回路であり、圧縮済み音声データを線形予測符号化方式に基づいて伸張し出力する。なお、この線形予測符号化伸張回路５は従来から公知の回路である。 Next, FIG. 1B is a block diagram showing a configuration of a data decompression circuit that decompresses audio data compressed by the above-described data compression circuit and returns it to audio data before compression. In this figure, reference numeral 5 denotes a linear predictive coding / decompressing circuit, which decompresses and outputs compressed speech data based on the linear predictive coding method. The linear predictive encoding / decompression circuit 5 is a conventionally known circuit.

符号６は高域復元回路であり、線形予測符号化伸張回路５から出力される音声データの高域を減衰前の状態に復元し出力する。この高域復元回路６には、例えば、ＩＩＲシェルビング型フィルタが用いられる。図２において、曲線Ｌ２はこの高域復元回路６の特性例を示しており、この図に示すように、高域復元回路６は入力される音声データの１ＫＨｚ以上の帯域について、圧縮時の減衰量に対応して例えば１２ｄＢレベルアップさせる Reference numeral 6 denotes a high-frequency restoration circuit, which restores and outputs the high-frequency range of the audio data output from the linear prediction encoding / decompression circuit 5 to the state before attenuation. For the high frequency restoration circuit 6, for example, an IIR shelving type filter is used. In FIG. 2, a curve L2 shows an example of the characteristics of the high-frequency restoration circuit 6, and as shown in this figure, the high-frequency restoration circuit 6 attenuates at the time of compression for a band of 1 KHz or more of input audio data. Increase the level by, for example, 12 dB corresponding to the amount

なお、ＩＩＲシェルビング型フィルタに代えてＦＩＲフィルタを用いてもよい。圧縮時にＩＩＲシェルビング型フィルタまたはＦＩＲフィルタを用いて高域減衰を行い、伸張時にも同じ型のフィルタを用いて高域増大した場合には、位相特性、周波数特性を共に直線とすることができる利点が得られる。 Note that an FIR filter may be used instead of the IIR shelving type filter. When high frequency attenuation is performed using an IIR shelving type filter or FIR filter during compression and the high frequency is increased using the same type filter during expansion, both phase characteristics and frequency characteristics can be made straight. Benefits are gained.

符号７は選択回路であり、高域復元回路６から出力される高域復元済みの音声データまたは高域復元回路６に入力される前の音声データのいずれか一方を選択して出力する。この選択回路２の選択操作はユーザによって行われる。すなわち、データ圧縮処理時において、高域減衰が行われていた場合には、選択回路７の選択を高域復元回路６の出力側に切り替え、高域減衰が行われていなかった場合は高域復元回路６の入力側に切り替える。なお、この切り替えを自動で行ってもよい。この場合、圧縮データに選択回路７を切り替えるためのデータを付加する。 Reference numeral 7 denotes a selection circuit that selects and outputs either the high-frequency restored audio data output from the high-frequency restoration circuit 6 or the audio data before being input to the high-frequency restoration circuit 6. The selection operation of the selection circuit 2 is performed by the user. That is, when high-frequency attenuation is performed during the data compression processing, the selection circuit 7 switches the selection to the output side of the high-frequency restoration circuit 6, and when high-frequency attenuation is not performed, the high-frequency attenuation is performed. Switch to the input side of the restoration circuit 6. This switching may be performed automatically. In this case, data for switching the selection circuit 7 is added to the compressed data.

このように、上記実施形態によれば、圧縮前の音声データの高域に雑音が多い場合に、予め高域を減衰させてから圧縮する。予め高域を減衰させることで、近接サンプル間の相関を大きくすることができ、これにより、線形予測の精度が上がり、誤差が小さくなり、結果として再生音質を改善することができる。また、圧縮時に用いるフィルタと伸張時に用いるフィルタを適切に選択すれば、フィルタによる位相特性の変化を相殺することができる。また、上記実施形態によれば、選択回路２および７が設けられているので、高域減衰処理を行う必要がない場合に高域減衰回路１および高域復元回路６をバイパスすることができる。 As described above, according to the above embodiment, when there is a lot of noise in the high frequency range of the audio data before compression, the high frequency frequency is attenuated in advance and then compressed. By attenuating the high range in advance, the correlation between adjacent samples can be increased, thereby increasing the accuracy of linear prediction and reducing the error, and as a result, the reproduced sound quality can be improved. Moreover, if the filter used at the time of compression and the filter used at the time of expansion are appropriately selected, the change in the phase characteristics due to the filter can be canceled out. Further, according to the embodiment, since the selection circuits 2 and 7 are provided, the high-frequency attenuation circuit 1 and the high-frequency restoration circuit 6 can be bypassed when it is not necessary to perform the high-frequency attenuation processing.

また、上記実施形態は線形予測符号化圧縮方式を用いているが、この発明は、ＡＤＰＣＭ圧縮方式やその他の符号化圧縮方式によってデータ圧縮を行う音声データ圧縮装置および伸張装置に適用可能である。
なお、高域の信号レベルを検出し、レベルが高い時に減衰率を上げるようにしてもよい。この場合、減衰率は０（スルー）から、遮断（この場合、ＬＰＦになる）まで可変できるようにする。また、減衰率はデータと共に伸張側へ伝送する。 Moreover, although the said embodiment uses the linear prediction encoding compression system, this invention is applicable to the audio | voice data compression apparatus and decompression | decompression apparatus which compress data by ADPCM compression system and another encoding compression system.
Note that a high-frequency signal level may be detected, and the attenuation rate may be increased when the level is high. In this case, the attenuation rate can be varied from 0 (through) to cutoff (in this case, LPF). The attenuation rate is transmitted to the decompression side together with the data.

この発明は、音声データを圧縮／伸張する装置に用いられる。 The present invention is used in an apparatus for compressing / decompressing audio data.

この発明の一実施形態による音声データ圧縮装置および音声データ伸張装置の構成を示すブロック図である。It is a block diagram which shows the structure of the audio | voice data compression apparatus and audio | voice data expansion | extension apparatus by one Embodiment of this invention. 同実施形態による音声データ圧縮装置および音声データ伸張装置の動作を説明するための波形図である。FIG. 6 is a waveform diagram for explaining operations of the audio data compression device and the audio data expansion device according to the embodiment.

符号の説明Explanation of symbols

１…高域減衰回路、２、７…選択回路、３…線形予測符号化圧縮回路、５…線形予測符号化伸張回路、６…高域復元回路。

DESCRIPTION OF SYMBOLS 1 ... High region attenuation circuit, 2, 7 ... Selection circuit, 3 ... Linear prediction encoding compression circuit, 5 ... Linear prediction encoding expansion circuit, 6 ... High region restoration circuit

Claims

被圧縮音声データの高域を減衰させる高域減衰手段と、
前記高域減衰手段から出力される音声データを、予測符号化圧縮方式によって圧縮する圧縮手段と、
を具備することを特徴とする音声データ圧縮装置。 High frequency attenuation means for attenuating the high frequency of the compressed audio data;
Compression means for compressing the audio data output from the high-frequency attenuation means by a predictive encoding compression method;
An audio data compression apparatus comprising:

前記高域減衰手段の出力データと前記高域減衰手段の入力データの一方を選択して前記圧縮手段へ出力する選択手段を設けたことを特徴とする請求項１に記載の音声データ圧縮装置。 2. The audio data compression apparatus according to claim 1, further comprising selection means for selecting one of the output data of the high-frequency attenuation means and the input data of the high-frequency attenuation means and outputting the selected data to the compression means.

前記圧縮手段は、線形予測符号化圧縮方式またはＡＤＰＣＭ圧縮方式によって音声データの圧縮を行うことを特徴とする請求項１または請求項２に記載の音声データ圧縮装置。 The audio data compression apparatus according to claim 1 or 2, wherein the compression means compresses audio data by a linear predictive coding compression method or an ADPCM compression method.

圧縮済み音声データを圧縮時の方式に対応する伸張方式を用いて伸張する伸張手段と、
前記伸張手段から出力される音声データの高域を増大して元の音声データを復元する高域復元手段と、
を具備することを特徴とする音声データ伸張装置。 Decompression means for decompressing compressed audio data using an decompression method corresponding to the method at the time of compression;
High frequency restoration means for restoring the original audio data by increasing the high frequency of the audio data output from the decompression means;
An audio data decompressing apparatus comprising:

前記高域復元手段の出力データと前記高域復元手段の入力データの一方を選択して出力する選択手段を設けたことを特徴とする請求項４に記載の音声データ伸張装置。 5. The audio data decompressing apparatus according to claim 4, further comprising selection means for selecting and outputting one of the output data of the high frequency restoration means and the input data of the high frequency restoration means.

前記伸張手段は、線形予測符号化圧縮方式またはＡＤＰＣＭ圧縮方式に対応する伸張方式に基づいて音声データの伸張を行うことを特徴とする請求項４または請求項５に記載の音声データ伸張装置。

6. The audio data expansion device according to claim 4, wherein the expansion unit expands audio data based on a expansion method corresponding to a linear predictive coding compression method or an ADPCM compression method.