JP2007271686A

JP2007271686A - Audio signal processor

Info

Publication number: JP2007271686A
Application number: JP2006093944A
Authority: JP
Inventors: Tsukasa Suenaga; 司末永; Kenichi Yamauchi; 健一山内; Noriaki Shime; 範明七五三
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2006-03-30
Filing date: 2006-03-30
Publication date: 2007-10-18
Anticipated expiration: 2026-03-30
Also published as: JP4175376B2

Abstract

<P>PROBLEM TO BE SOLVED: To appropriately improve sound quality of a compressed audio signal in which an information amount is reduced by utilizing masking effect. <P>SOLUTION: In the case of an audio signal such as music, many of signal components (maskees) which are omitted by compression, are ones which have been maskers are attenuated. Therefore, the signal components which have been the maskers and which are now the maskees, are taken into the present signal, and pseudo restoration of the audio signal of original sound is carried out. A human auditory masking characteristic is different dependent on frequencies. Therefore, the audio signal is divided into partial band signals of a plurality of frequency bands, and reverberation of the characteristic corresponding to the masking characteristic of each frequency is given. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

この発明は、圧縮されたデジタルオーディオ信号をデコードしたときの音質の劣化を補償するオーディオ信号処理装置、オーディオ信号処理方法、および、オーディオ信号処理プログラムに関する。 The present invention relates to an audio signal processing device, an audio signal processing method, and an audio signal processing program that compensate for deterioration in sound quality when a compressed digital audio signal is decoded.

近年、ＭＰ３，ＡＴＲＡＣ，ＡＡＣ等のオーディオ信号圧縮方式が開発され、実用化されている。これら信号圧縮方式の殆どは、人間の聴覚のマスキング効果を利用し、生のオーディオ信号のうちマスキング効果によってリスナに聞こえないであろう信号成分を省略することによって聴感上の音質劣化を抑えながら情報量を少なくしている。 In recent years, audio signal compression methods such as MP3, ATRAC, and AAC have been developed and put into practical use. Most of these signal compression methods use the masking effect of human hearing, omitting signal components that would not be heard by the listener due to the masking effect in the raw audio signal, while suppressing deterioration in sound quality on hearing. The amount is reduced.

図１は人間のマスキング効果について説明する図である。図１（Ａ），（Ｂ）は、マスキング効果のうち同時マスキングについて示している。また、図１（Ｃ）は、マスキング効果のうち継時マスキングについて示している。同時マスキングとは、オーディオ信号中のある周波数に大レベルの信号成分（マスカー）が存在する場合、その周辺の周波数の信号成分（マスキー）はリスナに聞こえないという効果である。同図（Ａ）に示すように、マスカーのレベルが大きいほどマスクされるマスキーのレベルも大きくなり、周波数範囲も広くなる。また、同図（Ｂ）に示すように、マスカーのレベルが（等感度曲線上で）同じレベルであっても、その周波数によって周波数軸の広がりが異なり、１ｋＨｚ以下では約１００Ｈｚの範囲、１ｋＨｚ以上では周波数が大きいほど影響を及ぼす範囲は大きくなる。 FIG. 1 is a diagram for explaining a human masking effect. FIGS. 1A and 1B show simultaneous masking among the masking effects. Further, FIG. 1C shows the successive masking among the masking effects. The simultaneous masking is an effect that when a high level signal component (masker) exists at a certain frequency in the audio signal, the signal component (maskee) of the surrounding frequency cannot be heard by the listener. As shown in FIG. 6A, the masker mask level increases as the masker level increases, and the frequency range also increases. Further, as shown in FIG. 5B, even if the masker level is the same (on the isosensitivity curve), the frequency axis spreads depending on the frequency, and the range of about 100 Hz is 1 kHz or less, and 1 kHz or more. Then, the greater the frequency, the greater the range affected.

また、同図（Ｃ）に示すように、大レベルの信号成分が発音された後は、その周辺の周波数成分については、しばらくの間音が聞こえなくなる。また、これら同時マスキング効果、（前方の）継時マスキング効果以外にも、時間的に過去に遡る後方継時マスキング効果や倍音マスキング効果等を利用したものもある。 Also, as shown in FIG. 5C, after a large level signal component is sounded, the surrounding frequency components cannot be heard for a while. In addition to the simultaneous masking effect and the forward masking effect (forward), there are those utilizing a backward masking effect or a harmonic masking effect that goes back in time.

しかしながら、上記オーディオ信号圧縮方式は、不可逆に情報量を少なくするのであるから音質が劣化することは否めず、特に高音域の音質劣化が著しいといわれている。そこで、復調（デコード）したオーディオ信号の高音域を強調することによって、音質の劣化を補償する技術が開発されている（たとえば特許文献１）。
特開２００３−１４０６９６号公報 However, since the audio signal compression method irreversibly reduces the amount of information, the sound quality is inevitably deteriorated, and it is said that the sound quality deterioration particularly in the high sound range is remarkable. Thus, a technique has been developed that compensates for deterioration in sound quality by emphasizing the high frequency range of the demodulated (decoded) audio signal (for example, Patent Document 1).
JP 2003-140696 A

しかし、上記特許文献１の装置では、強調されるのは高音部のみであり、それ以外の周波数帯域の音質劣化を補償することはできなかった。 However, in the apparatus disclosed in Patent Document 1, only the high sound part is emphasized, and it has not been possible to compensate for sound quality deterioration in other frequency bands.

また、単に高音域を強調して新たな周波数成分を生成するのみであるため、実際に存在していたが省略された信号成分を再現したことにはならず、聴感上の音質改善にはなるが、原音を復元することにはならないという問題点があった。 In addition, since only the high frequency range is emphasized and a new frequency component is generated, the signal component that was actually present but omitted is not reproduced, and the sound quality on hearing is improved. However, there was a problem that the original sound was not restored.

この発明は、マスキング効果を利用して情報量を減らした圧縮オーディオ信号の音質を的確に改善できるオーディオ信号処理装置、オーディオ信号処理方法、オーディオ信号処理プログラムを提供することを特徴とする。 The present invention provides an audio signal processing device, an audio signal processing method, and an audio signal processing program capable of accurately improving the sound quality of a compressed audio signal with a reduced amount of information using a masking effect.

この発明は、圧縮オーディオ信号を復調した復調オーディオ信号を複数の周波数帯域の部分帯域信号に分割する帯域分割部と、各部分帯域信号ごとに設けられ、各部分帯域信号に基づいて、各周波数帯域における聴覚マスキング特性の時間変化を模した残響信号を生成する残響フィルタと、前記残響フィルタが生成した各周波数帯域ごとの残響信号を前記復調オーディオ信号に加算する加算部と、を備えたオーディオ信号処理装置である。 The present invention provides a band dividing unit that divides a demodulated audio signal obtained by demodulating a compressed audio signal into partial band signals of a plurality of frequency bands, and is provided for each partial band signal. Audio signal processing comprising: a reverberation filter that generates a reverberation signal simulating temporal change in auditory masking characteristics in the signal; and an adder that adds the reverberation signal for each frequency band generated by the reverberation filter to the demodulated audio signal Device.

この発明は、圧縮オーディオ信号を復調した復調オーディオ信号から高音域の部分帯域信号を取り出す高域分離部と、前記部分帯域信号に基づいて、高音域における聴覚マスキング特性の時間変化を模した残響信号を生成する残響フィルタと、前記残響フィルタが生成した残響信号を前記復調オーディオ信号に加算する加算部と、を備えたオーディオ信号処理装置である。 The present invention relates to a high-frequency separation unit that extracts a high-frequency partial band signal from a demodulated audio signal obtained by demodulating a compressed audio signal, and a reverberation signal simulating temporal changes in auditory masking characteristics in the high-frequency range based on the partial band signal The reverberation filter which produces | generates, and the addition part which adds the reverberation signal which the said reverberation filter produced | generated to the said demodulation audio signal is an audio signal processing apparatus.

この発明は、前記残響信号を、前記復調オーディオ信号によってマスキングされるレベルに応じたゲインで増幅する増幅部を備えたオーディオ信号処理装置である。 The present invention is an audio signal processing apparatus including an amplifying unit that amplifies the reverberation signal with a gain corresponding to a level masked by the demodulated audio signal.

この発明は、圧縮オーディオ信号を復調した復調オーディオ信号を複数の周波数帯域の部分帯域信号に分割し、各部分帯域信号に基づいて、各周波数帯域における聴覚マスキング特性の時間変化を模した残響信号を生成し、前記各周波数帯域ごとの残響信号を前記復調オーディオ信号に加算することを特徴とするオーディオ信号処理方法である。 The present invention divides a demodulated audio signal obtained by demodulating a compressed audio signal into subband signals of a plurality of frequency bands, and based on each subband signal, generates a reverberation signal that imitates the temporal change of auditory masking characteristics in each frequency band. An audio signal processing method comprising: generating and adding a reverberation signal for each frequency band to the demodulated audio signal.

この発明は、圧縮オーディオ信号を復調した復調オーディオ信号から高音域の部分帯域信号を取り出し、前記部分帯域信号に基づいて、高音域における聴覚マスキング特性の時間変化を模した残響信号を生成し、前記残響信号を前記復調オーディオ信号に加算することを特徴とするオーディオ信号処理方法である。 The present invention extracts a high-frequency partial band signal from a demodulated audio signal obtained by demodulating a compressed audio signal, and generates a reverberation signal imitating a temporal change in auditory masking characteristics in a high-frequency range based on the partial band signal, An audio signal processing method comprising adding a reverberation signal to the demodulated audio signal.

この発明は、デジタル信号処理装置に、圧縮オーディオ信号を復調した復調オーディオ信号を複数の周波数帯域の部分帯域信号に分割するプロセス、各部分帯域信号に基づいて、各周波数帯域における聴覚マスキング特性の時間変化を模した残響信号を生成するプロセス、前記各周波数帯域ごとの残響信号を前記復調オーディオ信号に加算するするプロセス、を実行させるオーディオ信号処理プログラムである。 The present invention relates to a process for dividing a demodulated audio signal obtained by demodulating a compressed audio signal into a plurality of frequency band partial band signals, a time of auditory masking characteristics in each frequency band based on each partial band signal. An audio signal processing program for executing a process for generating a reverberation signal simulating a change and a process for adding a reverberation signal for each frequency band to the demodulated audio signal.

この発明は、デジタル信号処理装置に、圧縮オーディオ信号を復調した復調オーディオ信号から高音域の部分帯域信号を取り出すプロセス、前記部分帯域信号に基づいて、高音域における聴覚マスキング特性の時間変化を模した残響信号を生成するプロセス、前記残響信号を前記復調オーディオ信号に加算するプロセス、を実行させるオーディオ信号処理プログラムである。 The present invention imitates a digital signal processing device in the process of extracting a high-frequency partial band signal from a demodulated audio signal obtained by demodulating a compressed audio signal, and the temporal change in auditory masking characteristics in the high-frequency range based on the partial band signal. An audio signal processing program for executing a process of generating a reverberation signal and a process of adding the reverberation signal to the demodulated audio signal.

音楽等のオーディオ信号の場合、圧縮により省略される信号成分は、その直前は大レベルで発音されていた楽音が減衰したものが多い。すなわち、打楽器等の減衰音は時間とともに減衰するため、最初はマスカーであっても途中からマスキーに変わる、また、その楽音を録音したホールに響きがあっても直接音と初期反射音は発音されるが残響はマスキーとなって省略されてしまう、等である。そこで、圧縮オーディオ信号を復調した復調信号に残響を付与することにより、以前はマスカーであったが、今はマスキーとなっている信号成分を復元することができ、擬似的に原音のオーディオ信号を復元することができる。 In the case of audio signals such as music, many signal components that are omitted by compression are attenuated musical sounds that were sounded at a high level immediately before. In other words, the decay sound of percussion instruments decays with time, so even if it is a masker at first, it changes to a masque from the middle. However, reverberation is omitted as a mask. Therefore, by adding reverberation to the demodulated signal obtained by demodulating the compressed audio signal, the signal component that was previously a masker can now be restored, and the original audio signal is simulated. Can be restored.

圧縮オーディオ信号は、人間の聴覚マスキング特性に基づいて信号成分を省略している。人間の聴覚マスキング特性は、周波数によって異なる。そこで、この発明では、復元オーディオ信号を複数の周波数帯域の部分帯域信号に分割し、各周波数帯域のマスキング特性に合わせた特性の残響を付与する。 The compressed audio signal omits signal components based on human auditory masking characteristics. Human auditory masking characteristics vary with frequency. Therefore, in the present invention, the restored audio signal is divided into partial band signals of a plurality of frequency bands, and reverberation with characteristics matching the masking characteristics of each frequency band is given.

なお、信号圧縮による音質の劣化は高音域において顕著であるため、高音域のみ残響を付加しても簡易的に音質の改善実現することができる。 Note that the deterioration in sound quality due to signal compression is significant in the high sound range, so that it is possible to simply improve the sound quality even if reverberation is added only in the high sound range.

また、復調オーディオ信号の周波数成分（マスカー）のレベルによっても実際にマスキングされる周波数成分の特性が変化するため、この復調オーディオ信号に応じて残響信号を増幅するゲインを調整し、実際にマスキングされた信号成分に近い残響を付加するようにする。 Also, since the characteristics of the frequency component that is actually masked change depending on the level of the frequency component (masker) of the demodulated audio signal, the gain for amplifying the reverberation signal is adjusted according to the demodulated audio signal, and the masking is actually performed. Reverberation close to the signal component is added.

この発明によれば、人間の聴覚のマスキング効果に基づいて省略された音声信号成分を擬似的に再現することができるため、マスキング効果を利用した圧縮によって劣化した音質をよりよく改善することができる。 According to the present invention, since the audio signal component omitted based on the human auditory masking effect can be reproduced in a pseudo manner, the sound quality deteriorated by the compression using the masking effect can be improved better. .

図面を参照してこの発明の実施形態について説明する。図２は、この発明の実施形態であるオーディオ信号処理装置のブロック図である。このオーディオ信号処理装置では、図１に示したマスキング効果による非可聴成分を省略することによって圧縮されたオーディオ信号を再生するときに、直前の（時間的に遡った）音声信号を残響として残すことにより、上記圧縮によって省略された信号成分を擬似的に復元するものである。 Embodiments of the present invention will be described with reference to the drawings. FIG. 2 is a block diagram of an audio signal processing apparatus according to an embodiment of the present invention. In this audio signal processing apparatus, when a compressed audio signal is reproduced by omitting the inaudible component due to the masking effect shown in FIG. 1, the immediately preceding (retrochronized) audio signal is left as reverberation. Thus, the signal component omitted by the compression is restored in a pseudo manner.

これは、オーディオ信号が音楽演奏等の楽音である場合、省略される信号成分は、その直前にはレベルが大きく省略されず発音していた成分が減衰したものまたは演奏によりデクレッシェンドしたものである可能性がたかい。そこで過去の信号を残響として小レベルで付加することにより、この失われた信号成分を擬似的に復元することが可能になる。 This is because when the audio signal is a musical sound such as a musical performance, the signal component to be omitted is the one whose level was not greatly omitted immediately before that but the sounding component was attenuated or decremented by the performance. The possibility is high. Therefore, by adding the past signal as reverberation at a small level, it is possible to restore the lost signal component in a pseudo manner.

図２に示すように、オーディオ信号処理装置は、圧縮されたオーディオ信号を伸長（デコード）するデコーダ１０、デコードされたオーディオ信号を高域（Ｈ），中域（Ｍ），低域（Ｌ）の３つの周波数帯域に分割する帯域分割フィルタ１１、各帯域の信号に基づき、それぞれの態様の残響信号を生成する残響フィルタ１２（１２Ｈ，１２Ｍ，１２Ｌ）、生成された各帯域の残響信号をそれぞれのレベルで増幅するアンプ１３（１３Ｈ，１３Ｍ，１３Ｌ）、高域，中域，低域に分割された残響信号を統合する帯域合成部１５、帯域合成部１５から出力された残響信号とデコーダ１０でデコードされた復調オーディオ信号とを合成する加算器１６を備えている。 As shown in FIG. 2, the audio signal processing apparatus includes a decoder 10 that expands (decodes) a compressed audio signal, and the decoded audio signal is divided into a high frequency (H), a mid frequency (M), and a low frequency (L). Band division filter 11 for dividing the three frequency bands, reverberation filters 12 (12H, 12M, 12L) for generating reverberation signals of the respective modes based on the signals of the respective bands, and the generated reverberation signals of the respective bands, respectively. Amplifier 13 (13H, 13M, 13L) that amplifies at a level of, a band synthesizer 15 that integrates the reverberation signals divided into high, middle, and low frequencies, and the reverberation signal that is output from the band synthesizer 15 and the decoder 10 An adder 16 is provided for synthesizing the demodulated audio signal decoded in (1).

また、オーディオ信号処理装置は、デコーダ１０がデコードしたオーディオ信号の周波数特性に応じて、アンプ１３（１３Ｈ，１３Ｍ，１３Ｌ）のゲインを設定する解析部１４を備えている。また、この解析部１４が、デコーダ１０がデコードしたオーディオ信号の周波数特性に応じて残響フィルタ１２（１２Ｈ，１２Ｍ，１２Ｌ）のフィルタ特性）を設定するようにしてもよい。 The audio signal processing apparatus also includes an analysis unit 14 that sets the gain of the amplifier 13 (13H, 13M, 13L) according to the frequency characteristics of the audio signal decoded by the decoder 10. Further, the analysis unit 14 may set the reverberation filter 12 (filter characteristics of the reverberation filter 12 (12H, 12M, 12L)) according to the frequency characteristics of the audio signal decoded by the decoder 10.

図１のブロック図は機能的に示したものであり、このオーディオ信号処理装置は、例えばオーディオ装置やＡＶ機器に組み込まれる場合においてはＤＳＰ等のハードウェアで実現される。 The block diagram of FIG. 1 shows functionally, and this audio signal processing device is realized by hardware such as a DSP when incorporated in an audio device or an AV device, for example.

デコーダ１０は、ＭＰ３，ＡＴＲＡＣ，ＡＡＣ等の圧縮方式で圧縮されたオーディオ信号を通常のＰＣＭ形式のオーディオ信号にデコードする。このデコードされたオーディオ信号は帯域分割部１１、加算器１６および解析部１４に入力される。また、デコーダは、デコードする前の圧縮されたオーディオ信号に含まれる（省略されずに残った）再生信号成分の周波数・レベル成分情報を、解析部１４に出力する。 The decoder 10 decodes an audio signal compressed by a compression method such as MP3, ATRAC, or AAC into an ordinary PCM audio signal. The decoded audio signal is input to the band dividing unit 11, the adder 16, and the analyzing unit 14. Further, the decoder outputs the frequency / level component information of the reproduction signal component included in the compressed audio signal before decoding (remaining without being omitted) to the analysis unit 14.

帯域分割部１１は、入力されたオーディオ信号を高域（Ｈ），中域（Ｍ），低域（Ｌ）に分割して、それぞれ別々の残響フィルタ１２（１２Ｈ，１２Ｍ，１２Ｌ）に入力する。各帯域は、低域が２００Ｈｚ以下、中域が２００Ｈｚ〜２ｋＨｚ、高域が２ｋＨｚ以上となるように分割する。 The band dividing unit 11 divides the input audio signal into a high band (H), a middle band (M), and a low band (L), and inputs the divided audio signals to separate reverberation filters 12 (12H, 12M, 12L). . Each band is divided so that the low band is 200 Hz or less, the middle band is 200 Hz to 2 kHz, and the high band is 2 kHz or more.

なお各帯域の周波数割りは上記に限定されない。また、帯域分割数も高域、中域、低域の３つに限定されるものではない。 The frequency division of each band is not limited to the above. Further, the number of band divisions is not limited to three, that is, a high band, a middle band, and a low band.

残響フィルタ１２Ｈ，１２Ｍ，１２Ｌは、帯域分割部から入力されたオーディオ信号に対して、それぞれの帯域に応じたフィルタ特性で残響を付与し、残響信号を生成する。 The reverberation filters 12H, 12M, and 12L add reverberation to the audio signal input from the band dividing unit with filter characteristics corresponding to each band, and generate a reverberation signal.

ここで、図３を参照して残響フィルタ１２Ｈ，１２Ｍ，１２Ｌに設定されるフィルタ特性について説明する。図３は、（Ａ），（Ｂ），（Ｃ）は、それぞれ高域、中域、低域の残響フィルタに設定されるフィルタ特性を示す図である。 Here, the filter characteristics set in the reverberation filters 12H, 12M, and 12L will be described with reference to FIG. FIGS. 3A, 3B, and 3C are diagrams showing filter characteristics set in the high frequency, middle frequency, and low frequency reverberation filters, respectively.

人間の聴覚の高音域のマスキング特性は、大きな音が鳴った直後は大レベルの信号成分までマスキングされるが、そのマスキングレベルの減衰は急速である。そこで、高域の残響フィルタ１２Ｈには、このマスキング特性を模して初期レベルが高く減衰の速いフィルタ係数を設定する。また、人間の聴覚の低音域のマスキング特性は、大きな音が鳴った直後でもマスキングされる信号成分のレベルは高くないが、その後のマスキングレベルの減衰は緩やかであり長時間持続する。そこで、低域の残響フィルタ１２Ｌには、このマスキング特性を模して初期レベルが低く減衰の緩やかなフィルタ係数を設定する。人間の聴覚の中音域のマスキング特性は、高音域と低音域の中間の特性であるため、中域の残響フィルタ１２Ｍには、高域の残響フィルタ１２Ｈのフィルタ係数と低域の残響フィルタ１２Ｌのフィルタ係数の中間の特徴をもったフィルタ係数を設定する。 The high-frequency masking characteristics of human hearing are masked to a high level signal component immediately after a loud sound is produced, but the attenuation of the masking level is rapid. Therefore, a filter coefficient having a high initial level and a fast attenuation is set for the high-frequency reverberation filter 12H to simulate this masking characteristic. Further, the masking characteristic of the low range of human hearing is not high even after a loud sound is produced, but the subsequent attenuation of the masking level is slow and lasts for a long time. Therefore, a filter coefficient having a low initial level and moderate attenuation is set for the low-frequency reverberation filter 12L, imitating this masking characteristic. Since the masking characteristic of the human auditory midrange is an intermediate characteristic between the high range and the low range, the mid range reverberation filter 12M includes the filter coefficient of the high range reverberation filter 12H and the low range reverberation filter 12L. A filter coefficient having characteristics in the middle of the filter coefficient is set.

なお、各残響フィルタ１２Ｈ，１２Ｍ，１２Ｌは、直接音を出力せず残響の成分のみを出力するようにしているが、直接音も出力するようにしてもよい。 Note that each of the reverberation filters 12H, 12M, and 12L outputs only the reverberation component without outputting the direct sound, but may output the direct sound.

これらのフィルタ係数は、固定的なものであるが、デコードされたオーディオ信号の周波数特性に応じて（解析部１４の解析に基づき）特性を変更するようにしてもよい。たとえば、各帯域で特に大レベルの成分が出力された場合には、その成分の周辺周波数を強調するような周波数特性を持たせる等である。 These filter coefficients are fixed, but the characteristics may be changed according to the frequency characteristics of the decoded audio signal (based on the analysis of the analysis unit 14). For example, when a particularly large level component is output in each band, a frequency characteristic that enhances the peripheral frequency of the component is provided.

なお、残響フィルタ１２は、信号圧縮により省略された周波数成分を擬似的に復元するためのものであるため、ホール等の空間の響きを模擬する一般の残響効果装置のように、残響特性を（空間の広さを模した）ディレイ，（直接反射音を模した）初期反射音，（乱反射音を模した）残響音と時間分割する必要はなく、最初から密なタップ設定で残響信号を出力するように設定すればよい。 Since the reverberation filter 12 is for restoring the frequency component omitted by the signal compression in a pseudo manner, the reverberation characteristics (like a general reverberation effect device that simulates the reverberation of a space such as a hall) are ( There is no need to time-divide the delay (simulating the size of the space), the initial reflected sound (simulating direct reflection sound), and the reverberation sound (simulating diffuse reflection sound), and output the reverberation signal with dense tap settings from the beginning. It should be set so that.

図２にもどって、残響フィルタ１２Ｈ，１２Ｍ，１２Ｌから出力された残響信号は、アンプ１３Ｈ，１３Ｍ，１３Ｌに入力され、それぞれのアンプに設定されたゲインで増幅される。これらアンプ１３Ｈ，１３Ｍ，１３Ｌのゲインは、解析部１４が、デコーダ１０がデコードしたオーディオ信号の周波数特性に応じて設定する。 Returning to FIG. 2, the reverberation signals output from the reverberation filters 12H, 12M, and 12L are input to the amplifiers 13H, 13M, and 13L, and are amplified by the gains set in the respective amplifiers. The gains of the amplifiers 13H, 13M, and 13L are set by the analysis unit 14 according to the frequency characteristics of the audio signal decoded by the decoder 10.

図４を参照してアンプ１３Ｈ，１３Ｍ，１３Ｌのゲイン決定の方式を説明する。図４は、デコードされたオーディオ信号のある瞬間のピーク周波数成分とそのピーク周波数成分による同時マスキング範囲を示す図である。図示のようにピーク周波数成分によって低レベルの信号成分の多くはマスキングされてしまい、ピークレベルが高いほど、マスキングされる信号のレベルも高くなる。そこで、各周波数帯域（高域、中域、低域）で、それぞれ最大のピークレベルを検出し、そのレベルに相関した値をその帯域のアンプのゲインとして設定する。 A method of determining gains of the amplifiers 13H, 13M, and 13L will be described with reference to FIG. FIG. 4 is a diagram showing a peak frequency component at a certain moment of the decoded audio signal and a simultaneous masking range by the peak frequency component. As illustrated, many of the low level signal components are masked by the peak frequency component, and the higher the peak level, the higher the level of the signal to be masked. Therefore, the maximum peak level is detected in each frequency band (high band, middle band, low band), and a value correlated with that level is set as the gain of the amplifier in that band.

オーディオ信号のピーク周波数成分は、デコードされたオーディオ信号をＦＦＴして求めてもよく、デコードする前の圧縮信号に含まれている周波数成分情報をデコーダ１０から入力するようにしてもよい。 The peak frequency component of the audio signal may be obtained by performing FFT on the decoded audio signal, and frequency component information included in the compressed signal before decoding may be input from the decoder 10.

上記では各帯域の最大ピークレベルに相関した値をその帯域のアンプゲインとして設定したが、各帯域のピークレベルの平均値等に基づいてアンプのゲインを設定するようにしてもよい。また、この実施形態では同時マスキングのみ考慮してアンプのゲインを設定しているが、過去のオーディオ信号の周波数成分に基づき、継時マスキングも考慮してアンプのゲインを設定するようにしてもよい。 In the above description, the value correlated with the maximum peak level of each band is set as the amplifier gain of that band. However, the amplifier gain may be set based on the average value of the peak level of each band. In this embodiment, the gain of the amplifier is set considering only simultaneous masking. However, the gain of the amplifier may be set taking into account successive masking based on the frequency components of the past audio signal. .

図２に戻って、帯域合成部１５は、高域、中域、低域の帯域ごとに生成された残響信号を合成して全帯域の残響信号を生成する。そして、この残響信号を加算器１６に出力する。加算器１６は、この残響信号をデコーダ１０で復調出力された復調オーディオ信号と合成して後段の装置に出力する。後段の装置は、たとえばイコライザ、Ｄ／Ａコンバータ等である。 Returning to FIG. 2, the band synthesizing unit 15 synthesizes the reverberation signals generated for each of the high frequency band, the mid frequency band, and the low frequency band to generate a reverberation signal of the entire band. Then, the reverberation signal is output to the adder 16. The adder 16 synthesizes the reverberation signal with the demodulated audio signal demodulated and output by the decoder 10 and outputs the synthesized signal to a subsequent apparatus. The subsequent apparatus is, for example, an equalizer, a D / A converter, or the like.

デコーダ１０から出力された復調オーディオ信号と、そののち帯域分割部１１から帯域合成部１５までの処理部で処理されて出力される残響信号とでは、加算器１６に入力可能になるタイミングが異なるが、両信号が適切なタイミングで合成されるように復調オーディオ信号をバッファして加算器１６に入力する。 The demodulated audio signal output from the decoder 10 and the reverberation signal output after being processed by the processing units from the band dividing unit 11 to the band synthesizing unit 15 are different in timing that can be input to the adder 16. The demodulated audio signal is buffered and input to the adder 16 so that both signals are synthesized at an appropriate timing.

図５は、この発明の第２の実施形態であるオーディオ信号処理装置の構成を示すブロック図である。この図において、図２と同一構成の部分は同一符号を付して説明を省略する。このオーディオ信号処理装置は、信号圧縮では高域の信号成分の省略による音質劣化が著しいことに着目し、簡略な構成で効率的に音質の改善を図れるようにしたものである。このオーディオ信号処理装置は、このため、図２のオーディオ信号処理装置のうち中域、低域の処理系統を省略し、高域の処理系統である残響フィルタ１２Ｈ、アンプ１３Ｈのみを備えている。残響フィルタ１２Ｈには、ハイパスフィルタ（ＨＰＦ）１１′で取り出された高域の信号成分が入力される。またアンプ１３Ｈから出力された高域の残響信号は帯域合成部を経ることなくそのまま加算器１６で復調オーディオ信号と合成される。 FIG. 5 is a block diagram showing a configuration of an audio signal processing apparatus according to the second embodiment of the present invention. In this figure, parts having the same configuration as in FIG. In this audio signal processing apparatus, attention is paid to the remarkable deterioration in sound quality due to omission of high-frequency signal components in signal compression, and the sound quality can be improved efficiently with a simple configuration. For this reason, this audio signal processing apparatus is provided with only the reverberation filter 12H and the amplifier 13H, which are high-frequency processing systems, omitting the middle- and low-frequency processing systems in the audio signal processing apparatus of FIG. The reverberation filter 12H receives the high-frequency signal component extracted by the high-pass filter (HPF) 11 ′. The high-frequency reverberation signal output from the amplifier 13H is directly synthesized with the demodulated audio signal by the adder 16 without passing through the band synthesizing unit.

図６は、この発明の第３の実施形態であるオーディオ信号処理装置の構成を示すブロック図である。この図において、図２と同一構成の部分は同一符号を付して説明を省略する。このオーディオ信号処理装置は、図２の構成のオーディオ信号処理装置にさらに高域強調部を付加したものである。このオーディオ信号処理装置は、残響信号を用いて省略された信号成分を擬似的に復元するとともに、残響にも含まれていない高音域の信号成分を強調または生成する高域強調部２０を備えている。この高域強調部２０としては、たとえば特開２００３−１４０６９６号公報記載の装置等を用いることができる。
なお、図６では高域強調部２０を帯域分割部１１〜加算部１６までの残響付与部と並列に設けたが、加算器１６の後段に高域強調部を設け、残響を付与されたオーディオ信号に対して更に高域を強調するようにしてもよい。 FIG. 6 is a block diagram showing the configuration of an audio signal processing apparatus according to the third embodiment of the present invention. In this figure, parts having the same configuration as in FIG. This audio signal processing apparatus is obtained by adding a high frequency emphasis unit to the audio signal processing apparatus having the configuration shown in FIG. The audio signal processing apparatus includes a high frequency emphasizing unit 20 that artificially restores a signal component omitted using a reverberation signal and emphasizes or generates a high frequency signal component not included in the reverberation. Yes. As the high frequency emphasis unit 20, for example, a device described in Japanese Patent Application Laid-Open No. 2003-140696 can be used.
In FIG. 6, the high frequency emphasis unit 20 is provided in parallel with the reverberation imparting units from the band dividing unit 11 to the adding unit 16, but the high frequency emphasizing unit is provided after the adder 16 to add the reverberation. Higher frequencies may be emphasized with respect to the signal.

上記オーディオ信号処理装置は、ＤＳＰ等の専用プロセッサ上に構成するのみならず、パーソナルコンピュータ等でソフトウェア的に構成してもよい。 The audio signal processing apparatus may be configured not only on a dedicated processor such as a DSP but also as software using a personal computer or the like.

人間の聴覚のマスキング特性を説明する図Illustration explaining the human auditory masking characteristics この発明の実施形態であるオーディオ信号処理装置のブロック図Block diagram of an audio signal processing apparatus according to an embodiment of the present invention 同オーディオ信号処理装置の残響フィルタのフィルタ係数を説明する図The figure explaining the filter coefficient of the reverberation filter of the audio signal processing apparatus 同オーディオ信号処理装置のアンプのゲイン決定方式を説明する図The figure explaining the gain determination system of the amplifier of the audio signal processing apparatus この発明の第２の実施形態であるオーディオ信号処理装置のブロック図Block diagram of an audio signal processing apparatus according to a second embodiment of the present invention この発明の第３の実施形態であるオーディオ信号処理装置のブロック図The block diagram of the audio signal processing apparatus which is the 3rd Embodiment of this invention

符号の説明Explanation of symbols

１０…デコーダ
１１…帯域分割部
１１′…ハイパスフィルタ
１２（１２Ｈ、１２Ｍ、１２Ｌ）…残響フィルタ
１３（１３Ｈ、１３Ｍ、１３Ｌ）…アンプ
１４…解析部
１５…帯域合成部
１６…加算器
２０…高域強調部 DESCRIPTION OF SYMBOLS 10 ... Decoder 11 ... Band division part 11 '... High pass filter 12 (12H, 12M, 12L) ... Reverberation filter 13 (13H, 13M, 13L) ... Amplifier 14 ... Analysis part 15 ... Band synthesis part 16 ... Adder 20 ... High Area emphasis section

Claims

圧縮オーディオ信号を復調した復調オーディオ信号を複数の周波数帯域の部分帯域信号に分割する帯域分割部と、
各部分帯域信号ごとに設けられ、各部分帯域信号に基づいて、各周波数帯域における聴覚マスキング特性の時間変化を模した残響信号を生成する残響フィルタと、
前記残響フィルタが生成した各周波数帯域ごとの残響信号を前記復調オーディオ信号に加算する加算部と、
を備えたオーディオ信号処理装置。 A band dividing unit that divides a demodulated audio signal obtained by demodulating the compressed audio signal into partial band signals of a plurality of frequency bands;
A reverberation filter that is provided for each partial band signal and generates a reverberation signal imitating a temporal change in auditory masking characteristics in each frequency band based on each partial band signal;
An adder for adding a reverberation signal for each frequency band generated by the reverberation filter to the demodulated audio signal;
An audio signal processing apparatus comprising:

圧縮オーディオ信号を復調した復調オーディオ信号から高音域の部分帯域信号を取り出す高域分離部と、
前記部分帯域信号に基づいて、高音域における聴覚マスキング特性の時間変化を模した残響信号を生成する残響フィルタと、
前記残響フィルタが生成した残響信号を前記復調オーディオ信号に加算する加算部と、
を備えたオーディオ信号処理装置。 A high-frequency separation unit that extracts a high-frequency partial band signal from a demodulated audio signal obtained by demodulating the compressed audio signal;
Based on the partial band signal, a reverberation filter that generates a reverberation signal that imitates a temporal change in auditory masking characteristics in a high frequency range,
An adder for adding the reverberation signal generated by the reverberation filter to the demodulated audio signal;
An audio signal processing apparatus comprising:

前記残響信号を、前記復調オーディオ信号によってマスキングされるレベルに応じたゲインで増幅する増幅部を備えた請求項１または請求項２に記載のオーディオ信号処理装置。 The audio signal processing apparatus according to claim 1, further comprising an amplifying unit that amplifies the reverberation signal with a gain corresponding to a level masked by the demodulated audio signal.

圧縮オーディオ信号を復調した復調オーディオ信号を複数の周波数帯域の部分帯域信号に分割し、
各部分帯域信号に基づいて、各周波数帯域における聴覚マスキング特性の時間変化を模した残響信号を生成し、
前記各周波数帯域ごとの残響信号を前記復調オーディオ信号に加算する、
ことを特徴とするオーディオ信号処理方法。 The demodulated audio signal demodulated from the compressed audio signal is divided into subband signals of a plurality of frequency bands,
Based on each subband signal, generate a reverberation signal that imitates the temporal change of auditory masking characteristics in each frequency band,
Adding a reverberation signal for each frequency band to the demodulated audio signal;
An audio signal processing method.

圧縮オーディオ信号を復調した復調オーディオ信号から高音域の部分帯域信号を取り出し、
前記部分帯域信号に基づいて、高音域における聴覚マスキング特性の時間変化を模した残響信号を生成し、、
前記残響信号を前記復調オーディオ信号に加算する、
ことを特徴とするオーディオ信号処理方法。 Extracting the high-frequency partial band signal from the demodulated audio signal demodulated from the compressed audio signal,
Based on the partial band signal, generate a reverberation signal that imitates the temporal change of auditory masking characteristics in the high range,
Adding the reverberant signal to the demodulated audio signal;
An audio signal processing method.

デジタル信号処理装置に、
圧縮オーディオ信号を復調した復調オーディオ信号を複数の周波数帯域の部分帯域信号に分割するプロセス、
各部分帯域信号に基づいて、各周波数帯域における聴覚マスキング特性の時間変化を模した残響信号を生成するプロセス、
前記各周波数帯域ごとの残響信号を前記復調オーディオ信号に加算するするプロセス、
を実行させるオーディオ信号処理プログラム。 In digital signal processing equipment,
A process of dividing a demodulated audio signal obtained by demodulating a compressed audio signal into subband signals of a plurality of frequency bands;
A process for generating a reverberation signal simulating temporal changes in auditory masking characteristics in each frequency band based on each subband signal;
A process of adding a reverberation signal for each frequency band to the demodulated audio signal;
An audio signal processing program for executing

デジタル信号処理装置に、
圧縮オーディオ信号を復調した復調オーディオ信号から高音域の部分帯域信号を取り出すプロセス、
前記部分帯域信号に基づいて、高音域における聴覚マスキング特性の時間変化を模した残響信号を生成するプロセス、
前記残響信号を前記復調オーディオ信号に加算するプロセス、
を実行させるオーディオ信号処理プログラム。 In digital signal processing equipment,
A process of extracting a high frequency sub-band signal from a demodulated audio signal obtained by demodulating a compressed audio signal;
A process for generating a reverberation signal simulating temporal change of auditory masking characteristics in a high frequency range based on the partial band signal;
A process of adding the reverberant signal to the demodulated audio signal;
An audio signal processing program for executing