JP2005284163A - Noise spectrum estimating method, noise suppressing method and noise suppressing device - Google Patents

Noise spectrum estimating method, noise suppressing method and noise suppressing device Download PDF

Info

Publication number
JP2005284163A
JP2005284163A JP2004100935A JP2004100935A JP2005284163A JP 2005284163 A JP2005284163 A JP 2005284163A JP 2004100935 A JP2004100935 A JP 2004100935A JP 2004100935 A JP2004100935 A JP 2004100935A JP 2005284163 A JP2005284163 A JP 2005284163A
Authority
JP
Japan
Prior art keywords
noise
spectrum
observed signal
signal
estimated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP2004100935A
Other languages
Japanese (ja)
Other versions
JP4434813B2 (en
Inventor
Michiko Kazama
道子 風間
Mikio Higashiyama
三樹夫 東山
Toru Hirai
徹 平井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Waseda University
Yamaha Corp
Original Assignee
Waseda University
Yamaha Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Waseda University, Yamaha Corp filed Critical Waseda University
Priority to JP2004100935A priority Critical patent/JP4434813B2/en
Priority to US11/093,672 priority patent/US7596495B2/en
Priority to GB0506434A priority patent/GB2413469B/en
Priority to CA2502980A priority patent/CA2502980C/en
Publication of JP2005284163A publication Critical patent/JP2005284163A/en
Application granted granted Critical
Publication of JP4434813B2 publication Critical patent/JP4434813B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B1/00Details of transmission systems, not covered by a single one of groups H04B3/00 - H04B13/00; Details of transmission systems not characterised by the medium used for transmission
    • H04B1/06Receivers
    • H04B1/10Means associated with receiver for limiting or suppressing noise or interference

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

<P>PROBLEM TO BE SOLVED: To provide a new method which estimates the spectrum of noise from a voice signal in which noise is mixed. <P>SOLUTION: A correlation value between the spectrum envelope curve of the observation signal in a signal segment which is being observed at the present time, and the spectrum envelope curve of noise which is estimated about the observation signal of a signal segment which is observed at the last time, is calculated. The spectrum of the observation signal in the signal segment which is being observed at the present time and the spectrum of the noise estimated about the observation signal in the signal segment which is observed at the last time are added and mixed in a ratio corresponding to the correlation value. That is, when the correlation value is high, the mixing ratio of the spectrum of the observation signal in the signal segment which is being observed at the present time is made relative higher as compared with that at the time the correlation value is low and the mixing ratio of the noise estimated about the observation signal in the signal segment which is observed at the last time is made relatively low. The added and mixed spectrum is estimated as the spectrum of the noise included in the observation signal in the signal segment which is being observed at the present time. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

この発明は、雑音が混入した音声信号から、該雑音のスペクトルを推定する方法に関する。また、この発明は、該推定に基づき該雑音を抑圧した音声信号を生成する方法および装置に関する。   The present invention relates to a method of estimating a noise spectrum from a speech signal mixed with noise. The present invention also relates to a method and apparatus for generating a speech signal in which the noise is suppressed based on the estimation.

雑音が混入した音声信号から該雑音のスペクトルを推定する技術は、例えば音声認識技術、電話等による音声通信技術等において、雑音を抑圧する(雑音の混入した音声信号から雑音を除去し、目的とする音声信号を取り出す)のに利用される。音声信号に含まれる雑音を抑圧する技術としては、例えばスペクトルサブトラクション法がある。スペクトルサブトラクション法は、雑音が混入した音声信号から該雑音のスペクトルを推定し、雑音が混入した音声信号のスペクトルから、推定された雑音のスペクトルを差し引くことにより、雑音を抑圧するものである。   The technology for estimating the noise spectrum from a speech signal mixed with noise suppresses the noise (for example, removes the noise from the speech signal mixed with noise, in a speech recognition technology such as speech recognition technology or telephone). Used to extract audio signals). As a technique for suppressing noise included in an audio signal, for example, there is a spectral subtraction method. The spectrum subtraction method suppresses noise by estimating the noise spectrum from noise-mixed speech signals and subtracting the estimated noise spectrum from the noise-mixed speech signal spectrum.

スペクトルサブトラクション技術を開示した従来技術として、下記特許文献に記載されたものがある。
特開平11−3094号公報 特開2002−14694号公報 特開2003−223186号公報
As a prior art disclosing a spectral subtraction technique, there is one described in the following patent document.
Japanese Patent Laid-Open No. 11-3094 JP 2002-14694 A JP 2003-223186 A

この発明は、雑音が混入した音声信号から、該雑音のスペクトルを推定する新規な方法を提供しようとするものである。また、この発明は、該推定に基づき該雑音を抑圧した音声信号を生成する方法および装置を提供しようとするものである。   The present invention is intended to provide a novel method for estimating a spectrum of noise from a speech signal mixed with noise. The present invention is also intended to provide a method and apparatus for generating a speech signal in which the noise is suppressed based on the estimation.

この発明の雑音スペクトル推定方法は、雑音が混入した音声信号から、該雑音のスペクトルを推定する方法であって、現在観測されている信号区間の観測信号のスペクトルの包絡線と、前回観測された信号区間の観測信号について推定された雑音のスペクトルの包絡線との相関を求め、該求められた相関に応じて前記現在観測されている信号区間の観測信号について雑音のスペクトルを推定するものである。   The noise spectrum estimation method of the present invention is a method for estimating the noise spectrum from a speech signal mixed with noise, and the spectrum envelope of the observed signal in the currently observed signal section and the previously observed spectrum. A correlation with an envelope of a noise spectrum estimated for an observation signal in a signal interval is obtained, and a noise spectrum is estimated for the observation signal in the currently observed signal interval according to the obtained correlation. .

この発明の雑音スペクトル推定方法は、前記相関に応じた比率で、前記現在観測されている信号区間の観測信号のスペクトルと、前記前回観測された信号区間の観測信号について推定された雑音のスペクトルとを混合し、該混合したスペクトルを前記現在観測されている信号区間の観測信号について雑音のスペクトルとして推定することができる。   The noise spectrum estimation method according to the present invention includes a spectrum of an observation signal in the currently observed signal section, a spectrum of noise estimated for the observation signal in the previously observed signal section, at a ratio according to the correlation. And the mixed spectrum can be estimated as a noise spectrum for the observed signal in the currently observed signal section.

この発明の雑音スペクトル推定方法は、前記相関が高いときは該相関が低いときに比べて、前記現在観測されている信号区間の観測信号のスペクトルの混合比率を相対的に高くし、前記前回観測された信号区間の観測信号について推定された雑音のスペクトルの混合比率を相対的に低くし、前記相関が低いときは該相関が高いときに比べて、前記現在観測されている信号区間の観測信号のスペクトルの混合比率を相対的に低くし、前記前回観測された信号区間の観測信号について推定された雑音のスペクトルの混合比率を相対的に高くすることができる。   According to the noise spectrum estimation method of the present invention, when the correlation is high, the mixing ratio of the spectrum of the observation signal in the currently observed signal section is relatively high when the correlation is low, and the previous observation The noise spectrum mixing ratio estimated for the observed signal in the signal interval is relatively low, and the observed signal in the currently observed signal interval is lower when the correlation is low than when the correlation is high Can be relatively lowered, and the noise spectrum mixing ratio estimated for the observed signal in the previously observed signal section can be relatively increased.

この発明の雑音スペクトル推定方法は、前記相関が高くなるにつれて、該相関の変化に対する、前記現在観測されている信号区間の観測信号のスペクトルの混合比率の変化を大きくすることができる。   According to the noise spectrum estimation method of the present invention, as the correlation becomes higher, the change in the mixing ratio of the spectrum of the observed signal in the currently observed signal section with respect to the change in the correlation can be increased.

この発明の雑音スペクトル推定方法は、前記現在観測されている信号区間の観測信号について推定される雑音のスペクトルが

N(k)=[1−{ρ/(1+ρ)}]・N(k)+{ρ/(1+ρ)}・X(k)
但し、N(k):現在観測されている信号区間の観測信号について推定される
雑音のスペクトル
(k):前回観測された信号区間の観測信号について推定された雑音
のスペクトル
X(k):現在観測されている信号区間の観測信号のスペクトル
ρ:現在観測されている信号区間の観測信号のスペクトルの包絡線と前回
観測された信号区間の観測信号について推定された雑音のスペクトル
の包絡線との相関値
l,m:定数(lは1以上の値、mは0以上の値)

として求められる値とすることができる。
According to the noise spectrum estimation method of the present invention, a noise spectrum estimated for the observed signal in the currently observed signal section is obtained.

N (k) = [1- {ρ l / (1 + ρ l )} m ] · N 0 (k) + {ρ l / (1 + ρ l )} m · X (k)
However, N (k): estimated for the observed signal in the currently observed signal section
Noise spectrum
N 0 (k): Noise estimated for the observed signal in the previously observed signal section
Spectrum of
X (k): spectrum of the observed signal in the currently observed signal section
ρ: Spectral envelope of the observed signal in the currently observed signal section and the previous time
Noise spectrum estimated for the observed signal in the observed signal interval
Correlation value with the envelope
l, m: constant (l is a value of 1 or more, m is a value of 0 or more)

Can be obtained as

この発明の雑音スペクトル推定方法は、前記現在観測されている信号区間の観測信号について推定される雑音のスペクトルを、次の信号区間における前回観測された信号区間の観測信号について推定された雑音のスペクトルとすることができる。   According to the noise spectrum estimation method of the present invention, the noise spectrum estimated for the observed signal in the currently observed signal interval is the noise spectrum estimated for the observed signal in the previously observed signal interval in the next signal interval. It can be.

この発明の雑音スペクトル推定方法は、前記スペクトルの包絡線が振幅スペクトルの包絡線であるものとすることができる。   In the noise spectrum estimation method of the present invention, the envelope of the spectrum may be an envelope of an amplitude spectrum.

この発明の雑音スペクトル抑圧方法は、雑音が混入した音声信号から、該雑音の振幅スペクトルを推定し、該雑音を抑圧した音声信号を生成する方法であって、現在観測されている信号区間の観測信号をフーリエ変換して振幅スペクトルと位相スペクトルを求め、該求められた現在観測されている信号区間の観測信号の振幅スペクトルの包絡線と、前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルの包絡線との相関を求め、該求められた相関に応じて前記現在観測されている信号区間の観測信号について雑音の振幅スペクトルを推定し、前記現在観測されている信号区間の観測信号の振幅スペクトルから該現在観測されている信号区間の観測信号について推定された雑音の振幅スペクトルを減算し、該減算により得られる振幅スペクトルと前記求められた位相スペクトルとを再合成して逆フーリエ変換し、該逆フーリエ変換で得られる信号を前記雑音を抑圧した音声信号として出力し、前記現在観測されている信号区間の観測信号について推定される雑音の振幅スペクトルを、次の信号区間における前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルとするものである。   The noise spectrum suppression method of the present invention is a method for estimating an amplitude spectrum of a noise from a speech signal mixed with noise and generating a speech signal with the noise suppressed, and observing a currently observed signal section. The signal is subjected to Fourier transform to obtain an amplitude spectrum and a phase spectrum, the envelope of the obtained amplitude spectrum of the observed signal in the currently observed signal section, and the noise estimated for the observed signal in the previously observed signal section And the amplitude spectrum of the noise is estimated for the observation signal in the currently observed signal interval according to the obtained correlation, and the observation in the currently observed signal interval is estimated. The noise amplitude spectrum estimated for the observed signal in the currently observed signal section is subtracted from the amplitude spectrum of the signal, The resultant amplitude spectrum and the obtained phase spectrum are re-synthesized and inverse Fourier transformed, and a signal obtained by the inverse Fourier transformation is output as a speech signal with the noise suppressed, and the currently observed signal interval The noise amplitude spectrum estimated for the observed signal is used as the noise amplitude spectrum estimated for the observed signal in the signal section observed last time in the next signal section.

この発明の雑音抑圧装置は、雑音が混入した音声信号から、該雑音の振幅スペクトルを推定し、該雑音を抑圧した音声信号を生成する装置であって、現在観測されている信号区間の観測信号をフーリエ変換するフーリエ変換手段と、該フーリエ変換されたデータから振幅スペクトルを求める振幅スペクトル演算手段と、該フーリエ変換されたデータから位相スペクトルを求める位相スペクトル演算手段と、前記求められた現在観測されている信号区間の観測信号の振幅スペクトルの包絡線と、前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルの包絡線との相関を求める相関演算手段と、該求められた相関に応じて前記現在観測されている信号区間の観測信号について雑音の振幅スペクトルを推定する雑音振幅スペクトル演算手段と、前記現在観測されている信号区間の観測信号の振幅スペクトルから該現在観測されている信号区間の観測信号について推定された雑音の振幅スペクトルを減算する減算手段と、該減算により得られる振幅スペクトルと前記位相スペクトルとを再合成する再合成手段と、該再合成されたデータを逆フーリエ変換する逆フーリエ変換手段とを具備し、該逆フーリエ変換によって生成された信号を前記雑音を抑圧した音声信号として出力し、前記現在観測されている信号区間の観測信号について推定される雑音の振幅スペクトルを、次の信号区間における前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルとするものである。   A noise suppression apparatus according to the present invention is an apparatus that estimates an amplitude spectrum of a noise from a voice signal mixed with noise and generates a voice signal in which the noise is suppressed. Fourier transform means for Fourier transforming, amplitude spectrum computing means for obtaining an amplitude spectrum from the Fourier transformed data, phase spectrum computing means for obtaining a phase spectrum from the Fourier transformed data, and the obtained current observation Correlation calculating means for obtaining a correlation between the envelope of the amplitude spectrum of the observed signal in the signal section and the envelope of the amplitude spectrum of the noise estimated for the observed signal in the previously observed signal section, and the obtained correlation The noise amplitude spectrum for estimating the noise amplitude spectrum for the observed signal in the currently observed signal interval according to And a subtracting means for subtracting the noise amplitude spectrum estimated for the observed signal in the currently observed signal section from the amplitude spectrum of the observed signal in the currently observed signal section; and Re-synthesizing means for re-synthesizing the amplitude spectrum and the phase spectrum, and inverse Fourier transform means for performing inverse Fourier transform on the re-synthesized data, and the signal generated by the inverse Fourier transform is converted to the noise. Output as a suppressed speech signal, and estimate the amplitude spectrum of noise estimated for the observed signal in the currently observed signal interval, and the estimated noise amplitude for the observed signal in the previously observed signal interval in the next signal interval It is a spectrum.

この発明の雑音スペクトル推定方法によれば、現在観測されている信号区間の観測信号について雑音のスペクトルを推定することができる。この発明の雑音スペクトル抑圧方法および雑音スペクトル抑圧装置によれば、この発明の雑音スペクトル推定方法を利用して、音声信号に混入している雑音を除去、抑圧し、目的とする音声信号を取り出すことができる。   According to the noise spectrum estimation method of the present invention, it is possible to estimate a noise spectrum for an observation signal in a currently observed signal section. According to the noise spectrum suppression method and noise spectrum suppression apparatus of the present invention, the noise spectrum estimation method of the present invention is used to remove and suppress noise mixed in the audio signal and to extract the target audio signal. Can do.

(実施の形態1)
この発明の雑音スペクトル推定方法をスペクトルサブトラクション法による雑音抑圧処理に適用した実施の形態を以下説明する。図1は、この発明による雑音抑圧装置の実施の形態を示す。一点鎖線10で囲んだ部分は、従来のスペクトルサブトラクション法による雑音抑圧装置と共通する部分である。一点鎖線11で囲んだ部分は、この発明の方法により雑音の振幅スペクトルの推定を行う雑音振幅スペクトル推定部である。入力信号(観測信号)x(n)(n=0,1,2,…,N−1。但し、Nは1フレームのサンプル数)はマイク等で入力された雑音を含む音声信号(例えば音声認識のために入力された信号、電話通信で受信された音声信号等)のサンプル列である。入力信号x(n)には、背景雑音等の定常雑音が混入している。入力信号x(n)は、入力信号切り出し部12に入力され、所定サンプル数で構成されるフレームごとに切り出される。ここでは、雑音抑圧処理終了後に最終的に出力信号を合成する際に、フレーム間に切れ目を生じさせないように、図2(a),(b)に示すように、半フレームごとに順次ずらしてフレーム切り出しを行う。なお、1フレーム長Nは125〜500msec程度とするのが音質上好ましい。この長さの1フレーム長は、入力信号x(n)のサンプリング周波数が約8kHzの場合、1フレームを1024〜4096サンプルで構成することに相当する。
(Embodiment 1)
An embodiment in which the noise spectrum estimation method of the present invention is applied to noise suppression processing by a spectral subtraction method will be described below. FIG. 1 shows an embodiment of a noise suppression apparatus according to the present invention. A portion surrounded by an alternate long and short dash line 10 is a portion in common with a noise suppression device using a conventional spectral subtraction method. A portion surrounded by an alternate long and short dash line 11 is a noise amplitude spectrum estimator that estimates a noise amplitude spectrum by the method of the present invention. Input signal (observation signal) x 0 (n) (n = 0, 1, 2,..., N−1, where N is the number of samples in one frame) is an audio signal including noise input by a microphone or the like (for example, This is a sample sequence of a signal input for speech recognition, a speech signal received by telephone communication, and the like. The input signal x 0 (n) is mixed with stationary noise such as background noise. The input signal x 0 (n) is input to the input signal cutout unit 12 and cut out for each frame configured by a predetermined number of samples. Here, as shown in FIGS. 2 (a) and 2 (b), when the output signal is finally synthesized after the noise suppression processing is completed, the frames are sequentially shifted every half frame as shown in FIGS. Perform frame segmentation. Note that one frame length N is preferably about 125 to 500 msec in view of sound quality. One frame length of this length is equivalent to constituting one frame with 1024 to 4096 samples when the sampling frequency of the input signal x 0 (n) is about 8 kHz.

入力信号切り出し部12で切り出された入力信号x(n)は、フーリエ変換部14で、切り出されたフレームごとに順次フーリエ変換される。該フーリエ変換により順次求められる離散フーリエ変換X(k)(k=0,1,2,…,N−1)は、振幅スペクトル計算部16と位相スペクトル計算部18に入力される。振幅スペクトル計算部16は、(1)式により離散フーリエ変換X(k)の振幅スペクトル|X(k)|を求める。

|X(k)|={X(k)+X(k)1/2 …(1)
但し、X(k):X(k)の実数部
(k):X(k)の虚数部

また、位相スペクトル計算部18は、(2)式により離散フーリエ変換X(k)の位相スペクトルθ(k)を求める。

θ(k)=tan−1{X(k)/X(k)} …(2)
The input signal x (n) cut out by the input signal cutout unit 12 is sequentially Fourier transformed by the Fourier transform unit 14 for each cut out frame. Discrete Fourier transform X (k) (k = 0, 1, 2,..., N−1) sequentially obtained by the Fourier transform is input to the amplitude spectrum calculation unit 16 and the phase spectrum calculation unit 18. The amplitude spectrum calculation unit 16 obtains the amplitude spectrum | X (k) | of the discrete Fourier transform X (k) by the equation (1).

| X (k) | = {X R (k) 2 + X I (k) 2 } 1/2 (1)
Where X R (k): real part of X (k)
X I (k): Imaginary part of X (k)

Further, the phase spectrum calculation unit 18 obtains the phase spectrum θ (k) of the discrete Fourier transform X (k) by the equation (2).

θ (k) = tan −1 {X I (k) / X R (k)} (2)

雑音振幅スペクトル推定部11は、求められた振幅スペクトル|X(k)|に応じて、入力信号x(n)に含まれる雑音信号の振幅スペクトル(雑音振幅スペクトル)|N(k)|を、後述する手法により推定する。スペクトル減算部15は、切り出されたフレームごとに、(3)式により、振幅スペクトル計算部16で求めた現フレームの振幅スペクトル|X(k)|から、雑音振幅スペクトル推定部11で求めた現フレームの雑音振幅スペクトル|N(k)|を減算することにより、雑音振幅スペクトルを除去した現フレームの振幅スペクトル|Y(k)|を求める。

|Y(k)|=|X(k)|−|N(k)| …(3)
The noise amplitude spectrum estimation unit 11 determines the amplitude spectrum (noise amplitude spectrum) | N (k) | of the noise signal included in the input signal x (n) according to the obtained amplitude spectrum | X (k) | Estimated by the method described later. The spectrum subtraction unit 15 calculates the current amplitude obtained by the noise amplitude spectrum estimation unit 11 from the amplitude spectrum | X (k) | of the current frame obtained by the amplitude spectrum calculation unit 16 by the expression (3) for each extracted frame. By subtracting the noise amplitude spectrum | N (k) | of the frame, the amplitude spectrum | Y (k) | of the current frame from which the noise amplitude spectrum has been removed is obtained.

| Y (k) | = | X (k) |-| N (k) | (3)

再合成部17は、スペクトル減算部15で求めた現フレームの振幅スペクトル|Y(k)|と、位相スペクトル計算部18で求めた現フレームの入力信号x(n)の位相スペクトルθ(k)とを再合成して、(4)式に示す複素スペクトルデータG(k)に戻す。

G(k)=|Y(k)|eθ(k) …(4)
The resynthesizing unit 17 determines the amplitude spectrum | Y (k) | of the current frame obtained by the spectrum subtracting unit 15 and the phase spectrum θ (k) of the input signal x (n) of the current frame obtained by the phase spectrum calculating unit 18. Are recombined to return to the complex spectrum data G (k) shown in equation (4).

G (k) = | Y (k) | e θ (k) (4)

逆フーリエ変換部19は、複素スペクトルデータG(k)を逆フーリエ変換して、時間波形データg(n)に戻す。出力信号連結部21は、半フレーム毎に得られる(半フレームずつオーバーラップして得られる)各1フレーム長の時間波形データg(n)にそれぞれ図2(c)に示す三角窓を掛け(1フレーム長の前半の1/2フレームでゲインが0から1に直線的に上昇し、後半の1/2フレームでゲインが1から0に下降する特性のゲインを付与し)、これら三角窓を掛けられた時間波形データg(n)を図2(d)に示すように加算合成して連結することにより、出力信号g(n)を作成する。以上のようにして、入力信号x(n)から雑音を除去した出力信号g(n)(目的とする音声信号)が得られる。なお、上記の処理は、窓関数として三角窓を用いたが、これに限らず、ハニング窓、ハミング窓、台形窓等の窓関数を用いてもよい。 The inverse Fourier transform unit 19 performs inverse Fourier transform on the complex spectrum data G (k) and returns it to the time waveform data g (n). The output signal linking unit 21 multiplies the time waveform data g (n) of each one frame length obtained every half frame (obtained by overlapping every half frame) by a triangular window shown in FIG. The gain increases linearly from 0 to 1 in the first half frame of one frame length, and gain decreases from 1 to 0 in the second half frame). The multiplied time waveform data g (n) is added and combined as shown in FIG. 2D to generate an output signal g 0 (n). As described above, an output signal g 0 (n) (target audio signal) obtained by removing noise from the input signal x 0 (n) is obtained. In the above processing, a triangular window is used as the window function. However, the present invention is not limited to this, and a window function such as a Hanning window, a Hamming window, or a trapezoid window may be used.

図1の雑音振幅スペクトル推定部11について説明する。スペクトル包絡線抽出部20は、振幅スペクトル|X(k)|に含まれる細かな凹凸特性を除去して、振幅スペクトル|X(k)|の包絡線|X’(k)|を抽出する(つまり、振幅スペクトル|X(k)|を平滑化する)ものである。これは、後述する相関値算出において、振幅スペクトル|X(k)|そのものを用いると、スペクトルの相関値が低くなり、「音声区間」と「雑音区間」の区別が明確でなくなるためである。すなわち、雑音は長時間的平均でみれば、そのスペクトルは広い帯域にわたってほぼ一様となる滑らかな分布となることが期待できる。しかし、短時間でみれば多くの山谷を有するスペクトルの変動が観察される。一方、音声は、雑音とは異なり、その全体的な周波数特性は特定の周波数帯域に大きな振幅値を持っており、全周波数帯域に一様に分布していない。この実施の形態による雑音スペクトルの推定方法の特徴は、この「全周波数帯域に一様に分布する雑音」と、「ある特定の周波数帯域に大きな振幅値を持つ音声」を、スペクトルの相関値の大小で区別することにあるので、雑音の振幅スペクトルが持っている細かな凹凸特性を除去する。   The noise amplitude spectrum estimation unit 11 in FIG. 1 will be described. The spectrum envelope extraction unit 20 removes the fine unevenness characteristic included in the amplitude spectrum | X (k) |, and extracts the envelope | X ′ (k) | of the amplitude spectrum | X (k) | That is, the amplitude spectrum | X (k) | is smoothed). This is because if the amplitude spectrum | X (k) | itself is used in the correlation value calculation described later, the correlation value of the spectrum becomes low and the distinction between the “voice section” and the “noise section” becomes unclear. That is, if the noise is viewed as a long-term average, it can be expected that the spectrum has a smooth distribution that is substantially uniform over a wide band. However, a spectrum variation having many peaks and valleys is observed in a short time. On the other hand, unlike noise, speech has an overall frequency characteristic having a large amplitude value in a specific frequency band, and is not uniformly distributed over the entire frequency band. The feature of the noise spectrum estimation method according to this embodiment is that “noise uniformly distributed in the entire frequency band” and “speech having a large amplitude value in a specific frequency band” are represented by the correlation value of the spectrum. Since there is a distinction between large and small, the fine unevenness characteristic of the noise amplitude spectrum is removed.

スペクトル包絡線抽出部20は、例えば、振幅スペクトル|X(k)|を時間波形と見立ててローパスフィルタ処理をする(振幅スペクトル|X(k)|を直接ローパスフィルタにかける、あるいは振幅スペクトル|X(k)|を周波数軸方向に移動平均処理をする等)ことにより、包絡線を抽出する。振幅スペクトル|X(k)|を直接ローパスフィルタにかける場合のローパスフィルタのカットオフ周波数は、高すぎても低すぎても、音声の特徴を抽出することができない。すなわち、カットオフ周波数が高すぎると、雑音のスペクトルの細かな凹凸特性を除去できない。また、カットオフ周波数が低すぎると、音声成分自体が除去されてしまう。実験によれば、ローパスフィルタのカットオフ周波数はfs/300Hz{fs=16kHzサンプリングした時間数列とみなしたときの約50Hzに相当。fsは入力信号x(n)のサンプリング周波数}〜fs/16Hz{fs=16kHzサンプリングした時間数列とみなしたときの約1000Hzに相当)の範囲に設定した場合に、音声の特徴を良好に抽出することができた。スペクトル包絡線抽出部20は、具体的には、ローパスフィルタのカットオフ周波数をfs/300Hzとする場合は、カットオフ周波数が50Hzに相当する8次バタワース特性のローパスフィルタで構成することができる。   For example, the spectrum envelope extraction unit 20 performs low-pass filter processing by regarding the amplitude spectrum | X (k) | as a time waveform (directly applies the amplitude spectrum | X (k) | to the low-pass filter, or the amplitude spectrum | X (K) | is subjected to moving average processing in the frequency axis direction, etc.) to extract the envelope. If the cut-off frequency of the low-pass filter when the amplitude spectrum | X (k) | is directly applied to the low-pass filter is too high or too low, the audio feature cannot be extracted. That is, if the cut-off frequency is too high, the fine unevenness characteristic of the noise spectrum cannot be removed. If the cut-off frequency is too low, the sound component itself is removed. According to experiments, the cut-off frequency of the low-pass filter is equivalent to about 50 Hz when considered as fs / 300 Hz {fs = 16 kHz sampled time sequence. When fs is set in a range of sampling frequency of input signal x (n)} to fs / 16 Hz (corresponding to about 1000 Hz when regarded as a time sequence in which fs = 16 kHz is sampled), a voice feature is well extracted. I was able to. Specifically, when the cut-off frequency of the low-pass filter is fs / 300 Hz, the spectrum envelope extraction unit 20 can be configured with an 8th-order Butterworth characteristic low-pass filter corresponding to a cut-off frequency of 50 Hz.

なお、スペクトル包絡線抽出部20により振幅スペクトル|X(k)|の包絡線を抽出する別の方法として、振幅スペクトル|X(k)|をさらにフーリエ変換してケプストラムを求める方法もあり、上記の方法に限定されない。ケプストラムを用いる場合は、具体的には、例えば「ディジタル信号処理/社団法人 電子情報通信学会(コロナ社)」3.3.5 ケプストラム(p66〜70)や、「ディジタル信号処理入門/城戸健一著(丸善)」8.3 ケプストラムの計算(p158〜162)で説明されているような計算方法により、ケプストラムの低ケフレンシー部分のみを通過させるような窓関数をかけて、スペクトル包絡線を抽出する。   In addition, as another method of extracting the envelope of the amplitude spectrum | X (k) | by the spectrum envelope extraction unit 20, there is also a method of obtaining a cepstrum by further Fourier transforming the amplitude spectrum | X (k) | It is not limited to this method. When cepstrum is used, specifically, for example, “Digital Signal Processing / The Institute of Electronics, Information and Communication Engineers (Corona)” 3.3.5 Cepstrum (p66-70) or “Introduction to Digital Signal Processing / Kenichi Kido” (Maruzen) "8.3 Cepstrum calculation (p158-162) is used to extract a spectral envelope by applying a window function that passes only the low quefrency portion of the cepstrum.

雑音振幅スペクトル初期値出力部22は雑音振幅スペクトルの初期値を出力する。すなわち、本装置の起動当初は、参照する雑音振幅スペクトルデータがないため、初期値を設定する。雑音振幅スペクトル初期値の設定方法としては、たとえは、次の方法が考えられる。
(方法1)起動直後に入力された、音声の混入していない背景雑音のみのデータをフーリエ変換し、該フーリエ変換されたデータから、前記(1)式により求められる振幅スペクトルデータを雑音振幅スペクトル初期値として設定する。
(方法2)予め背景雑音に相当する振幅スペクトルデータをメモリに保持しておき、起動時にそれを読み出して雑音振幅スペクトル初期値として設定する。あるいは、予め背景雑音に相当する振幅スペクトルデータの包絡線データをメモリに保持しておき、起動時にそれを読み出して雑音振幅スペクトル包絡線データの初期値として設定する。
(方法3)ホワイトノイズやピンクノイズの振幅スペクトルデータを雑音振幅スペクトル初期値として設定する。
The noise amplitude spectrum initial value output unit 22 outputs an initial value of the noise amplitude spectrum. That is, since there is no noise amplitude spectrum data to be referenced at the start of the apparatus, an initial value is set. As a method for setting the initial value of the noise amplitude spectrum, for example, the following method can be considered.
(Method 1) Fourier transform is performed on data of only background noise that is not mixed with speech, which is input immediately after activation, and amplitude spectrum data obtained by the above equation (1) is converted into a noise amplitude spectrum from the Fourier transformed data. Set as initial value.
(Method 2) Amplitude spectrum data corresponding to background noise is stored in a memory in advance, and is read out at startup and set as a noise amplitude spectrum initial value. Alternatively, the envelope data of the amplitude spectrum data corresponding to the background noise is previously stored in the memory, and is read out at the time of activation and set as the initial value of the noise amplitude spectrum envelope data.
(Method 3) The amplitude spectrum data of white noise or pink noise is set as the initial value of the noise amplitude spectrum.

雑音振幅スペクトル更新部24は、後述する雑音振幅スペクトル算出部30で半フレームごとに求められる雑音の振幅スペクトル|N(k)|を順次入力し、半フレーム分遅延して、前回(半フレーム前)観測された信号区間の観測信号について推定された雑音振幅スペクトル推定値|N(k)|として順次出力するものである。起動当初は雑音の振幅スペクトル|N(k)|は未だ推定されていないので、雑音振幅スペクトル更新部24は雑音振幅スペクトル初期値出力部22で設定された雑音振幅スペクトルの初期値を出力する。スペクトル包絡線抽出部26は、スペクトル包絡線抽出部20と同様の方法により、雑音振幅スペクトル|N(k)|の包絡線|N’(k)|を抽出する。 The noise amplitude spectrum update unit 24 sequentially inputs the noise amplitude spectrum | N (k) | obtained for each half frame by the noise amplitude spectrum calculation unit 30 to be described later, and delays by half a frame before the previous (half frame before) ) The noise amplitude spectrum estimation value | N 0 (k) | estimated for the observed signal in the observed signal section is sequentially output. Since the noise amplitude spectrum | N (k) | has not been estimated yet at the beginning of activation, the noise amplitude spectrum update unit 24 outputs the initial value of the noise amplitude spectrum set by the noise amplitude spectrum initial value output unit 22. The spectrum envelope extraction unit 26 extracts the envelope | N 0 ′ (k) | of the noise amplitude spectrum | N 0 (k) | by the same method as the spectrum envelope extraction unit 20.

相関値算出部28は、スペクトル包絡線抽出部20で抽出された現フレームの振幅スペクトル包絡線|X’(k)|と、スペクトル包絡線抽出部26で抽出された雑音振幅スペクトル包絡線|N’(k)|の相関値(相関係数)ρを求める。相関値ρは、
入力信号振幅スペクトル包絡線を|X’(k)|=(x,x,…,x
雑音振幅スペクトル包絡線を|N’(k)|=(y,y,…,y
とすると、(5)式により求められる。

Figure 2005284163
The correlation value calculation unit 28 determines the amplitude spectrum envelope | X ′ (k) | of the current frame extracted by the spectrum envelope extraction unit 20 and the noise amplitude spectrum envelope | N extracted by the spectrum envelope extraction unit 26. The correlation value (correlation coefficient) ρ of 0 ′ (k) | The correlation value ρ is
The input signal amplitude spectrum envelope is represented by | X ′ (k) | = (x 1 , x 2 ,..., X k ).
The noise amplitude spectrum envelope is represented by | N 0 ′ (k) | = (y 1 , y 2 ,..., Y k ).
Then, it is calculated | required by (5) Formula.
Figure 2005284163

雑音振幅スペクトル算出部30は、求められた相関値ρに応じて、現在観測されている信号区間の音声信号について雑音の振幅スペクトル|N(k)|を、(6)式により求める。

|N(k)|=[1−{ρ/(1+ρ)}]・|N(k)|+{ρ/(1+ρ)} ・|X(k)| …(6)
但し、|N(k)|:現在観測されているフレームの音声信号について推定
される雑音の振幅スペクトル
|N(k)|:前回(半フレーム前)観測されたフレームの音声信号
について推定された雑音の振幅スペクトル
|X(k)|:現在観測されているフレームの音声信号のスペクトル
ρ:現在観測されているフレームの音声信号のスペクトルの包絡線と
前回観測されたフレームの音声信号について推定された雑音の
スペクトルの包絡線との相関値
l,m:定数(lは1以上の値、mは0以上の値)
The noise amplitude spectrum calculation unit 30 obtains the noise amplitude spectrum | N (k) | for the audio signal in the currently observed signal section according to the obtained correlation value ρ, using Equation (6).

| N (k) | = [1- {ρ l / (1 + ρ l )} m ] · | N 0 (k) | + {ρ l / (1 + ρ l )} m · | X (k) | )
However, | N (k) |: Estimates the voice signal of the currently observed frame
Amplitude spectrum of noise
| N 0 (k) |: Audio signal of the frame observed last time (half frame before)
Amplitude spectrum of noise estimated for
| X (k) |: spectrum of the audio signal of the currently observed frame
ρ: Spectral envelope of the audio signal of the currently observed frame
Of noise estimated for the speech signal of the last observed frame
Correlation value with spectral envelope
l, m: constant (l is a value of 1 or more, m is a value of 0 or more)

(6)式は、前回(半フレーム前)推定した雑音の振幅スペクトル|N(k)|と、今回算出した入力信号の振幅スペクトル|X(k)|を、求められた相関値ρに応じた比率で加算して、新たな振幅スペクトル|N(k)|を推定するものである。すなわち、相関値ρが低いときは、入力信号に含まれる音声成分が多い(つまり、有音区間)と判断されるので、前回推定した雑音の振幅スペクトル|N(k)|の比率を高くし、今回算出した入力信号の振幅スペクトル|X(k)|を比率を低くして加算する。つまり、雑音振幅スペクトル推定値|N(k)|が音声成分の影響で変化しないようにする。これに対し、相関値ρが高いときは、入力信号に含まれる音声成分が少ない(つまり、無音区間)と判断されるので、前回推定した雑音の振幅スペクトル|N(k)|の比率を低くし、今回算出した入力信号の振幅スペクトル|X(k)|を比率を高くして加算する。つまり、雑音振幅スペクトル推定値|N(k)|が、定常雑音の緩やかな変化に追従して変化するようにする。そして、相関値ρが限りなく1に近いときに、前回推定した雑音の振幅スペクトル|N(k)|と、今回算出した入力信号の振幅スペクトル|X(k)|を同じ比率(0.5:0.5)で加算する。このようにして、主に無音区間で雑音の振幅スペクトルが更新される。 Equation (6) is obtained by calculating the amplitude spectrum | N 0 (k) | of the noise estimated last time (half frame before) and the amplitude spectrum | X (k) | of the input signal calculated this time as the calculated correlation value ρ. A new amplitude spectrum | N (k) | is estimated by adding at a corresponding ratio. That is, when the correlation value ρ is low, it is determined that there are many audio components included in the input signal (that is, a sound section), so the ratio of the amplitude spectrum | N 0 (k) | Then, the amplitude spectrum | X (k) | of the input signal calculated this time is added at a reduced ratio. That is, the noise amplitude spectrum estimation value | N (k) | is prevented from changing due to the influence of the voice component. On the other hand, when the correlation value ρ is high, it is determined that there are few audio components included in the input signal (that is, the silent period), so the ratio of the amplitude spectrum | N 0 (k) | The amplitude spectrum | X (k) | of the input signal calculated this time is increased and the ratio is increased. That is, the noise amplitude spectrum estimation value | N (k) | is changed so as to follow a gradual change in stationary noise. When the correlation value ρ is infinitely close to 1, the previously estimated noise amplitude spectrum | N 0 (k) | and the input signal amplitude spectrum | X (k) | 5: 0.5). In this way, the amplitude spectrum of noise is updated mainly in the silent period.

(6)式において、lは、低相関値に対する感度を調整するための定数である。l値による、相関値ρに対する(6)式の係数値1−{ρ/(1+ρ)}、{ρ/(1+ρ)}の変化を図3に示す。なお、図3はm=1とした場合のものである。図3によれば、l値が大きいほど低相関時の雑音振幅スペクトル推定値の更新量が少なくなることがわかる。 In the equation (6), l is a constant for adjusting the sensitivity to the low correlation value. FIG. 3 shows changes in the coefficient values 1- {ρ 1 / (1 + ρ 1 )} m and {ρ 1 / (1 + ρ 1 )} m in the equation (6) with respect to the correlation value ρ according to the l value. FIG. 3 shows the case where m = 1. As can be seen from FIG. 3, the larger the l value, the smaller the amount of update of the noise amplitude spectrum estimation value at the time of low correlation.

(6)式において、mは、更新量を調整するための定数である。m値による、相関値ρに対する(6)式の係数値1−{ρ/(1+ρ)}、{ρ/(1+ρ)}の変化を図4に示す。なお、図4はl=2とした場合のものである。図4によれば、m値が大きいほど更新量が少なくなることがわかる。 In the equation (6), m is a constant for adjusting the update amount. FIG. 4 shows changes in the coefficient values 1- {ρ 1 / (1 + ρ 1 )} m and {ρ 1 / (1 + ρ 1 )} m in the equation (6) with respect to the correlation value ρ depending on the m value . FIG. 4 shows a case where l = 2. According to FIG. 4, it can be seen that the update amount decreases as the m value increases.

図1の雑音抑圧装置を使用して雑音抑圧実験を行い、雑音抑圧効果を測定した。実験では、定常雑音としてプロジェクタから発生する雑音が存在する環境で、女声アナウンス音および男声アナウンス音を収音し、その収音信号について、図1の雑音抑圧装置による雑音抑圧処理をした場合と、何も雑音抑圧処理をしない場合のPESQ−MOS値をそれぞれ測定した。収音信号のサンプリング周波数を16kHzとし、フレーム切り出しの1フレーム長を1024サンプルとし、図2の処理(雑音抑圧前に半フレームずつずらしてフレーム切り出しを行い、雑音抑圧後に三角窓を掛けて加算合成を行う。)を行った。雑音振幅スペクトルの演算には前記(6)式を使用し、そのl値、m値は、それぞれl=70、m=1とした。なお、PESQ−MOS値は、音声品質の評価指標で、0.5〜4.5の範囲で値をとり、値が高いほど音声品質が良いと判断される。測定結果を表1に示す。また、図5は表1の結果を図示したものである。

Figure 2005284163
表1によれば、背景雑音レベルが低い場合(SN比=24dB)も、高い場合(SN比=12dB)も、また、女声アナウンスの場合も、男声アナウンスの場合も、いずれの場合も、図1の雑音抑圧装置による雑音抑圧処理をした場合の方が、何も雑音抑圧処理をしなかった場合に比べてPESQ−MOS値が高く、同雑音抑圧処理により音声品質が改善されることがわかった。 A noise suppression experiment was performed using the noise suppression apparatus of FIG. 1, and the noise suppression effect was measured. In the experiment, in the environment where there is noise generated from the projector as stationary noise, a female voice announcement sound and a male voice announcement sound are collected, and the noise suppression processing by the noise suppression device of FIG. Each PESQ-MOS value was measured when no noise suppression processing was performed. The sampling frequency of the collected sound signal is 16 kHz, the length of one frame cut out is 1024 samples, and the processing shown in FIG. ). The above equation (6) was used for the calculation of the noise amplitude spectrum, and its l value and m value were 1 = 70 and m = 1, respectively. The PESQ-MOS value is an evaluation index of voice quality and takes a value in the range of 0.5 to 4.5, and it is determined that the higher the value, the better the voice quality. The measurement results are shown in Table 1. FIG. 5 illustrates the results of Table 1.
Figure 2005284163
According to Table 1, when the background noise level is low (S / N ratio = 24 dB), high (S / N ratio = 12 dB), both in the case of a female voice announcement and in the case of a male voice announcement, It can be seen that the PESQ-MOS value is higher when the noise suppression processing by the noise suppression device 1 is higher than when no noise suppression processing is performed, and the voice quality is improved by the noise suppression processing. It was.

(変更例)
前記実施の形態では、雑音振幅スペクトルの演算に前記(6)式を使用したが、雑音振幅スペクトルの演算はこれに限るものではなく、例えば、下記(7)式により雑音振幅スペクトル|N(k)|を求めることもできる。

|N(k)|=(1−ρ)・|N(k)|+ρ ・|X(k)| …(7)

また、相関値ρが所定値以下の時は、現在観測されているフレームの入力信号の振幅スペクトル|X(k)|の加算比率を0とする(すなわち、雑音振幅スペクトル推定値|N(k)|を更新しない)こともできる。
(Example of change)
In the above embodiment, the equation (6) is used for the calculation of the noise amplitude spectrum. However, the calculation of the noise amplitude spectrum is not limited to this. For example, the noise amplitude spectrum | N (k ) | Can also be obtained.

| N (k) | = (1-ρ l ) · | N 0 (k) | + ρ l · | X (k) | (7)

When the correlation value ρ is equal to or smaller than a predetermined value, the addition ratio of the amplitude spectrum | X (k) | of the input signal of the currently observed frame is set to 0 (that is, the noise amplitude spectrum estimation value | N (k ) | Can not be updated).

前記実施の形態では、振幅スペクトルサブトラクション法を用いて、入力信号の振幅スペクトル|X(k)|の包絡線に基づき雑音の振幅スペクトル|N(k)|を推定し、入力信号の振幅スペクトル|X(k)|から雑音の振幅スペクトル|N(k)|を減算して雑音抑圧を行ったが、これに代えて、パワースペクトルサブトラクション法を用いて、入力信号のパワースペクトル|X(k)|の包絡線に基づき雑音のパワースペクトル|N(k)|を推定し、入力信号のパワースペクトル|X(k)|から雑音のパワースペクトル|N(k)|を減算して雑音抑圧を行うことができ、この雑音のパワースペクトル|N(k)|の推定にこの発明の雑音スペクトル推定方法を適用することができる。 In the above embodiment, the amplitude spectrum of the input signal | N (k) | is estimated based on the envelope of the amplitude spectrum | X (k) | of the input signal using the amplitude spectrum subtraction method, and the amplitude spectrum of the input signal | Noise suppression was performed by subtracting the noise amplitude spectrum | N (k) | from X (k) |, but instead of this, the power spectrum | X (k) of the input signal is obtained using the power spectrum subtraction method. | power spectrum of the noise on the basis of the second envelope | N (k) | 2 estimate the power spectrum of the input signal | X (k) | 2 from the noise power spectrum | N (k) | 2 and subtracts Noise suppression can be performed, and the noise spectrum estimation method of the present invention can be applied to estimate the power spectrum | N (k) | 2 of this noise.

前記実施の形態では、入力信号の振幅スペクトル|X(k)|の包絡線に基づき雑音の振幅スペクトル|N(k)|を推定し、入力信号の振幅スペクトル|X(k)|から雑音の振幅スペクトル|N(k)|を減算して雑音抑圧を行ったが、これに代えて、入力信号の振幅情報と位相情報を分離していない複素スペクトルX(k)そのものを用いて、該複素スペクトルX(k)の包絡線に基づき雑音の複素スペクトルN(k)を推定し、入力信号の複素スペクトルX(k)から雑音の複素スペクトルN(k)を減算して雑音抑圧を行うこともできる。   In the embodiment, the noise amplitude spectrum | N (k) | is estimated based on the envelope of the amplitude spectrum | X (k) | of the input signal, and the noise spectrum is determined from the amplitude spectrum | X (k) | Although noise suppression was performed by subtracting the amplitude spectrum | N (k) |, the complex spectrum X (k) itself that does not separate the amplitude information and phase information of the input signal is used instead. It is also possible to estimate noise complex spectrum N (k) based on the envelope of spectrum X (k) and subtract noise complex spectrum N (k) from complex spectrum X (k) of the input signal to perform noise suppression. it can.

この発明の雑音スペクトル推定方法は雑音抑圧以外の用途にも適用することができる。   The noise spectrum estimation method of the present invention can be applied to uses other than noise suppression.

この発明による雑音抑圧装置の実施の形態を示すブロック図である。It is a block diagram which shows embodiment of the noise suppression apparatus by this invention. 図1の雑音抑圧装置における入力信号の切り出しおよび出力信号の連結方法を説明するタイムチャートである。2 is a time chart for explaining a method of cutting out an input signal and connecting output signals in the noise suppression apparatus of FIG. 1. l値による、相関値ρに対する(6)式の係数値1−{ρ/(1+ρ)}、{ρ/(1+ρ)}の変化を示す特性図である。by l value is a characteristic diagram showing relative correlation values [rho (6) equation coefficients 1- {ρ l / (1 + ρ l)} m, the change in {ρ l / (1 + ρ l)} m. m値による、相関値ρに対する(6)式の係数値1−{ρ/(1+ρ)}、{ρ/(1+ρ)}の変化を図4に示す特性図である。FIG. 4 is a characteristic diagram showing the change of the coefficient values 1− {ρ 1 / (1 + ρ 1 )} m and {ρ 1 / (1 + ρ 1 )} m in the equation (6) with respect to the correlation value ρ according to the m value. 図1の雑音抑圧装置による雑音抑圧効果を示す図で、表1の測定結果を線図で示したものである。It is a figure which shows the noise suppression effect by the noise suppression apparatus of FIG. 1, and shows the measurement result of Table 1 by the diagram.

符号の説明Explanation of symbols

14…フーリエ変換部(フーリエ変換手段)、15…スペクトル減算部(減算手段)、16…振幅スペクトル計算部(振幅スペクトル演算手段)、17…再合成部(再合成手段)、18…位相スペクトル計算部(位相スペクトル演算手段)、19…逆フーリエ変換部(逆フーリエ変換手段)、20,26…スペクトル崩落線抽出部、28…相関値算出部(相関演算手段)、30…雑音振幅スペクトル算出部(雑音振幅スペクトル演算手段)
DESCRIPTION OF SYMBOLS 14 ... Fourier transform part (Fourier transform means), 15 ... Spectral subtraction part (subtraction means), 16 ... Amplitude spectrum calculation part (amplitude spectrum calculation means), 17 ... Recombination part (recombination means), 18 ... Phase spectrum calculation (Phase spectrum calculation means), 19 ... inverse Fourier transform section (inverse Fourier transform means), 20, 26 ... spectral fall line extraction section, 28 ... correlation value calculation section (correlation calculation means), 30 ... noise amplitude spectrum calculation section (Noise amplitude spectrum calculation means)

Claims (9)

雑音が混入した音声信号から、該雑音のスペクトルを推定する方法であって、
現在観測されている信号区間の観測信号のスペクトルの包絡線と、前回観測された信号区間の観測信号について推定された雑音のスペクトルの包絡線との相関を求め、該求められた相関に応じて前記現在観測されている信号区間の観測信号について雑音のスペクトルを推定する雑音スペクトル推定方法。
A method for estimating a spectrum of noise from a speech signal mixed with noise,
Obtain the correlation between the spectrum envelope of the observed signal in the currently observed signal section and the noise spectrum envelope estimated for the observed signal in the previously observed signal section, and according to the calculated correlation A noise spectrum estimation method for estimating a noise spectrum for an observed signal in the currently observed signal section.
前記相関に応じた比率で、前記現在観測されている信号区間の観測信号のスペクトルと、前記前回観測された信号区間の観測信号について推定された雑音のスペクトルとを混合し、該混合したスペクトルを前記現在観測されている信号区間の観測信号について雑音のスペクトルとして推定する請求項1記載の雑音スペクトル推定方法。   The spectrum of the observed signal in the currently observed signal section and the noise spectrum estimated for the observed signal in the previously observed signal section are mixed at a ratio according to the correlation, and the mixed spectrum is The noise spectrum estimation method according to claim 1, wherein the observed signal in the currently observed signal section is estimated as a noise spectrum. 前記相関が高いときは該相関が低いときに比べて、前記現在観測されている信号区間の観測信号のスペクトルの混合比率を相対的に高くし、前記前回観測された信号区間の観測信号について推定された雑音のスペクトルの混合比率を相対的に低くし、
前記相関が低いときは該相関が高いときに比べて、前記現在観測されている信号区間の観測信号のスペクトルの混合比率を相対的に低くし、前記前回観測された信号区間の観測信号について推定された雑音のスペクトルの混合比率を相対的に高くする請求項2記載の雑音スペクトル推定方法。
When the correlation is high, the spectrum mixing ratio of the observation signal in the currently observed signal interval is relatively high compared to when the correlation is low, and the observation signal in the previously observed signal interval is estimated. The mixing ratio of the generated noise spectrum is relatively low,
When the correlation is low, the spectrum mixing ratio of the observed signal in the currently observed signal interval is relatively low compared to when the correlation is high, and the observed signal in the previously observed signal interval is estimated. The noise spectrum estimation method according to claim 2, wherein the mixing ratio of the spectrum of the generated noise is relatively high.
前記相関が高くなるにつれて、該相関の変化に対する、前記現在観測されている信号区間の観測信号のスペクトルの混合比率の変化を大きくする請求項3記載の雑音スペクトル推定方法。   The noise spectrum estimation method according to claim 3, wherein as the correlation becomes higher, a change in a mixing ratio of a spectrum of an observation signal in the currently observed signal section with respect to the change in the correlation is increased. 前記現在観測されている信号区間の観測信号について推定される雑音のスペクトルが、

N(k)=[1−{ρ/(1+ρ)}]・N(k)+{ρ/(1+ρ)}・X(k)
但し、N(k):現在観測されている信号区間の観測信号について推定される
雑音のスペクトル
(k):前回観測された信号区間の観測信号について推定された雑音
のスペクトル
X(k):現在観測されている信号区間の観測信号のスペクトル
ρ:現在観測されている信号区間の観測信号のスペクトルの包絡線と前回
観測された信号区間の観測信号について推定された雑音のスペクトル
の包絡線との相関値
l,m:定数(lは1以上の値、mは0以上の値)

として求められる値である請求項3記載の雑音スペクトル推定方法。
The spectrum of noise estimated for the observed signal in the currently observed signal interval is

N (k) = [1- {ρ l / (1 + ρ l )} m ] · N 0 (k) + {ρ l / (1 + ρ l )} m · X (k)
However, N (k): estimated for the observed signal in the currently observed signal section
Noise spectrum
N 0 (k): Noise estimated for the observed signal in the previously observed signal section
Spectrum of
X (k): spectrum of the observed signal in the currently observed signal section
ρ: Spectral envelope of the observed signal in the currently observed signal section and the previous time
Noise spectrum estimated for the observed signal in the observed signal interval
Correlation value with the envelope
l, m: constant (l is a value of 1 or more, m is a value of 0 or more)

The noise spectrum estimation method according to claim 3, wherein
前記現在観測されている信号区間の観測信号について推定される雑音のスペクトルを、次の信号区間における前回観測された信号区間の観測信号について推定された雑音のスペクトルとする請求項1から5のいずれかに記載の雑音スペクトル推定方法。   6. The noise spectrum estimated for the observed signal in the currently observed signal section is the noise spectrum estimated for the observed signal in the previously observed signal section in the next signal section. The noise spectrum estimation method according to claim 1. 前記スペクトルの包絡線が振幅スペクトルの包絡線である請求項1から6のいずれかに記載の雑音スペクトル推定方法。   The noise spectrum estimation method according to claim 1, wherein the spectrum envelope is an amplitude spectrum envelope. 雑音が混入した音声信号から、該雑音の振幅スペクトルを推定し、該雑音を抑圧した音声信号を生成する方法であって、
現在観測されている信号区間の観測信号をフーリエ変換して振幅スペクトルと位相スペクトルを求め、
該求められた現在観測されている信号区間の観測信号の振幅スペクトルの包絡線と、前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルの包絡線との相関を求め、
該求められた相関に応じて前記現在観測されている信号区間の観測信号について雑音の振幅スペクトルを推定し、前記現在観測されている信号区間の観測信号の振幅スペクトルから該現在観測されている信号区間の観測信号について推定された雑音の振幅スペクトルを減算し、
該減算により得られる振幅スペクトルと前記求められた位相スペクトルとを再合成して逆フーリエ変換し、
該逆フーリエ変換で得られる信号を前記雑音を抑圧した音声信号として出力し、
前記現在観測されている信号区間の観測信号について推定される雑音の振幅スペクトルを、次の信号区間における前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルとする雑音抑圧方法。
A method of estimating an amplitude spectrum of the noise from a speech signal mixed with noise and generating a speech signal with the noise suppressed,
Find the amplitude spectrum and phase spectrum by Fourier-transforming the observed signal in the currently observed signal section,
Obtaining the correlation between the amplitude spectrum envelope of the observed signal in the currently observed signal interval and the noise amplitude spectrum envelope estimated for the observed signal in the previously observed signal interval;
A noise amplitude spectrum is estimated for the observed signal in the currently observed signal section according to the obtained correlation, and the currently observed signal is estimated from the amplitude spectrum of the observed signal in the currently observed signal section. Subtract the noise amplitude spectrum estimated for the observed signal in the interval,
Re-synthesize the amplitude spectrum obtained by the subtraction and the obtained phase spectrum, and inverse Fourier transform,
The signal obtained by the inverse Fourier transform is output as an audio signal with the noise suppressed,
A noise suppression method in which an amplitude spectrum of noise estimated for an observation signal in the currently observed signal interval is a noise amplitude spectrum estimated for an observation signal in a previously observed signal interval in the next signal interval.
雑音が混入した音声信号から、該雑音の振幅スペクトルを推定し、該雑音を抑圧した音声信号を生成する装置であって、
現在観測されている信号区間の観測信号をフーリエ変換するフーリエ変換手段と、
該フーリエ変換されたデータから振幅スペクトルを求める振幅スペクトル演算手段と、
該フーリエ変換されたデータから位相スペクトルを求める位相スペクトル演算手段と、
前記求められた現在観測されている信号区間の観測信号の振幅スペクトルの包絡線と、前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルの包絡線との相関を求める相関演算手段と、
該求められた相関に応じて前記現在観測されている信号区間の観測信号について雑音の振幅スペクトルを推定する雑音振幅スペクトル演算手段と、
前記現在観測されている信号区間の観測信号の振幅スペクトルから該現在観測されている信号区間の観測信号について推定された雑音の振幅スペクトルを減算する減算手段と、
該減算により得られる振幅スペクトルと前記位相スペクトルとを再合成する再合成手段と、
該再合成されたデータを逆フーリエ変換する逆フーリエ変換手段とを具備し、
該逆フーリエ変換によって生成された信号を前記雑音を抑圧した音声信号として出力し、前記現在観測されている信号区間の観測信号について推定される雑音の振幅スペクトルを、次の信号区間における前回観測された信号区間の観測信号について推定された雑音の振幅スペクトルとする雑音抑圧装置。
An apparatus for estimating an amplitude spectrum of the noise from an audio signal mixed with noise and generating an audio signal with the noise suppressed,
Fourier transform means for Fourier transforming the observed signal in the currently observed signal section;
Amplitude spectrum calculating means for obtaining an amplitude spectrum from the Fourier transformed data;
Phase spectrum calculation means for obtaining a phase spectrum from the Fourier transformed data;
Correlation calculation means for obtaining a correlation between the amplitude spectrum envelope of the observed signal in the currently observed signal interval and the noise amplitude spectrum envelope estimated for the observed signal in the previously observed signal interval. When,
Noise amplitude spectrum calculating means for estimating an amplitude spectrum of noise for the observed signal in the currently observed signal section according to the obtained correlation;
Subtracting means for subtracting the amplitude spectrum of noise estimated for the observation signal of the currently observed signal section from the amplitude spectrum of the observation signal of the currently observed signal section;
Recombining means for recombining the amplitude spectrum obtained by the subtraction and the phase spectrum;
An inverse Fourier transform means for performing an inverse Fourier transform on the re-synthesized data,
The signal generated by the inverse Fourier transform is output as a speech signal with the noise suppressed, and the noise amplitude spectrum estimated for the observed signal in the currently observed signal interval is observed in the next signal interval. A noise suppressor that makes an amplitude spectrum of noise estimated for an observed signal in a signal section.
JP2004100935A 2004-03-30 2004-03-30 Noise spectrum estimation method, noise suppression method, and noise suppression device Expired - Fee Related JP4434813B2 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2004100935A JP4434813B2 (en) 2004-03-30 2004-03-30 Noise spectrum estimation method, noise suppression method, and noise suppression device
US11/093,672 US7596495B2 (en) 2004-03-30 2005-03-29 Current noise spectrum estimation method and apparatus with correlation between previous noise and current noise signal
GB0506434A GB2413469B (en) 2004-03-30 2005-03-30 Noise spectrum estimation method and apparatus
CA2502980A CA2502980C (en) 2004-03-30 2005-03-30 Noise spectrum estimation method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2004100935A JP4434813B2 (en) 2004-03-30 2004-03-30 Noise spectrum estimation method, noise suppression method, and noise suppression device

Publications (2)

Publication Number Publication Date
JP2005284163A true JP2005284163A (en) 2005-10-13
JP4434813B2 JP4434813B2 (en) 2010-03-17

Family

ID=34567592

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2004100935A Expired - Fee Related JP4434813B2 (en) 2004-03-30 2004-03-30 Noise spectrum estimation method, noise suppression method, and noise suppression device

Country Status (4)

Country Link
US (1) US7596495B2 (en)
JP (1) JP4434813B2 (en)
CA (1) CA2502980C (en)
GB (1) GB2413469B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007212704A (en) * 2006-02-09 2007-08-23 Univ Waseda Noise spectrum estimating method, and noise suppressing method and device
JP2008058343A (en) * 2006-08-29 2008-03-13 Casio Comput Co Ltd Mechanism driving sound reduction apparatus and mechanism driving sound reduction method
JP2014044313A (en) * 2012-08-27 2014-03-13 Xacti Corp Noise reduction device
KR20150032736A (en) * 2012-08-29 2015-03-27 니폰 덴신 덴와 가부시끼가이샤 Decoding method, decoding device, program, and recording method thereof

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2422237A (en) * 2004-12-21 2006-07-19 Fluency Voice Technology Ltd Dynamic coefficients determined from temporally adjacent speech frames
JP4827661B2 (en) * 2006-08-30 2011-11-30 富士通株式会社 Signal processing method and apparatus
US8744798B2 (en) * 2007-06-12 2014-06-03 Tektronix International Sales Gmbh Signal generator and user interface for adding amplitude noise to selected portions of a test signal
CA2690433C (en) * 2007-06-22 2016-01-19 Voiceage Corporation Method and device for sound activity detection and sound signal classification
JP5566846B2 (en) * 2010-10-15 2014-08-06 本田技研工業株式会社 Noise power estimation apparatus, noise power estimation method, speech recognition apparatus, and speech recognition method
JP6182895B2 (en) * 2012-05-01 2017-08-23 株式会社リコー Processing apparatus, processing method, program, and processing system
US10032462B2 (en) 2015-02-26 2018-07-24 Indian Institute Of Technology Bombay Method and system for suppressing noise in speech signals in hearing aids and speech communication devices
JP6668995B2 (en) * 2016-07-27 2020-03-18 富士通株式会社 Noise suppression device, noise suppression method, and computer program for noise suppression
CN117934487B (en) * 2024-03-25 2024-05-28 板石智能科技(深圳)有限公司 Detection method and device for scanning noise and error, electronic equipment and medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0681730A4 (en) * 1993-11-30 1997-12-17 At & T Corp Transmitted noise reduction in communications systems.
US5774846A (en) 1994-12-19 1998-06-30 Matsushita Electric Industrial Co., Ltd. Speech coding apparatus, linear prediction coefficient analyzing apparatus and noise reducing apparatus
JP3183104B2 (en) 1995-07-14 2001-07-03 松下電器産業株式会社 Noise reduction device
JPH113094A (en) 1997-06-12 1999-01-06 Kobe Steel Ltd Noise eliminating device
WO2001031640A1 (en) * 1999-10-29 2001-05-03 Koninklijke Philips Electronics N.V. Elimination of noise from a speech signal
JP2002014694A (en) 2000-06-30 2002-01-18 Toyota Central Res & Dev Lab Inc Voice recognition device
JP3693022B2 (en) 2002-01-29 2005-09-07 株式会社豊田中央研究所 Speech recognition method and speech recognition apparatus

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007212704A (en) * 2006-02-09 2007-08-23 Univ Waseda Noise spectrum estimating method, and noise suppressing method and device
JP2008058343A (en) * 2006-08-29 2008-03-13 Casio Comput Co Ltd Mechanism driving sound reduction apparatus and mechanism driving sound reduction method
JP2014044313A (en) * 2012-08-27 2014-03-13 Xacti Corp Noise reduction device
KR20150032736A (en) * 2012-08-29 2015-03-27 니폰 덴신 덴와 가부시끼가이샤 Decoding method, decoding device, program, and recording method thereof
KR101629661B1 (en) * 2012-08-29 2016-06-13 니폰 덴신 덴와 가부시끼가이샤 Decoding method, decoding apparatus, program, and recording medium therefor
US9640190B2 (en) 2012-08-29 2017-05-02 Nippon Telegraph And Telephone Corporation Decoding method, decoding apparatus, program, and recording medium therefor

Also Published As

Publication number Publication date
US7596495B2 (en) 2009-09-29
CA2502980C (en) 2010-05-04
GB2413469A (en) 2005-10-26
GB0506434D0 (en) 2005-05-04
CA2502980A1 (en) 2005-09-30
JP4434813B2 (en) 2010-03-17
GB2413469B (en) 2006-05-03
US20050256705A1 (en) 2005-11-17

Similar Documents

Publication Publication Date Title
JP4958303B2 (en) Noise suppression method and apparatus
JP5528538B2 (en) Noise suppressor
Nakatani et al. Robust and accurate fundamental frequency estimation based on dominant harmonic components
JP4434813B2 (en) Noise spectrum estimation method, noise suppression method, and noise suppression device
JP2005195955A (en) Device and method for noise suppression
JP4454591B2 (en) Noise spectrum estimation method, noise suppression method, and noise suppression device
JP3960834B2 (en) Speech enhancement device and speech enhancement method
US9418677B2 (en) Noise suppressing device, noise suppressing method, and a non-transitory computer-readable recording medium storing noise suppressing program
Tsilfidis et al. Blind single-channel suppression of late reverberation based on perceptual reverberation modeling
JPH08160994A (en) Noise suppression device
JP3849679B2 (en) Noise removal method, noise removal apparatus, and program
JP2004020679A (en) System and method for suppressing noise
JP2020160290A (en) Signal processing apparatus, signal processing system and signal processing method
Fang et al. Speech enhancement based on modified a priori SNR estimation
JP6707914B2 (en) Gain processing device and program, and acoustic signal processing device and program
Fingscheidt et al. Towards objective quality assessment of speech enhancement systems in a black box approach
JP2006201622A (en) Device and method for suppressing band-division type noise
JP2001249676A (en) Method for extracting fundamental period or fundamental frequency of periodical waveform with added noise
JP2005284016A (en) Method for inferring noise of speech signal and noise-removing device using the same
JP2899533B2 (en) Sound quality improvement device
JP4950971B2 (en) Reverberation removal apparatus, dereverberation method, dereverberation program, recording medium
JP7508179B2 (en) Noise Suppression Circuit
KR100468829B1 (en) Method for noise cancellation
KR20180087021A (en) Method for estimating room transfer function in noise environment and signal process method for estimating room transfer function in noise environment
Shi et al. Subband dereverberation algorithm for noisy environments

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20060524

A977 Report on retrieval

Free format text: JAPANESE INTERMEDIATE CODE: A971007

Effective date: 20090515

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20090526

A521 Request for written amendment filed

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20090710

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20091215

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20091222

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130108

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130108

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20140108

Year of fee payment: 4

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

LAPS Cancellation because of no payment of annual fees