JP4581114B2

JP4581114B2 - Adaptive beamformer

Info

Publication number: JP4581114B2
Application number: JP2005143289A
Authority: JP
Inventors: ウォルフガング・ヘルボート; 哲中村
Original assignee: ATR Advanced Telecommunications Research Institute International
Current assignee: ATR Advanced Telecommunications Research Institute International
Priority date: 2005-05-16
Filing date: 2005-05-16
Publication date: 2010-11-17
Anticipated expiration: 2025-05-16
Also published as: JP2006319925A

Abstract

<P>PROBLEM TO BE SOLVED: To provide an adaptive filter capable of realizing high-speed convergence, simple computation, high tracking capability, and sufficiently low delay together with step size control for each frequency domain. <P>SOLUTION: An adaptive filter 80 includes: a FIR filter 90 having a matrix W(k) of adaptive coefficients for receiving input signals x(k); a subtracter 98 for computing an error signal e(k) based on a reference signal y<SB>ref</SB>(k) and an output y(k) of the FIR filter 90; a coefficient updating module 94 for updating the matrix W(k) in response to the signals x(k) and e(k) each transformed into DFT region segments and based on a predetermined probability density distribution of the error signal e(k); and a double-talk detector 96 for adaptively controlling the module 94 such that, for each of the DFT region segments, the module 94 updates the matrix W(k) only when interference and noise is present in the input signals x(k) and freezes the update when a desired signal is present in the input signals x(k). <P>COPYRIGHT: (C)2007,JPO&INPIT

Description

この発明は、一般に、適応的ビーム形成によりノイズを削減する技術に関し、特に、適応的フィルタリングを異常値に対し頑健となるように改良する方法に関する。 The present invention relates generally to techniques for reducing noise through adaptive beamforming, and more particularly to a method for improving adaptive filtering to be robust against outliers.

自然で快適なマン／マシンインタラクションに対する要求が高まりつつある。そのために、マルチメディアまたは遠距離通信サービスのためのどのような端末においても、その音響に関するインターフェースは、シームレスで両手が自由に使えるような音による通信を可能にするものであることが求められている。しかも、それは乗用車の車中、会社、家庭または公共の場所のようなさまざまな音響環境で求められている。典型的な応用としては、オーディオ／ビデオ会議、対話システム、コンピュータゲーム、管制制御インターフェース、ディクテーションシステム、及び高品質の音声記録等がある。 The demand for natural and comfortable man / machine interaction is increasing. Therefore, in any terminal for multimedia or telecommunications services, the acoustic interface is required to be able to communicate with sound that can be used seamlessly by both hands. Yes. Moreover, it is sought after in various acoustic environments such as in passenger cars, offices, homes or public places. Typical applications include audio / video conferencing, interactive systems, computer games, control and control interfaces, dictation systems, and high quality audio recording.

所望の音源の近傍に置いたマイクロフォンを用いた音声及び音響採取に比べて、シームレスな音響インターフェースでは、所望の音源信号が残響によって損なわれる。これは反射のある音響環境、局部的干渉及びノイズと、ラウドスピーカからの音響エコーとによるものである。これらの干渉は聴く人にとって不快なばかりでなく、より重要なことだが、例えば音声認識にかかわる場合に有害である。基準となるラウドスピーカ信号が利用可能である場合は、常にアコースティックエコーキャンセレーションが望ましい。というのも、アコースティックエコーキャンセレーションはこのような干渉を最大限に抑制するからである。 Compared to voice and sound collection using a microphone placed in the vicinity of the desired sound source, the desired sound source signal is impaired by reverberation in a seamless acoustic interface. This is due to the reflective acoustic environment, local interference and noise, and acoustic echoes from the loudspeakers. These interferences are not only unpleasant for the listener, but more importantly, they are harmful, for example when involved in speech recognition. Acoustic echo cancellation is always desirable when a reference loudspeaker signal is available. This is because acoustic echo cancellation minimizes such interference.

局部的干渉及びノイズを抑制するためには、ビーム形成マイクロフォンアレイが非常に効果的である。なぜなら、これらは空間−時間フィルタリングによって干渉及びノイズを抑制する一方で、時間フィルタリングに基づく単一チャネル音声高品質化とは対照的に、所望の信号の歪を生じさせないからである。この場合、適応的な、データ依存のビーム形成器が最適の選択であると思われる。なぜなら、これらは所望の信号と干渉との特性を考慮して、干渉とノイズの抑制を最大とするからである。 A beam forming microphone array is very effective in suppressing local interference and noise. This is because they suppress interference and noise by space-time filtering, but do not cause distortion of the desired signal, in contrast to single channel speech enhancement based on temporal filtering. In this case, an adaptive, data-dependent beamformer appears to be the best choice. This is because they maximize interference and noise suppression in view of the desired signal and interference characteristics.

適応的ビーム形成では、干渉とノイズの抑制は通常、マイクロフォンチャネルの各々に適応フィルタを置き、適応フィルタの出力信号の和をとることで行なわれる。ここで、適応フィルタは、干渉とノイズとの抑制を最大としながら所望の信号の歪が最小となるように最適化される。所望の信号の歪は通常、線形制約（「線形制約された最小分散」（ｌｉｎｅｒａｌｙ−ｃｏｎｓｔｒａｉｎｅｄｍｉｎｉｍｕｍｖａｒｉａｎｃｅ：ＬＣＭＶ）ビーム形成または「線形制約された最小二乗誤差」（ｌｉｎｅａｒｌｙ−ｃｏｎｓｔｒａｉｎｅｄｌｅａｓｔ−ｓｑｕａｒｅｅｒｒｏｒ）ビーム形成）によって、または基準となる所望の信号（「最小平均二乗誤差」（Ｍｉｎｉｍｕｍｍｅａｎ−ｓｑｕａｒｅｅｒｒｏｒ：ＭＭＳＥ）ビーム形成または「最小二乗誤差」（Ｌｅａｓｔｓｑｕａｒｅｅｒｒｏｒ：ＬＳＥ）ビーム形成）を用いることによって制御される。 In adaptive beamforming, interference and noise suppression are usually performed by placing an adaptive filter in each microphone channel and summing the output signals of the adaptive filter. Here, the adaptive filter is optimized so as to minimize distortion of a desired signal while maximizing suppression of interference and noise. The desired signal distortion is typically linearly constrained ("linearly constrained minimum variance" (LCMV) beamforming or "linearly constrained least square error") beam. Or by using a desired signal ("minimum mean-square error" (MMSE) beamforming or "least square error" (LSE) beamforming). Is done.

干渉とノイズの抑制を最大としながら所望の信号の歪が最小となるようにするためには、しばしば、「所望信号のみ」、「干渉及びノイズのみ」、並びに「ダブルトーク」（所望信号と、干渉及びノイズ等の外乱との両者を含む）といった分類器が必要とされる。適応フィルタは、適応ビーム形成器による所望信号のキャンセレーションを防ぐために、干渉及びノイズのみが存在する場合に干渉とノイズとの抑制を最大にするよう適応化されなければならない。 In order to minimize the distortion of the desired signal while maximizing interference and noise suppression, often “desired signal only”, “interference and noise only”, and “double talk” (desired signal, Classifiers, including both interference and disturbances such as noise). The adaptive filter must be adapted to maximize interference and noise suppression in the presence of only interference and noise to prevent cancellation of the desired signal by the adaptive beamformer.

適応ビームフィルタの中には、所望の信号が存在するときのみ、適応フィルタによって所望の信号の予め定められた二次統計を黙示的に推定することによって、所望の信号の歪を最小化するものがある（非特許文献１）。このような分類器では通常、誤った分類の発生が避けられないので、適応フィルタの適応処理がダブルトークの間に起こってしまい、このため適応ビーム形成器の性能が落ちる。この問題の解決策の一つは、適応アルゴリズムのステップサイズを小さく選択することである。しかしながら、適応ステップサイズを小さくすると、収束速さが遅くなり、これは時間による変動の大きい音響環境では特に不利である。このため、一方で十分に速い適応を、他方で誤った分類器によって引起される摂動に対抗する能力を、同時に達成することはしばしば困難である。この、摂動に対抗する能力を、ここでは「頑健性」と呼ぶ。 Some adaptive beam filters minimize the distortion of the desired signal by implicitly estimating the predetermined second-order statistics of the desired signal by the adaptive filter only when the desired signal is present (Non-Patent Document 1). In such a classifier, since an erroneous classification is usually unavoidable, the adaptive processing of the adaptive filter occurs during double talk, which degrades the performance of the adaptive beamformer. One solution to this problem is to choose a small step size for the adaptive algorithm. However, reducing the adaptation step size slows down the convergence speed, which is particularly disadvantageous in acoustic environments that vary greatly with time. For this reason, it is often difficult to achieve simultaneously fast enough adaptation on the one hand and the ability to counteract perturbations caused by the wrong classifier on the other hand. This ability to counter perturbations is referred to herein as “robustness”.

アコースティックエコーキャンセレーションに関して、非特許文献２では、局部的な話者の存在によるダブルトークバーストに対する頑健性の必要が指摘され、適応のために誤差信号の非線形関数を用いることでこれに対処している。非特許文献３では、残存するエコー信号に対する汚染されたガウスモデルを用いて、サブバンド適応フィルタリングに頑健な統計（非特許文献１６）を導入することにより、サブバンドエコーキャンセラのためのダブルトークバーストに対する頑健性が得られる。非特許文献４では、頑健な統計の概念を用いて、正規化された最小平均二乗（ｎｏｒｍａｌｉｚｅｄｌｅａｓｔ−ｍｅａｎ−ｓｑｕａｒｅｓ：ＮＬＭＳ）アルゴリズム、比例ＮＬＭＳ（ＰｒｏｐｏｒｔｉｏｎａｌＮＬＭＳ：ＰＮＬＭＳ）アルゴリズム、及びアフィン投影アルゴリズム（ａｆｆｉｎｅｐｒｏｊｅｃｔｉｏｎａｌｇｏｒｉｔｈｍ：ＡＰＡ）の、ダブルトークに対し頑健なものを導出している。非特許文献５では頑健な再帰的最小二乗（ｒｅｃｕｒｓｉｖｅｌｅａｓｔ−ｓｑｕａｒｅ：ＲＬＳ）アルゴリズムが導出され、特許文献１ではこれが特許されている。
Ｊ．ベネツィ及びＴ．Ｆ．ゲンズラー、「音響及びネットワークキャンセレーションでの使用のための頑健な適応フィルタ」、欧州特許、ＥＰ１１７０８６４Ａ１，２００２年１月（J. Benesty and T.F. Gaensler. A robust adaptive filter for use in acoustic and network cancellation. European Patent, EP1170864A1, January 2002.）Ｏ．ホシュヤマ、Ａ．スギヤマ、及びＡ．ヒラノ、「制約適応フィルタを用いたブロッキング行列を備えたマイクロフォンアレイのための頑健な適応ビーム形成器」、ＩＥＥＥ信号処理トランザクション、４７（１０）：２６７７−２６８４、１９９９年１０月（O. Hoshuyama, A. Sugiyama, and A. Hirano. A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE Trans. on Signal Processing, 47(10):2677-2684, October 1999.）Ｍ．Ｍ．ソンディ、「適応エコーキャンセラ」、ベルシステム技術ジャーナル、ＸＬＶＩ（３）：４９７−５１０、１９６７年３月（M.M. Sondhi. An adaptive echo canceller. The Bell System Technical Journal, XLVI(3):497-510, March 1967）Ｔ．ゲンズラー、「ダブルトークに耐性のあるサブバンドエコーキャンセラ」、信号処理、６５（１）：８９−１０１、１９９８年２月（T. Gaensler. A double-talk resistant subband echo canceller. Signal Processing, 65(1):89-101, February 1998）Ｔ．ゲンズラー、Ｓ．Ｌ．ゲイ、Ｍ．Ｍ．ソンディ及びＪ．ベネツィ、「ネットワークエコーキャンセレーションのためのダブルトークに頑健な高速収束アルゴリズム」、ＩＥＥＥ音声及び音響処理トランザクション、８（６）：６５６−６６３、２０００年１１月（T. Gaensler, S.L. Gay, M.M. Sondhi, and J. Benesty. Double-talk robust fast converging algorithms for network echo cancellation. IEEE Trans. on Speech and Audio Processing, 8(6):656-663, November 2000.）Ｊ．ベネツィ及びＴ．ゲンズラー、「頑健で高速収束する最小二乗適応アルゴリズム」、ＩＥＥＥ音響、音声及び信号処理国際会議予稿集、６：３７８５−３７８８、２００１年５月（J. Benesty and T. Gaensler. A robust fast converging least-squares adaptive algorithm. Proceedings of. IEEE International. Conference on Acoustics, Speech, and Signal Processing, 6:3785-3788, May 2001）Ｊ．Ｊ．シンク、「周波数ドメイン及びマルチレート適応フィルタリング」、ＩＥＥＥ信号処理マガジン、１４−３７ページ、１９９２年１月（J.J. Shynk. Frequency-domain and multirate adaptive filtering. IEEE Signal Processing Magazine, pages 14-37, January 1992.）Ｊ．−Ｓ．スー及びＫ．Ｋ．パン、「マルチディレイブロック周波数ドメイン適応フィルタ」、ＩＥＥＥ音響、音声、及び信号処理トランザクション、３８（２）：３７３−３７６、１９９０年２月（J.-S. Soo and K.K. Pang. Multidelay block frequency domain adaptive filter. IEEE Trans. on Acoustics, Speech, and Signal Processing, 38(2):373-376, February 1990.）Ｊ．ベネツィ及びＤ．Ｒ．モルガン、「マルチチャネル周波数ドメイン適応フィルタリング」、Ｓ．Ｌ．ゲイ及びＪ．ベネツィ編、遠距離通信のための音響信号処理、第７章、１２１−１３３ページ、クルワー学術出版、ボストン、ＭＡ、２０００年（J. Benesty and D.R. Morgan. Multi-channel frequency-domain adaptive filtering. In S.L. Gay and J. Benesty, editors, Acoustic Signal Processing for Telecommunication, chapter 7, pages 121-133. Kluwer Academic Publishers, Boston, MA, 2000.）Ｈ．ブフナー、Ｊ．ベネツィ及びＷ．ケラーマン、「マルチチャネル周波数ドメイン適応フィルタリングとマルチチャネルアコースティックエコーキャンセレーションへの応用」、Ｊ．ベネツィ及びＹ．ファン編、適応信号処理：実世界の問題への応用、スプリンガー、ベルリン、２００３年（H. Buchner, J. Benesty, and W. Kellermann. Multichannel frequency-domain adaptive filtering with application to multichannel acoustic echo cancellation. In J. Benesty and Y. Huang, editors, Adaptive Signal Processing: Applications to Real-World Problems. Springer, Berlin, 2003.）Ｂ．Ｈ．ニッチ、「周波数ドメインで動作する適応フィルタアルゴリズムのための周波数選択的なステップの大きさ制御」、信号処理、８０（９）：１７３３−１７４５、２０００年９月（B. H. Nitsch. A frequency-selective stepfactor control for an adaptive filter algorithm working in the frequency domain. Signal Processing, 80(9):1733-1745, September 2000）Ｗ．ハーボート、「マン／マシンインターフェースのための音声キャプチャ」、マイクロフォンアレイ信号処理の実際的局面、スプリンガー、ハイデルベルク、ドイツ、２００５年（W. Herbordt. Sound capture for human/machine interfaces: Practical aspects of microphone array signal processing. Springer, Heidelberg, Germany, 2005.）Ｈ．ブフナー、Ｊ．ベネツィ、Ｔ．ゲンズラー及びＷ．ケラーマン、「アコースティックエコーキャンセレーションへの応用を伴う異常値に頑健な拡大マルチディレイフィルタ」、音響エコー及びノイズ制御に関する国際ワークショップ、２００３年９月（H. Buchner, J. Benesty, T. Gaensler, and W. Kellermann. An outlier robust extended multidelay filter with application to acoustic echo cancellation. Int. Workshop on Acoustic Echo and Noise Control, September 2003.）Ｒ．マーチン、「マルチチャネルエコー補償と雑音低減を伴う自由音声メカニズム」、博士論文、アーヘン情報処理データ通信研究所、１９９５年（R. Martin. Freisprecheinrichtungen mit mehrkanaliger Echokompensation und Stoergeraeuschreduktion. PhD thesis, Aachener Institut fuer Nachrichtengeraete und Datenkommunikation, 1995.）Ｗ．ケラーマン、「ビーム形成マイクロフォンアレイのためのアコースティックエコーキャンセレーション」、Ｍ．Ｓ．ブランドシュタイン及びＤ．Ｂ．ワード編、マイクロフォンアレイ：信号処理技術と応用、第１３章、２８１−３０６ページ、スプリンガー、ベルリン、２００１年（W. Kellermann. Acoustic echo cancellation for beamforming microphone arrays. In M.S. Brandstein and D.B. Ward, editors, Microphone Arrays: Signal Processing Techniques and Applications, chapter 13, pages 281-306. Springer, Berlin, 2001）Ｗ．ハーボート、Ｗ．ケラーマン及びＳ．ナカムラ、「ＬＣＭＶビーム形成とアコースティックエコーキャンセレーションの組合せ最適化」、ＥＵＲＡＳＩＰヨーロッパ信号処理会議、予稿集、２００４年（W. Herbordt, W. Kellermann, and S. Nakamura. Combined optimization of LCMV beamforming and acoustic echo cancellation. Proc. EURASIP European Signal Processing Conference, 2004.）Ｐ．Ｊ．ヒューバー、「頑健な統計」、ワイリー、ニューヨーク、１９８１年（P.J. Huber. Robust Statistics. Wiley, New York, 1981）Ｗ．ハーボート、「アコースティックマン／マシンフロントエンドのための頑健な適応ビーム形成とアコースティックエコーキャンセレーションとの組合せ」、博士論文、エルランゲンーニュルンベルク大学、ドイツ、２００４年（W. Herbordt. Combination of robust adaptive beamforming with acoustic echo cancellation for acoustic human/machine front-ends. PhD thesis, University Erlangen-Nuremberg, Germany, 2004）Ｌ．Ｊ．グリフィス及びＣ．Ｗ．ジム、「線形制約された適応ビーム形成への代替的アプローチ」、ＩＥＥＥアンテナ及び伝播トランザクション、３０（１）：２７−３４、１９８２年１月（L.J. Griffiths and C.W. Jim. An alternative approach to linearly constrained adaptive beamforming. IEEE Trans. on Antennas and Propagation, 30(1): 27-34, January 1982.）Ｓ．Ｍ．ケイ、「統計的信号処理の基礎：推定理論」、プレンティスホール、アッパーサドルリバー、ＮＪ、１９９３年（S.M. Kay. Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall, Upper Saddle River, NJ, 1993.） Regarding acoustic echo cancellation, Non-Patent Document 2 points out the need for robustness against double-talk bursts due to the presence of local speakers, and copes with this by using a non-linear function of the error signal for adaptation. Yes. In Non-Patent Document 3, double-talk burst for subband echo canceller is introduced by introducing robust statistics for subband adaptive filtering (Non-Patent Document 16) using a contaminated Gaussian model for the remaining echo signal. Robustness against is obtained. In Non-Patent Document 4, using a robust statistical concept, a normalized least-mean-squares (NLMS) algorithm, a proportional NLMS (Proportional NLMS: PNLMS) algorithm, and an affine projection algorithm (affine) are used. projection algorithm (APA), which is robust against double talk. Non-Patent Document 5 derives a robust recursive least-square (RLS) algorithm, which is patented in Patent Document 1.
J. et al. Benezi and T. F. Gensler, “Robust adaptive filter for use in acoustic and network cancellation”, European patent, EP 1170864A1, January 2002 (J. Benesty and TF Gaensler. A robust adaptive filter for use in acoustic and network cancellation. European (Patent, EP1170864A1, January 2002.) O. Hoshuyama, A. Sgiama, and A. Hirano, “Robust Adaptive Beamformer for Microphone Array with Blocking Matrix Using Constraint Adaptive Filter”, IEEE Signal Processing Transaction, 47 (10): 2677-2684, October 1999 (O. Hoshuyama, A. Sugiyama, and A. Hirano. A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters. IEEE Trans. On Signal Processing, 47 (10): 2677-2684, October 1999.) M.M. M.M. Sondy, “Adaptive Echo Canceller”, Bell System Technical Journal, XLVI (3): 497-510, March 1967 (MM Sondhi. An adaptive echo canceller. The Bell System Technical Journal, XLVI (3): 497-510, March 1967) T. T. Gensler, “Double-talk resistant subband echo canceller”, Signal Processing, 65 (1): 89-101, February 1998 (T. Gaensler. A double-talk resistant subband echo canceller. Signal Processing, 65 ( 1): 89-101, February 1998) T. T. Gensler, S. L. Gay, M. M.M. Sondy and J.H. Venezi, “Double-talk robust fast convergence algorithm for network echo cancellation”, IEEE speech and sound processing transaction, 8 (6): 656-663, November 2000 (T. Gaensler, SL Gay, MM Sondhi , and J. Benesty. Double-talk robust fast converging algorithms for network echo cancellation. IEEE Trans. on Speech and Audio Processing, 8 (6): 656-663, November 2000.) J. et al. Benezi and T. Gensler, “Robust and Fast Convergence Least Squares Adaptive Algorithm”, Proceedings of International Conference on IEEE Acoustics, Speech and Signal Processing, 6: 3785-3788, May 2001 (J. Benesty and T. Gaensler. A robust fast converging least -squares adaptive algorithm. Proceedings of. IEEE International. Conference on Acoustics, Speech, and Signal Processing, 6: 3785-3788, May 2001) J. et al. J. et al. Sink, "Frequency Domain and Multirate Adaptive Filtering", IEEE Signal Processing Magazine, pages 14-37, January 1992 (JJ Shynk. Frequency-domain and multirate adaptive filtering. IEEE Signal Processing Magazine, pages 14-37, January 1992 .) J. et al. -S. Sue and K. K. Pan, "Multi-delay block frequency domain adaptive filter", IEEE acoustic, speech, and signal processing transactions, 38 (2): 373-376, February 1990 (J.-S. Soo and KK Pang. Multidelay block frequency domain adaptive filter. IEEE Trans. on Acoustics, Speech, and Signal Processing, 38 (2): 373-376, February 1990.) J. et al. Venezi and D.C. R. Morgan, “Multichannel Frequency Domain Adaptive Filtering”, S.M. L. Gay and J.M. Venezi, Acoustic signal processing for telecommunications, Chapter 7, pages 121-133, Kluwer Academic Publishing, Boston, MA, 2000 (J. Benesty and DR Morgan. Multi-channel frequency-domain adaptive filtering. In SL Gay and J. Benesty, editors, Acoustic Signal Processing for Telecommunication, chapter 7, pages 121-133. Kluwer Academic Publishers, Boston, MA, 2000.) H. Buchner, J.A. Venezi and W. Kellerman, “Multichannel Frequency Domain Adaptive Filtering and Application to Multichannel Acoustic Echo Cancellation”, J. Am. Venezi and Y. Fan, Adaptive Signal Processing: Application to Real World Problems, Springer, Berlin, 2003 (H. Buchner, J. Benesty, and W. Kellermann. Multichannel frequency-domain adaptive filtering with application to multichannel acoustic echo cancellation. In J. Benesty and Y. Huang, editors, Adaptive Signal Processing: Applications to Real-World Problems. Springer, Berlin, 2003.) B. H. Niche, “Frequency Selective Step Size Control for Adaptive Filter Algorithms Operating in the Frequency Domain,” Signal Processing, 80 (9): 1733-1745, September 2000 (BH Nitsch. A frequency-selective stepfactor control for an adaptive filter algorithm working in the frequency domain. Signal Processing, 80 (9): 1733-1745, September 2000) W. Harbort, "Sound capture for man / machine interface", Practical aspects of microphone array signal processing, Springer, Heidelberg, Germany, 2005 (W. Herbordt. Sound capture for human / machine interfaces: Practical aspects of microphone array signal processing. Springer, Heidelberg, Germany, 2005.) H. Buchner, J.A. Venice, T. Gensler and W. Kellerman, “Amplified multi-delay filter robust to outliers with application to acoustic echo cancellation”, International Workshop on Acoustic Echo and Noise Control, September 2003 (H. Buchner, J. Benesty, T. Gaensler, and W. Kellermann. An outlier robust extended multidelay filter with application to acoustic echo cancellation. Int. Workshop on Acoustic Echo and Noise Control, September 2003.) R. Martin, “Free Speech Mechanism with Multichannel Echo Compensation and Noise Reduction,” PhD thesis, Aachener Institut fuer Nachrichtengeraete und, PhD thesis, Aachen Information Processing Data Communications Laboratory, 1995 Datenkommunikation, 1995.) W. Kellerman, “Acoustic Echo Cancellation for Beamforming Microphone Arrays”, M.M. S. Brandstein and D.C. B. Ward Edition, Microphone Arrays: Signal Processing Technology and Applications, Chapter 13, 281-306, Springer, Berlin, 2001 (W. Kellermann. Acoustic echo cancellation for beamforming microphone arrays. In MS Brandstein and DB Ward, editors, Microphone. Arrays: Signal Processing Techniques and Applications, chapter 13, pages 281-306. Springer, Berlin, 2001) W. Harboat, W. Kellerman and S. Nakamura, “Combined Optimization of LCMV Beamforming and Acoustic Echo Cancellation”, EURASIP European Signal Processing Conference, Proceedings, 2004 (W. Herbordt, W. Kellermann, and S. Nakamura. Combined optimization of LCMV beamforming and acoustic echo cancellation. Proc. EURASIP European Signal Processing Conference, 2004.) P. J. et al. Huber, Robust Statistics. Wiley, New York, 1981 (PJ Huber. Robust Statistics. Wiley, New York, 1981) W. Harbort, “Combination of robust adaptive beamforming and acoustic echo cancellation for the acoustic man / machine front end”, PhD thesis, University of Erlangen-Nuremberg, Germany, 2004 (W. Herbordt. Combination of robust adaptive beamforming with acoustic echo cancellation for acoustic human / machine front-ends. PhD thesis, University Erlangen-Nuremberg, Germany, 2004) L. J. et al. Griffith and C.I. W. Jim, “An Alternative Approach to Linearly Constrained Adaptive Beamforming”, IEEE Antennas and Propagation Transactions, 30 (1): 27-34, January 1982 (LJ Griffiths and CW Jim. An alternative approach to linearly constrained adaptive beamforming. IEEE Trans. on Antennas and Propagation, 30 (1): 27-34, January 1982.) S. M.M. Kay, “Basics of Statistical Signal Processing: Estimation Theory”, Prentice Hall, Upper Saddle River, NJ, 1993 (SM Kay. Fundamentals of Statistical Signal Processing: Estimation Theory. Prentice Hall, Upper Saddle River, NJ, 1993. )

近年、ＤＦＴドメイン適応アルゴリズム（「周波数ドメイン適応フィルタ」（ｆｒｅｑｕｅｎｃｙｄｏｍａｉｎａｄａｐｔｉｖｅｆｉｌｔｅｒｓ：ＦＤＡＦ））（非特許文献６）がアコースティックエコーキャンセレーション及び適応ビーム形成に関し非常に関心を集めている。なぜなら、これらは（ａ）高速な収束と計算の簡略さとを組合せ、かつ（ｂ）多くの応用において、十分に速いトラッキング能力と十分に低い遅延とが得られるように（非特許文献７）実現できるからである。 In recent years, the DFT domain adaptive algorithm (“frequency domain adaptive filters” (FDAF)) (6) has gained a great deal of interest regarding acoustic echo cancellation and adaptive beamforming. This is because they are (a) a combination of fast convergence and computational simplicity, and (b) a sufficiently fast tracking capability and a sufficiently low delay in many applications (non-patent document 7). Because it can.

ＦＤＡＦはマルチチャネル（ＭＣ）の場合に良好に一般化できる（ＭＣ−ＦＤＡＦ）（非特許文献８，９）。ＲＬＳアルゴリズムでは、収束速さは入力信号の相互相関行列の条件数から独立である。これは、高速の収束を確実にするためには、高度に自己相関のある及び相互相関のある入力信号（例えば、音声又は音楽）の場合に特に重要である。加えて、アコースティックエコーキャンセレーション又は適応ビーム形成器をＤＦＴドメインで実現することは、ＤＦＴの領域区分ごとの適応化を可能にする。これは、時間−周波数ドメインでスパースな信号について特に有利である。なぜなら、適応アルゴリズムのステップサイズを各ＤＦＴ領域区分ごとに個々に調節できるからである。これは、適応フィルタのより頻繁な適応とより速い収束とにつながり、特に色付の、時間によって変化するスペクトルを伴う信号で大いに干渉を抑制することになる（非特許文献１０，１１）。 FDAF can be generalized well in the case of multi-channel (MC) (MC-FDAF) (Non-Patent Documents 8 and 9). In the RLS algorithm, the convergence speed is independent of the condition number of the cross correlation matrix of the input signal. This is particularly important in the case of highly auto-correlated and cross-correlated input signals (eg speech or music) to ensure fast convergence. In addition, implementing acoustic echo cancellation or adaptive beamformer in the DFT domain allows for DFT domain-by-region adaptation. This is particularly advantageous for signals that are sparse in the time-frequency domain. This is because the step size of the adaptive algorithm can be individually adjusted for each DFT region section. This leads to more frequent adaptation and faster convergence of the adaptive filter and greatly suppresses interference, especially with colored signals with time-varying spectra (Non-Patent Documents 10 and 11).

このクラスのアルゴリズムの頑健性を改善するために、非特許文献１２において、頑健な統計と非線形最小二乗誤差（ＬＳＥ）基準とに基づいて、頑健なＤＦＴドメインの適応フィルタが導出され、アコースティックエコーキャンセレーションに適用されている。各サブバンドに非線形コスト関数が適用され（「狭帯域分解」）各サブバンドの誤差信号が個々に最小化されるサブバンド頑健適応フィルタ(非特許文献３)とは対照的に、非特許文献１２では離散的な時間ドメインで全帯域の誤差信号を最小化する。しかしながら、時間ドメインの最適化基準のため、非特許文献１２をＤＦＴの領域区分ごとのステップサイズ制御と組合わせて用いることはできない。 In order to improve the robustness of this class of algorithms, a non-patent document 12 derives a robust DFT domain adaptive filter based on robust statistics and nonlinear least square error (LSE) criteria, and provides acoustic echo cancellation. Applied to In contrast to the subband robust adaptive filter (Non-Patent Document 3), where a non-linear cost function is applied to each subband ("narrowband decomposition"), the error signal of each subband is minimized individually. 12, the error signal of the entire band is minimized in the discrete time domain. However, because of the time domain optimization criteria, Non-Patent Document 12 cannot be used in combination with step size control for each DFT region section.

従って、この発明の目的の一つは、高速な収束と計算の簡略さ、さらに高いトラッキング能力と十分に低い遅延とを、周波数領域区分ごとのステップサイズ制御と組合わせて提供することのできる、適応型ビーム形成器を提供することである。 Accordingly, one of the objects of the present invention is to provide fast convergence and computational simplicity, as well as high tracking capability and sufficiently low delay, in combination with step size control for each frequency domain segment, An adaptive beamformer is provided.

異常値に対し頑健な多入力多出力（ｍｕｌｔｉｐｌｅ−ｉｎｐｕｔｍｕｌｔｉｐｌｅ−ｏｕｔｐｕｔ：ＭＩＭＯ）周波数ドメイン適応アルゴリズムのコスト関数は、適応処理の基礎となる誤差信号の予め定められた非ガウス確率密度関数と最大尤度推定原理とを用いて導出できることがわかっている。これらのコスト関数を定式化することで、適応ビーム形成器又はアコースティックエコーキャンセレーションと組合せた適応ビーム形成器の、周波数領域区分ごとの適応が可能となる。予め定められた確率密度関数は、通常はガウス分布よりも分布の裾部分が厚くなる（または尖度が大きい）スーパーガウスである。 The cost function of the multiple-input multiple-output (MIMO) frequency domain adaptive algorithm, which is robust against outliers, includes a predetermined non-Gaussian probability density function and a maximum likelihood of the error signal underlying the adaptation process. It can be derived using the degree estimation principle. By formulating these cost functions, an adaptive beamformer or adaptive beamformer combined with acoustic echo cancellation can be adapted for each frequency domain segment. The predetermined probability density function is usually super Gaussian in which the tail of the distribution is thicker (or has a higher kurtosis) than the Gaussian distribution.

具体的には、この発明は、適応ビーム形成及び結合適応ビーム形成並びにアコースティックエコーキャンセレーションに適用される、ＭＩＭＯ適応フィルタと適応フィルタリングアルゴリズムとを提供する。ＭＩＭＯ適応フィルタは、適応フィルタ係数の行列と係数行列を適応的に更新するためのモジュールとからなり、誤差信号の統計を活用することにより、予め定められた最適化基準を用いて誤差信号を最小化する。誤差信号は、有利には非ガウス確率密度分布を有し、異常値に対し適応アルゴリズムの頑健性を提供する。適応フィルタの適応処理は、より頻繁な係数の更新のため、入力信号のスパースネスを活用できる変換ドメインで行なわれる。 Specifically, the present invention provides a MIMO adaptive filter and an adaptive filtering algorithm that are applied to adaptive and combined adaptive beamforming and acoustic echo cancellation. The MIMO adaptive filter consists of a matrix of adaptive filter coefficients and a module for adaptively updating the coefficient matrix. By utilizing error signal statistics, an error signal is minimized using a predetermined optimization criterion. Turn into. The error signal advantageously has a non-Gaussian probability density distribution and provides robustness of the adaptive algorithm for outliers. The adaptive processing of the adaptive filter is performed in a transform domain that can utilize the sparseness of the input signal for more frequent coefficient updating.

特に、確率密度分布は以下で与えられ、 In particular, the probability density distribution is given by

ここでε∈［０,１］は非特許文献１６で与えられる異常値確率であり、定数ｋ₀はεに依存し、

Here, ε∈ [0,1] is the outlier probability given in Non-Patent Document 16, and the constant k ₀ depends on ε,

となるように選ばれる。

Chosen to be

従って、この発明によれば、適応フィルタは、適応係数のベクトルを有し、複数の入力信号を受けるように接続された有限インパルス応答フィルタと、参照信号とＦＩＲフィルタの出力とに基づいて、誤差信号を計算するための手段と、各々が複数のＤＦＴ領域区分に変換された入力信号及び誤差信号に応答して、かつ誤差信号の予め定められた確率密度分布に基づいて、適応係数からなる適応係数ベクトルを更新するための手段と、各々がＤＦＴ領域区分に変換された入力信号及び参照信号に基づいて、各ＤＦＴ区分領域について、外乱が存在しないときに係数ベクトルを更新するように、更新するための手段を適応的に制御するための手段とを含む。 Therefore, according to the present invention, the adaptive filter has a vector of adaptive coefficients, and is based on the finite impulse response filter connected to receive a plurality of input signals, the reference signal, and the output of the FIR filter. Means for calculating a signal and an adaptation comprising an adaptation coefficient in response to an input signal and an error signal each converted into a plurality of DFT domain segments and based on a predetermined probability density distribution of the error signal Based on the means for updating the coefficient vector and the input signal and reference signal each converted into a DFT domain segment, for each DFT segment domain, update to update the coefficient vector when there is no disturbance Means for adaptively controlling the means for.

予め定められた確率密度分布は非ガウス確率密度分布であってもよい。 The predetermined probability density distribution may be a non-Gaussian probability density distribution.

好ましくは、非ガウス確率密度分布は以下の式で与えられ Preferably, the non-Gaussian probability density distribution is given by

ここでε∈［０,１］は入力信号の異常値であり、定数ｋ₀はεに依存し、

Where ε∈ [0,1] is the abnormal value of the input signal, and the constant k ₀ depends on ε,

であるように選ばれる。

Chosen to be.

さらに好ましくは、ＦＩＲフィルタは複数の出力経路を有し、さらに入力信号と出力経路との組合せの各々について係数ベクトルを含む係数行列を有し、誤差信号を計算するための手段は、参照信号とＦＩＲフィルタの出力とに基づいて、入力信号と出力信号との組合せの各々について誤差信号を計算するための手段を含み、更新するための手段は、各々がＤＦＴ領域区分に変換された入力信号及び誤差信号に応答して、かつ誤差信号の各々の予め定められた確率密度分布に基づいて、適応係数の適応係数行列を更新するための手段を含む。 More preferably, the FIR filter has a plurality of output paths, and further has a coefficient matrix including a coefficient vector for each combination of the input signal and the output path, and the means for calculating the error signal includes a reference signal and Means for calculating an error signal for each combination of input signal and output signal based on the output of the FIR filter, the means for updating comprises: an input signal each converted into a DFT domain partition; Means for updating the adaptive coefficient matrix of the adaptive coefficients in response to the error signal and based on a predetermined probability density distribution of each of the error signals.

最も好ましくは、更新するための手段は、マルチチャネルの、領域区分ごとの頑健なＭＣ−ＢＲＦＤＡＦアルゴリズムを用いて係数ベクトルを更新するための手段を含む。 Most preferably, the means for updating includes means for updating the coefficient vector using a multi-channel, per-region robust MC-BRFDAF algorithm.

１．始めに
以下の説明では、ＤＦＴの領域区分ごとのステップサイズ制御をダブルトークに頑健なアルゴリズムで用いることができるように、ＤＦＴドメインでＬＳＥコスト関数を用いる、ダブルトークに対し柔軟性のあるＤＦＴドメイン適応フィルタを導出する。この技術をＭＩＭＯシステムに適用するため、マルチチャネルの場合のこのアルゴリズムを定式化し、これを、マルチチャネル領域区分ごとの頑健ＦＤＡＦ（ＭＣ−ＢＲＦＤＡＦ）と呼ぶ（セクション２）。簡潔のため、制約のない場合のみを考察する。導出法は非特許文献８，９，１２と同様である。 1. Introduction In the following description, a DFT domain that uses LSE cost functions in the DFT domain and is flexible to double talk so that step size control for each DFT region segment can be used in a robust algorithm for double talk. Deriving an adaptive filter. In order to apply this technique to a MIMO system, we formulate this algorithm for the multi-channel case and call this robust FDAF (MC-BRFDAF) per multi-channel region partition (section 2). For brevity, consider only the unconstrained case. The derivation method is the same as in Non-Patent Documents 8, 9, and 12.

その後、ＭＣ−ＢＲＦＤＡＦを、マイクロフォンアレイを用いたマルチチャネル音声高品質化のための適応ビーム形成に適用する（セクション３）。非特許文献１７では、適応ブロッキング行列（非特許文献１）を用いた「一般化サイドローブキャンセラ」（ｇｅｎｅｒａｌｉｚｅｄｓｉｄｅｌｏｂｅｃａｎｃｅｌｌｅｒ：ＧＳＣ）（非特許文献１８）の例をあげて、ＤＦＴドメイン適応フィルタリングが適応ビーム形成に効果的に適用できること、及び特に所望の音声信号のスパースネスが、ブロッキング行列の適応的実現を用いたＧＳＣのトラッキング問題を解決する助けとなることを示す。これらのＤＦＴドメインのＧＳＣは、残響に対する頑健性とセンサの物理的許容度、又は所望の信号又は干渉の時間変化といった、ビーム形成マイクロフォンアレイが直面する課題に特に効率的に対処する。この出願では、ＭＣ−ＢＲＦＤＡＦを用いたＧＳＣの実験で、スケールの小さいマイクロフォンアレイであっても、ＭＣ−ＦＤＡＦを用いたＧＳＣに対しダブルトークに対する頑健性が大いに改善され、このため、適応のためにより大きなステップサイズを選ぶことができるとわかった。これはより高速な収束とより高いノイズ削減につながり、一方で、ビーム形成器の出力信号の良好な品質が保たれた。 MC-BRFDAF is then applied to adaptive beamforming for multi-channel speech enhancement using a microphone array (section 3). In Non-Patent Document 17, an example of “generalized sidelobe canceller (GSC) (Non-Patent Document 18) using an adaptive blocking matrix (Non-Patent Document 1) is taken as an example, and DFT domain adaptive filtering is applied. We show that it can be effectively applied to beamforming, and that the sparseness of the desired speech signal, in particular, helps to solve the GSC tracking problem with an adaptive realization of the blocking matrix. These DFT domain GSCs address particularly efficiently the challenges faced by beam-forming microphone arrays, such as robustness to reverberation and physical tolerance of the sensor, or the desired signal or interference time variation. In this application, GSC experiments using MC-BRFDAF greatly improved the robustness against double talk over GSC using MC-FDAF, even for small-scale microphone arrays. It turns out that a larger step size can be selected. This led to faster convergence and higher noise reduction, while maintaining good quality of the beamformer output signal.

２．ダブルトークに柔軟性のある周波数ドメイン適応フィルタ
このセクションでは、線形多入力単一出力（Ｍｕｌｔｉｐｌｅ−ｉｎｐｕｔｓｉｎｇｌｅ−ｏｕｔｐｕｔ：ＭＩＳＯ）フィルタのために、ＭＣ−ＢＲＦＤＡＦを定式化した。ＭＩＭＯの場合への一般化はこのセクションの最後にまとめる。導出法は非特許文献８，９，１２と同様である。 2. Double-Talk Flexible Frequency Domain Adaptive Filter In this section, MC-BRFDAF was formulated for a linear-multi-input single-output (MISO) filter. Generalizations to the MIMO case are summarized at the end of this section. The derivation method is the same as in Non-Patent Documents 8, 9, and 12.

数式においては、小文字と大文字の太字はそれぞれ、ベクトルと行列の数量を表す。（・）^*、（・）^T、及び（・）^Hはそれぞれ、複素共役、行列又はベクトル転置、及び共役転置を表す。下線を引いた数量はＤＦＴドメインの変数を示す。ｋはディスクリートな時間指標である。なお、下線は明細書中では変数前のアンダースコアで示し、ベクトル行列等は名前で区別する。 In mathematical formulas, lowercase and uppercase bold letters represent vector and matrix quantities, respectively. (•) ^* , (•) ^T , and (•) ^H represent complex conjugate, matrix or vector transpose, and conjugate transpose, respectively. Underlined quantities indicate DFT domain variables. k is a discrete time index. The underline is indicated by an underscore before the variable in the specification, and the vector matrix is distinguished by name.

２．１重複回避を用いた出力信号の計算
Ｑ個の入力チャネルを備えた適応ＭＩＭＯシステムの出力信号ベクトルｅ（ｋ）は以下で与えられる。 2.1 Calculation of output signal using overlap avoidance The output signal vector e (k) of an adaptive MIMO system with Q input channels is given by

ここで、ｙ_ref（ｋ）は参照信号である。ＭＩＳＯフィルタはＱＮ×１ベクトルｗ（ｋ）で記載され、これは長さＮのＱ個の列ベクトルｗ_q（ｋ）をフィルタ係数ｗ_n,q（ｋ），ｎ＝０，１，…，Ｎ−１で表わす。

Here, y _ref (k) is a reference signal. The MISO filter is described by a QN × 1 vector w (k), which is obtained by converting Q column vectors w _q (k) of length N into filter coefficients w _{n, q} (k), n = 0, 1,. This is represented by N-1.

適応フィルタの入力信号は、ＱＮ×１ベクトルｘ（ｋ）で表わされる。

The input signal of the adaptive filter is represented by a QN × 1 vector x (k).

ＭＩＳＯシステムの出力信号をＤＦＴドメインで高速畳込みを用いて計算し、かつ重複を回避するために、誤差信号ベクトルｅ（ｋ）のＮ個のサンプルのブロックを以下のように形成する。

In order to calculate the output signal of the MISO system using fast convolution in the DFT domain and avoid duplication, a block of N samples of the error signal vector e (k) is formed as follows.

ブロック重複ファクタとしてα＝Ｎ／Ｒを定義する。ここで、Ｒはブロックごとのサンプルの「新しい」数であり、式（６）において、ディスクリートな時間ｋをブロック時間ｒと置換える。ｒはｒＲ＝ｋによってｋと関連付けられる。こうして、データ行列Ｘ（ｒＲ）を、サイズ２Ｎ×２ＮのＤＦＴ行列Ｆ_2N×2Nを用いて、ＤＦＴドメインにおいてサイズ２ＮＱ×２Ｎのブロック対角行列Ｘ（ｒ）に変換する。

Define α = N / R as a block overlap factor. Where R is the “new” number of samples per block and replaces discrete time k with block time r in equation (6). r is associated with k by rR = k. Thus, the data matrix X (rR) is converted into a block diagonal matrix X (r) of size 2NQ × 2N in the DFT domain using the DFT matrix F _{2N × 2N} of size 2N × 2N.

ここでｗ（ｒＲ）はＤＦＴドメインで以下のように書くことができる。

Here, w (rR) can be written in the DFT domain as follows.

ここで、窓行列

Where the window matrix

は、循環畳込みを回避するために、係数ベクトルｗ_q（ｒＲ）にＮ個のゼロを付加する。Ｉ_Ｎ×ＮはサイズＮ×Ｎの恒等行列であり、０_Ｎ×ＮはサイズＮ×Ｎのゼロの行列である。式（６）から以下を得る。

Adds N zeros to the coefficient vector w _q (rR) to avoid circular convolution. I _{N × N} is an identity matrix of size N × N, and 0 _{N × N} is a zero matrix of size N × N. From equation (6) we obtain:

ここで、窓行列

Where the window matrix

からＮ個のサンプルを抽出する。適応フィルタの出力信号のＲ個のサンプルのブロックは、ベクトルｅ（ｒＲ）の最後のＲ個のサンプルから与えられる。

N samples are extracted from. A block of R samples of the output signal of the adaptive filter is given from the last R samples of the vector e (rR).

２．２最適化基準
最適化基準を定式化するために、式（１４）に左から 2.2 Optimization Criteria To formulate optimization criteria, formula (14) from the left

を乗算することによって、ブロック誤差ベクトルｅ（ｒＲ）をＤＦＴドメインに変換する。ここで

To convert the block error vector e (rR) into the DFT domain. here

である（非特許文献８）。よって以下が得られる。

(Non-Patent Document 8). Therefore, the following is obtained.

ベクトル＿ｅ（ｒ）の要素を＿ｅ_n（ｒ），ｎ＝０，１，…，２Ｎ−１で示す。

Element _e _n vector _e (r) (r), n = 0,1, ..., shown by 2N-1.

非特許文献３，４，１２にならって、参照信号に対しパーセバルの理論を適用し、ＤＦＴドメインのコスト関数ξ（ｒ）を以下のように定義した。 Following Non-Patent Documents 3, 4, and 12, Parseval's theory was applied to the reference signal, and the cost function ξ (r) of the DFT domain was defined as follows.

パラメータｋ₀は定数である。｜＿ｅ_n（ｒ）｜のスケールは一般に未知であるので、ρ（・）をスケール不変とするために、式（２０）に変数ｓ_ｎ（ｒ）を導入した（非特許文献１６）。これはシステム出力での残存ノイズレベルを反映するものでなければならない（非特許文献３，４）。式（２０）は、｜ｅ_n（ｒ）｜／ｓ_n（ｒ）≦ｋ₀に対して、各ＤＦＴドメイン内の二次誤差面のあるＬＳＥ基準（非特許文献８）と対応し、一方、｜ｅ_n（ｒ）｜／ｓ_n（ｒ）＞ｋ₀に対しては、二次基準が１−ノルム基準に置換わることが理解されるであろう。ρ（・）をこのように選択することにより、推定器を異常値に対し柔軟にすることができる。というのも、二次コスト関数に対し、｜ｅ_n（ｒ）｜／ｓ_n（ｒ）≦ｋ₀については勾配

The parameter k ₀ is a constant. Since the scale of | _e _n (r) | is generally unknown, a variable s _n (r) was introduced into equation (20) in order to make ρ (•) invariant (Non-patent Document 16). This must reflect the residual noise level at the system output (Non-Patent Documents 3 and 4). Equation (20) corresponds to the LSE criterion with a second order error surface in each DFT domain (8) for | e _n (r) | / s _n (r) ≦ k ₀ , , | E _n (r) | / s _n (r)> k ₀ , it will be understood that the secondary criterion replaces the 1-norm criterion. By selecting ρ (·) in this way, the estimator can be made flexible with respect to outliers. For the quadratic cost function, the gradient for | e _n (r) | / s _n (r) ≦ k ₀

が減じられるからである。ｋ₀の選択は、収束速さと頑健性とのトレードオフになる。なぜなら、アルゴリズムの頑健性は収束速さが遅くなることを代償に、ｋ₀とともに増加するからである。ＭＣ−ＦＤＡＦ（非特許文献８）は、

This is because is reduced. The choice of k ₀ is a trade-off between convergence speed and robustness. This is because the robustness of the algorithm increases with k ₀ at the cost of slower convergence. MC-FDAF (Non-Patent Document 8)

で、またはこれと等価に、

Or equivalently,

で得られる。

It is obtained by.

２．３適応アルゴリズム
コスト関数（２０）は、以下の形の反復ニュートンアルゴリズム（非特許文献１９）を用いて、ベクトル＿ｗ（ｒ）に対して最小化される。 2.3 Adaptive algorithm The cost function (20) is minimized for the vector_w (r) using an iterative Newton algorithm (Non-Patent Document 19) of the form:

ベクトルμ（ｒ）はステップサイズがベクトルμ_n（ｒ），ｎ＝０，１，…，２Ｎ−１でサイズが２Ｎ×２Ｎの、主たる対角上の対角行列であって、周波数領域区分での適応を個別に制御するためのものである。ＤＦＴドメインのニュートンステップ（２２）は、非特許文献５におけるディスクリートな時間ドメインのニュートンステップと類似しており、非特許文献１２におけるＤＦＴドメインのニュートンステップの、領域区分ごとの動作への拡張である。

The vector μ (r) is a diagonal diagonal matrix with a step size of the vector μ _n (r), n = 0, 1,..., 2N−1 and a size of 2N × 2N. This is for individually controlling the adaptation in the system. The Newton step (22) of the DFT domain is similar to the Newton step of the discrete time domain in Non-Patent Document 5, and is an extension of the Newton step of the DFT domain in Non-Patent Document 12 to the operation for each region division. .

１）コスト関数の勾配
非特許文献１２に従って、 1) Gradient of cost function According to Non-Patent Document 12,

連鎖法則を用いて、勾配∇ξ（ｒ）は以下のように求められる。

Using the chain law, the gradient ∇ξ (r) is determined as follows.

式（２４）は長さ２Ｎの列ベクトル

Equation (24) is a column vector of length 2N

を用いて、次のように書くことができる。

Can be written as follows:

２)コスト関数のヘシアン
サイズ２Ｎ×２Ｎのヘシアン行列∇²ξ（ｒ）は、式（２９）から、以下を用いて計算することができる。

2) Hessian of cost function The Hessian matrix ∇ ² ξ (r) of size 2N × 2N can be calculated from the equation (29) using the following.

式（２８）において、ベクトル＿Ψ（ｒ）のｎ番目の要素をΨ_n（ｒ）と示すことにより、連鎖法則を適用することによって、以下のように、式（３０）中のｎ番目の要素

By applying the chain rule by indicating the n th element of the vector _Ψ (r) as Ψ _n (r) in the expression (28), the n th element in the expression (30) is as follows:

を計算することができる。

Can be calculated.

ここで、

here,

式（３１）に、式（２５）、（２６）、（３２）及び（３３）を代入することにより、以下が得られる。

By substituting equations (25), (26), (32), and (33) into equation (31), the following is obtained.

主たる対角線上に、γ_ｎ（ｒ）、Ｎｎ＝０，１，…，２Ｎ−１を並べて２Ｎ×２Ｎの対角行列＿Γ（ｒ）を形成することにより、

By arranging γ _n (r), Nn = 0, 1,..., 2N−1 on the main diagonal line, a 2N × 2N diagonal matrix_Γ (r) is formed,

となり、式（２７）と（３４）とを式（３０）に代入することにより、

By substituting equations (27) and (34) into equation (30),

となる。

It becomes.

期待値行列＿Λ（ｒ）＝Ε｛∇²ξ（ｒ）｝の推定は、忘却係数０＜λ＜１で∇²ξ（ｒ）の再帰的平均を取ることで、以下のように求められる（非特許文献５，１２）。 The expected value matrix_Λ (r) = Ε {∇ ² ξ (r)} is estimated as follows by taking a recursive average of ∇ ² ξ (r) with the forgetting factor 0 <λ <1. (Non-Patent Documents 5 and 12).

このように行列＿Λ（ｒ）を再帰的に推定することで、

By recursively estimating the matrix_Λ (r) in this way,

に関し、ＲＬＳのような特性を備えたＭＣ−ＦＤＡＦが得られる。（式（４２）を参照）。

In particular, MC-FDAF having characteristics such as RLS is obtained. (See equation (42)).

３）近似
ニュートン型の適応ステップ（２２）は２ＮＱ×２ＮＱの行列∇²ξ（ｒ）の逆行列を必要とするので、実際的なシステムには、計算の複雑さを減じるために式（３６）の近似が必要であろう。非特許文献８及び非特許文献９に従って、十分に大きいＮについて 3) Approximation Since the Newtonian adaptation step (22) requires an inverse matrix of a 2NQ × 2NQ matrix ∇ ² ξ (r), for practical systems, the equation (36) is used to reduce the computational complexity. ) Approximation would be necessary. About sufficiently large N according to Non-Patent Document 8 and Non-Patent Document 9

と近似することができ、これによって、次の式が得られる。

Which yields the following equation:

式（３７）を用いて

Using equation (37)

を計算するために、∇²ξ（ｒ）のブロック対角行列構造を用いて、２ＮＱ×２ＮＱの行列∇²ξ（ｒ）を、サイズＱ×Ｑの２Ｎ個の行列に変換することができる。これにより、２ＮＱ×２ＮＱの逆行列の複雑さをサイズＱ×Ｑの２Ｎ行列に減じることができる（非特許文献９）。

To calculate, using the block diagonal matrix structure of ^{∇ 2 ξ (r), 2NQ} × 2NQ matrix ∇ ² xi] a (r), can be converted to the 2N matrix of size Q × Q . As a result, the complexity of the inverse matrix of 2NQ × 2NQ can be reduced to a 2N matrix of size Q × Q (Non-patent Document 9).

１つの出力チャンネルに対する、ＭＣ−ＢＲＦＤＡＦによるＭＩＳＯシステムの適応アルゴリズムは、最終的に式（１７）、（２２）、（２９）、（３７）及び（３８）で与えられる。適応アルゴリズムの、Ｑ個の入力チャネルとＰ個の出力チャネルとを備えたＭＩＭＯシステムへの、ＭＣ−ＢＲＦＤＡＦアルゴリズムの一般化 The adaptation algorithm of the MISO system according to MC-BRFDAF for one output channel is finally given by equations (17), (22), (29), (37) and (38). Generalization of the MC-BRFDAF algorithm to a MIMO system with Q input channels and P output channels for an adaptive algorithm

は、このアルゴリズムをＰ個の出力チャネルの全てについて繰返すという、直截的なものである。要約すれば、ＭＣ−ＢＲＦＤＡＦによる適応アルゴリズムの一回の反復は、ＭＩＭＯの場合について以下のように表せる。

Is straightforward to repeat this algorithm for all P output channels. In summary, a single iteration of the adaptation algorithm according to MC-BRFDAF can be expressed as follows for the MIMO case:

ＭＣ−ＦＤＡＦとは対照的に、行列＿Γ_p（ｒ）への依存性のために、Ｐ個の出力チャネルの全てについて重み付クロスパワースペクトル密度行列

In contrast to MC-FDAF, the weighted cross power spectral density matrix for all P output channels due to the dependence on the matrix_Γ _p (r).

の逆行列を計算しなければならない。式（４０）−式（４２）において

The inverse matrix of must be calculated. In Formula (40) -Formula (42)

及び、

And

とすることにより、ＭＣ−ＦＤＡＦが得られる。非特許文献１０、式（２９）に加えて、更新の式（４２）により、領域区分に依存したステップサイズベクトルμ_n（ｒ）と領域区分に依存したスケールパラメータｓ_n（ｒ）とで、領域区分ごとの動作が可能となることが注目される。さらに、ＤＦＴドメイン（２０）でのコスト関数に基づく導出は、アルゴリズムの効率的な実現を得るために、行列＿Γ（ｒ）と等価である重み付行列の近似（非特許文献１０、式（３１））を必要としない。

By doing so, MC-FDAF is obtained. In addition to Non-Patent Document 10 and Equation (29), the update equation (42) is used to calculate the step size vector μ _n (r) depending on the region segment and the scale parameter s _n (r) depending on the region segment, It is noted that the operation for each area section becomes possible. Furthermore, the derivation based on the cost function in the DFT domain (20) is an approximation of a weighted matrix equivalent to the matrix_Γ (r) in order to obtain an efficient implementation of the algorithm (Non-Patent Document 10, Equation (31) )) Is not required.

３ＭＩＭＯ適応フィルタの実施例
３．１ＭＩＭＯ適応フィルタの概観
図１はＭ個の入力チャネル２２とＰ個の出力チャネル２４とを備えた線形有限インパルス応答（ＦＩＲ）ＭＩＭＯフィルタ２０の構造を示し、ここで各入出力間のＦＩＲフィルタ３０はベクトルｗ_m,p（ｋ）で表わされる。各入力チャネルと各出力チャネル間のＦＩＲフィルタ３０はベクトルＷ（ｋ）で表わされる。システムＷ（ｋ）はＭ個の入力信号ｘ_m（ｋ）で駆動され、これらは行列Ｘ（ｋ）で表わされて加算器３２、３４、…３６によって加算される。ＭＩＭＯシステムの出力信号はｙ（ｋ）２４で表される。 3 MIMO Adaptive Filter Embodiment 3.1 Overview of MIMO Adaptive Filter FIG. 1 shows the structure of a linear finite impulse response (FIR) MIMO filter 20 with M input channels 22 and P output channels 24, Here, the FIR filter 30 between the respective inputs and outputs is represented by a vector w _{m, p} (k). The FIR filter 30 between each input channel and each output channel is represented by a vector W (k). The system W (k) is driven by M input signals x _m (k), which are represented by a matrix X (k) and are added by adders 32, 34,. The output signal of the MIMO system is represented by y (k) 24.

適応ＭＩＭＯフィルタリングでは、システムＷ（ｋ）は図２に示される構造を用いて最適化される。図２を参照して、適応線形ＭＩＭＯフィルタ５０は、入力信号ｘ（ｋ）５２を受け信号ｙ（ｋ）を出力するＦＩＲフィルタ６０と、外乱と組合わされた参照信号ｙ_ref（ｋ）から信号ｙ（ｋ）を減算して誤差信号ｅ（ｋ）５４を出すための減算器６６と、入力信号ｘ（ｋ）５２と誤差信号ｅ（ｋ）５４とを用いて、ある所与の最適化基準に従ってＷ（ｋ）を決定するためにコスト関数を定式化するための係数更新モジュール６２と、入力信号ｘ（ｋ）５２及び参照信号ｙ_ref（ｋ）を用いてダブルトークを検出し、係数更新モジュール６２の更新を制御するダブルトーク検出器６４とを含む。 In adaptive MIMO filtering, the system W (k) is optimized using the structure shown in FIG. Referring to FIG. 2, adaptive linear MIMO filter 50 receives a signal from FIR filter 60 that receives input signal x (k) 52 and outputs signal y (k), and reference signal y _ref (k) combined with disturbance. Using a subtractor 66 to subtract y (k) to produce an error signal e (k) 54, an input signal x (k) 52 and an error signal e (k) 54, a given optimization A coefficient update module 62 for formulating a cost function to determine W (k) according to a criterion, a double talk is detected using an input signal x (k) 52 and a reference signal y _ref (k), and a coefficient And a double-talk detector 64 that controls the update of the update module 62.

ＭＩＭＯフィルタ６０の出力信号ｙ（ｋ）は減算器６６で参照信号ｙ_ref（ｋ）から減算され、この結果誤差信号ｅ（ｋ）５４が得られる。誤差信号ｅ（ｋ）５４は、係数更新モジュール６２で所与の最適化基準に従ってＷ（ｋ）を決定するためにコスト関数を定式化するのに用いられる。特に、最適化は、ＬＣＭＶ又はＬＣＬＳＥビーム形成等でのような制約に従って行なわれ得る。誤差信号ｅ（ｋ）５４は通常、コスト関数に従って最小化される。適応フィルタリングでは、この最小化問題は、入力信号ｘ（ｋ）５２と誤差信号ｅ（ｋ）５４とを用いてＷ（ｋ）の係数更新を行なう何らかの適応アルゴリズムにより、反復して解決される。もし参照信号ｙ_ref（ｋ）が何らかの外乱信号とのダブルトークを含んでいれば、誤差信号ｅ（ｋ）５４全体を最小化するのは望ましくない。なぜなら、信号はその場合、外乱と参照信号とのダブルトークも含んでいるからである。従って、外乱の存在を認識し、係数更新モジュール６２によるＷ（ｋ）の適応を遅くするか又は停止するダブルトーク検出器が必要とされる。 The output signal y (k) of the MIMO filter 60 is subtracted from the reference signal y _ref (k) by the subtractor 66, and as a result, an error signal e (k) 54 is obtained. The error signal e (k) 54 is used by the coefficient update module 62 to formulate a cost function to determine W (k) according to a given optimization criterion. In particular, the optimization may be performed according to constraints such as in LCMV or LCLSE beamforming. The error signal e (k) 54 is typically minimized according to a cost function. In adaptive filtering, this minimization problem is solved iteratively by some adaptive algorithm that uses the input signal x (k) 52 and the error signal e (k) 54 to update the coefficients of W (k). If the reference signal y _ref (k) includes double talk with some disturbance signal, it is not desirable to minimize the entire error signal e (k) 54. This is because the signal then includes double talk between the disturbance and the reference signal. Therefore, there is a need for a double talk detector that recognizes the presence of disturbances and slows or stops the adaptation of W (k) by the coefficient update module 62.

音声及びオーディオ信号処理等の多くの応用では、ある予め定められた変換ドメインにおいて、信号ｘ（ｋ）５２及びｙ_ref（ｋ）はスパースである。図３はこのような応用の処理に適した適応線形ＭＩＭＯフィルタ８０の構造を示す。 In many applications, such as voice and audio signal processing, the signals x (k) 52 and y _ref (k) are sparse in certain predetermined transform domains. FIG. 3 shows the structure of an adaptive linear MIMO filter 80 suitable for processing such applications.

図３を参照して、適応線形ＭＩＭＯフィルタ８０は、入力信号ｘ（ｋ）８２を受け、信号ｙ（ｋ）を出力するＦＩＲフィルタ９０と、外乱と組合わされた参照信号ｙ_ref（ｋ）から信号ｙ（ｋ）を減算する減算器９８と、入力信号ｘ（ｋ）８２、参照信号ｙ_ref（ｋ）、及び誤差信号ｅ（ｋ）８４をそれぞれ変換（ＤＦＴ）ドメインに変換するための変換器９２、１００及び１０２と、いずれも変換ドメインに変換された入力信号ｘ（ｋ）８２と誤差信号ｅ（ｋ）８４とを用いて、ある所与の最適化基準に従ってＷ（ｋ）を決定するためにコスト関数を定式化するための係数更新モジュール９４と、各ＤＦＴ領域区分について、係数更新モジュール９４が、入力信号に干渉やノイズがあるときのみ係数行列を更新し、入力信号に所望の信号があるときには更新を凍結するように各ＤＦＴ領域区分に別個に作用する、ダブルトークを検出するためのダブルトーク検出器９６とを含む。 Referring to FIG. 3, adaptive linear MIMO filter 80 includes an FIR filter 90 that receives input signal x (k) 82 and outputs signal y (k), and reference signal y _ref (k) combined with disturbance. A subtractor 98 for subtracting the signal y (k), and a conversion for converting the input signal x (k) 82, the reference signal y _ref (k), and the error signal e (k) 84 into the conversion (DFT) domain, respectively. Determine the W (k) according to a given optimization criterion using the devices 92, 100 and 102 and the input signal x (k) 82 and the error signal e (k) 84, both transformed into the transform domain. A coefficient updating module 94 for formulating a cost function, and for each DFT region segment, the coefficient updating module 94 updates the coefficient matrix only when there is interference or noise in the input signal, There is a signal And a double-talk detector 96 for detecting double-talk that acts separately on each DFT region segment to freeze updates.

図３によれば、信号ｘ（ｋ）８２、ｙ_ref（ｋ）及びｅ（ｋ）８４はこうして、変換Ｔによりこのドメインに有利に変換でき、適応フィルタＷ（ｋ）９０の係数更新のためにスパースネスを活用することができる。この文脈で、スパースネスとは、変換ドメインにおいて、参照信号と外乱とが重複していない場合がしばしばであることを意味する。この場合、離散的な時間ドメインではダブルトークの間であっても、変換ドメインでは、係数更新モジュール９４によって、重複のないデータセグメントで係数の更新を行なうことができる。特に、変換のためのＤＦＴの選択には多くの有利な特徴がある。ＤＦＴドメインでは、係数Ｔの更新は、ＤＦＴ領域区分のうち参照信号のみが検出されたものの中で行なうことができる。これには、ＤＦＴ領域区分の各々で別個に動作するダブルトーク検出器９６が必要である。 According to FIG. 3, the signals x (k) 82, y _ref (k) and e (k) 84 can thus be advantageously transformed into this domain by the transformation T, for updating the coefficients of the adaptive filter W (k) 90. Sparseness can be utilized. In this context, sparseness means that the reference signal and the disturbance often do not overlap in the transform domain. In this case, even during the double talk in the discrete time domain, in the transform domain, the coefficient can be updated by the coefficient update module 94 in the non-overlapping data segment. In particular, there are many advantageous features in selecting a DFT for conversion. In the DFT domain, the coefficient T can be updated in the DFT region segment in which only the reference signal is detected. This requires a double talk detector 96 that operates separately in each of the DFT domain sections.

しかしながら、図３に示すダブルトーク検出器９６は正確にダブルトークを検出することができないので、ダブルトークの間に適応が起こる危険性は常に存在する。適応フィルタをすばやく適応させるために、一般に高速で収束する適応アルゴリズムを使用することが望ましいが、ダブルトークが検出されないと高速収束適応アルゴリズムの発散につながり、従って適応フィルタの性能を減じる恐れがある。 However, since the double talk detector 96 shown in FIG. 3 cannot accurately detect double talk, there is always a risk of adaptation occurring during double talk. In order to quickly adapt the adaptive filter, it is generally desirable to use an adaptive algorithm that converges at high speed, but if double talk is not detected, it can lead to divergence of the fast convergence adaptive algorithm and thus reduce the performance of the adaptive filter.

従って、先行技術の適応アルゴリズムは多くの状況で満足のいく性能を発揮するものの、高速で収束する適応アルゴリズムを使用しながら、一方では検出されないダブルトークによって引起される摂動に対抗することが望ましい。 Thus, while prior art adaptive algorithms perform satisfactorily in many situations, it is desirable to counteract perturbations caused by undetected double talk while using fast convergent adaptive algorithms.

３．２適応フィルタの異常値に対し頑健なコスト関数
適応的ビーム形成または適応的ビーム形成と組合されたアコースティックエコーキャンセレーションのためのＭＩＭＯフィルタの高速収束を確実にしつつ発散を避けるために、異常値に対し頑健な周知の最大尤度推定に基づく適応アルゴリズムを用いることを提案する。ＲＬＳアルゴリズム又はＭＣ−ＦＤＡＦ等の典型的な非頑健高速収束適応アルゴリズムを、誤差信号ｅ（ｋ）がガウス分布する最大尤度推定器と解釈することができる。この仮定は非常に有益ではあるが、このような適応アルゴリズムは仮定からのわずかな偏差にもきわめて感受性が高く、従って異常値に対して影響を受けやすいことが知られている。 3.2 Cost function robust to outliers in adaptive filter Anomalies to avoid fast divergence while ensuring fast convergence of MIMO filters for adaptive echoforming or combined with adaptive beamforming We propose to use an adaptive algorithm based on well-known maximum likelihood estimation that is robust to the values. A typical non-robust fast convergence adaptation algorithm such as RLS algorithm or MC-FDAF can be interpreted as a maximum likelihood estimator with a Gaussian distribution of the error signal e (k). While this assumption is very useful, it is known that such adaptive algorithms are very sensitive to small deviations from the assumption and are therefore sensitive to outliers.

さらに、統計学の文献から、ガウス分布よりも分布の裾部分が厚くなる単一分布を用いることによって、異常値に対し頑健な最大尤度推定器が導出され得ることがよく知られている。このような頑健な最大尤度推定を用いて、頑健な適応アルゴリズム（特に、変換ドメインでのスパースネスを活用した適応アルゴリズム）を導出することができ、これは、適応ビーム形成又は適応ビーム形成とアコースティックエコーキャンセレーションとの結合に有利に利用することができる。 Furthermore, it is well known from statistical literature that a maximum likelihood estimator that is robust against outliers can be derived by using a single distribution in which the tail of the distribution is thicker than a Gaussian distribution. Such robust maximum likelihood estimation can be used to derive a robust adaptation algorithm (especially an adaptation algorithm that takes advantage of sparseness in the transform domain), which is adaptive beamforming or adaptive beamforming and acoustics. It can be advantageously used for coupling with echo cancellation.

このような確率密度関数の一つは以下で与えられる。 One such probability density function is given below.

これは、統計学では最小インフォマティブ分布として知られている。ε∈［０，１］は非特許文献１６で導入された異常値の確率であり、定数ｋ₀はεに依存し、

This is known in statistics as the minimum informatics distribution. ε∈ [0,1] is the probability of an abnormal value introduced in Non-Patent Document 16, and the constant k ₀ depends on ε,

となるように選ばれる。最小インフォマティブ分布は中心部がガウス分布であり、裾部分ではラプラシアン分布（ガウス分布より尖度が大きい）であることが分かる。

Chosen to be It can be seen that the minimum informative distribution has a Gaussian distribution at the center and a Laplacian distribution (having a higher kurtosis than the Gaussian distribution) at the bottom.

例として、ＤＦＴ領域区分ごとのダブルトーク検出器のためのＤＦＴドメイン適応フィルタを導出する基本的ステップを示す。この場合、ｚはＤＦＴドメインにおける最適線形フィルタ＿Ｗ（ｒ）のｐ番目の誤差信号＿ｅ_p,n（ｒ）であると解釈され、＿ｗ_p（ｒ）のＭ−推定器（又は最大尤度型推定器）は、＿ｗ_p（ｒ）に関して以下に示すコスト関数を最小化することによって得られる。 As an example, the basic steps for deriving a DFT domain adaptive filter for a double-talk detector per DFT region partition are shown. In this case, z is interpreted as the p-th error signal _e _{p, n} (r) of the optimal linear filter _W (r) in the DFT domain, and the M-estimator (or maximum likelihood type) of _w _p (r) The estimator is obtained by minimizing the cost function shown below for _w _p (r).

これは、等価的には以下のようにも表せる。

This can be equivalently expressed as follows.

スケールファクタｓ_p,n（ｒ）は引数r（・）の分散を正規化する。｜ｅ_p,n（ｒ）｜／ｓ_p,n（ｒ）≦ｋ₀の場合、式（４６）は二次コスト関数のＬＳＥ基準に対応し、一方式（４６）は｜ｅ_p,n（ｒ）｜／ｓ_p,n（ｒ）＞ｋ₀．Ｆｏｒ｜ｅ_p,n（ｒ）｜／ｓ_p,n（ｒ）＞ｋ₀に対する１−ノルムの基準であり、異常値に対応する可能性が高い。ξ_p（ｒ）の勾配は異常値に対する頑健性が増加するように制限されている。

The scale factor sp _{, n} (r) normalizes the variance of the argument r (•). When | e _{p, n} (r) | / s _{p, n} (r) ≦ k ₀ , equation (46) corresponds to the LSE criterion of the quadratic cost function, and one method (46) is | e _{p, n} (R) | / s _{p, n} (r)> k ₀ . For | e _{p, n} (r) | / s _{p, n} (r)> is a 1-norm criterion for k ₀ , and is likely to correspond to an abnormal value. The slope of ξ _p (r) is limited to increase robustness against outliers.

式（４６）は以下の形の反復ニュートンアルゴリズムによって解くことができる。 Equation (46) can be solved by an iterative Newton algorithm of the form:

ここで、

here,

は＿ｗ（ｒ）に対するコスト関数ξ（ｒ）の勾配である。

Is the slope of the cost function ξ (r) with respect to _w (r).

は＿ｗ（ｒ）に対するヘシアンξ（ｒ）の期待値である。＿μ（ｒ）は周波数領域区分において別個の適応を制御するための、主対角上の、ステップサイズμ_n（ｒ），ｎ＝０，１，…，２Ｎ−１でサイズが２Ｎ×２Ｎである対角行列である。ＤＦＴドメインのニュートンステップ（４７）は、非特許文献５における離散的な時間ドメインのニュートンステップと類似し、非特許文献１２におけるＤＦＴドメインのニュートンステップの領域区分ごとの動作への拡張である。

Is the expected value of Hessian ξ (r) for _w (r). _Μ (r) is a step size μ _n (r), n = 0, 1,..., 2N−1 on the main diagonal to control separate adaptation in the frequency domain partition, and the size is 2N × 2N It is a diagonal matrix. The DFT domain Newton step (47) is similar to the discrete time domain Newton step in Non-Patent Document 5, and is an extension of the DFT domain Newton step in the non-patent document 12 to the operation for each region segment.

３．３直接的ビーム形成器の実現例
図４に適応ビーム形成器１２０を示す。図４を参照して、適応ビーム形成器１２０は、Ｍ個の入力チャネルとＰ＝１個の出力チャネルを備えた（「ＭＩＳＯ」システム）図２に従った適応線形ＭＩＭＯフィルタ５０と見ることができる。Ｍ個の入力チャネルはＭ個のマイクロフォン信号１２２に対応し、これは所望の信号、干渉及びノイズの混合物を含む。適応フィルタ１３０は干渉とノイズとを最大に抑制しつつ、所望の信号の歪を最小にするよう適応化される。 3.3 Implementation of Direct Beamformer FIG. 4 shows an adaptive beamformer 120. Referring to FIG. 4, adaptive beamformer 120 can be viewed as an adaptive linear MIMO filter 50 according to FIG. 2 with M input channels and P = 1 output channel (“MISO” system). it can. The M input channels correspond to M microphone signals 122, which contain a desired signal, interference and noise mixture. The adaptive filter 130 is adapted to minimize distortion of the desired signal while maximizing interference and noise.

図４を参照して、適応ビーム形成器１２０は、入力チャネルの各々に対し、適応フィルタ１５０、１５２、…、１５４を含む適応フィルタ１３０と、適応フィルタ１５０、１５２、…、１５４の出力を加算して、信号ｙ（ｋ）を得るための加算器１３８と、参照信号ｙ_ref（ｋ）から信号ｙ（ｋ）を減算し、誤差信号ｅ（ｋ）１２４を出力するための減算器１４０と、入力信号ｘ（ｋ）１２２、参照信号ｙ_ref（ｋ）、及び誤差信号ｅ（ｋ）１２４をそれぞれＤＦＴドメインに変換するための変換器１３２、１４２及び１４４と、いずれも変換器１３２及び１４４によって変換ドメインに変換された入力信号ｘ（ｋ）１２２及び誤差信号ｅ（ｋ）１２４を用いて、ある所与の最適化基準に従ってＷ（ｋ）（ｗ₀（ｋ），ｗ₁（ｋ），…,ｗ_M-1（ｋ））を決定するためにコスト関数を定式化する係数更新モジュール１３４と、変換器１３２及び１４２によっていずれもＤＦＴドメインに変換された入力信号ｘ（ｋ）１２２及び参照信号ｙ_ref（ｋ）を用いてＤＦＴ領域区分の各々においてダブルトークを検出するためのダブルトーク検出器１３６とを含む。 Referring to FIG. 4, adaptive beamformer 120 adds, for each of the input channels, adaptive filter 130 including adaptive filters 150, 152,... 154, and the outputs of adaptive filters 150, 152,. An adder 138 for obtaining the signal y (k), and a subtractor 140 for subtracting the signal y (k) from the reference signal y _ref (k) and outputting an error signal e (k) 124. , Converters 132, 142 and 144 for converting the input signal x (k) 122, the reference signal y _ref (k), and the error signal e (k) 124 to the DFT domain, respectively. Using the input signal x (k) 122 and the error signal e (k) 124 transformed into the transform domain by W (k) (w ₀ (k), w ₁ (k) according to a given optimization criterion. , ..., w _M-1 ( k)), a coefficient update module 134 that formulates a cost function, and an input signal x (k) 122 and a reference signal y _ref (k), both of which have been converted to the DFT domain by the converters 132 and 142. And a double-talk detector 136 for detecting double-talk in each of the DFT region sections.

適応ＬＣＬＳＥ又はＬＣＭＶビーム形成において、参照信号ｙ_ref（ｋ）はゼロに等しく、従って、適応フィルタ１３０の出力信号ｙ（ｋ）は誤差信号ｅ（ｋ）１２４に等しい。従って、誤差信号を最小化することは、干渉及びノイズのみならず、所望の信号をも抑制することになる。所望の信号の抑制を防ぐために、最適化基準（式（４６）等）に対し時間−空間的制約を導入して、ビーム形成器出力ｙ（ｋ）において所望の信号を保存する。 In adaptive LCLSE or LCMV beamforming, the reference signal y _ref (k) is equal to zero, so the output signal y (k) of the adaptive filter 130 is equal to the error signal e (k) 124. Therefore, minimizing the error signal suppresses not only interference and noise but also a desired signal. In order to prevent suppression of the desired signal, a time-spatial constraint is introduced to the optimization criterion (such as equation (46)) to preserve the desired signal at the beamformer output y (k).

しかしながら、制約のための設計で仮定された時間−空間特性と、所望の信号の実際の時間−空間特性との間に（例えば、音響環境での残響、マイクロフォンの不一致、または所望の音源の位置の不一致等の）不一致があると、所望の信号は常に、誤って干渉であるとされ、適応フィルタによってキャンセルされてしまう。文献では、この効果は信号漏洩及び所望信号のキャンセレーション／歪として知られている。また、所望の信号のキャンセレーションは、干渉やノイズがあるときのみ適応フィルタを適応化し、所望の信号が存在するときは常に適応を凍結することによって、回避できることがよく知られている。これは、ダブルトーク検出器１３６が所望信号と干渉又はノイズとのダブルトークを検出し、適応ビーム形成器の適応を不能化するか又は遅くすることを要件とする。変換ドメインにおいてセンサ信号のスパースネスを活用することによって、適応フィルタの収束を向上させることができる。 However, between the time-space characteristics assumed in the constraint design and the actual time-space characteristics of the desired signal (eg, reverberation in the acoustic environment, microphone mismatch, or desired sound source location). If there is a discrepancy (such as discrepancies), the desired signal will always be erroneously interference and will be canceled by the adaptive filter. In the literature, this effect is known as signal leakage and desired signal cancellation / distortion. It is also well known that cancellation of a desired signal can be avoided by adapting the adaptive filter only when there is interference or noise and freezing the adaptation whenever the desired signal is present. This requires that the double talk detector 136 detects double talk between the desired signal and interference or noise, disabling or slowing down adaptation of the adaptive beamformer. By utilizing the sparseness of the sensor signal in the transform domain, the convergence of the adaptive filter can be improved.

要約すれば、ダブルトークに対する頑健性と適応フィルタの高速な収束が望まれる変換ドメインにおいて、頑健なＭＩＭＯ適応フィルタの応用が存在する。調査によれば、特に、センサ信号のスパースネスを活用する頑健なＤＦＴドメインの適応フィルタリング（ＭＣ−ＢＲＦＤＡＦ）が有利な性能を発揮することが示されている。 In summary, there is a robust MIMO adaptive filter application in the transform domain where robustness against double talk and fast convergence of the adaptive filter are desired. Research has shown that robust DFT domain adaptive filtering (MC-BRFDAF), which exploits the sparseness of sensor signals, in particular, exhibits advantageous performance.

３．４適応ブロッキング行列を伴う一般化サイドローブキャンセラの実現例
図４の直接的な実現例を用いる代わりに、適応ビーム形成器はまた、ＧＳＣとして実現することもできる。 3.4 Implementation of the generalized sidelobe canceller with adaptive blocking matrix Instead of using the direct implementation of FIG. 4, the adaptive beamformer can also be implemented as a GSC.

図５に適応ブロッキング行列を用いたＧＳＣを示す。図５を参照して、ＧＳＣ１７０は、入力信号ｘ（ｋ）１７２を受け、信号ｙ（ｋ）を出力するための固定ビーム形成器１８０と、入力信号ｘ（ｋ）１７２と固定ビーム形成器１８０からの信号ｙ（ｋ）とを受け、Ｂ（ｋ）の出力が干渉の参照基準となるように、所望の信号を抑制するとともに干渉を通過させる適応ブロッキング行列Ｂ（ｋ）１８２と、適応ブロッキング行列Ｂ（ｋ）１８２の出力を受け、適応フィルタリングを用いて参照経路からの残存する干渉を適応的に減算するための干渉キャンセラａ（ｋ）１８６と、信号ｙ（ｋ）から干渉キャンセラ１８６の出力を減算してビーム形成器出力信号ｚ（ｋ）１７４を得るための減算器１８４とを含む。適応ブロッキング行列Ｂ（ｋ）１８２と干渉キャンセラａ（ｋ）１８６とは変換ドメインで頑健な適応ＭＩＭＯフィルタリングを系統的に適用することによって実現される。 FIG. 5 shows GSC using an adaptive blocking matrix. Referring to FIG. 5, GSC 170 receives an input signal x (k) 172 and outputs a signal y (k), a fixed beam former 180, an input signal x (k) 172 and a fixed beam former 180. And an adaptive blocking matrix B (k) 182 that suppresses a desired signal and passes the interference so that the output of B (k) becomes a reference standard for interference, and adaptive blocking. Interference canceller a (k) 186 for receiving the output of matrix B (k) 182 and adaptively subtracting the remaining interference from the reference path using adaptive filtering, and interference canceller 186 from signal y (k) And a subtractor 184 for subtracting the output to obtain a beamformer output signal z (k) 174. The adaptive blocking matrix B (k) 182 and the interference canceller a (k) 186 are realized by systematically applying adaptive MIMO filtering robust in the transform domain.

３．４．１固定ビーム形成器１８０
固定ビーム形成器１８０はセンサアレイを操作して所望の音源位置に向け、干渉に対して、所望の信号の性能を高める。固定ビーム形成器１８０はＧＳＣ１７０の参照経路を形成する。しばしば、固定ビーム形成器１８０は、所望の信号が減衰しないように、所与の区域内で所望の音源が移動することを許容するように設計される。又は、適応ビーム形成器又は適応ビーム配向ユニットを用いて、固定ビーム形成器を所望の音源位置に向けるようにしてもよい。特に小規模のマイクロフォンアレイでは、固定ビーム形成器による干渉抑制は多くの応用では不十分なため、適応ブロッキング行列及び干渉キャンセラからなる適応サイドローブキャンセリング経路が必要とされる。 3.4.1 Fixed beamformer 180
The fixed beamformer 180 manipulates the sensor array to point to the desired sound source location and enhances the performance of the desired signal against interference. Fixed beamformer 180 forms the reference path for GSC 170. Often, the fixed beamformer 180 is designed to allow the desired sound source to move within a given area so that the desired signal is not attenuated. Alternatively, an adaptive beamformer or adaptive beam directing unit may be used to direct the fixed beamformer to the desired sound source location. Particularly in a small-scale microphone array, interference suppression by a fixed beamformer is insufficient for many applications, so an adaptive sidelobe canceling path composed of an adaptive blocking matrix and an interference canceller is required.

３．４．２適応ブロッキング行列１８２
ブロッキング行列Ｂ（ｋ）１８２は、Ｂ（ｋ）の出力が干渉の参照基準となるように、所望の信号を抑制するとともに干渉を通過させる空間フィルタである。適応ブロッキング行列Ｂ（ｋ）１８２は、適応フィルタ１９０、１９２、…、１９４と減算器２００、２０２、…、２０４とのＭ個の組を含み、固定ビーム形成器１８０の出力ｙ（ｋ）を用いて、入力信号ｘ（ｋ）１７２の各チャネルを適応的にフィルタリングする。 3.4.2 Adaptive blocking matrix 182
The blocking matrix B (k) 182 is a spatial filter that suppresses a desired signal and passes the interference so that the output of B (k) becomes a reference standard for interference. The adaptive blocking matrix B (k) 182 includes M sets of adaptive filters 190, 192, ..., 194 and subtractors 200, 202, ..., 204, and outputs y (k) of the fixed beamformer 180. And each channel of the input signal x (k) 172 is adaptively filtered.

行列Ｂによる空間フィルタリングと所望信号の実際の波領域との間に不一致があると常に所望の信号を完全に抑制することのできない固定ブロッキング行列Ｂと異なり、適応ブロッキング行列は所望信号の波領域の変化をトラッキングすることができる。これは、固定ブロッキング行列が連続して所望信号成分を通過させる、時間的に変化する残響のある環境では特に重要である。マルチチャネル適応フィルタリングを用いて、固定ビーム形成器１８０の出力信号ｙ（ｋ）を参照し、この参照信号を適応フィルタ１９０、１９２、…、１９４を用いたサイドローブキャンセリング経路の各チャネルから減算することによって、適応ブロッキング行列１８２を実現することができる。 Unlike the fixed blocking matrix B, which cannot always completely suppress the desired signal if there is a discrepancy between the spatial filtering by the matrix B and the actual wave region of the desired signal, the adaptive blocking matrix Change can be tracked. This is particularly important in time-varying reverberant environments where the fixed blocking matrix continuously passes the desired signal component. Using multi-channel adaptive filtering, reference is made to the output signal y (k) of the fixed beamformer 180, and this reference signal is subtracted from each channel of the sidelobe canceling path using the adaptive filters 190, 192,. By doing so, the adaptive blocking matrix 182 can be realized.

３．４．３干渉キャンセラ１８６
ブロッキング行列１８２の出力信号を干渉の参照信号として用いて、干渉キャンセラａ（ｋ）１８６は適応フィルタリングにより、参照経路から残存する干渉を適応的に減算する。 3.4.3 Interference canceller 186
The interference canceller a (k) 186 adaptively subtracts the remaining interference from the reference path by adaptive filtering using the output signal of the blocking matrix 182 as an interference reference signal.

３．４．４適応制御（ダブルトーク検出）
固定ビーム形成器１８０は、干渉のない所望信号の推定を生成することができない。従って、ブロッキング行列１８２は、ブロッキング行列１８２による干渉の抑制を防ぐために、信号対干渉比（ｓｉｇｎａｌ−ｔｏ−ｉｎｔｅｒｆｅｒｅｎｃｅｒａｔｉｏ：ＳＩＲ）が高いときのみに適応化されるべきである。ブロッキング行列１８２で抑制された干渉成分を干渉キャンセラ１８６でキャンセルすることはできないので、これはＧＳＣ１７０の出力に漏洩する。一般に、ブロッキング行列１８２は、所望信号を全く含まない干渉の推定を生成することはない。従って、干渉キャンセラ１８６は、所望の信号のキャンセレーションと歪とを防ぐために、ＳＩＲが低いときのみに適応化されるべきである。 3.4.4 Adaptive control (double talk detection)
The fixed beamformer 180 cannot generate an estimate of the desired signal without interference. Accordingly, the blocking matrix 182 should be adapted only when the signal-to-interference ratio (SIR) is high in order to prevent interference suppression by the blocking matrix 182. Since the interference component suppressed by the blocking matrix 182 cannot be canceled by the interference canceller 186, this leaks to the output of the GSC 170. In general, the blocking matrix 182 does not generate an estimate of interference that does not include any desired signal. Accordingly, the interference canceller 186 should be adapted only when the SIR is low to prevent cancellation and distortion of the desired signal.

「Ｂ（ｋ）の適応」又は「ａ（ｋ）の適応」が全帯域信号ではなく、別個の周波数領域区分で行なわれると、明らかにより高いトラッキング性能が得られる。なぜなら、時間−周波数ドメインでの所望の信号と干渉とのスパースネスを活用できるからである。従って、適応ブロッキング行列１８２と干渉キャンセラ１８６との両者に、ダブルトーク検出器が必要とされる。 If “B (k) adaptation” or “a (k) adaptation” is performed on separate frequency domain segments rather than full-band signals, clearly higher tracking performance is obtained. This is because the sparseness between a desired signal and interference in the time-frequency domain can be utilized. Therefore, double-talk detectors are required for both the adaptive blocking matrix 182 and the interference canceller 186.

３．４．５頑健な変換ドメイン適応フィルタを使用する動機づけ
先に述べたように、適応制御（又はダブルトーク検出器）は所望の信号と干渉との活動を常に正確に検出することはできない。従って、ブロッキング行列１８２と干渉キャンセラ１８６とは、ダブルトークの間に適応化されてしまうかもしれず、これは適応フィルタの異常値につながる。変換ドメインで頑健なＭＩＭＯ適応フィルタリングを用いることで、センサ信号のスパースネスを活用した、高速に収束する適応アルゴリズムを用いながら適応フィルタの発散を防ぐことができる。 3.4.5 Motivation to use robust transform domain adaptive filters As mentioned earlier, adaptive control (or double-talk detector) cannot always detect the desired signal and interference activity accurately. . Accordingly, the blocking matrix 182 and the interference canceller 186 may be adapted during double talk, which leads to an adaptive filter outlier. By using robust MIMO adaptive filtering in the transform domain, it is possible to prevent the divergence of the adaptive filter while using an adaptive algorithm that converges at high speed using the sparseness of the sensor signal.

図８に、男性の所望の音声と、背景にラウドスピーカからオーケストラの音楽が流れている場合の適応制御の典型的挙動の例を示す。この実験的な設定は、セクション４のものに対応する。図８（Ａ）及び図８（Ｂ）はＭ／２番目のマイクロフォンで記録された所望の信号と干渉信号とを示す。図８（Ｃ）では、所望の信号と干渉とのＰＳＤの反復平均推定値の比ＳＩＲ（ｒ，ｎ）が、周波数ｎ（ｋＨｚ）とブロック時間ｒとの関数として示される。図８（Ｄ）では、ＳＩＲ（ｒ，ｎ）に基づく判断が示される。ブロッキング行列（ＢＭ）と干渉キャンセラ（ＩＣ）とが、１０ｌｏｇ₁₀ＳＩＲ（ｒ，ｎ）≧１５ｄＢと１０ｌｏｇ₁₀ＳＩＲ（ｒ，ｎ）≦１５ｄＢとについてそれぞれ適応される。図８（Ｅ）はΥ（ｒ，ｎ）を用いた適応制御の判断を例示する。 FIG. 8 shows an example of typical behavior of adaptive control when the desired voice of a man and orchestral music are flowing from a loudspeaker in the background. This experimental setup corresponds to that of section 4. 8A and 8B show a desired signal and an interference signal recorded by the M / 2nd microphone. In FIG. 8C, the ratio SIR (r, n) of the PSD iterative average estimate of the desired signal and interference is shown as a function of the frequency n (kHz) and the block time r. FIG. 8D shows a determination based on SIR (r, n). The blocking matrix (BM) and interference canceller (IC) are adapted for ₁₀ log ₁₀ SIR (r, n) ≧ 15 dB and ₁₀ log ₁₀ SIR (r, n) ≦ 15 dB, respectively. FIG. 8E illustrates adaptive control determination using 用い (r, n).

図８（Ｅ）の適応制御は常に所望の信号と干渉とのアクティビティを正確に検出するわけではないことが分かるであろう。ブロッキング行列１８２と干渉キャンセラ１８６とは従って、ダブルトークの間に適応化されるかもしれず、これは適応フィルタの異常値につながる。これらの異常値と、起こりうる適応フィルタの発散とは、（ａ）適応フィルタのステップサイズを減じること、又は（ｂ）適応フィルタがダブルトークの間に適応されにくくなるように、適応しきい値を減じること、によって防止できるであろう。 It will be appreciated that the adaptive control of FIG. 8 (E) does not always accurately detect the desired signal and interference activity. Blocking matrix 182 and interference canceller 186 may therefore be adapted during double talk, leading to outliers in the adaptive filter. These outliers and the possible divergence of the adaptive filter are: (a) reducing the step size of the adaptive filter, or (b) the adaptive threshold so that the adaptive filter is less likely to be adapted during double talk. Can be prevented by reducing.

しかしながら、どちらの選択肢もトラッキング能力を低下させ、このためＧＳＣ１７０の干渉抑制を低下させてしまう。このトレードオフを避けるために、ＭＣ−ＢＲＦＤＡＦをブロッキング行列１８２と干渉キャンセラ１８６とに適用する。 However, both options reduce the tracking capability and thus reduce the interference suppression of the GSC 170. To avoid this tradeoff, apply the MC-BRFDAF to the blocking matrix 182 and interference canceller 18 6.

図５から、ブロッキング行列１８２が１個の入力チャネルとＭ個の出力チャネルとを備えた単一入力多出力のシステムに対応し、干渉キャンセラ１８６がＭ個の入力チャネルとＰ＝１の出力チャネルを備えたＭＩＳＯシステムに対応することが理解されるであろう。ブロッキング行列の適応フィルタｂ_m（ｋ）をｍ＝０かつｐ＝０，１，…，Ｍ−１であるｗ_m,p（ｋ）で特定し、適応フィルタａ（ｋ）をｍ＝０，１，…，Ｍ−１かつｐ＝０であるｗ_m,p（ｋ）で特定することにより、頑健なＭＩＭＯ適応フィルタを変換ドメイン中で系統的に使用して、適応ブロッキング行列１８２を備えたＧＳＣ１７０を実現することができる。 From FIG. 5, it can be seen that blocking matrix 182 corresponds to a single-input multiple-output system with one input channel and M output channels, and interference canceller 186 has M input channels and P = 1 output channels. It will be understood that it corresponds to a MISO system with An adaptive filter b _m (k) of the blocking matrix is specified by w _{m, p} (k) where m = 0 and p = 0, 1,..., M−1, and the adaptive filter a (k) is m = 0, 1, ..., M−1 and p = 0, by specifying w _{m, p} (k), systematically using a robust MIMO adaptive filter in the transform domain and comprising an adaptive blocking matrix 182 The GSC 170 can be realized.

特に、ＭＣ−ＢＲＦＤＡＦアルゴリズムはこのＧＳＣの適応にとって有用である。ステップサイズベクトル＿μ（ｒ）は適応制御によって決定され、０と周波数に依存しない一定値のベクトルμ_cとの間で切換えられ、適応を不能化及び可能化する。 In particular, the MC-BRFDAF algorithm is useful for this GSC adaptation. The step size vector_μ (r) is determined by adaptive control and is switched between 0 and a constant vector μ _c independent of frequency, disabling and enabling adaptation.

３．５固定ブロッキング行列を備えた一般化サイドローブキャンセラの実現例
適応ブロッキング行列１８２を備えたＧＳＣの代替物として、固定（時間不変の）ブロッキング行列Ｂを備えてＧＳＣを実現することもできる。ここで、頑健なＭＩＭＯ適応フィルタは干渉キャンセラ１８６にのみ適用される。 3.5 Implementation of a generalized sidelobe canceller with a fixed blocking matrix As an alternative to a GSC with an adaptive blocking matrix 182, a GSC can be realized with a fixed (time-invariant) blocking matrix B. Here, the robust MIMO adaptive filter is applied only to the interference canceller 186.

３．６アコースティックエコーキャンセレーションと適応ビーム形成との結合
実際的なマルチメディア端末では、アコースティックエコー、干渉及びノイズを最適に抑制するために、ビーム形成マイクロフォンアレイとアコースティックエコーキャンセレーションとを組合わせることが望ましい（非特許文献１３，１４）。ここでは一般に、アコースティックエコーキャンセレーションとビーム形成との肯定的な相乗作用を最大に活用しながら、組合せシステムの計算の複雑さを最小にする、という課題に取り組まなければならない。 3.6 Combining Acoustic Echo Cancellation with Adaptive Beamforming In practical multimedia terminals, combining beamforming microphone arrays and acoustic echo cancellation to optimally suppress acoustic echo, interference and noise Is desirable (Non-Patent Documents 13 and 14). Here, one generally must address the challenge of minimizing the computational complexity of the combined system while maximizing the positive synergy between acoustic echo cancellation and beamforming.

図６はアコースティックエコーキャンセレーションと適応ビーム形成との結合システム２２０の構造を示す。調査によれば、図６によるアコースティックエコーキャンセレーションと適応ビーム形成との組合せは、時間によって強度に変化するエコー経路があり、アコースティックエコーｅ（ｋ）、所望信号ｄ（ｋ）及び干渉またはノイズｎ（ｋ）のダブルトークがしばしば起こる状況で、有利な特性を有することが示されている（非特許文献１５）。 FIG. 6 shows the structure of a combined system 220 for acoustic echo cancellation and adaptive beamforming. According to the investigation, the combination of acoustic echo cancellation and adaptive beamforming according to FIG. 6 has an echo path that varies in intensity with time, the acoustic echo e (k), the desired signal d (k) and the interference or noise n. It has been shown that it has advantageous characteristics in situations where double talk of (k) often occurs (Non-patent Document 15).

３．６．１直接的実現例
図６で、線２２６上のＱ個のラウドスピーカ信号ｖ（ｋ）は適応フィルタ２４０に対する追加の入力チャネルと解することができる。従って、アコースティックエコーキャンセレーションと適応ビーム形成との結合システム２２０は、Ｍ＋Ｑ個の入力チャネル２２２及び２２８を備えたＭＩＳＯシステムに対応し、ここでアコースティックエコーは干渉とみなされる。従って、アコースティックエコー、干渉及びノイズはＬＣＬＳＥ又はＬＣＭＶ最適化基準を用いて適応ＭＩＭＯシステムを最適化することによって抑制できる。従って、結合システムを適応するために、従来のＬＣＬＳＥ／ＬＣＭＶビーム形成で生じるのと同様のダブルトークの問題が生じるため、頑健なＭＩＭＯ適応フィルタを用い、変換ドメインでのスパースネスを活用することが望ましい。 3.6.1 Direct Implementation In FIG. 6, the Q loudspeaker signals v (k) on line 226 can be interpreted as additional input channels for adaptive filter 240. Accordingly, the combined acoustic echo cancellation and adaptive beamforming system 220 corresponds to a MISO system with M + Q input channels 222 and 228, where acoustic echo is considered interference. Accordingly, acoustic echo, interference and noise can be suppressed by optimizing the adaptive MIMO system using LCLSE or LCMV optimization criteria. Therefore, it is desirable to use a robust MIMO adaptive filter and take advantage of sparseness in the transform domain, as the combined system has the same double-talk problem that occurs with conventional LCLSE / LCMV beamforming. .

適応フィルタ２４０は、入力信号ｘ（ｋ）２２２を受けるための適応フィルタｗ（ｋ）２５０と、線２２８上でラウドスピーカ信号ｖ（ｋ）を受けるためのアコースティックエコーキャンセラａ（ｋ）２５２とを含む。 Adaptive filter 240 includes adaptive filter w (k) 250 for receiving input signal x (k) 222 and acoustic echo canceller a (k) 252 for receiving loudspeaker signal v (k) on line 228. Including.

適応ビーム形成器２２０はさらに、適応フィルタ２５０の出力とアコースティックエコーキャンセラａ（ｋ）２５２の出力とを加算して、線２２４上に出力信号ｙ（ｋ）を生成する加算器２４２を含む。 Adaptive beamformer 220 further includes an adder 242 that adds the output of adaptive filter 250 and the output of acoustic echo canceller a (k) 252 to generate output signal y (k) on line 224.

特に、ＭＣ−ＢＲＦＤＡＦが図６の結合システムの適応に有用である。 In particular, MC-BRFDAF is useful for adaptation of the combined system of FIG.

３．６．２ＧＳＣとしての実現
アコースティックエコーキャンセレーションと適応ビーム形成との結合システム２２０は図７に示すＧＳＣ２７０として実現することができる。 3.6.2 Implementation as GSC The combined system 220 of acoustic echo cancellation and adaptive beamforming can be implemented as the GSC 270 shown in FIG.

図７を参照して、ＧＳＣ２７０は入力信号ｘ（ｋ）２７２を受け、信号ｙ_wc（ｋ）を出力するための適応フィルタ２９０と、入力信号ｘ（ｋ）２７２を受け、信号ｙＢ（ｋ）を出力するためのブロッキング行列Ｂ（ｋ）２９２と、信号ｙＢ（ｋ）と線２７８上のラウドスピーカ信号ｖ（ｋ）とを受けるための干渉キャンセラ２９８とを含む。 Referring to FIG. 7, GSC 270 receives input signal x (k) 272, receives adaptive signal 290 for outputting signal y _wc (k), and input signal x (k) 272, and receives signal yB (k). And an interference canceller 298 for receiving the signal yB (k) and the loudspeaker signal v (k) on line 278.

干渉キャンセラ２９８は信号ｙＢ（ｋ）を受けるための適応フィルタｗ_a（ｋ）３１０と、ラウドスピーカ信号ｖ（ｋ）を受けるためのアコースティックエコーキャンセラａ（ｋ）３１２とを含む。 Interference canceller 298 includes adaptive filter w _a (k) 310 for receiving signal yB (k) and acoustic echo canceller a (k) 312 for receiving loudspeaker signal v (k).

ＧＳＣ２７０はさらに、適応フィルタｗ_a（ｋ）３１０の出力とアコースティックエコーキャンセラａ（ｋ）３１２の出力とを加算するための加算器２９６と、加算器２９６の出力を信号ｙ_wc（ｋ）から減算して、ＧＳＣ２７０の出力ｙ（ｋ）を生成するための減算器２９４とを含む。 The GSC 270 further adds an adder 296 for adding the output of the adaptive filter w _a (k) 310 and the output of the acoustic echo canceller a (k) 312 and subtracts the output of the adder 296 from the signal y _wc (k). And a subtractor 294 for generating the output y (k) of the GSC 270.

アコースティックエコーキャンセラａ（ｋ）３１２は、干渉キャンセラの付加的なチャネルである。従って、結合システムを適応化するためには、従来のＧＳＣと同様のダブルトークの問題が生じるため、頑健なＭＩＭＯ適応フィルタを用い、変換ドメインでのスパースネスを活用することが望ましい。特に、ＭＣ−ＢＲＦＤＡＦが図７の結合システムの適応に有用である。 The acoustic echo canceller a (k) 312 is an additional channel of the interference canceller. Therefore, in order to adapt the coupled system, the problem of double talk similar to that of the conventional GSC arises. Therefore, it is desirable to use a robust MIMO adaptive filter and utilize sparseness in the transform domain. In particular, MC-BRFDAF is useful for adaptation of the combined system of FIG.

４．実験結果
ＭＣ−ＢＲＦＤＡＦとＭＣ−ＦＤＡＦとによって実現されたＧＳＣを、１２ｃｍアパチャでＭ＝４個の均等に間隔をあけたセンサを備えた、残響時間Ｔ_６０＝２５０ｍｓの室内のマイクロフォンアレイに適用した。図９（Ａ）の所望信号が側面方向６０ｃｍの距離から到達した。図９（Ｂ）の干渉が後方距離１２０ｃｍで認められた。センサでの平均ＳＩＲは３ｄＢであった。パラメータは最大収束速さと収束後の最大ノイズ抑制を得るために最適化された。パラメータを以下の表１に示す。 4). Experimental Results GSC realized by MC-BRFDAF and MC-FDAF was applied to an indoor microphone array with reverberation time T ₆₀ = 250 ms, with M = 4 equally spaced sensors at 12 cm aperture. . The desired signal in FIG. 9A arrived from a distance of 60 cm in the lateral direction. The interference shown in FIG. 9B was observed at a rear distance of 120 cm. The average SIR at the sensor was 3 dB. The parameters were optimized to obtain maximum convergence speed and maximum noise suppression after convergence. The parameters are shown in Table 1 below.

パラメータは、一定のステップサイズパラメータμ_cを除き、どちらのＧＳＣ実現例についても同じである。図９（Ａ）から図９（Ｃ）はブロッキング行列による所望信号の抑制ＴＲ_BM（ｋ）、ＧＳＣの干渉抑制ＩＲ（ｋ）、及びシステムの初期化後の時間の関数としてのＧＳＣによる所望の信号の歪ＳＳＮＲ（ｋ）をそれぞれ示す。ＳＳＮＲ（ｋ）は、所望信号のみについて、固定ビーム形成器の出力とＧＳＣの出力との間のセグメント化されたＳＮＲである。干渉キャンセラは所望の信号に歪を生じさせてはならないので、理想的にはＳＳＮＲ（ｋ）＝∞となる。ブロッキング行列（図９（Ａ））と干渉キャンセラ（図９（Ｂ））とは、ＭＣ−ＦＤＡＦに対するよりもＭＣ−ＢＲＦＤＡＦに対してのほうが、より早く収束することが分かる。これは、ダブルトークへの頑健性が改善されるため、ＭＣ−ＢＲＦＤＡＦについてはより大きなステップサイズを選択できることによる。ＴＲ_BM（ｋ）はどちらのＧＳＣに対してもほぼ同じ値まで収束するが、収束後のＩＲ（ｋ）は、ＭＣ−ＢＲＦＤＡＦに対するものがＭＣ−ＦＤＡＦに対するものより約４ｄＢ大きい。この結果は、さまざまに混合した音声信号にこのアルゴリズムを適用することによって確認された。歪ＳＳＮＲ（ｋ）（図９（Ｃ））は、ＭＣ−ＢＲＦＤＡＦに対するもののほうが、ＭＣ−ＦＤＡＦよりわずかに高かった。

The parameters are the same for both GSC implementations except for the constant step size parameter μ _c . 9 (A) to 9 (C) show the desired signal suppression TR _BM (k) by the blocking matrix, the GSC interference suppression IR (k), and the desired signal by GSC as a function of time after system initialization. The distortion SSNR (k) of the signal is shown respectively. SSNR (k) is the segmented SNR between the fixed beamformer output and the GSC output for the desired signal only. Since the interference canceller should not cause distortion in the desired signal, ideally SSNR (k) = ∞. It can be seen that the blocking matrix (FIG. 9A) and interference canceller (FIG. 9B) converge faster for MC-BRFDAF than for MC-FDAF. This is because the robustness to double talk is improved, so that a larger step size can be selected for MC-BRFDAF. TR _BM (k) converges to approximately the same value for both GSCs, but the IR (k) after convergence is about 4 dB greater for MC-BRFDAF than for MC-FDAF. This result was confirmed by applying this algorithm to various mixed audio signals. The strain SSNR (k) (FIG. 9C) was slightly higher for MC-BRFDAF than for MC-FDAF.

今回開示された実施の形態は単に例示であって、本発明が上記した実施の形態のみに制限されるわけではない。本発明の範囲は、発明の詳細な説明の記載を参酌した上で、特許請求の範囲の各請求項によって示され、そこに記載された文言と均等の意味および範囲内でのすべての変更を含む。 The embodiment disclosed herein is merely an example, and the present invention is not limited to the above-described embodiment. The scope of the present invention is indicated by each of the claims after taking into account the description of the detailed description of the invention, and all modifications within the meaning and scope equivalent to the wording described therein are intended. Including.

Ｍ個の入力チャネルとＰ個の出力チャネルとを備えた線形有限インパルス応答（ＦＩＲ）ＭＩＭＯフィルタの構造を示す図である。FIG. 2 shows the structure of a linear finite impulse response (FIR) MIMO filter with M input channels and P output channels. システムＷ（ｋ）を最適化するための構造を示す図である。It is a figure which shows the structure for optimizing system W (k). 係数更新モジュールと変換ドメインでのダブルトーク検出器とを備えた適応線形ＭＩＭＯフィルタを示す図である。FIG. 4 shows an adaptive linear MIMO filter with a coefficient update module and a double-talk detector in the transform domain. 参照信号ｙ_ref（ｋ）を参照する適応ビーム形成器を示す図である。It is a figure which shows the adaptive beamformer which refers to reference signal _yref (k). 適応ブロッキング行列を備えたＧＳＣを示す図である。FIG. 3 is a diagram showing a GSC with an adaptive blocking matrix. 適応ビーム形成とアコースティックエコーキャンセレーションとの結合最適化のための構造を示す図である。It is a figure which shows the structure for the joint optimization of adaptive beam forming and acoustic echo cancellation. 一般化されたエコー及び干渉キャンセラを示す図である。It is a figure which shows the generalized echo and interference canceller. 所望の男性の音声とラウドスピーカからの背景で流れるオーケストラの音楽とに対する、ＧＳＣの適応制御の典型的挙動を示す図である。FIG. 6 shows typical behavior of GSC adaptive control for desired male voice and orchestral music flowing in the background from a loudspeaker. 「連続した」ダブルトークに対し、ＭＣ−ＦＤＡＦを用いた場合とＭＣ−ＢＲＦＤＡＦを用いた場合のＧＳＣの比較を示す図である。It is a figure which shows the comparison of GSC at the time of using MC-FDAF and MC-BRFDAF for "continuous" double talk.

符号の説明Explanation of symbols

２０ＦＩＲＭＩＭＯフィルタ
３０、６０、９０ＦＩＲフィルタ
３２、３４、３６加算器
５０、８０適応線形ＭＩＭＯフィルタ
６２、９４係数更新モジュール
６４、９６ダブルトーク検出器
６６、９８減算器
９２、１００、１０２変換器 20 FIR MIMO filter 30, 60, 90 FIR filter 32, 34, 36 Adder 50, 80 Adaptive linear MIMO filter 62, 94 Coefficient update module 64, 96 Double talk detector 66, 98 Subtractor 92, 100, 102 Converter

Claims

適応係数のベクトルを有し、複数の入力信号を受けるように接続された有限インパルス応答（Ｆｉｎｉｔｅｉｍｐｕｌｓｅｒｅｓｐｏｎｓｅ：ＦＩＲ）フィルタと、
参照信号と前記ＦＩＲフィルタの出力とに基づいて、誤差信号を計算するための手段と、
各々が複数の離散フーリエ変換（ＤｉｓｃｒｅｔｅＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ：ＤＦＴ）領域区分に変換された入力信号及び誤差信号に応答して、かつ前記誤差信号の予め定められた確率密度分布に基づいて、マルチチャネルの、領域区分ごとの頑健な周波数ドメイン適応フィルタ（ｍｕｌｔｉｃｈａｎｎｅｌｂｉｎ−ｗｉｓｅｒｏｂｕｓｔｆｒｅｑｕｅｎｃｙ−ｄｏｍａｉｎａｄａｐｔｉｖｅｆｉｌｔｅｒ：ＭＣ−ＢＲＦＤＡＦ）アルゴリズムを用いて適応係数からなる適応係数ベクトルを更新するための更新手段と、
各々がＤＦＴ領域区分に変換された前記入力信号及び前記参照信号に基づいて、各ＤＦＴ領域区分について外乱が存在しないときに係数ベクトルを更新するように、前記更新するための手段を適応的に制御するための手段とを含み、
前記更新手段が用いる周波数ドメイン適応フィルタは、マルチチャンネルの各チャンネルに対し、以下の式（１）‐（３）により表される処理を期待値行列＿＾Λ _ｐ（ｒ）が収束するまで繰返し実行するＭＣ−ＢＲＦＤＡＦアルゴリズムにより求められる、適応型ビーム形成器。

A finite impulse response (FIR) filter having a vector of adaptive coefficients and connected to receive a plurality of input signals;
Means for calculating an error signal based on a reference signal and the output of the FIR filter;
In response to an input signal and an error signal each converted into a plurality of Discrete Fourier Transform (DFT) domain segments, and based on a predetermined probability density distribution of the error signal, Updating means for updating an adaptive coefficient vector composed of adaptive coefficients using a robust frequency-domain adaptive filter (MC-BRFDAF) algorithm for each domain section; and a multi-channel bin-wice robust frequency-domain adaptive filter (MC-BRFDAF) algorithm;
Based on the input signal and the reference signal each converted to a DFT domain segment, the means for updating is adaptively controlled to update a coefficient vector when there is no disturbance for each DFT domain segment and means for viewing including,
The frequency domain adaptive filter used by the updating unit repeats the processing represented by the following equations (1) to (3) for each multi-channel until the expected value matrix _ ^ Λ _p (r) converges. An adaptive beamformer determined by the executing MC-BRFDAF algorithm .

前記予め定められた確率密度分布は非ガウス確率密度分布である、請求項１に記載の適応型ビーム形成器。 The adaptive beamformer of claim 1, wherein the predetermined probability density distribution is a non-Gaussian probability density distribution.

前記非ガウス確率密度分布は以下の式で与えられ

ここでε∈［０，１］は入力信号の異常値であり、定数ｋ₀はεに依存し、

であるように選ばれる、請求項２に記載の適応型ビーム形成器。 The non-Gaussian probability density distribution is given by

Here, ε∈ [0,1] is an abnormal value of the input signal, and the constant k ₀ depends on ε,

The adaptive beamformer according to claim 2, which is chosen to be