JP2020012976A

JP2020012976A - Sound source separation evaluation device and sound source separation device

Info

Publication number: JP2020012976A
Application number: JP2018135067A
Authority: JP
Inventors: 勇気太刀岡; Yuki Tachioka
Original assignee: Denso IT Laboratory Inc
Current assignee: Denso IT Laboratory Inc
Priority date: 2018-07-18
Filing date: 2018-07-18
Publication date: 2020-01-23
Anticipated expiration: 2038-07-18
Also published as: JP7014682B2

Abstract

To provide a sound source separation evaluation device for evaluating whether sound source separation can be performed without using a sound source before mixing when separating a mixed sound composed of a plurality of sound sources for each sound source.SOLUTION: A sound source separation evaluation device 1 includes: a microphone 10 which collects sounds coming from a plurality of sound sources; a sound source separation unit 11 which separates each sound source on the basis of a spectrogram of the sound collected by the microphone 10; a spatial correlation matrix calculation unit 12 for obtaining a spatial correlation matrix for each sound source; an eigenvalue decomposition unit 13 for acquiring eigenvalues and eigenvectors by eigenvalue decomposition of the spatial correlation matrix; and an arrival direction estimating unit 14 for obtaining an MUSIC (Multiple signal classification) spectrum for each frequency of the sound source by using an MUSIC method using the eigenvalues and eigenvectors.SELECTED DRAWING: Figure 1

Description

本発明は、複数の音源（話者・楽器・放送設備・騒音源等）からの音が入力された際に、方向別に分離して目的音源を抽出する技術に関する。 The present invention relates to a technique for extracting a target sound source by separating directions according to directions when sounds from a plurality of sound sources (speakers, musical instruments, broadcasting equipment, noise sources, etc.) are input.

収音した音に基づいて音源の分離を行った際に、周波数ごとに各音源の成分がばらばらに分離されてしまい、周波数間で一致性がなくなる問題をパーミュテーション問題という。この問題に関しては、特許文献１に詳細に記述されている。特許文献１は、パーミュテーション問題を解く手法として、独立成分分析を対象として分離行列の各行から到来方向を推定し、信頼度に基づき類似度を計算する方法が開示されている。 When the sound sources are separated based on the collected sounds, the components of each sound source are separated separately for each frequency, and the problem of inconsistency between the frequencies is called a permutation problem. This problem is described in detail in Patent Document 1. Patent Literature 1 discloses a method of solving a permutation problem, estimating an arrival direction from each row of a separation matrix for independent component analysis, and calculating a similarity based on reliability.

また、近年では、明示的に到来方向を利用するだけではなく、音源のモデル化でパーミュテーション問題を解決する手法がよく用いられる。例えば、特許文献２に記載された発明では、「各音源の尤度の時系列が周波数ビン間で同期しているほど高い評価値を与える評価関数を用いて」音源の分離を行う。 In recent years, a method of solving a permutation problem by modeling a sound source, instead of explicitly using an arrival direction, is often used. For example, in the invention described in Patent Literature 2, sound source separation is performed using “an evaluation function that gives a higher evaluation value as the time series of the likelihood of each sound source is synchronized between frequency bins”.

特開２００４−１４５１７２号公報JP 2004-145172 A 特開２０１４−２１５３８５号公報JP 2014-215385 A

特許文献１に記載された方法は、同文献の図７からも分かるように、ゲインのピークが不明確なことから間違えやすく、どの周波数を信頼するかに性能が依存するという課題があった。特許文献２に記載されたようなモデル化による方法は、明示的に到来方向を推定していないため、モデル化の誤差や最適化の過程でパーミュテーション解決を間違えて分離精度が低かった場合に、その原因を把握することができないという課題があった。このため、初期値や最適化の方法を変えて分離した際に、分離結果に差異が発生した場合に、どの結果が良いかを、元のソース音源の情報を用いることなく判断することが難しかった。 As can be seen from FIG. 7 of the document, the method described in Patent Document 1 has a problem that the peak of the gain is indefinite, so that it is easy to make a mistake, and the performance depends on which frequency is to be trusted. In the method by modeling as described in Patent Document 2, since the direction of arrival is not explicitly estimated, if the separation accuracy is low due to a modeling error or a wrong permutation solution in the optimization process, However, there is a problem that the cause cannot be grasped. For this reason, it is difficult to judge which result is better without using the information of the original source sound source when a difference occurs in the separation result when the separation is performed by changing the initial value or the optimization method. Was.

本発明は、上記背景に鑑み、ソース音源を用いないで音源分離を行えているかを評価することができる音源分離の評価装置及び音源分離装置を提供することを目的とする。 In view of the above background, an object of the present invention is to provide a sound source separation evaluation device and a sound source separation device that can evaluate whether sound source separation is performed without using a source sound source.

本発明の音源分離の評価装置は、複数の音源から到来した音を収音する収音部と、前記収音部にて収音した音の音源を分離する音源分離部と、前記各音源に対する空間相関行列を求める空間相関行列算出部と、前記空間相関行列を固有値分解して固有値及び固有ベクトルを求める固有値分解部と、前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ（Multiple signal classification）法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求める到来方向推定部とを備える。ここで、ＭＵＳＩＣ法は、死角を用いて音源の位置を推定するサブスペース法の一つである。 A sound source separation evaluation device of the present invention includes a sound pickup unit that collects sounds coming from a plurality of sound sources, a sound source separation unit that separates sound sources of sounds collected by the sound pickup units, A sound correlation matrix calculating unit for obtaining a spatial correlation matrix, an eigenvalue decomposition unit for eigenvalue decomposition of the spatial correlation matrix to obtain eigenvalues and eigenvectors, and each of the sound sources by MUSIC (Multiple signal classification) using the eigenvalues and eigenvectors. And a direction-of-arrival estimating unit for obtaining a MUSIC spectrum for each frequency. Here, the MUSIC method is one of the subspace methods for estimating the position of a sound source using blind spots.

本発明では、空間相関行列から求めた固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法によって各音源の各周波数についてＭＵＳＩＣスペクトルを求めているので、固有値の大きさにより、到来方向の信頼度を自然に導入することができる。また、ＭＵＳＩＣスペクトルは、明確なピークが現れるので、音の到来方向を明示的に知ることができるので、音源が分離できているかを評価することができる。 In the present invention, since the MUSIC spectrum is obtained for each frequency of each sound source by the MUSIC method using the eigenvalues and eigenvectors obtained from the spatial correlation matrix, the reliability of the arrival direction is naturally introduced by the magnitude of the eigenvalues. be able to. Further, since a clear peak appears in the MUSIC spectrum, it is possible to clearly know the direction of arrival of the sound, so that it is possible to evaluate whether the sound source has been separated.

また、音源ごとに全周波数のＭＵＳＩＣスペクトルを加算したＭＵＳＩＣスペクトルと、各周波数ビンでのＭＵＳＩＣスペクトルとを比較することで、どの周波数ビンでパーミュテーションが起こっているかを判断することも可能である。また、本発明の方法は、分離行列そのものを用いてはいないので、音源数と収録に用いたマイク数が等しい場合に加え、音源数がマイクの数より多い条件や少ない条件でも用いることができる。 Further, by comparing the MUSIC spectrum obtained by adding the MUSIC spectrums of all frequencies for each sound source with the MUSIC spectrum in each frequency bin, it is possible to determine which frequency bin has permutation. . Further, since the method of the present invention does not use the separation matrix itself, in addition to the case where the number of sound sources is equal to the number of microphones used for recording, the method can be used under conditions where the number of sound sources is larger or smaller than the number of microphones. .

なお、本発明は、観測された受信信号を基底とアクティベーションに分離する混合系手法を用いて音源分離を行う場合にも、音源の独立性などの音源の分離度を判定する量が最大となるように音源を分離する分離系手法を用いて音源分離を行う場合にも適用することができる。 It should be noted that the present invention also has a maximum amount of determining the degree of sound source separation such as independence of sound sources, even when performing sound source separation using a mixed system technique of separating an observed received signal into bases and activations. The present invention can also be applied to a case where sound source separation is performed using a separation system technique for separating sound sources.

本発明の音源分離の評価装置は、前記到来方向推定部にて求めた各音源のＭＵＳＩＣスペクトルの近さを評価する分離度算出部をさらに備えてもよい。なお、ＭＵＳＩＣスペクトルどうしの近さを評価する方法としては、例えば、各ＭＵＳＩＣスペクトルのピーク位置の差を評価してもよいし、ＭＵＳＩＣスペクトルのどうしの重なりを評価してもよい。本発明の構成により、音源を分離できているかどうかを定量的に評価することができる。 The sound source separation evaluation device of the present invention may further include a separation degree calculation unit that evaluates the closeness of the MUSIC spectrum of each sound source obtained by the arrival direction estimation unit. As a method of evaluating the closeness between MUSIC spectra, for example, a difference between peak positions of each MUSIC spectrum may be evaluated, or an overlap between MUSIC spectra may be evaluated. According to the configuration of the present invention, it is possible to quantitatively evaluate whether or not the sound sources can be separated.

本発明の音源分離装置は、複数の音源から到来した音を収音する収音部と、前記収音部にて収音した音の音源を分離する音源分離部と、前記各音源に対する空間相関行列を求める空間相関行列算出部と、前記空間相関行列を固有値分解して固有値及び固有ベクトルを求める固有値分解部と、前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求める到来方向推定部と、各音源のＭＵＳＩＣスペクトルと、周波数ごとのＭＵＳＩＣスペクトルとを比較して、パーミュテーションが起こっているか否かを判定するパーミュテーション算出部とを備え、前記音源分離部は、前記パーミュテーション算出部での判定結果を、音源の分離に用いる。なお、本発明は、混合系手法を用いて音源分離を行う場合にも、分離系手法を用いて音源分離を行う場合にも適用することができる。 A sound source separation device according to the present invention includes a sound collection unit that collects sounds arriving from a plurality of sound sources, a sound source separation unit that separates the sound sources of the sounds collected by the sound collection units, and a spatial correlation for each of the sound sources. A spatial correlation matrix calculation unit for obtaining a matrix, an eigenvalue decomposition unit for eigenvalue decomposition of the spatial correlation matrix to obtain an eigenvalue and an eigenvector, and a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalue and the eigenvector. And a permutation calculation unit that compares the MUSIC spectrum of each sound source with the MUSIC spectrum for each frequency to determine whether or not permutation has occurred. The unit uses the determination result of the permutation calculation unit for sound source separation. The present invention can be applied to the case where sound source separation is performed using a mixed system method and the case where sound source separation is performed using a separation system method.

各音源のＭＵＳＩＣスペクトルと、周波数ごとのＭＵＳＩＣスペクトルとを比較することにより、周波数ビンごとにパーミュテーションが起こっているか否かを判定できるので、この判定結果を利用して、もし、パーミュテーションが起こっていた場合にはこれを修正することができ、音源分離の性能を向上させることもできる。なお、パーミュテーション算出部での判定結果によっては、分離がうまくできていないと判断できる場合は、分離部による音源分離処理を中止することも可能である。 By comparing the MUSIC spectrum of each sound source with the MUSIC spectrum for each frequency, it is possible to determine whether or not permutation has occurred for each frequency bin. If this occurs, this can be corrected, and the performance of sound source separation can be improved. If it can be determined that the separation has not been properly performed depending on the determination result in the permutation calculation unit, the sound source separation processing by the separation unit can be stopped.

本発明の音源分離の評価方法は、到来した音の音源を分離し、その分離性能を評価する方法であって、収音した音の音源を分離するステップと、前記各音源に対する空間相関行列を求めるステップと、前記空間相関行列を固有値分解して固有値及び固有ベクトルを求めるステップと、前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求めるステップとを備える。 The sound source separation evaluation method of the present invention is a method of separating a sound source of an incoming sound and evaluating the separation performance thereof.The method includes the steps of separating a sound source of a collected sound, and a spatial correlation matrix for each of the sound sources. Determining the eigenvalue and eigenvector of the spatial correlation matrix by eigenvalue decomposition, and obtaining a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalue and eigenvector.

また、本発明の音源分離方法は、到来した音の音源を分離する方法であって、収音した音の音源を分離するステップと、前記各音源に対する空間相関行列を求めるステップと、前記空間相関行列を固有値分解して固有値及び固有ベクトルを求めるステップと、前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求めるステップと、各音源のＭＵＳＩＣスペクトルと、周波数ごとのＭＵＳＩＣスペクトルとを比較して、パーミュテーションが起こっているか否かを判定するステップとを備え、前記各音源を分離するステップにおいて、前記パーミュテーションが起こっているか否かの判定結果を、音源の分離に用いる。 Also, the sound source separation method of the present invention is a method of separating a sound source of an incoming sound, wherein a step of separating a sound source of a collected sound, a step of obtaining a spatial correlation matrix for each of the sound sources, Eigenvalue decomposition of the matrix to obtain eigenvalues and eigenvectors; using the eigenvalues and eigenvectors, a MUSIC method to obtain a MUSIC spectrum for each frequency of each sound source; a MUSIC spectrum of each sound source; Comparing the MUSIC spectrum with the MUSIC spectrum to determine whether or not permutation has occurred. In the step of separating each sound source, the determination result as to whether or not the permutation has occurred, Used for separation.

本発明のプログラムは、上記した音源分離の評価方法または音源分離方法の各ステップを実行するプログラムである。 A program according to the present invention is a program for executing each step of the above-described sound source separation evaluation method or sound source separation method.

本発明によれば、ソースの音源を用いないで音源分離を行えているかを評価することができる。 According to the present invention, it is possible to evaluate whether sound source separation is performed without using a source sound source.

第１の実施の形態の音源分離の評価装置を示す図である。FIG. 2 is a diagram illustrating an apparatus for evaluating sound source separation according to the first embodiment. 複数個の初期値からマルチチャンネル非負値行列因子分解により、音源の分離を行い、分離された音源に対して到来方向推定を行った例を示す図である。FIG. 11 is a diagram illustrating an example in which sound sources are separated from a plurality of initial values by multi-channel non-negative matrix factorization, and arrival directions are estimated for the separated sound sources. 第２の実施の形態の音源分離の評価装置を示す図である。FIG. 11 is a diagram illustrating a sound source separation evaluation device according to a second embodiment. 第３の実施の形態の音源分離の評価装置を示す図である。FIG. 14 is a diagram illustrating a sound source separation evaluation device according to a third embodiment. パーミュテーション解決の基本的な考え方を示す図である。It is a figure showing the basic idea of permutation solution. 第４の実施の形態の音源分離の評価装置を示す図である。FIG. 14 is a diagram illustrating a sound source separation evaluation device according to a fourth embodiment. 第４の実施の形態の音源分離の評価装置を示す図である。FIG. 14 is a diagram illustrating a sound source separation evaluation device according to a fourth embodiment.

以下、本発明の実施の形態の音源分離の評価装置及び音源分離装置について実施の形態を挙げて説明する。以下の説明では、時間周波数ビンで考え、特に断りのない限り時間周波数ビンに関するインデックスは省略する。また、マイク数をＭとし、音源数をＬとする。 Hereinafter, a sound source separation evaluation device and a sound source separation device according to an embodiment of the present invention will be described with reference to embodiments. In the following description, a time-frequency bin is considered, and an index relating to the time-frequency bin is omitted unless otherwise specified. The number of microphones is M, and the number of sound sources is L.

（第１の実施の形態）
図１は、第１の実施の形態の音源分離の評価装置１の構成を示す図である。第１の実施の形態の音源分離の評価装置１は、混合系手法によって音源分離を行い、その分離性能を評価する装置である。図１は、音源数Ｌ＝３の場合を記載している。 (First Embodiment)
FIG. 1 is a diagram illustrating a configuration of a sound source separation evaluation device 1 according to the first embodiment. The sound source separation evaluation device 1 of the first embodiment is a device that performs sound source separation by a mixed system technique and evaluates the separation performance. FIG. 1 illustrates a case where the number of sound sources L = 3.

音源分離の評価装置１は、収音部である複数のマイク１０と、音源分離部１１と、空間相関行列算出部１２と、固有値分解部１３と、到来方向推定部１４とを有している。音源分離部１１は、マイク１０で収音した音のスペクトログラムを複数の基底とそれに対応するアクティベーションに分解し、基底とアクティベーションをクラスタリングして音源分離する。音源分離部１１は、一例として、マルチチャンネル非負値行列因子分解を用いて、空間相関行列、基底行列とアクティベーション行列に分解する。空間相関行列、基底行列とアクティベーション行列に適当な初期値を与え、空間相関行列、基底行列とアクティベーション行列の積と、収音した音のスペクトログラムとの誤差が所定の閾値以下に収束するまで、空間相関行列、基底行列とアクティベーション行列の更新を行う。適切な初期値を与えれば、精度よく音源分離を行えるが、そうでない場合には音源分離の精度が低くなる。本実施の形態の評価装置１は、音源分離部１１にて行った音源分離の性能を評価する。 The sound source separation evaluation device 1 includes a plurality of microphones 10 as sound pickup units, a sound source separation unit 11, a spatial correlation matrix calculation unit 12, an eigenvalue decomposition unit 13, and an arrival direction estimation unit 14. . The sound source separation unit 11 decomposes the spectrogram of the sound collected by the microphone 10 into a plurality of bases and activations corresponding thereto, and performs sound source separation by clustering the bases and the activations. As an example, the sound source separation unit 11 decomposes into a spatial correlation matrix, a basis matrix, and an activation matrix using a multi-channel non-negative matrix factorization. Give appropriate initial values to the spatial correlation matrix, the basis matrix and the activation matrix, and until the error between the product of the spatial correlation matrix, the basis matrix and the activation matrix, and the spectrogram of the collected sound converges to a predetermined threshold or less. , The spatial correlation matrix, the basis matrix and the activation matrix are updated. If an appropriate initial value is given, sound source separation can be performed with high accuracy, but if not, the accuracy of sound source separation will be low. The evaluation device 1 of the present embodiment evaluates the performance of the sound source separation performed by the sound source separation unit 11.

音源分離の評価装置１は、音源と同じ数（Ｌ＝３）の空間相関行列算出部１２を有する。それぞれの空間相関行列算出部１２は、各音源ｌに対する空間相関行列Ｈ_flを求める。空間相関行列Ｈ_flの求め方は次のとおりである。空間相関行列算出部１２は、分離された音源のそれぞれについて、ある時間周波数ビンにおけるＭ次元の観測スペクトルｘ＝[ｘ₁,…,ｘ_M]^Tから、周波数ビンｆごとに空間相関行列Ｈ_f＝［Ｈ_f1，...，Ｈ_fl，...，Ｈ_fL］を算出する。Ｌ次元の音源のスペクトルをｙ（＝[ｙ₁,..,ｙ_L]^T）とすると、Ｈ_flと音源のパワースペクトル｜ｙ_l｜^２を用いることで、ｘの空間相関cov(x)が下記の式（１）で表される。

The evaluation apparatus 1 for sound source separation has the same number (L = 3) of spatial correlation matrix calculation units 12 as the number of sound sources. Each spatial correlation matrix calculation unit 12 calculates a spatial correlation matrix H _fl for each sound source 1. The method of _obtaining the spatial correlation matrix H _fl is as follows. For each of the separated sound sources, the spatial correlation matrix calculation unit 12 calculates a spatial correlation matrix H _f for each frequency bin f from an M-dimensional observed spectrum x = [x ₁ ,..., X _M ] ^{T at} a certain time-frequency bin. = [ _Hf1 , ..., _Hfl , ..., _HfL ] is calculated. Assuming that the spectrum of the L-dimensional sound source is y (= [y ₁ , .., y _L ] ^T ), the spatial correlation cov (x) of x is obtained by using H _fl and the power spectrum | y _l | ² of the sound source. Is represented by the following equation (1).

この式（１）において、左辺と右辺の誤差eが小さくなるように最適化することにより、Ｈ_fと｜ｙ_l｜^２を推定する。ここで、covはベクトル間の相関をとる関数である。例えば２次元のベクトルｘ=[ｘ₁, ｘ₂]^T （Ｔは転置）を引数とした場合には、次の式（２）で表される。

ここで、＊は複素共役をとるオペレーターである。３次元以上の場合にも、ペアでの相関をとることで、同様の操作を実現できる。 In this formula (1), by optimizing so that the error e of the left side and the right side is small, H _f and | estimating a ² | y _l. Here, cov is a function for obtaining correlation between vectors. For example, when a two-dimensional vector x = [x ₁ , x ₂ ] ^T (T is transposed) is used as an argument, it is expressed by the following equation (2).

Here, * is an operator that takes complex conjugate. In the case of three or more dimensions, a similar operation can be realized by taking a correlation between pairs.

固有値分解部１３は、上記手順により求めた音源ｌに対する空間相関行列Ｈ_flを固有値分解する。Ｍ行Ｍ列の正定値の空間相関行列Ｈ_flを固有値分解すると、次の式（３）の形に分解できる。

The eigenvalue decomposition unit 13 performs eigenvalue decomposition of the spatial correlation matrix H _fl for the sound source 1 obtained by the above procedure. When the spatial correlation matrix H _fl of M rows and M columns of positive definite _values is eigenvalue-decomposed, it can be decomposed into the following equation (3).

ここで、Ｄ_flは、Ｍ行Ｍ列の実数の固有値を対角成分に持つ対角行列で、降順にソートされているとする。また、Ｖ_flは、Ｍ行Ｍ列の複素行列で、固有値に対応する固有ベクトルを列に並べたものである。 Here, it is assumed that D _fl is a diagonal matrix having M rows and M columns of real eigenvalues as diagonal components, and is sorted in descending order. V _fl is a complex matrix of M rows and M columns, in which eigenvectors corresponding to eigenvalues are arranged in columns.

マイク間隔ｄの直線アレイで平面波仮定できるとすると、θ方向からの平面波のステアリングベクトルa(f,q)= [a₁(f,q), …, a_m(f,q), …, a_M(f,q)]^Tは、次の式（４）で表される。

なお、φ（ｆ）は、周波数ビンｆを周波数［Ｈｚ］に変換する関数、ｊは虚数単位、ｃは音速である。 Assuming that a plane wave can be assumed by a linear array with a microphone interval d, the steering vector a (f, q) = [a ₁ (f, q),..., A _m (f, q),. _M (f, q)] ^T is represented by the following equation (4).

Note that φ (f) is a function for converting the frequency bin f into a frequency [Hz], j is an imaginary unit, and c is a sound speed.

ここで、実際にはマイク１０の間隔が不明でも構わないことに注意する。マイク１０の間隔が実際には、ｄ’であった場合にはa_m= a_m ^d’/dとなるだけなので、ＭＵＳＩＣスペクトルの概形は変わらない。そのため、本手法においても、ブラインド音源分離の枠組みはそのまま維持できる。 Here, it should be noted that the interval between the microphones 10 may actually be unknown. In practice, the distance of the microphone 10, because only a _{_{^{/ d 'a m = a m}}} d in the case ^were' d, outline of the MUSIC spectrum is not changed. Therefore, even in the present method, the framework of the blind sound source separation can be maintained as it is.

到来方向推定部１４は、上記手順で求まった空間相関行列Ｈの固有値Ｄ及び固有ベクトルＶに基づいて、次の式（５）で表されるＭＵＳＩＣスペクトルＳ_fl（θ）を音源数Ｌ個、周波数ビン数Ｆ個分算出する。

Based on the eigenvalues D and eigenvectors V of the spatial correlation matrix H obtained in the above procedure, the direction-of-arrival estimating unit 14 converts the MUSIC spectrum S _fl (θ) represented by the following equation (5) into L sound sources, Calculate for F number of bins.

ＭＵＳＩＣスペクトルは、音源に関する最大固有値以外に対応する固有ベクトルＶ_fl（:,2:M）とステアリングベクトルa(f,θ)との内積の逆数の形で表される。このとき、信号部分空間と騒音部分空間の直交性により、音源の到来方向に対して、分母の値が小さくなり、ＭＵＳＩＣスペクトルＳ_fl（θ）がピークを取る。本実施の形態の方法は、特許文献１の手法に比べて明確なピークが形成され、固有値を利用することで、信頼度を別途求める必要がない。これにより、空間相関行列から、それぞれの音源からの音の到来方向を推定できる。 The MUSIC spectrum is expressed in the form of the reciprocal of the inner product of the eigenvector V _fl (:, 2: M) corresponding to a source other than the maximum eigenvalue and the steering vector a (f, θ). At this time, due to the orthogonality between the signal subspace and the noise subspace, the value of the denominator becomes smaller with respect to the arrival direction of the sound source, and the MUSIC spectrum S _fl (θ) takes a peak. In the method of the present embodiment, a clear peak is formed as compared with the method of Patent Document 1, and it is not necessary to separately obtain the reliability by using the eigenvalue. This makes it possible to estimate the direction of arrival of sound from each sound source from the spatial correlation matrix.

図２は、上記式によって求めたＭＵＳＩＣスペクトルＳ_fl（θ）を周波数ビンについて和をとったＭＵＳＩＣスペクトル

を示す図である。 FIG. 2 is a MUSIC spectrum obtained by summing the MUSIC spectrum S _fl (θ) obtained by the above equation with respect to frequency bins.

FIG.

図２は、複数個の初期値からマルチチャンネル非負値行列因子分解により、音源の分離を行い、分離された音源ｌに対して、上で説明した到来方向推定を行った例を示す図である。ここでは、求めた分離性能を表す指標であるＳＤＲ（signal-to-distortion ratio）[dB]が最良の場合を左に、最悪の場合を右に示す。図２の左のグラフでは、音源Ｓ１はθ＝０．４付近にピークを有し、音源Ｓ２はθ＝−１付近にピークを有し、音源Ｓ３はθ＝−１．２５付近にピークを有することが分かる。これに対し、図２の右のグラフでは、音源Ｓ１と音源Ｓ２は、θ＝０．２５付近にピークを有し、音源Ｓ３はθ＝−１付近にピークを有するという結果が求められる。右の例では音源１と音源２のピークが同じになってしまい、音源がうまく分離できていない。このようにＭＵＳＩＣスペクトルを求めることにより、音源分離がうまくいっている場合（左の場合）と、音源分離がうまくいっていない場合（右の場合）を容易に識別することができる。 FIG. 2 is a diagram showing an example in which sound sources are separated from a plurality of initial values by multi-channel non-negative matrix factorization, and the above-described DOA estimation is performed on the separated sound source l. . Here, the best case is shown on the left when the SDR (signal-to-distortion ratio) [dB], which is an index indicating the obtained separation performance, is shown on the left, and the worst case is shown on the right. In the left graph of FIG. 2, the sound source S1 has a peak near θ = 0.4, the sound source S2 has a peak near θ = −1, and the sound source S3 has a peak near θ = −1.25. It can be seen that it has. On the other hand, in the graph on the right side of FIG. 2, the result is obtained that the sound source S1 and the sound source S2 have a peak near θ = 0.25, and the sound source S3 has a peak near θ = −1. In the example on the right, the peaks of the sound source 1 and the sound source 2 are the same, and the sound sources are not well separated. By determining the MUSIC spectrum in this way, it is possible to easily identify a case where sound source separation is successful (left case) and a case where sound source separation is not successful (right case).

本実施の形態の音源分離の評価装置１の動作は、図１に示す構成図において、矢印に従って、各構成要素が機能することにより実現される。すなわち、音源分離部１１が、収音した音のスペクトログラムに基づいて音源分離を行い、次に、空間相関行列算出部１２が各音源に対する空間相関行列を求める。続いて、固有値分解部１３が、空間相関行列を固有値分解して固有値及び固有ベクトルを求め、到来方向推定部１４が、固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求める。 The operation of the sound source separation evaluating apparatus 1 according to the present embodiment is realized by the function of each component according to the arrow in the configuration diagram shown in FIG. That is, the sound source separation unit 11 performs sound source separation based on the spectrogram of the collected sound, and then the spatial correlation matrix calculation unit 12 obtains a spatial correlation matrix for each sound source. Subsequently, the eigenvalue decomposition unit 13 obtains eigenvalues and eigenvectors by eigenvalue decomposition of the spatial correlation matrix, and the arrival direction estimating unit 14 uses the eigenvalues and eigenvectors to perform MUSIC spectrum Ask for.

以上、本実施の形態の音源分離の評価装置１の構成について説明したが、上記した評価装置のハードウェアの例は、収音部である複数のマイク１０と接続されたコンピュータである。コンピュータは、ＣＰＵ、ＲＡＭ、ＲＯＭ、ハードディスク、ディスプレイ、キーボード、マウス、通信インターフェース等を備える。上記した各機能を実現するモジュールを有するプログラムをＲＡＭまたはＲＯＭに格納しておき、ＣＰＵによって当該プログラムを実行することによって、上記した音源分離の評価装置が実現される。このようなプログラムも本発明の範囲に含まれる。 The configuration of the evaluation apparatus 1 for sound source separation according to the present embodiment has been described above. An example of hardware of the above-described evaluation apparatus is a computer connected to a plurality of microphones 10 as sound pickup units. The computer includes a CPU, a RAM, a ROM, a hard disk, a display, a keyboard, a mouse, a communication interface, and the like. By storing a program having a module for realizing each of the above functions in a RAM or a ROM and executing the program by the CPU, the above-described apparatus for evaluating sound source separation is realized. Such a program is also included in the scope of the present invention.

（第２の実施の形態）
図３は、第２の実施の形態の音源分離の評価装置２の構成を示す図である。第２の実施の形態の音源分離の評価装置２は、例えば独立成分分析、独立ベクトル分析等の分離系手法によって音源を分離する装置である。図３では、音源数Ｌ＝３の場合を記載している。 (Second embodiment)
FIG. 3 is a diagram illustrating a configuration of a sound source separation evaluation device 2 according to the second embodiment. The sound source separation evaluation device 2 according to the second embodiment is a device that separates sound sources by a separation method such as independent component analysis or independent vector analysis. FIG. 3 illustrates a case where the number of sound sources L = 3.

音源分離の評価装置２は、収音部である複数のマイク１０と、逆行列算出部１５と、音源分離部１１と、空間相関行列算出部１２と、固有値分解部１３と、到来方向推定部１４とを有している。 The sound source separation evaluation device 2 includes a plurality of microphones 10 that are sound pickup units, an inverse matrix calculation unit 15, a sound source separation unit 11, a spatial correlation matrix calculation unit 12, an eigenvalue decomposition unit 13, an arrival direction estimation unit 14.

音源分離部１１は、マイク１０にて収音した音を独立した信号に分離する分離行列を、音のスペクトログラムの周波数ビンごとに推定する。具体的には、次式（６）で示すように、観測スペクトルxと音源のスペクトルyを結びつける分離行列W_fを推定する。
y =W_f x ・・・（６） The sound source separation section 11 estimates a separation matrix for separating the sound collected by the microphone 10 into independent signals for each frequency bin of the sound spectrogram. Specifically, as shown in the following equation (6), to estimate the separation matrix W _f linking the spectrum y of the observed spectrum x and the sound source.
y = _Wfx x (6)

逆行列算出部１５は、分離行列W_fの逆行列を求める。なお、W_fが正方行列でない場合は、ムーアペンローズの疑似逆行列を求める。 Inverse matrix calculating unit 15 calculates an inverse matrix of the separating matrix W _f. If W _f is not a square matrix, a Moore-Penrose pseudoinverse is obtained.

空間相関行列算出部１２は、上記式（６）の両辺に、左から逆行列をかけて、次の式（７）を得る。なお、式（７）において、ａに対するｆのインデックスは可読性のため省いている。

これより、音源ｌに対する空間相関行列Ｈ_flは、次の式（８）のように表される。

｜ｙ_ｌ｜^２は実数で位相差に影響を与えないので、実質的には、次式（９）で空間相関行列が求められる。

The spatial correlation matrix calculation unit 12 multiplies both sides of the above equation (6) by an inverse matrix from the left to obtain the following equation (7). In equation (7), the index of f with respect to a is omitted for readability.

Thus, the spatial correlation matrix H _fl for the sound source 1 is represented by the following equation (8).

Since | _yl | ² is a real number and does not affect the phase difference, the spatial correlation matrix is substantially obtained by the following equation (9).

空間相関行列Ｈ_flを算出した後の処理は、第１の実施の形態と同じであり、空間相関行列Ｈ_flを固有値分解し、固有値及び固有ベクトルを用いたＭＵＳＩＣ法により、到来方向を表すＭＵＳＩＣスペクトルＳ_fl（θ）を推定する。 The processing after calculating the spatial correlation matrix H _fl is the same as that of the first embodiment, and the eigenvalue decomposition of the spatial correlation matrix H _fl is performed, and the MUSIC spectrum representing the arrival direction is obtained by the MUSIC method using the eigen values and the eigen vectors. Estimate S _fl (θ).

（第３の実施の形態）
図４は、第３の実施の形態の音源分離の評価装置３の構成を示す図である。図２に示したように、音源の分離結果とＭＵＳＩＣスペクトルの重なりには関係がある。第３の実施の形態の音源分離の評価装置３は、分離度算出部１６を備えている。分離度算出部１６は、第１の実施の形態と同様にして推定されたＭＵＳＩＣスペクトルＳ_fl（θ）を用いて、音源の分離度を定量的に評価する。 (Third embodiment)
FIG. 4 is a diagram illustrating a configuration of a sound source separation evaluation device 3 according to the third embodiment. As shown in FIG. 2, there is a relationship between the sound source separation result and the MUSIC spectrum overlap. The sound source separation evaluation device 3 according to the third embodiment includes a separation degree calculation unit 16. The degree-of-separation calculation unit 16 quantitatively evaluates the degree of separation of the sound source using the MUSIC spectrum S _fl (θ) estimated in the same manner as in the first embodiment.

分離度算出部１６は、異なる音源であると判定された到来方向のピークが互いにどのくらい離れているかを評価する。分離度算出部１６は、ピーク位置の差の絶対値を_LＣ₂通り足し合わせて評価値を算出する。この値が大きいほど、各ピークが離れていると判断できる。 The degree-of-separation calculation unit 16 evaluates how far the peaks in the direction of arrival determined to be different sound sources are from each other. Separation degree calculating unit 16 calculates an evaluation value sum _L C ₂ ways the absolute value of the difference between the peak position. It can be determined that the larger this value is, the farther apart each peak is.

図２を例として説明する。図２の左のケースでは音源Ｓ１，Ｓ２，Ｓ３のピーク位置がそれぞれ０．４，−１，−１．２５である。各ピーク位置の差の絶対値の総和は、
｜0.4-(-1)｜＋｜-1-(-1.25)｜＋｜0.4-(-1.25)｜＝3.3
である。これに対して、右のケースでは、音源Ｓ１，Ｓ２，Ｓ３のピーク位置がそれぞれ０．２５，０．２５，−１である。各ピーク位置の差の絶対値の総和は、
｜0.25-0.25｜＋｜0.25-(-1)｜＋｜0.25-(-1)｜＝2.5
である。したがって、左のケースの方が、各ピーク位置の差が大きく、音源の分離度が大きいと判断できる。 This will be described with reference to FIG. In the left case of FIG. 2, the peak positions of the sound sources S1, S2, and S3 are 0.4, -1, and -1.25, respectively. The sum of the absolute values of the differences between the peak positions is
| 0.4-(-1) | + | -1-(-1.25) | + | 0.4-(-1.25) | = 3.3
It is. On the other hand, in the right case, the peak positions of the sound sources S1, S2, and S3 are 0.25, 0.25, and -1, respectively. The sum of the absolute values of the differences between the peak positions is
| 0.25-0.25 | + | 0.25-(-1) | + | 0.25-(-1) | = 2.5
It is. Therefore, in the case on the left, it can be determined that the difference between the peak positions is larger and the degree of separation of the sound source is larger.

なお、分離度算出部１６は、他の方法で、各音源のＭＵＳＩＣスペクトルＳ_fl（θ）の評価値を算出してもよい。例えば、ＭＵＳＩＣスペクトルの重なり割合を２つずつ評価して_LＣ₂通り足し合わせた値や、すべてのＭＵＳＩＣスペクトルの重なり面積を全体の面積で割った値の逆数を分離度とすることもできる。 Note that the degree-of-separation calculating section 16 may calculate the evaluation value of the MUSIC spectrum S _fl (θ) of each sound source by another method. For example, a value obtained by evaluating the overlapping ratios of the MUSIC spectra two by _{two and} adding them as _L C2 or a reciprocal of a value obtained by dividing the overlapping area of all the MUSIC spectra by the entire area can be used as the degree of separation.

なお、本実施の形態では、第１の実施の形態の構成に対して分離度算出部１６を追加した例を示したが、第２の実施の形態の構成に対して分離度算出部１６を追加することももちろん可能である。 In the present embodiment, an example is shown in which the degree-of-separation calculation unit 16 is added to the configuration of the first embodiment, but the degree-of-separation calculation unit 16 is added to the configuration of the second embodiment. Of course, it is possible to add.

（第４の実施の形態）
第４の実施の形態の音源分離装置について説明する。第４の実施の形態では、ＭＵＳＩＣスペクトルの情報をパーミュテーションの解決に使う。全周波数ビンに対して足し合わせたＭＵＳＩＣスペクトル

と各周波数ビンでのＭＵＳＩＣスペクトルＳ_fl（θ）を比較することで、当該周波数ビンでパーミュテーションが起こっているかを判定する。 (Fourth embodiment)
A sound source separation device according to a fourth embodiment will be described. In the fourth embodiment, information on the MUSIC spectrum is used for solving permutation. MUSIC spectrum summed over all frequency bins

And MUSIC spectrum S _fl (θ) at each frequency bin, to determine whether permutation has occurred at the frequency bin.

図５は、パーミュテーション解決の基本的な考え方を示す図である。図５において、音源Ｓ１，Ｓ２，Ｓ３のＭＵＳＩＣスペクトルを示す図であり、図２のBest SDR caseを再掲したものである。つまり、各音源の全周波数のＭＵＳＩＣスペクトルを足し合わせたものである。同グラフの上に、音源Ｓ３と判定された周波数ビンｆのＭＵＳＩＣスペクトルを一点鎖線で示している。しかし、このスペクトルのピークは、音源Ｓ３のピークよりも音源Ｓ１のピークの方にはるかに近い。この場合、音源Ｓ３と判定された周波数ビンｆは、パーミュテーションが起こっていると考えられる。この場合、音源分離部１１は、この比較結果に基づいて、音源分離を行う。 FIG. 5 is a diagram showing a basic concept of permutation solution. FIG. 5 is a diagram showing MUSIC spectra of the sound sources S1, S2, and S3, in which the Best SDR case of FIG. 2 is shown again. That is, the MUSIC spectra of all frequencies of each sound source are added. The MUSIC spectrum of the frequency bin f determined as the sound source S3 is indicated by a dashed line on the graph. However, the peak of this spectrum is much closer to the peak of sound source S1 than to the peak of sound source S3. In this case, the frequency bin f determined as the sound source S3 is considered to have permutation. In this case, the sound source separation unit 11 performs sound source separation based on the comparison result.

図６は、混合系手法を用いた音源分離装置４の構成を示す図である。音源分離装置４は、第１の実施の形態の評価装置１の構成に加え、パーミュテーション算出部１７を備えている。パーミュテーション算出部１７は、音源ごとのＭＵＳＩＣスペクトルと、周波数ビンごとのＭＵＳＩＣスペクトルを比較して、パーミュテーションが起こっているか否かを判定する。 FIG. 6 is a diagram showing a configuration of the sound source separation device 4 using the mixed system method. The sound source separation device 4 includes a permutation calculation unit 17 in addition to the configuration of the evaluation device 1 according to the first embodiment. The permutation calculation unit 17 compares the MUSIC spectrum for each sound source with the MUSIC spectrum for each frequency bin to determine whether or not permutation has occurred.

音源分離部１１は、パーミュテーション算出部１７でのパーミュテーションの判定結果にも基づいて、混合法による音源分離を行う。音源分離部１１は、例えば、音源のＭＵＳＩＣスペクトルのピーク位置と、周波数ビンのＭＵＳＩＣスペクトルのピーク位置の差を音源数個足し合わせたものの絶対値が最も小さくなるように、周波数ビンｆごとに音源ｌを並び替える。また、何らかのスペクトル間の距離を導入し（例えばユークリッド距離、板倉斎藤擬距離など）、その距離の総和が小さくなるように音源ｌを並び替える方法も考えられる。このような手続きを導入することで、分離性能の評価結果をパーミュテーション解決に用いることができる。これにより、空間相関行列算出部１２では、パーミュテーション解決された空間相関行列Ｈが得られる。 The sound source separation unit 11 performs sound source separation by the mixing method based on the determination result of the permutation in the permutation calculation unit 17. For example, the sound source separation unit 11 generates a sound source for each frequency bin f such that the absolute value of the sum of the difference between the peak position of the MUSIC spectrum of the sound source and the peak position of the MUSIC spectrum of the frequency bin for several sound sources is the smallest. Rearrange l. Further, a method of introducing some distance between spectra (for example, Euclidean distance, Itakura Saito pseudo distance, etc.) and rearranging the sound sources l so that the sum of the distances becomes small can be considered. By introducing such a procedure, the evaluation result of the separation performance can be used for permutation solution. Thereby, the spatial correlation matrix calculation unit 12 obtains the spatial correlation matrix H whose permutation has been resolved.

図７は、分離系手法を用いた音源分離装置４の構成を示す図である。音源分離部１１では、パーミュテーション解決された分離行列Ｗが得られる。これらを使って再度分離を行うか、音源分離の最適化の途中にこのパーミュテーション解決を挿入することで、パーミュテーションで音源分離を行うことができる。 FIG. 7 is a diagram showing a configuration of the sound source separation device 4 using the separation system technique. The sound source separation unit 11 obtains a separation matrix W whose permutation has been solved. By performing separation again using these or inserting this permutation solution in the middle of optimization of sound source separation, sound source separation can be performed by permutation.

本実施の形態では、パーミュテーション算出部１７による判定結果を音源分離部１１にフィードバックして、音源分離部１１が判定結果を用いて音源分離を行う例を説明したが、パーミュテーション算出部１７による判定結果が所定の基準を満たさない場合には、音源分離部１１による音源分離を中止してもよい。 In the present embodiment, an example has been described in which the determination result of the permutation calculation unit 17 is fed back to the sound source separation unit 11 and the sound source separation unit 11 performs sound source separation using the determination result. If the determination result by 17 does not satisfy the predetermined criterion, the sound source separation by the sound source separation unit 11 may be stopped.

本実施の形態の音源分離装置４の動作は、図６または図７に示す構成図において、矢印に従って、各構成要素が機能することにより実現される。また、本実施の形態の音源分離装置のハードウェアの例は、収音部である複数のマイク１０と接続されたコンピュータである。コンピュータは、ＣＰＵ、ＲＡＭ、ＲＯＭ、ハードディスク、ディスプレイ、キーボード、マウス、通信インターフェース等を備える。上記した各機能を実現するモジュールを有するプログラムをＲＡＭまたはＲＯＭに格納しておき、ＣＰＵによって当該プログラムを実行することによって、上記した音源分離装置が実現される。このようなプログラムも本発明の範囲に含まれる。 The operation of the sound source separation device 4 of the present embodiment is realized by the function of each component according to the arrow in the configuration diagram shown in FIG. 6 or FIG. Further, an example of hardware of the sound source separation device according to the present embodiment is a computer connected to a plurality of microphones 10 as sound pickup units. The computer includes a CPU, a RAM, a ROM, a hard disk, a display, a keyboard, a mouse, a communication interface, and the like. The above-described sound source separation device is realized by storing a program having modules for realizing the above-described functions in a RAM or a ROM and executing the program by the CPU. Such a program is also included in the scope of the present invention.

本発明は、複数の音源から入力された音を方向別に分離して目的音源を抽出する技術として有用である。 INDUSTRIAL APPLICABILITY The present invention is useful as a technique for separating a sound input from a plurality of sound sources for each direction and extracting a target sound source.

１，２，３音源分離の評価装置
４音源分離装置
１０マイク
１１音源分離部
１２空間相関行列算出部
１３固有値分解部
１４到来方向推定部
１５逆行列算出部
１６分離度算出部
１７パーミュテーション算出部 1, 2, 3 sound source separation evaluation device 4 sound source separation device 10 microphone 11 sound source separation unit 12 spatial correlation matrix calculation unit 13 eigenvalue decomposition unit 14 arrival direction estimation unit 15 inverse matrix calculation unit 16 separation degree calculation unit 17 permutation calculation Department

Claims

複数の音源から到来した音を収音する収音部と、
前記収音部にて収音した音の音源を分離する音源分離部と、
前記各音源に対する空間相関行列を求める空間相関行列算出部と、
前記空間相関行列を固有値分解して固有値及び固有ベクトルを求める固有値分解部と、
前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求める到来方向推定部と、
を備える音源分離の評価装置。 A sound pickup unit for picking up sounds coming from a plurality of sound sources,
A sound source separation unit that separates a sound source of the sound collected by the sound collection unit;
A spatial correlation matrix calculation unit for determining a spatial correlation matrix for each sound source,
An eigenvalue decomposition unit for eigenvalue decomposition of the spatial correlation matrix to obtain eigenvalues and eigenvectors,
An arrival direction estimating unit that obtains a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalue and the eigenvector;
An apparatus for evaluating sound source separation comprising:

前記音源分離部は、前記収音部にて収音した音のスペクトログラムを複数の基底とそれに対応するアクティベーションに分解し、前記基底とアクティベーションをクラスタリングして音源分離し、
前記空間相関行列算出部は、前記音源分離部にて分離された各音源について、空間相関行列を求める、請求項１に記載の音源分離の評価装置。 The sound source separation unit decomposes a spectrogram of the sound collected by the sound collection unit into a plurality of bases and activations corresponding thereto, and performs sound source separation by clustering the bases and activations.
The evaluation apparatus for sound source separation according to claim 1, wherein the spatial correlation matrix calculation unit obtains a spatial correlation matrix for each sound source separated by the sound source separation unit.

前記音源分離部は、前記収音部にて収音した音を独立した信号に分離する分離行列を、前記音のスペクトログラムの周波数ビンごとに推定して音源を分離し、
前記空間相関行列算出部は、前記分離行列の逆行列を求めることにより、各音源の空間相関行列を求める、請求項１に記載の音源分離の評価装置。 The sound source separation unit separates a sound source by estimating a separation matrix for separating the sound collected by the sound collection unit into independent signals for each frequency bin of a spectrogram of the sound,
The evaluation apparatus for sound source separation according to claim 1, wherein the spatial correlation matrix calculation unit obtains a spatial correlation matrix of each sound source by obtaining an inverse matrix of the separation matrix.

前記到来方向推定部にて求めた各音源のＭＵＳＩＣスペクトルの近さを評価する分離度算出部をさらに備える、請求項１乃至３のいずれかに記載の音源分離の評価装置。 The evaluation apparatus for sound source separation according to any one of claims 1 to 3, further comprising a degree-of-separation calculation unit that evaluates the closeness of the MUSIC spectrum of each sound source obtained by the direction-of-arrival estimation unit.

複数の音源から到来した音を収音する収音部と、
前記収音部にて収音した音の音源を分離する音源分離部と、
前記各音源に対する空間相関行列を求める空間相関行列算出部と、
前記空間相関行列を固有値分解して固有値及び固有ベクトルを求める固有値分解部と、
前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求める到来方向推定部と、
各音源のＭＵＳＩＣスペクトルと、周波数ごとのＭＵＳＩＣスペクトルとを比較して、パーミュテーションが起こっているか否かを判定するパーミュテーション算出部と、
を備え、
前記音源分離部は、前記パーミュテーション算出部での判定結果を、音源の分離に用いる音源分離装置。 A sound pickup unit for picking up sounds coming from a plurality of sound sources,
A sound source separation unit that separates a sound source of the sound collected by the sound collection unit;
A spatial correlation matrix calculation unit for determining a spatial correlation matrix for each sound source,
An eigenvalue decomposition unit for eigenvalue decomposition of the spatial correlation matrix to obtain eigenvalues and eigenvectors,
An arrival direction estimating unit that obtains a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalue and the eigenvector;
A permutation calculating unit that compares the MUSIC spectrum of each sound source with the MUSIC spectrum for each frequency to determine whether or not permutation has occurred;
With
A sound source separation device, wherein the sound source separation unit uses the determination result of the permutation calculation unit for sound source separation.

前記音源分離部は、前記収音部にて収音した音のスペクトログラムを複数の基底とそれに対応するアクティベーションに分解し、前記基底とアクティベーションをクラスタリングして音源分離し、
前記空間相関行列算出部は、前記音源分離部にて分離された各音源について、空間相関行列を求める、請求項５に記載の音源分離装置。 The sound source separation unit decomposes a spectrogram of the sound collected by the sound collection unit into a plurality of bases and activations corresponding thereto, and performs sound source separation by clustering the bases and activations.
The sound source separation device according to claim 5, wherein the spatial correlation matrix calculation unit obtains a spatial correlation matrix for each sound source separated by the sound source separation unit.

前記音源分離部は、前記収音部にて収音した音を独立した信号に分離する分離行列を、前記音のスペクトログラムの周波数ビンごとに推定して音源を分離し、
前記空間相関行列算出部は、前記分離行列の逆行列を求めることにより、各音源の空間相関行列を求める、請求項５に記載の音源分離装置。 The sound source separation unit separates a sound source by estimating a separation matrix for separating the sound collected by the sound collection unit into independent signals for each frequency bin of a spectrogram of the sound,
The sound source separation device according to claim 5, wherein the spatial correlation matrix calculation unit obtains a spatial correlation matrix of each sound source by obtaining an inverse matrix of the separation matrix.

到来した音の音源を分離し、その分離性能を評価する方法であって、
収音した音の音源を分離するステップと、
前記各音源に対する空間相関行列を求めるステップと、
前記空間相関行列を固有値分解して固有値及び固有ベクトルを求めるステップと、
前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求めるステップと、
を備える音源分離の評価方法。 A method of separating a sound source of an incoming sound and evaluating its separation performance,
Separating the sound source of the collected sound;
Obtaining a spatial correlation matrix for each sound source;
Eigenvalue decomposition of the spatial correlation matrix to obtain eigenvalues and eigenvectors,
Obtaining a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalues and the eigenvectors;
A method for evaluating sound source separation comprising:

到来した音の音源を分離する方法であって、
収音した音の音源を分離するステップと、
前記各音源に対する空間相関行列を求めるステップと、
前記空間相関行列を固有値分解して固有値及び固有ベクトルを求めるステップと、
前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求めるステップと、
各音源のＭＵＳＩＣスペクトルと、周波数ごとのＭＵＳＩＣスペクトルとを比較して、パーミュテーションが起こっているか否かを判定するステップと、
を備え、
前記各音源を分離するステップにおいて、前記パーミュテーションが起こっているか否かの判定結果を、音源の分離に用いる音源分離方法。 A method of separating the sound source of an incoming sound,
Separating the sound source of the collected sound;
Obtaining a spatial correlation matrix for each sound source;
Eigenvalue decomposition of the spatial correlation matrix to obtain eigenvalues and eigenvectors,
Obtaining a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalues and the eigenvectors;
Comparing the MUSIC spectrum of each sound source with the MUSIC spectrum for each frequency to determine whether or not permutation has occurred;
With
In the sound source separation method, in the step of separating each sound source, a determination result as to whether or not the permutation has occurred is used for sound source separation.

到来した音の音源を分離し、その分離性能を評価するためのプログラムであって、コンピュータに、
収音した音の音源を分離するステップと、
前記各音源に対する空間相関行列を求めるステップと、
前記空間相関行列を固有値分解して固有値及び固有ベクトルを求めるステップと、
前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求めるステップと、
を実行させるプログラム。 A program for separating a sound source of an incoming sound and evaluating the separation performance thereof.
Separating the sound source of the collected sound;
Obtaining a spatial correlation matrix for each sound source;
Eigenvalue decomposition of the spatial correlation matrix to obtain eigenvalues and eigenvectors,
Obtaining a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalues and the eigenvectors;
A program that executes

到来した音の音源を分離するためのプログラムであって、コンピュータに、
収音した音の音源を分離するステップと、
前記各音源に対する空間相関行列を求めるステップと、
前記空間相関行列を固有値分解して固有値及び固有ベクトルを求めるステップと、
前記固有値及び固有ベクトルを用いて、ＭＵＳＩＣ法により、各音源の各周波数について、ＭＵＳＩＣスペクトルを求めるステップと、
各音源のＭＵＳＩＣスペクトルと、周波数ごとのＭＵＳＩＣスペクトルとを比較して、パーミュテーションが起こっているか否かを判定するステップと、
を実行させ、
前記各音源を分離するステップにおいて、前記パーミュテーションが起こっているか否かの判定結果を、音源の分離に用いるプログラム。 A program for separating a sound source of an incoming sound.
Separating the sound source of the collected sound;
Obtaining a spatial correlation matrix for each sound source;
Eigenvalue decomposition of the spatial correlation matrix to obtain eigenvalues and eigenvectors,
Obtaining a MUSIC spectrum for each frequency of each sound source by the MUSIC method using the eigenvalues and the eigenvectors;
Comparing the MUSIC spectrum of each sound source with the MUSIC spectrum for each frequency to determine whether or not permutation has occurred;
And execute
A program that uses a result of determining whether or not the permutation has occurred in the step of separating each sound source for sound source separation.