JP2000097758A

JP2000097758A - Sound-source signal estimating device

Info

Publication number: JP2000097758A
Application number: JP10267877A
Authority: JP
Inventors: Yasushige Nakayama; 靖茂中山; Tetsuo Umeda; 哲夫梅田; Takashi Nishi; 隆司西; Satoru Koizumi; 悟小泉
Original assignee: Nippon Hoso Kyokai NHK; Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 1998-09-22
Filing date: 1998-09-22
Publication date: 2000-04-07
Anticipated expiration: 2018-09-22
Also published as: JP3927701B2

Abstract

PROBLEM TO BE SOLVED: To provide a sound-source signal estimating device with no stable successive correction while the convergence of successive correction is not slowed down by carrying out normalization using the square of the norm of an estimated signal vector as a mixed signal for outputting a corrected vector. SOLUTION: In this device, an estimated signal as a mixed signal and an estimated signal which is estimated as a sound-source signal are inputted, and a vector for successively correcting a separation coefficient vector is generated and outputted by a linear operation. A correction vector generating means carries out normalization using the square of the norm of the estimated signal vector as the mixed signal. A vector having a magnitude of an estimated signal which is estimated as a sound-source signal and the direction of the estimated signal vector as the mixed signal as well is generated as a correction vector, which is multiplied by a specified factor to be outputted. Thus, when individual signal is estimated/separated from a signal wherein other signal and noise are mixed in a voice signal, the effect from signal power fluctuation is reduced.

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、複数の音源信号が
相互に混在して複数のチャンネルを介して入力されたと
きに、その複数の音源信号を音源毎に推定する技術に関
する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a technique for estimating a plurality of sound source signals for each sound source when a plurality of sound source signals are mixed with each other and input through a plurality of channels.

【０００２】[0002]

【従来の技術】複数の音源信号が相互に混在して複数の
チャンネルを介して入力されたときに、その音源信号を
音源毎に推定することは一般にはできない。その理由
は、音源信号が未知である場合、いったん混合されて入
力した信号であるチャンネル信号のみから混合過程を一
意に決定することが不可能であるからである。そこで、
音源信号が互いに統計的に独立であると仮定したうえで
音源信号の混合過程をモデル化し、音源信号の推定、分
離を行うことが試みられている。従来のこの種推定、分
離手法として、文献 C.Jutten et al.“Blind separati
on of sources, Part １：An adaptive algorithm base
d on neuromimetic architecture, ”SignalProcess. 2
4, 1-10 (1991) に記載されたＩＣＡ（Independent Com
ponent Analysis) の手法がある。2. Description of the Related Art When a plurality of sound source signals are mixed and input through a plurality of channels, it is generally impossible to estimate the sound source signals for each sound source. The reason is that if the sound source signal is unknown, it is impossible to uniquely determine the mixing process only from the channel signal which is the signal once mixed and input. Therefore,
Attempts have been made to model the mixing process of the sound source signals on the assumption that the sound source signals are statistically independent of each other, and to estimate and separate the sound source signals. As a conventional method for this kind of estimation and separation, see C. Jutten et al. “Blind separati
on of sources, Part 1: An adaptive algorithm base
d on neuromimetic architecture, ”SignalProcess. 2
4, ICA (Independent Commun.) Described in (1991)
ponent Analysis).

【０００３】ＩＣＡの手法は、複数の音源信号の混合過
程をモデル化し、かつ原音源信号が統計的に独立である
ことを利用する音源信号の推定、分離手法である。その
複数の音源の混合、分離過程の原理を音源信号、チャン
ネル信号および推定信号がそれぞれ２つある場合を例に
図４に示す。入力信号としての２つのチャンネル信号
（第１チャンネル信号、第２チャンネル信号）はそれぞ
れ、ある時刻における複数の連続するサンプル値の集合
としてのベクトルで与えられる。図４においては、連続
するサンプル値の数はｍ（ｍ＞１）である。そのｍ個の
サンプル値からなる２つのチャンネル信号ベクトルThe ICA method is a method of estimating and separating a sound source signal by modeling a mixing process of a plurality of sound source signals and utilizing the fact that the original sound source signals are statistically independent. FIG. 4 shows the principle of the process of mixing and separating a plurality of sound sources, taking as an example a case where there are two sound source signals, two channel signals, and two estimated signals. Each of two channel signals (first channel signal and second channel signal) as input signals is given by a vector as a set of a plurality of continuous sample values at a certain time. In FIG. 4, the number of continuous sample values is m (m> 1). Two channel signal vectors consisting of the m sample values

【外１】が音源分離過程の入力端に入力されたとき、それを音源
分離処理して同じくｍ個のサンプル値からなる混合過程
モデルにおける２つの音源信号ベクトル[Outside 1] Is input to the input end of the sound source separation process, the sound source separation process is performed, and two sound source signal vectors in the mixing process model also including m sample values

【外２】を２つの推定信号ベクトル[Outside 2] Is the two estimated signal vectors

【外３】として推定している。図４の場合、いずれの信号（チャ
ンネル信号、推定信号）もその数は２であるが、チャン
ネル信号、推定信号の数をそれぞれＮ，Ｍ（Ｎ＞２，Ｎ
≧Ｍ＞２）に拡張しても一般性は失われない。ただし、
音源数がチャンネル数より多い場合であってもチャンネ
ル数を超える数の音源推定をすることはできない。[Outside 3] It is estimated as. In the case of FIG. 4, the number of each of the signals (channel signal and estimated signal) is 2, but the numbers of the channel signal and the estimated signal are N and M (N> 2, N
The generality is not lost even if it is extended to ≧ M> 2). However,
Even when the number of sound sources is larger than the number of channels, it is impossible to estimate the number of sound sources exceeding the number of channels.

【０００４】図４に示す混合過程モデルは、それぞれｍ
個のサンプル値からなる次の２つの音源信号ベクトル[0004] The mixing process model shown in FIG.
Next two sound source signal vectors consisting of sample values

【数１】がそれぞれ個別の混合係数ベクトル(Equation 1) Are individual mixing coefficient vectors

【外４】と内積演算されて、他方の音源信号に加算される混合過
程モデルを示している。その混合された２つの信号がチ
ャンネル信号ベクトル[Outside 4] And a mixing process model in which an inner product is calculated and added to the other sound source signal. The mixed two signals are the channel signal vector

【数２】として図４に示す音源分離過程に入力される。ここで、
同図に示す混合係数ベクトル(Equation 2) Is input to the sound source separation process shown in FIG. here,
Mixing coefficient vector shown in the figure

【外５】はｋ番目の音源信号がｎ番目の音源信号に混入される際
の混合係数ベクトルを示している。[Outside 5] Indicates a mixing coefficient vector when the k-th sound source signal is mixed with the n-th sound source signal.

【０００５】すなわち、このＩＣＡの手法は、図４に示
す混合過程モデルに基づいてｎ番目の入力チャンネル信
号が次式で表現できることを前提に、音源信号ベクトル
の推定、分離を行う手法である。That is, the ICA method is a method of estimating and separating a sound source signal vector on the premise that the n-th input channel signal can be represented by the following equation based on the mixing process model shown in FIG.

【数３】ここで、(Equation 3) here,

【外６】の内積、すなわち[Outside 6] The inner product of

【数４】を表わしている。（１）式に基づき他の音源信号が混在
したチャンネル信号から所定の音源信号を推定信号ベク
トル(Equation 4) Is represented. Based on equation (1), a predetermined sound source signal is estimated from a channel signal in which other sound source signals are mixed.

【外７】として分離するには、以下に説明する分離係数ベクトル[Outside 7] The separation coefficient vector described below

【外８】を定義し、他の音源信号に相当するこの推定信号ベクト
ルをそれぞれの分離係数ベクトルで内積演算した結果
を、音源信号を分離しようとしているチャンネル信号か
ら減算するようにすればよい。[Outside 8] May be defined, and the result of inner product operation of this estimated signal vector corresponding to another excitation signal with each separation coefficient vector may be subtracted from the channel signal from which the excitation signal is to be separated.

【０００６】いま、ｎ番目の入力チャンネル信号からｋ
番目の入力チャンネルに対応した音源信号を除去してｎ
番目の入力チャンネルに対応した音源信号を推定する場
合の分離係数ベクトルを〔外８〕とすれば、ｎ番目の推
定信号Now, from the n-th input channel signal, k
Remove the sound source signal corresponding to the
If the separation coefficient vector for estimating the sound source signal corresponding to the nth input channel is represented by [Equation 8], the nth estimated signal

【外９】は[Outside 9] Is

【数５】で表されるから、この（２）式に（１）式を代入し、推
定信号が音源信号にかなり近いとの仮定のもとに(Equation 5) Therefore, substituting equation (1) into equation (2), under the assumption that the estimated signal is very close to the sound source signal,

【外１０】として、さらに式を整理すると[Outside 10] And further rearranging the formula

【数６】となる。推定信号と音源信号とが完全に一致する場合に
は、混合係数ベクトル〔外５〕と分離係数ベクトル〔外
８〕は理論上一致する筈である。このため、(Equation 6) Becomes When the estimated signal and the sound source signal completely match, the mixing coefficient vector [5] and the separation coefficient vector [8] should theoretically match. For this reason,

【外１１】であれば完全に音源信号が推定できたことになるが、実
際には、混合係数ベクトル〔外５〕は混合過程モデルで
定義した混合係数ベクトルであり、その値は未知なので
次のような期待値[Outside 11] In this case, the sound source signal can be completely estimated. However, in actuality, the mixing coefficient vector is a mixing coefficient vector defined by the mixing process model, and its value is unknown. value

【外１２】を分離の指標として考える。[Outside 12] Is considered as an index of separation.

【０００７】つまり、各音源信号間が無相関と仮定すれ
ばいわゆるクロス項が０となりThat is, if it is assumed that there is no correlation between the sound source signals, the so-called cross term becomes zero.

【数７】で表わされる。ここで、(Equation 7) Is represented by here,

【外１３】はベクトルのノルムであり[Outside 13] Is the norm of the vector

【数８】である。この（４）式より(Equation 8) It is. From this equation (4)

【外１４】がゼロベクトルのとき、すなわち、[Outside 14] Is a zero vector, that is,

【外１５】のとき、期待値〔外１２〕が最小になる。そこで、〔外
１２〕を分離の指標と見なし、これを最小化する分離係
数ベクトル〔外８〕を１つ前の時刻における〔外８〕を
用いて逐次修正しながら推定していく。[Outside 15] , The expected value [outside 12] becomes minimum. Therefore, [外 12] is regarded as an index of separation, and the separation coefficient vector [外 8] for minimizing this is estimated while successively correcting using [外 8] at the immediately preceding time.

【０００８】ここで、ｋ番目の入力チャンネルに対応し
た音源信号はｋ番目以外の入力チャンネルに対応した音
源信号にとっては雑音である。従って、この雑音として
のｋ番目の入力チャンネルに対応した音源信号にのみ注
目し、ｋ番目以外には音源信号がないと仮定する。例え
ば、図４に示した例において、ｋ番目の入力チャンネル
に対応した音源信号以外の音源信号はないと仮定すれ
ば、図４は、図５のように書き替えることができる。図
５において、ある時刻ｊにおける分離係数ベクトル〔外
８〕を[0008] Here, the sound source signal corresponding to the k-th input channel is noise for the sound source signals corresponding to the input channels other than the k-th input channel. Therefore, it is assumed that only the sound source signal corresponding to the k-th input channel as the noise is focused, and that there is no sound source signal other than the k-th input channel. For example, in the example shown in FIG. 4, if there is no sound source signal other than the sound source signal corresponding to the k-th input channel, FIG. 4 can be rewritten as shown in FIG. In FIG. 5, the separation coefficient vector at a certain time j

【外１６】と表記した。混合係数ベクトル〔外５〕も同様に[Outside 16] It was written. Similarly, the mixing coefficient vector

【外１７】と表記した。同様に、ｋ番目の音源信号ベクトル[Outside 17] It was written. Similarly, the k-th sound source signal vector

【外１８】、ｎ番目のチャンネル信号ベクトル[Outside 18] , N th channel signal vector

【外１９】、ｋ番目の推定信号ベクトル[Outside 19] , The k-th estimated signal vector

【外２０】についても時刻ｊにおけるものであることを明確にする
ために、それぞれ[Outside 20] Are also at time j.

【外２１】と表記した。[Outside 21] It was written.

【０００９】いま、図５において分離係数ベクトル〔外
１６〕が混合係数ベクトル〔外１７〕に等しくなるよう
に〔外１６〕を修正するには、同図に示す残差Now, in order to correct [Eq. 16] so that the separation coefficient vector [Eq. 16] becomes equal to the mixing coefficient vector [Eq. 17] in FIG.

【外２２】がなるべく小さくなるようにすればよい。ｋ番目の音源
信号ベクトル[Outside 22] Should be made as small as possible. k-th source signal vector

【外２３】を雑音として除去するために分離係数ベクトル〔外１
６〕を逐次修正して残差〔外２２〕を最小化する過程は
次のように記述することができる。１．初期設定として分離係数ベクトル[Outside 23] In order to remove the noise as noise.
6] is successively corrected to minimize the residual [外]. 1. Separation coefficient vector as default

【外２４】は任意の初期値とする。ただし、一般的にはゼロベクト
ルとすることが多い。２．図５では、ｎ番目の音源信号
ベクトルはゼロベクトルの場合を仮定しているので[Outside 24] Is an arbitrary initial value. However, in general, it is often a zero vector. 2. In FIG. 5, since it is assumed that the n-th sound source signal vector is a zero vector,

【外２５】となり、[Outside 25] Becomes

【数９】であるから残差ｅ_jは(Equation 9) Therefore, the residual e _j is

【数１０】として求まる。次に修正ベクトル(Equation 10) Is obtained as Then the correction vector

【外２６】を[Outside 26] To

【数１１】とおいて、これ（〔外２６〕）を[Equation 11] Then, this ([outside 26])

【数１２】のようにμ倍したうえで、その時刻の分離係数ベクトル
〔外１６〕に加算することで逐次修正を行って、次の時
刻における分離係数ベクトル(Equation 12) , And then successively corrects by adding to the separation coefficient vector at the time [Eq. 16] to obtain the separation coefficient vector at the next time.

【外２７】を生成する。[Outside 27] Generate

【００１０】図５に示すAs shown in FIG.

【外２８】は時刻ｊと時刻（ｊ＋１）との差の時間に相当する遅延
回路を意味し、図５全体としてはこの逐次修正過程を実
現する回路構成を示している。図中、μは収束係数であ
らかじめ定められた値（０＜μ≦１）である。そして逐
次修正された分離係数ベクトル〔外２７〕を用いて、上
述の（２）式により次の時刻（ｊ＋１）における音源信
号の推定を行う。現実には上記仮定と異なり、ｋ番目以
外の音源信号が存在し、しかもそれはｋ番目のチャンネ
ル信号にも混在しているので、上記のような逐次処理に
よっては完全な音源信号の推定はできないが、図５中の
残差〔外２２〕に相当する第ｎ推定信号を最も小さくす
る場合の修正ベクトル〔外２６〕が〔外１６〕を〔外１
７〕に最も近づける〔外２６〕になることから、従来よ
りこのような逐次修正による考え方を用いて音源信号の
推定を行っている。[Outside 28] Denotes a delay circuit corresponding to the time of the difference between time j and time (j + 1), and FIG. 5 as a whole shows a circuit configuration for realizing this sequential correction process. In the figure, μ is a value determined in advance by a convergence coefficient (0 <μ ≦ 1). Then, using the sequentially corrected separation coefficient vector [Eq. 27], the sound source signal at the next time (j + 1) is estimated by the above equation (2). In reality, unlike the above assumption, there is a sound source signal other than the k-th sound source signal, and it is also mixed in the k-th channel signal. 5, the correction vector [修正 26] when minimizing the n-th estimated signal corresponding to the residual [外 22] in FIG.
7], the sound source signal is conventionally estimated using the concept of such sequential correction.

【００１１】[0011]

【発明が解決しようとする課題】上述したように、従来
のＩＣＡの手法においては、ある時刻ｊにおける分離係
数ベクトルの修正ベクトル〔外２６〕の修正方向は
（７）式によって時刻ｊにおける推定信号ベクトルAs described above, in the conventional ICA method, the modification direction of the modification vector of the separation coefficient vector at a certain time j is calculated by the following equation (7). vector

【外２９】に基づいて決められる。しかし、修正ベクトル〔外２
６〕の大きさは推定信号ベクトル〔外２９〕の大きさに
も依存するため、分離係数ベクトル〔外１６〕を最適な
値に向けて逐次修正していく際に、推定信号ベクトル
〔外２９〕のパワー変動によって修正ベクトル〔外２
６〕の大きさが大きく変動し、結果的に逐次修正が不安
定になってしまう。従来、それを避けるために収束係数
μを必要以上に小さな値としていたことから逐次修正の
収束が遅くなるという問題があった。さらに、収束速度
が入力信号の大きさに依存するため、収束係数μを必要
以上に小さくしても動作が不安定となる現象が残るとい
う問題もあった。[Outside 29] Is determined based on However, the correction vector [2
6] also depends on the size of the estimated signal vector [Eq. 29]. Therefore, when the separation coefficient vector [Eq. 16] is sequentially corrected toward an optimum value, the estimated signal vector [Eq. ] And the power fluctuation of []
6] greatly fluctuates, and as a result, the sequential correction becomes unstable. Conventionally, the convergence coefficient μ has been set to an unnecessarily small value in order to avoid such a problem. Further, since the convergence speed depends on the magnitude of the input signal, there is a problem that the operation becomes unstable even if the convergence coefficient μ is made smaller than necessary.

【００１２】本発明の目的は、複数（例えば、Ｍ個）の
音源信号が相互に混在してディジタル信号の形式で複数
（例えば、Ｎ個）のチャンネル信号として入力されたと
の仮定のもとで、前記チャンネル信号に基づき前記音源
信号として推定されたＭ個の推定信号を２信号間で演算
してチャンネル信号から混在信号を分離するための分離
係数ベクトルを逐次修正しながら求め、その分離係数ベ
クトルと前記混在信号としての前記推定信号ベクトルと
を内積演算して得られた結果を前記チャンネル信号から
減じて、前記音源信号を推定する音源信号推定装置にお
いて、従来のように、逐次修正が不安定になったり、逐
次修正の収束が遅くなることのない音源信号推定装置を
提供することにある。An object of the present invention is to assume that a plurality of (for example, M) sound source signals are mutually mixed and input as a plurality of (for example, N) channel signals in the form of digital signals. Calculating the M estimated signals estimated as the sound source signal between the two signals based on the channel signal while sequentially correcting the separation coefficient vector for separating the mixed signal from the channel signal; In the sound source signal estimating apparatus for estimating the sound source signal by subtracting the result obtained by inner product calculation of the estimated signal vector as the mixed signal and the estimated signal vector from the channel signal, the sequential correction is unstable as in the related art. It is an object of the present invention to provide a sound source signal estimating apparatus which does not cause convergence of successive corrections.

【００１３】[0013]

【課題を解決するための手段】上記目的を達成するため
に、本発明においては、時刻ｊにおける推定信号ベクト
ルを、時刻ｊにおける分離係数ベクトルの修正ベクトル
の修正方向の決定にのみ用いるように、分離係数ベクト
ルの修正ベクトル〔外２６〕を推定信号ベクトル〔外２
９〕のノルムの自乗で正規化する正規化処理手段を設け
たことを特徴としている。In order to achieve the above object, according to the present invention, an estimated signal vector at time j is used only for determining a correction direction of a correction vector of a separation coefficient vector at time j. The modified vector of the separation coefficient vector [Eq.
[9] normalization processing means for normalizing by the square of the norm is provided.

【００１４】すなわち、本発明による音源信号推定装置
は、Ｍ（Ｍ≧２）個の音源信号が相互に混在してディジ
タル信号の形式でＮ（Ｎ≧Ｍ≧２）個のチャンネル信号
として入力されたとの仮定のもとで、前記Ｎ個のチャン
ネル信号に基づき前記音源信号として推定されたＭ個の
推定信号を２信号間で演算してチャンネル信号から混在
信号を分離するための分離係数ベクトルを逐次修正しな
がら求め、その分離係数ベクトルと前記混在信号として
の前記推定信号ベクトルとを内積演算して得られた結果
を前記チャンネル信号から減じて、前記音源信号を推定
する音源信号推定装置において、該装置は、前記混在信
号としての推定信号と前記音源信号として推定された推
定信号とが入力され、前記分離係数ベクトルを逐次修正
するためのベクトルを線形演算により生成して出力する
修正ベクトル生成手段と、該修正ベクトル生成手段の出
力ベクトルを前記分離係数ベクトルに加算して次の時点
における分離係数ベクトルとして出力することにより前
記分離係数ベクトルの逐次修正を行う逐次修正手段と、
該逐次修正手段によって逐次修正された分離係数ベクト
ルと前記混在信号としての推定信号ベクトルとを内積演
算して得られた結果を前記チャンネル信号から減じて、
前記音源信号として推定された推定信号を出力する減算
手段とを含む音源信号推定手段を少なくとも１組具える
とともに、前記修正ベクトル生成手段は、前記混在信号
としての推定信号ベクトルのノルムの自乗で正規化演算
して、前記音源信号として推定された推定信号の大きさ
を有するとともに前記混在信号としての推定信号ベクト
ルの方向を有するベクトルを修正ベクトルとして生成
し、該修正ベクトルを所定倍して出力する修正ベクトル
生成手段であることを特徴とするものである。That is, in the sound source signal estimating apparatus according to the present invention, M (M ≧ 2) sound source signals are mixed and input as N (N ≧ M ≧ 2) channel signals in the form of a digital signal. Based on the assumption, the M estimated signals estimated as the sound source signal based on the N channel signals are calculated between two signals, and a separation coefficient vector for separating a mixed signal from the channel signal is obtained. In the sound source signal estimating device that obtains while sequentially correcting, subtracts the result obtained by calculating the inner product of the separation coefficient vector and the estimated signal vector as the mixed signal from the channel signal, and estimates the sound source signal, The apparatus receives an estimated signal as the mixed signal and an estimated signal estimated as the sound source signal, and performs a vector correction for sequentially correcting the separation coefficient vector. A correction vector generating means for generating and outputting the separation coefficient vector by a linear operation, and adding the output vector of the correction vector generation means to the separation coefficient vector and outputting the result as a separation coefficient vector at the next time, thereby sequentially generating the separation coefficient vector. Sequential correction means for performing correction;
Subtracting the result obtained by inner product operation of the separation coefficient vector sequentially corrected by the successive correction means and the estimated signal vector as the mixed signal from the channel signal,
At least one set of sound source signal estimating means including a subtracting means for outputting an estimated signal estimated as the sound source signal, and the correction vector generating means is configured to generate a normal signal by the square of the norm of the estimated signal vector as the mixed signal , A vector having the magnitude of the estimated signal estimated as the sound source signal and having the direction of the estimated signal vector as the mixed signal is generated as a correction vector, and the corrected vector is output by multiplying it by a predetermined number. It is characterized by being a correction vector generating means.

【００１５】[0015]

【発明の実施の形態】以下に添付図面を参照し、発明の
実施の形態に基づいて本発明を詳細に説明する。上述し
たように、本発明は、上記ＩＣＡの手法において、分離
係数ベクトル〔外１６〕の修正ベクトル〔外２６〕を推
定信号ベクトル〔外２９〕で正規化するようにしたもの
であり、そのための正規化回路を含んでいる本発明装置
の一実施形態の回路構成を図１に示す。ここに、図示の
正規化回路における正規化の処理は次式DESCRIPTION OF THE PREFERRED EMBODIMENTS The present invention will be described in detail below based on embodiments of the present invention with reference to the accompanying drawings. As described above, according to the present invention, in the above-mentioned ICA method, the modified vector [Eq.26] of the separation coefficient vector [Eq.16] is normalized by the estimated signal vector [Eq.29]. FIG. 1 shows a circuit configuration of an embodiment of the device of the present invention including a normalization circuit. Here, the normalization processing in the illustrated normalization circuit is expressed by the following equation.

【数１３】で表わされる。(Equation 13) Is represented by

【００１６】また、図１においては、２つの入力チャン
ネルおよび２つの音源信号が存在する場合を想定してい
て、２つの音源信号に対する各残差として他の推定信号
そのものが入力されている。また、図１では、雑音とし
ての推定信号ベクトル〔外２９〕をそのノルムの自乗で
正規化して残差ｅ_jとの乗算を行い分離係数ベクトル
〔外１６〕の修正ベクトル〔外２６〕を算出する回路構
成例を示しているが、図１の正規化回路に代わり、各推
定信号ベクトルと残差とを乗算して求めた、従来技術に
おける（７）式の修正ベクトルにIn FIG. 1, it is assumed that there are two input channels and two sound source signals, and another estimated signal itself is input as each residual for the two sound source signals. Also, calculated in FIG. 1, the correction vector of the separation coefficient vector [External 16] to normalize the estimated signal vector as noise [External 29] in the square of the norm performs multiplication of the residual e _j [out 26] A circuit configuration example is shown, but instead of the normalization circuit of FIG. 1, the modified vector of the conventional art (7) obtained by multiplying each estimated signal vector and the residual is obtained.

【外３０】を乗算する乗算回路を収束係数乗算器の直前に配置する
ことにより修正ベクトルの正規化の処理を実現してもよ
く、また収束係数乗算後に正規化の処理を施してもよ
い。さらに、正規化の処理は、上記正規化の処理を行う
ハードウエアと同様の動作をソフトウェアで実現するこ
とも可能である。[Outside 30] May be implemented immediately before the convergence coefficient multiplier to perform the normalization processing of the correction vector, or may be performed after the convergence coefficient multiplication. Further, in the normalization processing, the same operation as the hardware for performing the above-described normalization processing can be realized by software.

【００１７】図１に示す実施形態の動作を説明する。ま
ず、２つの入力チャンネル信号（第１チャンネル信号ベ
クトルThe operation of the embodiment shown in FIG. 1 will be described. First, two input channel signals (first channel signal vector)

【外３１】、第２チャンネル信号ベクトル[Outside 31] , The second channel signal vector

【外３２】）は、ある時刻ｊにおけるｍ個の連続するサンプル値の
集合として装置に並列に入力される。いま、第１チャン
ネル信号に注目すれば、第１チャンネル信号から、混在
信号としての推定信号と位置付けられる第２推定信号ベ
クトルと以下に説明するようにして求めた分離係数ベク
トル[Outside 32] ) Are input to the device in parallel as a set of m consecutive sample values at a certain time j. Now, paying attention to the first channel signal, from the first channel signal, a second estimated signal vector positioned as an estimated signal as a mixed signal, and a separation coefficient vector obtained as described below.

【外３３】との間で内積演算した結果が減算器（図１中、記号[Outside 33] And the result of the inner product operation is a subtractor (in FIG. 1, the symbol

【外３４】で示す）で減じられて音源信号としての第１推定信号ベ
クトル[Outside 34] The first estimated signal vector as a sound source signal

【外３５】として出力される。図１中、記号[Outside 35] Is output as Symbol in FIG.

【外３６】は内積演算回路を示している。[Outside 36] Indicates an inner product operation circuit.

【００１８】上記において、内積演算すべき一方のベク
トルとしての分離係数ベクトル〔外３３〕は、次のよう
に逐次修正して求める。すなわち、雑音としての第２推
定信号をベクトル表現したときの方向成分のみを抽出す
るために正規化回路で第２推定信号ベクトルIn the above description, the separation coefficient vector (outside 33) as one of the vectors for which the inner product operation is to be performed is obtained by successively correcting as follows. That is, in order to extract only the directional component when the second estimated signal as noise is represented by a vector, the second estimated signal vector is extracted by a normalization circuit.

【外３７】をそのノルムの自乗で正規化し、その結果と第１推定信
号とを乗算器（図１中、記号[Outside 37] Is normalized by the square of its norm, and the result and the first estimated signal are multiplied by a multiplier (in FIG. 1, a symbol

【外３８】で示す）で乗算して修正ベクトル[Outside 38] Multiplied by the correction vector

【外３９】を求め、さらに収束係数μ（０＜μ≦１）を乗算したう
えで時刻ｊにおける分離係数ベクトル〔外３３〕（図１
中、逐次修正手段を構成し、記号〔外２８〕で示される
時刻ｊと時刻（ｊ＋１）との差の時間に相当する遅延回
路の出力）と加算して、図１中、記号[Outside 39] , And further multiplied by a convergence coefficient μ (0 <μ ≦ 1), and then a separation coefficient vector at time j [33] (FIG. 1).
1 and an output of a delay circuit corresponding to the difference between the time j and the time (j + 1), which is indicated by the symbol [外 28],

【外４０】で示される加算器の出力を時刻（ｊ＋１）における分離
係数ベクトルとしている。[Outside 40] The output of the adder represented by is defined as a separation coefficient vector at time (j + 1).

【００１９】本発明は、上述したC. Jutten 他の文献に
見られる手法において、最適な逐次修正を行うにはどう
したらよいかを理論的に検討した結果生まれたものであ
るため、次に、その理論的根拠を簡単に説明する。図２
は、分離係数ベクトル〔外１６〕の修正方向と修正量を
μ＝１（μは収束係数）として幾何学的に示し、とく
に、図５の場合におけるある時刻ｊの音源ベクトルThe present invention has been developed as a result of theoretically examining how to perform optimal sequential correction in the method described in the above-mentioned C. Jutten et al. The rationale is briefly explained. FIG.
Indicates geometrically the correction direction and the correction amount of the separation coefficient vector [外 16] as μ = 1 (μ is a convergence coefficient). In particular, the sound source vector at a certain time j in the case of FIG.

【外４１】と、それを除外するために用いる分離係数ベクトル〔外
１６〕の集合と、〔外１６〕を修正ベクトル〔外２６〕
により修正して得られる次の時刻ｊ＋１における分離係
数ベクトル[Outside 41] And a set of separation coefficient vectors used to exclude it, and a modified vector
The separation coefficient vector at the next time j + 1 obtained by correcting

【外４２】の集合を示している。[Outside 42] Is shown.

【００２０】図２において、ある時刻ｊにおける音源ベ
クトル〔外４１〕を除去するために用いた分離係数ベク
トル〔外１６〕がベクトル空間上で図示の位置に存在し
たとすると、次の時刻（ｊ＋１）において用いる分離係
数ベクトル〔外４２〕は分離係数ベクトル〔外１６〕を
修正ベクトル〔外２６〕により修正したものとなる。こ
こで、修正ベクトル〔外２６〕は時刻ｊにおける推定信
号ベクトル〔外２９〕に基づきその方向が決まるが、In FIG. 2, assuming that the separation coefficient vector [16] used for removing the sound source vector [41] at a certain time j exists at the position shown in the vector space, the next time (j + 1) ) Are obtained by modifying the separation coefficient vector [16] with the correction vector [26]. Here, the direction of the correction vector [外 26] is determined based on the estimated signal vector [外 29] at time j,

【外４３】を仮定しているので、修正ベクトル〔外２６〕の方向は
〔外４１〕の方向と一致するものと考えてよい。[Outside 43] Therefore, it can be considered that the direction of the correction vector [outside 26] coincides with the direction of [outside 41].

【００２１】従って、いま、混合過程モデルにおける混
合係数ベクトル〔外１７〕がベクトル空間上で図２中の
図示の位置にあるとすれば、同図において、Therefore, assuming that the mixing coefficient vector [Eq. 17] in the mixing process model is at the position shown in FIG. 2 in the vector space,

【外４４】ベクトル（混合係数ベクトル−分離係数ベクトル）と修
正ベクトル〔外２６〕ベクトルとを含む面は図３に示さ
れ、理論的に最適な修正ベクトル〔外２６〕は、〔外４
４〕ベクトルを〔外４１〕ベクトル上に射影したものと
なる。ここで面π _jを、図５の第ｋ音源信号ベクトル[Outside 44]Vector (mixing coefficient vector-separation coefficient vector)
The plane containing the positive vector [26] is shown in FIG.
Thus, the theoretically optimal correction vector [Eq.
4] Projecting the vector onto the [outside 41] vector
Become. Where the plane π _jIs the k-th sound source signal vector of FIG.

【外４５】に対し（５）式で求めたｚ_jが第ｎチャンネル信号ベク
トルｙ_n,jに等しくなるような分離係数ベクトル〔外１
６〕の集合とすると、面π_jは、次式のようにｍ次元ユ
ークリッド空間内の平面をなす。[Outside 45] , A separation coefficient vector [z 1] such that z _j obtained by equation (5) becomes equal to the n-th channel signal vector y _{n, j.}
6], the plane π _j forms a plane in an m-dimensional Euclidean space as in the following equation.

【数１４】 [Equation 14]

【００２２】また、面π_j+1は面π_j上の分離係数ベク
トル〔外１６〕を修正ベクトル〔外２６〕により修正し
て得られた時刻（ｊ＋１）における分離係数ベクトル
〔外４２〕の集合であるから、理論的には、逐次修正さ
れた分離係数ベクトル〔外４２〕のベクトル空間上の位
置は、ベクトル空間上に占める分離係数ベクトル〔外１
６〕の位置より面π_j+1に下ろした垂線の足となる。す
なわち、図３より、最適な修正ベクトル〔外２６〕はThe surface π _{j + 1} is obtained by correcting the separation coefficient vector [外 16] on the surface π _j by the correction vector [外 26] to obtain the separation coefficient vector [外 42] at time (j + 1). Since it is a set, theoretically, the position in the vector space of the sequentially corrected separation coefficient vector [Eq.
6], the leg of the perpendicular line lowered to the surface π _{j + 1} . That is, according to FIG. 3, the optimal correction vector

【数１５】と求まる。ここで、残差ｅ_jは（６）式を変形すると(Equation 15) Is obtained. Here, the residual e _j is obtained by transforming equation (6).

【数１６】となるから、これを用いて（１０）式を変形すると次式
を得る。(Equation 16) Therefore, the following equation is obtained by transforming equation (10) using this.

【数１７】（１１）式は、まさしく従来手法における修正ベクトル
〔外２６〕を表す（７）式を推定信号ベクトル〔外２
９〕のノルムの自乗で正規化したものに他ならない。[Equation 17] The equation (11) is equivalent to the equation (7) representing the correction vector [〔26] in the conventional method.
9] is nothing but normalized by the square of the norm.

【００２３】すなわち、図３から明らかなように、（１
１）式の修正ベクトル〔外２６〕は、分離係数ベクトル
〔外１６〕を推定信号ベクトル〔外２９〕のベクトル方
向に修正する修正量として最適な修正量である。従っ
て、分離係数ベクトル〔外１６〕を推定信号ベクトル
〔外２９〕のベクトル方向に修正する場合、（１１）式
の修正ベクトル〔外２６〕は、修正後の分離係数ベクト
ル〔外４２〕と目標とする混合係数ベクトル〔外１７〕
との差が最小となる修正量であるため、逐次修正したと
きに推定信号ベクトル〔外２９〕の大きさによらずに単
調に収束することが保証される。That is, as is apparent from FIG.
The correction vector [1] in the expression 1) is an optimum correction amount as a correction amount for correcting the separation coefficient vector [1] in the vector direction of the estimated signal vector [2]. Therefore, when correcting the separation coefficient vector [Eq. 16] in the vector direction of the estimated signal vector [Eq. 29], the correction vector [Eq. Mixing coefficient vector
Is the correction amount that minimizes the difference, and it is guaranteed that the convergence will be monotonous regardless of the size of the estimated signal vector [Eq.

【００２４】なお、本実施形態では、チャンネル信号、
推定信号の数がそれぞれ２の場合を例に説明したが、分
離係数ベクトルは２信号間の演算により求められるた
め、チャンネル信号、推定信号の数が２を超える場合で
あっても本発明を同様に適用できることは言うまでもな
い。In this embodiment, channel signals,
Although the case where the number of estimated signals is 2 has been described as an example, the present invention is applied to the case where the number of channel signals and the number of estimated signals exceeds 2, since the separation coefficient vector is obtained by an operation between the two signals. Needless to say, it can be applied to.

【００２５】[0025]

【発明の効果】本発明によれば、ある音声信号にそれ以
外の複数の楽音や音声信号または雑音が混入し、それら
が相互に混在している信号からそれぞれの信号を推定、
分離するに際し、それぞれの信号パワー変動による推
定、分離への影響を軽減することができ、さらに、収束
係数を大きくすることができることから安定かつ高速の
信号分離が可能となる。According to the present invention, a certain audio signal is mixed with a plurality of other musical tones, audio signals or noises, and each signal is estimated from a signal in which these signals are mixed.
In the separation, the influence on the estimation and separation due to each signal power fluctuation can be reduced, and the convergence coefficient can be increased, so that stable and high-speed signal separation can be performed.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明による音源信号推定装置の一実施形態を
示している。FIG. 1 shows an embodiment of a sound source signal estimation device according to the present invention.

【図２】分離係数ベクトル〔外１６〕を逐次修正して求
めるにあたり、分離係数ベクトルの修正方向と修正量を
幾何学的に示している。FIG. 2 geometrically shows a correction direction and a correction amount of a separation coefficient vector when sequentially obtaining and determining a separation coefficient vector [16].

【図３】図２における〔外４４〕ベクトル（混合係数ベ
クトル−分離係数ベクトル）と修正ベクトル〔外２６〕
ベクトルとを含む面を示している。FIG. 3 shows a vector (mixing coefficient vector−separation coefficient vector) and a correction vector (FIG. 2) in FIG.
4 shows a plane including a vector.

【図４】ＩＣＡの手法によって混合している音源信号を
推定、分離するにあたって、その複数の音源信号の混
合、分離過程の原理を示している。FIG. 4 shows the principle of a process of mixing and separating a plurality of sound source signals when estimating and separating mixed sound source signals by the ICA method.

【図５】ｋ番目の入力チャンネルに対応する音源信号以
外の音源信号はないと仮定して、図１を書き替えたもの
である。FIG. 5 is a rewrite of FIG. 1 on the assumption that there is no sound source signal other than the sound source signal corresponding to the k-th input channel.

───────────────────────────────────────────────────── フロントページの続き (72)発明者西隆司東京都世田谷区砧１丁目10番11号日本放送協会放送技術研究所内 (72)発明者小泉悟東京都世田谷区砧１丁目10番11号日本放送協会放送技術研究所内Ｆターム(参考） 2G064 AB16 AB21 CC13 CC29 5J083 AA05 AB20 AC18 AE08 BE60 9A001 GG03 GG05 HH15 ──────────────────────────────────────────────────続き Continuing on the front page (72) Inventor Takashi Nishi 1-10-11 Kinuta, Setagaya-ku, Tokyo Inside the Japan Broadcasting Corporation Broadcasting Research Institute (72) Inventor Satoru Koizumi 1-10-11, Kinuta, Setagaya-ku, Tokyo No. Japan Broadcasting Corporation Broadcasting Technology Laboratory F-term (reference) 2G064 AB16 AB21 CC13 CC29 5J083 AA05 AB20 AC18 AE08 BE60 9A001 GG03 GG05 HH15

Claims

【特許請求の範囲】[Claims]

【請求項１】Ｍ（Ｍ≧２）個の音源信号が相互に混在
してディジタル信号の形式でＮ（Ｎ≧Ｍ≧２）個のチャ
ンネル信号として入力されたとの仮定のもとで、前記Ｎ
個のチャンネル信号に基づき前記音源信号として推定さ
れたＭ個の推定信号を２信号間で演算してチャンネル信
号から混在信号を分離するための分離係数ベクトルを逐
次修正しながら求め、その分離係数ベクトルと前記混在
信号としての前記推定信号ベクトルとを内積演算して得
られた結果を前記チャンネル信号から減じて、前記音源
信号を推定する音源信号推定装置において、該装置は、前記混在信号としての推定信号と前記音源信号として推
定された推定信号とが入力され、前記分離係数ベクトル
を逐次修正するためのベクトルを線形演算により生成し
て出力する修正ベクトル生成手段と、該修正ベクトル生成手段の出力ベクトルを前記分離係数
ベクトルに加算して次の時点における分離係数ベクトル
として出力することにより前記分離係数ベクトルの逐次
修正を行う逐次修正手段と、該逐次修正手段によって逐次修正された分離係数ベクト
ルと前記混在信号としての推定信号ベクトルとを内積演
算して得られた結果を前記チャンネル信号から減じて、
前記音源信号として推定された推定信号を出力する減算
手段とを含む音源信号推定手段を少なくとも１組具える
とともに、前記修正ベクトル生成手段は、前記混在信号としての推
定信号ベクトルのノルムの自乗で正規化演算して、前記
音源信号として推定された推定信号の大きさを有すると
ともに前記混在信号としての推定信号ベクトルの方向を
有するベクトルを修正ベクトルとして生成し、該修正ベ
クトルを所定倍して出力する修正ベクトル生成手段であ
ることを特徴とする音源信号推定装置。The present invention is based on the assumption that M (M ≧ 2) sound source signals are mutually mixed and input as N (N ≧ M ≧ 2) channel signals in the form of a digital signal. N
The M estimated signals estimated as the sound source signals are calculated between two signals based on the number of channel signals, and the separation coefficient vector for separating the mixed signal from the channel signal is obtained while sequentially correcting the separation coefficient vector. A sound source signal estimating apparatus for estimating the sound source signal by subtracting a result obtained by calculating an inner product of the estimated signal vector as the mixed signal and the estimated signal vector from the channel signal, wherein the apparatus performs estimation as the mixed signal. Correction vector generation means for receiving a signal and an estimated signal estimated as the sound source signal, generating and outputting a vector for sequentially correcting the separation coefficient vector by a linear operation, and an output vector of the correction vector generation means Is added to the separation coefficient vector and output as the separation coefficient vector at the next point in time. A successive correction means for successively correcting the coefficient vector; and a result obtained by calculating an inner product of the separation coefficient vector successively corrected by the successive correction means and the estimated signal vector as the mixed signal, from the channel signal. ,
And at least one set of sound source signal estimating means including a subtracting means for outputting an estimated signal estimated as the sound source signal, wherein the correction vector generating means has a function of normalizing the square of the norm of the estimated signal vector as the mixed signal. , A vector having the magnitude of the estimated signal estimated as the sound source signal and having the direction of the estimated signal vector as the mixed signal is generated as a correction vector, and the corrected vector is output by multiplying it by a predetermined number. A sound source signal estimating device, which is a correction vector generating means.