JP2003078988A

JP2003078988A - Sound pickup device, method and program, recording medium

Info

Publication number: JP2003078988A
Application number: JP2001269751A
Authority: JP
Inventors: Mariko Aoki; 真理子青木; Kenichi Furuya; 賢一古家
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2001-09-06
Filing date: 2001-09-06
Publication date: 2003-03-14
Anticipated expiration: 2021-09-06
Also published as: JP3716918B2

Abstract

PROBLEM TO BE SOLVED: To provide a sound pickup device that separates one sound source signal from a plurality of acoustic signals of sound sources with high S/N. SOLUTION: Two channel sound signals from a microphone are divided into a plurality of frequency bands by each frame, a level or a phase is calculated by each channel and each frequency band, and the levels and the phases from the past to the present frames are weighted-summed. An inter-channel difference of the level or phase subjected to weighted summation is calculated and to which sound source a corresponding frequency band component belongs on the basis of the inter-channel level or phase difference. Frequency band component signals by each channel are synthesized astride the frequency bands to obtain a sound source signal on the basis of a discrimination signal.

Description

【発明の詳細な説明】Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、空間に複数の音源
が異なる位置に配置されている場合に、少なくとも２本
以上のマイクロホンを用いて、空間を複数のゾーンに分
割し、目的とするゾーンにある音源からの音を他のゾー
ンの音源とは独立に収音する装置、方法及びこの方法を
コンピュータに実行させるプログラム、プログラム記録
媒体に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention divides a space into a plurality of zones by using at least two or more microphones when a plurality of sound sources are arranged in different positions in the space, and a desired zone is obtained. The present invention relates to an apparatus, a method, a program for causing a computer to execute the method, a program recording medium, and a method for collecting the sound from a sound source independent of other zones.

【０００２】[0002]

【従来の技術】従来のゾーン分離収音技術には、例え
ば、音が持ついくつかの周波数成分の和として表現され
る特徴を利用したものがある。すなわち、複数の音（複
数の音源）が同時に鳴っている場合、互いに離して設け
られた複数のマイクロホンにより収音し、各マイクロホ
ンの各出力チャネル信号を、フレーム毎に複数の周波数
帯域に分割し、その各帯域には主として１つの音源信号
成分のみ存在するようにし、これら分割された各出力チ
ャネル信号の各同一帯域毎に、複数のマイクロホンの位
置に起因して変化する、マイクロホンに到達する音響信
号のパラメータ、すなわちレベル（パワー）、到達時間
の値の差を帯域別チャネル間パラメータ値差として検出
し、各帯域の帯域別チャネル間パラメータ差に基づき、
その帯域の帯域分割された各出力チャネル信号の何れが
何れの音源から入力された信号であるかを判定し、この
音源信号判定に基づき、帯域分割された各出力チャネル
信号の内、同一音源から入力された信号を少なくとも１
つ選択し、選択された複数の帯域信号を音源信号として
合成する音源分離方法が提案されている（参考：特開平
１０−３１３４９７号公報（特願平０９−２５２３１２
号）「音源分離方法、装置および記録媒体」）。2. Description of the Related Art A conventional zone-separated sound collecting technique utilizes, for example, a feature expressed as a sum of several frequency components of a sound. In other words, when multiple sounds (multiple sound sources) are playing at the same time, the sound is picked up by multiple microphones that are provided separately from each other, and each output channel signal of each microphone is divided into multiple frequency bands for each frame. , The sound reaching the microphones, which is caused by the positions of a plurality of microphones, for each of the same bands of the respective divided output channel signals so that only one sound source signal component is present in each of the bands. The signal parameter, that is, the level (power), the difference in the value of the arrival time is detected as the inter-band parameter value difference between bands, and based on the inter-band parameter value difference between each band,
It is determined which of the sound sources is input from which sound source of each band-divided output channel signal of that band, and based on this sound source signal judgment, from the same sound source of each band-divided output channel signal. At least 1 input signal
A sound source separation method has been proposed in which one of the two band signals is selected and synthesized as a sound source signal (reference: Japanese Patent Application Laid-Open No. 10-313497 (Japanese Patent Application No. 09-252312).
No.) "Sound source separation method, device and recording medium").

【０００３】[0003]

【発明が解決しようとする課題】ところが、従来の技術
では、部屋の残響時間が長くなるにつれ、チャネル間の
到達レベル差や到達位相差（時間差）の算出誤差が生
じ、その結果、異なるゾーンからの音が混じったり、目
的とするゾーンで発せられている音が劣化するという欠
点があった。However, in the conventional technique, as the reverberation time of the room becomes longer, an error in calculating the arrival level difference or the arrival phase difference (time difference) between channels occurs, and as a result, the difference from different zones occurs. There is a drawback that the sound of the sound is mixed and the sound emitted in the target zone is deteriorated.

【０００４】[0004]

【課題を解決するための手段】上記課題を解決するため
に、本発明は、従来、１フレームの情報を元に算出さ
れていたチャネル間の到達位相差や到達レベル差を、複
数フレームに渡り加重平均することにより、到達位相差
や到達レベル差の算出誤差を減らし、目的とするゾーン
（音源）の音のみを従来に比べて高Ｓ／Ｎで抽出するこ
とを特徴とする。SUMMARY OF THE INVENTION In order to solve the above problems, the present invention provides the arrival phase difference and the arrival level difference between channels, which have been calculated based on the information of one frame, over a plurality of frames. By weighted averaging, the calculation error of the arrival phase difference and the arrival level difference is reduced, and only the sound of the target zone (sound source) is extracted with a higher S / N than the conventional one.

【０００５】[0005]

【発明の実施の形態】図１に、本発明の実施例である収
音装置の構成を示す。収音手段はそれぞれマイクロホン
２₁,２₂で構成し、音源１₁,１₂からの音響信号s₁(n),s₂
(n)（n：時間）を収音して電気信号（チャネル信号）x₁
(n),x₂(n)に変換する。３は帯域分割手段であり、収音
手段からの信号をフレーム毎に周波数帯域（帯域分解能
が約１０〜２０Hz）に分割する。その分割した各周波数
帯域信号X₁,X₂を式（１）、（２）で表す。帯域分割の
手段として例えば、フーリエ変換やウォーブレット変換
があげられる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 shows the configuration of a sound collecting device which is an embodiment of the present invention. Each sound collecting means is constituted by the microphone 2 _1, 2 _2, the acoustic signal s ₁ from the sound source 1 _1, 1 ₂ (n), s ₂
(n) (n: time) is picked up and electrical signal (channel signal) x ₁
Convert to (n), x ₂ (n). A band dividing unit 3 divides the signal from the sound collecting unit into frequency bands (band resolution is about 10 to 20 Hz) for each frame. The divided frequency band signals X ₁ and X ₂ are represented by equations (1) and (2). Examples of means for band division include Fourier transform and Warblet transform.

【０００６】[0006]

【数１】ここで、ω（２πｆ）は角速度を表し、lは信号分析長
（フーリエ変換の場合はフレーム長：約２０〜４０mse
c）のインデックスである。[Equation 1] Here, ω (2πf) represents an angular velocity, l is a signal analysis length (in the case of Fourier transform, frame length: about 20 to 40 mse).
It is the index of c).

【０００７】４のパラメータ値差検出用レベル加重平均
手段においては、X₁,X₂の信号レベルに対し、式（３）
で示す加重平均を行う。In the parameter-weighted difference detecting level weighted averaging means of 4, the equation (3) is applied to the signal levels of X ₁ and X _2.
The weighted average shown in is performed.

【数２】ここで、０＜α≦１とする。また、Lは加重平均に用い
るフレームの個数とする。 i はチャネルのインデック
スである。５のパラメータ値差検出用位相加重平均手段
においては、X₁ ,X₂の位相に対し、式（４）で示す加重
平均を行う。[Equation 2] Here, 0 <α ≦ 1. L is the number of frames used for weighted averaging. i is the channel index. In the parameter value difference detecting phase weighted averaging means 5, the weighted average represented by the equation (4) is performed on the phases X ₁ and X ₂ .

【数３】ここで、０＜β≦１とする。また、Ｍは加重平均に用い
るフレームの個数とする。[Equation 3] Here, 0 <β ≦ 1. Further, M is the number of frames used for the weighted average.

【０００８】６の信号合成用レベル加重平均手段におい
ては、X₁ ,X₂の信号レベルに対し、式（５）で示す加重
平均を行う。In the signal-combining level weighted averaging means 6, the weighted average represented by the equation (5) is applied to the signal levels of X ₁ and X ₂ .

【数４】ここで、０＜γ≦１とする。また、Ｎは加重平均に用い
るフレームの個数とする。[Equation 4] Here, 0 <γ ≦ 1. N is the number of frames used for the weighted average.

【０００９】７の信号合成用位相加重平均手段において
は、X₁ ,X₂の信号レベルに対し、式（６）で示す加重平
均を行う。In the signal-combining phase weighted averaging means 7, the weighted average represented by the equation (6) is applied to the signal levels of X ₁ and X ₂ .

【数５】ここで、０＜δ≦１とする。また、Ｏは加重平均に用い
るフレームの個数とする。パラメータ値差検出用レベ
ル、位相加重平均手段４、５、信号合成用レベル加重平
均手段６，７を設けることにより、α,β,γ,δ及びＬ,
Ｍ,Ｎ,Oをそれぞれ異なる値に設定することができる。[Equation 5] Here, 0 <δ ≦ 1. O is the number of frames used for the weighted average. By providing the parameter value difference detection level, the phase weighted averaging means 4 and 5, and the signal synthesizing level weighted averaging means 6 and 7, α, β, γ, δ and L,
M, N, and O can be set to different values.

【００１０】８のパラメータ値差検出手段においては、
パラメータ値差検出用レベル、位相加重平均手段４、５
で加重平均されたレベルまたは位相を用いて、チャネル
間レベル差（ΔLev）、チャネル間位相差（Δarg）を算
出する。これらはそれぞれ、式（７）、（８）で算出さ
れる。In the parameter value difference detecting means of 8,
Parameter value difference detection level, phase weighted averaging means 4, 5
By using the level or phase weighted and averaged in, the inter-channel level difference (ΔLev) and the inter-channel phase difference (Δarg) are calculated. These are calculated by equations (7) and (8), respectively.

【数６】 [Equation 6]

【００１１】９の信号判定手段においては、パラメータ
値差手段８で算出されたチャネル間レベル差またはチャ
ネル間位相差に基づき、|V₁(ω,ｌ)|,arg(Y₁(ω,ｌ)),|
V₂(ω,ｌ)|,arg(Y₂(ω,ｌ))に乗算する重み値Wei１
(ω),Wei２(ω)を決定する。例えばチャネル間レベル差
|W₁（ω,l)|／|W₂（ω,l)|が、ある１より大きな値τよ
り大きい場合、|V₁(ω,ｌ)|,arg(Y₁(ω,ｌ))には１を乗
算し、|V₂(ω,ｌ)|,arg(Y₂(ω,ｌ))には０または１より
小さな値ａ(ω)を乗算するように決定する。すなわち、
あるωにおいてチャネル１の方がレベルが大きい場合に
はゾーン１（チャネル１で代表されるゾーン）に音源が
あると判定する。これに対し、チャネル２の方がレベル
が大きい場合にはゾーン２に音源があると判定する。ま
た、例えば△argが正の値となる場合、|V₁(ω,ｌ)|,arg
(Y₁(ω,ｌ))には１を乗算し、|V₂(ω,ｌ)|,arg(Y₂(ω,
ｌ))には０または１より小さな値ａ(ω)を乗算するよう
に決定する。すなわちチャネル２の方が遅れるのでゾー
ン１に音源があると判定する。但し、ここではarg(X
₁(ω,ｌ))等は周波数領域での係数arg(X₁(ω,ｌ))等の
複素平面上での位相角の負値である。もし、これを正値
と定義されるなら△argが正の値にある場合にはチャネ
ル１の方が遅れるのでゾーン２と判定する。In the signal determination means 9 of the above, based on the inter-channel level difference or inter-channel phase difference calculated by the parameter value difference means 8, | V ₁ (ω, l) |, arg (Y ₁ (ω, l )), |
Weight value Wei1 for multiplying V ₂ (ω, l) |, arg (Y ₂ (ω, l))
(ω), Wei2 (ω) is determined. Level difference between channels
| W ₁ (ω, l) | / | W ₂ (ω, l) | is greater than a certain value τ greater than ₁ , | V ₁ (ω, l) |, arg (Y ₁ (ω, l) ) Is multiplied by 1, and | V ₂ (ω, l) |, arg (Y ₂ (ω, l)) is multiplied by 0 or a value smaller than 1 a (ω). That is,
When the level of channel 1 is higher at a certain ω, it is determined that there is a sound source in zone 1 (zone represented by channel 1). On the other hand, when channel 2 has a higher level, it is determined that there is a sound source in zone 2. Further, for example, when Δarg has a positive value, | V ₁ (ω, l) |, arg
(Y ₁ (ω, l)) is multiplied by 1 and | V ₂ (ω, l) |, arg (Y ₂ (ω, l)
l)) is determined to be multiplied by a value a (ω) smaller than 0 or 1. That is, since channel 2 is delayed, it is determined that there is a sound source in zone 1. However, arg (X
₁ (ω, l)) and the like are negative values of the phase angle on the complex plane of the coefficient arg (X ₁ (ω, l)) and the like in the frequency domain. If this is defined as a positive value, if Δarg has a positive value, channel 1 is delayed and zone 2 is determined.

【００１２】１０の信号選択手段においては、信号判定
手段９で決定された重み値Wei１(ω)およびWei２(ω)
を、信号合成用レベル、位相加重平均手段６、７から出
力されるレベル及び位相信号に対して乗算する。この場
合、簡略化するために信号合成用レベル及び位相の加重
平均を省略してもよい。In the signal selection means of 10, the weight values Wei1 (ω) and Wei2 (ω) determined by the signal determination means 9 are used.
Is multiplied by the level for signal synthesis, the level and the phase signal output from the phase weighted averaging means 6, 7. In this case, the weighted average of the signal combining level and the phase may be omitted for simplification.

【００１３】１１の信号合成手段においては、重み値We
i１(ω）,Wei２(ω）が乗算された、信号のレベル|V
₁(ω,ｌ)|,|V₂(ω,ｌ)|および位相arg(Y₁(ω,ｌ)),arg
(Y₂(ω,ｌ))を元に、信号を周波数領域から時間領域に
変換することで、各音源からの音s₁^(n),s₂^(n)を高い
Ｓ／Ｎで抽出する。In the signal synthesis means of 11, the weight value We
Signal level | V multiplied by i1 (ω) and Wei2 (ω)
₁ (ω, l) |, | V ₂ (ω, l) | and the phase arg (Y ₁ (ω, l)), arg
By converting the signal from the frequency domain to the time domain based on (Y ₂ (ω, l)), the sound s ₁ ^ (n), s ₂ ^ (n) from each sound source can be converted with high S / N. Extract.

【００１４】図２を参照して本発明の収音装置の処理を
説明する。互いに離して配置された複数のマイクロホン
からの各出力チャネル信号を入力し、フレーム毎に複数
の周波数帯域に分割する（ｓ１）。複数の周波数帯域に
分割された各チャネル信号X₁,X₂の各同一帯域毎に、フ
レーム毎に算出された信号のレベル、位相を過去から複
数フレームに渡り保持し、その信号レベル、位相のパラ
メータ値差検出用の加重平均|W₁（ω,l)|,|W₂（ω,l)|,
arg(U₁(ω,ｌ)),arg(U₂(ω,ｌ))を算出する（ｓ２）。
また、複数の周波数帯域に分割された各出力チャネル信
号の各同一帯域毎に、フレーム毎に算出された信号のレ
ベル、位相を複数フレームに渡り保持し、その信号レベ
ル、位相の信号合成用の加重平均|V₁（ω,l)|,|V₂（ω,
l)|,arg(Y₁(ω,ｌ)),arg(Y₂(ω,ｌ))を算出する（ｓ
３）。複数のマイクロホンの位置に起因して変化する、
マイクロホンに到達する音響信号のパラメータの値の差
としてｓ２で算出したチャネル間レベル差△Lev、チャ
ネル間位相差△argを検出する（ｓ４）。各帯域のチャ
ネル間パラメータ値差に基づき、その帯域の信号合成用
レベル、位相加重平均した出力信号のうち、何れが何れ
の音源から入力された信号であるかを判定する（すなわ
ち乗算する重み値Wei１(ω),Wei２(ω)を決定する）
（ｓ５）。ｓ５の判定に基づきｓ３で算出した信号合成
用レベル、位相加重平均信号に重み値を乗算すること
で、同一音源から入力された信号を少なくとも１つ抽出
する（ｓ６）。同一音源からの信号として選択された複
数の帯域信号を合成し音源信号s₁^(n),s₂^(n)として出
力する（ｓ７）。The processing of the sound collecting device of the present invention will be described with reference to FIG. Each output channel signal from a plurality of microphones arranged apart from each other is input and divided into a plurality of frequency bands for each frame (s1). For each same band of each channel signal X ₁ and X ₂ divided into a plurality of frequency bands, the level and phase of the signal calculated for each frame are held over a plurality of frames from the past, and the signal level and phase Weighted average for detecting parameter value difference | W ₁ (ω, l) |, | W ₂ (ω, l) |,
arg (U ₁ (ω, l)), arg (U ₂ (ω, l)) are calculated (s2).
Also, for each same band of each output channel signal divided into a plurality of frequency bands, the level and phase of the signal calculated for each frame are held over a plurality of frames, and the signal level and phase for signal combining are held. Weighted average | V ₁ (ω, l) |, | V ₂ (ω,
l) |, arg (Y ₁ (ω, l)), arg (Y ₂ (ω, l)) are calculated (s
3). Changes due to the position of multiple microphones,
The inter-channel level difference ΔLev and the inter-channel phase difference Δarg calculated in s2 are detected as the differences in the parameter values of the acoustic signals reaching the microphone (s4). Based on the parameter value difference between channels in each band, it is determined which of the sound source is the signal input from the signal synthesis level and the phase weighted average output signal of that band (that is, the weight value to be multiplied). Wei1 (ω) and Wei2 (ω) are determined)
(S5). At least one signal input from the same sound source is extracted by multiplying the signal combining level and the phase weighted average signal calculated in s3 based on the determination in s5 by a weight value (s6). A plurality of band signals selected as signals from the same sound source are combined and output as sound source signals s ₁ ^ (n) and s ₂ ^ (n) (s ₇ ).

【００１５】また、本発明の収音装置は、CPUやメモリ
等を有するコンピュータと、アクセス主体となるユーザ
が利用する利用者端末と、記録媒体から構成することが
できる。記録媒体はCD-ROM、磁気ディスク装置、半導体
メモリ等の機械読み取りが可能な記録媒体であり、ここ
に記録された制御用プログラムは、コンピュータに読み
取られ、コンピュータの動作を制御しコンピュータ上に
前述した実施の形態における各構成要素、すなわち、帯
域分割手段、パラメータ値差検出用レベル、位相加重平
均手段、信号合成用レベル、位相加重平均手段、パラメ
ータ値差検出手段、信号判定手段、信号選択手段、信号
合成手段等を実現する。The sound collecting device of the present invention can be composed of a computer having a CPU, a memory, etc., a user terminal used by a user who is an access subject, and a recording medium. The recording medium is a machine-readable recording medium such as a CD-ROM, a magnetic disk device, and a semiconductor memory. The control program recorded here is read by a computer, controls the operation of the computer, and operates on the computer. Each component in the embodiment, that is, band dividing means, parameter value difference detecting level, phase weighted averaging means, signal synthesizing level, phase weighted averaging means, parameter value difference detecting means, signal determining means, signal selecting means , Signal synthesizing means, etc. are realized.

【００１６】[0016]

【発明の効果】本発明は、チャネル間の到達時間差及び
到達レベル差を加重平均することにより、これら値の算
出誤差を減らし、従来の方法に比べて高いＳ／Ｎで目的
とするゾーン（音源）の音を抽出することを可能とす
る。The present invention reduces the calculation error of these values by weighted averaging the arrival time difference and the arrival level difference between channels, and achieves a target zone (sound source) with a high S / N compared to the conventional method. ) Sound can be extracted.

【図面の簡単な説明】[Brief description of drawings]

【図１】本発明の実施例である収音装置の構成図。FIG. 1 is a configuration diagram of a sound collecting device that is an embodiment of the present invention.

【図２】本発明の実施例である収音装置の処理を説明す
るための図。FIG. 2 is a diagram for explaining processing of the sound collecting device that is the embodiment of the present invention.

【符号の説明】[Explanation of symbols]

１・・・音源、２・・・マイクロホン、３・・・帯域分
割手段、４・・・パラメータ値差検出用レベル加重平均
手段、５・・・パラメータ値差検出用位相加重平均手
段、６・・・信号合成用レベル加重平均手段、７・・・
信号合成用位相加重平均手段、８・・・パラメータ値差
検出手段、９・・・信号判定手段、１０・・・信号選択
手段、１５・・・信号合成手段DESCRIPTION OF SYMBOLS 1 ... Sound source, 2 ... Microphone, 3 ... Band splitting means, 4 ... Parameter value difference detecting level weighted averaging means, 5 ... Parameter value difference detecting phase weighted averaging means, 6 ... ..Level-weighted averaging means for signal synthesis, 7 ...
Phase-weighted average means for signal synthesis, 8 ... Parameter value difference detection means, 9 ... Signal determination means, 10 ... Signal selection means, 15 ... Signal synthesis means

フロントページの続き (51)Int.Cl.⁷ 識別記号ＦＩテーマコート゛(参考）Ｈ０４Ｒ 3/04 Continuation of front page (51) Int.Cl. ⁷ Identification code FI theme code (reference) H04R 3/04

Claims

【特許請求の範囲】[Claims]

【請求項１】複数の音源から少なくとも１つの音源を分
離する収音装置において、少なくとも２チャネルの信号をフレーム毎に複数の周波
数帯域に分割する帯域分割手段と、各チャネル及び周波数帯域毎にレベル又は位相を算出
し、過去から現在のフレームにわたって加重平均するパ
ラメータ値差検出用加重平均手段と、前記加重平均されたレベル又は位相のチャネル間差を算
出するチャネル間パラメータ差算出手段と、前記チャネル間パラメータ差に基づき、対応する周波数
帯域成分がいずれの音源に属するかを判定する信号判定
手段と、信号判定手段の同一音源からの信号として判定された周
波数帯域成分信号を周波数帯域を跨いで音源信号を合成
する音源合成手段とを有することを特徴とする収音装
置。1. A sound collecting device for separating at least one sound source from a plurality of sound sources, a band dividing means for dividing a signal of at least two channels into a plurality of frequency bands for each frame, and a level for each channel and each frequency band. Alternatively, a parameter value difference detection weighted averaging unit that calculates a phase and performs a weighted average over the past to the current frame, an inter-channel parameter difference calculation unit that calculates the channel-to-channel difference of the weighted averaged level or phase, and the channel A signal determination unit that determines which sound source the corresponding frequency band component belongs to based on the inter-parameter difference, and a frequency band component signal that is determined as a signal from the same sound source of the signal determination unit A sound pickup device having a sound source synthesizing means for synthesizing signals.

【請求項２】請求項１に記載の収音装置において、各チャネル及び周波数帯域毎にレベル及び位相を算出
し、過去から現在のフレームにわたって加重平均して周
波数帯域成分信号とし、音源合成手段に出力する信号合
成用加重平均手段を有することを特徴とする収音装置。2. The sound collecting device according to claim 1, wherein a level and a phase are calculated for each channel and a frequency band, and a weighted average is applied over a past frame to a present frame to obtain a frequency band component signal, and the sound source synthesizing means A sound collecting device having a weighted averaging means for outputting signals.

【請求項３】複数の音源から少なくとも１つの音源を分
離する収音方法において、少なくとも２チャネルの信号をフレーム毎に複数の周波
数帯域に分割し、各チャネル及び周波数帯域毎にレベル
又は位相を算出し、前記レベル及び位相を過去から現在のフレームにわたっ
て加重平均し、前記加重平均されたレベル又は位相のチャネル間差を算
出し、前記加重平均されたレベル又は位相のチャネル間差に基
づき、対応する周波数帯域成分がいずれの音源に属する
か判定し、同一音源からの信号として判定された周波数帯域成分信
号を周波数帯域を跨いで音源信号を合成することを特徴
とする収音方法。3. A sound pickup method for separating at least one sound source from a plurality of sound sources, wherein a signal of at least two channels is divided into a plurality of frequency bands for each frame, and a level or phase is calculated for each channel and each frequency band. Then, the level and the phase are weighted averaged from the past to the current frame, the difference between the channels of the weighted average of the level or the phase is calculated, and the difference between the channels of the weighted average of the level or the phase is calculated. A sound collecting method characterized by determining which sound source a frequency band component belongs to, and synthesizing a sound source signal across frequency bands from the frequency band component signals determined as signals from the same sound source.

【請求項４】請求項３に記載の収音方法において、前記周波数帯域成分信号は、各チャネル及び周波数帯域
毎にレベル及び位相を過去から現在のフレームにわたっ
て加重平均した信号であることを特徴とする収音方法。4. The sound collecting method according to claim 3, wherein the frequency band component signal is a signal obtained by weighting and averaging levels and phases for each channel and each frequency band over the past to the present frame. How to collect sound.

【請求項５】少なくとも２チャネルの信号をフレーム毎
に複数の周波数帯域に分割する処理と、各チャネル及び周波数帯域毎にレベル又は位相を算出
し、過去から現在のフレームにわたって加重平均する処
理と、前記加重平均されたレベル又は位相のチャネル間差を算
出する処理と、前記加重平均されたレベル又は位相のチャネル間差に基
づき、対応する周波数帯域成分がいずれの音源に属する
か判定する処理と、同一音源からの信号として判定された周波数帯域成分信
号を周波数帯域を跨いで音源信号を合成する処理とをコ
ンピュータに実行させるための複数の音源から少なくと
も１つの音源を分離する収音方法のプログラム。5. A process of dividing a signal of at least two channels into a plurality of frequency bands for each frame, a process of calculating a level or a phase for each channel and a frequency band, and performing a weighted average over the past to the present frame. A process of calculating the difference between the channels of the weighted average level or phase, and a process of determining which sound source the corresponding frequency band component belongs to, based on the difference between the channels of the weighted average level or phase, A program of a sound collection method for separating at least one sound source from a plurality of sound sources for causing a computer to perform a process of synthesizing a sound source signal of a frequency band component signal determined as a signal from the same sound source across frequency bands.

【請求項６】請求項５に記載の収音方法のプログラムに
おいて、各チャネル及び周波数帯域毎にレベル及び位相を算出
し、過去から現在のフレームにわたって加重平均して周
波数帯域成分信号とする処理を有することを特徴とする
収音方法のプログラム。6. The program of the sound collecting method according to claim 5, wherein a level and a phase are calculated for each channel and a frequency band, and a weighted average is performed over the past to the current frame to obtain a frequency band component signal. A program for a sound collection method having.

【請求項７】少なくとも２チャネルの信号をフレーム毎
に複数の周波数帯域に分割する処理と、各チャネル及び周波数帯域毎にレベル又は位相を算出
し、過去から現在のフレームにわたって加重平均する処
理と、前記加重平均されたレベル又は位相のチャネル間差を算
出する処理と、前記加重平均されたレベル又は位相のチャネル間差に基
づき、対応する周波数帯域成分がいずれの音源に属する
か判定する処理と、同一音源からの信号として判定された周波数帯域成分信
号を周波数帯域を跨いで音源信号を合成する処理とをコ
ンピュータに実行させるための複数の音源から少なくと
も１つの音源を分離する収音方法のプログラムを記録し
たコンピュータ読み取り可能な記録媒体。7. A process of dividing a signal of at least two channels into a plurality of frequency bands for each frame, a process of calculating a level or a phase for each channel and a frequency band, and performing a weighted average over the past to the present frame. A process of calculating the difference between the channels of the weighted average level or phase, and a process of determining which sound source the corresponding frequency band component belongs to, based on the difference between the channels of the weighted average level or phase, A sound collection method program for separating at least one sound source from a plurality of sound sources for causing a computer to execute processing of synthesizing sound source signals across frequency bands of frequency band component signals determined as signals from the same sound source. The recorded computer-readable recording medium.

【請求項８】請求項７に記載の記録媒体において、各チャネル及び周波数帯域毎にレベル及び位相を算出
し、過去から現在のフレームにわたって加重平均して周
波数帯域成分信号とする処理を有する記録媒体。8. The recording medium according to claim 7, further comprising a process of calculating a level and a phase for each channel and a frequency band, and performing a weighted average over a past frame to a present frame to obtain a frequency band component signal. .