JP2007318373A

JP2007318373A - Voice input unit, and audio source separation unit

Info

Publication number: JP2007318373A
Application number: JP2006144818A
Authority: JP
Inventors: Hiroshi Hashimoto; 裕志橋本
Original assignee: Kobe Steel Ltd
Current assignee: Kobe Steel Ltd
Priority date: 2006-05-25
Filing date: 2006-05-25
Publication date: 2007-12-06

Abstract

PROBLEM TO BE SOLVED: To avoid replacement of the directions of existence of audio sources to a plurality of microphones used for inputting voice signals, which are processing objects of an audio source separation unit, when the voice signals obtained from the plurality of microphones provided on a moving object having a varied direction are transmitted to the audio source separation unit. SOLUTION: Based on the detection results (the directions of microphone units 20a) of a gyrosensor 10a, from among eight input voice signals obtained from eight microphones 1L-4L, 1R-4R, or more, arrayed on the periphery of a reference axis M0, two signals of the portions thereof are selected and transmitted to an audio source separation processing section 31, so as to control not to replace the existence directions of the audio sources to the selected microphones. COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、所定の音響空間において音声を入力する複数のマイクロホンを備え、それらマイクロホンにより得られる音声信号を所定の音源分離装置に伝送する音声入力装置、及びそれを具備する音源分離装置に関するものである。 The present invention relates to a sound input device that includes a plurality of microphones that input sound in a predetermined acoustic space, and that transmits a sound signal obtained by the microphones to a predetermined sound source separation device, and a sound source separation device including the sound input device. is there.

所定の音響空間に複数の音源と複数のマイクロホン（音声入力手段）とが存在する場合、その複数のマイクロホンごとに、複数の音源各々からの個別音声信号（以下、音源信号という）が重畳された音声信号（以下、混合音声信号という）が入力される。このようにして入力された複数の前記混合音声信号のみに基づいて、前記音源信号各々を同定（分離）する音源分離処理の方式は、ブラインド音源分離方式（Blind Source Separation方式、以下、ＢＳＳ方式という）と呼ばれる。
さらに、ＢＳＳ方式の音源分離処理の１つに、独立成分分析法（Independent Component Analysis、以下、ＩＣＡ法という）に基づくＢＳＳ方式の音源分離処理がある。このＩＣＡ法に基づくＢＳＳ方式は、複数のマイクロホンを通じて入力される複数の前記混合音声信号（時系列の音声信号）において、前記音源信号どうしが統計的に独立であることを利用して所定の分離行列（逆混合行列）を最適化し、入力された複数の前記混合音声信号に対して最適化された分離行列によるフィルタ処理を施すことによって前記音源信号の同定（音源分離）を行う処理方式である。その際、分離行列の最適化は、ある時点で設定されている分離行列を用いたフィルタ処理により同定（分離）された信号（分離信号）に基づいて、逐次計算（学習計算）により以降に用いる分離行列を計算することによって行われる。
ここで、ＩＣＡ法に基づくＢＳＳ方式の音源分離処理によれば、分離信号各々は、混合音声信号の入力数（＝マイクロホンの数）と同じ数の出力端（出力チャンネルといってもよい）各々を通じて出力される。このようなＩＣＡ法に基づくＢＳＳ方式の音源分離処理は、例えば、非特許文献１や非特許文献２等に詳説されている。
また、ブラインド音源分離処理としては、バイノーラル方式のブラインド音源分離処理も知られている。これは、人間の聴覚モデルに基づいて複数の入力音声信号に時変のゲイン調節を施して音源分離を行うものであり、比較的低い演算負荷で実現できる音源分離処理である。これについては、例えば、非特許文献３や非特許文献４等に詳説されている。このバイノーラル方式のブラインド音源分離処理の１つに、バイナリマスク方式によるブラインド音源分離処理がある。
猿渡洋、「アレー信号処理を用いたブラインド音源分離の基礎」電子情報通信学会技術報告、vol.EA2001-7、pp.49-56、April 2001. 高谷智哉他、「SIMOモデルに基づくICAを用いた高忠実度なブラインド音源分離」電子情報通信学会技術報告、vol.US2002-87、EA2002-108、January 2003. R.F.Lyon, "A computational model of binaural localization and separation," In Proc. ICASSP, 1983. M. Bodden, "Modeling human sound-source localization and the cocktail-party-effect," Acta Acoustica, vol.1, pp.43-55, 1993. When there are a plurality of sound sources and a plurality of microphones (sound input means) in a predetermined acoustic space, individual sound signals (hereinafter referred to as sound source signals) from each of the plurality of sound sources are superimposed for each of the plurality of microphones. An audio signal (hereinafter referred to as a mixed audio signal) is input. A sound source separation processing method for identifying (separating) each of the sound source signals based only on the plurality of mixed audio signals input in this way is a blind source separation method (hereinafter referred to as a BSS method). ).
Furthermore, as one of the BSS sound source separation processes, there is a BSS sound source separation process based on an independent component analysis method (hereinafter referred to as ICA method). The BSS method based on the ICA method uses a fact that the sound source signals are statistically independent from each other in a plurality of the mixed sound signals (time-series sound signals) input through a plurality of microphones. This is a processing method for identifying a sound source signal (sound source separation) by optimizing a matrix (inverse mixing matrix) and applying a filtering process using an optimized separation matrix to a plurality of input mixed speech signals. . At that time, the optimization of the separation matrix is used later by sequential calculation (learning calculation) based on the signal (separated signal) identified (separated) by the filter processing using the separation matrix set at a certain time. This is done by calculating the separation matrix.
Here, according to the sound source separation processing of the BSS method based on the ICA method, each of the separated signals has the same number of output terminals (also called output channels) as the number of mixed audio signals (= the number of microphones). Is output through. Such BSS sound source separation processing based on the ICA method is described in detail in Non-Patent Document 1, Non-Patent Document 2, and the like, for example.
As a blind sound source separation process, a binaural blind sound source separation process is also known. This is a sound source separation process which performs sound source separation by performing time-varying gain adjustment on a plurality of input audio signals based on a human auditory model, and is a sound source separation process which can be realized with a relatively low calculation load. This is described in detail in, for example, Non-Patent Document 3 and Non-Patent Document 4. As one of the binaural blind sound source separation processing, there is a blind sound source separation processing by a binary mask method.
Hiroshi Saruwatari, “Basics of Blind Sound Source Separation Using Array Signal Processing,” IEICE Technical Report, vol.EA2001-7, pp.49-56, April 2001. Tomoya Takatani et al., "High fidelity blind source separation using ICA based on SIMO model" IEICE Technical Report, vol.US2002-87, EA2002-108, January 2003. RFLyon, "A computational model of binaural localization and separation," In Proc. ICASSP, 1983. M. Bodden, "Modeling human sound-source localization and the cocktail-party-effect," Acta Acoustica, vol.1, pp.43-55, 1993.

ところで、ＩＣＡ法に基づくＢＳＳ方式の音源分離処理や、バイノーラル方式のブラインド音源分離処理においては、複数の音源に対するマイクロホンの向きが変化することにより、マイクロホンに対する複数の音源の存在方向（左右方向）が入れ替わると、これに応じて前記出力端（出力チャンネル）各々に出力される分離信号も入れ替わる。このため、音源分離装置が処理対象とする混合音声信号を入力するための複数のマイクロホンを、音響空間においてその向きが変化する作業者やロボット等の動体に設けた場合、音源分離装置によって特定の音源を追跡すること、即ち、特定の音源に対応する分離信号が必ず特定の出力端を通じて出力されるようにすることができないという問題点があった。
従って、本発明は上記事情に鑑みてなされたものであり、その目的とするところは、音響空間において向きが変化する動体に複数のマイクロホンを設け、その複数のマイクロホンにより得られる音声信号（混合音声信号）を音源分離装置に伝送する場合に、その音源分離装置が処理対象とする音声信号の入力に用いた複数のマイクロホンに対する音源の存在方向が入れ替わらないようにできる音声入力装置及びそれを具備する音源分離装置を提供することにある。 By the way, in the BSS sound source separation process based on the ICA method and the binaural blind sound source separation process, the direction of the microphone with respect to the plurality of sound sources changes, so that the existence directions (left and right directions) of the plurality of sound sources with respect to the microphones are changed. When switched, the separated signals output to the respective output terminals (output channels) are also switched accordingly. For this reason, when a plurality of microphones for inputting mixed audio signals to be processed by the sound source separation device are provided in a moving body such as an operator or a robot whose direction changes in the acoustic space, a specific sound source separation device is used. There is a problem that tracking a sound source, that is, it is impossible to always output a separated signal corresponding to a specific sound source through a specific output terminal.
Therefore, the present invention has been made in view of the above circumstances, and an object of the present invention is to provide a plurality of microphones on a moving body whose direction changes in an acoustic space, and to obtain an audio signal (mixed sound) obtained by the plurality of microphones. And a sound input device capable of preventing the direction of the sound source from changing with respect to a plurality of microphones used for inputting sound signals to be processed by the sound source separation device when the signal is transmitted to the sound source separation device. An object of the present invention is to provide a sound source separation device.

上記目的を達成するために本発明は、所定の音響空間において音声を入力する複数のマイクロホンを備え、そのマイクロホンにより得られる音声信号を所定の音源分離装置に伝送する音声入力装置に適用されるものであり、以下の第１発明に係る音声入力装置又は第２発明に係る音声入力装置として構成されるものである。
ここで、第１の発明は、次の（１−１）〜（１−３）に示す構成要素を具備することを特徴とする。
（１−１）３つ以上のマイクロホン及びこれらを所定の基準軸の周囲の所定位置に並べた状態で支持する支持部を有する第１のマイクロホンユニット。
（１−２）前記基準軸を回転中心とした場合における前記第１のマイクロホンユニットの向きを検出する第１の向き検出手段。
（１−３）前記第１の向き検出手段の検出結果に基づいて、前記３つ以上のマイクロホンにより得られる３つ以上の入力音声信号から、そのうちの一部の複数の信号を選択して前記音源分離装置に伝送する信号選択手段。
また、第２の発明は、次の（２−１）〜（２−４）に示す構成要素を具備することを特徴とする。
（２−１）複数のマイクロホン及びこれらを所定の基準軸の周囲の所定位置に支持する支持部を有する第２のマイクロホンユニット。
（２−２）前記第２のマイクロホンユニットを前記基準軸を中心に回転駆動する回転駆動手段。
（２−３）前記基準軸を回転中心とした場合における前記第２のマイクロホンユニットを回転可能に支持する部分の向き若しくは前記第２のマイクロホンユニットの向きを検出する第２の向き検出手段。
（２−４）前記第２の向き検出手段の検出結果に基づいて前記回転駆動手段を制御することにより前記第２のマイクロホンユニットの向きを調節する向き調節手段。
ここで、前記第１の向き検出手段や前記第２の向き検出手段としては、例えば、所定の基準方向に対する回転角度をジャイロセンサにより検出するものが考えられる。
前記第１の発明によれば、前記第１のマイクロホンユニットが前記基準軸を中心に回転してその向きが変化した場合に、音源分離装置へ伝送する音声信号（即ち、音源分離装置が処理対象とする音声信号）の入力に用いる複数のマイクロホンを、これに対する音源の存在方向が入れ替わらないように選択できる。
また、前記第２の発明によれば、前記第２のマイクロホンユニットを回転可能に支持する部分が前記基準軸を中心に回転した場合でも、前記第２のマイクロホンユニットの向きを一定方向に向くよう保持できる。
従って、前記第１の発明又は前記第２の発明によれば、音響空間において向きが変化する動体に複数のマイクロホンを設け、その複数のマイクロホンにより得られる音声信号（混合音声信号）を音源分離装置に伝送する場合に、その音源分離装置が処理対象とする音声信号の入力に用いた複数のマイクロホンに対する音源の存在方向が入れ替わらないようにできる。
また、本発明は、以上に示した第１の発明又は第２の発明に係る音声入力装置を具備し、その音声入力装置から伝送される複数の音声信号から、その音声入力装置が配置される音響空間に存在する１又は複数の音源に対応する分離信号を生成する音源分離装置として捉えることもできる。
なお、このような音源分離装置としては、例えば、独立成分分析法に基づくブラインド音源分離処理を行う音源分離装置や、バイナリマスク方式によるブラインド音源分離処理を行う音源分離装置等が考えられる。 In order to achieve the above object, the present invention is applied to an audio input device that includes a plurality of microphones for inputting sound in a predetermined acoustic space and transmits an audio signal obtained by the microphones to a predetermined sound source separation device. And is configured as a voice input device according to the following first invention or a voice input device according to the second invention.
Here, 1st invention comprises the component shown to following (1-1)-(1-3), It is characterized by the above-mentioned.
(1-1) A first microphone unit having three or more microphones and a support unit that supports these microphones in a state where they are arranged at predetermined positions around a predetermined reference axis.
(1-2) First direction detecting means for detecting the direction of the first microphone unit when the reference axis is the rotation center.
(1-3) Based on the detection result of the first direction detection means, select a plurality of signals, some of which are selected from three or more input audio signals obtained by the three or more microphones. Signal selection means for transmission to the sound source separation device.
Moreover, 2nd invention comprises the component shown to following (2-1)-(2-4), It is characterized by the above-mentioned.
(2-1) A second microphone unit having a plurality of microphones and a support unit that supports these microphones at a predetermined position around a predetermined reference axis.
(2-2) Rotation driving means for driving the second microphone unit to rotate about the reference axis.
(2-3) Second orientation detection means for detecting the orientation of the portion that rotatably supports the second microphone unit or the orientation of the second microphone unit when the reference axis is the center of rotation.
(2-4) Direction adjusting means for adjusting the direction of the second microphone unit by controlling the rotation driving means based on the detection result of the second direction detecting means.
Here, as the first direction detection unit and the second direction detection unit, for example, a unit that detects a rotation angle with respect to a predetermined reference direction by a gyro sensor can be considered.
According to the first aspect of the present invention, when the first microphone unit rotates around the reference axis and changes its direction, the audio signal transmitted to the sound source separation device (that is, the sound source separation device is the object to be processed). A plurality of microphones used for input of the sound signal) can be selected so that the direction of the sound source relative to the microphones is not switched.
In addition, according to the second aspect of the invention, even when the portion that rotatably supports the second microphone unit rotates around the reference axis, the second microphone unit is oriented in a certain direction. Can hold.
Therefore, according to the first invention or the second invention, a plurality of microphones are provided on a moving body whose direction changes in an acoustic space, and a sound signal (mixed sound signal) obtained by the plurality of microphones is provided as a sound source separation device. When transmitting to the sound source, it is possible to prevent the direction of the sound source from changing with respect to the plurality of microphones used for inputting the audio signal to be processed by the sound source separation device.
Further, the present invention includes the voice input device according to the first invention or the second invention described above, and the voice input device is arranged from a plurality of voice signals transmitted from the voice input device. It can also be understood as a sound source separation device that generates a separation signal corresponding to one or a plurality of sound sources present in the acoustic space.
As such a sound source separation device, for example, a sound source separation device that performs blind sound source separation processing based on an independent component analysis method, a sound source separation device that performs blind sound source separation processing by a binary mask method, and the like are conceivable.

本発明によれば、音響空間において向きが変化する動体に複数のマイクロホンを設け、その複数のマイクロホンにより得られる音声信号（混合音声信号）を音源分離装置に伝送する場合に、その音源分離装置が処理対象とする音声信号の入力に用いた複数のマイクロホンに対する音源の存在方向が入れ替わらないようにできる。 According to the present invention, when a plurality of microphones are provided on a moving body whose direction changes in an acoustic space, and a sound signal (mixed sound signal) obtained by the plurality of microphones is transmitted to the sound source separation device, the sound source separation device is It is possible to prevent the direction in which the sound source exists for a plurality of microphones used for inputting the audio signal to be processed from being switched.

以下添付図面を参照しながら、本発明の実施の形態について説明し、本発明の理解に供する。尚、以下の実施の形態は、本発明を具体化した一例であって、本発明の技術的範囲を限定する性格のものではない。
ここに、図１は本発明の第１実施形態に係る音源分離装置Ｘ１の概略構成を表すブロック図、図２は音源分離装置Ｘ１の概略外観図、図３は音源分離装置Ｘ１が備えるマイクユニットの平面図、図４は音源分離装置Ｘ１におけるマイクユニットの向きに応じたマイクロホンの選択処理を説明するためのマイクユニットの平面図、図５は本発明の第２実施形態に係る音源分離装置Ｘ２の概略構成を表すブロック図、図６は音源分離装置Ｘ２におけるマイクユニットの回転制御を説明するためのマイクユニットの平面図である。 Embodiments of the present invention will be described below with reference to the accompanying drawings for understanding of the present invention. In addition, the following embodiment is an example which actualized this invention, Comprising: It is not the thing of the character which limits the technical scope of this invention.
FIG. 1 is a block diagram showing a schematic configuration of the sound source separation device X1 according to the first embodiment of the present invention, FIG. 2 is a schematic external view of the sound source separation device X1, and FIG. 3 is a microphone unit provided in the sound source separation device X1. FIG. 4 is a plan view of a microphone unit for explaining microphone selection processing according to the direction of the microphone unit in the sound source separation device X1, and FIG. 5 is a sound source separation device X2 according to the second embodiment of the present invention. FIG. 6 is a plan view of the microphone unit for explaining the rotation control of the microphone unit in the sound source separation device X2.

［第１実施形態］
まず、図１〜図３を参照しつつ、本発明の第１実施形態に係る音源分離装置Ｘ１の構成について説明する。
図１に示すように、音源分離装置Ｘ１は、所定の音響空間において音声を入力する８つ（３つ以上の一例）のマイクロホン（以下、マイクと称する）１Ｌ〜４Ｌ、１Ｒ〜４Ｒが設けられたマイクユニット２０ａと、制御ユニット３０ａと、ジャイロセンサ１０ａとを備えている。また、制御ユニット３０ａには、音源分離処理部３１、ＭＰＵ３２ａ及びマルチプレクサ３３ａが設けられている。
そして、音源分離装置Ｘ１は、複数のマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒにより得られる音声信号のうちの２つ（複数）の信号をマルチプレクサ３３ａを介して音源分離処理部３１の２つの入力チャンネルＩｎ１、Ｉｎ２（信号入力端）に伝送し、その２つの入力チャンネルＩｎ１、Ｉｎ２に入力された音声信号から、複数のマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒが配置される音響空間に存在する２つの音源に対応する分離信号を生成（同定）し、それを出力チャンネルＯｕｔ１、Ｏｕｔ２（信号出力端）を通じて出力するものである。図１に示す例では、一方の出力チャンネルＯｕｔ１を通じて出力される分離信号がスピーカ４０に出力されている。
複数の入力チャンネルＩｎ１、Ｉｎ２への入力音声信号は、それぞれ複数の音源の信号（音源信号）が重畳された混合音声信号である。そして、音源分離処理部３１は、複数の入力チャンネルＩｎ１、Ｉｎ２各々への入力音声信号に基づいてブラインド音源分離処理を実行することにより、各入力音声信号に重畳されている複数の音源信号（ここでは、２つの音源信号）を同定し、その同定した信号を分離信号として出力するものである。ここで、音源分離処理部３１が実行するブラインド音源分離処理としては、前述したように、ＩＣＡ法に基づくＢＳＳ方式の音源分離処理や、バイナリマスク方式によるブラインド音源分離処理等が考えられる。それら各処理の詳細は、前述した非特許文献１〜４に詳述されているので、ここでは説明を省略する。 [First Embodiment]
First, the configuration of the sound source separation device X1 according to the first embodiment of the present invention will be described with reference to FIGS.
As shown in FIG. 1, the sound source separation device X1 is provided with eight (three or more examples) microphones (hereinafter referred to as microphones) 1L to 4L and 1R to 4R for inputting sound in a predetermined acoustic space. The microphone unit 20a, the control unit 30a, and the gyro sensor 10a are provided. The control unit 30a includes a sound source separation processing unit 31, an MPU 32a, and a multiplexer 33a.
The sound source separation device X1 then transmits two (a plurality) of the audio signals obtained by the plurality of microphones 1L to 4L and 1R to 4R to the two input channels In1 of the sound source separation processing unit 31 via the multiplexer 33a. , In2 (signal input terminal), from the audio signals input to the two input channels In1 and In2, the two sound sources existing in the acoustic space where the plurality of microphones 1L to 4L and 1R to 4R are arranged Corresponding separation signals are generated (identified) and output through output channels Out1 and Out2 (signal output terminals). In the example shown in FIG. 1, the separated signal output through one output channel Out1 is output to the speaker 40.
Input audio signals to the plurality of input channels In1 and In2 are mixed audio signals on which a plurality of sound source signals (sound source signals) are superimposed. The sound source separation processing unit 31 performs a blind sound source separation process based on the input sound signals to each of the plurality of input channels In1 and In2, thereby performing a plurality of sound source signals (here, Then, two sound source signals) are identified, and the identified signals are output as separated signals. Here, as described above, the blind sound source separation process executed by the sound source separation processing unit 31 may be a BSS sound source separation process based on the ICA method, a blind sound source separation process based on the binary mask method, or the like. The details of each of these processes are described in detail in Non-Patent Documents 1 to 4 described above, and a description thereof is omitted here.

図３に示すように、マイクユニット２０ａは、８つのマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒ及びこれらを所定の基準軸Ｍ０の周囲の３６０°の範囲に並べた状態で支持する支持部２１ａとを備えて構成されている（第１のマイクロホンユニットの一例）。図３に示す例では、８つのマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒは、基準軸Ｍ０を中心とする円周に沿って等間隔で（隣接するマイクの配置位置と基準軸Ｍ０とにより形成される中心角が４５°となる間隔で）で配列されている。
図３に示す例では、マイクユニット２０ａの正面方向Ｄ１に対し、基準軸Ｍ０を中心として反時計回りに０°から４５°間隔の各位置に、マイク３Ｌ、マイク２Ｌ、マイク１Ｌ、マイク４Ｒ、マイク３Ｒ、マイク２Ｒ、マイク１Ｒ、マイク４Ｌが配置されている。
なお、マイクユニット２０ａ、ジャイロセンサ１０ａ、ＭＰＵ３２ａ及びマルチプレクサ３３ａが、第１発明に係る音声入力装置の一例を構成する。 As shown in FIG. 3, the microphone unit 20a includes eight microphones 1L to 4L, 1R to 4R and a support portion 21a that supports these microphones in a state where they are arranged in a range of 360 ° around a predetermined reference axis M0. (An example of a first microphone unit). In the example shown in FIG. 3, the eight microphones 1L to 4L and 1R to 4R are formed at equal intervals along the circumference centered on the reference axis M0 (the positions of adjacent microphones and the reference axis M0 are formed). Are arranged at intervals such that the central angle is 45 °.
In the example illustrated in FIG. 3, the microphone 3L, the microphone 2L, the microphone 1L, the microphone 4R, and the microphone 3L are positioned at respective positions 0 to 45 ° counterclockwise around the reference axis M0 with respect to the front direction D1 of the microphone unit 20a. A microphone 3R, a microphone 2R, a microphone 1R, and a microphone 4L are arranged.
The microphone unit 20a, the gyro sensor 10a, the MPU 32a, and the multiplexer 33a constitute an example of an audio input device according to the first invention.

図２に示す平面図（ａ）及び側面図（ｂ）のように、音源分離装置Ｘ１は、作業者が着用するヘルメットや帽子等の着用具５０に装着され、その全体が、着用具５０の回転に従って回転する。ここで、音源分離装置Ｘ１は、基準軸Ｍ０が、着用具５０の回転軸とほぼ一致するように着用具５０に装着されている。
また、ジャイロセンサ１０ａは、ジャイロスコープの原理により例えば１軸の回転角度を検出するセンサであり、基準軸Ｍ０を回転中心とした場合におけるマイクユニット２０ａの向きを検出する（第１の向き検出手段の一例）。具体的には、所定の初期化処理が実行されたときのマイクユニット２０ａの正面方向Ｄ１の向きを基準方向Ｄ０（図４参照）とし、その基準方向Ｄ０に対するマイクユニット２０ａの正面方向Ｄ１の角度（基準方向Ｄ０に対する回転角度）をマイクユニット２０ａの向き（回転角度）として検出する。
マルチプレクサ３３ａは、８つのマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒにより得られる８つの入力音声信号から、そのうちの一部である２つの信号を選択して音源分離処理部３１に伝送するものである。
また、ＭＰＵ３２ａは、ジャイロセンサ１０ａに対して前記基準方向Ｄ０設定のための初期化設定を行うとともに、ジャイロセンサ１０ａの検出結果（マイクユニット２０ａの向き）を入力し、その検出結果に基づいてマルチプレクサ３３ａを制御することにより、８つのマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒにより得られる８つの入力音声信号のうち、いずれの２つの信号を音源分離処理部３１に伝送させるかを切り替える制御を行うものである。その制御は、ＭＰＵ３２ａが所定のプログラムを実行することにより実現される。なお、マルチプレクサ３３ａ及びＭＰＵ３２ａが、信号選択手段の一例である。 As shown in the plan view (a) and the side view (b) shown in FIG. 2, the sound source separation device X1 is mounted on a wearing tool 50 such as a helmet or a hat worn by an operator, Rotate according to rotation. Here, the sound source separation device X1 is attached to the wearing tool 50 such that the reference axis M0 substantially coincides with the rotation axis of the wearing tool 50.
The gyro sensor 10a is a sensor that detects, for example, the rotation angle of one axis based on the principle of the gyroscope, and detects the direction of the microphone unit 20a when the reference axis M0 is the center of rotation (first direction detecting means). Example). Specifically, the direction of the front direction D1 of the microphone unit 20a when a predetermined initialization process is executed is set as the reference direction D0 (see FIG. 4), and the angle of the front direction D1 of the microphone unit 20a with respect to the reference direction D0. (Rotation angle with respect to the reference direction D0) is detected as the direction (rotation angle) of the microphone unit 20a.
The multiplexer 33a selects two signals, which are part of the eight input audio signals obtained by the eight microphones 1L to 4L and 1R to 4R, and transmits them to the sound source separation processing unit 31.
Further, the MPU 32a performs initialization setting for setting the reference direction D0 to the gyro sensor 10a, inputs a detection result of the gyro sensor 10a (direction of the microphone unit 20a), and a multiplexer based on the detection result. By controlling 33a, control is performed to switch which two signals of the eight input audio signals obtained by the eight microphones 1L to 4L and 1R to 4R are transmitted to the sound source separation processing unit 31. is there. The control is realized by the MPU 32a executing a predetermined program. The multiplexer 33a and the MPU 32a are an example of signal selection means.

以下、図４を参照しつつ、ＭＰＵ３２ａによるマルチプレクサ３３ａの制御内容について説明する。ここで、図４は、マイクユニット２０ａの向きに応じたマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒの選択処理を説明するためのマイクユニット２０ａの平面図を表す。
ＭＰＵ３２ａは、ジャイロセンサ１０ａにより検出されるマイクユニット２０ａの回転角度ω（基準方向Ｄ０に対するマイクユニット２０ａの正面方向Ｄ１の角度）を監視し、その回転角度ωに応じて、８つの音声信号の中からいずれの２つの音声信号を選択して音源分離処理部３１に伝送するかを、以下に示す８つのルールに従って制御する。なお、音源分離処理部３１の２つの入力チャンネルを、第１入力チャンネルＩｎ１及び第２入力チャンネルＩｎ２と称する。 Hereinafter, the control contents of the multiplexer 33a by the MPU 32a will be described with reference to FIG. Here, FIG. 4 is a plan view of the microphone unit 20a for explaining the selection processing of the microphones 1L to 4L and 1R to 4R according to the direction of the microphone unit 20a.
The MPU 32a monitors the rotation angle ω of the microphone unit 20a detected by the gyro sensor 10a (the angle of the front direction D1 of the microphone unit 20a with respect to the reference direction D0), and according to the rotation angle ω, among the eight audio signals. Which two audio signals are selected and transmitted to the sound source separation processing unit 31 is controlled according to the following eight rules. The two input channels of the sound source separation processing unit 31 are referred to as a first input channel In1 and a second input channel In2.

［ルール１］
（０°≦ω＜２２．５）又は（３３７．５°≦ω＜３６０°）である場合、マイク１Ｌの音声信号を第１入力チャンネルＩｎ１へ、マイク１Ｒの音声信号を第２入力チャンネルＩｎ２へ伝送する。このときの状態を図４（ａ）に示す。
［ルール２］
（２２．５°≦ω＜６７．５）である場合、マイク２Ｌの音声信号を第１入力チャンネルＩｎ１へ、マイク２Ｒの音声信号を第２入力チャンネルＩｎ２へ伝送する。このときの状態を図４（ｂ）に示す。
［ルール３］
（６７．５°≦ω＜１１２．５°）である場合、マイク３Ｌの音声信号を第１入力チャンネルＩｎ１へ、マイク３Ｒの音声信号を第２入力チャンネルＩｎ２へ伝送する。
［ルール４］
（１１２．５°≦ω＜１５７．５°）である場合は、マイク４Ｌの音声信号を第１入力チャンネルＩｎ１へ、マイク４Ｒの音声信号を第２入力チャンネルＩｎ２へ伝送する。
［ルール５］
（１５７．５°≦ω＜２０２．５°）である場合、マイク１Ｒの音声信号を第１入力チャンネルＩｎ１へ、マイク１Ｌの音声信号を第２入力チャンネルＩｎ２へ伝送する。
［ルール６］
（２０２．５°≦ω＜２４７．５°）である場合、マイク２Ｒの音声信号を第１入力チャンネルＩｎ１へ、マイク２Ｌの音声信号を第２入力チャンネルＩｎ２へ伝送する。
［ルール７］
（２４７．５°≦ω＜２９２．５°）である場合、マイク３Ｒの音声信号を第１入力チャンネルＩｎ１へ、マイク３Ｌの音声信号を第２入力チャンネルＩｎ２へ伝送する。
［ルール８］
（２０２．５°≦ω＜３３７．５°）である場合、マイク４Ｒの音声信号を第１入力チャンネルＩｎ１へ、マイク４Ｌの音声信号を第２入力チャンネルＩｎ２へ伝送する。このときの状態を図４（ｃ）に示す。
このように、ＭＰＵ３２ａは、基準軸Ｍ０に対して相互に反対側に位置する２つのマイク（１Ｌと１Ｒ、２Ｌと２Ｒ、３Ｌと３Ｒ、４Ｌと４Ｒ）で得られる音声信号を選択して音源分離処理部３１へ伝送させる。 [Rule 1]
When (0 ° ≦ ω <22.5) or (337.5 ° ≦ ω <360 °), the audio signal of the microphone 1L is sent to the first input channel In1, and the audio signal of the microphone 1R is sent to the second input channel In2. Transmit to. The state at this time is shown in FIG.
[Rule 2]
When (22.5 ° ≦ ω <67.5), the audio signal of the microphone 2L is transmitted to the first input channel In1, and the audio signal of the microphone 2R is transmitted to the second input channel In2. The state at this time is shown in FIG.
[Rule 3]
When (67.5 ° ≦ ω <112.5 °), the audio signal of the microphone 3L is transmitted to the first input channel In1, and the audio signal of the microphone 3R is transmitted to the second input channel In2.
[Rule 4]
If (112.5 ° ≦ ω <157.5 °), the audio signal of the microphone 4L is transmitted to the first input channel In1, and the audio signal of the microphone 4R is transmitted to the second input channel In2.
[Rule 5]
When (157.5 ° ≦ ω <202.5 °), the audio signal of the microphone 1R is transmitted to the first input channel In1, and the audio signal of the microphone 1L is transmitted to the second input channel In2.
[Rule 6]
When (202.5 ° ≦ ω <247.5 °), the audio signal of the microphone 2R is transmitted to the first input channel In1, and the audio signal of the microphone 2L is transmitted to the second input channel In2.
[Rule 7]
When (247.5 ° ≦ ω <292.5 °), the audio signal of the microphone 3R is transmitted to the first input channel In1, and the audio signal of the microphone 3L is transmitted to the second input channel In2.
[Rule 8]
When (202.5 ° ≦ ω <337.5 °), the audio signal of the microphone 4R is transmitted to the first input channel In1, and the audio signal of the microphone 4L is transmitted to the second input channel In2. The state at this time is shown in FIG.
As described above, the MPU 32a selects the sound signal obtained by the two microphones (1L and 1R, 2L and 2R, 3L and 3R, 4L and 4R) located on the opposite sides of the reference axis M0, and generates a sound source. The data is transmitted to the separation processing unit 31.

以上に示したように音声信号の選択を行う音源分離装置Ｘ１では、マイクユニット２０ａが基準軸Ｍ０を中心に回転してその向きが変化した場合に、音源分離処理部３１へ伝送する音声信号（即ち、音源分離装置が処理対象とする混合音声信号）の入力に用いる２つのマイクが、そのマイクに対する音源（図４における音源１、音源２）の存在方向が入れ替わらないように選択される。
その結果、音源分離装置Ｘ１が装着された着用具５０を着用した作業者は、回転して向く方向を変更した場合であっても、スピーカ４０を通じて、特定の音源から発生する音声のみが分離生成（抽出）された音声を選択的に聴くことができる。 As described above, in the sound source separation device X1 that selects the sound signal, when the microphone unit 20a rotates around the reference axis M0 and changes its direction, the sound signal (to be transmitted to the sound source separation processing unit 31) That is, the two microphones used for input of the mixed sound signal to be processed by the sound source separation device are selected so that the directions of the sound sources (sound source 1 and sound source 2 in FIG. 4) relative to the microphones are not switched.
As a result, the worker wearing the wearing tool 50 equipped with the sound source separation device X1 separates and generates only the sound generated from the specific sound source through the speaker 40, even when the direction of rotation is changed. The (extracted) voice can be selectively heard.

［第２実施形態］
次に、図５に示すブロック図を参照しつつ、本発明の第２実施形態に係る音源分離装置Ｘ２の構成について説明する。
図５に示すように、音源分離装置Ｘ２は、所定の音響空間において音声を入力する２つのマイクロホン（以下、マイクと称する）１Ｌ、１Ｒが設けられたマイクユニット２０ｂと、制御ユニット３０ｂと、ジャイロセンサ１０ｂと、モータ６０とを備えている。また、制御ユニット３０ｂには、音源分離処理部３１、ＭＰＵ３２ｂ及びモータ６０を動作させるドライバ３３ｂが設けられている。
そして、音源分離装置Ｘ２は、２つのマイク１Ｌ、１Ｒにより得られる２つ（複数）の音声信号を音源分離処理部３１の２つの入力チャンネルＩｎ１、Ｉｎ２各々に伝送し、その２つの入力チャンネルＩｎ１、Ｉｎ２を通じて入力される音声信号から、複数のマイク１Ｌ、２Ｒが配置される音響空間に存在する２つの音源に対応する分離信号を生成（同定）し、その分離信号を出力チャンネルＯｕｔ１、Ｏｕｔ２を通じて出力するものである。図５に示す例では、一方の出力チャンネルＯｕｔ１を通じて出力される分離信号がスピーカ４０に出力されている。
各入力チャンネルＩｎ１、Ｉｎ２への入力音声信号は、それぞれ複数の音源の信号（音源信号）が重畳された混合音声信号であり、音源分離処理部３１は、前述した音源分離装置Ｘ１における音源分離処理部３１と同じものである。
なお、マイクユニット２０ｂ、ジャイロセンサ１０ｂ、ＭＰＵ３２ｂ及びドライバ３３ｂが、第２発明に係る音声入力装置の一例を構成する。 [Second Embodiment]
Next, the configuration of the sound source separation device X2 according to the second embodiment of the present invention will be described with reference to the block diagram shown in FIG.
As shown in FIG. 5, the sound source separation apparatus X2 includes a microphone unit 20b provided with two microphones (hereinafter referred to as microphones) 1L and 1R for inputting sound in a predetermined acoustic space, a control unit 30b, a gyro The sensor 10b and the motor 60 are provided. The control unit 30b is provided with a sound source separation processing unit 31, an MPU 32b, and a driver 33b that operates the motor 60.
The sound source separation device X2 transmits two (plural) audio signals obtained by the two microphones 1L and 1R to each of the two input channels In1 and In2 of the sound source separation processing unit 31, and the two input channels In1. , Generating (identifying) separated signals corresponding to two sound sources existing in an acoustic space in which a plurality of microphones 1L and 2R are arranged from the audio signal input through In2, and using the separated signals through output channels Out1 and Out2. Output. In the example shown in FIG. 5, the separated signal output through one output channel Out1 is output to the speaker 40.
Input audio signals to the input channels In1 and In2 are mixed audio signals on which a plurality of sound source signals (sound source signals) are superimposed, and the sound source separation processing unit 31 performs sound source separation processing in the sound source separation device X1 described above. This is the same as the unit 31.
The microphone unit 20b, the gyro sensor 10b, the MPU 32b, and the driver 33b constitute an example of a voice input device according to the second invention.

マイクユニット２０ｂは、２つのマイク１Ｌ、１Ｒと、これらを所定の基準軸Ｍの周囲の所定位置に支持する支持部２１ｂとを有して構成されている（第２のマイクロホンユニットの一例）。図５に示す例では、２つのマイク１Ｌ、１Ｒは、基準軸Ｍに対して相互に反対側の等距離の位置に配置されている。
以下、２つのマイク１Ｌ、１Ｒの配列方向に直交する方向であって、一方のマイク１Ｌが左方向、他方のマイク１Ｒが右方向となる方向を、マイクユニット２０ｂの正面方向と称する。図５では、紙面に向かう方向が、マイクユニット２０ｂの正面方向である。
モータ６０は、例えばステッピングモータ等により構成され、マイクユニット２０ｂを基準軸Ｍ０を中心に回転駆動し、これを所望の向きで停止させる（向きを調節する）駆動手段である（回転駆動手段の一例）。
ジャイロセンサ１０ｂは、前述した音源分離装置Ｘ１におけるジャイロセンサ１０ａと同様の回転角度検出用のセンサであり、基準軸Ｍ０を回転中心とした場合におけるマイクユニット２０ｂを回転可能に支持する部分（図５の例では、モータ６０の本体）の向きを検出する（第２の向き検出手段の一例）。具体的には、所定の初期化処理が実行されたときのモータ６０本体の正面方向Ｄ２（図６参照）の向きを基準方向Ｄ０（図６参照）とし、その基準方向Ｄ０に対するモータ６０本体の正面方向Ｄ２の角度（基準方向Ｄ０に対する回転角度）をモータ６０本体の向き（回転角度）として検出する。ここで、図５に示す例では、モータ６０本体（マックユニット２０ｂの支持部）、ジャイロセンサ１０ｂ及び着用具５０は、基準軸Ｍ０の位置で連結固定され、モータ６０本体の正面方向Ｄ２が、着用具５０の正面方向となるように設定されている。 The microphone unit 20b includes two microphones 1L and 1R and a support portion 21b that supports them at a predetermined position around a predetermined reference axis M (an example of a second microphone unit). In the example illustrated in FIG. 5, the two microphones 1 L and 1 R are arranged at equidistant positions on the opposite sides of the reference axis M.
Hereinafter, the direction orthogonal to the arrangement direction of the two microphones 1L and 1R, in which one microphone 1L is the left direction and the other microphone 1R is the right direction is referred to as a front direction of the microphone unit 20b. In FIG. 5, the direction toward the paper surface is the front direction of the microphone unit 20b.
The motor 60 is constituted by, for example, a stepping motor or the like, and is a driving unit that rotates the microphone unit 20b around the reference axis M0 and stops it in a desired direction (adjusts the direction) (an example of a rotation driving unit). ).
The gyro sensor 10b is a sensor for detecting a rotation angle similar to the gyro sensor 10a in the sound source separation device X1 described above, and a part that rotatably supports the microphone unit 20b when the reference axis M0 is the rotation center (FIG. 5). In this example, the direction of the main body of the motor 60 is detected (an example of second direction detection means). Specifically, the direction of the front direction D2 (see FIG. 6) of the main body of the motor 60 when a predetermined initialization process is executed is set as the reference direction D0 (see FIG. 6), and the main body of the motor 60 with respect to the reference direction D0. The angle of the front direction D2 (rotation angle with respect to the reference direction D0) is detected as the direction (rotation angle) of the motor 60 main body. Here, in the example shown in FIG. 5, the motor 60 main body (support unit of the Mac unit 20 b), the gyro sensor 10 b, and the wearing tool 50 are connected and fixed at the position of the reference axis M 0, and the front direction D 2 of the motor 60 main body is It is set to be the front direction of the wearing tool 50.

ドライバ３３ｂは、ＭＰＵ３２ｂからの制御指令に従ってモータ６０を動作させることにより、モータ６０の回転軸の回転角度、即ち、その回転軸に支持されたマイクユニット２０ｂの回転角度を調節するモータ駆動回路である。
ＭＰＵ３２ｂは、ジャイロセンサ１０ｂに対して所定の基準方向Ｄ０（図６参照）設定のための初期化設定を行うとともに、ジャイロセンサ１０ｂの検出結果（マイクユニット２０ｂの支持部（モータ６０本体）の向き）を入力し、その検出結果に基づいてドライバ３３ｂに制御指令を出力することにより、マイクユニット２０ｂの向きを調節するものである。即ち、ＭＰＵ３２ｂは、ドライバ３３ｂを通じてモータ６０の回転軸の回転角度、即ち、マイクユニット２０ｂの向きを制御する。なお、ＭＰＵ３２ｂ及びドライバ３３ｂが、向き調節手段の一例である。
図５に示すように、音源分離装置Ｘ２も、作業者が着用するヘルメットや帽子等の着用具５０に装着され、その全体が、着用具５０の回転に従って回転する。ここで、音源分離装置Ｘ２は、基準軸Ｍ０が、着用具５０の回転軸とほぼ一致するように着用具５０に装着されている。 The driver 33b is a motor drive circuit that adjusts the rotation angle of the rotation shaft of the motor 60, that is, the rotation angle of the microphone unit 20b supported by the rotation shaft by operating the motor 60 in accordance with a control command from the MPU 32b. .
The MPU 32b performs initialization setting for setting a predetermined reference direction D0 (see FIG. 6) with respect to the gyro sensor 10b, and also detects the detection result of the gyro sensor 10b (the direction of the support unit (motor 60 main body) of the microphone unit 20b). ) And a control command is output to the driver 33b based on the detection result, thereby adjusting the direction of the microphone unit 20b. That is, the MPU 32b controls the rotation angle of the rotation shaft of the motor 60, that is, the direction of the microphone unit 20b, through the driver 33b. The MPU 32b and the driver 33b are an example of the orientation adjusting unit.
As shown in FIG. 5, the sound source separation device X 2 is also attached to a wearing tool 50 such as a helmet or a hat worn by an operator, and the whole rotates according to the rotation of the wearing tool 50. Here, the sound source separation device X 2 is attached to the wearing tool 50 so that the reference axis M 0 substantially coincides with the rotation axis of the wearing tool 50.

以下、図６を参照しつつ、ＭＰＵ３２ｂによるモータ６０の制御内容について説明する。ここで、図６は、マイクユニット２０ｂの向きに応じたマイクユニット２０ｂの回転制御を説明するためのマイクユニット２０ｂの平面図を表す。
ＭＰＵ３２ｂは、ジャイロセンサ１０ｂにより検出されるモータ６０本体の回転角度ω（基準方向Ｄ０に対するモータ６０本体（或いは着用具５０）の正面方向Ｄ２の角度）を監視し、その回転角度ωに応じて、マイクユニット２０ｂの向き（モータ６０の回転軸の回転角度）を制御する。
図６（ａ）は、モータ６０本体の正面方向Ｄ２（着用具５０の正面方向）が所定の基準方向Ｄ０を向いている初期状態におけるマイクユニット２０ｂの向きを表す。この初期状態では、マイクユニット２０ｂの正面方向が基準方向Ｄ０に向くよう初期設定される。また、この初期状態において、ジャイロセンサ１０ｂが初期化され、その検出角度ωが０°となる。
図６（ｂ）は、初期状態からモータ６０本体（着用具５０）が反時計回りに角度ωだけ回転した状態（ジャイロセンサ１０ｂの検出角度＝ω）を表す。この場合、ＭＰＵ３２ｂは、マイクユニット２０ｂを−ωだけ回転させる。これにより、図６（ｃ）に示すように、マイクユニット２０ｂの正面方向が基準方向Ｄ０に向く状態となる。
また、図６（ｄ）は、初期状態からモータ６０本体（着用具５０）が時計回りに角度ωだけ回転した状態（ジャイロセンサ１０ｂの検出角度＝ω）を表す。この場合も、ＭＰＵ３２ｂは、マイクユニット２０ｂを−ωだけ回転させる。これにより、図６（ｅ）に示すように、マイクユニット２０ｂの正面方向が基準方向Ｄ０に向く状態となる。 Hereinafter, the control content of the motor 60 by the MPU 32b will be described with reference to FIG. Here, FIG. 6 shows a plan view of the microphone unit 20b for explaining the rotation control of the microphone unit 20b according to the direction of the microphone unit 20b.
The MPU 32b monitors the rotation angle ω of the motor 60 main body detected by the gyro sensor 10b (the angle of the front direction D2 of the motor 60 main body (or the wearing tool 50) with respect to the reference direction D0), and according to the rotation angle ω, The direction of the microphone unit 20b (the rotation angle of the rotation shaft of the motor 60) is controlled.
FIG. 6A shows the orientation of the microphone unit 20b in the initial state where the front direction D2 of the motor 60 main body (the front direction of the wearing tool 50) faces the predetermined reference direction D0. In this initial state, the microphone unit 20b is initially set so that the front direction of the microphone unit 20b faces the reference direction D0. In this initial state, the gyro sensor 10b is initialized and the detection angle ω becomes 0 °.
FIG. 6B shows a state in which the motor 60 main body (wearing tool 50) is rotated counterclockwise by an angle ω from the initial state (detection angle of the gyro sensor 10b = ω). In this case, the MPU 32b rotates the microphone unit 20b by −ω. Thereby, as shown in FIG.6 (c), the front direction of the microphone unit 20b will be in the state which faces the reference direction D0.
FIG. 6D shows a state where the motor 60 main body (wearing tool 50) is rotated clockwise by an angle ω from the initial state (detection angle of the gyro sensor 10b = ω). Also in this case, the MPU 32b rotates the microphone unit 20b by −ω. Thereby, as shown in FIG.6 (e), the front direction of the microphone unit 20b will be in the state which faces the reference direction D0.

以上に示したように、音源分離装置Ｘ２は、着用具５０を着用した作業者が回転することにより、マイクユニット２０ｂを回転可能に支持する部分であるモータ６０本体が基準軸Ｍ０を中心に回転した場合でも、マイクユニット２０ｂの向きを一定方向に向くよう保持する。
その結果、音源分離装置Ｘ２が装着された着用具５０を着用した作業者は、回転して向く方向を変更した場合であっても、スピーカ４０を通じて、特定の音源から発生する音声のみが分離生成（抽出）された音声を選択的に聴くことができる。
なお、図５に示した音源分離装置Ｘ２は、ジャイロセンサ１０ｂにより、マイクユニット２０ｂを回転可能に支持する支持部（モータ６０本体）の向きを検出するよう構成されているが、他の構成も考えられる。
例えば、ジャイロセンサ１０ｂを、マイクユニット２０ｂの支持部２１ｂ等に設け、そのジャイロセンサ１０ｂにより、マイクユニット２０ｂの向きを検出するよう構成することも考えられる。この場合、ジャイロセンサ１０ｂによる検出角度が常に一定（＝０°）となるように、モータ６０の回転軸の角度を調節（制御）すればよい。
また、図５に示した音源分離装置Ｘ２は、モータ６０の回転軸が基準軸Ｍ０となる構成を示したが、ギア等のリンク機構を採用することにより、基準軸Ｍ０とモータ６０の回転軸とが一致しない構成も考えられる。 As described above, in the sound source separation device X2, the motor 60 main body, which is a portion that rotatably supports the microphone unit 20b, rotates about the reference axis M0 when the worker wearing the wearing tool 50 rotates. Even in this case, the microphone unit 20b is held so as to face in a certain direction.
As a result, the worker wearing the wearing tool 50 with the sound source separation device X2 separated and generated only the sound generated from a specific sound source through the speaker 40, even when the direction of rotation is changed. The (extracted) voice can be selectively heard.
The sound source separation device X2 shown in FIG. 5 is configured to detect the orientation of the support portion (motor 60 main body) that rotatably supports the microphone unit 20b by the gyro sensor 10b, but other configurations are also possible. Conceivable.
For example, it is conceivable that the gyro sensor 10b is provided on the support portion 21b of the microphone unit 20b and the direction of the microphone unit 20b is detected by the gyro sensor 10b. In this case, the angle of the rotating shaft of the motor 60 may be adjusted (controlled) so that the angle detected by the gyro sensor 10b is always constant (= 0 °).
Further, the sound source separation device X2 shown in FIG. 5 has a configuration in which the rotation axis of the motor 60 is the reference axis M0. However, by adopting a link mechanism such as a gear, the reference axis M0 and the rotation axis of the motor 60 are There may be a configuration in which does not match.

また、前述した音源分離装置Ｘ１及び音源分離装置Ｘ２では、２入力２出力の音源分離処理部３１を例示したが、３つ以上の入出力チャンネル（チャンネル数ｎ）を備えた音源分離処理部を採用することも考えられる。但しその場合、音源分離装置Ｘ１では、マイクロホンの数をｎ＋１個以上とし、マルチプレクサ３３ａにより選択する信号数をｎ個とする。また、音源分離装置Ｘ２では、マイクロホンの数をｎ個とする。
また、回転角度を検出するセンサとして、ジャイロセンサ１０ａ、１０ｂ以外の他の回転角度検出センサを採用することも考えられる。
また、前述した音源分離装置Ｘ１におけるマイクユニット２０ａは、３つ以上のマイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒを基準軸Ｍ０の周囲の３６０°の範囲に並べた状態で支持するものであったが、各マイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒがそれより狭い範囲に並べられた構成も考えられる。例えば、着用具５０の回転角度の範囲が９０°（±４５°）以内であるという制約があるような場合には、マイク１Ｌ〜４Ｌ、１Ｒ〜４Ｒを基準軸Ｍ０の周囲の２７０°の範囲に並べた状態で支持する構成とすることが考えられる。 Further, in the sound source separation device X1 and the sound source separation device X2 described above, the sound source separation processing unit 31 having two inputs and two outputs is illustrated, but a sound source separation processing unit having three or more input / output channels (number of channels n) is provided. It is possible to adopt it. However, in that case, in the sound source separation device X1, the number of microphones is n + 1 or more, and the number of signals selected by the multiplexer 33a is n. In the sound source separation device X2, the number of microphones is n.
It is also conceivable to employ a rotation angle detection sensor other than the gyro sensors 10a and 10b as a sensor for detecting the rotation angle.
Further, the microphone unit 20a in the sound source separation device X1 described above supports three or more microphones 1L to 4L and 1R to 4R in a state where they are arranged in a 360 ° range around the reference axis M0. A configuration in which the microphones 1L to 4L and 1R to 4R are arranged in a narrower range is also conceivable. For example, when there is a restriction that the range of the rotation angle of the wearing tool 50 is within 90 ° (± 45 °), the microphones 1L to 4L and 1R to 4R are within a range of 270 ° around the reference axis M0. It is conceivable that the structure is supported in a state where they are arranged side by side.

本発明は、音声入力装置への利用が可能である。 The present invention can be used for a voice input device.

本発明の第１実施形態に係る音源分離装置Ｘ１の概略構成を表すブロック図。The block diagram showing the schematic structure of the sound source separation apparatus X1 which concerns on 1st Embodiment of this invention. 音源分離装置Ｘ１の概略外観図。1 is a schematic external view of a sound source separation device X1. 音源分離装置Ｘ１が備えるマイクユニットの平面図。The top view of the microphone unit with which the sound source separation apparatus X1 is provided. 音源分離装置Ｘ１におけるマイクユニットの向きに応じたマイクロホンの選択処理を説明するためのマイクユニットの平面図。The top view of the microphone unit for demonstrating the selection process of the microphone according to the direction of the microphone unit in the sound source separation device X1. 本発明の第２実施形態に係る音源分離装置Ｘ２の概略構成を表すブロック図。The block diagram showing schematic structure of the sound source separation apparatus X2 which concerns on 2nd Embodiment of this invention. 音源分離装置Ｘ２におけるマイクユニットの回転制御を説明するためのマイクユニットの平面図。The top view of the microphone unit for demonstrating rotation control of the microphone unit in the sound source separation apparatus X2.

符号の説明Explanation of symbols

Ｘ１、Ｘ２…本発明の実施形態に係る音源分離装置
１、２…音源
１Ｌ〜４Ｌ、１Ｒ〜４Ｒ…マイクロホン
１０ａ、１０ｂ…ジャイロセンサ
２０ａ、２０ｂ…マイクユニット
２１ａ、２２ｂ…マイクロホンの支持部
３０ａ、３０ｂ…制御ユニット
４０…スピーカ
Ｉｎ１、Ｉｎ２…入力チャンネル
Ｏｕｔ１、Ｏｕｔ２…出力チャンネル X1, X2 ... sound source separation devices 1, 2 ... sound sources 1L to 4L, 1R to 4R ... microphones 10a, 10b ... gyro sensors 20a, 20b ... microphone units 21a, 22b ... microphone support 30a, 30b ... Control unit 40 ... Speakers In1, In2 ... Input channel Out1, Out2 ... Output channel

Claims

所定の音響空間において音声を入力する複数のマイクロホンを備え、該マイクロホンにより得られる音声信号を所定の音源分離装置に伝送する音声入力装置であって、
複数のマイクロホン及びこれらを所定の基準軸の周囲の所定位置に並べた状態で支持する支持部を有する第１のマイクロホンユニットと、
前記基準軸を回転中心とした場合における前記第１のマイクロホンユニットの向きを検出する第１の向き検出手段と、
前記第１の向き検出手段の検出結果に基づいて、前記複数のマイクロホンにより得られる複数の入力音声信号から、そのうちの一部の複数の信号を選択して前記音源分離装置に伝送する信号選択手段と、
を具備してなることを特徴とする音声入力装置。 A voice input device comprising a plurality of microphones for inputting voice in a predetermined acoustic space, and transmitting a voice signal obtained by the microphones to a predetermined sound source separation device,
A first microphone unit having a plurality of microphones and a support unit that supports the microphones in a state in which they are arranged at predetermined positions around a predetermined reference axis;
First orientation detection means for detecting the orientation of the first microphone unit when the reference axis is the rotation center;
Based on the detection result of the first orientation detection means, a signal selection means for selecting a part of a plurality of signals from a plurality of input audio signals obtained by the plurality of microphones and transmitting the selected signal to the sound source separation device. When,
A voice input device comprising:

所定の音響空間において音声を入力する複数のマイクロホンを備え、該マイクロホンにより得られる音声信号を所定の音源分離装置に伝送する音声入力装置であって、
複数のマイクロホン及びこれらを所定の基準軸の周囲の所定位置に支持する支持部を有する第２のマイクロホンユニットと、
前記第２のマイクロホンユニットを前記基準軸を中心に回転駆動する回転駆動手段と、
前記基準軸を回転中心とした場合における前記第２のマイクロホンユニットを回転可能に支持する部分の向き若しくは前記第２のマイクロホンユニットの向きを検出する第２の向き検出手段と、
前記第２の向き検出手段の検出結果に基づいて前記回転駆動手段を制御することにより前記第２のマイクロホンユニットの向きを調節する向き調節手段と、
を具備してなることを特徴とする音声入力装置。 A voice input device comprising a plurality of microphones for inputting voice in a predetermined acoustic space, and transmitting a voice signal obtained by the microphones to a predetermined sound source separation device,
A second microphone unit having a plurality of microphones and a support portion for supporting these at predetermined positions around a predetermined reference axis;
Rotation driving means for rotating the second microphone unit around the reference axis;
Second orientation detection means for detecting the orientation of the portion that rotatably supports the second microphone unit or the orientation of the second microphone unit when the reference axis is the rotation center;
Direction adjusting means for adjusting the direction of the second microphone unit by controlling the rotation driving means based on the detection result of the second direction detecting means;
A voice input device comprising:

前記第１の向き検出手段又は前記第２の向き検出手段が、所定の基準方向に対する回転角度をジャイロセンサにより検出するものである請求項１又は２のいずれかに記載の音声入力装置。 The voice input device according to claim 1, wherein the first direction detection unit or the second direction detection unit detects a rotation angle with respect to a predetermined reference direction by a gyro sensor.

請求項１〜３のいずれかに記載の音声入力装置を具備し、該音声入力装置から伝送される複数の音声信号から、該音声入力装置が配置される音響空間に存在する１又は複数の音源に対応する分離信号を生成してなることを特徴とする音源分離装置。 One or a plurality of sound sources that are provided in the acoustic space in which the voice input device is arranged from a plurality of voice signals transmitted from the voice input device, comprising the voice input device according to claim 1. A sound source separation device characterized by generating a separation signal corresponding to the above.