JP7154530B2

JP7154530B2 - Sound source direction identification device

Info

Publication number: JP7154530B2
Application number: JP2018004926A
Authority: JP
Inventors: 昌浩和田; 慶介高橋
Original assignee: 株式会社ユピテル; 株式会社ユピテル鹿児島
Priority date: 2018-01-16
Filing date: 2018-01-16
Publication date: 2022-10-18
Anticipated expiration: 2038-01-16
Also published as: JP2019124570A; JP7403778B2; JP2022180571A

Description

本発明は、音源方向特定装置等に関するものである。 The present invention relates to a sound source direction specifying device and the like.

特許文献１には、３つのマイクロフォンを備え、音源方向を推定する音源方向推定装置が記載されている。具体的には、音源方向の水平角を、３つの到達時間差を用いて算出する方法が開示されている。 Patent Literature 1 describes a sound source direction estimation device that includes three microphones and estimates the direction of a sound source. Specifically, a method of calculating the horizontal angle of the sound source direction using three arrival time differences is disclosed.

特開２０１５－１６１６５９号公報JP 2015-161659 A

３つのマイクロフォンを用いて音源を特定する方法として、特許文献１の方法では、方向を精度良く特定できないおそれがあった。 As a method of identifying a sound source using three microphones, the method of Patent Document 1 may not accurately identify the direction.

本願は、例えば上記の課題等の様々な課題に鑑み提案されたものであって、従来技術とは異なる方法で、例えば３つのマイクロフォンを用いて音源方向を精度良く特定することができる音源方向特定装置等を提供すること等を目的とする。
本願の発明の目的はこれに限定されず、本明細書および図面等に開示される構成の部分から奏する効果を得ることを目的とする構成についても分割出願・補正等により権利取得する意思を有する。例えば本明細書において「～できる」と記載した箇所を「～が課題である」と読み替えた課題が本明細書には開示されている。課題はそれぞれ独立したものとして記載しているものであり、この課題を解決するための構成についても単独で分割出願・補正等により権利取得する意思を有する。課題が明細書の記載から黙示的に把握されるものであっても、本出願人は本明細書に記載の構成の一部を補正または分割出願にて特許請求の範囲とする意思を有する。 The present application has been proposed in view of various problems such as those described above, and is a method of identifying a sound source direction that can accurately identify the direction of a sound source using, for example, three microphones in a manner different from the conventional technology. The purpose is to provide equipment, etc.
The purpose of the invention of the present application is not limited to this, and we intend to acquire rights for the configuration aiming to obtain the effect produced by the configuration disclosed in the specification and drawings by divisional application, amendment, etc. . For example, the present specification discloses a problem in which the phrase "can be done" is read as "is the problem." The problems are described as independent ones, and we have the intention to independently acquire the rights for the structure for solving the problems by filing a divisional application, an amendment, etc. Even if the problem is implicitly grasped from the description of the specification, the applicant intends to claim part of the structure described in this specification in an amended or divisional application.

（１）本願の音源方向特定装置は、三角形の頂点に配置された３つのマイクロフォンと、音源から前記３つのマイクロフォンの各々までの音の到達時間の差に基づき、前記音源の位置を、前記三角形を含む平面に垂直な方向に沿って前記三角形を含む平面に投影した位置から前記平面の前記三角形で囲まれた領域の内側にある基準位置へ向かう音源方向を特定する特定部と、を備えることを特徴とする。このようにすれば、３つのマイクロフォンで音源方向を特定することができる。 (1) The sound source direction identifying device of the present application determines the position of the sound source based on three microphones arranged at the vertices of a triangle and the difference in arrival time of sound from the sound source to each of the three microphones. a specifying unit that specifies a sound source direction from a position projected onto a plane containing the triangle along a direction perpendicular to the plane containing the characterized by In this way, the sound source direction can be specified with three microphones.

例えば、２つのマイクロフォンを結ぶ線分の中点を基準位置とし、当該２つのマイクロフォンを含む平面を想定した場合、２つのマイクロフォンでは、音源方向の特定は制約を受けざるを得ない。
図１に示すように、間隔Ｄａｂを開けて配置されたマイクロフォンMIC（Ａｃｈ）（以下、MICａと記載する。），MIC（Ｂｃｈ）（以下、MICｂと記載する。）および音源を含む平面を想定する。図１では、音源の位置（以下、音源位置と称する）から基準位置へ向かう音源方向を矢印にて示している。また、音源方向を、マイクロフォンMICａ，MICｂを結ぶ線分の中点を通る、マイクロフォンMICａ，MICｂを結ぶ線分の垂線（図１では０°と表記する。）と音源方向とのなす角の角度（以下、音源角度と称する。）である角度θにて示すとする。以下の記載において、垂線を０°線と記載する場合がある。
音が平面波であるとみなすと、音源位置からマイクロフォンMICａまでの距離と音源位置からマイクロフォンMICｂまでの距離との差である距離差Ｄdiffは、斜辺がマイクロフォンMICａ，MICｂを結ぶ線分、１辺が音源方向に直交する直角三角形のもう１辺の長さである。従って、距離差Ｄdiffは、次の（式１）で示される。音源位置からの距離は、マイクロフォンMICｂに対し、マイクロフォンMICａの方が距離差Ｄdiffだけ長いということになる。
Ｄdiff＝Ｄａｂ×ｓｉｎθ・・・（式１）
（式１）を変形すると、角度θは次の（式２）で示される。
θ＝ａｒｃｓｉｎ（Ｄdiff／Ｄａｂ）・・・（式２）
また、音速Ｖｓを用いて、距離差Ｄdiffと、マイクロフォンMICａにおける音の到達時間とマイクロフォンMICｂにおける音の到達時間との差である到達時間差Ｔdiffとの関係は、次の（式３）で示される。
Ｔdiff＝Ｄdiff／Ｖｓ・・・（式３）
（式３）を変形すると、距離差Ｄdiffは次の（式４）で示される。
Ｄdiff＝Ｖｓ×Ｔdiff・・・（式４）
（式２）に（式４）を代入すると、次の（式５）となる。
θ＝ａｒｃｓｉｎ（Ｖｓ×Ｔdiff／Ｄａｂ）・・・（式５）
（式５）において、間隔Ｄａｂおよび音速Ｖｓが既知とすれば、到達時間差Ｔdiffを測定などにより求めることで、角度θを算出することができる。
しかしながら、図２に示すように、マイクロフォンMICａ，MICｂに対し、同じ到達時間差Ｔdiffとなる音源方向は、角度θで示される方向と角度θ´で示される方向との２つ存在する。ここで、角度θ´は１８０°から角度θを減じた角度である。従って、到達時間差Ｔdiffだけでは、音源角度が角度θと角度θ´との何れであるかを特定することができない。マイクロフォンMICａ，MICｂを通る線に対し、一方の方向を正面方向、他方の方向を背面方向と称する場合、正面方向に音源があった場合、実際には、正面方向が実像、背面方向が虚像であるが、到達時間差Ｔdiffだけでは、何れか実像であるか区別がつかない。 For example, when a midpoint of a line segment connecting two microphones is set as a reference position and a plane including the two microphones is assumed, specifying the direction of the sound source is inevitably restricted by the two microphones.
As shown in FIG. 1, a plane including microphones MIC (Ach) (hereinafter referred to as MICa) and MIC (Bch) (hereinafter referred to as MICb) arranged with an interval Dab and a sound source is assumed. do. In FIG. 1, arrows indicate the direction of the sound source from the position of the sound source (hereinafter referred to as the sound source position) to the reference position. Also, the direction of the sound source is the angle formed by the direction of the sound source and the perpendicular line of the line connecting the microphones MICa and MICb that passes through the midpoint of the line connecting the microphones MICa and MICb (indicated as 0° in FIG. 1). (hereinafter referred to as the sound source angle). In the description below, the perpendicular may be referred to as the 0° line.
Assuming that the sound is a plane wave, the distance difference Ddiff, which is the difference between the distance from the sound source position to the microphone MICa and the distance from the sound source position to the microphone MICb, has a line segment connecting the microphones MICa and MICb as the hypotenuse and a side as It is the length of the other side of a right-angled triangle perpendicular to the direction of the sound source. Therefore, the distance difference Ddiff is given by the following (Equation 1). As for the distance from the sound source position, the microphone MICa is longer than the microphone MICb by the distance difference Ddiff.
Ddiff=Dab×sin θ (Formula 1)
By modifying (formula 1), the angle θ is expressed by the following (formula 2).
θ=arcsin(Ddiff/Dab) (Formula 2)
Further, using the speed of sound Vs, the relationship between the distance difference Ddiff and the arrival time difference Tdiff, which is the difference between the arrival time of the sound at the microphone MICa and the arrival time of the sound at the microphone MICb, is expressed by the following (Equation 3): .
Tdiff=Ddiff/Vs (Formula 3)
By transforming (Formula 3), the distance difference Ddiff is given by (Formula 4) below.
Ddiff=Vs×Tdiff (Formula 4)
Substituting (Equation 4) into (Equation 2) yields the following (Equation 5).
θ=arcsin(Vs×Tdiff/Dab) (Formula 5)
In (Formula 5), if the interval Dab and the speed of sound Vs are known, the angle θ can be calculated by obtaining the arrival time difference Tdiff by measurement or the like.
However, as shown in FIG. 2, there are two sound source directions with the same arrival time difference Tdiff for the microphones MICa and MICb, namely, the direction indicated by the angle θ and the direction indicated by the angle θ′. Here, the angle θ' is an angle obtained by subtracting the angle θ from 180°. Therefore, it is not possible to specify whether the sound source angle is the angle θ or the angle θ′ only from the arrival time difference Tdiff. If one direction of the line passing through the microphones MICa and MICb is called the front direction and the other direction is called the back direction, when the sound source is in the front direction, the front direction is actually a real image, and the back direction is a virtual image. However, it is not possible to distinguish which is the real image only by the arrival time difference Tdiff.

これに対して、本願の構成である同一直線上にない、三角形の頂点に配置される３つのマイクロフォンの間での到達時間の差によれば、音源方向を特定することができる。例えば、マイクロフォンMICａ，MICｂおよび音源を含む平面上の、マイクロフォンMICａ，MICｂに対して、音源方向が角度θである音１から遠い位置に、３つ目のマイクロフォンであるマイクロフォンMIC（Ｃｃｈ）（以下、MICｃと記載する。）を配置したとする。この場合、マイクロフォンMICａ，MICｂよりもマイクロフォンMICｃに音が早く到達した場合には音源方向は角度θ´である音２であり、マイクロフォンMICａ，MICｂよりもマイクロフォンMICｃに音が遅く到達した場合には音源方向は角度θである音１であると特定することができる。
つまり、マイクロフォンMICａへの音の到達時間とマイクロフォンMICｂへの音の到達時間との差である到達時間差Ｔｃａを（式５）に代入して算出される、角度θおよび１８０°から角度θを減じた角度θ´のうち、マイクロフォンMICａへの音の到達時間とマイクロフォンMICｃへの音の到達時間との差である到達時間差ＴｃａもしくはマイクロフォンMICｂへの音の到達時間とマイクロフォンMICｃへの音の到達時間との差である到達時間差Ｔｂｃに基づいて、何れか一方を音源角度であると特定することができる。以下の説明において、マイクロフォンMICａ～MICｃの何れの組の到達時間差であるかを区別する場合には到達時間差Ｔａｂ，Ｔｂｃ，Ｔｃａと記載し、総称する場合には到達時間差Ｔdiffと記載する。
尚、ここでは、到達時間差Ｔdiffは、２つのマイクロフォンにおいて、音が到達するのに要した時間の長い方の時間から、短い方の時間を減じて算出される時間であるものとする。無論、２つのマイクロフォンにおいて、音が到達するのに要した時間の長短によれば、２つのマイクロフォンのどちらが音源に対して遠方にあるかを特定することができる。 On the other hand, according to the configuration of the present application, the sound source direction can be specified according to the difference in arrival times among three microphones arranged at the vertices of a triangle that are not on the same straight line. For example, on a plane containing microphones MICa, MICb and a sound source, a third microphone, microphone MIC (Cch) (hereinafter referred to as a , MICc) are arranged. In this case, when the sound reaches the microphone MICc earlier than the microphones MICa and MICb, the sound source direction is the sound 2 whose angle is θ′, and when the sound reaches the microphone MICc later than the microphones MICa and MICb, The sound source direction can be identified as sound 1 at angle θ.
That is, subtracting the angle θ from the angle θ and 180° calculated by substituting the arrival time difference Tca, which is the difference between the arrival time of the sound to the microphone MICa and the arrival time of the sound to the microphone MICb, into (Equation 5): The arrival time difference Tca, which is the difference between the arrival time of sound to the microphone MICa and the arrival time of the sound to the microphone MICc, or the arrival time of the sound to the microphone MICb and the arrival time of the sound to the microphone MICc. Either one can be identified as the sound source angle based on the arrival time difference Tbc, which is the difference between . In the following description, the arrival time differences Tab, Tbc, and Tca are used to distinguish which group of microphones MICa to MICc the arrival time difference belongs to, and the arrival time difference Tdiff is used to collectively refer to the arrival time difference.
Here, the arrival time difference Tdiff is a time calculated by subtracting the shorter time from the longer time required for the sound to arrive at the two microphones. Of course, it is possible to identify which of the two microphones is farther from the sound source, depending on how long or short it takes for the sound to arrive at the two microphones.

以下の説明において、図２にて示した、角度θ，θ´を区別するために、３つのマイクロフォンのうち、１組をなす２つのマイクロフォンにおいて、２つのマイクロフォンを通る線に対し、残り１つのマイクロフォンがない側を「表」、残り１つのマイクロフォンがある側を「裏」と称する。例えば、図２では、１組をなすマイクロフォンMICａ、MICｂにおいて、マイクロフォンMICｃが図２の位置にある場合、音１がある側が「表」であり、音２がある側が「裏」である。以下の説明において、１組をなすマイクロフォンをマイク組として記載、例えば１組をなすマイクロフォンMICａ、MICｂをマイク組MICａ⇔MICｂと記載する場合がある。 In the following description, in order to distinguish the angles θ and θ′ shown in FIG. The side without microphones is called the "front" and the side with the remaining microphone is called the "back". For example, in FIG. 2, in a pair of microphones MICa and MICb, when the microphone MICc is positioned as shown in FIG. In the following description, a set of microphones may be referred to as a microphone set, for example, a set of microphones MICa and MICb may be referred to as a microphone set MICa⇔MICb.

さて、音源方向が、３つのマイクロフォンの位置を頂点とする三角形ＡＢＣの垂心を通り、三角形の各辺と平行な３本の線を境界線とする、６つの領域の何れに含まれるかを特定すれば、効率的に３組のマイクロフォンの各々における音源方向の表裏の区別をすることができる。
図３に示すように、マイクロフォンMICａ～MICｃの位置を頂点とする三角形ＡＢＣの垂心を通り、三角形ＡＢＣの各辺と平行な３本の線である、平行線ＰＬａｂ，ＰＬｂｃ、ＰＬｃａを境界線とする６つの領域を領域１～６と称する。ここでは、三角形ＡＢＣの垂心を基準位置とする。音源位置から基準位置へ向かう方向が音源方向であり、図３において矢印にて音源方向の一例を示している。尚、音は平面波であるとみなしているため、音源方向は、マイクロフォンMICａ～MICｃを含む平面上にて、任意に移動して考えることができる。マイク組MICａ⇔MICｂの「表」に位置する領域は領域１～３であり、「裏」に位置する領域は領域４～６である。マイク組MICｂ⇔MICｃの「表」に位置する領域は領域３～５であり、「裏」に位置する領域は領域１，２，６である。マイク組MICｃ⇔MICａの組の「表」に位置する領域は領域１、５，６であり、「裏」に位置する領域は領域２～４である。従って、例えば、音源方向が領域３にあると特定されれば、マイク組MICａ⇔MICｂの「表」、マイク組MICｂ⇔MICｃの「表」、マイク組MICｃ⇔MICａの「表」であると、効率的に特定することができる。 Now, specify which of the six regions the sound source direction is included in, with three lines passing through the orthocenter of the triangle ABC with the positions of the three microphones as vertices and parallel to each side of the triangle as boundary lines. By doing so, it is possible to efficiently distinguish between the front and back of the sound source direction in each of the three sets of microphones.
As shown in FIG. 3, parallel lines PLab, PLbc, and PLca, which are three lines that pass through the orthocenter of a triangle ABC whose vertices are the positions of the microphones MICa to MICc and are parallel to each side of the triangle ABC, are defined as boundary lines. The 6 regions that do this are referred to as regions 1-6. Here, the orthocenter of triangle ABC is used as the reference position. The direction from the sound source position to the reference position is the sound source direction, and an example of the sound source direction is indicated by an arrow in FIG. Since the sound is regarded as a plane wave, the direction of the sound source can be arbitrarily moved on the plane including the microphones MICa to MICc. Areas 1 to 3 are located on the "front" of the microphone set MICa⇔MICb, and areas 4 to 6 are located on the "back". Areas 3 to 5 are located on the “front” of the microphone set MICb⇔MICc, and areas 1, 2, and 6 are located on the “back”. Areas 1, 5, and 6 are located on the "front" of the microphone set MICc⇔MICa, and areas 2 to 4 are located on the "back". Therefore, for example, if the direction of the sound source is specified to be in region 3, the "front" of the microphone set MICa ⇔ MICb, the "front" of the microphone set MICb ⇔ MICc, and the "front" of the microphone set MICc ⇔ MICa, can be efficiently identified.

ところで、図４に示す様に、三角形ＡＢＣの垂心Ｏを通る、三角形ＡＢＣの各辺と平行な平行線ＰＬａｂ，ＰＬｂｃ、ＰＬｃａに加え、頂点Ａ～Ｃの各々から、各々の対辺に下された垂線ＰＬａ，ＰＬｂ，ＰＬｃの、合計６本の線を境界線とする１２の領域において、到達時間差Ｔdiffの大きい順は自ずと決まる。説明するに当たって、１２の領域を次のように称する。垂心Ｏを基点として、垂心Ｏから頂点Ａ側の垂線ＰＬａから右回りに平行線ＰＬｃａまでの領域を領域Ｒ１ａと称し、領域Ｒ１ａから右回りに順に、領域Ｒ１ｂ，Ｒ２ａ，Ｒ２ｂ，Ｒ３ａ，Ｒ３ｂ，Ｒ４ａ，Ｒ４ｂ，Ｒ５ａ，Ｒ５ｂ，Ｒ６ａ，Ｒ６ｂと称する。尚、垂線ＰＬａ，ＰＬｂ，ＰＬｃの交点が垂心Ｏである。 By the way, as shown in FIG. 4, in addition to the parallel lines PLab, PLbc, and PLca passing through the orthocenter O of the triangle ABC and parallel to each side of the triangle ABC, from each of the vertices A to C, In the 12 regions bounded by a total of 6 perpendicular lines PLa, PLb, and PLc, the descending order of arrival time differences Tdiff is naturally determined. For purposes of discussion, the twelve regions will be referred to as follows. With the orthocenter O as a base point, the region from the perpendicular line PLa on the side of the vertex A from the orthocenter O to the parallel line PLca clockwise is referred to as region R1a. They are called R4a, R4b, R5a, R5b, R6a and R6b. The perpendicular center O is the intersection of the perpendiculars PLa, PLb, and PLc.

尚、ここでは、簡単のため、三角形ＡＢＣが正三角形である場合を例に説明する。
例えば、音源位置が垂線ＰＬａ上にある場合には、マイク組MICａ⇔MICｂの到達時間差Ｔａｂとマイク組MICａ⇔MICｃの到達時間差Ｔｃａとは同じになり、マイク組MICｂ⇔MICｃの到達時間差Ｔｂｃは０となる。つまり、到達時間差Ｔａｂ，Ｔｃａが最大で、到達時間差Ｔｂｃが最小となる。音源位置が垂線ＰＬｂ，ＰＬｃ上にある場合、同様に、到達時間差Ｔdiffの大きい順は決まる。
また、例えば、音源位置が平行線ＰＬａｂ上にある場合には、到達時間差Ｔｂｃと到達時間差Ｔｃａとは同じになり、到達時間差Ｔａｂは到達時間差Ｔｂｃおよび到達時間差Ｔｃａの２倍となる。つまり、到達時間差Ｔａｂが最大となり、到達時間差Ｔｂｃ，Ｔｃａが最小となる。これは、音源位置からマイクロフォンMICａまでの距離と音源位置からマイクロフォンMICｂまでの距離との差である距離差ＤＤａｂは辺ＡＢの長さであり、音源位置からマイクロフォンMICｂまでの距離と音源位置からマイクロフォンMICｃまでの距離との差である距離差ＤＤｂｃは頂点Ｂから辺ＡＢの中点までの距離であり、音源位置からマイクロフォンMICｃまでの距離と音源位置からマイクロフォンMICａまでの距離との差である距離差ＤＤｃａは頂点Ａから辺ＡＢの中点までの距離であるからである。音源位置が平行線ＰＬｂｃ，ＰＬｃａ上にある場合、同様に、到達時間差Ｔdiffの大きい順は決まる。以下の説明において、マイクロフォンMICａ～MICｃの何れの組の距離差であるかを区別する場合には距離差ＤＤａｂ，ＤＤｂｃ，ＤＤｃａと記載し、総称する場合には距離差Ｄdiffと記載する。
次に、音源位置が線上にない場合について、音源位置が領域Ｒ２ａにあり、音源方向と垂線ＰＬｃとのなす角が角度θである場合を例に、図５を用いて説明する。尚、音源位置は領域Ｒ２ａにあるため、角度θは３０°未満である。 Here, for the sake of simplicity, the case where the triangle ABC is an equilateral triangle will be described as an example.
For example, when the sound source position is on the perpendicular line PLa, the arrival time difference Tab between the microphone sets MICa⇔MICb and the arrival time difference Tca between the microphone sets MICa⇔MICc are the same, and the arrival time difference Tbc between the microphone sets MICb⇔MICc is 0. becomes. That is, the arrival time differences Tab and Tca are maximum, and the arrival time difference Tbc is minimum. When the sound source positions are on the perpendiculars PLb and PLc, similarly, the descending order of arrival time differences Tdiff is determined.
Further, for example, when the sound source position is on the parallel line PLab, the arrival time difference Tbc and the arrival time difference Tca are the same, and the arrival time difference Tab is twice the arrival time difference Tbc and the arrival time difference Tca. That is, the arrival time difference Tab is maximized, and the arrival time differences Tbc and Tca are minimized. This is because the distance difference DDab, which is the difference between the distance from the sound source position to the microphone MICa and the distance from the sound source position to the microphone MICb, is the length of the side AB, and the distance from the sound source position to the microphone MICb and from the sound source position to the microphone The distance difference DDbc, which is the difference from the distance to MICc, is the distance from the vertex B to the midpoint of the side AB, and is the difference between the distance from the sound source position to the microphone MICc and the distance from the sound source position to the microphone MICa. This is because the difference DDca is the distance from the vertex A to the midpoint of the side AB. When the sound source positions are on the parallel lines PLbc and PLca, similarly, the descending order of arrival time differences Tdiff is determined. In the following description, the distance differences DDab, DDbc, and DDca are used to distinguish which set of the microphones MICa to MICc the distance difference is, and the distance difference Ddiff is used to collectively refer to them.
Next, the case where the sound source position is not on the line will be described using FIG. 5 as an example of the case where the sound source position is in the region R2a and the angle between the sound source direction and the perpendicular line PLc is the angle θ. Since the sound source position is in the region R2a, the angle θ is less than 30°.

図５は、各組における距離差Ｄdiffを算出するために、マイクロフォンMICａ～MICｃの位置を頂点とする三角形ＡＢＣに、図１と同様に、各辺を斜辺とする直角三角形を描いた図である。詳しくは、直角三角形ＡＢＤは、斜辺が辺ＡＢであり、一辺が音源方向と直交する辺ＢＤである直角三角形である。また、直角三角形ＢＣＦは、斜辺が辺ＢＣであり、一辺が音源方向と直交する辺ＦＢである直角三角形である。また、直角三角形ＣＡＥは、辺ＣＡが斜辺であり、一辺が音源方向と直交する辺ＡＥである直角三角形である。
ここで、角ＤＢＡの角度はθ、角ＦＢＣの角度は（６０°＋θ）、角ＣＡＥの角度は（６０°－θ）となる。距離差ＤＤａｂは直角三角形ＡＢＤの辺ＡＤの長さである。また、距離差ＤＤｂｃは直角三角形ＢＣＦの辺ＣＦの長さである。また、距離差ＤＤｃａは直角三角形ＣＡＥの辺ＥＣの長さである。
マイクロフォンMICｂ，MICｃ間の間隔を間隔Ｄｂｃ、マイクロフォンMICｃ，MICａ間の間隔を間隔Ｄｃａとすると、距離差ＤＤａｂはＤａｂ×ｓｉｎθ、距離差ＤＤｂｃはＤｂｃ×ｓｉｎ（６０＋θ）、距離差ＤＤｃａはＤｃａ×ｓｉｎ（６０－θ）である。ここで、θ＜３０°であるので、ｓｉｎθ＜ｓｉｎ（６０－θ）＜ｓｉｎ（６０＋θ）であり、Ｄａｂ＝Ｄｂｃ＝Ｄｃａであるので、Ｄａｂ×ｓｉｎθ＜Ｄｃａ×ｓｉｎ（６０－θ）＜Ｄｂｃ×ｓｉｎ（６０＋θ）である。つまり、ＤＤａｂ＜ＤＤｃａ＜ＤＤｂｃとなる。
他の領域についても同様に、距離差Ｄdiffの大きい順は決まる。また、距離差Ｄdiffの大きい順とは、到達時間差Ｔdiffの大きい順と同じであるので、各々の領域における到達時間差Ｔdiffの大きい順は図６に示すようになる。図６では、各領域において、３つの到達時間差Ｔdiffを大きい順に記載している。尚、上記したように、音が到達するのに要した時間の長短によれば、２つのマイクロフォンのうち、どちらが音源に対して遠方にあるかを特定することができる。図６では、音源より遠方のマイクロフォンを括弧書きで示している。例えば、領域Ｒ１ａにおいて、最大の到達時間差Ｔdiffとなるのは到達時間差Ｔｃａであり、音源位置より遠方のマイクロフォンはマイクロフォンMICｃであることを示している。
以上、三角形ＡＢＣが正三角形である場合を例に、各領域における距離差Ｄdiffの大きい順について説明したが、三角形ＡＢＣが正三角形ではなく、垂心が三角形ＡＢＣで囲まれた領域の内側にある、すべての角が９０°以下である三角形である場合にも、同様に各領域における距離差Ｄdiffの大きい順は自ずと決まる。すべての角が９０°以下である三角形とは、例えば、直角三角形、鋭角三角形などである。尚、すべての角が９０°以下である三角形に該当しない三角形、鈍角三角形の場合には、距離差Ｄdiffの大きい順は図６に示す通りにはならない。 FIG. 5 is a diagram in which a triangle ABC whose vertices are the positions of the microphones MICa to MICc and right-angled triangles whose sides are oblique sides in the same manner as in FIG. . Specifically, the right-angled triangle ABD is a right-angled triangle whose hypotenuse is the side AB and whose one side is the side BD perpendicular to the direction of the sound source. The right-angled triangle BCF is a right-angled triangle whose oblique side is the side BC and whose one side is the side FB perpendicular to the direction of the sound source. The right-angled triangle CAE is a right-angled triangle whose side CA is the oblique side and whose one side is the side AE perpendicular to the direction of the sound source.
Here, the angle DBA is θ, the angle FBC is (60°+θ), and the angle CAE is (60°−θ). The distance difference DDab is the length of the side AD of the right triangle ABD. Also, the distance difference DDbc is the length of the side CF of the right triangle BCF. Also, the distance difference DDca is the length of the side EC of the right triangle CAE.
If the distance between the microphones MICb and MICc is Dbc, and the distance between the microphones MICc and MICa is Dca, the distance difference DDab is Dab×sin θ, the distance difference DDbc is Dbc×sin(60+θ), and the distance difference DDca is Dca×sin. (60-θ). Here, since θ<30°, sinθ<sin(60−θ)<sin(60+θ), and Dab=Dbc=Dca, so Dab×sinθ<Dca×sin(60−θ)<Dbc x sin(60+θ). That is, DDab<DDca<DDbc.
Similarly, for other areas, the order of increasing distance difference Ddiff is determined. Also, the descending order of the distance difference Ddiff is the same as the ascending order of the arrival time difference Tdiff, so the descending order of the arrival time difference Tdiff in each region is as shown in FIG. In FIG. 6, the three arrival time differences Tdiff are listed in descending order in each region. As described above, it is possible to specify which of the two microphones is farther from the sound source, according to the length of time required for the sound to arrive. In FIG. 6, the microphones farther from the sound source are shown in parentheses. For example, in the region R1a, the maximum arrival time difference Tdiff is the arrival time difference Tca, indicating that the microphone farther from the sound source position is the microphone MICc.
The order in which the distance difference Ddiff in each region is larger has been described above, taking as an example the case where the triangle ABC is an equilateral triangle. Similarly, in the case of a triangle having all angles of 90° or less, the order of the largest distance difference Ddiff in each region is naturally determined. A triangle having all angles of 90° or less includes, for example, a right-angled triangle and an acute-angled triangle. In the case of triangles that do not correspond to triangles whose angles are all 90 degrees or less, or triangles with obtuse angles, the descending order of the distance difference Ddiff is not as shown in FIG.

ところで、（式１）では、真横、つまり角度θが９０°に近づくほど、角度θの変化量に対する距離差Ｄdiffの変化量は小さくなる。（式１）を微分した、次の（式６）から明らかである。
Ｄdiff´＝（Ｄａｂ×ｓｉｎθ）´＝ｃｏｓθ・・・（式６）
従って、図７に示すように、例えば、音源角度が角度θの場合の距離差Ｄdiffと、角度（θ＋Δθ）の場合の距離差Ｄdiffとの差は、角度θが９０°に近づくほど微小となる。このため、角度θが９０°に近づくほど、距離差Ｄdiffの測定誤差の影響を大きく受けた角度θが算出され易くなり、算出される角度θの精度は悪くなる。尚、図７は、図１と同様の図であり、音源方向をマイクロフォンMICａ，MICｂを結ぶ線分の中点を通る、マイクロフォンMICａ，MICｂを結ぶ線分の垂線（図７では０°と表記する。）と音源方向とのなす角の角度である角度θにて示した図である。ここで、角度θが９０°に近づくとは、対象のマイク組MICａ⇔MICｂにおいて、距離差Ｄdiffおよび到達時間差Ｔdiffが最大に近づくということである。
以上を鑑み、３つのマイクロフォンのうちの２つのマイクロフォンを１組として各組から算出される３つの到達時間差Ｔdiffの各々に基づいて合計３つの角度θを算出することはできるが、３つの到達時間差Ｔdiffのうち最大の到達時間差Ｔdiffは角度θの算出から除外することで、音源方向を示す角度θの精度を上げることができることを発明者らは見出した。 By the way, in (Equation 1), the closer the angle θ is to 90°, the smaller the amount of change in the distance difference Ddiff with respect to the amount of change in the angle θ. It is clear from the following (Equation 6) obtained by differentiating (Equation 1).
Ddiff′=(Dab×sin θ)′=cos θ (Formula 6)
Therefore, as shown in FIG. 7, for example, the difference between the distance difference Ddiff when the sound source angle is the angle θ and the distance difference Ddiff when the sound source angle is the angle (θ+Δθ) becomes smaller as the angle θ approaches 90°. . Therefore, as the angle .theta. approaches 90.degree., it becomes easier to calculate the angle .theta. that is greatly affected by the measurement error of the distance difference Ddiff, and the accuracy of the calculated angle .theta. FIG. 7 is a diagram similar to FIG. 1, and the perpendicular line of the line connecting the microphones MICa and MICb passing through the middle point of the line connecting the microphones MICa and MICb in the direction of the sound source (indicated as 0° in FIG. 7). ) and the direction of the sound source, which is an angle θ. Here, the angle θ approaching 90° means that the distance difference Ddiff and the arrival time difference Tdiff approach the maximum in the target microphone set MICa⇔MICb.
In view of the above, it is possible to calculate a total of three angles θ based on each of the three arrival time differences Tdiff calculated from each set of two microphones out of the three microphones, but the three arrival time differences The inventors have found that the accuracy of the angle θ indicating the direction of the sound source can be improved by excluding the maximum arrival time difference Tdiff from the calculation of the angle θ.

さて、最大の到達時間差Ｔdiffとなる１組を除く、残り２組の到達時間差Ｔdiffを角度θの算出に用いるのであれば、音源方向が１２の領域（図６参照）の何れにあるかを特定する必要はなく、音源方向が最大の到達時間差Ｔdiffで特定される６領域の何れにあるかを特定すれば足りる。
最大の到達時間差Ｔdiffで特定される６領域とは、図８に示す、領域Ｒ１ａ、Ｒ１ｂを含む領域Ｒ１、領域Ｒ２ａ、Ｒ２ｂを含む領域Ｒ２、領域Ｒ３ａ、Ｒ３ｂを含む領域Ｒ３、領域Ｒ４ａ、Ｒ４ｂを含む領域Ｒ４、領域Ｒ５ａ、Ｒ５ｂを含む領域Ｒ５、領域Ｒ６ａ、Ｒ６ｂを含む領域Ｒ６の６領域である。図８に示すように、領域Ｒ１～Ｒ６の各々における最大の到達時間差Ｔdiffは、それぞれ、到達時間差Ｔｃａ，Ｔｂｃ，Ｔａｂ，Ｔｃａ，Ｔｂｃ，Ｔａｂである。
例えば、到達時間差Ｔdiffが最大となる組のマイクロフォンがマイク組MICａ⇔MICｂであり、遠方のマイクロフォンがマイクロフォンMICａである場合、音源方向は領域Ｒ３にあると特定される。図８における領域Ｒ３は、図３における領域３，４を跨ぐ領域である。従って、音源方向が、領域Ｒ３であると特定されれば、マイク組MICａ⇔MICｂにおいては音源方向が表裏の何れかであるかを特定することができないが、マイク組MICｂ⇔MICｃにおいては音源方向が「表」にあり、マイク組MICｃ⇔MICａにおいては音源方向が「裏」にあると特定することができる。因みに、到達時間差Ｔdiffが２番目あるいは３番目に大きい組がわかったとしても、この組を除いた残り２組の表裏の特定をすることはできない。 Now, if the remaining two pairs of arrival time differences Tdiff, excluding one pair with the maximum arrival time difference Tdiff, are used to calculate the angle θ, then specify which of the 12 regions (see FIG. 6) the sound source direction lies in. It is sufficient to specify which of the six regions specified by the maximum arrival time difference Tdiff the sound source direction is in.
The six regions specified by the maximum arrival time difference Tdiff are, shown in FIG. , a region R5 including regions R5a and R5b, and a region R6 including regions R6a and R6b. As shown in FIG. 8, the maximum arrival time differences Tdiff in each of the regions R1 to R6 are the arrival time differences Tca, Tbc, Tab, Tca, Tbc, Tab, respectively.
For example, if the pair of microphones with the largest arrival time difference Tdiff is the microphone pair MICa⇔MICb and the distant microphone is the microphone MICa, the direction of the sound source is identified as being in the region R3. A region R3 in FIG. 8 is a region straddling the regions 3 and 4 in FIG. Therefore, if the sound source direction is specified to be the region R3, it cannot be specified whether the sound source direction is front or back in the microphone set MICa⇔MICb. is on the "front side", and the sound source direction is on the "back side" in the microphone set MICc⇔MICa. Incidentally, even if the pair with the second or third largest arrival time difference Tdiff is known, it is not possible to identify the front and back of the remaining two pairs excluding this pair.

以上、音源がマイクロフォンMICａ，MICｂ，MICｃを含む平面にあると仮定して説明した。しかしながら、（１）の構成は、音源がマイクロフォンMICａ，MICｂ，MICｃを含む平面にある場合に限定されるものではない。図９は、音源がマイクロフォンMICａ，MICｂ，MICｃを含む平面にない場合を示している。ここでは、マイクロフォンMICａ，MICｂ，MICｃを含む平面をＸＹ平面と称し、音源からＸＹ平面までの距離を距離Ｄｚと称し、音源の位置を、ＸＹ平面に垂直な方向に沿ってＸＹ平面に投影した位置を投影位置と称し、投影位置から基準位置までの距離を距離Ｄｘｙと称し、音源から基準位置までの距離を距離Ｄｄと称する。また、音源、基準位置、および投影位置を頂点とする三角形における、音源と基準位置とを結ぶ線分と基準位置と投影位置とを結ぶ線分とのなす角の角度を角度θｚと称する。距離Ｄｚに対し距離Ｄｘｙが十分長ければ、角度θｚは小さくなるため、距離Ｄｘｙを距離Ｄｄに近似することができる。同様に、投影位置から３つのマイクロフォンMICａ，MICｂ，MICｃの各々までの距離は、夫々、音源から３つのマイクロフォンMICａ，MICｂ，MICｃの各々までの距離に近似することができる。従って、例えば、投影位置からマイクロフォンMICａまでの距離と投影位置からマイクロフォンMICｂまでの距離との差は、距離差ＤＤａｂに近似することができる。そのため、音源から３つのマイクロフォンMICａ，MICｂ，MICｃの各々までの音の到達時間の差に基づいて、投影位置から３つのマイクロフォンMICａ，MICｂ，MICｃの各々までの距離の差を求めることができる。つまり、音源がＸＹ平面にない場合においても、音源の位置をＸＹ平面に対して垂直な方向に沿ってＸＹ平面に投影した投影位置から基準位置へ向かう音源方向を、音源から３つのマイクロフォンMICａ，MICｂ，MICｃの各々までの音の到達時間の差に基づいて特定することができる。
以上を鑑み、発明者らは、３つのマイクロフォンを用いた音源方向の特定において、次の（２）の構成が良いことを見出した。 The above explanation is based on the assumption that the sound source is on the plane containing the microphones MICa, MICb, and MICc. However, the configuration of (1) is not limited to the case where the sound source is on the plane containing the microphones MICa, MICb, and MICc. FIG. 9 shows the case where the sound source is not in the plane containing the microphones MICa, MICb, MICc. Here, the plane containing the microphones MICa, MICb, and MICc is called the XY plane, the distance from the sound source to the XY plane is called the distance Dz, and the position of the sound source is projected onto the XY plane along the direction perpendicular to the XY plane. The position is called a projection position, the distance from the projection position to the reference position is called distance Dxy, and the distance from the sound source to the reference position is called distance Dd. An angle formed by a line segment connecting the sound source and the reference position and a line segment connecting the reference position and the projection position in a triangle having vertices at the sound source, the reference position, and the projection position is called an angle θz. If the distance Dxy is sufficiently longer than the distance Dz, the angle θz will be small, so the distance Dxy can be approximated to the distance Dd. Similarly, the distance from the projection position to each of the three microphones MICa, MICb and MICc can be approximated to the distance from the sound source to each of the three microphones MICa, MICb and MICc. Therefore, for example, the difference between the distance from the projection position to the microphone MICa and the distance from the projection position to the microphone MICb can be approximated to the distance difference DDab. Therefore, the difference in distance from the projection position to each of the three microphones MICa, MICb, and MICc can be obtained based on the difference in arrival time of sound from the sound source to each of the three microphones MICa, MICb, and MICc. In other words, even when the sound source is not on the XY plane, the sound source direction from the projection position projected onto the XY plane along the direction perpendicular to the XY plane to the reference position is the direction of the sound source from the sound source to the three microphones MICa, It can be specified based on the difference in arrival time of sound to each of MICb and MICc.
In view of the above, the inventors found that the following configuration (2) is good for identifying the sound source direction using three microphones.

（２）本願の音源方向特定装置は、前記基準位置は前記三角形の垂心であり、前記特定部は、前記３つのマイクロフォンのうちの２つのマイクロフォンを１組として各組から算出される３つの前記到達時間の差のうち、最大の前記到達時間の差である１組に基づき、前記音源方向が、前記２つのマイクロフォンを通る線の各々に引かれた前記基準位置を通る３つの垂線により区画された前記基準位置を囲む６つの領域の何れに属するかを決定し、前記最大の到達時間の差である１組を除く残り２組の前記到達時間の差に基づき、前記６つの領域のうち決定した領域となる前記音源方向と前記３つの垂線のうちの１つの垂線とのなす角度である音源角度を算出することを特徴とする。このようにすると、音源方向を精度良く特定することができる。 (2) In the sound source direction identifying device of the present application, the reference position is the orthocenter of the triangle, and the identifying unit includes three microphones calculated from each set of two microphones out of the three microphones. Based on the set of arrival time differences that are the largest of the arrival time differences, the sound source direction is bounded by three perpendicular lines through the reference position drawn to each of the lines through the two microphones. Determine which of the six regions surrounding the reference position belongs to, and determine one of the six regions based on the remaining two sets of differences in arrival times excluding one set that is the largest difference in arrival times and calculating a sound source angle, which is an angle between the sound source direction and one of the three perpendiculars. By doing so, the direction of the sound source can be specified with high accuracy.

つまり、最大の到達時間差Ｔdiffである１組に基づき、音源方向が６つの領域の何れに属するかを決定し、最大の到達時間差Ｔdiffである１組を除く残り２組の到達時間差Ｔdiffから算出される音源角度の候補である角度θ，θ´のうち、６つの領域のうち決定した領域となる方の角度を音源角度の算出に用いる。最大の到達時間差Ｔdiffである１組によって、残り２組の音源方向の表裏、角度θ，θ´の何れであるかを特定することができる。また、最大の到達時間差Ｔdiffである１組を除くことで、音源角度の精度を良くすることができる。 That is, based on one set having the maximum arrival time difference Tdiff, it is determined which of the six regions the sound source direction belongs to, and is calculated from the remaining two sets of arrival time differences Tdiff excluding the one set having the maximum arrival time difference Tdiff. Of the angles θ and θ′ that are candidates for the sound source angle, the angle corresponding to the determined region among the six regions is used to calculate the sound source angle. With one set having the maximum arrival time difference Tdiff, it is possible to specify which of the remaining two sets of sound source directions is the front and back, or the angles θ and θ'. Also, by excluding one set with the maximum arrival time difference Tdiff, the accuracy of the sound source angle can be improved.

（３）前記三角形は正三角形である構成とすると良い。このようにすると、角度θを導出するための演算を簡素にすることができる。演算の際に特定部にかかる負荷を軽減することができる。 (3) Preferably, the triangle is an equilateral triangle. By doing so, the calculation for deriving the angle θ can be simplified. It is possible to reduce the load on the specific unit during calculation.

（４）前記特定部は、前記３つのマイクロフォンの各々が出力する３つの電気信号のうちの２つの電気信号を１組として各組から算出される位相差に基づき、前記到達時間の差を算出する構成とすると良い。 (4) The specifying unit calculates the difference in the arrival time based on the phase difference calculated from each set of two electrical signals out of the three electrical signals output from each of the three microphones. It is good to have a configuration that

電気信号は、マイクロフォンの周波数特性のバラツキ、環境などによる誤差を含む。このため、例えば、電気信号のレベルが閾値を超えた時刻に基づき、到達時間の差を算出した場合には、音源方向の精度が悪くなるおそれがある。位相差に基づき、到達時間の差を算出することで、音源方向を精度良く特定することができる。尚、音が人声である（５）のように、音に複数の周波数成分が含まれる場合、電気信号を周波数解析し、周波数成分毎に位相差を算出する構成とすると良い。 The electrical signal contains errors due to variations in the frequency characteristics of microphones, the environment, and the like. For this reason, for example, when the difference in arrival times is calculated based on the time when the level of the electrical signal exceeds the threshold value, the accuracy of the direction of the sound source may deteriorate. By calculating the arrival time difference based on the phase difference, the direction of the sound source can be specified with high accuracy. When the sound contains a plurality of frequency components as in (5) where the sound is human voice, it is preferable to configure the frequency analysis of the electric signal to calculate the phase difference for each frequency component.

（５）前記音源の音は人声であり、前記３つのマイクロフォンの各マイクロフォン間の距離は、５７ｍｍ以上１７０ｍｍ以下である構成とすると良い。このようにすると、音源である人の方向を精度良く特定することができる。 (5) The sound from the sound source is human voice, and the distance between each of the three microphones is preferably 57 mm or more and 170 mm or less. By doing so, the direction of the person who is the sound source can be specified with high accuracy.

到達時間差Ｔdiffを位相差から算出する（４）の構成の場合、マイクロフォン間距離は、位相差の算出に使用する周波数成分の周波数に基づいて決定すると良い。例えば、マイクロフォン間距離を周波数の１波長分としてしまうと、一方のマイクロフォンに入る波に対し、他方のマイクロフォンには、１周期進んだ波から１周期遅れた波までの範囲の波が入る可能性が出てきてしまい、位相差を特定することができなくなってしまう。
例えば、一方のマイクロフォンに入る波に対して位相差が－１／２πである波が、他方のマイクロフォンに入る場合、実際に他方のマイクロフォンに入る波とは１／４周期遅れた波なのであるが、３／４周期早い波も入る可能性があるため、電気信号に基づき、位相差が－１／２πであるのか＋３／２πであるのか特定することはできない。
そこで、マイクロフォン間距離を位相差の算出に使用する周波数の半波長分とすると、一方のマイクロフォンに入る波に対し、他方のマイクロフォンに入る波は、１／２周期進んだ波から１／２周期遅れた波までに限定される。上記した、一方のマイクロフォンに入る波に対して位相差が－１／２πである波が他方のマイクロフォンに入る場合、他方のマイクロフォンに入る波は１／４周期遅れた波であり、位相差は－１／２πであると特定することができるようになる。
具体的な数値を挙げると、例えば、音速を３４０ｍ／ｓ、マイクロフォン間距離を５７ｍｍとすれば、３ｋＨｚ以下の周波数の波に対し、位相差を特定することができる。また、マイクロフォン間距離を１７０ｍｍとすれば、１ｋＨｚ以下の周波数の波に対し、位相差を特定することができる。 In the configuration (4) in which the arrival time difference Tdiff is calculated from the phase difference, the inter-microphone distance is preferably determined based on the frequency of the frequency component used for calculating the phase difference. For example, if the distance between the microphones is one wavelength of the frequency, there is a possibility that the wave entering one microphone will receive a wave in the range from one cycle leading to one cycle lagging behind the wave entering the other microphone. comes out, and the phase difference cannot be specified.
For example, when a wave with a phase difference of -1/2π with respect to a wave entering one microphone enters the other microphone, the wave actually entering the other microphone is a wave delayed by 1/4 period. , a wave that is 3/4 period earlier may also enter, so it is not possible to specify whether the phase difference is -1/2π or +3/2π based on the electrical signal.
Therefore, if the distance between the microphones is half the wavelength of the frequency used to calculate the phase difference, the wave entering one microphone and the wave entering the other microphone will be 1/2 period ahead of the wave entering the other microphone. Limited to lagging waves. When a wave having a phase difference of -1/2π with respect to a wave entering one microphone enters the other microphone, the wave entering the other microphone is a wave delayed by 1/4 period, and the phase difference is It becomes possible to specify that it is -1/2π.
As a specific numerical value, for example, if the speed of sound is 340 m/s and the distance between microphones is 57 mm, the phase difference can be specified for waves with a frequency of 3 kHz or less. Also, if the distance between microphones is 170 mm, the phase difference can be specified for waves with a frequency of 1 kHz or less.

このように、マイクロフォン間距離が短い程、高い周波数においても位相差を特定することができるようになるため、位相差を特定できる周波数の範囲は広くなる。しかしながら、マイクロフォン間距離が短いと位相差は小さくなってしまうため、特に、低い周波数における位相差が小さくなり、位相差の誤差を招来するおそれがある。 Thus, the shorter the distance between the microphones, the wider the range of frequencies in which the phase difference can be identified because the phase difference can be identified even at higher frequencies. However, if the distance between the microphones is short, the phase difference will be small, so the phase difference will be small especially at low frequencies, and there is a risk of causing a phase difference error.

ところで、人声の基本周波数の上限は２００Ｈｚ程度であり、第１フォルマント周波数の上限は１ｋＨｚ程度であり、第２フォルマント周波数の上限は３ｋＨｚ程度であることが知られている。ここで、フォルマント周波数は、音圧レベルがピークとなる、母音を特徴付ける周波数である。例えば、「あ」などの短い人声の場合にも、マイクロフォンが出力する電気信号には、１ｋＨｚ以下に、基本周波数および第１フォルマント周波数の２つの周波数成分が含まれる。また、３ｋＨｚ以下に、基本周波数、第１フォルマント周波数、および第２フォルマント周波数の３つの周波数成分が含まれる。
発明者らは、音源方向を精度良く特定するのに、音源位置を特定するのに用いる周波数範囲を１ｋＨｚ以下とすると良く、３ｋＨｚ以下とすると特に良いことを見出した。上記のように、１ｋＨｚ以下の周波数範囲とすれば、少なくとも基本周波数および第１フォルマント周波数の２つの周波数成分が含まれ、さらに範囲を広げ、３ｋＨｚ以下の周波数範囲とすれば、基本周波数、第１フォルマント周波数、および第２フォルマント周波数の３つの周波数成分が含まれるからである。また、３ｋＨｚより高い周波数を使用しなくても、音源を精度良く特定することができるからである。
上記のように、１ｋＨｚ以下の周波数の位相差を算出するには、マイクロフォン間距離を１７０ｍｍとすれば良く、３ｋＨｚ以下の周波数の位相差を算出するには、マイクロフォン間距離を５７ｍｍとすれば良い。マイクロフォン間距離を５７ｍｍ以上１７０ｍｍ以下の範囲とすると、位相差を特定できる周波数の上限値が１ｋＨｚ～３ｋＨｚとなる。従って、マイクロフォン間距離を５７ｍｍ以上１７０ｍｍ以下の範囲とすると、少なくとも基本周波数、第１フォルマント周波数を位相差の算出に使用することができる。また、位相差の算出に使用する周波数の上限を第２フォルマント周波数程度とすることで、低い周波数における位相差の精度を良くすることができる。このように、人声に対し、音源方向を精度良く特定することができる。 It is known that the upper limit of the fundamental frequency of human voice is about 200 Hz, the upper limit of the first formant frequency is about 1 kHz, and the upper limit of the second formant frequency is about 3 kHz. Here, the formant frequency is a frequency that characterizes a vowel at which the sound pressure level peaks. For example, even in the case of a short human voice such as "ah", the electrical signal output by the microphone contains two frequency components, the fundamental frequency and the first formant frequency, below 1 kHz. Also, 3 kHz or less includes three frequency components, the fundamental frequency, the first formant frequency, and the second formant frequency.
The inventors found that the frequency range used for specifying the sound source position should be 1 kHz or less, and particularly preferably 3 kHz or less, in order to accurately specify the direction of the sound source. As described above, the frequency range of 1 kHz or less includes at least two frequency components, the fundamental frequency and the first formant frequency. This is because it contains three frequency components, the formant frequency and the second formant frequency. Also, the sound source can be specified with high accuracy without using frequencies higher than 3 kHz.
As described above, in order to calculate the phase difference at frequencies of 1 kHz or less, the distance between the microphones should be 170 mm, and in order to calculate the phase differences at frequencies of 3 kHz or less, the distance between the microphones should be 57 mm. . If the distance between the microphones is in the range of 57 mm or more and 170 mm or less, the upper limit of the frequency at which the phase difference can be specified is 1 kHz to 3 kHz. Therefore, when the distance between the microphones is in the range of 57 mm or more and 170 mm or less, at least the fundamental frequency and the first formant frequency can be used to calculate the phase difference. Further, by setting the upper limit of the frequency used for calculating the phase difference to about the second formant frequency, it is possible to improve the accuracy of the phase difference at low frequencies. In this way, it is possible to accurately identify the direction of the sound source of the human voice.

（６）前記特定部は、所定期間において、前記３つのマイクロフォンの各々から出力される電気信号をデジタル値に変換するサンプリング処理と前記、サンプリング処理にて変換されたデジタル値に基づき方向を特定する特定処理と、を繰り返し実行する構成とすると良い。 (6) The specifying unit specifies a direction based on a sampling process for converting the electrical signal output from each of the three microphones into a digital value and the digital value converted by the sampling process in a predetermined period. A specific process may be repeatedly executed.

このようにすると、音がいつ発せられるかわからない場合であっても、音の発生に応じて、音源方向の特定をすることができる。例えば、人声に応じて動作するコミュニケーションロボット、音の発生場所を記録する監視カメラなどの、音に応じて動作する装置において適用すると良い。音に応じて動作する装置に適用する場合、音源方向に向けた動きを行う制御をすると良い。また、人の発話した位置を特定する機能を備えると良い。また、音声認識機能を設けると良い。また、特定した人の発話方向に所定の部位を向ける機能を備えると良い。特に、コミュニケーションロボットとすると良い。また、サンプリング処理の開始時刻から次のサンプリング処理の開始時刻までの時間は２００ｍｓ以下である構成とすると良い。このようにすると、例えばコミュニケーションロボットの場合、例えば「おい」などの短い呼びかけに対しても、音に応じて音源位置を特定し、確実に動作することができる。また、本願の構成によれば、特に（５）のように、マイクロフォン間の距離をコミュニケーションロボットとして好適なサイズとすることができる。 In this way, even if it is not known when the sound will be emitted, the direction of the sound source can be specified according to the occurrence of the sound. For example, it may be applied to a device that operates according to sound, such as a communication robot that operates according to human voice and a monitoring camera that records the location where sound is generated. When applied to a device that operates in response to sound, it is preferable to control movement toward the direction of the sound source. Also, it is preferable to have a function of identifying the position where a person speaks. Also, it is better to provide a voice recognition function. Also, it is preferable to have a function of directing a predetermined part in the speaking direction of the specified person. In particular, it is preferable to use it as a communication robot. Also, it is preferable that the time from the start time of the sampling process to the start time of the next sampling process is 200 ms or less. In this way, for example, in the case of a communication robot, it is possible to specify the sound source position according to the sound even for a short call such as "Hey" and to operate reliably. Moreover, according to the configuration of the present application, the distance between the microphones can be set to a suitable size for a communication robot, as in (5).

上述した（１）から（６）に示した発明は、任意に組み合わせることができる。例えば、（１）に示した発明の全てまたは一部の構成に、（２）以降の少なくとも１つの発明の少なくとも一部の構成を加える構成としてもよい。特に、（１）に示した発明に、（２）以降の少なくとも１つの発明の少なくとも一部の構成を加えた発明とするとよい。また、（１）から（６）に示した発明から任意の構成を抽出し、抽出された構成を組み合わせてもよい。本願の出願人は、これらの構成を含む発明について権利を取得する意思を有する。 The inventions shown in (1) to (6) above can be combined arbitrarily. For example, a configuration may be adopted in which at least part of the configuration of at least one of the following inventions (2) is added to all or part of the configuration of the invention shown in (1). In particular, the invention shown in (1) may be added with at least a part of the configuration of at least one invention after (2). Also, arbitrary configurations may be extracted from the inventions shown in (1) to (6) and the extracted configurations may be combined. The applicant of this application intends to obtain rights to inventions including these configurations.

また、後述する（Ａ）から（Ｉ）に示した発明は、任意に組み合わせるとよい。例えば、（Ａ）に示した発明の全てまたは一部の構成に、（Ｂ）以降の少なくとも１つの発明の少なくとも一部の構成を加える構成としてもよい。特に、（Ａ）に示した発明に、（Ｂ）以降の少なくとも１つの発明の少なくとも一部の構成を加えた発明とするとよい。また、上述した（１）から（６）に示した発明と後述する（Ａ）から（Ｉ）に示した発明とは、任意に組み合わせることができる。また、（Ａ）から（Ｉ）に示した発明から任意の構成を抽出し、抽出された構成を組み合わせてもよい。（１）から（６）に示した発明から任意の構成を抽出し、（Ａ）から（Ｉ）に示した発明から任意の構成を抽出し、抽出された構成を組み合わせてもよい。本願の出願人は、これらの構成を含む発明について権利を取得する意思を有する。 Also, the inventions shown in (A) to (I) described later may be combined arbitrarily. For example, a configuration may be adopted in which at least part of the configuration of at least one invention after (B) is added to all or part of the configuration of the invention shown in (A). In particular, the invention shown in (A) may be added with at least a part of at least one invention after (B). The inventions shown in (1) to (6) above and the inventions shown in (A) to (I) described later can be combined arbitrarily. Also, arbitrary configurations may be extracted from the inventions shown in (A) to (I) and the extracted configurations may be combined. Arbitrary configurations may be extracted from the inventions shown in (1) to (6), arbitrary configurations may be extracted from the inventions shown in (A) to (I), and the extracted configurations may be combined. The applicant of this application intends to obtain rights to inventions including these configurations.

（Ａ）複数のマイクを備える装置であって、前記複数のマイクのうち２つのマイクに対する音の到達時間の前後関係及び前記２つのマイクに対する音の到達時間の差に基づき所定の基準方向と音源の方向とのなす角度を求める機能である角度算出機能と、前記２つのマイクとは別のマイクを用いて、これらのマイクの位置を含む平面に垂直な面であって前記２つのマイクの位置を含む面によって区分される２つの領域のうちのいずれの領域側に前記音源が存在するかを特定する機能である音源方向特定機能とを備えることを特徴とする装置とするとよい。
このようにすれば、３つのマイクの位置を含む平面に垂直な面であって前記２つのマイクの位置を含む面によって区分される２つの領域のうちのいずれの領域側に音源が存在するかを確定でき、所定の基準方向と音源の方向とのなす角度を求めることができる。
所定の基準方向は例えば３つのマイクの位置を含む平面内の所定の方向とするとよく、音源の方向は３つのマイクの位置を含む平面内の方向（例えば３次元ベクトルの当該平面内の成分）とするとよい。
前記音源方向特定機能で用いる前記別のマイクは１つのマイクとしてもよいが複数のマイクとしてもよい。 (A) A device comprising a plurality of microphones, wherein a predetermined reference direction and a sound source are determined based on the anteroposterior relationship of arrival times of sound to two microphones among the plurality of microphones and the difference in arrival times of sound to the two microphones. Using an angle calculation function, which is a function to obtain the angle formed with the direction of the two microphones, and a microphone different from the two microphones, the position of the two microphones on a plane perpendicular to the plane containing the positions of these microphones and a sound source direction specifying function, which is a function of specifying on which side of two regions divided by a plane containing the sound source, the sound source exists.
In this way, it is possible to determine which side of the two regions the sound source exists on, of the two regions divided by the plane perpendicular to the plane containing the positions of the three microphones and containing the positions of the two microphones. can be determined, and the angle between a predetermined reference direction and the direction of the sound source can be obtained.
The predetermined reference direction may be, for example, a predetermined direction in a plane containing the positions of the three microphones, and the direction of the sound source may be a direction in the plane containing the positions of the three microphones (for example, a component of a three-dimensional vector within the plane). should be
The separate microphone used in the sound source direction identifying function may be one microphone, or may be a plurality of microphones.

（Ｂ）前記「別のマイクを用いて」は、「前記２つのマイクとは配置位置が平行でない位置に配置された別の２個のマイク間の音の到達時間の前後関係を用いて」とするとよい。
このようにすれば、３つのマイクの位置を含む平面に垂直な面であって前記２つのマイクの位置を含む面によって区分される２つの領域のうちのいずれの領域側に音源が存在するかをより確実により精度よく確定できる。例えば正五角形ABCDEの頂点にマイクAからEを各々配置し、マイクAとマイクBとを角度算出機能で用いる前記２つのマイクとし、マイクCとマイクDとを音源方向特定機能で用いる別のマイクとするとよい。 (B) "Using another microphone" means "Using the anteroposterior relationship of the arrival time of sound between two other microphones arranged at positions that are not parallel to the two microphones." should be
In this way, it is possible to determine which side of the two regions the sound source exists on, of the two regions divided by the plane perpendicular to the plane containing the positions of the three microphones and containing the positions of the two microphones. can be determined more reliably and accurately. For example, mics A to E are placed at the vertices of a regular pentagon ABCDE, mics A and B are the two mics used in the angle calculation function, and mics C and D are different mics used in the sound source direction identification function. should be

（Ｃ）前記「別のマイクを用いて」は、「前記２つのマイクとは別の１つのマイクと、前記２つのマイクのうちいずれか１つのマイクとの、音の到達時間の前後関係を用いて」とするとよい。
このようにすればマイクを少なくとも１つ追加するだけで３つのマイクの位置を含む平面に垂直な面であって前記２つのマイクの位置を含む面によって区分される２つの領域のうちのいずれの領域側に音源が存在するかをより確実に精度よく確定できる。例えば正三角形XYZの頂点にマイクXからZを各々配置し、マイクXとマイクYとを角度算出機能で用いる前記２つのマイクとし、マイクZを「別の１つのマイク」とし、「前記２つのマイクのうちいずれか１つのマイク」をマイクXとするとよい。 (C) "Using another microphone" means "predicting the arrival time of sound between one microphone other than the two microphones and one of the two microphones. Use "
In this way, just by adding at least one microphone, any of the two areas divided by the plane perpendicular to the plane containing the positions of the three microphones and containing the positions of the two microphones Whether or not a sound source exists on the area side can be determined more reliably and accurately. For example, microphones X to Z are arranged at the vertices of an equilateral triangle XYZ, respectively, microphone X and microphone Y are the two microphones used in the angle calculation function, microphone Z is "another microphone", and "the two microphones Any one of the microphones" may be referred to as the microphone X.

（Ｄ）前記複数のマイクのうちから、前記角度算出機能で用いる前記２つのマイクとして機能させるマイクのペアと、前記音源方向特定機能で用いる前記別のマイクとして機能させるマイクとを、所定のルールに基づいて決定する機能を備えるとよい。
このようにすれば、音源の位置が変化しても、より確実に、より精度よく、３つのマイクの位置を含む平面に垂直な面であって前記２つのマイクの位置を含む面によって区分される２つの領域のうちのいずれの領域側に音源が存在するかを確定でき、所定の基準方向と音源の方向とのなす角度を求めることが可能となる。
特に所定のルールは、前記複数のマイク各々に検出される音に基づくルールとするとよく、前記複数のマイク各々に検出される音の比較結果のルールとするとよい。例えば前記複数のマイク各々に検出される音の位相のずれなど、到達時間の差に基づくルールとするとよい。 (D) from among the plurality of microphones, a pair of microphones functioning as the two microphones used in the angle calculation function and a microphone functioning as the separate microphones used in the sound source direction identifying function according to a predetermined rule; It is preferable to provide a function to determine based on.
In this way, even if the position of the sound source changes, it is more reliable and more accurate that the plane is divided by the plane perpendicular to the plane containing the positions of the three microphones and containing the positions of the two microphones. It is possible to determine on which side of the two regions the sound source exists, and to obtain the angle formed by a predetermined reference direction and the direction of the sound source.
In particular, the predetermined rule may be a rule based on sounds detected by each of the plurality of microphones, and may be a rule based on a comparison result of sounds detected by each of the plurality of microphones. For example, the rule may be based on a difference in arrival time, such as phase shift of sounds detected by each of the plurality of microphones.

（Ｅ）前記所定のルールは、前記複数のマイクのうちから、最も音の到達時間差の大きいマイクのペアである基準ペアの２マイクを除く他のいずれかのマイクを前記角度算出機能で用いる２つのマイクのうちの少なくとも１つとするルールとするとよい。
このようにすれば、音源の位置がどのような位置になっても、角度算出機能による基準方向と音源の方向とのなす角度の算出精度が大幅に低くなってしまうことを防止できる。
（Ｆ）前記所定のルールは、前記複数のマイクのうちから、最も音の到達時間差の大きいマイクのペアである基準ペアの２マイクの少なくともいずれか一方を前記音源方向特定機能で用いる前記別のマイクとするルールとするとよい。
（Ｇ）前記音源方向特定機能は、前記複数のマイクを頂点とする多角形の頂点を結ぶ辺をなすマイクのペアのうち、最も音の到達時間差の大きいマイクのペアである基準ペアの２マイク以外がなす前記多角形の各辺に対して当該基準ペアの２マイクのなす辺が前記基準ペアの音の到達時間の前から後に向かう方向に交差する方向が、各辺について当該多角形の内側から外側であるか外側から内側であるかの性質に基づいて、当該各辺のうちの少なくとも１つの辺を形成する前記２つのマイクの位置を含む面によって区分される２つの領域のうちのいずれの領域側に前記音源が存在するかを特定するとよい。
このようにすれば、音源の位置がどのような位置になっても、より確実に２つの領域のうちのいずれの領域側に音源が存在するかを特定することができる。例えば三角形ABCの頂点位置に各々のマイクを設け、マイクBとマイクCの間が最も音の到達時間差の大きいマイクのペアとした場合、辺BCについてはA→Bと向かう辺ABについては三角形ABCの外側から内側へ向かう方向となる幾何学的な性質がある。 (E) The predetermined rule is to use any one of the plurality of microphones in the angle calculation function, excluding two microphones of a reference pair, which is a pair of microphones with the largest sound arrival time difference2. A good rule is to have at least one of the three microphones.
In this way, regardless of the position of the sound source, it is possible to prevent the calculation accuracy of the angle between the reference direction and the direction of the sound source from being significantly lowered by the angle calculation function.
(F) The predetermined rule uses at least one of two microphones of a reference pair, which is a pair of microphones with the largest sound arrival time difference among the plurality of microphones, in the sound source direction identification function. It is good to make a rule that it is a microphone.
(G) The sound source direction specifying function is a reference pair of two microphones, which is a pair of microphones having the largest sound arrival time difference among pairs of microphones forming sides connecting the vertices of a polygon whose vertices are the plurality of microphones. With respect to each side of the polygon other than any of the two regions demarcated by the plane containing the positions of the two microphones forming at least one of the sides, based on the property of being from the outside to the outside or from the outside to the inside It is preferable to specify whether the sound source exists on the region side of .
In this way, regardless of the position of the sound source, it is possible to more reliably specify in which of the two regions the sound source exists. For example, if each microphone is placed at the vertex position of triangle ABC, and the pair of microphones with the largest sound arrival time difference is between microphone B and microphone C, side BC will be A → B and side AB will be triangle ABC There is a geometric property that is the direction from the outside to the inside of .

（Ｈ）前記複数のマイクとして、三角形の頂点位置に第一のマイクと第二のマイクと第三のマイクを備え、前記音源方向特定機能は、第一のマイクと第二のマイクからなるペアと、第二のマイクと第三のマイクからなるペアと、第三のマイクと第一のマイクからなるベアの、前記三角形の３辺を形成する３組のペアのうち、最も音の到達時間差の大きい２つのマイクのペアを基準ペアとして、前記基準ペアのうち先に音が到達したマイクと前記基準ペアとは別のマイク位置を含む前記面によって区分される前記２つの領域のうち前記三角形の外側の領域から音が到達したものとする、または、前記基準ペアのうち後に音が到達したマイクと前記基準ペアとは別のマイク位置を含む前記面によって区分される前記２つの領域のうち前記三角形の内側の領域から音が到達したものとする、の少なくともいずれか一方を行うとよい。
このようにすれば、３つのマイクで、音源がいずれの領域にあるかをより確実に特定することができる。 (H) as the plurality of microphones, a first microphone, a second microphone, and a third microphone are provided at vertex positions of a triangle, and the sound source direction specifying function is a pair consisting of the first microphone and the second microphone; and the largest sound arrival time difference among the three pairs forming the three sides of the triangle, the pair consisting of the second and third microphones and the bear consisting of the third and first microphones. With a pair of two microphones having a large , as a reference pair, the triangle out of the two regions divided by the plane containing the microphone from which the sound reaches first in the reference pair and the position of the microphone different from the reference pair of the two regions demarcated by the plane containing the microphone on which the sound arrived later in the reference pair and a microphone position different from the reference pair It is preferable to carry out at least one of the following: It is assumed that the sound has arrived from the area inside the triangle.
In this way, it is possible to more reliably specify in which area the sound source is located with the three microphones.

（Ｉ）前記三角形は正三角形とするとよい。
このようにすれば、三組のペアの精度が平等となり、方向による偏りが少ない条件で360°をカバーできる。したがって、装置の全周のいずれの方向から音声が到達したかを検知する装置において極めて優れた効果を発揮する。 (I) The triangle is preferably an equilateral triangle.
In this way, the accuracies of the three pairs are equal, and 360° can be covered under the condition that the direction bias is small. Therefore, it is extremely effective in a device that detects from which direction sound arrives from all around the device.

本願によれば、従来技術とは異なる方法で、例えば３つのマイクロフォンを用いて音源方向を精度良く特定することができる音源方向特定装置等を提供することができる。本願の発明の効果はこれに限定されず、本明細書および図面等に開示される構成の部分から奏する効果についても開示されており、当該効果を奏する構成についても分割出願・補正等により権利取得する意思を有する。例えば本明細書において「～できる」と記載した箇所などは奏する効果を明示する記載であり、また「～できる」と記載がなくとも効果を示す部分が存在する。またこのような記載がなくとも当該構成よって把握される効果が存在する。 According to the present application, it is possible to provide a sound source direction identifying device or the like that can accurately identify the sound source direction using, for example, three microphones, using a method different from the conventional technique. The effect of the invention of the present application is not limited to this, and the effect produced by the parts of the configuration disclosed in the specification and drawings, etc. is also disclosed, and the configuration that produces the effect is also acquired by divisional application, amendment, etc. have the intention to For example, in this specification, the description of "can be done" is a description that clearly shows the effect, and there are parts showing the effect even if there is no description of "can be done". Moreover, even without such a description, there is an effect that can be grasped by the configuration.

２つのマイクロフォンの各々への音源からの距離の差と音源方向との関係を説明する図である。It is a figure explaining the relationship between the difference of the distance from the sound source to each of two microphones, and the sound source direction. ２つのマイクロフォンの各々への音の到達時間の差が互いに同じになる音源位置が２つ存在することを説明する図である。FIG. 4 is a diagram for explaining that there are two sound source positions where the difference in arrival time of sound to each of two microphones is the same; 三角形ＡＢＣの垂心を囲む６つの領域と３つのマイクロフォンの各組における音源方向の表裏との関係を示す図である。FIG. 4 is a diagram showing the relationship between six regions surrounding the orthocenter of triangle ABC and the front and back of sound source directions in each set of three microphones; 三角形ＡＢＣの垂心Ｏを囲む１２の領域の境界線における到達時間差を示す図である。FIG. 10 is a diagram showing arrival time differences at boundaries of 12 regions surrounding the orthocenter O of triangle ABC. 音源位置が図４に示す領域Ｒ２ａにある場合の音源から３つのマイクロフォンの各々までの距離の差を導出するための図である。5 is a diagram for deriving differences in distances from the sound source to each of the three microphones when the sound source position is in the region R2a shown in FIG. 4; FIG. 三角形ＡＢＣの垂心Ｏを囲む１２の領域の各々における到達時間差の大きい順を示す図である。FIG. 10 is a diagram showing the descending order of the arrival time difference in each of 12 regions surrounding the orthocenter O of triangle ABC; 音源が２つのマイクロフォンを通る直線に近づくほど距離差の測定誤差が大きくなることを説明する図である。It is a figure explaining that the measurement error of a distance difference becomes large, so that a sound source approaches the straight line which passes two microphones. 音の到達時間の差が最大である１組にて特定される三角形ＡＢＣの垂心Ｏを囲む６の領域を示す図である。Fig. 6 shows the six regions surrounding the orthocenter O of the triangle ABC identified by the pair where the sound arrival time differences are the greatest; 音源が３つのマイクロフォンを含む平面にない場合を示す図である。Fig. 3 shows the case where the sound source is not in the plane containing the three microphones; 実施形態に係るロボットの斜視図である。1 is a perspective view of a robot according to an embodiment; FIG. 固定部下筐体とともに示す音源方向特定装置の斜視図である。FIG. 4 is a perspective view of the sound source direction identifying device shown together with the housing under the fixed portion; 音源方向特定装置の電気的構成図である。1 is an electrical configuration diagram of a sound source direction specifying device; FIG. １組のマイクロフォンにおける音源角度の極性を説明する図である。It is a figure explaining the polarity of the sound source angle in 1 set of microphones. 距離差が最大の組とその組をなす２つのマイクロフォンのうち遠方であるマイクロフォンとにより特定される６つの場合の各々において各組の表裏の音源角度の何れを音源角度の算出に採用するかを示した表である。Which of the front and back sound source angles of each pair is to be adopted for calculating the sound source angle in each of the six cases specified by the pair with the largest distance difference and the farthest microphone among the two microphones forming the pair. It is a table showing. 各組における音源角度と全体における音源角度との関係を説明する図である。It is a figure explaining the relationship between the sound source angle in each group, and the sound source angle in the whole.

図１０に示すロボット１は、人声に反応して動作するコミュニケーションロボットである。ロボット１は、固定部２および固定部２に対して可動する可動部３を備える。以下の説明には、図１０，１１に示す方向を用いる。固定部２は固定部下筐体２１、固定部上筐体２２、および音源方向特定装置１０などを有する。固定部下筐体２１は、底面２３を有するボウル状であり、内部に音源方向特定装置１０などを収納している。尚、底面２３と平行な面がＸＹ平面である。固定部上筐体２２は筒状であり、固定部下筐体２１の上に位置し、可動部３の下部を覆う。固定部下筐体２１と固定部上筐体２２との間には、僅かな間隙が設けられており、間隙に音源方向特定装置１０が備えるマイクロフォンMICａ～MICｃ（図１１参照）が配置されている。尚、固定部上筐体２２の内部は部材がぎゅうぎゅうにつまっておらず、遮音する構造になっていない。このため、実際は固定部上筐体２２内部に音が抜け、マイクロフォンMICａ～MICｃは、それぞれ、子基板１２ａ～１２ｃ（図１１）の後ろからも音を拾うことができる。可動部３は可動部筐体３１および表示装置３２などを備える。表示装置３２は、例えばタッチパネル、液晶ディスプレイなどで実現される。可動部筐体３１は一部が平面状に切り欠かれた球状である。表示装置３２は、可動部筐体３１の平面状の部分に取り付けられている。可動部３は、モータ（不図示）を駆動源として、固定部下筐体２１の底面２３に垂直なＺ軸回りに３６０°回転可能となっている。ロボット１は、音が発せられると、例えば人などの音を発した音源に表示装置３２が対面するように可動部３を回転させる。音源方向特定装置１０は、可動部３を回転させるための、音源の方向を特定する装置である。 A robot 1 shown in FIG. 10 is a communication robot that operates in response to human voice. The robot 1 includes a fixed part 2 and a movable part 3 movable with respect to the fixed part 2 . The directions shown in FIGS. 10 and 11 are used in the following description. The fixed part 2 has a fixed part lower housing 21, a fixed part upper housing 22, a sound source direction specifying device 10, and the like. The housing 21 under the fixed part has a bowl shape with a bottom surface 23, and accommodates the sound source direction specifying device 10 and the like inside. A plane parallel to the bottom surface 23 is the XY plane. The fixed part upper housing 22 is cylindrical, is positioned above the fixed part lower housing 21 , and covers the lower part of the movable part 3 . A slight gap is provided between the fixed part lower housing 21 and the fixed part upper housing 22, and the microphones MICa to MICc (see FIG. 11) provided in the sound source direction identifying device 10 are arranged in the gap. . The interior of the fixed upper housing 22 is not tightly packed with members, and does not have a sound-insulating structure. As a result, the sound actually escapes inside the fixed upper housing 22, and the microphones MICa to MICc can also pick up the sound from behind the slave boards 12a to 12c (FIG. 11), respectively. The movable section 3 includes a movable section housing 31, a display device 32, and the like. The display device 32 is realized by, for example, a touch panel, a liquid crystal display, or the like. The movable unit housing 31 has a spherical shape in which a part is notched in a planar shape. The display device 32 is attached to the planar portion of the movable housing 31 . The movable part 3 is rotatable by 360° around the Z-axis perpendicular to the bottom surface 23 of the housing 21 under the fixed part, using a motor (not shown) as a drive source. When a sound is emitted, the robot 1 rotates the movable part 3 so that the display device 32 faces the sound source of the sound, such as a person. The sound source direction identification device 10 is a device that identifies the direction of the sound source for rotating the movable part 3 .

図１１に示すように、音源方向特定装置１０は、円盤状の基板１１およびマイクロフォンMICａ～MICｃなどを備える。基板１１はマイクロフォンMICａ～MICｃが取り付けられる子基板１２ａ～１２ｃを有する。子基板１２ａ～１２ｃの各々は、一方の面にマイクロフォンMICａ～MICｃの各々が取り付けられ、他方の面は基板１１と直交するように基板１１に取り付けられている。マイクロフォンMICａ～MICｃは、無指向性のコンデンサマイクフォンであり、基板１１に固定されている。基板１１は、固定部下筐体２１の底面２３に対してほぼ平行であり、マイクロフォンMICａ～MICｃ各々のＺ方向の位置は、底面２３に対してほぼ同等である。マイクロフォンMICａ～MICｃは、それぞれ、ＸＹ平面に描かれる正三角形ＡＢＣ（図３参照）の頂点の位置に位置するように配置されている。これにより、例えばマイクロフォンMICａ～MICｃの各々間の距離は３組で共通であるため、例えば（式５）をなどの導出式などの導出方法を３組で共通とすることができ、音源角度θを導出するための演算を簡素にすることができる。正三角形ＡＢＣの一辺の長さ、即ち、マイクロフォンMICａ～MICｃの各々間の距離は例えば約１００ｍｍである。これにより、後述する（処理５）での位相差の算出には少なくとも基本周波数、第１フォルマント周波数が含まれることとなり、また、低い周波数における位相差の精度を良くすることができるため、マイコン４１（後述）は人声に対し、音源角度θを精度良く特定することができる。 As shown in FIG. 11, the sound source direction identifying device 10 includes a disk-shaped substrate 11, microphones MICa to MICc, and the like. The substrate 11 has sub-boards 12a-12c to which microphones MICa-MICc are attached. Each of the sub-boards 12a to 12c has one surface on which each of the microphones MICa to MICc is attached, and the other surface on which the sub-board 12a-12c is attached so as to be perpendicular to the substrate 11. As shown in FIG. The microphones MICa to MICc are omnidirectional condenser microphones and fixed to the substrate 11 . The substrate 11 is substantially parallel to the bottom surface 23 of the housing 21 under the fixed part, and the positions of the microphones MICa to MICc in the Z direction are substantially the same with respect to the bottom surface 23 . The microphones MICa to MICc are arranged so as to be positioned at the vertices of an equilateral triangle ABC drawn on the XY plane (see FIG. 3). As a result, for example, since the distance between each of the microphones MICa to MICc is common to the three groups, the derivation method such as the derivation formula such as (Equation 5) can be common to the three groups, and the sound source angle θ can be simplified. The length of one side of the equilateral triangle ABC, that is, the distance between each of the microphones MICa to MICc is, for example, about 100 mm. As a result, at least the fundamental frequency and the first formant frequency are included in the calculation of the phase difference in (processing 5), which will be described later. (described later) can accurately specify the sound source angle θ for human voice.

また、音源方向特定装置１０は、基板１１の下方に、図１２に示す様にアンプＡＭＰａ～ＡＭＰｃ、サンプルホールド回路ＳＨａ～ＳＨｃ、およびマイコン４１などを備える。マイクロフォンMICａ、アンプＡＭＰａ、およびサンプルホールド回路ＳＨａはこの順に直列に接続されている。同様に、マイクロフォンMICｂ、アンプＡＭＰｂ、およびサンプルホールド回路ＳＨｂは直列に接続されており、マイクロフォンMICｃ、アンプＡＭＰｃ、およびサンプルホールド回路ＳＨｃは直列に接続されている。つまり、音源方向特定装置１０には、マイクロフォンMICａ～MICｃの各々からサンプルホールド回路ＳＨａ～ＳＨｃの各々までの３つのチャンネルがある。３つのチャンネルのそれぞれをチャンネルＡｃｈ～Ｃｃｈと称する。アンプＡＭＰａ～ＡＭＰｃは、電気的に接続されているマイクロフォンMICａ～MICｃから出力される電気信号を増幅して、電気的に接続されているサンプルアンドホールド回路ＳＨａ～ＳＨｃへ出力する。サンプルアンドホールド回路ＳＨａ～ＳＨｃは、マイコン４１から出力されるサンプリングクロック信号に同期して、入力される電気信号をホールドし、ホールドした電気信号をマイコン４１へ出力する。サンプリングクロック信号の周波数、つまりサンプリング周波数は、２０～４０ｋＨｚ程度である。 12, the sound source direction identifying device 10 includes amplifiers AMPa to AMPc, sample-and-hold circuits SHa to SHc, a microcomputer 41, and the like below the board 11. FIG. A microphone MICa, an amplifier AMPa, and a sample-and-hold circuit SHa are connected in series in this order. Similarly, the microphone MICb, the amplifier AMPb, and the sample-and-hold circuit SHb are connected in series, and the microphone MICc, the amplifier AMPc, and the sample-and-hold circuit SHc are connected in series. In other words, the sound source direction specifying device 10 has three channels from each of the microphones MICa to MICc to each of the sample and hold circuits SHa to SHc. Each of the three channels is called channels Ach-Cch. The amplifiers AMPa to AMPc amplify electrical signals output from the electrically connected microphones MICa to MICc, and output the amplified electrical signals to the electrically connected sample-and-hold circuits SHa to SHc. The sample-and-hold circuits SHa to SHc hold input electrical signals in synchronization with sampling clock signals output from the microcomputer 41 and output the held electrical signals to the microcomputer 41 . The frequency of the sampling clock signal, that is, the sampling frequency is about 20-40 kHz.

マイコン４１はロボット１の電源がオンされ、起動すると、後述する（処理１）を開始する。また、所定期間において、音源方向を特定するための、（処理１）～（処理８）を繰り返し実行する。これにより、音がいつ発せられるかわからない場合であっても、音の発生に応じて、音源方向の特定をすることができる。尚、（処理１）を実行する周期は、２００ｍｓ以下である。これにより、例えば「おい」などの短い人声であっても、音の発生に応じて、音源方向の特定をすることができる。また、マイコン４１はロボット１の電源がオフされると、実行している（処理１）～（処理８）の何れかを終了する。 When the power of the robot 1 is turned on and the microcomputer 41 is activated, the microcomputer 41 starts (process 1) described later. In addition, (process 1) to (process 8) for identifying the direction of the sound source are repeatedly executed in a predetermined period. Thereby, even if it is not known when the sound will be emitted, the direction of the sound source can be specified according to the occurrence of the sound. Note that the period for executing (process 1) is 200 ms or less. As a result, even for a short human voice such as "hey", the direction of the sound source can be specified according to the sound generation. Further, when the power of the robot 1 is turned off, the microcomputer 41 terminates any of the processes (processing 1) to (processing 8) being executed.

（処理１）マイコン４１はサンプルアンドホールド回路ＳＨａ～ＳＨｃの各々から出力された電気信号をＡＤ変換し、各チャネル用の配列に格納する。詳しくは、マイコン４１は、サンプルアンドホールド回路ＳＨａ～ＳＨｃの各々から出力された電気信号をＡＤ変換したデータを順次、チャンネル毎に配列して内蔵するメモリに記憶する。 (Processing 1) The microcomputer 41 AD-converts the electrical signals output from each of the sample-and-hold circuits SHa to SHc, and stores them in arrays for each channel. Specifically, the microcomputer 41 sequentially arranges the data obtained by AD-converting the electrical signals output from the sample-and-hold circuits SHa to SHc for each channel and stores them in the built-in memory.

（処理２）一定量のデータを取得すると、マイコン４１は、高速フーリエ変換（ＦＦＴ）を３チャンネル分、行う。詳しくは、マイコン４１は、サンプルアンドホールド回路ＳＨａ～ＳＨｃの各々から出力された電気信号をＡＤ変換したデータの数が予め決められた数となる程度の所定時間が経過すると、メモリに記憶したデータを、チャネル毎に高速フーリエ変換する。所定時間は、５０ｍｓ～１００ｍｓ程度である。例えば２００ｍｓより長くなると、声を掛けられてから動作するまでにタイムラグが生じ、不自然さが増す。一方、５０ｍｓより短くすると、データ数が少なくなるため、方向の精度が落ちる。所定時間を上記の範囲とすることで、コミュニケーションを円滑にし、音源角度の精度を確保することができる。また、声にはいろいろな波長が混ざっているため、高速フーリエ変換により周波数解析を行う。尚、高速フーリエ変換のため、データの数は２の累乗が良く、例えば２＾８、２＾９、２＾１０程度が良い。マイコン４１は、高速フーリエ変換により得られた各周波数成分の複素数データを、周波数成分の各々に付与される周波数インデックスに対応付けてメモリに記憶する。また、後述の（処理４）にて絶対位相を算出する際に１つの位相に特定することができるように、次からの処理では、半波長がマイクロフォンMICａ～MICｃの各々間の距離より長い周波数である１．７ｋＨｚより低い周波数を処理の対象とする。尚、ここでは、音速を３４０ｍ／ｓとして算出している。 (Process 2) After acquiring a certain amount of data, the microcomputer 41 performs fast Fourier transform (FFT) for three channels. More specifically, the microcomputer 41 stores the data stored in the memory after a predetermined time has passed so that the number of data obtained by AD-converting the electrical signals output from each of the sample-and-hold circuits SHa to SHc reaches a predetermined number. is fast Fourier transformed for each channel. The predetermined time is approximately 50 ms to 100 ms. For example, if it is longer than 200 ms, there will be a time lag between being spoken to and the action taking place, increasing the unnaturalness. On the other hand, if the time is shorter than 50 ms, the number of data is reduced, so the precision of the direction is lowered. By setting the predetermined time within the above range, smooth communication can be achieved and the accuracy of the sound source angle can be ensured. In addition, since various wavelengths are mixed in the voice, frequency analysis is performed by fast Fourier transform. For the fast Fourier transform, the number of data should preferably be a power of 2, for example, 2^8, 2^9, or 2^10. The microcomputer 41 associates the complex number data of each frequency component obtained by the fast Fourier transform with the frequency index given to each frequency component and stores it in the memory. Also, in order to be able to specify one phase when calculating the absolute phase in (process 4) described later, in the processes from the next, a frequency whose half wavelength is longer than the distance between each of the microphones MICa to MICc A frequency lower than 1.7 kHz is to be processed. Here, the speed of sound is calculated as 340 m/s.

（処理３）次に、マイコン４１は、１．７ｋＨｚより低い周波数成分を対象として、高速フーリエ変換により得られた周波数インデックスごとに、複素数データからパワーを算出する。パワーは実数値の２乗に虚数値の２乗を加算した値である。次に、マイコン４１は、予め設定された閾値を超えた周波数インデックスをメモリに記憶する。以後、予め設定された閾値を超えた周波数インデックスを有音周波数インデックスと称する。ここで、予め設定された閾値を超えなかった周波数インデックスは、この周波数に音声成分が無いことを示す。そこで、マイコン４１は、以降の処理において、有音周波数インデックスのみを処理の対象とする。 (Process 3) Next, the microcomputer 41 calculates power from complex number data for each frequency index obtained by fast Fourier transform for frequency components lower than 1.7 kHz. The power is the sum of the square of the real value and the square of the imaginary value. Next, the microcomputer 41 stores frequency indexes exceeding a preset threshold in memory. A frequency index exceeding a preset threshold is hereinafter referred to as a voiced frequency index. Here, a frequency index that does not exceed a preset threshold indicates that there is no speech component at this frequency. Therefore, the microcomputer 41 processes only the voiced frequency index in subsequent processes.

（処理４）次に、マイコン４１は、有音周波数インデックスごとに、複素数データから絶対位相を算出する。絶対位相を算出する式を以下に示すように、４象限を対象とするものである。
絶対位相＝ＡｒｃＴａｎ［虚数値，実数値］
尚、ここでの絶対位相は、サンプルアンドホールド回路ＳＨａ～ＳＨｃがサンプリングした実時間データの、サンプルアンドホールド回路ＳＨａ～ＳＨｃが最初にホールドした開始時間を基準としたものである。また、複素数データの範囲は複素数平面における４象限であるため、算出される絶対位相の範囲は－π～＋πとなる。 (Process 4) Next, the microcomputer 41 calculates the absolute phase from the complex number data for each sound frequency index. The formula for calculating the absolute phase is for four quadrants as shown below.
absolute phase = ArcTan [imaginary value, real value]
The absolute phase here is based on the start time of the real-time data sampled by the sample-and-hold circuits SHa-SHc, which is first held by the sample-and-hold circuits SHa-SHc. Also, since the range of complex number data is four quadrants on the complex number plane, the range of the calculated absolute phase is -π to +π.

（処理５）次に、マイコン４１は、各有音周波数インデックスについて、３チャンネル分の絶対位相から、２チャンネルを１組とし、合計３組の位相差を求める。詳しくは、チャンネルＡｃｈ対チャンネルＢｃｈの位相差、チャンネルＢｃｈ対チャンネルＣｃｈの位相差、およびチャンネルＣｃｈ対チャンネルＡｃｈの位相差を求める。ここでは、チャンネルＡｃｈ対チャンネルＢｃｈの位相差を算出する際にはチャンネルＢｃｈの絶対位相からチャンネルＡｃｈの絶対位相を減じて算出し、チャンネルＢｃｈ対チャンネルＣｃｈの位相差を算出する際にはチャンネルＣｃｈの絶対位相からチャンネルＢｃｈの絶対位相を減じて算出し、チャンネルＣｃｈ対チャンネルＡｃｈの位相差を算出する際にはチャンネルＡｃｈの絶対位相からチャンネルＣｃｈの絶対位相を減じて算出するものとする。 (Process 5) Next, the microcomputer 41 obtains a total of three sets of phase differences from the absolute phases of the three channels for each sound frequency index, with two channels as one set. Specifically, the phase difference between channel Ach and channel Bch, the phase difference between channel Bch and channel Cch, and the phase difference between channel Cch and channel Ach are obtained. Here, when calculating the phase difference between channel Ach and channel Bch, the absolute phase of channel Ach is subtracted from the absolute phase of channel Bch, and when calculating the phase difference between channel Bch and channel Cch, channel Cch When calculating the phase difference between channel Cch and channel Ach, the absolute phase of channel Cch is subtracted from the absolute phase of channel Ach.

また、１組のマイクロフォンMICにおける音源角度のプラス・マイナスの極性を図１１に示すように定義する。尚、図１１は、３組のうちマイクロフォンMICａ，MICｂの組を取り上げて説明する図である。音源角度および表裏などの定義は上記と同様である。即ち、マイクロフォンMICａ，MICｂを結ぶ線分の中点を通る、マイクロフォンMICａ，MICｂを結ぶ線分の垂線を０°線と称する。また、音源の位置を、三角形ＡＢＣを含む平面に垂直な方向に沿って三角形ＡＢＣを含む平面に投影した投影位置から、三角形ＡＢＣの垂心である基準位置へ向かう方向が音源方向である。音は平面波とみなし、音源の投影位置からマイクロフォンMICａ，MICｂを結ぶ線分の中点へ向かう方向と０°線とのなす角度が音源角度θである。マイクロフォンMICａ，MICｂを通る線に対し、マイクロフォンMICｃがない側が表であり、マイクロフォンMICｃがある側が裏である。
マイクロフォンMICａ，MICｂにおいて、０°線に対して、位相差を算出する際に、減じる方のチャンネルであるチャンネルＡｃｈのマイクロフォンMICａのない側をプラス、マイクロフォンMICａのある側をマイナスと定義する。つまり、位相差がプラスであればマイクロフォンMICａがマイクロフォンMICｂよりも音源に対して遠方にあり、一方、位相差がマイナスであればマイクロフォンMICｂがマイクロフォンMICａよりも音源に対して遠方にあることになる。
また、他の組についても同様に、定義する。即ち、マイクロフォンMICｂ，MICｃにおいて、０°線に対して、位相差を算出する際に、減じる方のチャンネルであるチャンネルＢｃｈのマイクロフォンMICｂのない側をプラス、マイクロフォンMICｂのある側をマイナスと定義する。マイクロフォンMICｃ，MICａにおいて、０°線に対して、位相差を算出する際に、減じる方のチャンネルであるチャンネルＣｃｈのマイクロフォンMICｃのない側をプラス、マイクロフォンMICｃのある側をマイナスと定義する。以下の説明において、音源角度θを方向値と記載する場合がある。 Also, plus and minus polarities of sound source angles in one set of microphones MIC are defined as shown in FIG. FIG. 11 is a diagram for explaining a set of microphones MICa and MICb out of the three sets. Definitions of the sound source angle and the front and back are the same as above. That is, the perpendicular to the line segment connecting the microphones MICa and MICb, which passes through the midpoint of the line segment connecting the microphones MICa and MICb, is referred to as the 0° line. The direction of the sound source is the direction from the projected position of the sound source onto the plane containing the triangle ABC along the direction perpendicular to the plane containing the triangle ABC toward the reference position that is the orthocenter of the triangle ABC. The sound is regarded as a plane wave, and the sound source angle θ is the angle formed by the 0° line and the direction from the projection position of the sound source to the midpoint of the line connecting the microphones MICa and MICb. With respect to the line passing through the microphones MICa and MICb, the side without the microphone MICc is the front side, and the side with the microphone MICc is the back side.
For the microphones MICa and MICb, when calculating the phase difference with respect to the 0° line, the side without the microphone MICa of the channel Ach, which is the channel to be reduced, is defined as plus, and the side with the microphone MICa is defined as minus. That is, if the phase difference is positive, the microphone MICa is farther from the sound source than the microphone MICb, and if the phase difference is negative, the microphone MICb is farther from the sound source than the microphone MICa. .
Also, other sets are similarly defined. That is, for microphones MICb and MICc, when calculating the phase difference with respect to the 0° line, the side without microphone MICb of channel Bch, which is the channel to be subtracted, is defined as plus, and the side with microphone MICb is defined as minus. . For the microphones MICc and MICa, when calculating the phase difference with respect to the 0° line, the side without the microphone MICc of the channel Cch, which is the channel to be reduced, is defined as plus, and the side with the microphone MICc is defined as minus. In the following description, the sound source angle θ may be referred to as a direction value.

（処理６）次に、マイコン４１は、各有音周波数インデックスについて、位相差と該有音周波数インデックスの周波数から到達時間差Ｔdiffを算出する。このように、位相差から到達時間差Ｔdiffを求めることで、到達時間差Ｔdiffを精度良く求めることができる。例えば、２つの、高速フーリエ変換前の実時間波形の各々にて予め設定された音量の閾値を超えた時刻の時間差を遅延時間差とすることもできる。しかしながら、この実時間波形を用いた方式の場合、マイクロフォンの周波数特性、２つのマイクロフォン間の周波数特性の差の影響を受け易い。例えば、一方のマイクロフォンにおいて、ある帯域の周波数の感度が悪く、この帯域の周波数のレベルが落ちた場合には、実時間波形は他方のマイクロフォンとは異なるものとなってしまう。このため、遅延時間差が実際とは異なるものとなり、到達時間差Ｔdiffの精度は悪くなってしまう。また、この実時間波形を用いた方式の場合、閾値の設定が遅延時間差に大きく影響してしまう。上記のように、２つの実時間波形は互いに異なるものとなるため、音量の閾値によって遅延時間差は変動してしまう。また、この実時間波形を用いた方式の場合、周囲環境、例えば、壁などによる反射音の影響を受け易い。この点、本実施形態における位相差を用いた方式によれば、実時間波形を用いた方式と比較し、マイクロフォンの周波数特性および反射音の影響が到達時間差Ｔdiffに反映されにくいため、精度良く到達時間差Ｔdiffを求めることができる。後述するように、音声の周波数成分ごとに到達時間差Ｔdiffを求めて、求めた到達時間差Ｔdiffを用いて音源角度θを求めるので、周波数成分間における相関がなく、マイクフォンや環境の周波数特性の影響を受けにくい。 (Process 6) Next, the microcomputer 41 calculates an arrival time difference Tdiff for each sound frequency index from the phase difference and the frequency of the sound frequency index. By obtaining the arrival time difference Tdiff from the phase difference in this way, the arrival time difference Tdiff can be obtained with high accuracy. For example, the delay time difference can be the time difference between the times at which two real-time waveforms before the fast Fourier transform exceed a preset volume threshold. However, the method using the real-time waveform is easily affected by the frequency characteristics of the microphones and the difference in frequency characteristics between the two microphones. For example, if one microphone has poor frequency sensitivity in a certain band and the frequency level in this band drops, the real-time waveform will differ from that of the other microphone. Therefore, the delay time difference is different from the actual one, and the accuracy of the arrival time difference Tdiff is deteriorated. Moreover, in the case of the method using this real-time waveform, the setting of the threshold greatly affects the delay time difference. As described above, since the two real-time waveforms are different from each other, the delay time difference fluctuates depending on the volume threshold. Moreover, in the case of the method using this real-time waveform, it is easily affected by the sound reflected by the surrounding environment, for example, walls. In this regard, according to the method using the phase difference in the present embodiment, compared to the method using the real-time waveform, the influence of the frequency characteristics of the microphone and the reflected sound is less likely to be reflected in the arrival time difference Tdiff. A time difference Tdiff can be determined. As will be described later, the arrival time difference Tdiff is obtained for each frequency component of the voice, and the sound source angle θ is obtained using the obtained arrival time difference Tdiff. difficult to receive.

マイコン４１は、各有音周波数インデックスの到達時間差Ｔdiffを算出した後、全ての有音周波数インデックスの到達時間差Ｔdiffの加重平均を算出する。ここで使用される重み（レベル）は、当該周波数インデックスの√（実数値＾２＋虚数値＾２）である。マイコン４１は、以降の処理では各有音周波数インデックスでの値は使用せず、加重平均により求まった１つの値を使用する。
各組で１つの到達時間差Ｔdiffを算出後、マイコン４１は到達時間差Ｔdiff、音速、（式４）から、距離差Ｄdiffを算出する。ここでは、音速を３４０ｍ／ｓとして算出するものとする。尚、ここでは、位相差のプラス・マイナスの極性を到達時間差Ｔdiffおよび距離差Ｄdiffにも踏襲させるものとする。従って、例えば、マイクロフォンMICａ，MICｂにおいて、距離差ＤdiffがプラスであればマイクロフォンMICａが遠方にあり、距離差ＤdiffがマイナスであればマイクロフォンMICｂが遠方にあることを示すこととなる。 After calculating the arrival time difference Tdiff of each sound frequency index, the microcomputer 41 calculates a weighted average of the arrival time differences Tdiff of all sound frequency indexes. The weight (level) used here is √(real value ̂2+imaginary value ̂2) of the frequency index. In subsequent processing, the microcomputer 41 does not use the values of each sound frequency index, but uses one value obtained by weighted averaging.
After calculating one arrival time difference Tdiff for each group, the microcomputer 41 calculates the distance difference Ddiff from the arrival time difference Tdiff, the speed of sound, and (Equation 4). Here, the speed of sound is assumed to be 340 m/s for calculation. Here, it is assumed that the arrival time difference Tdiff and the distance difference Ddiff also follow the plus/minus polarity of the phase difference. Therefore, for example, for the microphones MICa and MICb, if the distance difference Ddiff is positive, the microphone MICa is far away, and if the distance difference Ddiff is negative, the microphone MICb is far away.

（処理７－１）次に、マイコン４１は、算出した３つの距離差Ｄdiffの絶対値が最大である距離差Ｄdiffおよび算出した３つの距離差Ｄdiffのプラス・マイナスの極性に基づき、図１２に示す表５１の６つの行のうち、適合する行を選出する。 (Processing 7-1) Next, the microcomputer 41 calculates the distance difference Ddiff having the maximum absolute value of the calculated three distance differences Ddiff and the positive/negative polarities of the calculated three distance differences Ddiff as shown in FIG. A matching row is selected from among the six rows of the table 51 shown.

図１４に示す表５１は、図８を表にまとめたものである。表５１は、チェンネルＡｃｈ～Ｃｃｈの各組において、表裏のいずれの側を音源角度の算出に採用すべきかを示したものである。表５１の行の各々は、図８に示す領域Ｒ１～Ｒ６の各々のいずれかに対応している。表５１の列は、チャンネルＡｃｈ～Ｃｃｈの３組の各々における表・裏に対応している。表５１において、音源角度の算出に採用すべき側には「○」が記され、採用すべきでない側には「―」が記されている。
例えば、表５１の１行目は、距離差Ｄdiffが最大のペアがチャンネルＡｃｈ，Ｃｃｈのペアであり、チャンネルＡｃｈのマイクロフォンMICａが音源に対して遠方である場合について示されている。この場合とは、図８における領域Ｒ４に音源方向が属する場合であり、音源方向はマイクロフォンMICｂ，MICｃのペアの表、マイクロフォンMICａ，MICｂのペアの裏に位置するため、表５１においても、「Ｂｃｈ－Ｃｃｈの表」および「Ｃｃｈ－Ａｃｈの裏」に「○」が記されている。また、この場合、上記したように、チャンネルＡｃｈ，Ｃｃｈのペアの距離差ＤＤｃａから算出される音源角度の精度は悪い為、マイコン４１はチャンネルＡｃｈ，Ｃｃｈのペアの表・裏いずれの側も音源角度の算出には採用しない。このため、表５１では、「Ａｃｈ－Ｃｃｈの表」、「Ａｃｈ－Ｃｃｈの裏」の何れにも「―」が記されている。 Table 51 shown in FIG. 14 summarizes FIG. 8 as a table. Table 51 shows which of the front and back sides should be used to calculate the sound source angle for each set of channels Ach to Cch. Each row of Table 51 corresponds to one of each of regions R1 to R6 shown in FIG. The columns of Table 51 correspond to the front and back of each of the three sets of channels Ach-Cch. In Table 51, the sides that should be used in the calculation of the sound source angle are marked with "o", and the sides that should not be used are marked with "-".
For example, the first row of Table 51 shows the case where the pair with the largest distance difference Ddiff is the pair of channels Ach and Cch, and the microphone MICa of channel Ach is far from the sound source. This case refers to the case where the sound source direction belongs to region R4 in FIG. "Bch-Cch Front" and "Cch-Ach Back" are marked with "○". In this case, as described above, since the accuracy of the sound source angle calculated from the distance difference DDca between the pair of channels Ach and Cch is poor, the microcomputer 41 detects the sound source on either side of the pair of channels Ach and Cch. Not used for angle calculation. Therefore, in Table 51, "-" is written on both the "Ach-Cch front" and the "Ach-Cch back".

マイコン４１は、最大である距離差Ｄdiffであるチャンネルの組および極性に基づき、表５１を参照し、距離差Ｄdiffが最大であるチャンネルの組以外の、チャンネルの組について、「○」が記されているのは表裏の何れであるかを選出する。例えば、最大である距離差Ｄdiffであるチャンネルの組がチャンネルＡｃｈ，Ｃｃｈであり、距離差Ｄdiffの極性がプラスであれば、表５１の１行目が適合するため、マイコン４１は、チャンネルＢｃｈ，Ｃｃｈの表、チャンネルＡｃｈ，Ｂｃｈの裏を選出し、メモリに記憶する。また、マイコン４１は、各有音周波数インデックスについて、チャンネルの２組の各々について、（式５）を用いて、音源角度θを算出する。尚、ここでは、距離差Ｄdiffのプラス・マイナスの極性を音源角度θにも踏襲させるものとする。上述したように、（式５）を用いて算出される音源角度θは、マイクロフォンMICａ～MICｃを含むＸＹ平面に対して垂直な方向に沿って音源の位置をＸＹ平面に投影した投影位置から基準位置までの距離を、音源から基準位置までの距離に近似できると仮定した場合の、投影位置から基準位置へ向かう音源方向を示すものである。 The microcomputer 41 refers to Table 51 based on the set of channels with the maximum distance difference Ddiff and the polarity, and the channel sets other than the set of channels with the maximum distance difference Ddiff are marked with "○". Choose which one is on the front or back. For example, if the pair of channels with the maximum distance difference Ddiff is channels Ach and Cch, and the polarity of the distance difference Ddiff is positive, the first row of Table 51 is matched. The front side of Cch and the back side of channels Ach and Bch are selected and stored in memory. Further, the microcomputer 41 calculates the sound source angle θ using (Equation 5) for each of the two sets of channels for each sound frequency index. Here, it is assumed that the sound source angle θ also follows the plus/minus polarity of the distance difference Ddiff. As described above, the sound source angle θ calculated using (Equation 5) is based on the projection position of the sound source position projected onto the XY plane along the direction perpendicular to the XY plane containing the microphones MICa to MICc. It shows the sound source direction from the projected position to the reference position when it is assumed that the distance to the position can be approximated to the distance from the sound source to the reference position.

（処理７－２）次に、マイコン４１は、各々の組で算出した音源角度θを、基準方向を３組で統一させた、全体の音源角度に換算する。ここでは、図１５に示すように、マイクロフォンMICａ，MICｃの表側の０°線を全体の基準方向として、マイクロフォンMICａ，MICｃを結ぶ線分の中点を支点として右回りに０°～３６０°の範囲で全体の音源角度を示すものとする。図１５は、距離差Ｄdiffが最大のペアがチャンネルＢｃｈ，Ｃｃｈのペアであり、チャンネルＣｃｈのマイクロフォンMICｃが音源に対してマイクロフォンMICｂよりも遠方である場合について示されている。ここで、マイクロフォンMICａ，MICｃにおける音源角度を角度＋θｃａ、マイクロフォンMICａ，MICｂにおける音源角度を角度＋θａｂであるとする。この場合、角度＋θｃａは裏、角度＋θａｂは表に位置するため、全体の音源角度は、それぞれ、１８０°－θｃａ、１２０°＋θａｂとなる。 (Process 7-2) Next, the microcomputer 41 converts the sound source angle θ calculated for each group into the overall sound source angle with the reference direction unified for the three groups. Here, as shown in FIG. 15, with the 0° line on the front side of the microphones MICa and MICc as the overall reference direction, the midpoint of the line segment connecting the microphones MICa and MICc is used as the fulcrum, and the direction is clockwise from 0° to 360°. The range shall indicate the overall sound source angle. FIG. 15 shows a case where the pair with the largest distance difference Ddiff is the pair of channels Bch and Cch, and the microphone MICc of channel Cch is farther from the sound source than the microphone MICb. Here, assume that the sound source angle at the microphones MICa and MICc is angle +θca, and the sound source angle at the microphones MICa and MICb is angle +θab. In this case, since the angle +θca is on the back side and the angle +θab is on the front side, the overall sound source angles are 180°−θca and 120°+θab, respectively.

（処理８）次に、マイコン４１は、（処理７－２）で算出した全体の音源角度に基づき、最終的な音源方向を統計的に算出する。具体的には、マイコン４１は、（処理７－２）で算出した全体の音源角度を平均し、１つの音源方向を算出する。 (Process 8) Next, the microcomputer 41 statistically calculates the final sound source direction based on the overall sound source angle calculated in (Process 7-2). Specifically, the microcomputer 41 averages all the sound source angles calculated in (processing 7-2) to calculate one sound source direction.

マイコン４１は、算出した音源方向に表示装置３２が対面するように、可動部３を回転させるモータを制御する。これにより、ロボット１の表示装置３２が音源方向に対面する。 The microcomputer 41 controls the motor that rotates the movable section 3 so that the display device 32 faces the calculated direction of the sound source. As a result, the display device 32 of the robot 1 faces the direction of the sound source.

ここで、本実施形態による音源方向特定の他方式に対するメリットを説明する。
他方式として、指向性マイクフォロンを複数用い、その音量差、もしくは音量比から音源方向を求める方式がある。この他方式では、音源の位置検出の精度は、マイクロフォンの指向性の性能に依存されてしまう。この点、本実施形態では、無指向性マイクロフォンを使用し、指向性の性能に依存されない。また、この他方式では、例えば１０個程度の指向性マイクフォロンが必要とされるが、本実施形態では、３個のマイクロフォンで、音源方向を特定することができる。また、この他方式では、周囲環境の影響を受け易い。例えば周りに壁などがあると、音が壁に反射するため、マイクロフォンは間接音を拾ってしまう。このため、複数のマイクロフォンが拾う音の互いのレベル差が小さくなってしまう。この点、本実施形態では、音量ではなく、位相で見ているので、求める音源角度を高い分解能、精度とすることができる。 Here, merits of sound source direction identification according to the present embodiment over other methods will be described.
As another method, there is a method in which a plurality of directional microphone folons are used and the sound source direction is determined from the volume difference or volume ratio. In other methods, the accuracy of position detection of the sound source depends on the directivity performance of the microphone. In this regard, the present embodiment uses an omnidirectional microphone and does not rely on directional performance. Also, in this other method, for example, about ten directional microphones are required, but in this embodiment, the direction of the sound source can be specified with three microphones. In addition, other methods are susceptible to the influence of the surrounding environment. For example, if there are walls around you, the microphone will pick up indirect sounds because the sound will reflect off the walls. As a result, the level difference between sounds picked up by a plurality of microphones becomes small. In this regard, in the present embodiment, the sound source angle to be obtained can be obtained with high resolution and accuracy because the phase is used instead of the volume.

ここで、音源方向特定装置１０は音源方向特定装置の一例であり、マイコン４１は特定部の一例であり、（処理１）はサンプリング処理の一例であり、（処理２）～（処理８）は特定処理の一例である。また、（処理７－２）にて算出する全体の音源角度は、「前記６つの領域のうち決定した領域となる前記音源方向と前記３つの垂線のうちの１つの垂線とのなす角度である音源角度」の一例である。 Here, the sound source direction specifying device 10 is an example of a sound source direction specifying device, the microcomputer 41 is an example of an specifying unit, (Processing 1) is an example of sampling processing, and (Processing 2) to (Processing 8) are It is an example of specific processing. In addition, the overall sound source angle calculated in (process 7-2) is the angle formed by the sound source direction, which is the determined region among the six regions, and one of the three perpendicular lines. It is an example of "sound source angle".

以上、説明した実施形態によれば、以下の効果を奏する。
音源方向特定装置１０は、正三角形ＡＢＣの頂点に配置された３つのマイクロフォンMICａ～MICｃと、音源から３つのマイクロフォンの各々までの音の到達時間差Ｔdiffに基づき、音源の位置を、正三角形ＡＢＣを含む平面に垂直な方向に沿って正三角形ＡＢＣを含む平面に投影した位置から正三角形ＡＢＣを含む平面の正三角形ＡＢＣで囲まれた領域の内側にある基準位置へ向かう音源方向を特定するマイコン４１とを備える。これにより、音源方向特定装置１０は３つのマイクロフォンMICａ～MICｃで音源方向を特定することができる。また、３つのマイクロフォンMICａ～MICｃは正三角形ＡＢＣの頂点に配置されるため、音源角度θを導出するための演算を簡素にすることができる。 According to the embodiment described above, the following effects are obtained.
The sound source direction identifying device 10 determines the position of the sound source based on the three microphones MICa to MICc arranged at the vertices of the equilateral triangle ABC and the sound arrival time difference Tdiff from the sound source to each of the three microphones. The microcomputer 41 identifies the sound source direction from the position projected onto the plane containing the equilateral triangle ABC along the direction perpendicular to the plane containing the equilateral triangle ABC toward the reference position inside the area surrounded by the equilateral triangle ABC on the plane containing the equilateral triangle ABC. and Accordingly, the sound source direction identifying device 10 can identify the sound source direction using the three microphones MICa to MICc. Also, since the three microphones MICa to MICc are arranged at the vertices of the equilateral triangle ABC, the computation for deriving the sound source angle θ can be simplified.

また、マイコン４１は、（処理７－１）において、マイクロフォンMICａ～MICｃのうちの２つのマイクロフォンを１組として各組から算出される３つの到達時間差Ｔdiffのうち、最大の到達時間差Ｔdiffに基づき、音源方向がマイクロフォンMICａ～MICｃのうちの２つのマイクロフォンを通る線の各々に引かれた基準位置を通る３つの垂線ＰＬａ～ＰＬｃ（図８）により区画された基準位置を囲む６つの領域である領域Ｒ１～Ｒ６（図８）に対応する表２１の何れの行に適合するかを決定し、最大の到達時間差Ｔdiffである１組を除く残り２組の到達時間差Ｔdiffに基づき、領域Ｒ１～Ｒ６のうち決定した領域となる、表裏および極性の情報を付加した音源角度θを算出する。実施形態においては、音源角度θの範囲を０°以上９０°以下の範囲とし、音源角度θにプラス・マイナスの極性および表裏の情報を付加することで、３６０°を示すこととしている。マイコン４１は、（処理７－２）において、（処理７－１）にて算出した音源角度θを、基準方向を３組で統一させた、全体の音源角度に換算する。これにより、音源方向特定装置１０は、音源角度θを精度良く特定することができる。 Further, in (processing 7-1), the microcomputer 41 sets two microphones among the microphones MICa to MICc as one set, and out of three arrival time differences Tdiff calculated from each set, based on the maximum arrival time difference Tdiff, Areas that are six areas surrounding a reference position where the sound source direction is demarcated by three perpendicular lines PLa to PLc (FIG. 8) passing through the reference position drawn on each of the lines passing through two microphones among the microphones MICa to MICc. Determine which row of Table 21 corresponding to R1 to R6 (FIG. 8) fits, and based on the remaining two sets of arrival time differences Tdiff excluding the one set with the largest arrival time difference Tdiff, the regions R1 to R6 The sound source angle θ to which the front/back and polarity information is added, which is the determined area, is calculated. In the embodiment, the sound source angle θ ranges from 0° to 90°, and 360° is indicated by adding plus/minus polarities and front/rear information to the sound source angle θ. In (processing 7-2), the microcomputer 41 converts the sound source angle θ calculated in (processing 7-1) into the overall sound source angle with the three sets of reference directions unified. Accordingly, the sound source direction identifying device 10 can accurately identify the sound source angle θ.

また、マイコン４１は、（処理６）において、位相差に基づき、到達時間差Ｔdiffを算出する。これにより、音源方向特定装置１０は、音源角度θを精度良く特定することができる。 Also, in (process 6), the microcomputer 41 calculates the arrival time difference Tdiff based on the phase difference. Accordingly, the sound source direction identifying device 10 can accurately identify the sound source angle θ.

また、マイクロフォンMICａ～MICｃのマイクロフォン間の距離は約１００ｍｍである。これにより、音源方向特定装置１０は、音源である人の方向を精度良く特定することができる。 Further, the distance between the microphones MICa to MICc is approximately 100 mm. Accordingly, the sound source direction identifying apparatus 10 can accurately identify the direction of the person who is the sound source.

また、マイコン４１は、所定期間において、マイクロフォンMICａ～MICｃの各々から出力される電気信号をデジタル値に変換する（処理１）と、（処理１）にて変換されたデジタル値に基づき方向を特定する（処理２）～（処理８）と、を繰り返し実行する。これにより、音源方向特定装置１０は、短い呼びかけに対しても、音に応じて音源位置を特定し、確実に動作することができる。 Further, the microcomputer 41 converts the electrical signal output from each of the microphones MICa to MICc into a digital value in a predetermined period (processing 1), and specifies the direction based on the digital value converted in (processing 1). Then (processing 2) to (processing 8) are repeatedly executed. As a result, the sound source direction identification device 10 can identify the sound source position according to the sound even for a short call, and operate reliably.

また、本発明は前記実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲内での種々の改良、変更が可能であることは言うまでもない。
例えば、上記では、マイクロフォンMICａ～MICｃは正三角形ＡＢＣの頂点の位置に配置されると説明したが、これに限定されない。正三角形ではなく、すべての角が９０°以下である三角形であっても良い。 Moreover, the present invention is not limited to the above embodiments, and it goes without saying that various improvements and modifications are possible without departing from the scope of the present invention.
For example, although it has been described above that the microphones MICa to MICc are arranged at the positions of the vertices of the equilateral triangle ABC, the present invention is not limited to this. Instead of an equilateral triangle, a triangle having all angles of 90° or less may be used.

また、上記では、音源方向特定装置１０は、サンプルホールド回路ＳＨａ～ＳＨｃを備えると説明したが、これに限定されない。例えば、マイコン４１が、アンプのＡＭＰａ～ＡＭＰｃからの出力信号を同時にサンプリング可能な構成を備えている場合には、サンプルホールド回路ＳＨａ～ＳＨｃを備えない構成としても良い。 In the above description, the sound source direction identifying device 10 includes the sample and hold circuits SHa to SHc, but the present invention is not limited to this. For example, if the microcomputer 41 has a configuration capable of sampling the output signals from the amplifiers AMPa to AMPc at the same time, the sample and hold circuits SHa to SHc may not be provided.

また、上記では、（処理４）および（処理５）において、組をなす一方のチャンネルの絶対位相から他方のチャンネルを減じて位相差を算出すると説明した。これに限定されず、組をなす一方のチャンネルを基準に決め、基準としたチャンネルの絶対位相が０°となる様に他方のチャンネルを座標回転させて算出しても良い。座標回転させた後の他方のチャンネルの絶対位相が位相差となる。 Further, in the above description, in (processing 4) and (processing 5), the phase difference is calculated by subtracting the absolute phase of one channel of the pair from the absolute phase of the other channel. Without being limited to this, one channel forming a pair may be determined as a reference, and the coordinates of the other channel may be rotated so that the absolute phase of the reference channel is 0°. The absolute phase of the other channel after coordinate rotation is the phase difference.

また、上記では、（処理７－２）において、最終的な音源方向を統計的に算出すると説明したが、これに限定されない。ここで、複数の音源方向を相加平均するのではなく、角度が０°に近い程、重みを付けた重み付け平均とすると良い。 Also, in the above description, the final sound source direction is statistically calculated in (process 7-2), but the present invention is not limited to this. Here, instead of taking an arithmetic average of a plurality of sound source directions, it is preferable to take a weighted average in which the closer the angle is to 0°, the more weighted it is.

また、上記では、マイクロフォンMICａ～MICｃは無指向性マイクロフォンであると説明したが、これに限定されず、単一指向性マイクロフォンでも良い。コンデンサマイクロフォンは単一指向性といっても、指向性はするどくない。単一指向性のコンデンサマイクロフォンであっても、裏から、つまり集音側でない側で話をした場合に、音が取れないということはなく、全指向性と単一指向性との差はわずかであるからである。但し、本実施形態では、マイクロフォンMICａ～MICｃの各々は、マイクロフォンMICａ～MICｃを含む平面における３６０°方向のどの方向に音源が位置したとしても、同等に音を拾うことが好適であるため、マイクロフォンMICａ～MICｃは無指向性マイクロフォンであることが好ましい。 Also, in the above description, the microphones MICa to MICc are omnidirectional microphones, but they are not limited to this, and may be unidirectional microphones. Condenser microphones are unidirectional, but their directivity is not sharp. Even with a unidirectional condenser microphone, when speaking from the back, that is, on the non-collecting side, the sound is not lost, and the difference between omnidirectional and unidirectional is small. Because it is. However, in the present embodiment, each of the microphones MICa to MICc preferably picks up sound equally regardless of the direction of the sound source in the 360° direction on the plane containing the microphones MICa to MICc. MICa-MICc are preferably omnidirectional microphones.

また、上記では、例えば、音源方向の表裏の定義、音源角度の極性の定義を説明したが、これに限定されない。これらは、算出される位相差に対して全体の音源方向が整合されるように、任意に定義することができる。 Also, in the above description, for example, the definition of the front and back of the sound source direction and the definition of the polarity of the sound source angle have been described, but the present invention is not limited to this. These can be arbitrarily defined such that the overall sound direction is aligned to the calculated phase difference.

また、上記では、ロボット１の電源がオンされ、マイコン４１が起動している期間、（処理１）～（処理８）を繰り返し実行すると説明したが、これに限定されない。例えば、音量レベルが閾値を超えたことをトリガとして、（処理１）を開始する構成としても良い。この構成によれば、音を確実に取り込み、音に反応して確実に動作することができる。 Further, in the above description, while the power of the robot 1 is turned on and the microcomputer 41 is activated, (process 1) to (process 8) are repeatedly executed, but the present invention is not limited to this. For example, it may be configured to start (process 1) by using the fact that the volume level exceeds the threshold as a trigger. According to this configuration, it is possible to reliably take in sound and reliably operate in response to the sound.

また、上記では、音源方向特定装置１０はロボット１に備えられると説明したが、これに限定されない。例えば、音源方向特定装置１０が可動式の監視カメラに備えられても良い。この構成によれば、音源方向特定装置１０が特定した方向にカメラを向けることができる。また、音源方向特定装置１０が判定した音源方向を記録する機能を備える構成、音源方向特定装置１０が判定した音源方向を記録装置に出力する構成としても良い。また、マイクロフォンMICａ～MICｃが集音した音声を音源方向特定装置１０が記録する機能を備える構成、マイクロフォンMICａ～MICｃが集音した音声を例えばＰＣなどの処理装置に出力する構成を音源方向特定装置１０が備える構成としても良い。 In the above description, the sound source direction specifying device 10 is provided in the robot 1, but it is not limited to this. For example, the sound source direction specifying device 10 may be provided in a movable surveillance camera. According to this configuration, the camera can be directed in the direction specified by the sound source direction specifying device 10 . Further, a configuration having a function of recording the sound source direction determined by the sound source direction identifying device 10, or a configuration outputting the sound source direction determined by the sound source direction identifying device 10 to a recording device may be employed. In addition, the sound source direction identifying device 10 has a function of recording the sounds collected by the microphones MICa to MICc, and outputs the sounds collected by the microphones MICa to MICc to a processing device such as a PC. 10 may be provided.

また、上記では、（処理６）にて、各組にて、加重平均により１つの到達時間差Ｔdiffを求め、以降の処理を行うと説明した。これに限定されず、１つの到達時間差Ｔdiffを求めずに、各組において、各有音周波数インデックスについて、（処理７－１）以降の処理を行う構成としても良い。この構成の場合、（処理８）にて、マイコン４１は、（処理７－２）で算出した、「有音インデックス数×２」個の全体の音源角度に基づき、最終的な音源方向を統計的に算出する。具体的には、マイコン４１は、（処理７－２）で算出した、「有音インデックス数×２」個の全体の音源角度のうち、外れ値を除外して平均し、１つの音源方向を算出する。このように、周波数成分ごとに音源角度を求め、それらの統計から最終的な音源角度を求めるため、条件の悪い周波数成分による誤差の影響を受けにくい。例えばマイクロフォン、アンプなどの周波数特性にはバラツキがある。このため、算出する位相差の誤差が比較的大きい周波数成分と、誤差が比較的小さい周波数成分とが含まれることが考えられる。そこで、複数の周波数成分に基づき最終的な音源角度を求めることで、１つの周波数成分に基づき最終的な音源角度を求める場合よりも、音源角度θの精度を良くすることができる。 Further, in the above description, in (process 6), one arrival time difference Tdiff is obtained by weighted average for each pair, and the subsequent processes are performed. Without being limited to this, it is also possible to adopt a configuration in which the processes after (process 7-1) are performed for each sounded frequency index in each group without obtaining one arrival time difference Tdiff. In this configuration, in (processing 8), the microcomputer 41 statistically determines the final sound source direction based on the total sound source angles of “sounded index number×2” calculated in (processing 7-2). Calculate Specifically, the microcomputer 41 excludes outliers and averages the total sound source angles of “the number of voiced indices×2” calculated in (process 7-2), and determines one sound source direction. calculate. In this way, since the sound source angle is obtained for each frequency component and the final sound source angle is obtained from the statistics thereof, it is less likely to be affected by errors due to poorly conditioned frequency components. For example, there are variations in the frequency characteristics of microphones, amplifiers, and the like. Therefore, it is conceivable that a frequency component with a relatively large error in the calculated phase difference and a frequency component with a relatively small error are included. Therefore, by obtaining the final sound source angle based on a plurality of frequency components, it is possible to improve the accuracy of the sound source angle θ compared to obtaining the final sound source angle based on one frequency component.

本発明の範囲は，明細書に明示的に説明された構成や限定されるものではなく，本明細書に開示される本発明の様々な側面の組み合わせをも，その範囲に含むものである。本発明のうち，特許を受けようとする構成を，添付の特許請求の範囲に特定したが，現在の処は特許請求の範囲に特定されていない構成であっても，本明細書に開示される構成を，将来的に特許請求の範囲とする意思を有する。
本願発明は上述した実施の形態に記載の構成に限定されない。上述した各実施の形態や変形例の構成要素は任意に選択して組み合わせて構成するとよい。また各実施の形態や変形例の任意の構成要素と，発明を解決するための手段に記載の任意の構成要素または発明を解決するための手段に記載の任意の構成要素を具体化した構成要素とは任意に組み合わせて構成するとよい。これらについても本願の補正または分割出願等において権利取得する意思を有する。
また，意匠出願への変更出願により，全体意匠または部分意匠について権利取得する意思を有する。図面は本装置の全体を実線で描画しているが，全体意匠のみならず当該装置の一部の部分に対して請求する部分意匠も包含した図面である。例えば当該装置の一部の部材を部分意匠とすることはもちろんのこと，部材と関係なく当該装置の一部の部分を部分意匠として包含した図面である。当該装置の一部の部分としては，装置の一部の部材としても良いし，その部材の部分としても良い。全体意匠はもちろんのこと，図面の実線部分のうち任意の部分を破線部分とした部分意匠を，権利化する意思を有する。 The scope of the invention is not limited to the configurations or limitations expressly set forth herein, but rather includes within its scope any combination of the various aspects of the invention disclosed herein. Although the patentable configuration of the present invention is specified in the appended claims, any configuration not currently specified in the claims is disclosed herein. It is our intention to claim such configurations in the future.
The present invention is not limited to the configurations described in the above-described embodiments. The constituent elements of the above-described embodiments and modifications may be arbitrarily selected and combined. In addition, arbitrary constituent elements of each embodiment and modifications, arbitrary constituent elements described in Means for Solving the Invention, or constituent elements embodying arbitrary constituent elements described in Means for Solving the Invention and may be configured in any combination. We intend to acquire the rights for these as well in the amendment of the present application or in a divisional application.
I also intend to acquire rights to the whole design or partial design by filing a conversion application to a design application. Although the drawing shows the entire device in solid lines, the drawing includes not only the overall design but also the partial design claimed for a part of the device. For example, it is a drawing that includes a part of the device as a partial design regardless of the member, as well as a partial design of a member of the device. The part of the device may be a part of the device or a part of the member. We intend to acquire rights not only for the overall design, but also for the partial design in which any part of the solid line part of the drawing is the broken line part.

１ロボット
１０音源方向特定装置
４１マイコン
MICａ，MICｂ，MICｃマイクロフォン 1 robot 10 sound source direction specifying device 41 microcomputer
MICa, MICb, MICc Microphone

Claims

三角形の頂点に配置された３つのマイクロフォンと、
音源から前記３つのマイクロフォンの各々までの音の到達時間の差に基づき、前記音源の位置を、前記三角形を含む平面に垂直な方向に沿って前記三角形を含む平面に投影した位置から前記平面の前記三角形で囲まれた領域の内側にある基準位置へ向かう音源方向を特定する特定部と、を備え、
前記３つのマイクロフォンの各々が出力する３つの電気信号のうちの２つの電気信号を１組として各組から算出される位相差に基づき、前記到達時間の差を算出する機能を備え、
前記音源の音は人声であり、前記３つのマイクロフォンの各マイクロフォン間の距離は、５７ｍｍ以上１７０ｍｍ以下であり、
前記特定部は、
所定期間において、
前記３つのマイクロフォンの各々から出力される電気信号をデジタル値に変換するサンプリング周期を２００ｍｓ以下としたサンプリング処理と、
前記サンプリング処理にて変換されたデジタル値に基づき方向を特定する特定処理と、を繰り返し実行すること
を特徴とする音源方向特定装置。 three microphones placed at the vertices of the triangle,
Based on the difference in the arrival times of sound from the sound source to each of the three microphones, the position of the sound source is projected onto the plane containing the triangle along the direction perpendicular to the plane containing the triangle. a specifying unit that specifies a sound source direction toward a reference position inside the area surrounded by the triangle,
A function of calculating the difference in arrival time based on the phase difference calculated from each set of two electrical signals out of the three electrical signals output by each of the three microphones,
The sound of the sound source is human voice, and the distance between each of the three microphones is 57 mm or more and 170 mm or less ,
The identification unit
for a given period of time,
Sampling processing with a sampling period of 200 ms or less for converting the electrical signal output from each of the three microphones into a digital value;
and repeatedly executing a specifying process of specifying a direction based on the digital value converted by the sampling process.
A sound source direction identification device characterized by:

三角形の頂点に配置された３つのマイクロフォンと、
音源から前記３つのマイクロフォンの各々までの音の到達時間の差に基づき、前記音源の位置を、前記三角形を含む平面に垂直な方向に沿って前記三角形を含む平面に投影した位置から前記平面の前記三角形で囲まれた領域の内側にある基準位置へ向かう音源方向を特定する特定部と、を備え、
前記３つのマイクロフォンの各々が出力する３つの電気信号のうちの２つの電気信号を１組として各組から算出される位相差に基づき、前記到達時間の差を算出する機能を備え、
前記音源の音は人声であり、前記３つのマイクロフォンの各マイクロフォン間の距離は、５７ｍｍ以上１７０ｍｍ以下であり、
前記３つのマイクロフォンが収納された筐体の内部は音が抜ける構造であり、前記３つのマイクロフォンは、それぞれ、子基板の前面側に取り付けられ、それぞれの子基板の後ろからも音を拾う構成としたこと
を特徴とする音源方向特定装置。 three microphones placed at the vertices of the triangle,
Based on the difference in the arrival times of sound from the sound source to each of the three microphones, the position of the sound source is projected onto the plane containing the triangle along the direction perpendicular to the plane containing the triangle. a specifying unit that specifies a sound source direction toward a reference position inside the area surrounded by the triangle,
A function of calculating the difference in arrival time based on the phase difference calculated from each set of two electrical signals out of the three electrical signals output by each of the three microphones,
The sound of the sound source is human voice, and the distance between each of the three microphones is 57 mm or more and 170 mm or less ,
The inside of the housing in which the three microphones are accommodated has a structure that allows sound to escape, and the three microphones are each attached to the front side of the child board and configured to pick up sound from behind each child board. what i did
A sound source direction identification device characterized by:

三角形の頂点に配置された３つのマイクロフォンと、
音源から前記３つのマイクロフォンの各々までの音の到達時間の差に基づき、前記音源の位置を、前記三角形を含む平面に垂直な方向に沿って前記三角形を含む平面に投影した位置から前記平面の前記三角形で囲まれた領域の内側にある基準位置へ向かう音源方向を特定する特定部と、を備え、
前記基準位置は前記三角形の垂心であり、
前記特定部は、
前記３つのマイクロフォンのうちの２つのマイクロフォンを１組として各組から算出される３つの前記到達時間の差のうち、最大の前記到達時間の差である１組に基づき、前記音源方向が、前記２つのマイクロフォンを通る線の各々に引かれた前記基準位置を通る３つの垂線により区画された前記基準位置を囲む６つの領域の何れに属するかを決定し、
前記最大の到達時間の差である１組を除く残り２組の前記到達時間の差に基づき、前記６つの領域のうち決定した領域となる前記音源方向と前記３つの垂線のうちの１つの垂線とのなす角度である音源角度を算出することを特徴とする音源方向特定装置。 three microphones placed at the vertices of the triangle,
Based on the difference in the arrival times of sound from the sound source to each of the three microphones, the position of the sound source is projected onto the plane containing the triangle along the direction perpendicular to the plane containing the triangle. a specifying unit that specifies a sound source direction toward a reference position inside the area surrounded by the triangle,
the reference position is the orthocenter of the triangle;
The identification unit
Of the three arrival time differences calculated from each set of two microphones out of the three microphones, based on one set that is the largest difference in the arrival time, the sound source direction is the determining which of the six regions surrounding the reference position defined by three perpendicular lines passing through the reference position drawn on each of the lines passing through the two microphones;
The direction of the sound source and one of the three perpendiculars, which is the region determined from the six regions, based on the remaining two sets of the differences in the arrival times excluding the one set that is the maximum difference in the arrival times. 1. A sound source direction identification device, characterized in that it calculates a sound source angle, which is an angle between and.