JP2018077139A

JP2018077139A - Sound field estimation device, sound field estimation method and program

Info

Publication number: JP2018077139A
Application number: JP2016219140A
Authority: JP
Inventors: 江村　暁; Akira Emura; 暁江村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2016-11-09
Filing date: 2016-11-09
Publication date: 2018-05-17

Abstract

PROBLEM TO BE SOLVED: To provide a sound field estimation technique capable of accurately estimating a sound field from a pickup signal picked up by using an array of two spherical microphones even when a point sound source exists near the array of the spherical microphones.SOLUTION: The sound field estimation device includes: a sound field representation vector estimation part 120 that estimates a sound field representation vector a(ω) which is a vector composed of decomposition coefficient a(ω, q), a(ω, q) of a mixed wave constituting a sound field from a frequency area pickup signal u(ω,m,j) using a cost function J(ω); and a sound field calculation part 130 that calculates a virtual microphone frequency area pickup signal u(ω,r) from the sound field representation vector a(ω) and the virtual microphone position r.SELECTED DRAWING: Figure 3

Description

本発明は、音場推定技術に関し、特に２つの球面マイクロホンアレーを用いて収音した収音信号から音場を推定する技術に関する。 The present invention relates to a sound field estimation technique, and more particularly, to a technique for estimating a sound field from collected sound signals collected using two spherical microphone arrays.

近年、オーディオ再生に使われるチャネル数およびスピーカ数は、臨場感をより高めるためあるいは再生エリアを広げるために、2から、5.1へ、さらには22.1へと増加している。チャネル数等が拡張された再生手法の評価検証には、再生された音場を測定、推定することが重要になる。 In recent years, the number of channels and the number of speakers used for audio playback has increased from 2 to 5.1 and further to 22.1 in order to enhance the sense of reality or expand the playback area. It is important to measure and estimate the reproduced sound field for evaluation and verification of a reproduction method with an expanded number of channels.

このような音場推定法として、非特許文献１で球面マイクロホンアレーを用いる方法が提案されている。この方法では、バッフルを持つ剛球型球面マイクロホンアレーに配置された複数のマイクロホンで信号を収音し、そのマルチチャネル信号から音場の平面波分解を求め、音場を推定している。また、バッフルを持たない開球型（オープン型）球面マイクロホンアレーを２つ用いて音場の平面波分解を求めることにより、音場を求める方法が非特許文献２で示されている。非特許文献２の方法では、非特許文献１の方法と異なり、(1)単一の球面マイクロホンアレーでなく、２つの球面マイクロホンアレーを用いて、(2)球面波スペクトルの代わりに、周波数領域の収音信号から音場を構成する平面波の集まりを直接求めることで、より広い範囲で、精度よく音場を推定している。 As such a sound field estimation method, Non-Patent Document 1 proposes a method using a spherical microphone array. In this method, a signal is picked up by a plurality of microphones arranged in a hard spherical spherical microphone array having a baffle, and a plane wave decomposition of the sound field is obtained from the multichannel signal to estimate the sound field. Non-Patent Document 2 discloses a method for obtaining a sound field by obtaining plane wave decomposition of a sound field using two open spherical (open) spherical microphone arrays having no baffle. In the method of Non-Patent Document 2, unlike the method of Non-Patent Document 1, (1) instead of a single spherical microphone array, two spherical microphone arrays are used, and (2) instead of the spherical wave spectrum, a frequency domain is used. The sound field is accurately estimated in a wider range by directly obtaining a collection of plane waves constituting the sound field from the collected sound signal.

B. Rafaely, “Analysis and Design of Spherical Microphone Arrays”, IEEE Trans. Speech Audio Processing, Vol.13, No.1, pp.135-143, Jan. 2005.B. Rafaely, “Analysis and Design of Spherical Microphone Arrays”, IEEE Trans. Speech Audio Processing, Vol.13, No.1, pp.135-143, Jan. 2005. 江村暁, “２つの球面マイクアレーによる音場補間推定”, 日本音響学会講演論文集, pp.589-590, 2016年3月.Satoshi Emura, “Sound Field Interpolation Estimation Using Two Spherical Microphone Arrays”, Proceedings of the Acoustical Society of Japan, pp.589-590, March 2016.

現実の音場は点音源の集まりから生成されるケースが多い。点音源からの波は球面波として空間を伝搬し、球面マイクロホンアレーに配置されたマイクロホンに到達する。点音源が十分遠方にあれば、その波は平面波で良好に近似することができる。しかし、点音源が球面マイクロホンアレーの近くにあればあるほど、その波面の湾曲を無視できなくなり、波面を平面波で近似することによる誤差が大きくなってしまう。 Real sound fields are often generated from a collection of point sources. A wave from a point sound source propagates in space as a spherical wave and reaches a microphone arranged in a spherical microphone array. If the point sound source is sufficiently far away, the wave can be approximated well by a plane wave. However, as the point sound source is closer to the spherical microphone array, the curvature of the wavefront cannot be ignored, and the error caused by approximating the wavefront with a plane wave increases.

したがって、音場を平面波の集まりとみなして分解し音場を推定する非特許文献２の手法では、球面マイクロホンアレーに近い位置に点音源が存在する場合に、音場の推定精度が低下してしまうという問題がある。 Therefore, in the method of Non-Patent Document 2 in which the sound field is regarded as a collection of plane waves and decomposed to estimate the sound field, when a point sound source exists at a position close to the spherical microphone array, the sound field estimation accuracy decreases. There is a problem of end.

そこで本発明は、球面マイクロホンアレーに近い位置に点音源が存在する場合であっても、２つの球面マイクロホンアレーを用いて収音した収音信号から音場を精度よく推定することができる音場推定技術を提供することを目的とする。 Therefore, the present invention provides a sound field that can accurately estimate the sound field from the collected sound signals collected using the two spherical microphone arrays even when a point sound source is present at a position close to the spherical microphone array. The purpose is to provide estimation technology.

本発明の一態様は、第1球面マイクロホンアレー、第2球面マイクロホンアレーを球面上に配置されたマイクロホンの数、マイクロホンの配置位置、球面の半径がいずれも同一である球面状のマイクロホンアレーとし、前記マイクロホンの数をJ、前記マイクロホンの配置位置を指定する仰角と方位角の組を(Θ_j, Φ_j) (j=1, 2,…, J)、前記球面の半径をr_a、前記第1球面マイクロホンアレーの中心位置をd₁、前記第2球面マイクロホンアレーの中心位置をd₂とし、混合波モデルを3次元座標系の原点へ入射角(θ_q,φ_q)で入射してくる平面波(q=1, 2,…, Q)と、前記原点から距離Rだけ離れた位置にある点音源から生成され、前記原点へ同じ入射角(θ_q,φ_q)で入射してくる球面波(q=1, 2,…, Q)のモデルとし、前記第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される平面波の周波数領域での観測信号である平面波周波数領域観測信号をv_p(ω, q, m, j) (ωは周波数を表すインデックスとし、ω=1, 2,…, F、q=1, 2,…, Qとする)、前記第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される球面波の周波数領域での観測信号である球面波周波数領域観測信号をv_c(ω, q, m, j) (ω=1, 2,…, F、q=1, 2,…, Q)とし、前記第1球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 1, j) (ω=1, 2,…, F、j=1, 2,…, J)、前記第2球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 2, j) (ω=1, 2,…, F、j=1, 2,…, J)と、音場推定の対象となる仮想マイクロホンの配置位置である仮想マイクロホン位置rから、前記仮想マイクロホン位置rにおける周波数領域での収音信号である仮想マイクロホン周波数領域収音信号u(ω, r) (ω=1, 2,…, F)を推定する音場推定装置であって、前記周波数領域収音信号u(ω,m,j)(ω=1, 2,…, F、m=1, 2、j=1, 2,…, J)から、次式で定義されるコスト関数J(ω) (ω=1, 2,…, F)を用いて、音場を構成する混合波の分解係数a_p(ω, q)、a_c(ω, q) (ω=1, 2,…, F、q=1, 2,…, Q)からなるベクトルである音場表現ベクトルa(ω)(ω=1, 2,…, F)を推定する音場表現ベクトル推定部と、 One aspect of the present invention is a spherical microphone array in which the first spherical microphone array and the second spherical microphone array have the same number of microphones arranged on the spherical surface, the placement positions of the microphones, and the spherical radius. the number of the microphones J, set the elevation and azimuth to specify the position of the microphone _{_{(Θ j, Φ j) (}} j = 1, 2, ..., J), the radius of the spherical r _a, the The center position of the first spherical microphone array is d ₁ , the center position of the second spherical microphone array is d ₂ , and the mixed wave model is incident on the origin of the three-dimensional coordinate system at an incident angle (θ _q , φ _q ). A plane wave (q = 1, 2, ..., Q) and a point sound source located at a distance R from the origin and generated at the same incident angle (θ _q , φ _q ) The jth microphone on the mth spherical microphone array (m = 1, 2), which is a model of a spherical wave (q = 1, 2, ..., Q) Hong (j = 1, 2, ... , J) a plane wave frequency domain observed signal is the observation signal in the frequency domain of a plane wave observed by _{v p (ω, q, m} , j) (ω index representing the frequency Ω = 1, 2,..., F, q = 1, 2,..., Q), and the j-th microphone (j = 1, 2, 2) on the m-th spherical microphone array (m = 1, 2). …, J), the spherical wave frequency domain observation signal, which is the observation signal in the frequency domain of the spherical wave observed by v _c (ω, q, m, j) (ω = 1, 2,…, F, q = 1, 2,..., Q), and the frequency domain collected signal u (ω, 1, j) (ω = 1, 2,…, F, j = 1, 2, Q) collected by the first spherical microphone array. ..., J), and frequency-domain sound pickup signals u (ω, 2, j) (ω = 1, 2, ..., F, j = 1, 2, ..., J) picked up by the second spherical microphone array A virtual microphone that is a sound pickup signal in the frequency domain at the virtual microphone position r from the virtual microphone position r that is the position of the virtual microphone that is the target of sound field estimation. A sound field estimation device for estimating a lophone frequency domain collected signal u (ω, r) (ω = 1, 2,..., F), the frequency domain collected signal u (ω, m, j) (ω = 1, 2, ..., F, m = 1, 2, j = 1, 2, ..., J), the cost function J (ω) defined by the following equation (ω = 1, 2, ..., F) Is used to decompose the mixed wave components a _p (ω, q), a _c (ω, q) (ω = 1, 2,…, F, q = 1, 2,…, Q) A sound field expression vector estimation unit for estimating a sound field expression vector a (ω) (ω = 1, 2,...

ただし、D(ω) (ω=1, 2,…, F)は、次式で定義される2J×2Qの辞書行列であり、 However, D (ω) (ω = 1, 2,..., F) is a 2J × 2Q dictionary matrix defined by the following equation:

d_p(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)は、次式で定義される2J次元の第q平面波ベクトルであり、 d _p (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q) is a 2J-dimensional q-plane wave vector defined by the following equation:

d_c(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)は、次式で定義される2J次元の第q球面波ベクトルであり、 d _c (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q) is a 2J-dimensional q-spherical wave vector defined by the following equation:

前記音場音場表現ベクトルa(ω)(ω=1, 2,…, F)は、前記分解係数a_p(ω, q)、a_c(ω, q) (ω=1, 2,…, F、q=1, 2,…, Q)を用いて次式で定義される2Q次元ベクトルであり、 The sound field representation vector a (ω) (ω = 1, 2,..., F) is expressed by the decomposition coefficients a _p (ω, q), a _c (ω, q) (ω = 1, 2,. , F, q = 1, 2, ..., Q), which is a 2Q dimensional vector defined by

u_m(ω) (m=1, 2、ω=1, 2,…, F)は、次式で定義されるJ次元の第m球面マイクロホンアレーの周波数領域収音信号ベクトルであり、 u _m (ω) (m = 1, 2, ω = 1, 2,..., F) is a frequency domain collected signal vector of the J-dimensional m-th spherical microphone array defined by the following equation:

λは正則化パラメータであるとし、前記音場表現ベクトルa(ω) (ω=1, 2,…, F)、前記仮想マイクロホン位置rから、前記仮想マイクロホン周波数領域収音信号u(ω, r)(ω=1, 2,…, F)を計算する音場計算部とを含む。 λ is a regularization parameter. From the sound field expression vector a (ω) (ω = 1, 2,..., F) and the virtual microphone position r, the virtual microphone frequency domain collected signal u (ω, r ) (ω = 1, 2,..., F).

本発明によれば、音場を平面波と球面波からなる混合波の集まりとみなし、その分解係数を求めることにより音場を推定することで、球面マイクロホンアレーに近い位置に点音源が存在する場合であっても、２つの球面マイクロホンアレーを用いて収音した収音信号から音場を精度よく推定することが可能となる。 According to the present invention, when a sound source exists at a position close to a spherical microphone array by regarding the sound field as a collection of mixed waves composed of plane waves and spherical waves, and estimating the sound field by obtaining a decomposition coefficient thereof. Even so, it is possible to accurately estimate the sound field from the collected sound signals collected using the two spherical microphone arrays.

音場推定装置１００の構成の一例を示す図。The figure which shows an example of a structure of the sound field estimation apparatus 100. FIG. 音場推定装置１００の動作の一例を示す図。The figure which shows an example of operation | movement of the sound field estimation apparatus 100. 音場推定装置１０１の構成の一例を示す図。The figure which shows an example of a structure of the sound field estimation apparatus 101. FIG. 音場推定装置２００の構成の一例を示す図。The figure which shows an example of a structure of the sound field estimation apparatus 200. 音場推定装置２００の動作の一例を示す図。The figure which shows an example of operation | movement of the sound field estimation apparatus 200. 音場推定装置２０１の構成の一例を示す図。The figure which shows an example of a structure of the sound field estimation apparatus 201. FIG.

以下、本発明の実施の形態について、詳細に説明する。なお、同じ機能を有する構成部には同じ番号を付し、重複説明を省略する。 Hereinafter, embodiments of the present invention will be described in detail. In addition, the same number is attached | subjected to the structure part which has the same function, and duplication description is abbreviate | omitted.

＜第一実施形態＞
まず、開球型球面マイクロホンアレーモデル、混合波モデルについて説明する。 <First embodiment>
First, an open spherical spherical microphone array model and a mixed wave model will be described.

［開球型球面マイクロホンアレーモデル］
開球型球面マイクロホンアレーとは、各マイクロホンは細い骨格で支持されているが、音響的には中空に浮いている状態と等価とみなすことができる球面状のマイクロホンアレーである。以下、第1球面マイクロホンアレー、第2球面マイクロホンアレーは、開球型球面マイクロホンアレーであるとする。また、第1球面マイクロホンアレーと第2球面マイクロホンアレーの球面上に配置されたマイクロホンの数、第1球面マイクロホンアレーと第2球面マイクロホンアレーにおけるマイクロホンの配置位置、第1球面マイクロホンアレーと第2球面マイクロホンアレーの球面の半径は、同一であるとする。第m球面マイクロホンアレー(m=1, 2)には、半径r_a(>0)の球面上にJ個(J≧1)のマイクロホンが配置されており、マイクロホンの配置位置は仰角と方位角の組(Θ_j, Φ_j) (j=1, 2,…, J)で指定されるものとする。また、第1球面マイクロホンアレーの中心位置をd₁、第2球面マイクロホンアレーの中心位置をd₂とする。各中心位置d₁、d₂は、3次元座標系での位置（3次元位置）である。 [Open ball type spherical microphone array model]
An open spherical spherical microphone array is a spherical microphone array in which each microphone is supported by a thin skeleton, but can be considered acoustically equivalent to a floating state in a hollow state. Hereinafter, it is assumed that the first spherical microphone array and the second spherical microphone array are open spherical microphone arrays. Also, the number of microphones arranged on the spherical surfaces of the first spherical microphone array and the second spherical microphone array, the arrangement positions of the microphones in the first spherical microphone array and the second spherical microphone array, the first spherical microphone array and the second spherical surface It is assumed that the spheres of the microphone array have the same radius. In the m-th spherical microphone array (m = 1, 2), J microphones (J ≧ 1) are arranged on a spherical surface with _a radius r _a (> 0), and the microphones are arranged at elevation and azimuth angles. It is assumed that (Θ _j , Φ _j ) (j = 1, 2,..., J) The center position of the first spherical microphone array is d ₁ and the center position of the second spherical microphone array is d ₂ . Each of the center positions d ₁ and d ₂ is a position (three-dimensional position) in a three-dimensional coordinate system.

第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)の3次元位置r(m, j)は、仰角と方位角の組(Θ_j, Φ_j)が各球面マイクロホンアレーの中心位置から測定したものであるので、式(1)で表される。 The three-dimensional position r (m, j) of the jth microphone (j = 1, 2, ..., J) on the mth spherical microphone array (m = 1, 2) is a set of elevation and azimuth (Θ _j , Since Φ _j ) is measured from the center position of each spherical microphone array, it is expressed by equation (1).

［混合波モデル］
混合波モデルとは、3次元座標系の原点に入射する平面波と球面波のモデルのことであり、具体的には、原点へ入射角(θ_q, φ_q)で入射してくる平面波(q=1, 2,…, Q)と、原点から距離Rだけ離れた位置にある点音源から生成され、原点へ同じ入射角(θ_q, φ_q)で入射してくる球面波(q=1, 2,…, Q)のモデルのことである（θ_qは仰角、φ_qは方位角を表し、Qは1以上の整数とする）。 [Mixed wave model]
The mixed wave model is a model of a plane wave and a spherical wave that are incident on the origin of the three-dimensional coordinate system, and specifically, a plane wave that is incident on the origin at an incident angle (θ _q , φ _q ) (q = 1, 2, ..., Q) and a spherical wave (q = 1) that is generated from a point source at a distance R from the origin and incident at the same incident angle (θ _q , φ _q ) , 2, ..., Q) (θ _q is the elevation angle, φ _q is the azimuth angle, and Q is an integer of 1 or more).

なお、Q個の入射角(θ_q, φ_q)は、Q個の平面波、球面波が全方位から万遍なく取得できるように設定するものとする。例えば、正多面体の頂点の方向から平面波、球面波が入射するようにQ個の入射角(θ_q, φ_q)を設定するとよい。 Note that the Q incident angles (θ _q , φ _q ) are set so that Q plane waves and spherical waves can be obtained from all directions. For example, Q incident angles (θ _q , φ _q ) may be set so that a plane wave and a spherical wave are incident from the apex direction of the regular polyhedron.

球面波を生成するQ個の点音源の3次元位置R_qは、式(2)で表される(q=1, 2,…, Q)。 The three-dimensional positions R _q of the Q point sound sources that generate the spherical waves are expressed by Equation (2) (q = 1, 2,..., Q).

以下、図１〜図２を参照して音場推定装置１００について説明する。図１に示すように音場推定装置１００は、短時間フーリエ変換部１１０、音場表現ベクトル推定部１２０、音場計算部１３０、短時間逆フーリエ変換部１４０、記録部１９０を含む。記録部１９０は、音場推定装置１００の処理に必要な情報を適宜記録する構成部である。 Hereinafter, the sound field estimation apparatus 100 will be described with reference to FIGS. As shown in FIG. 1, the sound field estimation apparatus 100 includes a short-time Fourier transform unit 110, a sound field expression vector estimation unit 120, a sound field calculation unit 130, a short-time inverse Fourier transform unit 140, and a recording unit 190. The recording unit 190 is a component that appropriately records information necessary for processing by the sound field estimation apparatus 100.

第1球面マイクロホンアレー９０１、第2球面マイクロホンアレー９０２は音場推定装置１００に接続しており、第1球面マイクロホンアレー９０１、第2球面マイクロホンアレー９０２による収音信号が音場推定装置１００の入力となっている。 The first spherical microphone array 901 and the second spherical microphone array 902 are connected to the sound field estimation apparatus 100, and the collected sound signals from the first spherical microphone array 901 and the second spherical microphone array 902 are input to the sound field estimation apparatus 100. It has become.

第m球面マイクロホンアレー上の第jマイクロホンによる時間領域での収音信号である時間領域収音信号をy(t, m, j)と表すこととする(ただし、tは時間を表すパラメータ、m=1, 2、j=1, 2,…, J)。 Let y (t, m, j) represent the time domain sound collection signal, which is the sound signal collected in the time domain by the jth microphone on the mth spherical microphone array (where t is a parameter representing time, m = 1, 2, j = 1, 2, ..., J).

音場推定装置１００は、第1球面マイクロホンアレーが収音した時間領域収音信号y(t, 1, j)(j=1, 2,…, J)、第2球面マイクロホンアレーが収音した時間領域収音信号y(t, 2, j)(j=1, 2,…, J)、音場推定の対象となる仮想マイクロホンの配置位置である仮想マイクロホン位置rから、仮想マイクロホン位置rにおける時間領域での収音信号である仮想マイクロホン時間領域収音信号y(t, r)を推定し、出力する。 The sound field estimation apparatus 100 collects the time domain collected signal y (t, 1, j) (j = 1, 2,..., J) collected by the first spherical microphone array and the second spherical microphone array. Time domain sound pickup signal y (t, 2, j) (j = 1, 2, ..., J), from virtual microphone position r, which is the placement position of the virtual microphone that is the target of sound field estimation, at virtual microphone position r A virtual microphone time domain sound pickup signal y (t, r) that is a sound pickup signal in the time domain is estimated and output.

図２に従い音場推定装置１００の動作について説明する。短時間フーリエ変換部１１０は、時間領域収音信号y(t, m, j)(m=1, 2、j=1, 2,…, J)から、短時間フーリエ変換を用いて、第m球面マイクロホンアレー上の第jマイクロホンによる時間領域収音信号y(t, m, j)の周波数領域表現である周波数領域収音信号u(n, ω, m, j)（ただし、nはフレーム番号を表すインデックス、ωは周波数を表すインデックスとし、ω=1, 2,…, Fとする)に変換し、出力する（Ｓ１１０）。 The operation of the sound field estimation apparatus 100 will be described with reference to FIG. The short-time Fourier transform unit 110 uses the short-time Fourier transform from the time domain sound pickup signal y (t, m, j) (m = 1, 2, j = 1, 2,. Frequency domain collected signal u (n, ω, m, j), which is the frequency domain representation of the time domain collected signal y (t, m, j) by the jth microphone on the spherical microphone array (where n is the frame number) Ω is an index representing frequency, and ω = 1, 2,..., F) and output (S110).

以降の処理（具体的には音場表現ベクトル推定部１２０、音場計算部１３０での処理）は、フレームnごとに実行されるが、記載を簡略化するため、フレーム番号nを省略し、u(n, ω, m, j)を単にu(ω, m, j)と表すことにする。また、周波数領域収音信号u(ω, m, j) (j=1, 2,…, J)を用いて、第m球面マイクロホンアレーの周波数領域収音信号ベクトルu_m(ω)をJ次元ベクトルとして式(3)で定義する(m=1, 2、ω=1, 2,…, F)。 Subsequent processing (specifically, processing in the sound field expression vector estimation unit 120 and the sound field calculation unit 130) is performed for each frame n, but in order to simplify the description, the frame number n is omitted, u (n, ω, m, j) is simply expressed as u (ω, m, j). In addition, the frequency domain sound pickup signal vector u _m (ω) of the mth spherical microphone array is used in the J dimension using the frequency domain sound pickup signal u (ω, m, j) (j = 1, 2, ..., J). The vector is defined by equation (3) (m = 1, 2, ω = 1, 2,..., F).

なお、時間領域信号を周波数領域信号に変換するものであれば、短時間フーリエ変換以外の方法を用いてもよい。 Note that a method other than the short-time Fourier transform may be used as long as it converts a time-domain signal into a frequency-domain signal.

音場表現ベクトル推定部１２０は、周波数領域収音信号u(ω, m, j)(ω=1, 2,…, F、m=1, 2、j=1, 2,…, J)から、音場を構成する混合波の分解係数a_p(ω, q)、a_c(ω, q) (ω=1, 2,…, F、q=1, 2,…, Q)からなるベクトルである音場表現ベクトルa(ω)(ω=1, 2,…, F)を推定し、出力する（Ｓ１２０）。以下、音場表現ベクトルa(ω)の算出方法について、説明する。 The sound field expression vector estimation unit 120 uses the frequency domain collected signal u (ω, m, j) (ω = 1, 2,..., F, m = 1, 2, j = 1, 2,..., J). , A vector consisting of the decomposition coefficients a _p (ω, q), a _c (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q) of the mixed wave constituting the sound field A sound field expression vector a (ω) (ω = 1, 2,..., F) is estimated and output (S120). Hereinafter, a method for calculating the sound field expression vector a (ω) will be described.

まず、各マイクロホンで観測される信号のモデル（周波数領域収音信号u(ω, m, j)の線形結合モデル）について説明する。 First, a model of a signal observed by each microphone (a linear combination model of frequency domain sound pickup signals u (ω, m, j)) will be described.

［各マイクロホンで観測される信号のモデル］
各マイクロホンで観測される信号は、Q個の平面波とQ個の球面波からなる集まりであると仮定する。具体的には、周波数領域収音信号u(ω,m,j)は、Q個の平面波の周波数領域信号とQ個の球面波の周波数領域信号の線形結合として近似表現されるものとする。 [Signal model observed by each microphone]
It is assumed that the signal observed by each microphone is a collection of Q plane waves and Q spherical waves. Specifically, the frequency domain sound pickup signal u (ω, m, j) is assumed to be approximately expressed as a linear combination of a frequency domain signal of Q plane waves and a frequency domain signal of Q spherical waves.

第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される平面波の周波数領域での観測信号である平面波周波数領域観測信号v_p(ω, q, m, j)（つまり、平面波が到達したときにマイクロホンで観測される周波数領域での信号）は、式(4)で表される(ω=1, 2,…, F、q=1, 2,…, Q)。 Plane wave frequency domain observation signal v _p (the observation signal in the frequency domain of the plane wave observed by the j th microphone (j = 1, 2,..., J) on the mth spherical microphone array (m = 1, 2). ω, q, m, j) (that is, the signal in the frequency domain observed by the microphone when the plane wave arrives) is expressed by equation (4) (ω = 1, 2,…, F, q = 1, 2,…, Q).

ただし、kは周波数インデックスωに基づき決定される波数、k^_qは式(5)で表されるベクトルである。なお、iは虚数単位、・は内積記号である。 Here, k is the wave number determined based on the frequency index ω, and k ^ _q is a vector represented by the equation (5). Note that i is an imaginary unit, and · is an inner product symbol.

また、第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される球面波の周波数領域での観測信号である球面波周波数領域観測信号v_c(ω, q, m, j)（つまり、球面波が到達したときにマイクロホンで観測される周波数領域での信号）は、式(6)で表される(ω=1, 2,…, F、q=1, 2,…, Q)。 In addition, spherical wave frequency domain observation, which is an observation signal in the frequency domain of spherical waves observed with the jth microphone (j = 1, 2, ..., J) on the mth spherical microphone array (m = 1, 2). The signal v _c (ω, q, m, j) (that is, the signal in the frequency domain observed by the microphone when the spherical wave arrives) is expressed by equation (6) (ω = 1, 2, …, F, q = 1, 2,…, Q).

ここで、音場を構成するQ個の平面波の分解係数a_p(ω, 1),…, a_p(ω, Q)とQ個の球面波の分解係数a_c(ω, 1),…, a_c(ω, Q)を順に並べた2Q次元ベクトルとして音場表現ベクトルa(ω)(ω=1, 2,…, F)を式(7)で定義すると、 Here, the decomposition coefficients a _p (ω, 1), ..., a _p (ω, Q) of the Q plane waves constituting the sound field and the decomposition coefficients a _c (ω, 1), ... of the Q spherical waves , a _c (ω, Q) are arranged in order, and the sound field expression vector a (ω) (ω = 1, 2, ..., F) is defined by equation (7) as

周波数領域収音信号u(ω, m, j)は、式(8)のようなQ個の平面波周波数領域観測信号v_p(ω, q, m, j)とQ個の球面波周波数領域観測信号v_c(ω, q, m, j)の線形結合として近似表現される。 The frequency domain collected signal u (ω, m, j) is composed of Q plane wave frequency domain observation signals v _p (ω, q, m, j) and Q spherical wave frequency domain observations as shown in Equation (8). It is approximated as a linear combination of signals v _c (ω, q, m, j).

次に、音場音場表現ベクトルa(ω)(ω=1, 2,…, F)の算出方法について説明する。 Next, a method for calculating the sound field sound field expression vector a (ω) (ω = 1, 2,..., F) will be described.

［音場音場表現ベクトルa(ω)の算出方法］
まず、第q平面波ベクトルd_p(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)を2J次元ベクトルとして式(9)で定義する。 [Calculation method of sound field sound field expression vector a (ω)]
First, the q-th plane wave vector d _p (ω, q) (ω = 1, 2,..., F, q = 1, 2,..., Q) is defined as a 2J-dimensional vector by Equation (9).

同様に、第q球面波ベクトルd_c(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)を2J次元ベクトルとして式(10)で定義する。 Similarly, the q-th spherical wave vector d _c (ω, q) (ω = 1, 2,..., F, q = 1, 2,..., Q) is defined as a 2J-dimensional vector by Equation (10).

さらに、第q平面波ベクトルd_p(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)と第q球面波ベクトルd_c(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)を用いて、2J×2Qの辞書行列D(ω)(ω=1, 2,…, F)を式(11)で定義する。 Furthermore, the q-th plane wave vector d _p (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q) and the q-th spherical wave vector d _c (ω, q) (ω = 1, 2 ..., F, q = 1, 2, ..., Q), and 2J x 2Q dictionary matrix D (ω) (ω = 1, 2, ..., F) is defined by equation (11) To do.

つまり、辞書行列D(ω)の第q列(q=1, 2,…, Q)は、仰角と方位角の組(θ_q,φ_q)で指定された方向から原点へ振幅１の平面波が入射した場合の、原点での位相が0の状態での第1球面マイクロホンアレーと第2球面マイクロホンアレーでの平面波周波数領域観測信号からなるベクトルである第q平面波ベクトルd_p(ω, q)である。また、辞書行列D(ω)の第Q+q列(q=1, 2,…, Q)は、仰角と方位角の組(θ_q, φ_q)で指定された方向から原点へ球面波が入射した場合の、原点での位相が0の状態での第1球面マイクロホンアレーと第2球面マイクロホンアレーでの球面波周波数領域観測信号からなるベクトルである第q球面波ベクトルd_c(ω, q)である。 That is, the q-th column (q = 1, 2,..., Q) of the dictionary matrix D (ω) is a plane wave with an amplitude of 1 from the direction specified by the pair of elevation angle and azimuth angle (θ _q , φ _q ) to the origin. Q plane wave vector d _p (ω, q), which is a vector composed of plane wave frequency domain observation signals at the first spherical microphone array and the second spherical microphone array when the phase at the origin is 0 It is. In addition, the Q + q column (q = 1, 2, ..., Q) of the dictionary matrix D (ω) is a spherical wave from the direction specified by the pair of elevation angle and azimuth angle (θ _q , φ _q ) to the origin. The q-th spherical wave vector d _c (ω, which is a vector composed of spherical wave frequency domain observation signals at the first spherical microphone array and the second spherical microphone array when the phase at the origin is zero. q).

辞書行列D(ω)と音場表現ベクトルa(ω)、第1球面マイクロホンアレーの周波数領域収音信号ベクトルu₁(ω)と第2球面マイクロホンアレーの周波数領域収音信号ベクトルu₂(ω)を用いて、コスト関数J(ω)を式(12)で定義する(ω=1, 2,…, F)。 Dictionary matrix D (ω), sound field expression vector a (ω), frequency domain collected signal vector u ₁ (ω) of the first spherical microphone array, and frequency domain collected signal vector u ₂ (ω of the second spherical microphone array ), The cost function J (ω) is defined by the equation (12) (ω = 1, 2,..., F).

式(12)の第1項は式(8)の線形結合の誤差に由来するものである。また、式(12)の第2項は正則化項であり、音場表現ベクトルのL1ノルムに相当するΣ(a_p ²(ω, q)+a_c ²(ω, q))^1/2と正則化パラメータλからなる。Σ(a_p ²(ω, q)+a_c ²(ω, q))^1/2となっているのは、入射角(θ_q,φ_q)で原点へ入射してくる波による音場を平面波周波数領域観測信号v_p(ω, q, m, j)と球面波周波数領域観測信号v_c(ω, q, m, j)の線形結合 The first term of equation (12) is derived from the error of the linear combination of equation (8). Also, the second term of equation (12) is a regularization term, and Σ (a _p ² (ω, q) + a _c ² (ω, q)) ^1/2 corresponding to the L1 norm of the sound field expression vector And the regularization parameter λ. Σ (a _p ² (ω, q) + a _c ² (ω, q)) ^1/2 is the sound field caused by the wave incident on the origin at the incident angle (θ _q , φ _q ) A linear combination of the plane wave frequency domain observation signal v _p (ω, q, m, j) and the spherical wave frequency domain observation signal v _c (ω, q, m, j)

として解釈し、音場表現ベクトルの要素である分解係数a_p(ω, q)及びa_c(ω, q)を求めるためである。 This is because the decomposition coefficients a _p (ω, q) and a _c (ω, q), which are elements of the sound field expression vector, are obtained.

音場表現ベクトルa(ω)をコスト関数J(ω)の最小値を実現するベクトルとして算出する。式(12)の第2項を正則化項として用いることにより、特定のqについてのa_p(ω, q)とa_c(ω, q)の組以外のqについてはa_p(ω, q)、a_c(ω, q)の値を0に近づける効果を持つ。すなわち、qに関して疎（スパース）な音場表現ベクトルa(ω)が算出される。これにより、あらかじめ想定する（線形結合による近似に用いる）平面波、球面波の個数Qがマイクロホンの数を大きく上回るような冗長な場合であっても、λを適切に設定することで、平面波、球面波をうまく抽出することが可能となる。 The sound field expression vector a (ω) is calculated as a vector that realizes the minimum value of the cost function J (ω). By using the second term of formula (12) as a regularization term, a _p (ω, q) for a particular q and a _c (omega, q) for the q except set of a _p (omega, q ), A _c (ω, q) has the effect of approaching zero. That is, a sparse sound field expression vector a (ω) with respect to q is calculated. As a result, even if it is a redundant case where the number of plane waves and spherical waves Q (preliminarily used for approximation by linear combination) greatly exceeds the number of microphones, plane waves and spherical surfaces can be obtained by appropriately setting λ. Waves can be extracted successfully.

式(12)の第2項を正則化項として持つ問題は、group Lassoとして参考非特許文献１で定式化されており、音場表現ベクトルa(ω)を算出することができる。
（参考非特許文献１）M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables”, Journal of the Royal Statistical Society: Series B, Vol.68, Issue 1, pp.49-67, 2006. The problem of having the second term of equation (12) as a regularization term is formulated in Reference Non-Patent Document 1 as group Lasso, and the sound field expression vector a (ω) can be calculated.
(Reference Non-Patent Document 1) M. Yuan and Y. Lin, “Model selection and estimation in regression with grouped variables”, Journal of the Royal Statistical Society: Series B, Vol.68, Issue 1, pp.49-67, 2006.

音場計算部１３０は、Ｓ１２０で推定した音場表現ベクトルa(ω)、仮想マイクロホン位置rから、式(14)を用いて、仮想マイクロホンの位置rでの周波数領域での収音信号である仮想マイクロホン周波数領域収音信号u(ω,r)(ω=1, 2,…, F)を計算し、出力する（Ｓ１３０）。 The sound field calculation unit 130 is a sound collection signal in the frequency domain at the position r of the virtual microphone using the equation (14) from the sound field expression vector a (ω) estimated in S120 and the virtual microphone position r. The virtual microphone frequency domain collected signal u (ω, r) (ω = 1, 2,..., F) is calculated and output (S130).

なお、入力となる仮想マイクロホン位置rは、音場推定装置１００のユーザが求めたい音場の位置として自由に設定することができる。 The input virtual microphone position r can be freely set as the position of the sound field desired by the user of the sound field estimation apparatus 100.

短時間逆フーリエ変換部１４０は、仮想マイクロホン周波数領域収音信号u(ω, r)(ω=1, 2,…, F)（フレーム番号を省略しないで記載すると、仮想マイクロホン周波数領域収音信号u(n, ω, r)(ω=1, 2,…, F)となる）から、短時間逆フーリエ変換を用いて、仮想マイクロホン周波数領域収音信号u(ω, r)を仮想マイクロホンによる時間領域での収音信号である仮想マイクロホン時間領域収音信号y(t, r)に変換し、出力する（Ｓ１４０）。 The short-time inverse Fourier transform unit 140 generates a virtual microphone frequency domain sound collection signal u (ω, r) (ω = 1, 2,... From u (n, ω, r) (ω = 1, 2, ..., F)), using the short-time inverse Fourier transform, the virtual microphone frequency domain collected signal u (ω, r) is generated by the virtual microphone. A virtual microphone time domain sound pickup signal y (t, r), which is a sound pickup signal in the time domain, is converted and output (S140).

なお、時間領域信号を周波数領域信号に変換する方法として短時間フーリエ変換以外の方法を用いた場合は、当該方法に対応する逆変換を用いればよい。 When a method other than the short-time Fourier transform is used as a method for converting the time domain signal into the frequency domain signal, an inverse transform corresponding to the method may be used.

［変形例１］
第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される球面波周波数領域観測信号v_c(ω, q, m, j)として、式(6)の代わりに式(15)を用いてもよい。 [Modification 1]
As the spherical wave frequency domain observation signal v _c (ω, q, m, j) observed by the j-th microphone (j = 1, 2, ..., J) on the m-th spherical microphone array (m = 1, 2) Instead of equation (6), equation (15) may be used.

この場合、音場計算部１３０では、式(14)の代わりに式(16)を用いて仮想マイクロホン周波数領域収音信号u(ω,r)(ω=1, 2,…, F)を計算する。 In this case, the sound field calculation unit 130 calculates the virtual microphone frequency domain collected signal u (ω, r) (ω = 1, 2,..., F) using the equation (16) instead of the equation (14). To do.

［変形例２］
音場推定装置１００では、第1球面マイクロホンアレー９０１、第2球面マイクロホンアレー９０２による収音信号を時間領域収音信号y(t, 1, j)、y(t, 2, j) (j=1, 2,…, J)として入力とするように構成した。 [Modification 2]
In the sound field estimation apparatus 100, the sound signals collected by the first spherical microphone array 901 and the second spherical microphone array 902 are converted into time domain collected signals y (t, 1, j), y (t, 2, j) (j = 1, 2, ..., J).

しかし、第1球面マイクロホンアレー９０１、第2球面マイクロホンアレー９０２による収音信号を周波数領域収音信号u(ω, 1, j)、u(ω, 2, j) (ω=1, 2,…, F、j=1, 2,…, J)として入力とするように構成してもよい。 However, the collected sound signals from the first spherical microphone array 901 and the second spherical microphone array 902 are converted into frequency domain collected signals u (ω, 1, j), u (ω, 2, j) (ω = 1, 2,... , F, j = 1, 2,..., J).

周波数領域収音信号u(ω, 1, j)、u(ω, 2, j)を入力とするように構成した音場推定装置１０１を図３に示す。図３に示すように音場推定装置１０１は、音場表現ベクトル推定部１２０、音場計算部１３０、記録部１９０を含む。音場表現ベクトル推定部１２０、音場計算部１３０、記録部１９０は、音場推定装置１００のそれと同一である。 FIG. 3 shows a sound field estimation apparatus 101 configured to receive frequency domain collected signals u (ω, 1, j) and u (ω, 2, j). As shown in FIG. 3, the sound field estimation apparatus 101 includes a sound field expression vector estimation unit 120, a sound field calculation unit 130, and a recording unit 190. The sound field expression vector estimation unit 120, the sound field calculation unit 130, and the recording unit 190 are the same as those of the sound field estimation device 100.

音場推定装置１０１は、第1球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 1, j)、第2球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 2, j) (ω=1, 2,…, F、j=1, 2,…, J)、音場推定の対象となる仮想マイクロホンの配置位置である仮想マイクロホン位置rから、仮想マイクロホン位置rにおける周波数領域での収音信号である仮想マイクロホン周波数領域収音信号u(ω, r)を推定し、出力する。 The sound field estimation apparatus 101 includes a frequency domain collected signal u (ω, 1, j) collected by the first spherical microphone array and a frequency domain collected signal u (ω, 2, j) collected by the second spherical microphone array. j) (ω = 1, 2, ..., F, j = 1, 2, ..., J), the frequency at the virtual microphone position r from the virtual microphone position r that is the placement position of the virtual microphone that is the target of sound field estimation A virtual microphone frequency domain collected signal u (ω, r), which is a collected signal in the region, is estimated and output.

本実施形態の発明によれば、音場を平面波と球面波からなる混合波の集まりとみなし、その分解係数を周波数領域で求めることにより音場を推定することで、球面マイクロホンアレーに近い位置に点音源が存在する場合であっても、２つの球面マイクロホンアレーを用いて収音した収音信号から音場を精度よく推定することが可能となる。特に、分解係数からなる音場表現ベクトルのL1ノルムに相当する項を含む正則化項を用いて定義されるコスト関数J(ω)を用いることで、スパースな音場表現ベクトルが算出され、近似に用いるべき平面波と球面波をうまく抽出することが可能となる。 According to the invention of the present embodiment, the sound field is regarded as a collection of mixed waves composed of plane waves and spherical waves, and the sound field is estimated by obtaining the decomposition coefficient in the frequency domain, so that the sound field is close to the position of the spherical microphone array. Even when a point sound source is present, it is possible to accurately estimate the sound field from the collected sound signals collected using the two spherical microphone arrays. In particular, a sparse sound field expression vector is calculated and approximated by using a cost function J (ω) defined using a regularization term including a term corresponding to the L1 norm of the sound field expression vector composed of decomposition coefficients. It is possible to successfully extract the plane wave and the spherical wave to be used for.

＜第二実施形態＞
第一実施形態では、2つの開球型球面マイクロホンアレーを用いて収音した収音信号から音場を推定する方法について説明した。本実施形態では、2つの剛球型球面マイクロホンアレーを用いて収音した収音信号から音場を推定する方法について説明する。 <Second embodiment>
In the first embodiment, the method for estimating the sound field from the collected sound signals collected using the two open spherical spherical microphone arrays has been described. In the present embodiment, a method for estimating a sound field from collected sound signals collected by using two hard sphere type spherical microphone arrays will be described.

まず、剛球型球面マイクロホンアレーモデル、混合波モデルについて説明する。 First, the hard sphere type spherical microphone array model and the mixed wave model will be described.

［剛球型球面マイクロホンアレーモデル］
第1球面マイクロホンアレー、第2球面マイクロホンアレーは、剛球型球面マイクロホンアレーであるとする。また、第1球面マイクロホンアレーと第2球面マイクロホンアレーの球面上に配置されたマイクロホンの数、第1球面マイクロホンアレーと第2球面マイクロホンアレーにおけるマイクロホンの配置位置、第1球面マイクロホンアレーと第2球面マイクロホンアレーの球面の半径は、同一であるとする。第m球面マイクロホンアレー(m=1, 2)には、半径r_a(>0)の球面上にJ個(J≧1)のマイクロホンが配置されており、マイクロホンの配置位置は仰角と方位角の組(Θ_j, Φ_j) (j=1, 2,…, J)で指定されるものとする。また、第1球面マイクロホンアレーの中心位置をd₁、第2球面マイクロホンアレーの中心位置をd₂とする。 [Hard sphere type spherical microphone array model]
The first spherical microphone array and the second spherical microphone array are assumed to be hard sphere type spherical microphone arrays. Also, the number of microphones arranged on the spherical surfaces of the first spherical microphone array and the second spherical microphone array, the arrangement positions of the microphones in the first spherical microphone array and the second spherical microphone array, the first spherical microphone array and the second spherical surface It is assumed that the spheres of the microphone array have the same radius. In the m-th spherical microphone array (m = 1, 2), J microphones (J ≧ 1) are arranged on a spherical surface with _a radius r _a (> 0), and the microphones are arranged at elevation and azimuth angles. It is assumed that (Θ _j , Φ _j ) (j = 1, 2,..., J) The center position of the first spherical microphone array is d ₁ and the center position of the second spherical microphone array is d ₂ .

つまり、第1球面マイクロホンアレー、第2球面マイクロホンアレーが剛球型球面マイクロホンアレーに変更された点以外は第一実施形態の球面マイクロホンアレーモデルと同一である。 That is, it is the same as the spherical microphone array model of the first embodiment except that the first spherical microphone array and the second spherical microphone array are changed to rigid spherical spherical microphone arrays.

［混合波モデル］
混合波モデルは、第一実施形態のそれと同一とする。 [Mixed wave model]
The mixed wave model is the same as that of the first embodiment.

なお、第m球面マイクロホンアレーの中心と球面波を生成するQ個の点音源との距離をR⁺ _m,q、第m球面マイクロホンアレーの中心から見た球面波を生成するQ個の点音源の仰角と方位角の組を(θ⁺ _m,q, φ⁺ _m,q)とする(m=1, 2、q=1, 2,…, Q)。 The distance between the center of the mth spherical microphone array and the Q point sound sources that generate spherical waves is R ⁺ _{m, q} , and the Q point sound sources that generate spherical waves viewed from the center of the mth spherical microphone array Let (θ ⁺ _{m, q} , φ ⁺ _{m, q} ) be the set of elevation angle and azimuth angle (m = 1, 2, q = 1, 2,..., Q).

以下、図４〜図５を参照して音場推定装置２００について説明する。図４に示すように音場推定装置２００は、短時間フーリエ変換部１１０、音場表現ベクトル推定部２２０、音場計算部２３０、短時間逆フーリエ変換部１４０、記録部１９０を含む。記録部１９０は、音場推定装置２００の処理に必要な情報を適宜記録する構成部である。 Hereinafter, the sound field estimation apparatus 200 will be described with reference to FIGS. As shown in FIG. 4, the sound field estimation apparatus 200 includes a short-time Fourier transform unit 110, a sound field expression vector estimation unit 220, a sound field calculation unit 230, a short-time inverse Fourier transform unit 140, and a recording unit 190. The recording unit 190 is a component that appropriately records information necessary for processing of the sound field estimation apparatus 200.

第1球面マイクロホンアレー９０３、第2球面マイクロホンアレー９０４は音場推定装置２００に接続しており、第1球面マイクロホンアレー９０３、第2球面マイクロホンアレー９０４による収音信号が音場推定装置２００の入力となっている。 The first spherical microphone array 903 and the second spherical microphone array 904 are connected to the sound field estimation device 200, and the collected sound signals from the first spherical microphone array 903 and the second spherical microphone array 904 are input to the sound field estimation device 200. It has become.

音場推定装置２００は、第1球面マイクロホンアレーが収音した時間領域収音信号y(t, 1, j)(j=1, 2,…, J)、第2球面マイクロホンアレーが収音した時間領域収音信号y(t, 2, j)(j=1, 2,…, J)、音場推定の対象となる仮想マイクロホンの配置位置である仮想マイクロホン位置rから、仮想マイクロホン位置rにおける時間領域での収音信号である仮想マイクロホン時間領域収音信号y(t, r)を推定し、出力する。 The sound field estimation apparatus 200 collects the time domain sound collected signal y (t, 1, j) (j = 1, 2,..., J) collected by the first spherical microphone array and the second spherical microphone array. Time domain sound pickup signal y (t, 2, j) (j = 1, 2, ..., J), from virtual microphone position r, which is the placement position of the virtual microphone that is the target of sound field estimation, at virtual microphone position r A virtual microphone time domain sound pickup signal y (t, r) that is a sound pickup signal in the time domain is estimated and output.

図５に従い音場推定装置２００の動作について説明する。短時間フーリエ変換部１１０は、時間領域収音信号y(t, m, j)(m=1, 2、j=1, 2,…, J)から、短時間フーリエ変換を用いて、第m球面マイクロホンアレー上の第jマイクロホンによる時間領域収音信号y(t, m, j)の周波数領域表現である周波数領域収音信号u(n, ω, m, j)（ただし、nはフレーム番号を表すインデックス、ωは周波数を表すインデックスとし、ω=1, 2,…, Fとする)に変換し、出力する（Ｓ１１０）。 The operation of the sound field estimation apparatus 200 will be described with reference to FIG. The short-time Fourier transform unit 110 uses the short-time Fourier transform from the time domain sound pickup signal y (t, m, j) (m = 1, 2, j = 1, 2,. Frequency domain collected signal u (n, ω, m, j), which is the frequency domain representation of the time domain collected signal y (t, m, j) by the jth microphone on the spherical microphone array (where n is the frame number) Ω is an index representing frequency, and ω = 1, 2,..., F) and output (S110).

以降の処理（具体的には音場表現ベクトル推定部２２０、音場計算部２３０での処理）がフレームnごとに実行されるのは、第一実施形態と同様であり、記載を簡略化するため、フレーム番号nを省略し、u(n, ω, m, j)を単にu(ω, m, j)と表すことにする。 The subsequent processing (specifically, processing in the sound field expression vector estimation unit 220 and the sound field calculation unit 230) is executed for each frame n, as in the first embodiment, and is simplified. Therefore, the frame number n is omitted and u (n, ω, m, j) is simply expressed as u (ω, m, j).

音場表現ベクトル推定部２２０は、周波数領域収音信号u(ω,m,j)(ω=1, 2,…, F、m=1, 2、j=1, 2,…, J)から、音場を構成する混合波の分解係数a_p(ω, q)、a_c(ω, q) (ω=1, 2,…, F、q=1, 2,…, Q)からなるベクトルである音場表現ベクトルa(ω)(ω=1, 2,…, F)を推定し、出力する（Ｓ２２０）。以下、音場表現ベクトルa(ω)の算出方法について、説明する。 The sound field expression vector estimation unit 220 uses the frequency domain collected signal u (ω, m, j) (ω = 1, 2,..., F, m = 1, 2, j = 1, 2,..., J). , A vector consisting of the decomposition coefficients a _p (ω, q), a _c (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q) of the mixed wave constituting the sound field A sound field expression vector a (ω) (ω = 1, 2,..., F) is estimated and output (S220). Hereinafter, a method for calculating the sound field expression vector a (ω) will be described.

まず、各マイクロホンで観測される信号のモデル（周波数領域収音信号u(ω,m,j)の線形結合モデル）について説明する。 First, a model of a signal observed by each microphone (linear combination model of frequency domain sound pickup signal u (ω, m, j)) will be described.

［各マイクロホンで観測される信号のモデル］
第一実施形態と同様、周波数領域収音信号u(ω,m,j)は、Q個の平面波の周波数領域信号とQ個の球面波の周波数領域信号の線形結合として近似表現されるものとする。 [Signal model observed by each microphone]
As in the first embodiment, the frequency domain collected signal u (ω, m, j) is approximately expressed as a linear combination of the frequency domain signal of Q plane waves and the frequency domain signal of Q spherical waves. To do.

まず、第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される平面波の周波数領域での観測信号である平面波周波数領域観測信号v_p(ω, q, m, j)について説明する(ω=1, 2,…, F、q=1, 2,…, Q)。 First, a plane wave frequency domain observation signal v, which is an observation signal in the frequency domain of a plane wave observed by the jth microphone (j = 1, 2,..., J) on the mth spherical microphone array (m = 1, 2). _p (ω, q, m, j) will be described (ω = 1, 2,..., F, q = 1, 2,..., Q).

第1球面マイクロホンアレー９０３、第2球面マイクロホンアレー９０４と同一の特徴を有する剛球型球面マイクロホンアレーを仮定し、その剛球型球面マイクロホンアレーの中心が3次元座標系の原点と一致している場合を考える。このとき、仰角と方位角の組(θ_q, φ_q)で指定された方向から振幅1の平面波が入射した場合、音場は入射波と散乱波から構成されることを考慮すると、剛球型球面マイクロホンアレー上の第jマイクロホン(j=1, 2,…, J)で観測される平面波の周波数領域での観測信号v_p,O(ω, q, j)は、式(17)で表される(ω=1, 2,…, F、q=1, 2,…, Q)。 Assuming a hard spherical spherical microphone array having the same characteristics as the first spherical microphone array 903 and the second spherical microphone array 904, the center of the hard spherical spherical microphone array coincides with the origin of the three-dimensional coordinate system. Think. At this time, if a plane wave with an amplitude of 1 is incident from the direction specified by the pair of elevation angle and azimuth angle (θ _q , φ _q ), considering that the sound field is composed of incident waves and scattered waves, a hard sphere type The observed signal v _{p, O} (ω, q, j) in the frequency domain of the plane wave observed by the j-th microphone (j = 1, 2,…, J) on the spherical microphone array is expressed by equation (17). (Ω = 1, 2, ..., F, q = 1, 2, ..., Q).

ただし、Y_n ^w(x, y)は次数n, wの球面調和関数、Y_n ^w*(x, y)は次数n, wの球面調和関数の複素共役、j_n(x)はオーダーnの球ベッセル関数、h_n(x)はオーダーnの第1種ハンケル関数である。また、j_n'(x)、h_n'(x)は、それぞれj_n(x)、h_n(x)の微分関数である。なお、kは周波数インデックスωに基づき決定される波数である。 Where Y _n ^w (x, y) is a spherical harmonic function of order n, w, Y _n ^{w *} (x, y) is a complex conjugate of a spherical harmonic function of order n, w, and j _n (x) is of order n H _n (x) is a first-class Hankel function of order n. J _n ′ (x) and h _n ′ (x) are differential functions of j _n (x) and h _n (x), respectively. Note that k is the wave number determined based on the frequency index ω.

第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される平面波周波数領域観測信号v_p(ω, q, m, j)は、位相差を考慮して式(17)から導出される式(18)で表される(ω=1, 2,…, F、q=1, 2,…, Q)。 The plane wave frequency domain observation signal v _p (ω, q, m, j) observed by the j-th microphone (j = 1, 2, ..., J) on the m-th spherical microphone array (m = 1, 2) is It is expressed by the equation (18) derived from the equation (17) in consideration of the phase difference (ω = 1, 2,..., F, q = 1, 2,..., Q).

なお、k^_qは式(5)で表されるベクトルである。 Note that k ^ _q is a vector represented by Equation (5).

次に、第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される球面波の周波数領域での観測信号である球面波周波数領域観測信号v_c(ω, q, m, j)について説明する(ω=1, 2,…, F、q=1, 2,…, Q)。 Next, a spherical wave frequency domain that is an observation signal in the frequency domain of the spherical wave observed by the jth microphone (j = 1, 2,..., J) on the mth spherical microphone array (m = 1, 2). The observation signal v _c (ω, q, m, j) will be described (ω = 1, 2,..., F, q = 1, 2,..., Q).

先ほどと同様、第1球面マイクロホンアレー９０３、第2球面マイクロホンアレー９０４と同一の特徴を有する剛球型球面マイクロホンアレーを仮定し、その剛球型球面マイクロホンアレーの中心が3次元座標系の原点と一致している場合を考える。このとき、剛球型球面マイクロホンアレー上の第jマイクロホン(j=1, 2,…, J)で観測される球面波の周波数領域での観測信号v_c,O(ω, q, j)は、式(19)で表される(ω=1, 2,…, F、q=1, 2,…, Q)。 As before, a rigid spherical microphone array having the same characteristics as the first spherical microphone array 903 and the second spherical microphone array 904 is assumed, and the center of the rigid spherical spherical microphone array coincides with the origin of the three-dimensional coordinate system. Think if you are. At this time, the observation signal v _{c, O} (ω, q, j) in the frequency domain of the spherical wave observed by the j-th microphone (j = 1, 2, ..., J) on the hard sphere type spherical microphone array is It is expressed by equation (19) (ω = 1, 2,..., F, q = 1, 2,..., Q).

第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される球面波周波数領域観測信号v_c(ω, q, m, j)は、式(20)で表される(q=1, 2,…, Q)。 The spherical wave frequency domain observation signal v _c (ω, q, m, j) observed by the j-th microphone (j = 1, 2, ..., J) on the m-th spherical microphone array (m = 1, 2) is And represented by equation (20) (q = 1, 2,..., Q).

周波数領域収音信号u(ω,m,j)は、式(18)の平面波周波数領域観測信号v_p(ω, q, m, j)と式(20)の球面波周波数領域観測信号v_c(ω, q, m, j)を用いて式(8)と同一の線形結合として近似表現される。 The frequency domain picked-up signal u (ω, m, j) is the plane wave frequency domain observation signal v _p (ω, q, m, j) in equation (18) and the spherical wave frequency domain observation signal v _{c in} equation (20). Using (ω, q, m, j), it is approximated as the same linear combination as equation (8).

［音場音場表現ベクトルa(ω)の算出方法］
式(18)で定義される平面波周波数領域観測信号v_p(ω, q, m, j)と式(20)で定義される球面波周波数領域観測信号v_c(ω, q, m, j)を用いて、辞書行列D(ω)を式(11)と同様に、コスト関数J(ω)を式(12)と同様に定義する。第一実施形態と同様のgroup Lasso型の最適化問題を解くことに帰着するため、音場表現ベクトルa(ω)を算出することができる。 [Calculation method of sound field sound field expression vector a (ω)]
Plane wave frequency domain observation signal v _p (ω, q, m, j) defined by equation (18) and spherical wave frequency domain observation signal v _c (ω, q, m, j) defined by equation (20) Is used to define the dictionary matrix D (ω) as in equation (11) and the cost function J (ω) as in equation (12). The sound field expression vector a (ω) can be calculated in order to result in solving the same group Lasso type optimization problem as in the first embodiment.

なお、式(18)の観測信号v_p(ω, q, m, j)と式(20)の観測信号v_c(ω, q, m, j)は極限（nについての無限和）をもって定義されているため、実際には有限のn（以下、Nとすると、Nは0以上の整数となる）を用いて数値計算することにより観測信号v_p(ω, q, m, j)と観測信号v_c(ω, q, m, j)の値を求める必要がある。つまり、観測信号v_p(ω, q, m, j)と観測信号v_c(ω, q, m, j)の計算式として、式(18)と式(20)の代わりに、式(18)’と式(20)’を用いる。 Note that the observed signal v _p definition (omega, q, m, j) and the observed signal v _c of the formula (20) (omega, q, m, j) is with a intrinsic (infinite sum of n) of formula (18) Therefore, the observed signal v _p (ω, q, m, j) and the observed value are actually calculated using a finite number of n (hereinafter N is an integer greater than or equal to 0). It is necessary to obtain the value of the signal v _c (ω, q, m, j). In other words, instead of Equation (18) and Equation (20), Equation (18) is used as a calculation equation for observation signal v _p (ω, q, m, j) and observation signal v _c (ω, q, m, j). ) 'And equation (20)'.

例えば、r_a=4cmのとき、N=10程度にとればよい。 For example, when r _a = 4 cm, it take approximately N = 10.

なお、剛球型球面マイクロホンアレーによる観測信号については、参考非特許文献２に詳しい。
（参考非特許文献２）D. P. Jarrett, E. A. P. Habets, M. R. P. Thomas and P. A. Naylor, “Rigid sphere room impulse response simulation: algorithm and applications”, Journal of the Acoustical Society of America, Vol.132, Issue 3, pp.1462-1472, 2012. Note that the observation signal by the hard spherical spherical microphone array is detailed in Reference Non-Patent Document 2.
(Reference Non-Patent Document 2) DP Jarrett, EAP Habets, MRP Thomas and PA Naylor, “Rigid sphere room impulse response simulation: algorithm and applications”, Journal of the Acoustical Society of America, Vol.132, Issue 3, pp.1462 -1472, 2012.

音場計算部２３０は、Ｓ２２０で推定した音場表現ベクトルa(ω)、仮想マイクロホン位置rから、式(21)を用いて、仮想マイクロホンの位置rでの周波数領域での収音信号である仮想マイクロホン周波数領域収音信号u(ω,r)(ω=1, 2,…, F)を計算し、出力する（Ｓ２３０）。 The sound field calculation unit 230 is a sound collection signal in the frequency domain at the virtual microphone position r using the equation (21) from the sound field expression vector a (ω) estimated in S220 and the virtual microphone position r. The virtual microphone frequency domain collected signal u (ω, r) (ω = 1, 2,..., F) is calculated and output (S230).

ここで、(Θ⁺, Φ⁺)は3次元座標系の原点からみた仮想マイクロホン位置rの仰角と方位角の組、R⁺ _q(q=1, 2,…, Q)は仮想マイクロホン位置rと球面波を生成するQ個の点音源との距離、(θ⁺ _q, φ⁺ _q)(q=1, 2,…, Q)は仮想マイクロホン位置rから見た球面波を生成するQ個の点音源の仰角と方位角の組である。 Where (Θ ⁺ , Φ ⁺ ) is the set of elevation and azimuth of the virtual microphone position r viewed from the origin of the three-dimensional coordinate system, and R ⁺ _q (q = 1, 2, ..., Q) is the virtual microphone position r And (θ ⁺ _q , φ ⁺ _q ) (q = 1, 2, ..., Q) are the Q points that generate spherical waves from the virtual microphone position r. It is a set of the elevation angle and azimuth angle of the point sound source.

［変形例］
第１実施形態の変形例２と同様、音場推定装置２００を構成してもよい。つまり、音場推定装置２００では、第1球面マイクロホンアレー９０３、第2球面マイクロホンアレー９０４による収音信号を時間領域収音信号y(t, 1, j)、y(t, 2, j) (j=1, 2,…, J)として入力とするように構成したが、第1球面マイクロホンアレー９０３、第2球面マイクロホンアレー９０４による収音信号を周波数領域収音信号u(ω, 1, j)、u(ω, 2, j) (ω=1, 2,…, F、j=1, 2,…, J)として入力とするように構成してもよい。 [Modification]
Similarly to the second modification of the first embodiment, the sound field estimation apparatus 200 may be configured. That is, in the sound field estimation apparatus 200, the sound collected signals by the first spherical microphone array 903 and the second spherical microphone array 904 are converted into time domain sound collected signals y (t, 1, j), y (t, 2, j) ( j = 1, 2,..., J), but the sound signals collected by the first spherical microphone array 903 and the second spherical microphone array 904 are frequency domain sound collected signals u (ω, 1, j). ), U (ω, 2, j) (ω = 1, 2,..., F, j = 1, 2,..., J).

周波数領域収音信号u(ω, 1, j)、u(ω, 2, j)を入力とするように構成した音場推定装置２０１を図６に示す。図６に示すように音場推定装置２０１は、音場表現ベクトル推定部２２０、音場計算部２３０、記録部１９０を含む。音場表現ベクトル推定部２２０、音場計算部２３０、記録部１９０は、音場推定装置２００のそれと同一である。 FIG. 6 shows a sound field estimation apparatus 201 configured to receive the frequency domain collected signals u (ω, 1, j) and u (ω, 2, j) as inputs. As shown in FIG. 6, the sound field estimation apparatus 201 includes a sound field expression vector estimation unit 220, a sound field calculation unit 230, and a recording unit 190. The sound field expression vector estimation unit 220, the sound field calculation unit 230, and the recording unit 190 are the same as those of the sound field estimation device 200.

音場推定装置２０１は、第1球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 1, j)、第2球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 2, j) (ω=1, 2,…, F、j=1, 2,…, J)、音場推定の対象となる仮想マイクロホンの配置位置である仮想マイクロホン位置rから、仮想マイクロホン位置rにおける周波数領域での収音信号である仮想マイクロホン周波数領域収音信号u(ω, r)を推定し、出力する。 The sound field estimation apparatus 201 includes a frequency domain collected signal u (ω, 1, j) collected by the first spherical microphone array and a frequency domain collected signal u (ω, 2, j) collected by the second spherical microphone array. j) (ω = 1, 2, ..., F, j = 1, 2, ..., J), the frequency at the virtual microphone position r from the virtual microphone position r that is the placement position of the virtual microphone that is the target of sound field estimation A virtual microphone frequency domain collected signal u (ω, r), which is a collected signal in the region, is estimated and output.

＜変形例＞
この発明は上述の実施形態に限定されるものではなく、この発明の趣旨を逸脱しない範囲で適宜変更が可能であることはいうまでもない。上記実施形態において説明した各種の処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。 <Modification>
The present invention is not limited to the above-described embodiment, and it goes without saying that modifications can be made as appropriate without departing from the spirit of the present invention. The various processes described in the above embodiment may be executed not only in time series according to the order of description, but also in parallel or individually as required by the processing capability of the apparatus that executes the processes or as necessary.

＜補記＞
本発明の装置は、例えば単一のハードウェアエンティティとして、キーボードなどが接続可能な入力部、液晶ディスプレイなどが接続可能な出力部、ハードウェアエンティティの外部に通信可能な通信装置（例えば通信ケーブル）が接続可能な通信部、ＣＰＵ（Central Processing Unit、キャッシュメモリやレジスタなどを備えていてもよい）、メモリであるＲＡＭやＲＯＭ、ハードディスクである外部記憶装置並びにこれらの入力部、出力部、通信部、ＣＰＵ、ＲＡＭ、ＲＯＭ、外部記憶装置の間のデータのやり取りが可能なように接続するバスを有している。また必要に応じて、ハードウェアエンティティに、ＣＤ−ＲＯＭなどの記録媒体を読み書きできる装置（ドライブ）などを設けることとしてもよい。このようなハードウェア資源を備えた物理的実体としては、汎用コンピュータなどがある。 <Supplementary note>
The apparatus of the present invention includes, for example, a single hardware entity as an input unit to which a keyboard or the like can be connected, an output unit to which a liquid crystal display or the like can be connected, and a communication device (for example, a communication cable) capable of communicating outside the hardware entity. Can be connected to a communication unit, a CPU (Central Processing Unit, may include a cache memory or a register), a RAM or ROM that is a memory, an external storage device that is a hard disk, and an input unit, an output unit, or a communication unit thereof , A CPU, a RAM, a ROM, and a bus connected so that data can be exchanged between the external storage devices. If necessary, the hardware entity may be provided with a device (drive) that can read and write a recording medium such as a CD-ROM. A physical entity having such hardware resources includes a general-purpose computer.

ハードウェアエンティティの外部記憶装置には、上述の機能を実現するために必要となるプログラムおよびこのプログラムの処理において必要となるデータなどが記憶されている（外部記憶装置に限らず、例えばプログラムを読み出し専用記憶装置であるＲＯＭに記憶させておくこととしてもよい）。また、これらのプログラムの処理によって得られるデータなどは、ＲＡＭや外部記憶装置などに適宜に記憶される。 The external storage device of the hardware entity stores a program necessary for realizing the above functions and data necessary for processing the program (not limited to the external storage device, for example, reading a program) It may be stored in a ROM that is a dedicated storage device). Data obtained by the processing of these programs is appropriately stored in a RAM or an external storage device.

ハードウェアエンティティでは、外部記憶装置（あるいはＲＯＭなど）に記憶された各プログラムとこの各プログラムの処理に必要なデータが必要に応じてメモリに読み込まれて、適宜にＣＰＵで解釈実行・処理される。その結果、ＣＰＵが所定の機能（上記、…部、…手段などと表した各構成要件）を実現する。 In the hardware entity, each program stored in an external storage device (or ROM or the like) and data necessary for processing each program are read into a memory as necessary, and are interpreted and executed by a CPU as appropriate. . As a result, the CPU realizes a predetermined function (respective component requirements expressed as the above-described unit, unit, etc.).

本発明は上述の実施形態に限定されるものではなく、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。また、上記実施形態において説明した処理は、記載の順に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されるとしてもよい。 The present invention is not limited to the above-described embodiment, and can be appropriately changed without departing from the spirit of the present invention. In addition, the processing described in the above embodiment may be executed not only in time series according to the order of description but also in parallel or individually as required by the processing capability of the apparatus that executes the processing. .

既述のように、上記実施形態において説明したハードウェアエンティティ（本発明の装置）における処理機能をコンピュータによって実現する場合、ハードウェアエンティティが有すべき機能の処理内容はプログラムによって記述される。そして、このプログラムをコンピュータで実行することにより、上記ハードウェアエンティティにおける処理機能がコンピュータ上で実現される。 As described above, when the processing functions in the hardware entity (the apparatus of the present invention) described in the above embodiments are realized by a computer, the processing contents of the functions that the hardware entity should have are described by a program. Then, by executing this program on a computer, the processing functions in the hardware entity are realized on the computer.

この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。具体的には、例えば、磁気記録装置として、ハードディスク装置、フレキシブルディスク、磁気テープ等を、光ディスクとして、ＤＶＤ（Digital Versatile Disc）、ＤＶＤ−ＲＡＭ（Random Access Memory）、ＣＤ−ＲＯＭ（Compact Disc Read Only Memory）、ＣＤ−Ｒ（Recordable）／ＲＷ（ReWritable）等を、光磁気記録媒体として、ＭＯ（Magneto-Optical disc）等を、半導体メモリとしてＥＥＰ−ＲＯＭ（Electronically Erasable and Programmable-Read Only Memory）等を用いることができる。 The program describing the processing contents can be recorded on a computer-readable recording medium. As the computer-readable recording medium, for example, any recording medium such as a magnetic recording device, an optical disk, a magneto-optical recording medium, and a semiconductor memory may be used. Specifically, for example, as a magnetic recording device, a hard disk device, a flexible disk, a magnetic tape or the like, and as an optical disk, a DVD (Digital Versatile Disc), a DVD-RAM (Random Access Memory), a CD-ROM (Compact Disc Read Only). Memory), CD-R (Recordable) / RW (ReWritable), etc., magneto-optical recording medium, MO (Magneto-Optical disc), etc., semiconductor memory, EEP-ROM (Electronically Erasable and Programmable-Read Only Memory), etc. Can be used.

また、このプログラムの流通は、例えば、そのプログラムを記録したＤＶＤ、ＣＤ−ＲＯＭ等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The program is distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM in which the program is recorded. Furthermore, the program may be distributed by storing the program in a storage device of the server computer and transferring the program from the server computer to another computer via a network.

このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。そして、処理の実行時、このコンピュータは、自己の記録媒体に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるＡＳＰ（Application Service Provider）型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの（コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等）を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. When executing the process, the computer reads a program stored in its own recording medium and executes a process according to the read program. As another execution form of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and the program is transferred from the server computer to the computer. Each time, the processing according to the received program may be executed sequentially. Also, the program is not transferred from the server computer to the computer, and the above-described processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition. It is good. Note that the program in this embodiment includes information that is used for processing by an electronic computer and that conforms to the program (data that is not a direct command to the computer but has a property that defines the processing of the computer).

また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、ハードウェアエンティティを構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 In this embodiment, a hardware entity is configured by executing a predetermined program on a computer. However, at least a part of these processing contents may be realized by hardware.

Claims

第1球面マイクロホンアレー、第2球面マイクロホンアレーを球面上に配置されたマイクロホンの数、マイクロホンの配置位置、球面の半径がいずれも同一である球面状のマイクロホンアレーとし、
前記マイクロホンの数をJ、前記マイクロホンの配置位置を指定する仰角と方位角の組を(Θ_j, Φ_j) (j=1, 2,…, J)、前記球面の半径をr_a、前記第1球面マイクロホンアレーの中心位置をd₁、前記第2球面マイクロホンアレーの中心位置をd₂とし、
混合波モデルを3次元座標系の原点へ入射角(θ_q,φ_q)で入射してくる平面波(q=1, 2,…, Q)と、前記原点から距離Rだけ離れた位置にある点音源から生成され、前記原点へ同じ入射角(θ_q,φ_q)で入射してくる球面波(q=1, 2,…, Q)のモデルとし、
前記第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される平面波の周波数領域での観測信号である平面波周波数領域観測信号をv_p(ω, q, m, j) (ωは周波数を表すインデックスとし、ω=1, 2,…, F、q=1, 2,…, Qとする)、前記第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される球面波の周波数領域での観測信号である球面波周波数領域観測信号をv_c(ω, q, m, j) (ω=1, 2,…, F、q=1, 2,…, Q)とし、
前記第1球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 1, j) (ω=1, 2,…, F、j=1, 2,…, J)、前記第2球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 2, j) (ω=1, 2,…, F、j=1, 2,…, J)と、音場推定の対象となる仮想マイクロホンの配置位置である仮想マイクロホン位置rから、前記仮想マイクロホン位置rにおける周波数領域での収音信号である仮想マイクロホン周波数領域収音信号u(ω, r) (ω=1, 2,…, F)を推定する音場推定装置であって、
前記周波数領域収音信号u(ω, m, j)(ω=1, 2,…, F、m=1, 2、j=1, 2,…, J)から、次式で定義されるコスト関数J(ω) (ω=1, 2,…, F)を用いて、音場を構成する混合波の分解係数a_p(ω, q)、a_c(ω, q) (ω=1, 2,…, F、q=1, 2,…, Q)からなるベクトルである音場表現ベクトルa(ω)(ω=1, 2,…, F)を推定する音場表現ベクトル推定部と、

ただし、D(ω) (ω=1, 2,…, F)は、次式で定義される2J×2Qの辞書行列であり、

d_p(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)は、次式で定義される2J次元の第q平面波ベクトルであり、

d_c(ω, q)(ω=1, 2,…, F、q=1, 2,…, Q)は、次式で定義される2J次元の第q球面波ベクトルであり、

前記音場音場表現ベクトルa(ω)(ω=1, 2,…, F)は、前記分解係数a_p(ω, q)、a_c(ω, q) (ω=1, 2,…, F、q=1, 2,…, Q)を用いて次式で定義される2Q次元ベクトルであり、

u_m(ω) (m=1, 2、ω=1, 2,…, F)は、次式で定義されるJ次元の第m球面マイクロホンアレーの周波数領域収音信号ベクトルであり、

λは正則化パラメータであるとし、
前記音場表現ベクトルa(ω) (ω=1, 2,…, F)、前記仮想マイクロホン位置rから、前記仮想マイクロホン周波数領域収音信号u(ω, r)(ω=1, 2,…, F)を計算する音場計算部と
を含む音場推定装置。 The first spherical microphone array, the second spherical microphone array is a spherical microphone array in which the number of microphones arranged on the spherical surface, the microphone placement position, and the spherical radius are all the same,
The number of the microphones J, set the elevation and azimuth to specify the position of the microphone _{_{(Θ j, Φ j) (}} j = 1, 2, ..., J), the radius of the spherical r _a, the the center position of the first spherical microphone array d _1, the center position of the second spherical microphone array and d _2,
A plane wave (q = 1, 2, ..., Q) that enters the mixed wave model at the incidence angle (θ _q , φ _q ) to the origin of the three-dimensional coordinate system, and is located at a distance R from the origin A model of a spherical wave (q = 1, 2, ..., Q) generated from a point sound source and incident at the same incident angle (θ _q , φ _q ) to the origin,
A plane wave frequency domain observation signal, which is an observation signal in the frequency domain of the plane wave observed by the j th microphone (j = 1, 2,..., J) on the mth spherical microphone array (m = 1, 2), v _p (ω, q, m, j) (ω is an index representing frequency, and ω = 1, 2, ..., F, q = 1, 2, ..., Q), the m-th spherical microphone array (m = 1, 2) The spherical wave frequency domain observation signal, which is the observation signal in the frequency domain of the spherical wave observed by the j-th microphone (j = 1, 2, ..., J) on v _c (ω, q, m, j) (ω = 1, 2, ..., F, q = 1, 2, ..., Q)
Frequency domain collected signals u (ω, 1, j) (ω = 1, 2,..., F, j = 1, 2,..., J) collected by the first spherical microphone array, the second spherical microphone Frequency domain collected signal u (ω, 2, j) (ω = 1, 2, ..., F, j = 1, 2, ..., J) picked up by the array, and virtual microphone for sound field estimation Virtual microphone frequency domain collected signal u (ω, r) (ω = 1, 2,..., F) which is a collected sound signal in the frequency domain at the virtual microphone position r A sound field estimation device for estimating
From the frequency domain collected signal u (ω, m, j) (ω = 1, 2, ..., F, m = 1, 2, j = 1, 2, ..., J), the cost defined by the following equation Using the function J (ω) (ω = 1, 2, ..., F), the decomposition coefficients a _p (ω, q), a _c (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q), a sound field expression vector estimator for estimating a sound field expression vector a (ω) (ω = 1, 2, ..., F) ,

However, D (ω) (ω = 1, 2,..., F) is a 2J × 2Q dictionary matrix defined by the following equation:

d _p (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q) is a 2J-dimensional q-plane wave vector defined by the following equation:

d _c (ω, q) (ω = 1, 2, ..., F, q = 1, 2, ..., Q) is a 2J-dimensional q-spherical wave vector defined by the following equation:

The sound field representation vector a (ω) (ω = 1, 2,..., F) is expressed by the decomposition coefficients a _p (ω, q), a _c (ω, q) (ω = 1, 2,. , F, q = 1, 2, ..., Q), which is a 2Q dimensional vector defined by

u _m (ω) (m = 1, 2, ω = 1, 2,..., F) is a frequency domain collected signal vector of the J-dimensional m-th spherical microphone array defined by the following equation:

Let λ be a regularization parameter,
From the sound field expression vector a (ω) (ω = 1, 2,..., F) and the virtual microphone position r, the virtual microphone frequency domain collected signal u (ω, r) (ω = 1, 2,. , F) a sound field calculation unit including a sound field calculation unit.

請求項１に記載の音場推定装置であって、
前記第1球面マイクロホンアレー、前記第2球面マイクロホンアレーは、開球型球面マイクロホンアレーであり、
iを虚数単位、・を内積記号とし、
前記平面波周波数領域観測信号v_p(ω, q, m, j) は、次式で表され、

ただし、kは周波数インデックスωに基づき決定される波数、k^_qは次式で表されるベクトルであり、

r(m, j) (m=1, 2、j=1, 2,…, J)は、次式で表される前記第m球面マイクロホンアレー上の第jマイクロホンの3次元位置であるとし、

前記球面波周波数領域観測信号v_c(ω, q, m, j)は、次式で表され、

ただし、R_q(q=1, 2,…, Q)は、次式で表される前記点音源の3次元位置であるとし、

前記音場表現ベクトル推定部は、前記音場表現ベクトルa(ω)を前記コスト関数J(ω)の最小値を実現するベクトルとして算出し、
前記音場計算部は、次式を用いて前記仮想マイクロホン周波数領域収音信号u(ω,r)を計算する

ことを特徴とする音場推定装置。 The sound field estimation apparatus according to claim 1,
The first spherical microphone array and the second spherical microphone array are open spherical spherical microphone arrays,
i is an imaginary unit,.
The plane wave frequency domain observation signal v _p (ω, q, m, j) is expressed by the following equation:

Where k is the wave number determined based on the frequency index ω, k ^ _q is a vector represented by the following equation,

r (m, j) (m = 1, 2, j = 1, 2,..., J) is a three-dimensional position of the jth microphone on the mth spherical microphone array represented by the following equation:

The spherical wave frequency domain observation signal v _c (ω, q, m, j) is expressed by the following equation:

However, R _q (q = 1, 2,..., Q) is a three-dimensional position of the point sound source represented by the following equation:

The sound field expression vector estimation unit calculates the sound field expression vector a (ω) as a vector that realizes the minimum value of the cost function J (ω),
The sound field calculation unit calculates the virtual microphone frequency domain collected signal u (ω, r) using the following equation:

The sound field estimation apparatus characterized by the above-mentioned.

請求項１に記載の音場推定装置であって、
前記第1球面マイクロホンアレー、前記第2球面マイクロホンアレーは、剛球型球面マイクロホンアレーであり、
iを虚数単位、・を内積記号とし、
Nを0以上の整数、Y_n ^w(x, y)を次数n, wの球面調和関数、Y_n ^w*(x, y)を次数n, wの球面調和関数の複素共役、j_n(x)をオーダーnの球ベッセル関数、h_n(x)をオーダーnの第1種ハンケル関数、j_n'(x)をj_n(x)の微分関数、h_n'(x)をh_n(x)の微分関数とし、
前記平面波周波数領域観測信号v_p(ω, q, m, j) は、次式で表され、

R⁺ _m,q(m=1, 2、q=1, 2,…, Q)を第m球面マイクロホンアレーの中心と球面波を生成するQ個の点音源との距離、(θ⁺ _m,q, φ⁺ _m,q)(m=1, 2、q=1, 2,…, Q)を第m球面マイクロホンアレーの中心から見た球面波を生成するQ個の点音源の仰角と方位角の組とし、
前記球面波周波数領域観測信号v_c(ω, q, m, j)は、次式で表され、

前記音場表現ベクトル推定部は、前記音場表現ベクトルa(ω)を前記コスト関数J(ω)の最小値を実現するベクトルとして算出し、
(Θ⁺, Φ⁺)を3次元座標系の原点からみた仮想マイクロホン位置rの仰角と方位角の組、R⁺ _q(q=1, 2,…, Q)を仮想マイクロホン位置rと球面波を生成するQ個の点音源との距離、(θ⁺ _q, φ⁺ _q)(q=1, 2,…, Q)は仮想マイクロホン位置rから見た球面波を生成するQ個の点音源の仰角と方位角の組とし、
前記音場計算部は、次式を用いて前記仮想マイクロホン周波数領域収音信号u(ω,r)を計算する

ことを特徴とする音場推定装置。 The sound field estimation apparatus according to claim 1,
The first spherical microphone array and the second spherical microphone array are rigid spherical spherical microphone arrays,
i is an imaginary unit,.
N is an integer greater than or _equal to 0, Y _n ^w (x, y) is a spherical harmonic function of order n, w, Y _n ^{w *} (x, y) is a complex conjugate of a spherical harmonic function of order n, w, j _n ( x) is a spherical Bessel function of order n, h _n (x) is a Hankel function of the first kind of order n, j _n '(x) is a differential function of j _n (x), and h _n ' (x) is h _n The differential function of (x)
The plane wave frequency domain observation signal v _p (ω, q, m, j) is expressed by the following equation:

R ⁺ _{m, q} (m = 1, 2, q = 1, 2, ..., Q) is the distance between the center of the mth spherical microphone array and the Q point sources that generate spherical waves, (θ ⁺ _{m, q} , φ ⁺ _{m, q} ) (m = 1, 2, q = 1, 2, ..., Q) from the center of the mth spherical microphone array A pair of horns,
The spherical wave frequency domain observation signal v _c (ω, q, m, j) is expressed by the following equation:

The sound field expression vector estimation unit calculates the sound field expression vector a (ω) as a vector that realizes the minimum value of the cost function J (ω),
(Θ ⁺ , Φ ⁺ ) is a set of elevation and azimuth of the virtual microphone position r as seen from the origin of the three-dimensional coordinate system, and R ⁺ _q (q = 1, 2, ..., Q) is the virtual microphone position r and spherical wave (Θ ⁺ _q , φ ⁺ _q ) (q = 1, 2,…, Q) is the Q point sound sources that generate spherical waves viewed from the virtual microphone position r As a pair of elevation and azimuth,
The sound field calculation unit calculates the virtual microphone frequency domain collected signal u (ω, r) using the following equation:

The sound field estimation apparatus characterized by the above-mentioned.

第1球面マイクロホンアレー、第2球面マイクロホンアレーを球面上に配置されたマイクロホンの数、マイクロホンの配置位置、球面の半径がいずれも同一である球面状のマイクロホンアレーとし、
前記マイクロホンの数をJ、前記マイクロホンの配置位置を指定する仰角と方位角の組を(Θ_j, Φ_j) (j=1, 2,…, J)、前記球面の半径をr_a、前記第1球面マイクロホンアレーの中心位置をd₁、前記第2球面マイクロホンアレーの中心位置をd₂とし、
混合波モデルを3次元座標系の原点へ入射角(θ_q,φ_q)で入射してくる平面波(q=1, 2,…, Q)と、前記原点から距離Rだけ離れた位置にある点音源から生成され、前記原点へ同じ入射角(θ_q,φ_q)で入射してくる球面波(q=1, 2,…, Q)のモデルとし、
前記第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される平面波の周波数領域での観測信号である平面波周波数領域観測信号をv_p(ω, q, m, j) (ωは周波数を表すインデックスとし、ω=1, 2,…, F、q=1, 2,…, Qとする)、前記第m球面マイクロホンアレー(m=1, 2)上の第jマイクロホン(j=1, 2,…, J)で観測される球面波の周波数領域での観測信号である球面波周波数領域観測信号をv_c(ω, q, m, j) (ω=1, 2,…, F、q=1, 2,…, Q)とし、
音場推定装置が、前記第1球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 1, j) (ω=1, 2,…, F、j=1, 2,…, J)、前記第2球面マイクロホンアレーが収音した周波数領域収音信号u(ω, 2, j) (ω=1, 2,…, F、j=1, 2,…, J)と、音場推定の対象となる仮想マイクロホンの配置位置である仮想マイクロホン位置rから、前記仮想マイクロホン位置rにおける周波数領域での収音信号である仮想マイクロホン周波数領域収音信号u(ω, r) (ω=1, 2,…, F)を推定する音場推定方法であって、
前記音場推定装置が、前記周波数領域収音信号u(ω, m, j)(ω=1, 2,…, F、m=1, 2、j=1, 2,…, J)から、次式で定義されるコスト関数J(ω) (ω=1, 2,…, F)を用いて、音場を構成する混合波の分解係数a_p(ω, q)、a_c(ω, q) (ω=1, 2,…, F、q=1, 2,…, Q)からなるベクトルである音場表現ベクトルa(ω)(ω=1, 2,…, F)を推定する音場表現ベクトル推定ステップと、

λは正則化パラメータであるとし、
前記音場推定装置が、前記音場表現ベクトルa(ω) (ω=1, 2,…, F)、前記仮想マイクロホン位置rから、前記仮想マイクロホン周波数領域収音信号u(ω, r)(ω=1, 2,…, F)を計算する音場計算ステップと
を含む音場推定方法。 The first spherical microphone array, the second spherical microphone array is a spherical microphone array in which the number of microphones arranged on the spherical surface, the microphone placement position, and the spherical radius are all the same,
The number of the microphones J, set the elevation and azimuth to specify the position of the microphone _{_{(Θ j, Φ j) (}} j = 1, 2, ..., J), the radius of the spherical r _a, the the center position of the first spherical microphone array d _1, the center position of the second spherical microphone array and d _2,
A plane wave (q = 1, 2, ..., Q) that enters the mixed wave model at the incidence angle (θ _q , φ _q ) to the origin of the three-dimensional coordinate system, and is located at a distance R from the origin A model of a spherical wave (q = 1, 2, ..., Q) generated from a point sound source and incident at the same incident angle (θ _q , φ _q ) to the origin,
A plane wave frequency domain observation signal, which is an observation signal in the frequency domain of the plane wave observed by the j th microphone (j = 1, 2,..., J) on the mth spherical microphone array (m = 1, 2), v _p (ω, q, m, j) (ω is an index representing frequency, and ω = 1, 2, ..., F, q = 1, 2, ..., Q), the m-th spherical microphone array (m = 1, 2) The spherical wave frequency domain observation signal, which is the observation signal in the frequency domain of the spherical wave observed by the j-th microphone (j = 1, 2, ..., J) on v _c (ω, q, m, j) (ω = 1, 2, ..., F, q = 1, 2, ..., Q)
The sound field estimation device is a frequency domain sound pickup signal u (ω, 1, j) (ω = 1, 2,..., F, j = 1, 2,..., J) picked up by the first spherical microphone array. , Frequency-domain sound pickup signals u (ω, 2, j) (ω = 1, 2, ..., F, j = 1, 2, ..., J) picked up by the second spherical microphone array, and sound field estimation From the virtual microphone position r, which is the placement position of the target virtual microphone, a virtual microphone frequency domain collected signal u (ω, r) that is a collected signal in the frequency domain at the virtual microphone position r (ω = 1, 2, ..., F) for estimating the sound field,
The sound field estimation device is obtained from the frequency domain collected signal u (ω, m, j) (ω = 1, 2, ..., F, m = 1, 2, j = 1, 2, ..., J), Using the cost function J (ω) (ω = 1, 2, ..., F) defined by the following equation, the decomposition coefficients a _p (ω, q), a _c (ω, q) Estimate the sound field expression vector a (ω) (ω = 1, 2, ..., F), which is a vector consisting of (ω = 1, 2, ..., F, q = 1, 2, ..., Q) A sound field expression vector estimation step;

Let λ be a regularization parameter,
The sound field estimation device uses the sound field expression vector a (ω) (ω = 1, 2,..., F), the virtual microphone position r, and the virtual microphone frequency domain sound collection signal u (ω, r) ( A sound field estimation method comprising: a sound field calculation step for calculating ω = 1, 2,..., F).

請求項４に記載の音場推定方法であって、
前記第1球面マイクロホンアレー、前記第2球面マイクロホンアレーは、開球型球面マイクロホンアレーであり、
iを虚数単位、・を内積記号とし、
前記平面波周波数領域観測信号v_p(ω, q, m, j) は、次式で表され、

前記音場表現ベクトル推定ステップは、前記音場表現ベクトルa(ω)を前記コスト関数J(ω)の最小値を実現するベクトルとして算出し、
前記音場計算ステップは、次式を用いて前記仮想マイクロホン周波数領域収音信号u(ω,r)を計算する

ことを特徴とする音場推定方法。 The sound field estimation method according to claim 4,
The first spherical microphone array and the second spherical microphone array are open spherical spherical microphone arrays,
i is an imaginary unit,.
The plane wave frequency domain observation signal v _p (ω, q, m, j) is expressed by the following equation:

The sound field expression vector estimation step calculates the sound field expression vector a (ω) as a vector that realizes a minimum value of the cost function J (ω),
The sound field calculation step calculates the virtual microphone frequency domain collected signal u (ω, r) using the following equation:

A sound field estimation method characterized by the above.

請求項４に記載の音場推定方法であって、
前記第1球面マイクロホンアレー、前記第2球面マイクロホンアレーは、剛球型球面マイクロホンアレーであり、
iを虚数単位、・を内積記号とし、
Nを0以上の整数、Y_n ^w(x, y)を次数n, wの球面調和関数、Y_n ^w*(x, y)を次数n, wの球面調和関数の複素共役、j_n(x)をオーダーnの球ベッセル関数、h_n(x)をオーダーnの第1種ハンケル関数、j_n'(x)をj_n(x)の微分関数、h_n'(x)をh_n(x)の微分関数とし、
前記平面波周波数領域観測信号v_p(ω, q, m, j) は、次式で表され、

前記音場表現ベクトル推定ステップは、前記音場表現ベクトルa(ω)を前記コスト関数J(ω)の最小値を実現するベクトルとして算出し、
(Θ⁺, Φ⁺)を3次元座標系の原点からみた仮想マイクロホン位置rの仰角と方位角の組、R⁺ _q(q=1, 2,…, Q)を仮想マイクロホン位置rと球面波を生成するQ個の点音源との距離、(θ⁺ _q, φ⁺ _q)(q=1, 2,…, Q)は仮想マイクロホン位置rから見た球面波を生成するQ個の点音源の仰角と方位角の組とし、
前記音場計算ステップは、次式を用いて前記仮想マイクロホン周波数領域収音信号u(ω,r)を計算する

ことを特徴とする音場推定方法。 The sound field estimation method according to claim 4,
The first spherical microphone array and the second spherical microphone array are rigid spherical spherical microphone arrays,
i is an imaginary unit,.
N is an integer greater than or _equal to 0, Y _n ^w (x, y) is a spherical harmonic function of order n, w, Y _n ^{w *} (x, y) is a complex conjugate of a spherical harmonic function of order n, w, j _n ( x) is a spherical Bessel function of order n, h _n (x) is a Hankel function of the first kind of order n, j _n '(x) is a differential function of j _n (x), and h _n ' (x) is h _n The differential function of (x)
The plane wave frequency domain observation signal v _p (ω, q, m, j) is expressed by the following equation:

The sound field expression vector estimation step calculates the sound field expression vector a (ω) as a vector that realizes a minimum value of the cost function J (ω),
(Θ ⁺ , Φ ⁺ ) is a set of elevation and azimuth of the virtual microphone position r as seen from the origin of the three-dimensional coordinate system, and R ⁺ _q (q = 1, 2, ..., Q) is the virtual microphone position r and spherical wave (Θ ⁺ _q , φ ⁺ _q ) (q = 1, 2,…, Q) is the Q point sound sources that generate spherical waves viewed from the virtual microphone position r As a pair of elevation and azimuth,
The sound field calculation step calculates the virtual microphone frequency domain collected signal u (ω, r) using the following equation:

A sound field estimation method characterized by the above.

請求項１ないし３のいずれか１項に記載の音場推定装置としてコンピュータを機能させるためのプログラム。 The program for functioning a computer as a sound field estimation apparatus of any one of Claim 1 thru | or 3.